U.S. patent application number 09/796230 was filed with the patent office on 2001-08-30 for simultaneous measurement of gene expression and genomic abnormalities using nucleic acid microarrays.
Invention is credited to Bao, Yijia, Che, Diping, Li, Wan-Liang, Muller, Uwe Richard, Seelig, Steven A., Shi, Jufang.
Application Number | 20010018183 09/796230 |
Document ID | / |
Family ID | 22917233 |
Filed Date | 2001-08-30 |
United States Patent
Application |
20010018183 |
Kind Code |
A1 |
Bao, Yijia ; et al. |
August 30, 2001 |
Simultaneous measurement of gene expression and genomic
abnormalities using nucleic acid microarrays
Abstract
The invention comprises a multi-color, comparative hybridization
assay method using an array of nucleic acid target elements
attached to a solid support for the simultaneous detection of both
gene expression and chromosomal abnormalities in a tissue sample.
The method of the invention employs a comparative hybridization of
a tissue mRNA or cDNA sample labeled in a first fluorescent color,
a tissue chromosomal DNA sample labeled in a second fluorescent
color, and at least one reference nucleic acid labeled in a third
fluorescent color, to the array. The fluorescent color presence and
intensity at each of at least two target elements are detected and
the fluorescent ratios (i) of the first and third colors and (ii)
the second and third colors determined. Gene expression and
chromosomal abnormalities are thus simultaneously detected.
Inventors: |
Bao, Yijia; (Naperville,
IL) ; Che, Diping; (Westmont, IL) ; Li,
Wan-Liang; (Lisle, IL) ; Muller, Uwe Richard;
(Plano, IL) ; Seelig, Steven A.; (Naperville,
IL) ; Shi, Jufang; (Hinsdale, IL) |
Correspondence
Address: |
VYSIS, INC
LAW DEPARTMENT
3100 WOODCREEK DRIVE
DOWNERS GROVE
IL
60515
|
Family ID: |
22917233 |
Appl. No.: |
09/796230 |
Filed: |
February 28, 2001 |
Related U.S. Patent Documents
|
|
|
|
|
|
Application
Number |
Filing Date |
Patent Number |
|
|
09796230 |
Feb 28, 2001 |
|
|
|
09243067 |
Feb 2, 1999 |
|
|
|
Current U.S.
Class: |
435/6.13 ;
435/6.1 |
Current CPC
Class: |
B01J 2219/00596
20130101; B01J 2219/00608 20130101; B01J 2219/00612 20130101; C12Q
1/6837 20130101; B01J 2219/00659 20130101; B01J 2219/00637
20130101; B01J 2219/00529 20130101; B01J 2219/00707 20130101; B01J
2219/00605 20130101; C40B 40/06 20130101; B01J 2219/00722
20130101 |
Class at
Publication: |
435/6 |
International
Class: |
C12Q 001/68 |
Goverment Interests
[0001] The United States has certain rights in this invention
pursuant to a grant for ATP Project No. 94-05-0021, Award No.
70NANB5H1 108 from the National Institute of Standards and
Technology.
Claims
We claim:
1. A method for simultaneous detection of gene expression and
chromosomal abnormality in a tissue sample comprising: (a)
providing an array of nucleic acid target elements attached to a
solid support wherein the nucleic acid target elements comprise
polynucleotide sequences substantially complementary under
preselected hybridization conditions to nucleic acids indicative of
gene expression and of chromosomal sequence of a tissue sample; (b)
providing at least three labeled nucleic acid populations: (i) a
mRNA or cDNA population labeled with a first marker and derived
from the tissue sample, (ii) a chromosomal DNA population labeled
with a second marker and derived from the tissue sample, and (iii)
at least one reference nucleic acid population labeled with a third
marker; (c) contacting the array with the labeled nucleic acid
populations under hybridization conditions; and (d) detecting
presence and intensity of each of the first, second and third
markers to at least two target elements.
2. The method of claim 1 wherein the target elements comprise
genomic DNA.
3. The method of claim 1 wherein the target elements comprise
cDNA.
4. The method of claim 1 wherein the tissue sample is from a
human.
5. The method of claim 1 wherein the array comprises cDNA and
genomic DNA target elements.
6. The method of claim 1 wherein the array comprises target
elements at a density in the range of 100 to 10,000 target elements
per square centimeter.
7. The method of claim 1 wherein the first, second and third
markers each comprise a different fluorescent label.
8. The method of claim 1 further comprising processing data from
the detecting step (c) in a programmed computer, storing raw and
processed data in a database and displaying raw and processed
data.
9. The method of claim 1 further comprising addition of unlabeled
blocking nucleic acid.
10. The method of claim 4 further comprising use of data derived
from the method in selection of therapy for a human.
11. The method of claim 1 further comprising determining
fluorescent ratios at each target element (i) between the first and
third colors and (ii) between the second and third colors.
12. The method of claim 1 wherein the tissue comprises a cell line
sample.
13. The method of claim 1 wherein the tissue sample comprises one
cell.
14. The method of claim 1 wherein the tissue sample comprises a
human tumor sample.
15. The method of claim 1 wherein the tissue sample comprises blood
cells.
16. The method of claim 2 wherein the genomic DNA comprises human
genomic DNA having a complexity in a range of 20 kb to 250 kb.
17. The method of claim 3 wherein the cDNA comprises cDNA having a
complexity in a range of 100 bp to 5,000 bp.
18. The method of claim 1 wherein the target nucleic acid elements
comprise at least one peptide nucleic acid.
19. The method of claim 1 wherein the method is performed in a
mesoscale device.
20. A method of for simultaneous detection of gene expression and
chromosomal abnormality in a tissue sample comprising: (a)
providing an array of nucleic acid target elements comprising
genomic DNA attached to a solid support wherein the nucleic acid
target elements comprise polynucleotide sequences substantially
complementary under preselected hybridization conditions to nucleic
acids indicative of gene expression and of chromosomal sequence of
a tissue sample; (b) providing at least three labeled nucleic acid
populations: (i) a mRNA or cDNA population labeled with a first
fluorescent color and derived from the tissue sample, (ii) a
chromosomal DNA population labeled with a second fluorescent color
and derived from the tissue sample, and (iii) at least one
reference nucleic acid population labeled with a third fluorescent
color; (c) contacting the array with the labeled nucleic acid
populations under hybridization conditions; and (d) detecting
presence and intensity of each of the first, second and third
fluorescent colors at at least two target elements.
21. The method of claim 20 wherein the array comprises at least 100
target elements on a planar surface of a substrate.
22. The method of claim 20 wherein the array comprises target
elements at a density in the range of 100 to 10,000 target elements
per square centimeter.
23. The method of claim 20 further comprising determining
fluorescent ratios at each target element (i) between the first and
third colors and (ii) between the second and third colors.
24. The method of claim 20 further comprising processing data from
the detecting step (c) in a programmed computer, storing raw and
processed data in a database and displaying raw and processed
data.
25. The method of claim 20 further comprising addition of unlabeled
blocking nucleic acid.
26. The method of claim 26 further comprising use of data from the
method in selection of therapy for a human.
27. The method of claim 20 wherein the chromosomal DNA population
is produced by a method comprising PCR.
28. The method of claim 20 wherein the tissue sample comprises one
cell.
29. The method of claim 20 wherein the tissue sample comprises a
human tumor sample.
30. The method of claim 20 wherein the tissue sample comprises
blood cells.
31. The method of claim 20 wherein the tissue sample comprises a
human blastomere cell or a human polar body.
32. The method of claim 20 wherein the tissue sample is produced by
microdissection.
33. The method of claim 20 wherein the method is performed in a
mesoscale device.
34. A method for simultaneous detection of gene expression and
chromosomal abnormality in a tissue sample comprising: (a)
providing an array of nucleic acid target elements attached to a
solid support wherein the nucleic acid target elements comprise
polynucleotide sequences substantially complementary under
preselected hybridization conditions to nucleic acids indicative of
gene expression and of chromosomal sequence of a tissue sample; (b)
providing at least three labeled nucleic acid populations: (i) a
mRNA or cDNA population labeled with a first fluorescent color and
derived from the tissue sample, (iii) at least one reference
nucleic acid population labeled with a third fluorescent color; (c)
contacting the array with the labeled nucleic acid populations
under hybridization conditions; and (d) detecting presence and
intensity of each of the first, second and third fluorescent colors
at at least two target elements.
35. The method of claim 34 wherein the target nucleic acid elements
comprise oligomers in the range of 8 bp to about 100 bp.
36. The method of claim 34 wherein the array comprises at least 100
target elements.
37. The method of claim 34 wherein the array comprises target
elements at a density in the range of 100 to 10,000 target elements
per square centimeter.
38. The method of claim 34 further comprising determining
fluorescent ratios at each target element (i) between the first and
third colors and (ii) between the second and third colors.
39. The method of claim 34 further comprising processing data from
the detecting step (c) in a programmed computer, storing raw and
processed data in a database and displaying raw and processed
data.
40. The method of claim 34 further comprising addition of unlabeled
blocking nucleic acid.
41. The method of claim 34 further comprising use of data from the
method in selection of a therapy for a human.
41. The method of claim 34 further comprising use of data from the
method in selection of a therapy for a human.
42. The method of claim 34 wherein the chromosomal DNA population
is produced by a method comprising PCR.
43. The method of claim 34 wherein the tissue sample comprises one
cell.
44. The method of claim 34 wherein the tissue sample comprises a
human tumor sample.
45. The method of claim 34 wherein the tissue sample comprises
blood cells.
46. The method of claim 34 wherein the tissue sample is produced by
microdissection.
47. The method of claim 34 wherein the cDNA comprises cDNA having a
complexity in a range of 100 bp to 5,000 bp.
48. The method of claim 34 wherein the target nucleic acid elements
comprise at least one peptide nucleic acid.
49. The method of claim 34 wherein the method is performed in a
mesoscale device.
50. The method of claim 1 wherein the target elements comprise
polynucleotides in the range of 8 bp to about 100 bp.
51. The method of claim 4 wherein the tissue sample comprises
bladder tissue.
52. The method of claim 4 wherein the tissue sample comprises lung
tissue.
53. The method of claim 4 wherein the tissue sample comprises
prostate tissue.
54. The method of claim 4 wherein the tissue sample comprises
breast tissue.
55. The method of claim 4 wherein the tissue sample comprises
esophageal tissue.
56. The method of claim 4 wherein the tissue sample comprises
cervical tissue.
57. The method of claim 4 wherein the tissue sample comprises
ovarian tissue.
58. The method of claim 4 wherein the tissue sample comprises colon
tissue.
59. The method of claim 4 wherein the tissue sample comprises brain
tissue.
60. The method of claim 4 wherein the tissue sample comprises
stomach tissue.
61. The method of claim 4 wherein the tissue sample comprises skin
tissue.
62. The method of claim 4 wherein the tissue sample comprises
pancreas tissue.
63. The method of claim 4 wherein the tissue sample comprises a
human blastomere.
64. The method of claim 4 wherein the tissue sample comprises a
human polar body.
65. The method of claim 1 comprising use of at least two reference
nucleic acid populations.
66. The method of claim 1 comprising use of at least four reference
nucleic acid populations.
67. The method of claim 20 comprising use of at least two reference
nucleic acid populations.
68. The method of claim 34 comprising use of at least two reference
nucleic acid populations.
69. The method of claim 4 comprising use of at least two reference
nucleic acid populations.
70. The method of claim 4 wherein the tissue sample comprises a
cancer cell line.
71. The method of claim 20 wherein at least four separate
fluorescently labeled nucleic acid populations are hybridized with
the array.
72. The method of claim 26 wherein at least eight separate
fluorescently labeled nucleic acid populations are hybridized with
the array.
73. The method of claim 5 wherein at least four separate
fluorescently labeled nucleic acid populations are hybridized with
the array.
74. The method of claim 34 wherein at least four separate
fluorescently labeled nucleic acid populations are hybridized with
the array.
75. The method of claim 34 wherein at least eight separate
fluorescently labeled nucleic acid populations are hybridized with
the array.
76. The method of claim 8 which further comprises: displaying at
least one chromosome ideogram with array data.
77. The method of claim 24 which further comprises: displaying at
least one chromosome ideogram with array data.
78. The method of claim 46 which further comprises: displaying at
least one chromosome ideogram with array data.
Description
FIELD OF THE INVENTION
[0002] This invention relates generally to the assessment of
nucleic acids in human or animal tissue samples. More particularly,
the invention relates to the simultaneous measurement in tissue
samples of gene expression and of chromosome abnormalities.
BACKGROUND OF THE INVENTION
[0003] Abnormalities in the expression of genes, both in the timing
and level of expression of particular genes, are a fundamental
cause of cancer and other human disease. Abnormalities in genomic
DNA, i.e. in chromosomes, are also a fundamental cause of cancer
and other human disease, often leading to the over-expression or
under-expression of genes. Some chromosomal abnormalities, such as
balanced translocations and inversions between chromosomes, and
base pair changes, do not involve a change in DNA sequence copy
number. Other genomic DNA abnormalities comprise changes in DNA
sequence copy number from the normal one copy per chromosome. These
genomic DNA abnormalities often are referred to as gene
amplification for copy number increase and gene deletion for copy
number decrease. For example, one aggressive form of breast cancer,
occurring in about 25-30% of breast cancers, results from the gene
amplification and over-expression of the Her-2/neu oncogene, which
is located on chromosome 17 at band q12. Breast cancer patients
with this genetic abnormality have a significantly poorer
prognosis, both for overall survival and disease-free survival,
then patients without this abnormality. In addition,
over-expression of the Her-2 gene occurs, in the absence of gene
amplification of the chromosomal locus of the gene, at an earlier,
less aggressive stage of the disease, Borg, et al., "Her-2/neu
Activity in Human Breast Cancer," Cancer Research 50, 4332-4337
(Jul. 15, 1990). Proper assessment and management of breast cancer
thus requires tests to measure the presence of Her-2 gene
expression and Her-2 gene chromosomal copy number.
[0004] Chromosomal abnormalities such as Her-2 gene copy number can
be assessed by assays using fluorescent in situ hybridization
("FISH"). FISH assays involve hybridization of DNA probes to
chromosomal DNA present in morphologically intact metaphase spreads
or interphase cells of tissue samples. The U.S. Food and Drug
Administration recently approved a diagnostic FISH test,
PathVysion.TM. Her-2, available from Vysis, Inc. (Downers Grove,
Ill.) for detection of Her-2 copy number and prediction of outcome
of adriamycin therapy in node positive breast cancer patients.
[0005] Cancer also involves abnormalities in multiple genes,
leading to multiple forms of the disease, as exemplified by breast
cancer, wherein the Her-2 oncogene is not abnormal in the majority
of cases. So-called "DNA Chip" or "microarray" tests using
hybridization to a two dimensional array of multiple nucleic acid
probes attached to a solid substrate assess multiple gene
expression abnormalities simultaneously. See for example, U.S. Pat.
No. 5,445,934, "Array of Oligonucleotides on Solid Substrate,"
Fodor, et al., U.S. Pat. No. 5,800,992, "Method of Detecting
Nucleic Acids," Fodor, et al., and U.S. Pat. No. 5,807,552,
"Methods for Fabricating Microarrays of Biological Substances,"
Brown, et al. The microarray gene expression tests are of growing
use in the development of new drugs targeted at particular
diseases.
[0006] Multiple gene expression at the protein level also can be
examined by the use of "microdot" immunoassays, which are two
dimensional arrays of immobilized antigens on a substrate. See U.S.
Pat. No. 5,486,452, "Devices and Kits for Immunological Analysis,"
Gordon, et al., priority date Feb. 3, 1982, and Ekins, et al,
Analytica Chimica Acta, 227:73-96 (1989). The immobilized antigens
of Gordon, et al. include nucleic acids and are disclosed as
arrayed at densities of 10.sup.5 per 10 square centimeters (or
1,000 per cm.sup.2). Gordon, et al. further disclose the array has
"intrinsic resolution" below the size of pipetting devices common
in 1982, see Gordon, et al. at column 17, and can thus contain
antigens at higher densities. Gordon, et al. disclose that the
arrays can be manufactured by use of mechanical transfer apparatus,
miniaturized applicators, lithographic procedures or high speed
electronic printing.
[0007] U.S. Pat. No. 5,665,549, "Comparative Genomic Hybridization
(CGH)," Pinkel, et al., discloses a method for simultaneous
assessment of multiple genetic abnormalities. CGH involves the
comparative, multi-color hybridization of a reference nucleic acid
population labeled in one fluorescent color and a sample nucleic
acid population labeled in a second fluorescent color to all or
part of a reference genome, such as a human metaphase chromosome
spread. Comparison of the resulting fluorescence intensity at
locations in the reference genome permits determination of copy
number of chromosomal sequences, or of expressed gene sequences, in
the sample population. Microarray-based CGH tests have also been
disclosed for the assessment of multiple genomic DNA or gene
expression abnormalities, see U.S. Pat. No. 5,830,645, "Comparative
Fluorescent Hybridization to Nucleic Acid Arrays, Pinkel, et al.;
co-pending and commonly assigned U.S. Patent Application Serial
Number 09/085,625, "Improvements of Biological Assays for Analyte
Detection," Muller, et al.; and Pinkel, et al., "High resolution
analysis of DNA copy number variation using comparative genomic
hybridization to microarrays," Nature Genetics, Vol. 20, Oct. 1998,
pp. 207-211. Pinkel, et al. in Nature Genetics disclose the
capability of CGH to a microarray target to detect a single copy
change in genomic DNA.
[0008] To date, assessment of gene expression and of chromosomal
abnormalities requires separate tests on a tissue sample, leading
to extra sample processing and reagent costs. Separate testing for
gene expression and chromosomal abnormalities can also require more
tissue than is available. The prior art does not disclose
simultaneous measurement of gene expression and chromosomal
abnormalities with a multi-color hybridization to a microarray. It
is an object of this invention to circumvent separate testing by
performing simultaneous testing for gene expression and chromosomal
abnormalities on a tissue sample. It is another object to
simultaneously test gene expression and chromosomal abnormalities
on a single nucleic acid microarray. Other objects of the invention
will be detailed below.
SUMMARY OF THE INVENTION
[0009] The invention comprises a multi-color, comparative
hybridization assay method using an array of nucleic acid target
elements attached to a solid support for the simultaneous detection
of both gene expression and chromosomal abnormalities in a tissue
sample. The method of the invention employs a comparative
hybridization of a tissue mRNA or cDNA sample labeled with a first
detectable marker, a tissue genomic DNA sample labeled with a
second detectable marker, and at least one reference nucleic acid
labeled with a third detectable marker, to the array. Each marker's
presence and intensity at each target element is detected and the
ratios of the markers, for example, (1) of the first and third
markers and (2) the second and third markers, are determined for
each of the target elements. Gene expression and chromosomal
abnormalities are thus simultaneously detected by analysis of the
marker ratios. In a preferred embodiment, the markers are each
fluorescent labels.
[0010] The invention has broad utility in human disease management
by providing more complete genetic assessment data to guide therapy
selection, in human and animal drug development programs by
assessing therapeutic candidate effects, and in bacterial and viral
pathogen diagnosis. Particular cancers, which are characterized by
gene amplification coupled with over-expression of the mRNA for the
amplified gene, may be more aggressive diseases and need more
aggressive therapies. The mechanism that drives over-expression
could be fundamental in understanding what therapeutic
interventions may be appropriate. Thus, the characterization of
both gene expression and amplification by the methods of the
invention can lead to improved cancer therapy.
[0011] In a preferred embodiment, the invention comprises a method
for simultaneous detection of gene expression and chromosomal
abnormality in a tissue sample comprising:
[0012] (a) providing a microarray of nucleic acid target elements
attached to a solid support wherein the nucleic acid target
elements comprise polynucleotide sequences substantially
complementary under preselected hybridization conditions to nucleic
acids present in a tissue sample, which are indicative of gene
expression and indicative of chromosomal sequence;
[0013] (b) providing at least three labeled probe nucleic acid
populations:
[0014] (i) a cDNA population labeled in a first fluorescent color
and derived from mRNA from the tissue sample,
[0015] (ii) a chromosomal DNA population labeled in a second
fluorescent color and derived from the tissue sample, and
[0016] (iii) at least one reference nucleic acid population labeled
in a third fluorescent color;
[0017] (c) contacting the microarray with the labeled nucleic acid
populations under hybridization conditions; and
[0018] (d) detecting presence and intensity of each of the first,
second and third fluorescent label colors on at least two target
elements.
[0019] Measurement and comparison of hybridization of message,
genomic and reference nucleic acids at the same target elements
provides the simultaneous assessment of expression and genomic
changes. The invention also comprises use of multiple reference
nucleic acids, for example, a genomic reference DNA labeled in the
third fluorescent color and a reference cDNA population labeled in
a fourth fluorescent color. The nucleic acid target elements can be
either genomic DNA, oligomer DNA or cDNA. A preferred embodiment
comprises an array with a mixture of genomic DNA target elements
and oligomer DNA or cDNA target elements, with the oligomer
DNA/cDNA targets measuring expression and the genomic DNA targets
measuring chromosomal change. It is also preferred to use a
microarray having a target element density capable of measuring
1,000 different gene and genomic loci in less than one square
centimeter of chip surface.
BRIEF DESCRIPTION OF THE DRAWINGS
[0020] FIGS. 1(a) through 1(e) depict the components of a preferred
hybridization cartridge for use in performing the inventive
methods.
[0021] FIGS. 2(a) through 2(h) depict data from a nucleic acid
microarray after hybridization with tissue cDNA and genomic DNA
populations, each derived from a human cancer cell line, one
labeled red, the other green, and a total human genomic DNA
reference population labeled orange, which show the capability of
the method of the invention to detect simultaneously both gene
expression and chromosomal abnormalities on the same nucleic acid
microarray.
DETAILED DESCRIPTION OF THE INVENTION
[0022] (1) Definitions
[0023] The following abbreviations are used herein:
[0024] bp--base pair
[0025] CGH--Comparative Genomic Hybridization
[0026] DAPl--4,6diamidino-2-phenylindole
[0027] dCTP--deoxycytosine triphosphate
[0028] DNA--deoxyribonucleic acid (in either single- or
double-stranded form, including analogs that can function in a
similar manner)
[0029] dUTP--deoxyuridine triphosphate
[0030] FISH--fluorescence in situ hybridization
[0031] kb--kilobase
[0032] mm--millimeter
[0033] mRNA--messenger RNA
[0034] ng--nanogram
[0035] nl--nanoliter
[0036] RNA--ribonucleic acid in either single- or double-stranded
form, including analogs that can function in a similar manner
[0037] .mu.g--microgram
[0038] .mu.l--microliter
[0039] .mu.m--micrometer
[0040] .mu.M--micromole
[0041] The term "nucleic acid" or "nucleic acid molecule" refer to
a deoxyribonucleotide or ribonucleotide polymer in either single-
or double-stranded form, including known analogs of natural
nucleotides that can function in a similar manner as naturally
occurring nucleotides.
[0042] The term "exon" refers to any segment of an interrupted gene
that is represented in the mature mRNA product. Some protein coding
genes do have exons that are non-coding, e.g., exon 1 of the human
c-myc gene. Perhaps all protein coding genes have first and last
exons that are partially coding.
[0043] The terms "single copy sequence" or "unique sequence" refer
to a nucleic acid sequence that is typically present only once per
haploid genome, such as the coding exon sequences of a gene.
[0044] The term "complexity" is used herein according to standard
meaning of this term as established by Britten, et al., Methods of
Enzymol., 29:363 (1974). See also Cantor and Schimmel, Biophysical
Chemistry: Part III at 1228-1230, for further explanation of
nucleic acid complexity.
[0045] The term "target element" refers to a region of a substrate
surface that contains immobilized or attached nucleic acids capable
of hybridization to nucleic acids isolated from a tissue
sample.
[0046] "Bind(s) substantially" refers to complementary
hybridization between a tissue nucleic acid and a target element
nucleic acid and embraces minor mismatches that can be accommodated
by reducing the stringency of the hybridization media to achieve
the desired detection of the tissue polynucleotide sequence.
[0047] The terms "specific hybridization" or "specifically
hybridizes with" refers to hybridization in which a tissue nucleic
acid binds substantially to target element nucleic acid and does
not bind substantially to other nucleic acids in the array under
defined stringency conditions. One of skill will recognize that
relaxing the stringency of the hybridizing conditions will allow
sequence mismatches to be tolerated. The degree of mismatch
tolerated can be controlled by suitable adjustment of the
hybridization conditions.
[0048] One of skill will also recognize that the precise sequence
of the particular nucleic acids described herein can be modified to
a certain degree to produce tissue nucleic acid probes or target
element nucleic acids that are "substantially identical" to others,
and retain the ability to bind substantially to a complementary
nucleic acid. Such modifications are specifically covered by
reference to individual sequences herein. The term "substantial
identity" of polynucleotide sequences means that a polynucleotide
comprises a sequence that has at least 90% sequence identity, and
more preferably at least 95%, compared to a reference sequence
using the methods described below using standard parameters.
[0049] Two nucleic acid sequences are said to be "identical" if the
sequence of nucleotides in the two sequences is the same when
aligned for maximum correspondence as described below. The term
"complementary to" is used herein to mean that the complementary
sequence is complementary to all or a portion of a reference
polynucleotide sequence.
[0050] Sequence comparisons between two (or more) polynucleotides
are typically performed by comparing sequences of the two sequences
over a "comparison window" to identify and compare local regions of
sequence similarity. A "comparison window," as used herein, refers
to a segment of at least about 20 contiguous positions, usually
about 50 to about 200, more usually about 1 00 to 150, in which a
sequence may be compared to a reference sequence of the same number
of contiguous positions after the two sequences are optimally
aligned.
[0051] Optimal alignment of sequences for comparison may be
conducted by the local homology algorithm of Smith and Waterman,
Adv. Appl Math. 2:482 (1981), by the homology alignment algorithm
of Needleman and Wunsch, J. Mol Biol. 48:443 (1970), by the search
for similarity method of Pearson and Lipman, Proc. Natl. Acad. Sci.
(U.S.A.) 85:2444 (1988), and by computerized implementations of
these algorithms.
[0052] "Percentage of sequence identity" is determined by comparing
two optimally aligned sequences over a comparison window, wherein
the portion of the polynucleotide sequence in the comparison window
may comprise additions or deletions (i.e., gaps) as compared to the
reference sequence (which does not comprise additions or deletions)
for optimal alignment of the two sequences. The percentage is
calculated by determining the number of positions at which the
identical nucleic acid base occurs in both sequences to yield the
number of matched positions, dividing the number of matched
positions by the total number of positions in the window of
comparison and multiplying the result by 100 to yield the
percentage of sequence identity.
[0053] Another indication that nucleotide sequences are
substantially identical is if two molecules hybridize to the same
sequence under stringent conditions. Stringent conditions are
sequence dependent and will be different in different
circumstances. Generally, stringent conditions are selected to be
about 5.degree. to about 25.degree. C. lower than the thermal
melting point (Tm) for the specific sequence at a defined ionic
strength and pH. The Tm is the temperature (under defined ionic
strength and pH) at which the strands of a DNA duplex or RNA-DNA
hybrid are half dissociated or denatured.
[0054] As used herein, a "probe" is defined as a population or
collection of tissue nucleic acid molecules (either RNA or DNA)
capable of binding to a target element comprising nucleic acid of
complementary sequence through one or more types of chemical bonds,
usually through hydrogen bond formation. The probe populations are
directly or indirectly labeled as described below. The probe
populations are typically of high complexity, for instance, being
prepared from total genomic DNA or total mRNA isolated from a
tissue cell or tissue cell population.
[0055] (2) Overview
[0056] The methods of the invention combine the capability of
assessment of a large number of nucleic acids provided by
microarray test formats with the multi-color, comparative
hybridization power of CGH to assess simultaneously both gene
expression and genomic abnormalities in the same tissue sample. The
methods of the invention employ hybridization under suitable
hybridization conditions to a nucleic acid array comprising
multiple nucleic acid target elements of nucleic acid populations
derived from a tissue sample. The nucleic acid target elements
comprise either genomic DNA, oligomer or cDNA nucleic acids
complementary to expressed gene sequences, or a mixture of the two.
The nucleic acid populations are separately labeled with different
detectable markers and comprise (1) a mixture of mRNA or its
complementary cDNA, which is representative of gene expression in
the tissue sample, and (2) a mixture of genomic DNA, which is
representative of the genomic status of the tissue sample. The
labeled nucleic acid populations are co-hybridized to the array
with one or more reference nucleic acid populations, with each
reference population also labeled with its own different detectable
marker. Preferably, all of the nucleic acid populations applied to
the array are each labeled with different fluorescent markers. The
reference nucleic acid or nucleic acids is or are chosen to permit
assessment of the gene expression state and genomic state of the
tissue sample relative to the reference or references. After a
suitable hybridization time, the fluorescent color presence and
intensity are detected at each target element of the array.
Comparison of the fluorescent ratios between colors at a particular
target element provides measurement of the copy number for genomic
DNA sequences and for cDNA sequences, which are complementary to
that target element.
[0057] A genomic DNA sequence generally contains both one or more
"exon" sequences, which code for all or part of the RNA expressed
gene sequence, and one or more "intron," non-coding sequences,
which also often contain repeat sequences replicated at many points
in the human genome. A genomic target element can thus serve as a
hybridization target for the expressed gene sequences that map to
the particular genomic sequence. Similarly, a target element
complementary to a particular expressed gene sequence is also
complementary to the exon sequences of genomic DNA. Hence, a
genomic DNA target element and a cDNA target element can each be
used in an array format for hybridization to either genomic DNA or
expressed gene sequence nucleic acids. The array format used in the
methods of the invention comprises a microarray of separate nucleic
acid target elements each complementary to (1) a particular genomic
DNA sequence or (2) a particular expressed gene sequence. A mixture
of target elements comprising some target elements complementary to
(1) and some complementary to (2) can also be used.
[0058] A significant advantage of the methods of the invention is
the simultaneous determination of both gene expression and
chromosomal abnormality. Some aggressive, virulent forms of cancer
are characterized by both over-expression of one or more oncogenes
and gene amplification of the chromosomal locus of each oncogene,
such as breast cancer involving Her-2. Testing for over-expression
of the oncogene alone is inadequate for the complete
characterization of the disease state. Simultaneous testing of the
same tissue sample for both gene expression and chromosomal
abnormalities with the methods of the invention thus advantageously
identifies both over-expression and the molecular causes of
over-expression and thereby enables appropriate prognostic
assessment and therapy selection.
[0059] The choice of genomic, cDNA or a mixture of target elements
can vary with the tissue and analysis sought. For example, cDNA
target elements are advantageous because the effect of repeat
sequences present in some genomic DNAs is decreased and more
precise detection of expressed genes is possible. Genomic DNA
target elements are advantageous because the higher complexities
can produce greater signal. A mixture of genomic DNA and cDNA
target elements can also be used to provide more detailed genomic
and expression analysis.
[0060] (3) Nucleic Acids in the Target Elements
[0061] The nucleic acid sequences of the target elements can
comprise any type of nucleic acid or nucleic acid analog, including
without limitation, RNA, DNA, peptide nucleic acids or mixtures
thereof, and can be present as clones also comprising vector
sequences or can be substantially pure. Arrays comprising peptide
nucleic acids are disclosed in U.S. Pat. No. 5,821,060, "DNA
Sequencing, Mapping and Diagnostic Procedures Using Hybridization
Chips and Unlabeled DNA," H. Arlinghaus, et al.
[0062] The nucleic acids of a target element typically have their
origin in a defined region of a selected genome (for example a
clone or several contiguous clones from a human or animal genomic
library), or correspond to a functional genetic unit of a selected
genome, which may or may not be complete (for example a full or
partial cDNA sequence). The target nucleic acids can also comprise
inter-Alu or Degenerate Oligonucleotide Primer PCR products derived
from cloned DNA.
[0063] The nucleic acids of a target element can, for example,
contain specific genes or be from a chromosomal region suspected of
being present at increased or decreased copy number in cells of
interest, e.g., tumor cells. For example, separate target elements
can comprise DNA complementary to each of the oncogene loci listed
in Table 2 below. The target element may also contain an mRNA or
cDNA derived from such mRNA, suspected of being transcribed at
abnormal levels, for example, expressed genes mapping to the gene
loci in Table 2 below.
[0064] Alternatively, a target element may comprise nucleic acids
of unknown significance or location. An array of such elements
could represent locations that sample, either continuously or at
discrete points, any desired portion of a genome, including, but
not limited to, an entire genome, a single chromosome, or a portion
of a chromosome. The number of target elements and the complexity
of the nucleic acids in each would determine the density of
analysis. For example, an array of 300 target elements, with each
target containing DNA from a different genomic clone, could sample,
i.e., analyze, the entire human genome at 10 megabase intervals. An
array of 3,000 target elements, with each containing 100 kb of
genomic DNA, could give substantially complete coverage at one
megabase intervals of the unique sequence regions of the human
genome. Similarly, an array of target elements comprising nucleic
acids from anonymous cDNA clones or complementary to Expressed
Sequence Tags ("ESTs") would permit identification of those
expressed gene sequences that might be differently expressed in
some cells of interest, thereby focusing attention on study of
these genes or identification of expression abnormalities for
diagnosis.
[0065] One of skill will recognize that each target element can
comprise a mixture of target nucleic acids of different lengths and
sequences. A target element will generally contain more than one
copy of a cloned or synthesized piece of DNA, and each copy can be
broken into fragments of different lengths. The length and
complexity of the target element sequences of the invention is not
critical to the invention. One of skill can adjust these factors to
provide optimum hybridization and signal production for a given
hybridization procedure, and to provide the required resolution
among different genes or genomic locations.
[0066] The target elements can comprise oligomers, such as those in
the range of 8 to about 100 bp, preferably 20 to 80 bp, and more
preferably about 40 to about 60 bp, which can be readily
synthesized using widely available synthesizer machines. Oligomers
in target elements can also be synthesized in situ on the array
substrate by any methods, such as those known in the art. The
oligomer sequence information can be obtained from any convenient
source, including nucleic acid sequence data banks, such as
GENBANK, commercial databases such as LIFESEQ from Incyte
Pharmaceuticals, Inc. (Palo Alto, Calif.), or EST data such as that
produced by use of SAGE (serial analysis of gene expression). For
oligomer or partial cDNA elements, one need only synthesize a
partial sequence complementary to a part of the mRNA for the gene
or complementary to an identifiable, critical sequence for the gene
(critical in the sense of the sequences coding for the functional
parts of the expressed protein, i.e., of the receptor binding
site).
[0067] The target elements can comprise partial or full-length cDNA
sequences, either synthesized for smaller cDNAs or cloned,
preferably having a complexity in the range of about 100 bp to
about 5,000 bp. cDNA target elements can be readily obtained from
expressed gene sequence cDNA libraries from a desired tissue, which
are produced using conventional methods or obtained from commercial
sources, such as the libraries maintained by Genome Systems, Inc.
(St. Louis, Mo.), Research Genetics (Huntsville, Ala.) and
Clonetech (South San Francisco, Calif.).
[0068] The target elements can comprise genomic DNA sequences of
any complexity, but generally of a complexity of about 20,000 bp to
about 250,000 bp, and preferably about 50,000 bp to about 1 75,000
bp. Genomic DNA can be obtained from any mapped genomic clones
produced by standard cloning procedures or obtained from commercial
sources, such as the chromosome specific libraries maintained by
the American Type Culture Collection (Rockville, Md.), hereinafter
ATCC. A preferred genomic library source is the human DNA BAC
library maintained by Genome Systems.
[0069] The identification of genomic DNA or cDNA selected for use
in the target elements can be determined by the location of
chromosomal sequences known or identified as amplified or deleted
or of genes over- or under-expressed. The identification of genomic
or cDNA clones is done by designing primer sequence pairs using,
for example, genetic data in Gene Map '98 maintained by the U.S.
National Institute of Health or the Genome Data Base at
http://gdbwww.gdb.orq/qdbtop.html. For example, the Her-2 gene is
believed to comprise about 40 kb of genomic sequence and a PCR
primer pair can be designed based upon the published Her-2
sequence. The PCR primer pair or the PCR amplicon product can then
be used to screen a genomic DNA library to identify clones
containing complementary sequences. The genomic DNA clones
identified in the screen can be used on an array in the method of
the invention to identify genomic abnormality at the Her-2
locus.
[0070] For use of arrays that detect viruses and viral gene
expression simultaneously with detection of human genetic
abnormalities, the target elements can comprise sequences
complementary to known or identified viral sequences. The array
target elements can also be designed to detect viral integration
sites in the human or an animal genome. Use of such a pathogen
array is medically significant, for example, because of the known
ties of human papilloma virus to human cervical cancer and h.
pylori to human gastrointestinal cancer. Similarly, known bacterial
gene sequences can be used to design the nucleic acids of the
target elements. Use of pathogen sequence based arrays also can be
used in food and environmental testing.
[0071] (4) Target Elements
[0072] The target elements can be of varying dimension, shape and
area. The target elements can comprise physically separated spots
produced by printing methods, for example, mechanical transfer,
gravure, ink jet or imprint methods. The target elements also can
be closely abutted such as those produced by the photolithographic
in situ array synthesis of U.S. Pat. No. 5,445,934. The target
elements are preferably generally round in shape on a planar
surface. Generally, smaller elements are preferred, with a typical
target element comprising less than 500 microns in diameter.
Particularly preferred target element sizes are between about 5
microns and 250 microns in diameter to achieve high density.
[0073] The target element density can be any desired density and is
preferably one typical of nucleic acid microarrays, i.e. greater
than about 100 target elements per square centimeter. For the
preferred use in human disease management, the target element
density is preferably in the range of about 100 to about 10,000
target elements per square centimeter of chip surface. Higher or
lower densities can be desirable and higher densities can be
preferred for use in drug development to permit examination of
higher numbers of expressed gene sequences.
[0074] (5) Array Manufacture
[0075] The microarray can be manufactured in any desired manner and
both robotic deposition and synthesis in situ methods for array
manufacturing are known. See for example, U.S. Pat. Nos. 5,486,452,
5,830,645, 5,807,552, 5,800,992 and 5,445,934. It is preferred to
manufacture the microarray using a robotic deposition method and
apparatus, which employs robotic deposition of nucleic acids
through a capillary needle or pin as disclosed in co-pending,
commonly assigned U.S. patent application Ser. No. 09/085,625,
filed May 27, 1998, "Improvements of Biological Assays for Analyte
Detection," Muller, et al. (hereinafter "Muller, et al."), to
produce a two dimensional microarray of physically separated or
"spotted" target elements immobilized in rows and columns on a
chromium coated-substrate.
[0076] A robotic applicator with multiple capillary needles can be
used. A single needle applicator using a pin which is washed
between applications of different nucleic acids, or using a robotic
pin changer also can be used. The needle used is preferably a 33
gauge, one-inch long stainless steel capillary syringe needle. The
needle is connected to a nucleic acid reservoir, preferably a Luer
lock syringe tip. A preferred needle and reservoir is available
commercially from EFD, (East Providence, R.I.). It is preferred to
use multiple capillary needles, each depositing a different nucleic
acid, thereby eliminating a washing step between depositions.
[0077] Any suitable amount of nucleic acid is deposited in each
target element, with the target element size dependent on the
amount deposited. For each target element, the amount can be from
about 0.05 nl to about 5.0 nl of a nucleic acid solution of 1
.mu.g/.mu.l nucleic acid concentration. For a density of 1,000
target elements/cm.sup.2, the individual amount deposited per
target element is about 0.2 nl to about 2.0 nl of 1 .mu.g/.mu.l
solution. The nucleic acid is provided in any solvent that will
permit deposition of denatured nucleic acid. Preferably, the
nucleic acid is provided in 100 mM NaOH at 1 .mu.g/.mu.l
concentration.
[0078] To assist robotic manufacturing, automated tracking and
labeling methods and apparatus can be used, for example, in
delivering the correct nucleic acid for deposition at a particular
target element. For example, bar coding or transponder labeling or
tracking of capillary pins containing different nucleic acids are
useful to assure delivery of the correct nucleic acid to the
desired target element. The use of bar coding or transponder
labeling also permits better computer control of the manufacturing
process.
[0079] A microarray comprising both cDNA and genomic DNA target
elements can be produced in any arrangement. For example, the cDNA
elements can be located in one portion of the array or can be
interspersed among the genomic DNA target elements. Although the
regularity of a two dimensional array on a planar substrate surface
is preferred to permit easy fluorescence detection and analysis,
the array can be manufactured in any desired configuration.
[0080] Individual target elements can appear only once or can be
replicated to provide statistical power to analysis of results. For
arrays with densities under 3,000 target elements per cm.sup.2, it
is preferred to manufacture the array so that each target element
is replicated three times on the array, to provide better
calibration of the results. Applicants have determined that when
using a microarray of less than one cm.sup.2 of substrate surface
area, the replicates can be placed adjacent each other or separated
without material effect on the results.
[0081] Preferably, individual microarrays are manufactured on a
large, substrate plate or wafer, which is scored using procedures
well known in the semiconductor industry for breakup into
individual chips. Chromium-coated glass plates or wafers are
available commercially from Nanofilm (Westlake Village, Calif.) and
can be scored using conventional procedures. Thus, multiple chips
can be manufactured at once on the same wafer with one robotic
applicator, and then separated into individual chips. Before
printing, the wafers are preferably washed using, in order,
distilled water, isopropanol, methanol and distilled water washes.
Nitrogen is used to blow-off excess water and the rinsed wafers are
dried.
[0082] The preferred Muller, et al. apparatus uses X-Y and Z axis
controllers for the capillary pin applicator with application of a
burst of low air pressure to deposit each nucleic acid. It is
further preferred to use a suitable Z-axis controller on the
apparatus of Muller, et al. to avoid contact of the capillary pin
with the substrate surface. Positioning the pin above the surface,
preferably about 100 .mu.m above, permits better spot size
regularity and use of lower air pressure.
[0083] When beginning printing, the plate or wafer is equilibrated
to room temperature. The Z-axis height of each chip is then
determined for use by the robot controller. Preferably, the
printing starts with deposition of a 300 .mu. diameter marker" spot
in one corner of each chip for alignment control. The nitrogen
pressure is low, preferably about 1 psi or less, and is a pressure
sufficient to deposit the particular nucleic acid given its
viscosity and amount to be deposited. The nitrogen pulse length is
generally about 10 milliseconds.
[0084] It is also preferred to include various control target
elements such as, for example, target elements comprising: (1)
total genomic DNA, (2) vector DNA, (3) a pooled mixture of genomic
DNA or cDNA from each target element, (4) total RNA from a normal
tissue, or (5) total genomic or cDNA from a tissue with known
abnormalities. The control target elements can also include a
series of target elements each comprising a nucleic acid of known
copy number for a particular expressed gene or genomic sequence.
For example, genomic DNA extracted from cell lines with 1, 2, 3, 4
and 5 copies of the human X chromosomes can be used.
[0085] For quality control of the preferred robotic deposition
manufacturing, it is preferred to image the produced arrays using a
stereo microscope and a CCD camera. An image of each chip is
captured and analyzed. Chips with missing, missized or misshaped
target elements are identified and marked.
[0086] When using cloned cDNA or cloned genomic DNA, the vector
sequences can be removed before deposition with any suitable
process or retained if they do not significantly interfere with the
hybridization. For cloned genomic DNA and cDNA, it is preferred to
not remove the vector sequences.
[0087] Any suitable substrate can be used, including those
disclosed in U.S. Pat. Nos. 5,445,934 and 5,807,552. The substrate
can be for example, without limitation, glass, plastics such as
polystyrene, polyethylene, polycarbonate, polysulfone and
polyester, metals such as chromium and copper, metal coated
substrates and filters of any material. The substrate surface
bearing the immobilized nucleic acids is preferably planar, but any
desired surface can be used including, for example, a substrate
having ridges or grooves to separate the array target elements. The
nucleic acids can also be attached to beads, which are separately
identifiable. The planar chromium-coated glass substrate of Muller,
et al. is preferred.
[0088] The nucleic acids of the target elements can be attached to
the substrate in any suitable manner that makes them available for
hybridization, including covalent or non-covalent binding. The
non-covalent attachment method of Muller, et al. is preferred.
[0089] (6) Tissue Nucleic Acids
[0090] The nucleic acid populations can be derived from any tissue
source, including human, plant and animal tissue. The tissue sample
comprises any tissue, including a newly obtained sample, a frozen
sample, a biopsy sample, a blood sample, an anmiocentesis sample,
preserved tissue such as a paraffin-embedded fixed tissue sample
(i.e., a tissue block), or a cell culture. Thus, the tissue sample
can comprise a whole blood sample, a skin sample, epithelial cells,
soft tissue cells, fetal cells, amniocytes, lymphocytes,
granulocytes, suspected tumor cells, organ tissue, blastomeres and
polar bodies. The tissue to be tested can be derived from a
micro-dissection process to produce a more homogeneous cell
population. Paraffin fixed tissue is pre-treated with any suitable
process to remove the wax, and a paraffin pretreatment kit is
available commercially from Vysis, Inc. Any suitable amount of
tissue can be used, including a single cell, such as a human
blastomere cell to be tested during in vitro fertilization
procedures. Where only one or a few cells are available, such as
when testing human fetal cells separated from maternal blood
samples, a nucleic acid amplification technique to amplify the
amount of nucleic acid can be used.
[0091] The nucleic acid populations derived from the tissue are
produced by any suitable nucleic acid separation or purification
process. Nucleic acid separation methods for both genomic DNA and
for messenger RNA are available commercially, such as the QIAamp
tissue kit for DNA isolation from Qiagen. For example, mRNA can be
extracted from the tissue and then converted to cDNA by treatment
with reverse transcriptase. If insufficient cDNA is available, the
cDNA can be amplified by polymerase chain reaction. This well known
process is called RT/PCR. It is also possible to convert the cDNA
into a complementary RNA ("cRNA").
[0092] In general, where greater than about one million cells of
tissue are available, the tissue nucleic acids can be extracted and
used without amplification. If less than about one million cells
are available, a nucleic acid amplification or concentration is
preferably used. Preferably, such an amplification technique is
PCR. Care and appropriate controls should be used with PCR to avoid
or identify any artefacts introduced.
[0093] (7) Reference Nucleic Acids
[0094] The reference nucleic acid population is any suitable
nucleic acid collection chosen to serve as a reference. For
example, the reference population can be total human genomic DNA
from normal tissue, total mRNA extracted from a normal sample of
the tissue to be tested and converted to cDNA, or a synthetic or
naturally-occurring mixture of cDNA for particular expressed genes.
The reference can be a cRNA population. The reference also can
include a "spiked," known amount of a particular genomic or cDNA
sequence to enable control analysis.
[0095] (8) Labeling
[0096] The labels used can be any suitable non-radioactive marker
detectable by any detection method. For example, the labels can be
fluorescent molecules or can be proteins, haptens or enzymes. Also,
"mass spec" labels, such as different isotopes of tin, can readily
be detected after hybridization to the array by laser removal and
mass spectrometry process, such as MALDI (matrix-assisted laser
desorption-ionization). See Wu, et al., Analytical Chemistry 66,
1637 (1994) and Wu, et al., Rapid Communications in Mass
Spectrometry, 7, 142 (1993). Preferably the labels are each
fluorescent markers having sufficient spectral separation to be
readily distinguished from each other without need of extensive
"cross-talk" correction, such as fluorescein, Texas Red and 5-(and
6-)carboxytetramethyl rhodamine. An extensive list of fluorescent
label compounds useful for attachment to nucleic acids appears in
U.S. Pat. No. 5,491,224, "Direct Label Transaminated DNA Probe
Compositions for Chromosome Identification and Methods for their
Manufacture," Bittner, et al. Fluorescent compounds suitable for
use are available commercially from Molecular Probe (Eugene,
Oreg.). Indirect labels, such as biotin and phycoerythrin, that are
fluorescently labeled after hybridization to the array by contact
with a fluorescent protein, such as avidin labeled with
fluorescein, also can be used.
[0097] The reference population(s) and the tissue nucleic acid
populations are labeled in any suitable manner, such as by end
labeling, nick translation or chemical transformation. Preferably,
during either the RT or PCR processing, a label incorporation step
is used to label the resulting cDNA in a desired fluorescent color.
The separated chromosomal DNA can be labeled using any suitable
labeling chemistry, including end-labeling, nick translation and
chemical labeling. It is preferred to use nick translation to label
the chromosomal DNA in a suitable fluorescent color using a
fluorescent dUTP or dCTP. Manufacture of suitable fluorescently
labeled dCTP is disclosed in K. Cruickshank, Anal. Biochemistry,
"Quantitation of Fluorescent Nucleotide Incorporation by Capillary
Gel Electrophoresis and Laser Induced Fluorescent Detection," (in
press), hereinafter referred to as "Cruickshank." Suitable nick
translation kits are available commercially.
[0098] Preferably, for use of total human genomic DNA as the
reference population, the labeling is done by a bisulfite-catalyzed
transamination process as disclosed in U.S. Pat. No. 5,506,350,
"Production of Chromosome Region Specific DNA Sequences and
Transamination," Bittner, et al. Total human genomic DNA labeled by
such a process is available commercially from Vysis, Inc. (Downers
Grove, Ill.).
[0099] The labeling method used preferably results in a label
content of each nucleic acid population of about 0.3 to about 6.0
mole percent labeled nucleotides when using direct attachment of
fluorophores to the nucleic acids. The quantities of each labeled
tissue nucleic acid and reference nucleic acid to be used are
preferably in the range of about 100 ng to about 1 .mu.g,
preferably about 300 ng to about 425 ng.
[0100] (9) Array Hybridization
[0101] The tissue and reference nucleic acid populations are
hybridized to the array under suitable hybridization conditions,
i.e., stringency, for a time selected to permit detection of
hybridization of single copy genomic sequences. The hybridization
conditions include choice of buffer, denaturant, such as formamide,
salt additives and accelerant. Hybridization buffers containing
formamide and dextran sulfate at specified pH and salt conditions,
such as LSI Hybridization Buffer (Vysis, Inc.), are available
commercially. The buffer will preferably have a pH of about 6.8 to
about 7.2, a salt content of about 1.5.times.SSC to about
2.5.times.SSC, and a formamide content of about 40-50%. Suitable
conditions can include a temperature of about 40 to about 80
degrees centigrade for a time sufficient to detect signal over
background for both genomic and expression of about 1 to about 72
hours, preferably 12-24 hours. Hybridization accelerators, such as
dextran sulfate, can be used if desired. Adequate diffusion of the
tissue and nucleic acid populations into contact with all target
elements is necessary. This can be achieved by simple diffusion, or
by accelerating diffusion or overcoming diffusion limitations using
any suitable means including mechanical mixing, such as by rocking,
or fluidic diffusion, such as by microfluidic pumping of the
labeled populations in and out of a hybridization chamber
containing the array. The post-hybridization wash is preferably at
a stringency greater than that of the hybridization.
[0102] When using an array comprising human genomic DNA target
elements, it is also preferable to add to the hybridization mix an
excess of unlabeled human repeat sequence DNA, such as Cot1 DNA
available from Life Technologies, Inc., to suppress the
non-specific signal resulting from hybridization of labeled repeat
sequences present in the tissue nucleic acid population or in a
reference genomic DNA, if used. Use of unlabeled repeat sequence
DNA is generally in amounts of about 0.02 to about 5.0 .mu.g per 1
ng of total labeled genomic DNA (both tissue and reference), and
preferably about 0.1 to 0.5 .mu.g per 1 ng total labeled genomic
DNA.
[0103] The hybridization can be performed in any suitable apparatus
that will maintain the populations in contact with the array for a
suitable time. For example, the labeled populations can be added to
the array, covered with a cover slip and then incubated in an oven
at the preselected temperature. Preferably, a cover slip designed
to provide a desired hybridization volume between its bottom
surface and the top of the array substrate is used. The labeled
populations can be added to an array contained in a sealed
cartridge apparatus, such as disclosed in European Patent
Application 0 695 941 A1, "Method and Apparatus for Packaging a
Chip," published 7 February 1996, by microfluidic injection and
circulation. The hybridization also can be carried out in a
miniaturized hybridization and assay chip, such as that disclosed
in PCT Patent Application WO 97/02357, "Integrated Nucleic Acid
Diagnostic Device," published 23 January 1997. Such miniaturized
chips are referred to as manufactured on a mesoscale, i.e.,
manufactured having volumes for fluid pathways and reaction
chambers measured in amounts of 10.sup.-8 and 10.sup.-9 liters.
[0104] FIGS. 2(a) through 2(e) show components of a preferred
hybridization cartridge. FIG. 2(a) displays the first component, a
chromium coated glass "chip" 30 containing the immobilized nucleic
acid target elements 31 of the microarray 32. The microarray 32 is
preferably located in the center of the chip 30, as shown. In a
preferred format, the chip is 25.4 mm long.times.1 6.93 mm
wide.times.0.7 mm thick; and the microarray covers a 10.5 mm
long.times.6 mm wide area. Shown in FIG. 2(b), the second component
is a "probe clip" 33, depicted with two alternate shapes, square
and circular, for "array window" 34. The probe clip 33 can be made
from any suitable material, preferably plastic. The array window 34
is of a clear material, and is located and sized to permit ready
imaging of the microarray. The probe clip 33 forms a hybridization
chamber and fits snuggly over the array as a retainer and
protective cover. Preferably, the array window 34 is 1.27 mm in
diameter, centrally located in a 25.4 mm long.times.16.76 mm wide
probe clip 33.
[0105] FIGS. 2(c) and 2(d) are top and side views of the fourth
component, a chip holder 36, preferably made of a sturdy, injection
moldable plastic, such as high-impact polystyrene, which is capable
of withstanding necessary hybridization temperatures without loss
of physical stability. The chip holder 36 can be of any desirable
dimension for holding the chip, and preferably is 25.4 mm
wide.times.76.2 mm long.times.3.2 mm thick. As shown, near one end,
the chip holder 36 contains a cavity 37, preferably 26 mm
long.times.18.5 mm wide.times.1.7 mm deep, sized to accept the chip
30 bearing the microarray 32. The cavity 37 along its length is
also slightly wider, preferably 0.5 mm on each side, to create an
access gap 38 to permit easier addition and removal of the probe
clip and microscope cover slip. The surface of the cavity bottom is
scored with shallow grooves to facilitate spreading of adhesive or
fixative designed to hold the chip in place. The chip holder 36 at
the end opposite the cavity 37 can be lightly scored across the
width of the holder on its upper surface to provide a more
grippable surface for the user. The chip holder bottom can be
grooved to facilitate alignment in an array reader.
[0106] In manufacture of the completed cartridge, a microarray with
desired target elements is manufactured as described above, and is
then glued with any suitable adhesive into the bottom of cavity 37.
The chip holder 36 bearing the array can then be shrink wrapped,
and enclosed in a kit with the probe clip 33, a cover slip used in
array imaging, and any other desirable reagents for labeling or
extracting nucleic acids and/or performing the hybridization. To
carry out the method of the invention, the user applies the
hybridization solution comprising an appropriate buffer and the
labeled nucleic acid populations (reference and tissue) to the
surface of the microarray, and places the probe clip 33 on top of
the microarray. The completed cartridge is depicted in FIG. 2(e).
Also shown superimposed in FIG. 2(e) is the camera field of view 35
for the preferred imaging system of Che. The cartridge is then
incubated in an oven, with desired humidity control at the desired
hybridization temperature for the desired time.
[0107] When the hybridization is completed, the probe clip 33 is
removed and the chip washed at a desired stringency, preferably, in
order with 2.times.SSC at room temperature for 5 minutes, with
2.times.SSC and 50% formamide at 40.degree. C. for 30 minutes, and
2.times.SSC at room temperature for 10 minutes, to remove
hybridized probe. Gel/Mount (Biomeda, Foster City, Calif.) and DAPI
is applied to the array and a 18 mm.times.18 mm glass microscope
cover slip is sealed over the array, still in holder 36. The
covered chip is then imaged to detect the hybridization
results.
[0108] (10) Array Detection
[0109] After hybridization, the fluorescence presence and intensity
for each label color is detected and determined by any suitable
detector or reader apparatus and method. Laser-based array scanning
detectors are known to the art, see U.S. Pat. No. 5,578,832,
"Method and Apparatus for Imaging a Sample on a Device," Trulsen,
et al. Optical waveguide detection methods for array hybridization
also have been disclosed, see U.S. Pat. No. 5,843,651, "Light
Scattering Optical Waveguide Method for Detecting Specific Binding
Events," D. Stimpson, et al. Preferably, a large field imaging
apparatus and method, such as disclosed in co-pending, commonly
assigned U.S. patent application Ser. No. 09/049,798, "Large-Field
Fluorescent Imaging Device," filed Mar. 27, 1998, D. Che, (herein
referred to as "Che") is used.
[0110] The large-field fluorescence imaging apparatus of Che uses
reflective optics to couple the excitation beam generated by a
high-power white light source onto the microarray surface to
provide a high illumination intensity, and combines the high
illumination intensity with the high detection efficiency of an
array detector to provide a high image acquisition rate. The white
light generated by the light source is collimated and filtered with
a computer-controlled filter to provide the excitation beam. The
excitation beam is passed through a field stop to form a
well-defined beam pattern and then projected onto the array surface
with a concave mirror. The concave mirror is disposed to image the
field stop on the sample to define an illumination area which
matches the field of view of the imaging optics. The fluorescent
light generated in the sample is color filtered to reject scattered
light of excitation color and imaged by the imaging optics onto the
array detector to produce a fluorescent image of the sample.
[0111] The array imaging apparatus and method may employ digital
image processing algorithms used in a programmed computer for data
analysis, storage and display of digital image data from the
imaging apparatus. Any suitable digital image processing, data
storage and display software can be used for analysis of the array
hybridization results. Digital imaging methods are known to those
skilled in the art, for example, as disclosed in U.S. Pat. No.
5,665,549, "Comparative Genomic Hybridization," Kallionemi, et al.,
and U.S. Pat. No. 5,830,645.
[0112] The hybridization images are preferably captured and
analyzed by use of a high resolution digital imaging camera, such
as a SenSys 1600 Camera with PSI interface from Photometrics
(Scottsdale, Ariz.), which receives the large field image directly
from the detection optics. Any other suitable camera can also be
used. The raw image data captured by the camera is stored in any
suitable computer data base or data storage file. The raw image
data is processed using suitable image analysis algorithms to
determine the marker intensity at each target element of the
microarray. Image analysis algorithms are well known to those
skilled in the art, and a package of a large number of such
algorithms is available as IPLab from Scanalytics (Fairfax,
Va.)
[0113] Preferably, the image analysis algorithms carry out the
following operations, implemented in appropriate computer software:
(i) background correction, as necessary; (ii) array target element
or "spot" segmentation for identification of individual array
elements; (iii) spot grid assignment of a column and row number to
each spot; (iv) spot data analysis, including verification of
validity and presence of artifacts, averaging of data for replicate
spots, normalization of data from all spots, and multi-experiment
comparison and analysis; (v) single spot calculations, including
the total intensity of each fluorescent marker color, the average
DAPI counterstain intensity, the mean, mode, median and correlation
coefficient of the per pixel ratios of fluorescent intensities, and
the ratio of total tissue nucleic acid marker intensity to
reference intensity, termed as the "mass ratio"; (vi) target
summary analysis, including the number of valid replicates for a
spot, the mean and coefficient of variation of the per spot mass
ratios and the correlation coefficient of per pixel ratios across
all spots. Preferably, the image analysis used standardizes the
mean mass ratio such that the modal value is 1.00 using a
window-based estimate of the mode.
[0114] The fluorescent data at each target element can be compared
automatically to produce the ratio between any desired tissue and
reference or between tissues. For example, when using four tissue
nucleic acids (primary tumor genomic DNA and cDNA and metastasis
genomic DNA and cDNA) with two references (total genomic and total
cDNA from normal tissue of the same cell type as the tumor), at
least eight different ratios can be calculated (the ratio of each
reference with each tissue),
[0115] The image analysis also preferably comprises implementation
of criteria set by the individual user for valid analyses,
including (vii) exclusion of spots with pixels having saturated
tissue or reference color channels; (viii) spot size and shape
criteria for exclusion; and (ix) a "relation coefficient" exclusion
for spots with relative coefficient values below threshold. The
array data analysis can also include comparison algorithms to
compare data from individual tests to data bases containing disease
genotypes and phenotypes (i.e. listing of gene expression and
chromosome abnormalities for particular diseases), which can
identify possible diagnosis or choice of therapy based upon
individual test results.
[0116] The image analysis preferably uses computer display and
printing algorithms, such as those, for example, known to one of
skill in the art, for computer monitor display and computer
printing. The data display can include "pseudo-color" images
selected by the user for the individual fluorescent colors of the
tissue and reference nucleic acids. The array data display can be
coupled with display of conventional chromosome ideograms to more
clearly detail chromosome abnormalities and expressed gene
abnormalities identified by the method of the invention. See U.S.
Pat. No. 5,665,549, FIG. 9, for an exemplary ideogram. Preferably,
the array data is also displayed so that spots excluded from
analysis are marked for ready identification by the user. This can
be done by displaying that target element in an "error color" or
with a colored circle around it.
[0117] In the preferred embodiment, the array reader and software
automatically capture four images of each chip, specific for: (1)
the DAPI counterstain (blue), (2) the tissue DNA (green), (3) the
tissue cDNA (red), and (4) the reference DNA (orange). These images
are referred to as color planes. However, images for more or
different color planes can be taken. The image analysis portion of
the software preferably uses one of the colors (preferably the DAPI
image) to identify target elements and their location in the grid.
Once all spots are identified the software analyses each pixel
under each spot for its intensity in each of the remaining color
planes. Suitable algorithms are employed to determine the local
background for each of these color planes, which is then subtracted
from the total intensity of each color. The background corrected
intensities can then be averaged for all pixels under a particular
target spot or group of spots, and this average intensity per pixel
(e.g., A for DAPI intensity, B for tissue DNA intensity, C for
tissue cDNA intensity and D for reference intensity) can be used
for various analyses.
[0118] For example, the intensity A may be used as an indicator of
target spot quality, since the intensity of DAPI staining is a
function of total amount of DNA attached at the target spot. Below
a certain value for A (under controlled staining conditions) the
amount of target element DNA may become rate limiting. The
intensity D of the reference DNA can be used as an indicator for
the efficiency of hybridization, since this reagent is preferably
provided in a pre-determined concentration and is quality
controlled.
[0119] In the preferred analysis, the most important information is
the ratio of background corrected tissue intensity over background
corrected reference intensity; i.e. for the above example the
ratios of B/D and C/D. If more than one reference is used, then
additional ratios can be taken to give informative data. These
ratios can be determined for a group of spots, a single spot, or
for each pixel under each spot.
[0120] In the most preferred mode, and for the example listed
above, the B/D and C/D intensity ratios are being determined for
each pixel, which should be independent on their absolute intensity
in any of the colors. In other words, a plot of B versus D, for
example, for each pixel under each spot should yield a scatter
around a straight line, which should intersect both the X and Y
axis at 0, if the background correction was appropriate.
(Appropriate algorithms can generate such a plot by "clicking" on a
given target spot or group of spots in the display.) This plot
reveals two types of information:
[0121] First, the amount of scatter around the linear regression
line is indicative of the quality of the data, and can be
statistically evaluated to generate a correlation coefficient,
which for ideal spots is 1 (i.e. all pixel values fall on the
regression line). A value less than 1 indicates less than perfect
data, and a value of 0.8 or less is preferably taken as an
indicator that data from such a spot should be considered suspect.
This scatter plot can be generated for a single spot or group of
spots. Second, the slope of this regression line is the B/D or C/ID
intensity ratio, respectively, for a given spot or group of
spots.
[0122] In order to extract the desired biological information, the
B/D or C/D ratio is preferably normalized with respect to a control
spot or group of spots, for which these ratios can be correlated to
a known level of DNA or RNA sequence in the test probe mixture.
This is done as follows:
[0123] For analysis of genomic DNA the assumption is made that most
of the tissue DNA sequences are in fact present in their normal
copy number, i.e. two per genome (except for sequences from the sex
chromosomes if the test tissue is from a male donor). For the
reference DNA this is assumed to be true for all sequences (other
than those from X or Y chromosomes if the reference DNA is from a
male donor). Based on these assumptions the software compares the
B/D or C/D ratios of all target spots and selects a group of ratios
that appear to be very similar. This group of ratios is assumed to
represent targets that are normal in the test tissue, and the
average of that ratio is used to normalize all other ratios. In
other words, the B/D or C/D ratios of all spots will be divided by
the average B/D or C/D ratio, respectively, of this "normal group."
Thus, the B/D or C/D ratios of all normal spots should be close to
1, while the B/D or C/D ratios from targets that are aneuploid
(present in copy numbers larger or smaller than 2), will be around
0.5 or less (deletions) or 1.5 or above (additions or
amplifications).
[0124] The inventive combination of simultaneous expression and
genomic analysis allows a correlation of the expression level to
the gene copy number, by using the ratios described above as
follows:
[0125] Assume that an assay was performed in which B is the
intensity for the tissue genomic DNA, C is the intensity for tissue
mRNA (cDNA) and D is the intensity for the reference genomic DNA.
Then, the ratios to be obtained are as follows:
[0126] (B/D)=background corrected average pixel intensity ratio
[0127] (Bg/Dg)=background corrected average pixel intensity ratio
average for "normal" subgroup
[0128] (B/D)/(Bg/Dg)=normalized B/D ratio=Bn/Dn
[0129] (C/D)=background corrected average pixel intensity ratio
[0130] (Cg/Dg)=background corrected average pixel intensity ratio
average for "normal" subgroup
[0131] (C/D)/(Cg/Dg)=normalized C/D ratio=Cn/Dn
[0132] The Bn/Dn ratio reveals the number of genomic copies of a
given target sequence, the Cn/Dn ratio reveals the relative number
of mRNA copies per genomic sequence, and the Cn/Bn ratio would
indicate whether the relative mRNA copy number correlates with a
relative change in the genomic copy number change.
[0133] (11) Example Arrays
[0134] Exemplary of the types of microarrays useful in the method
of the invention is a prenatal array of about 100 target elements
without replicates, which comprise genomic DNA sequences from (a)
the unique sequence regions immediately adjacent the repeat
sequence regions of (i) all human telomeres and (ii) all human
centromeres (taken from both p and q arm); (b) the "microdeletion"
syndrome regions for DiGeorge, Smith-Magenis, Downs, Williams,
Velocardiofacial, Alagille, Miller-Dieker, Wolf-Hirschhorn, Cri du
Chat, Cat Eye, Langer-Giedion, Kallmann and Prader-Willi/Angelman
syndromes; and (c) deletion regions identified with sterylsulfatase
deficiency, muscular dystrophy and male infertility, and those
believed tied to mental retardation that involve deletion of the
sub-telomeric, unique sequence regions on each chromosome.
[0135] Table 1 lists human genomic DNA clones useful in such an
array. This prenatal array has powerful medical utility because of
its capability to reliably detect multiple gross chromosomal
changes causing inherited disease. The human prenatal array is also
useful for post-natal testing, for fetal cell testing and for
pre-implantation genetic testing on blastomeres and polar bodies.
Table 1 includes the chromosomal loci and the disease correlated to
each loci.
1TABLE 1 Prenatal Chip-Loci To Detect Copy Number Abnormalities in
Non-Cancer Genetic Diseases Gene or Chrom. Locus Cyto. Loc. Disease
1p tel 1 p tel Mental Retardation, other p58 1 p36 1p36 deletion
syndrome 1 near cen aneusomy & region marker 1q tel 1 q tel
Mental Retardation, other 2p tel 2 p tel Mental Retardation, other
2 ner cen aneusomy & region marker 2q tel 2 q tel Mental
Retardation, other 3p tel 3 p tel Mental Retardation, other 3 near
cen aneusomy & region marker 3q tel 3 q tel Mental Retardation,
other 4p tel 4 p tel Mental Retardation, other WHSCR/WHSC 4 p16.3
Wolf-Hirschhorn syndrome 4 near cen aneusomy & region marker 4q
tel 4 q tel Mental Retardation, other D5S23 5p15.2 Cri du chat
syndrome 5p tel 5 p tel Mental Retardation, other 5 near cen
aneusomy & region marker 5q tel 5 q tel Mental Retardation,
other 6p tel 6 p tel Mental Retardation, other 6 near cen aneusomy
& region marker 6q tel 6 q tel Mental Retardation, other 7p tel
7 p tel Mental Retardation, other 7 near cen aneusomy & region
marker 7q tel 7 q tel Mental Retardation, other Elastin 7 q11.23
Williams syndrome 8p tel 8 p tel Mental Retardation, other 8 near
cen aneusomy & region marker 8q tel 8 q tel Mental Retardation,
other EXT1 7 q24.1 Langer-Giedion syndrome 9p tel 9 p tel Mental
Retardation, other 9 near cen aneusomy & region marker 9q tel 9
q tel Mental Retardation, other 10p tel 10 p tel Mental
Retardation, other 10 near cen aneusomy & region marker 10q tel
10 q tel Mental Retardation, other WI-8545 10p14-p13
Velocardiofacial/DiGeorge syndromes 11p tel 11 p tel Mental
Retardation, other 11 near con aneusomy & region marker 11q tel
11 q tel Mental Retardation, other 12p tel 12 p tel Mental
Retardation, other 12 near cen aneusomy & region marker 12q tel
12 q tel Mental Retardation, other 13 near cen chromosome poidy
& region marker 13q tel 13 q tel Mental Retardation, other RB1
13 q14 Trisomy 13, other 14q tel 14 q tel Mental Retardation, other
14 near cen chromosome poidy & region marker 15q tel 15 q tel
Mental Retardation, other 15 near cen SNRPN 15 q11-q13
Prader-Willi/Angelman syndromes D15S10 15 q11-q13
Prader-Willi/Angelman syndromes 16p tel 16 p tel Mental
Retardation, other 16 near cen aneusomy & region marker 16q tel
16 q tel Mental Retardation, other 17p tel 17 p tel Mental
Retardation, other FLII 17 p11 Smith-Magenis syndrome PMP22 or
adjac 17 p12 CMT1A/HNNPP D17S258 17 p13 Miller-Dieker
syndrome/Isolated Lissencephally LIS1 17 p13 Miller-Dieker
syndrome/Isolated Lissencephally 17 near cen 17 p13 aneusomy &
region marker 17q tel 17 q tel Mental Retardation, other 18 near
cen aneusomy & region marker 18p tel 18 p tel Mental
Retardation, other 18q tel 18 q tel Mental Retardation, other
18p11.3 probe 18 q11.3 Tri/Iso Chromosome 18p 19p tel 19 p tel
Mental Retardation, other 19 near cen aneusomy & region marker
19q tel 19 q tel Mental Retardation, other 20p tel 20 p tel Mental
Retardation, other JAG1 20 p 11 Alagille syndrome 20 near cen
aneusomy & region marker 20q tel 20 q tel Mental Retardation,
other 21q tel 21 q tel Mental Retardation, other 21 near cen
aneusomy & region marker MNB or D21S55 21 q22.1 Down syndrome
ERG 21 q22.1 Down dyndrome 22q tel 22 q tel Mental Retardation,
other 22q near cen Cat Eye syndrome GSCL 22 q11
Velocardiofacial/DiGeorge syndromes HIRA, TUPLE 1 22q11
Velocardiofacial/DiGeorge syndromes X/Y p tel X/Y p tel Mental
Retardation, other STS X p22.3 Ichthyosis, x-linked KAL X P22.3
Kallmann syndrome AR Xq11-q12 aneusomy & region marker XIST
Xq13.2 Region marker Dystrophin exon Xp 21 Muscular Dystrophy X/Y q
tel X/Y q tel Mental Retardation, other SRY Y p11.3 xx males, etc.
AZFB Yq11.2 male infertility/Yq marker AZFC Yq12 male
infertility/Yq marker
[0136] Another example is the AmpliOnc.TM. genomic DNA target
element array containing genomic sequences for each of the 52
oncogene or amplified gene loci listed in Table 2.
2TABLE 2 AmpliOnc Loci Gene or Chrom. Locus Cyto. Location Cancer
Association NRAS 1p13.2 Breast cell line MYCL1 1p34.3 Small cell
lung cancer cell line, neuroblastoma cell line FGR 1p36.2-p36.1
LAMC2 1q25-q31 Breast cell line REL 2p13-p12 Non-Hodgkin's Lymphoma
ALK 2p23 lymphoma MYCN (N-myc) 2p24.3-q24.1 Neuroblastoma RAF1 3p25
Non-small cell lung cancer TERC (hTR) 3q26 Cervical, Head &
Neck, Lung PIK3CA 3q26.3 Ovarian BCL6 3q27 lymphoma PDGFRA 4q11-q12
Giloblastoma MYB 6q22 Colorectal; Leukemia; Melanoma ESR1 (ER, ESR)
6q25.1 Breast EGFR (ERBB1, ERBB) 7p12.3-p12.1 Glioma; Head &
Neck PGY1, MDR1 7q21 Drug resistant cell lines MET 7q31 Gastric
FGFR1, FLG 8p11.2-p11.1 Breast MOS 8q11 Breast ETO, MTG8, CBFA2T1
8q22 leukemia MYC (c-myc) 8q24.12-q24.13 Small cell lung, Breast,
Esophageal, Cervical, Ovarian, Head & Neck, etc. ABL1 (ABL)
9q34.1 CML FGFR2 (BEK) 10q26 Breast HRAS 11p15.5 Colorectal,
Bladder CCND1 (Cyclin D1, BCL1) 11q13 Head & Neck, Esophageal,
Breast, Hepatic, Ovarian FGF4 (HSTF1, HST) 11q13 Breast, Ovarian
FGFF3 (INT2) 11q13 Breast, Ovarian, Gastric, Melanoma, Head &
Neck EMS1 11q13 Breast, Bladder GARP(D11S833E) 11q13.5-q14 Breast
PAK1 11q13.5-q14 Breast MLL (ALL1) 11q23 leukemia KRAS2 12p12.1
Colorectal, Gastric, Adenocortical, Lung giant cell CCND2 (Cyclin
D2) 12p13 Lymphoma, CLL TEL (ETV6) 12p13 leukemia WNT1 (INT1)
12q12-q13 Retinoblastoma SAS; CDK4 12q13-q14 Sarcoma, glioma GL1
12q13.2-q13.3 Sarcoma, glioma MDM2 12q14.3-q15 Sarcoma, glioma AKT1
14q32.3 Gastric PML 15Q22 leukemia IGF1R 15q25-q26 rare amplicon
FES 15q26.1 rare amplicon MRP 16p13.1 Drug resistant cell lines
MYH11 16p13.13-p13.12 leukemia CBFB 16q22 leukemia RARA 17q12
leukemia HER-2/neu (EGFR2) 17q12-21 Breast, Ovarian, Gastric TOP2A
17q21-q22 YES1 18p11.3 Gastric BCL2-3' segment 18q21.3
Non-Hodgkin's Lymphoma BCL2-5' segment 18q21.3 Non-Hodgkin's
Lymphoma INSR (insulin receptor) 19p13.2 Breast JUNB 19p13.2 HeLa
cell lines CCNE (Cyclin E) 19q12 Gastric, Ovarian BCL3 19q13
lymphoma AIB1 20q12 Breast CSE1L (CAS) 20q13 Breast MYBL2 20q13.1
Breast PTPN1 20q13.1-q13.2 Breast ZNF217 (ZABC1) 20q13.2 Breast
STK15 (BTAK, aurora 2) 20q13.2 Breast, ovarian, colon, prostate,
neuroblatoma and cervical AML1 (CBFA2) 21q22.3 leukemia BCR
22q11.21 leukemia EWSR1 (EWS) 22q12 sarcoma PDGFB (SIS)
22q12.3-q13.1 Rhabdomyosarcoma, liposarcoma AR Xq11.2-q12 Prostate
Note: Alternate names for a gene are shown in parentheses.
[0137] Genomic DNA target elements derived from the clones listed
in Table 2 contain human genomic DNA inserts of about 50 kb to
about 200 kb in a PAC, P1 or BAC vector. This array is produced
without separation of the vector sequences. Use of this array
permits simultaneous identification of genomic amplification of
each of these oncogene loci, as well as expression of the genes
which map into these regions.
[0138] Yet another example is an AmpliOnc II array, which contains
genomic DNA from the oncogene loci of Table 2, supplemented by
genomic DNA from the human tumor suppressor gene loci for: the p53,
RB1, WT1, APC, NF1, NF2, VHL, MEN1, MENZA, DPC4, MSH2, MCH1, PMS1,
PMS2, P57/KIP2, PTCH, BRCA1, BRCA2, P1 6/CDKN2, EXT1, EXT2,
PTEN/MMAC1, ATM, and TP73 genes. The genomic DNA target elements
are produced by selecting genomic DNA clones from a human genomic
library that map to the loci for these tumor suppressor genes. This
selection is done by the preparation of PCR primer pairs from the
loci or genes and subsequent library screening to identify the
clones. In this embodiment, the clones for the tumor suppressor
loci can be about 20 kb to 250 kb, and are preferably about 50 kb
to about 200 kb in complexity.
[0139] (12) Utility of the Invention
[0140] The methods of the invention have significant utility in the
fields of genetic research, human disease management, human disease
clinical research, human disease drug development and
pharmacogenomics, human genetic research, animal drug development,
animal disease management, animal genetic research, and plant
genetic research. In particular, by enabling more precise genetic
detailing of suspected cancerous tissue, the invention will provide
improved disease management through more tailored diagnosis and
therapy selection. The methods can also be used to determine the
presence of viruses, viral integration into chromosomes and
expression of viral genes. The method can also be used to
simultaneously detect human genomic DNA abnormalities, human gene
expression and gene expression of bacterial genes.
[0141] The methods of the invention are particularly useful for
genomic disease management of cancer and other disease. For
example, the methods are useful for categorizing genotype and
phenotype of cancer, including those of the breast, prostate, lung
(small cell and non-small cell), ovary, cervix, kidney, head and
neck, pancreas, stomach, brain, soft tissue and skin, and of
various blood or lymphatic system cancers such as leukemias and
lymphomas. Once the tumor tissue genotype and phenotype are
categorized by the method of the invention, the physician can
combine this data with other clinical data to determine diagnosis,
prognosis, therapy and predict response to therapy.
[0142] The capabilities provided by the multi-color methods of the
invention enable rapid comparative testing in drug development. For
example, a cancer cell line can be dosed with a putative drug
compound and at desired time intervals thereafter a cell sample can
be removed. Each of the removed cell samples, for example,
collected at time 0, 10, 20 and 30 hours after dosing, is treated
to extract nucleic acids, which are then each labeled with a
separate fluor. The four populations are then applied to the array
with appropriate reference. The time-tracked effects of the drug on
expression and initial chromosome status are thus assessed.
Chromosomal change generally occurs over longer time periods and is
not expected to change in this example. The method also can be
applied to assess drug efficacy in drug resistant cell lines,
particularly as drug resistance can be caused by gene
amplification.
EXAMPLES
[0143] The following examples are intended to be merely
illustrative of the invention and are not to be construed as
limiting.
Example 1
[0144] (A) Procedures
[0145] (i) Test Array Manufacture:
[0146] Four inch.times.four inch chromium-coated plates (Nanofilm)
were scored by U.S. Precision Glass Company (Elgin, Ill.), and the
scoring marked 24 equally sized chips. A 180 target element
microarray was made on each chip. Before nucleic acid deposition,
the plate was washed consecutively with distilled water,
isopropanol, methanol and distilled water, allowed to dry and
equilibrated to room temperature. The microarray was deposited
centrally in each chip and occupied about 5 mm.times.6 mm of chip
surface. The microarray was made using a computer-controlled,
single needle fluid deposition robot supplied by New Precision
Technologies (Northbrook, Ill.). The robot was modified by addition
of a laser-based Z-axis controller, a pressure regulatable nitrogen
gas line hooked to the deposition pin and a platen sized to hold
twelve, 4".times.4" plates. The robot used multiple deposition
pins, each a 33 gauge, one-inch long steel capillary syringe needle
linked to a Luer lock syringe tip from EFD. The capillary pins were
each loaded with a different genomic DNA by loading into the Luer
lock portion of the needle. The needle was changed manually after
deposition of each target element on all chips on the platen. The
microarray was made with approximately 400 micron spacing between
target element centers in both the X and Y directions.
[0147] The robot was controlled with computer software provided
with the robot, which was modified to bring the capillary pin into
contact with the chip surface and, at the contact moment, to apply
a microburst of nitrogen pressure to the top of the pin. The
contact and microburst period was about 10 milliseconds per target
element. The gas pressure was about 1 psi and was regulated
manually, as necessary, to force sufficient amounts of the viscous
genomic DNA out of the pin. The control conditions were set to
deposit about 0.3 nl of 1 .mu.g/.mu.l nucleic acid in 100 mm NaOH
per spot. The deposited elements were approximately round, with
variations noticeable under microscope examination after DAPI
staining. The spot size also varied with the viscosity of the DNA.
Individual chips were separated manually.
[0148] The microarray comprised spots with genomic DNA from 31
human putative amplified gene loci, one spot of total human genomic
DNA, three control spots of pooled genomic DNA, each spot a pool of
equal amounts of genomic DNA for ten of these oncogene loci, and
one spot of lambda phage DNA. These thirty-six spots were
replicated five times each on the microarray to produce the one
hundred-eighty spot microarray. The 31 human putative amplified
gene loci are listed below, and were genomic human DNA inserted
into BAC, PAC or Pl cloning vectors. Each of the genomic DNA for
these loci was produced with DNA of a single BAC, PAC or P1 clone,
although the individual insert sizes were not uniform. These BAC
clones were obtained by screening the available genomic libraries
with a primer sequence for each locus, as follows:
3 GENE LOCUS CLONE NO. LIBRARY SOURCE.sup.1 MYCL1 RMC01P052 UCSF
FGR RMC01P057 UCSF REL BAC-274-P9 GS N-MYC PAC-254-N16 GS RAF1
BAC-98-L2 GS PIK3CA PAC-97-B16 GS PDGFRA BAC-619-M20 GS MYB
BAC-268-N4 GS EGFR BAC-246-M20 GS MET BAC-54-J7 RG FLG BAC-566-K20
GS C-MYC P1-469 GS ABL PAC-763-A4 RG BEK BAC-126-B28 GS HRAS1
BAC-137-C7 GS BCL1 PAC-128-18 GS INT2 BAC-36-F16 GS KRAS
BAC-490-C21 GS WNT1 BAC-400-H17 GS GLI RMC12P001 UCSF CDK4
BAC-561-N1 GS MDM2 BAC-82-N15 GS AKT1 BAC-466-A19 GS FES P11-2298
GS HER2 P1-506 GS YES1 BAC-8-P19 GS JUNB BAC-104-C10 GS 20q13.2
BAC-97 GS PDGFB RMC22P003 UCSF AR PAC-1097-P11 RG .sup.1GS is
Genome systems; RG is Research Genetics; UCSF is the LBL/UCSF
Resource for Molecular Cytogenetics, University of California, San
Francisco, Cancer Center. The clone number for each locus is shown.
Human insert sizes ranged from about 60 kb to about 212 kb; not all
inserts were measured. Chromosome location for each is in Table 2
above.
[0149] (ii) Tissue Extractions and Labeling:
[0150] For each of SJSA-1 and Colo 320 cell lines, obtained from
ATCC, the cells were centrifuged at 7,000 rpm at 4.degree. C. to
produce cell pellets. Supernatant was discarded. The pellets were
resuspended in Solution #2 of DNA Extraction Kit from Stratagene.
The pellets were homogenized using a mechanical homogenizer at
medium setting. Pronase was added to produce a pronase
concentration of 100 .mu.g/ml in each tube. Tubes were incubated
with shaking at 60.degree. C. for one hour. Tubes were placed on
ice for 10 minutes. Stratagene DNA Extraction Kit Solution #3 was
added and the tubes again placed on ice for 5 minutes. Tubes were
centrifuged for 15 minutes at 8,000 rpm at 4.degree. C. to pellet
the protein precipitate. The supernatant was decanted. RNase was
added to the supernatant to produce an Rnase concentration of 20
.mu.g/ml and the supernatant incubated at 37.degree. C. for 15
minutes. Two times the volume of ethanol was added and then
centrifuged for 15 minutes at 10,000 rpm. Supernatant was decanted.
The DNA pellets were dried under vacuum with a Speed Vac. The DNA
pellets were resuspended in water and 995 .mu.l of 50 mM sodium
hydroxide added.
[0151] Cy-5 dUPT, from Amersham (Arlington Heights, Ill.) and a
fluorescein labeled dCTP, produced according to Cruickshank, was
used in nick translation to label the extracted DNA. The nick
translation of Cy-5 dUPT for SJSA-1 incorporation used a standard
protocol with a Promega (Madison, Wis.) nick translation kit. For
Colo 320, 10 .mu.l of nick translation enzyme and 5 .mu.l of nick
translation buffer (both from Vysis, Inc.) were mixed with 1 .mu.g
of extracted Colo 320 DNA, 4 .mu.l each of dATP, dGTP and dTTP, 1
.mu.l of dCTP, 2 .mu.l of fluorescein dCTP, produced according to
Cruickshank, and sufficient water to produce 50 .mu.l of solution.
The mix was incubated at 37.degree. C. for 30 minutes. The enzyme
was heat inactivated by heating at 80.degree. C. for 10 minutes.
The solution was G-25 Spin Column purified and the labeled probe
dried with Speed Vac. for 40 minutes.
[0152] (iii) Hybridization:
[0153] The nick translated DNA's (41 5 ng each), reference DNA (415
ng SpectrumOrange Total Human DNA (Vysis, Inc.), and Cot-1 DNA (100
ug), (LTI, Bethesda, Md.) were mixed with about 15 .mu.l LSI
Hybridization Buffer, (Vysis, Inc.), to produce 25 .mu.l of
hybridization mix. The hybridization mix was pipetted onto the chip
contained in a chip holder shown in FIG. 1. The chip was glued in
place in the holder using RTV 103 silicone rubber sealant (GE,
Waterford, N.Y.). The probe clip 33 of FIG. 1 was applied as
described above. The holder was then incubated at 37.degree. C.
overnight in an enclosed moisture chamber. After hybridization, the
probe clip was removed and the chip washed with 2.times.SSC at room
temperature for 5 minutes, the 2.times.SSC and 50% formamide at
40.degree. C. for 30 minutes, and then 2.times.SSC at room
temperature for 10 minutes. The washed chips were dried at room
temperature in the dark. Ten .mu.l of GEL/Mount.TM. and DAPI were
added and an 18 mm.times.18 mm glass cover clip was placed over the
array in the holder.
[0154] (iv) Image Capturing and Analysis:
[0155] A bread-board imaging apparatus of Che was used to capture
large field images of the hybridized array through the array
window, without removal of the probe clip or cover slip. The
bread-board image included a dual filter wheel (LudI) and single
band pass filters (Chroma Technology, Battleboro, Vt.) for each of
DAPI, fluorescein, SpectrumOrange and Cy5 were used for excitation
and emission. Image data was processed using a Macintosh computer
running algorithms that carried out the following steps: (1) Each
target element spot is located from the DAPI image and assigned its
grid location; (2) fluorescent intensities for each fluor at each
spot are determined; (3) fluorescent ratios, by mode, median and
mass, are calculated for each spot; (4) exclusion criteria based on
spot size and intensity threshold; (5) composite images are
produced and displayed on a computer monitor; (6) displayed images
include white circles drawn around each spot and number of grid
location; (7) printing capability for conventional computer-based
printers; and (8) raw and processed data and image storage.
[0156] (B) Results
[0157] The fluorescence ratio for the Colo 320 compared to
reference is shown in Table 3. As Table 3 indicates, the oncogene
CMYC was amplified 32 fold in the Colo 320 cells. This compares to
the known amplification of CMYC in Colo 320 of 29.+-.6 fold
(calculated from average of published data). A pseudo-colored
composite image of the hybridization results showed significant
color intensity for the CMYC elements, which also indicated
amplification of the CMYC locus. Table 4 shows the fluorescent
ratio analysis results for the SJSA-1 cells compared to reference.
Table 4 shows the GLI (9.4 fold), MDM2 (7.5 fold) and CDK4/SAS
(12.1 fold) loci are each amplified in SJSA-1 cells. A
pseudo-colored composite image of the hybridization results showed
significant color intensity for the GLI, MDM2 and CDK4/SAS
elements, also indicating amplification. Table 5 shows the
fluorescent ratio of the Colo 320 signal compared to the SJSA-1
signal for most targets is around 1. However, the low ratio of the
GLI (0.1 2), MDM2 (0.1 3) and CDK4/SAS (0.09) indicates these gene
loci were amplified in SJSA-1 cells relative to the Colo 320 cells.
The high ratio of target CMYC (40) indicates the CMYC amplification
in the Colo 320 cells. The gene amplification observed with three
probes (two sample probes and one reference probe) hybridized
simultaneously to one chip was similar to that obtained by separate
hybridizations of the SJSA-1 and Colo 320 DNAs onto separate chips.
(Subsequent to data collection, it was learned that the clone for
the AKT2 locus was not correctly mapped. The data shown in Tables
3, 4 and 5 and in FIG. 2(a) through 2(h) for the AKT-2 target
element are, thus, not meaningful.)
[0158] This Example 1 is the first demonstration known to the
applicants of a comparative hybridization of more than two
separately-labeled nucleic acid populations to the same array.
These results demonstrate the simultaneous hybridization of three
separately-labeled nucleic acid populations to a microarray to
detect status of tissue nucleic acids.
4TABLE 3 Test/Reference ratio analysis for the hybridization
results of Example 1. CMYC amplification in Colo 320 cells was
observed. Norm. Ratio: (by mode) (by median) (by mass) Tgt. Name #
(Mean Cu) (Mean Cu) (Mean Cu) CorrC. 1 THD 5 ( 0.96 4%) ( 1.04 3%)
( 1.02 3%) 0.951 2 Lamb 5 ( 1.99 23%) ( 2.47 13%) ( 2.01 36%) 0.446
3 PDGFB 5 ( 0.81 11%) ( 0.96 3%) ( 0.96 3%) 0.934 4 EGFR 5 ( 0.83
12%) ( 0.97 3%) ( 0.94 3%) 0.880 5 PDGFRA 5 ( 0.68 4%) ( 0.86 2%) (
0.83 2%) 0.969 6 MYB 5 ( 0.68 12%) ( 0.75 6%) ( 0.75 4%) 0.941 7
WNT 1 5 ( 1.21 6%) ( 1.29 3%) ( 1.29 3%) 0.973 8 HRAS 1 5 ( 1.48
9%) ( 1.70 5%) ( 1.65 4%) 0.961 9 MET 5 ( 0.80 15%) ( 0.91 2%) (
0.90 3%) 0.940 10 BEK 5 ( 0.61 5%) ( 0.77 14%) ( 0.75 10%) 0.943 11
HER2 5 ( 1.11 10%) ( 1.22 3%) ( 1.16 1%) 0.956 12 BCL 1 5 ( 0.68
8%) ( 0.75 4%) ( 0.75 3%) 0.961 13 YES 1 5 ( 0.85 3%) ( 0.94 1%) (
0.93 1%) 0.970 14 RAF1 5 ( 0.91 28%) ( 1.09 2%) ( 0.99 4%) 0.931 15
GLI 5 ( 1.04 7%) ( 1.15 2%) ( 1.16 3%) 0.949 16 MDM2 5 ( 0.88 4%) (
0.97 3%) ( 0.98 3%) 0.968 17 C-MYC 5 (28.74 6%) (33.37 4%) (32.30
2%) 0.976 18 20Q13.2 5 ( 0.77 6%) ( 0.88 5%) ( 0.86 3%) 0.976 19
REL 5 ( 0.97 2%) ( 1.07 2%) ( 1.04 2%) 0.946 20 MYCL1 5 ( 0.99 9%)
( 1.14 5%) ( 1.09 4%) 0.957 21 FGR 5 ( 0.92 21%) ( 0.94 3%) ( 0.93
2%) 0.970 22 FES 5 ( 0.87 7%) ( 0.98 4%) ( 0.96 4%) 0.962 23 ABL 5
( 1.12 10%) ( 1.33 6%) ( 1.25 1%) 0.947 24 INT2 5 ( 0.72 4%) ( 0.86
4%) ( 0.84 3%) 0.952 25 PIK3CA 5 ( 0.83 11%) ( 0.89 3%) ( 0.87 7%)
0.952 26 N-MYC 5 ( 1.02 5%) ( 1.13 2%) ( 1.12 2%) 0.792 27 AKT2 5 (
1.15 7%) ( 1.21 4%) ( 1.22 4%) 0.964 28 FLG 5 ( 1.03 8%) ( 1.12 5%)
( 1.12 4%) 0.913 29 JUNB 5 ( 0.92 4%) ( 0.99 1%) ( 0.97 1%) 0.834
30 AKT1 5 ( 1.01 2%) ( 1.06 4%) ( 1.03 2%) 0.906 31 KRAS 5 ( 0.90
11%) ( 1.02 6%) ( 1.00 6%) 0.965 32 CDK4 5 ( 1.02 5%) ( 1.17 2%) (
1.12 2%) 0.968 33 A.R 5 ( 0.78 4%) ( 0.85 2%) ( 0.84 3%) 0.961 34
c1 5 ( 0.96 9%) ( 1.12 7%) ( 1.10 7%) 0.852 35 c2 5 ( 4.94 22%) (
5.68 11%) ( 5.27 9%) 0.967 36 c3 5 ( 0.93 3%) ( 1.01 2%) ( 1.01 1%)
0.976 All 178 9% 4% 4% 0.928 Normalizer 0.40 0.38 0.37
[0159]
5TABLE 4 Test/Reference ratio analysis for the hybridization
results of Example 1. GLI, MDM2 and CDK4/SAS amplification in
SJSA-1 cells was observed. Norm. Ratio: (by mode) (by median) (by
mass) Tgt. Name # (Mean Cu) (Mean Cu) (Mean Cu) CorrC. 1 THD 5 (
1.39 3%) ( 1.15 2%) ( 1.18 3%) 0.976 2 Lamb 5 ( 0.93 16%) ( 0.65
18%) ( 0.61 57%) 0.563 3 PDGFB 5 ( 1.21 8%) ( 0.98 4%) ( 0.99 2%)
0.973 4 EGFR 5 ( 1.40 16%) ( 1.14 6%) ( 1.15 4%) 0.968 5 PDGFRA 5 (
1.25 3%) ( 0.98 2%) ( 0.99 2%) 0.988 6 MYB 5 ( 1.24 11%) ( 1.01 6%)
( 1.06 4%) 0.980 7 WNT 1 5 ( 1.30 6%) ( 1.04 4%) ( 1.03 4%) 0.976 8
HRAS 1 5 ( 1.15 7%) ( 0.91 7%) ( 0.93 5%) 0.980 9 MET 5 ( 1.31 6%)
( 1.00 4%) ( 1.03 3%) 0.977 10 BEK 5 ( 1.25 5%) ( 0.92 6%) ( 0.92
8%) 0.941 11 HER2 5 ( 1.12 2%) ( 0.85 1%) ( 0.90 2%) 0.976 12 BCL 1
5 ( 2.49 4%) ( 1.94 4%) ( 1.96 3%) 0.987 13 YES 1 5 ( 1.32 2%) (
1.09 1%) ( 1.08 1%) 0.988 14 RAF1 5 ( 1.20 10%) ( 0.92 4%) ( 1.01
1%) 0.969 15 GLI 5 (11.55 4%) ( 9.18 2%) ( 9.39 3%) 0.982 16 MDM2 5
(10.21 11%) ( 7.39 12%) ( 7.51 10%) 0.976 17 C-MYC 5 ( 1.03 4%) (
0.81 2%) ( 0.81 2%) 0.984 18 20Q13.2 5 ( 1.14 8%) ( 0.98 3%) ( 0.99
2%) 0.983 19 REL 5 ( 1.27 2%) ( 1.06 2%) ( 0.99 13%) 0.821 20 MYCL1
5 ( 1.40 4%) ( 1.09 3%) ( 1.13 1%) 0.987 21 FGR 5 ( 1.23 5%) ( 0.97
3%) ( 0.99 3%) 0.986 22 FES 5 ( 1.19 2%) ( 0.95 2%) ( 0.94 2%)
0.979 23 ABL 5 ( 0.92 12%) ( 0.67 16%) ( 0.71 10%) 0.968 24 INT2 5
( 1.78 4%) ( 1.44 2%) ( 1.50 2%) 0.980 25 PIK3CA 5 ( 1.03 5%) (
0.88 5%) ( 0.85 7%) 0.745 26 N-MYC 5 ( 1.47 5%) ( 1.24 2%) ( 1.16
1%) 0.987 27 AKT2 5 ( 1.23 6%) ( 1.01 3%) ( 1.03 3%) 0.968 28 FLG 5
( 1.66 5%) ( 1.35 1%) ( 1.35 1%) 0.956 29 JUNB 5 ( 1.26 4%) ( 1.01
1%) ( 1.03 3%) 0.949 30 AKT1 5 ( 1.11 2%) ( 0.91 2%) ( 0.92 3%)
0.972 31 KRAS 5 ( 1.23 11%) ( 1.05 2%) ( 1.06 1%) 0.989 32 CDK4 5
(15.46 5%) (11.69 6%) (12.06 2%) 0.976 33 A.R 5 ( 0.98 3%) ( 0.77
2%) ( 0.77 2%) 0.986 34 c1 5 ( 1.44 9%) ( 1.16 3%) ( 1.19 2%) 0.951
35 c2 5 ( 3.71 15%) ( 2.68 6%) ( 3.01 5%) 0.978 36 c3 5 ( 4.09 2%)
( 3.29 3%) ( 3.33 2%) 0.989 All 176 6% 4% 5% 0.954 Normalizer 1.00
1.20 1.21
[0160]
6TABLE 5 Test/Reference ratio analysis for the hybridization
results of Example 1. GLI, MDM2 and CDK4/SAS amplification in
SJSA-1 cells and CMYC amplification in Colo 320 cells were
observed. Norm. Ratio: (by mode) (by median) (by mass) Tgt. Name #
(Mean Cu) (Mean Cu) (Mean Cu) CorrC. 1 THD 5 ( 0.92 6%) ( 0.91 5%)
( 0.88 4%) 0.934 2 Lamb 5 ( 3.24 0%) ( 4.05 21%) ( 4.39 52%) 0.228
3 PDGFB 5 ( 0.88 8%) ( 0.98 5%) ( 0.97 2%) 0.904 4 EGFR 5 ( 0.70
11%) ( 0.88 7%) ( 0.85 6%) 0.856 5 PDGFRA 5 ( 0.77 6%) ( 0.91 13%)
( 0.86 4%) 0.963 6 MYB 5 ( 0.62 3%) ( 0.77 11%) ( 0.72 3%) 0.936 7
WNT 1 5 ( 1.14 8%) ( 1.24 2%) ( 1.28 8%) 0.919 8 HRAS 1 5 ( 1.58
12%) ( 1.96 12%) ( 1.82 9%) 0.944 9 MET 5 ( 0.77 13%) ( 0.91 6%) (
0.90 6%) 0.928 10 BEK 5 ( 1.22 94%) ( 1.06 19%) ( 0.90 19%) 0.823
11 HER2 5 ( 1.26 11%) ( 1.48 3%) ( 1.31 2%) 0.933 12 BCL 1 5 ( 0.3
10%) ( 1.38 8%) ( 0.39 6%) 0.951 13 YES 1 5 ( 0.83 4%) ( 0.86 2%) (
0.88 2%) 0.979 14 RAF1 5 ( 1.10 16%) ( 1.22 2%) ( 1.00 4%) 0.902 15
GLI 5 ( 0.12 2%) ( 0.13 2%) ( 0.12 3%) 0.937 16 MDM2 5 ( 0.12 11%)
( 0.14 17%) ( 0.13 13%) 0.960 17 C-MYC 5 (36.47 6%) (43.14 10%)
(40.45 2%) 0.967 18 20Q13.2 5 ( 0.88 6%) ( 0.92 4%) ( 0.89 2%)
0.928 19 REL 5 ( 0.98 6%) ( 1.00 2%) ( 1.02 2%) 0.969 20 MYCL1 5 (
0.93 13%) ( 1.06 8%) ( 0.98 5%) 0.959 21 FGR 5 ( 0.88 6%) ( 0.96
5%) ( 0.96 5%) 0.949 22 FES 5 ( 0.99 7%) ( 1.04 5%) ( 1.03 3%)
0.948 23 ABL 5 ( 2.00 39%) ( 2.18 27%) ( 1.99 30%) 0.926 24 INT2 5
( 0.57 8%) ( 0.60 5%) ( 0.57 5%) 0.924 25 PIK3CA 5 ( 1.44 70%) (
1.00 6%) ( 1.05 4%) 0.925 26 N-MYC 5 ( 0.91 9%) ( 1.01 1%) ( 0.99
3%) 0.959 27 AKT2 5 ( 1.15 9%) ( 1.18 2%) ( 1.20 2%) 0.906 28 FLG 5
( 0.85 12%) ( 0.83 4%) ( 0.84 4%) 0.865 29 JUNB 5 ( 0.97 5%) ( 0.98
3%) ( 0.97 3%) 0.918 30 AKT1 5 ( 1.21 6%) ( 1.17 3%) ( 1.15 2%)
0.893 31 KRAS 5 ( 0.91 9%) ( 0.96 9%) ( 0.96 7%) 0.968 32 CDK4 5 (
0.09 4%) ( 0.11 9%) ( 0.09 3%) 0.960 33 A.R 5 ( 1.00 6%) ( 1.08 2%)
( 1.12 1%) 0.960 34 c1 5 ( 0.93 11%) ( 0.99 5%) ( 0.93 3%) 0.824 35
c2 5 ( 2.68 18%) ( 2.40 8%) ( 1.78 5%) 0.939 36 c3 5 ( 0.29 3%) (
0.31 5%) ( 0.31 3%) 0.966 All 180 13% 7% 7% 0.954 Normalizer 0.31
0.32 0.30
Example 2
[0161] (A) Procedures
[0162] (i) Array:
[0163] The same 180 element microarray of Example 1 was used.
[0164] (ii) Tissue Extraction and Labeling:
[0165] Two cell lines were used in this experiment, Colo 320 and
K562, both from ATCC. Five million cells of each were spun down
(1.5K for 10 min.) to pellet. After decanting, 100 .mu.l RNase
solution and 300 .mu.l lysis solution were added to the pellet and
the mixture were vortexed at high speed briefly. The mRNA for each
cell line were isolated by nitrocellulose-polyT using the isolation
protocol was provided by the manufacturer (Ambion, Tex.).
[0166] The isolated mRNA was ethanol precipitated and reverse
transcribed in the presence of Cy-5-dCTP (Amersham) using
conventional protocol and primered by random pN9 to produce the
Cy-5 labeled cDNA probe, of which one-fifth was used for each
hybridization assay (one million cell for each assay). DNA was
isolated for each cell line with conventional phenol-chloroform
extraction and labeled with nick translation in the presence of
fluorescein dCTP as in Example 1 to produce the labeled gDNA.
[0167] (iii) Hybridization:
[0168] Each hybridization was at total volume of 25 .mu.l
consisting of 15 .mu.l LSI hybridization buffer (Vysis, Inc.), 200
ng cell line gDNA probe, 200 ng cell line cDNA probe, 200 ng
SpectrumOrange Total Human Genomic DNA (Vysis, Inc.) as the
reference, 20 .mu.g salmon sperm DNA and 40 .mu.g Cot-I DNA.
Hybridization was to microarrays in chip holders with probe chip as
in Example 1, and was carried out at 42.degree. C. in an enclosed
moisture chamber for three days. For each cell line, the
hybridization was duplicated on two chips. The overall process is
shown below: 1
[0169] (iv) Imaging Capturing and Data Analysis:
[0170] Fluorescent images of hybridized chips were taken and
analyzed, as in Example 1, with the breadboard dual-filter wheel
imaging system of Che. Single-band pass filters were used for both
excitation and emission. Images were analyzed with the same
software as in Example 1.
[0171] (B) Results
[0172] General description of figures: Data are presented as
scatter plots and/or bar graphs. The scatter plots, with each point
corresponding to a particular target clone, serve as statistical
representation of data sets. The information for any given target
clone can be extracted from the bar graphs.
[0173] (i) Signal Intensity:
[0174] The intensities of background corrected signal for the genes
in the microarray were comparable between tissue cDNA (average of
165 counts for 10 seconds exposure) and tissue gDNA (average of 187
counts for 10 s exposure). Background associated with cDNA
detection was higher, 132 counts as compared to 73 counts for gDNA.
For both cDNA and gDNA, even the weakest signals were well above
background (S/B>1) with 60 seconds exposure, provided that
enough probe was deposited on the chip.
[0175] (ii) Data Reliability:
[0176] FIG. 2(a) shows the correlation of genomic DNA hybridization
data obtained from two hybridizations for each of the cell lines.
Linear regression correlation of the data for Colo 320 and K562 are
0.9963 and 0.9999, respectively, indicating high reliability of the
data. As expected, the ratios of the tissue gDNA over human
reference gDNA formed a cluster for a majority of the target
element genes (around one after normalization). Ratios that were
distant from the cluster indicate gene amplifications in the cell
lines for the corresponding genes (CMYC in Colo 320 and ABL in
K562). It is interesting to note that for both cell lines that were
tested, the "normal" cluster spans a ratio range from 0.5 to 1.5.
Within this range, the values of the ratio were highly reproducible
between experiments and they were distributed such that it was
believed unreliable to identify any particular gene within this
cluster as deleted or amplified.
[0177] FIG. 2(b) shows the reliability of gene expression
hybridization data obtained from two hybridizations for each cell
line. Linear regression correlation of the two sets of data for
Colo 320 and K562 were 0.9989 and 0.9790, respectively.
[0178] (iii) Assay Multiplexing:
[0179] FIG. 2(c) (for K562 cell line) and FIG. 2(d) (for Colo 320
cell line) demonstrate the assay multiplexing achieved with the new
assay format. With a separate genomic DNA assay, one could detect
only the genomic copy numbers (relative to human reference) of the
target sequences (green bars). With an expression cDNA assay, one
could only detect the expression profile (some equivalence of red
bars). With the method of the invention, the genomic and expression
data were acquired simultaneously.
[0180] (iv) Use of Normal Human Total gDNA as Reference for
Expression Assay:
[0181] Normally, because of lack of a "universal" or "normal"
reference, the expression levels of two samples can be compared
reliably only when the expression assays for the two samples are
performed on the same chip in separate assays. Example 2 used total
normal human gDNA as the reference nucleic acid for expression
assay. When using the tissue cDNA and reference gDNA labeled with
fluorochromes of different color, after hybridization, the
fluorescent intensity ratio of the two colors should reflect the
initial concentration ratio of the cDNA and reference gDNA in the
probe solution. If a particular reference gDNA is readily available
and its copy numbers of gene specific sequence do not change (i.e.,
are "stable") or varies only negligibly, then it can be used as a
universal reference for all expression assays. The expression
profile can be expressed as the ratio of cDNA over reference gDNA
as shown in FIG. 2(e). This ratio profile is sample and sample only
dependent. In other words, if two expression assays of the same
sample are carried out in two separate hybridization on two
different chips comprising the same array, the expression profiles
obtained from the two assays should differ only by a scaling factor
which is constant for all targets. Different samples will exhibit
different expression profiles (expressed as ratio to reference
genomic DNA). Comparison of FIGS. 2(b) and 2(e) show that the
expression profiles are indeed sample and sample only dependent.
With the use of total human genomic DNA as a reference for
expression analysis in the methods of the invention, the expression
profiles of different samples can be compared even if the assays
are carries out separately and independently.
[0182] (v.) Correlating Genomic Amplification to Gene
Over-Expression:
[0183] FIG. 2(f) and 2(g) are plots of genomic copy number vs cDNA
(both relative to reference genomic DNA) for K562 and Colo 320 cell
lines, respectively.
[0184] As expected, within a cell line, except for the amplified
genes, the expression levels for the rest of the genes analyzed
varied widely while their genomic copy number maintains relatively
constant. As shown in FIG. 2(e), in both cell lines, for some
genes, such as JUNB, HRAS1, GLI, the cDNAs are more abundant while
for others, such as PDGFRA, BEK, MDM2, the cDNAs are less abundant.
Significantly, for C-MYC and ABL, the expression levels are very
different for the two cell lines and the trend is in accordance
with their amplification at the genomic level. The over-expression
of C-MYC in Colo 320 and ABL in K562 can be attributed to gene
amplification. FIG. 2(h) is the plot of "gene expression" ratio vs
"gene copy number" ratio between the two cell lines. Interestingly,
there was a remarkable correlation between the two quantities.
(Linear regression results, Y=0.262X+0.724, correlation 0.985). In
the graph, genes that are unamplified in both cell lines form a
cluster, while genes that are unequally amplified in the two cell
lines are separated apart from the cluster. This graph, or more
generally, the simultaneous genomic and expression assay,
facilitates reliable attribution of over-or under-expression to
gene amplification or deletion.
[0185] The specification of this application is not intended to be
limiting as to the scope of the invention. All patents, patent
applications and published references cited herein are hereby
incorporated by reference. The scope of the invention is determined
by the following claims, including any and all equivalents
thereof.
* * * * *
References