U.S. patent application number 12/735843 was filed with the patent office on 2011-07-21 for identification of pediatric onset inflammatory disease loci and methods of use thereof for the diagnosis and treatment of the same.
Invention is credited to Jonathan Bradfield, Struan Grant, Hakon Hakonarson, Marcin Imielinski.
Application Number | 20110177502 12/735843 |
Document ID | / |
Family ID | 40986189 |
Filed Date | 2011-07-21 |
United States Patent
Application |
20110177502 |
Kind Code |
A1 |
Hakonarson; Hakon ; et
al. |
July 21, 2011 |
IDENTIFICATION OF PEDIATRIC ONSET INFLAMMATORY DISEASE LOCI AND
METHODS OF USE THEREOF FOR THE DIAGNOSIS AND TREATMENT OF THE
SAME
Abstract
Compositions and methods for detection and treatment of
inflammatory bowel disease are provided.
Inventors: |
Hakonarson; Hakon; (Malvern,
PA) ; Bradfield; Jonathan; (Philadelphia, PA)
; Imielinski; Marcin; (Cambridge, MA) ; Grant;
Struan; (Philadelphia, PA) |
Family ID: |
40986189 |
Appl. No.: |
12/735843 |
Filed: |
February 19, 2009 |
PCT Filed: |
February 19, 2009 |
PCT NO: |
PCT/US09/34586 |
371 Date: |
April 1, 2011 |
Related U.S. Patent Documents
|
|
|
|
|
|
Application
Number |
Filing Date |
Patent Number |
|
|
61029841 |
Feb 19, 2008 |
|
|
|
61059486 |
Jun 6, 2008 |
|
|
|
Current U.S.
Class: |
435/6.11 ;
435/29; 536/23.1 |
Current CPC
Class: |
C12Q 1/6883 20130101;
C12Q 2600/156 20130101; C12Q 2600/158 20130101; C12Q 2600/136
20130101; C12Q 2600/172 20130101 |
Class at
Publication: |
435/6.11 ;
536/23.1; 435/29 |
International
Class: |
C12Q 1/68 20060101
C12Q001/68; C07H 21/00 20060101 C07H021/00; C12Q 1/02 20060101
C12Q001/02 |
Claims
1. A method for detecting a propensity for developing IBD, the
method comprising: detecting the presence of a single nucleotide
polymorphism on chromosome 20q13 in a target polynucleotide wherein
if said single nucleotide polymorphism is present, said patient has
an increased risk for developing IBD, wherein said single
nucleotide polymorphism is a T at rs2315008 or an A at rs4809330 in
the TNFRSF6B gene.
2. A method for detecting a propensity for developing inflammatory
bowel disease (IBD), the method comprising: detecting the presence
of at least one single nucleotide polymorphism in a target
polynucleotide wherein if said at least one single nucleotide
polymorphism is present, said patient has an increased risk for
developing IBD, wherein said at least one single nucleotide
polymorphism is set forth in a Table selected from the group
consisting of Table 6A, Table 6B, Table 13, Table 14, Table 15,
Table 16, Table 17, Table 18, and Table 19.
3. A method as claimed in claim 1, wherein the target nucleic acid
is amplified prior to detection.
4. The method of claim 1, wherein the step of detecting the
presence of said single nucleotide polymorphism further comprises
the step of analyzing a polynucleotide sample to determine the
presence of said single nucleotide polymorphism by performing a
process selected from the group consisting of detection of specific
hybridization, measurement of allele size, restriction fragment
length polymorphism analysis, allele-specific hybridization
analysis, single base primer extension reaction, and sequencing of
an amplified polynucleotide.
5. A method as claimed in claim 1, wherein in the target nucleic
acid is DNA.
6. The method of claim 1, wherein nucleic acids comprising said
polymorphism are obtained from an isolated cell of the human
subject.
7. A method for detecting a propensity for developing IBD, the
method comprising: detecting the presence of a single nucleotide
polymorphism on chromosome 21q21 wherein if said single nucleotide
polymorphism is present, said patient has an increased risk for
developing IBD, wherein said single nucleotide polymorphism is an A
at rs28336878 in the PSMG1 gene.
8. A method as claimed in claim 7, wherein the target nucleic acid
is amplified prior to detection.
9. The method of claim 7, wherein the step of detecting the
presence of said single nucleotide polymorphism further comprises
the step of analyzing a polynucleotide sample to determine the
presence of said single nucleotide polymorphism by performing a
process selected from the group consisting of detection of specific
hybridization, measurement of allele size, restriction fragment
length polymorphism analysis, allele-specific hybridization
analysis, single base primer extension reaction, and sequencing of
an amplified polynucleotide.
10. A method as claimed in claim 7, wherein in the target nucleic
acid is DNA.
11. The method of claim 7, wherein nucleic acids comprising said
polymorphism are obtained from an isolated cell of the human
subject.
12. An isolated nucleic acid comprising a single nucleotide
polymorphism associated with an increased risk of developing IBD
selected from the group consisting of a T at rs2315008, or an A at
RS4809330 in the TNFRSF6B gene and an A at rs2836878 in the PSMG1
gene.
13. A solid support comprising a nucleic acid comprising the
polymorphism of claim 12.
14. A method for identifying agents which modulate aberrant
physiological processes associated with IBD, comprising, a)
providing colonic biopsy samples expressing a single nucleotide
polymorphism as claimed in claim 12; b) providing colonic biopsy
samples which express the cognate sequences which lack the
polymorphisms of step a); c) contacting the cells of steps a) and
b) with a test agent and d) analyzing whether said agent alters an
aberrant physiological process associated with IBD in samples of
step a) relative to those of step b), thereby identifying agents
which modulate inflammatory bowel disease.
15. The method of claim 14, wherein said aberrant physiological
process associated with IBD is selected from the group consisting
of a defect in the colonic mucosal barrier, defects in bacterial
clearance and dysregulation of immune responses to commensal
intestinal bacteria.
16. The method as claimed in claim 2, wherein the target nucleic
acid is amplified prior to detection.
17. The method as claimed in claim 2, wherein the step of detecting
the presence of said single nucleotide polymorphism further
comprises the step of analyzing a polynucleotide sample to
determine the presence of said single nucleotide polymorphism by
performing a process selected from the group consisting of
detection of specific hybridization, measurement of allele size,
restriction fragment length polymorphism analysis, allele-specific
hybridization analysis, single base primer extension reaction, and
sequencing of an amplified polynucleotide.
18. A method as claimed in claim 2, wherein in the target nucleic
acid is DNA.
19. The method of claim 2, wherein nucleic acids comprising said
polymorphism are obtained from an isolated cell of the human
subject.
Description
CROSS-REFERENCE TO RELATED APPLICATIONS
[0001] This application claims priority to U.S. Provisional
Applications 61/029,841 and 61/059,486 filed Feb. 19, 2008 and Mar.
6, 2008 respectively, the entire disclosures of each being
incorporated herein by reference.
FIELD OF THE INVENTION
[0002] This invention relates to the fields of inflammatory
disorders and genetic testing. More specifically, the invention
provides compositions and methods for the diagnosis and treatment
inflammatory bowel disease (IBD) in pediatric and adult
patients.
BACKGROUND OF THE INVENTION
[0003] Several publications and patent documents are cited
throughout the specification in order to describe the state of the
art to which this invention pertains. Each of these citations is
incorporated by reference herein as though set forth in full.
[0004] Inflammatory bowel disease (IBD) is a common inflammatory
disorder with complex etiology that involves both genetic and
environmental triggers, including but not limited to defects in
bacterial clearance, a defective mucosal barrier and persistent
dysregulation of the immune response to commensal intestinal
bacteria .sup.1-3. IBD is characterized by two distinct phenotypes:
Crohn's disease (CD) and ulcerative colitis (UC). Among children,
CD is twice as common as UC. CD can affect any part of the gut with
discontinuous penetrating lesions and is characterized by full
thickness (transmural), discrete inflammation which leads to
stricturing and fistulization, and can occur in the large and small
bowel, whereas in UC, the impact is as a confluent inflammation of
the colon, nearly always involving the rectum, ranging from
proctitis to a pancolitis and is characterized by mucosal
inflammation .sup.4; CD impacts 100-250/100,000 and UC impacts
80-100/100,000 in the UK and the USA.
[0005] Recurrence of both CD and UC among families.sup.5-7,.sup.8,
twin studies.sup.9, phenotype concordance among families.sup.10-12,
identification of specific genetic risk factors, and environmental
components all demonstrate that both disorders are complex genetic
diseases.
[0006] Linkage studies facilitated the `positional cloning` of the
first two genes involved in the pathogenesis of the disease.sup.13,
including CARD15 (caspase recruitment domain family, member 15;
also known as NOD2), which is now considered the first and most
widely replicated CD susceptibility gene .sup.14-16. The IBD5
locus, a site on chromosome 5q31, and its association with CD
.sup.17-19, has not been further resolved due to extensive linkage
disequilibrium (LD) in the region .sup.20.
[0007] With the more recent introduction of the GWA technology,
several genes involved in the pathogenesis of IBD have been
uncovered. Duerr et al were the first to report a highly
significant association between CD and sequence variants in the
interleukin 23 receptor (IL23R) gene on chromosome 1p31 in
non-Jewish, ileal CD cases of European ancestry using the HumanHap
317K gene chip from Illumina.sup.20. A coding variant, rs11209026
(Arg381Gln), was shown to confer a strong protective effect against
the disease and was then replicated in the same study in separate
cohorts of patients with CD or UC. Others have replicated this
finding, including our own laboratory in a cohort with pediatric
onset CD.sup.21, lending further support for the protective role of
the IL23R gene in IBD.sup.21. Around the same time, Hampe et
al.sup.22 reported an independent association of a nsSNP in the
autophagy-related 16-like 1 gene (ATG16L1) on chromosome 2q37.1
.sup.22 (a threonine-to-alanine substitution at amino acid position
300 of the protein--T300A) and confirmed the previously reported
variants in the SLC22A4 and CARD15 genes.
[0008] Rioux et al .sup.23 presented a follow-up GWA study to their
IL23R finding in ileal CD and two independent replication studies,
identifying several new regions of association to CD. Specifically,
in addition to the previously established CARD15 and IL23R
associations, they also reported strong association with
independent replication to variation within an intergenic region on
10q2.1, in the genomic regions encoding PHOX2B, NCF4 and FAM92B.
They also independently identified strong and significantly
replicated association with the coding variant in ATG16L1.
[0009] The Wellcome Trust Case Control Consortium .sup.24 described
a joint GWA study (using the Affymetrix GeneChip 500K platform)
carried out in the British population, which examined 2,000
individuals for each of seven major diseases, including CD, against
a shared set of approximately 3,000 controls; they identified in
the case-control comparison nine independent association signals at
P<5.times.10.sup.-7 thereby corroborating the ATG16L1, 5q31,
IL23R, 10q21 and 5p13.1 loci .sup.25. Their study also identified
four further new strong association signals, located on chromosomes
3p21, 5q33, 10q24 and 18p11. Parkes et al also reported replication
for the signals in the ATG16L1 and IRGM genes.sup.27. We have also
successfully demonstrated the association of ATG16L1 variation in
our cohort of pediatric onset CD.sup.28.
[0010] Given that genetic variants associated with CD do not
account for the entire genetic risk, further studies are necessary
to further identify and characterize novel IBD genes. GWA studies
have confirmed that genetic variants associated with IBD are indeed
common and contribute only modestly to overall disease risk. As
such, a barrier to performing further studies is the need for large
sample sizes necessary to identify additional variants with smaller
effect size; however, an alternative strategy is to ascertain
individuals with a younger age of disease onset, as has been
carried out with Alzheimer's disease, type 2 diabetes and breast
cancer. Such a tactic is attractive for IBD for several reasons.
First, CD-affected children are more likely to have colonic CD than
adults. Second, UC-affected children are more likely to have
extensive colitis than adults and a young age of IBD onset is
associated with a greater family history of IBD. Taken together,
childhood onset IBD demonstrates unique characteristics in
phenotype, severity and family history; all of which justify
ascertaining children with IBD for GWA studies to potentially
identify new IBD genes.
SUMMARY OF THE INVENTION
[0011] In accordance with the present invention, compositions and
methods are provided for diagnosis and treatment of pediatric IBD.
An exemplary method entails detecting the presence of a single
nucleotide polymorphism set forth in the Tables provided in the
Examples below in a target polynucleotide wherein if the single
nucleotide polymorphism is present, the patient has an increased
risk for developing IBD. Exemplary single nucleotide polymorphisms
associated with the development of IBD reside on chromosome 20q13
or chromosome 21q22 include, without limitation, a Tat rs2315008,
or an A at RS4809330 in the TNFRSF6B gene on chromosome 20 and an A
at rs2836878 in the PSMG1 gene on chromosome 21. Notably, several
other loci have been identified herein which comprise alterations
associated with the IBD phenotype. The methods of the invention can
include alternative means for detecting the disclosed
polymorphisms. For example, such methods of detection can further
comprises processes such as specific hybridization, measurement of
allele size, restriction fragment length polymorphism analysis,
allele-specific hybridization analysis, single base primer
extension reaction, and sequencing of an amplified
polynucleotide.
[0012] In yet another aspect, nucleic acid molecules useful for
amplifying the nucleic acids encoding the single nucleotide
polymorphisms disclosed herein are provided. Also provided are
solid supports comprising suitable nucleic acid targets to
facilitate detection of such SNPS in patient samples. A suitable
solid support for this process includes a microarray.
[0013] Finally, the invention also encompasses screening methods to
identify agents which modulate the aberrant physiological process
associated with IBD observed in the SNP containing cells described
herein. An exemplary method entails providing colonic biopsy
samples comprising at least one of a T at rs2315008, an A at
RS4809330 in the TNFRSF6B gene and/or an A at rs2836878 in the
PSMG1 gene; providing cells which express these gene(s) which lack
the cognate polymorphisms (step b); contacting each cell type with
a test agent and analyzing whether said agent alters aberrant
physiological process associated with IBD in the samples of step a)
relative to those of step b), thereby identifying agents which
modulate IBD. Aberrant physiological processes associated with the
IBD phenotype, include, without limitation, defects in the colonic
mucosal barrier, defects in bacterial clearance and dysregulation
of immune responses to commensal intestinal bacteria. Each of the
SNPs described herein can be assessed in this manner, alone or in
combination.
[0014] Also provided are transgenic mice comprising the SNP
containing nucleic acid molecules described herein. Such mice
provide a superior in vivo screening tool to identify agents which
modulate the progression and development of IBD.
BRIEF DESCRIPTION OF THE DRAWING
[0015] FIG. 1. FIG. 1A: Linkage disequilibrium (D') between SNPs at
the 20q13 locus in the control cohort together with the
corresponding Haploview gene track. The association signal resides
in a region of LD that harbors the genes RTEL1, TNFRSF6B, ARFRP1,
ZGPAT and LIME1. FIG. 1B: Colonic PSMG1 and DSCAM Expression. Colon
biopsies were obtained from healthy controls (n=11), and affected
segments for CD patients with ileo-colonic (n=18) or colon-only
(n=14) location and UC patients (n=10). RNA was prepared and the
global pattern of gene expression was determined using the
Affymetrix GeneChip
[0016] Human Genome HG-U133 Plus 2.0 array. Results for A) PSMG1,
and B) DSCAM are shown. *p=0.004, **p=0.003 vs. control.
[0017] FIG. 2. Linkage disequilibrium (D') between SNPs at the
21q22 locus in the control cohort together. The association signal
resides in a region of LD that harbors no genes; however, PSMG1
represents the nearest gene geographically.
[0018] FIG. 3. Colonic TNFRSF6B Expression. Colon biopsies were
obtained from healthy controls (n=11, CDHIS:0), and affected
segments for CD patients with ileo-colonic (n=18, mean(SEM) CDHIS:
4.1.+-.0.7) or colon-only (n=14, mean(SEM) CDHIS: 4.9.+-.1)
location and UC patients (n=10, mean(SEM) CDHIS: 7.2.+-.0.6,
p<0.05 vs. CD groups). RNA was prepared and the global pattern
of gene expression was determined using the Affymetrix GeneChip
Human Genome HG-U133 Plus 2.0 array. Results for the genes within
the telomeric region of LD on 20q13 including A) TNFRSF6B, and B)
ARFRP1, LIME1, RTEL1, and ZGPAT are shown. *p=0.01, **p=0.005 vs.
control.
[0019] FIG. 4. Scatter plots of -log(P) against genomic location
for our three main genome scans. Figures were generated using
Haploview (49).
[0020] FIG. 5. Allelic effects of SNPs on lymphoblastoid cell line
gene expression of IL27. The A allele of rs1968752 confers risk in
our CD cohort (OR=1.23 [1.12-1.40]). rs1968752 lies in an LD-block
containing the IL27 gene. Individuals with the A/A genotype at
rs1968752 have 15 fold decrease in IL27 gene expression compared to
those with the C/C genotype. Reduced IL27 expression is likely to
promote inflammation through activation of the Th-17 lineage.
[0021] FIG. 6. Colonic expression of IL27 in CD cases vs controls.
We compared colonic gene expression between 13 normal (NL) and 37
CD samples, using a students T-test with significance threshold of
P<0.05. We found that IL-27 expression is significantly reduced
in the CD cases in comparison with normal tissue, (P=0.028).
[0022] FIG. 7. Colonic expression of TLR genes Expression of the
Toll Like Receptor genes, TLR1, TLR6 and TLR10, located in the LD
block containing rs4833103, which associates with very early onset
(age <=8) UC (P=1.81.times.10.sup.-8, OR=0.56 [0.46-0.69 ]).
Students t-test showed statistically significant difference in
means for TLR1 (P=0.002), TLR6 (P=0.005) and TLR10 (P=0.02) gene
expression between 13 normal (NL) and 10 UC samples.
[0023] FIG. 8. Cumulative risk modeling of genetic variants
associated with IBD. 54 genetic variants (including 6 novel loci
discovered in this study) were analyzed in 2134 pediatric IBD cases
and 6197 controls to determine their cumulative effects on CD, UC,
and IBD risk. Panel's a-c represent distributions of genotypic
scores for 30 CD loci, 17 UC loci, and 37 IBD loci, respectively.
Panel's d-f represent estimates of cumulative risk as a function of
genotypic score for CD, UC, and IBD, respectively.
DETAILED DESCRIPTION OF THE INVENTION
[0024] Inflammatory bowel disease (IBD) constitutes two related
clinical entities, Crohn's disease (CD) and ulcerative colitis
(UC), both of which cause abdominal pain, diarrhea and growth
disturbances. Family and twin studies have indicated that genetic
factors play a large role in an individual's risk of developing IBD
and recently, genome-wide association (GWA) studies have associated
several variants in the caspase recruitment domain 15 (CARD15),
interleukin 23 receptor (IL23R) and autophagy related 16-like 1
(ATG16L1) genes with IBD, notably to the CD subphenotype. However,
these genetic variations account for only a small portion of the
overall genetic susceptibility of CD and their contribution to UC
pathogenesis is even less. We hypothesized that an alternative
strategy such as stratifying cases by age of onset may be needed to
identify new IBD genes. We have performed a GWA analysis in a
cohort of 1,011 pediatric onset IBD cases, and 4,250 age matched
controls. We observed and replicated significantly associated novel
loci on several chromosomes. Example 1 describes loci residing on
chromosome 20q13 and 21q22 which are close to the tumor necrosis
factor receptor superfamily member 6B (TNFRSF6B) and Down syndrome
critical region protein 2 isoform (PSMG1) genes, respectively.
Colonic biopsies also demonstrate expression differences in
TNFRSF6B mRNA message between IBD patients and disease-free
controls, driven most obviously by local mucosal inflammation. When
addressing the individual subcomponents of IBD, we identified an
additional novel locus on 21q21 associated specifically with the
colonic form of CD. In addition, when analyzing UC separately, we
detected strong association with four single nucleotide
polymorphisms (SNPs) within the major histocompatibility complex
(MHC) on chromosome 6q21. Finally, we show that CARD15 is only
associated with CD in patients with ileal disease and that the
signal is absent in CD patients with colon-only disease. In
conclusion, we have discovered novel susceptibility loci in
pediatric onset IBD on 20q13 and 21q22, and identified TNFRSF6B and
PSMG1 respectively as IBD susceptibility genes. Example II provides
additional loci that provide new targets for the development of
agents useful for the treatment of IBD.
[0025] In Example III, additional novel IBD associated loci are
provided: 1L27 on 16p11 and LNPEP-LRAP on 5q15 as CD loci, SMAD3 on
15q22 and HORMAD2 on 21q22. The fifth locus is a Toll-like receptor
gene cluster on 4p14 for UC with onset prior to 8 years of age
(P=1.81.times.10.sup.-8); we had a limited sized replication cohort
and detected evidence of association. Our results also revealed
that 21 of 32 previously implicated adult-onset CD loci and 8 of 15
previously implicated adult-onset UC loci contribute to the
pathogenesis of the childhood-onset form of the disease. Using
these data, we modeled the cumulative effect of the most
significant risk alleles detected, demonstrating, for instance,
that children carrying 34 or more of the common CD risk alleles
have .about.13-fold increased risk of developing CD, while children
carrying 20 or more of the common UC risk alleles have
.about.7-fold increased risk of developing UC.
[0026] The results presented herein advance the current
understanding of pediatric-onset IBD by highlighting key
pathogenetic mechanisms, most notably Th17 signaling and innate
immunity based on the discovery of the IL27 and TLR loci in CD and
UC, respectively. These observations clarify the relationship with
adult-onset disease and quantify the cumulative IBD risk conferred
by multiple risk alleles in pediatric-onset disease, an important
contribution to the future development of a molecular diagnostic
for IBD.
Definitions
[0027] For purposes of the present invention, "a" or "an" entity
refers to one or more of that entity; for example, "a cDNA" refers
to one or more cDNA or at least one cDNA. As such, the terms "a" or
"an," "one or more" and "at least one" can be used interchangeably
herein. It is also noted that the terms "comprising," "including,"
and "having" can be used interchangeably. Furthermore, a compound
"selected from the group consisting of" refers to one or more of
the compounds in the list that follows, including mixtures (i.e.
combinations) of two or more of the compounds. According to the
present invention, an isolated, or biologically pure molecule is a
compound that has been removed from its natural milieu. As such,
"isolated" and "biologically pure" do not necessarily reflect the
extent to which the compound has been purified. An isolated
compound of the present invention can be obtained from its natural
source, can be produced using laboratory synthetic techniques or
can be produced by any such chemical synthetic route.
[0028] "IBD-associated SNP or specific marker" is a SNP or marker
which is associated with an increased or decreased risk of
developing IBD not found normal patients who do not have this
disease. Such markers may include but are not limited to nucleic
acids, proteins encoded thereby, or other small molecules.
[0029] A "single nucleotide polymorphism (SNP)" refers to a change
in which a single base in the DNA differs from the usual base at
that position. These single base changes are called SNPs or
"snips." Millions of SNP's have been cataloged in the human genome.
Some SNPs such as that which causes sickle cell are responsible for
disease. Other SNPs are normal variations in the genome. These are
to be distinguished from those associated with the disease
phenotype.
[0030] The term "genetic alteration" as used herein refers to a
change from the wild-type or reference sequence of one or more
nucleic acid molecules. Genetic alterations include without
limitation, base pair substitutions, additions and deletions of at
least one nucleotide from a nucleic acid molecule of known
sequence.
[0031] The term "solid matrix" as used herein refers to any format,
such as beads, microparticles, a microarray, the surface of a
microtitration well or a test tube, a biacore chip, a dipstick or a
filter. The material of the matrix may be polystyrene, cellulose,
latex, nitrocellulose, nylon, polyacrylamide, dextran or
agarose.
[0032] The phrase "consisting essentially of" when referring to a
particular nucleotide or amino acid means a sequence having the
properties of a given SEQ ID NO:. For example, when used in
reference to an amino acid sequence, the phrase includes the
sequence per se and molecular modifications that would not affect
the functional and novel characteristics of the sequence.
[0033] "Target nucleic acid" as used herein refers to a previously
defined region of a nucleic acid present in a complex nucleic acid
mixture wherein the defined wild-type region contains at least one
known nucleotide variation which may or may not be associated with
IBD. The nucleic acid molecule may be isolated from a natural
source by cDNA cloning or subtractive hybridization or synthesized
manually. The nucleic acid molecule may be synthesized manually by
the triester synthetic method or by using an automated DNA
synthesizer.
[0034] With regard to nucleic acids used in the invention, the term
"isolated nucleic acid" is sometimes employed. This term, when
applied to DNA, refers to a DNA molecule that is separated from
sequences with which it is immediately contiguous (in the 5' and 3'
directions) in the naturally occurring genome of the organism from
which it was derived. For example, the "isolated nucleic acid" may
comprise a DNA molecule inserted into a vector, such as a plasmid
or virus vector, or integrated into the genomic DNA of a prokaryote
or eukaryote. An "isolated nucleic acid molecule" may also comprise
a cDNA molecule. An isolated nucleic acid molecule inserted into a
vector is also sometimes referred to herein as a recombinant
nucleic acid molecule.
[0035] With respect to RNA molecules, the term "isolated nucleic
acid" primarily refers to an RNA molecule encoded by an isolated
DNA molecule as defined above. Alternatively, the term may refer to
an RNA molecule that has been sufficiently separated from RNA
molecules with which it would be associated in its natural state
(i.e., in cells or tissues), such that it exists in a
"substantially pure" form.
[0036] By the use of the term "enriched" in reference to nucleic
acid it is meant that the specific DNA or RNA sequence constitutes
a significantly higher fraction (2-5 fold) of the total DNA or RNA
present in the cells or solution of interest than in normal cells
or in the cells from which the sequence was taken. This could be
caused by a person by preferential reduction in the amount of other
DNA or RNA present, or by a preferential increase in the amount of
the specific DNA or RNA sequence, or by a combination of the two.
However, it should be noted that "enriched" does not imply that
there are no other DNA or RNA sequences present, just that the
relative amount of the sequence of interest has been significantly
increased.
[0037] It is also advantageous for some purposes that a nucleotide
sequence be in purified form. The term "purified" in reference to
nucleic acid does not require absolute purity (such as a
homogeneous preparation); instead, it represents an indication that
the sequence is relatively purer than in the natural environment
(compared to the natural level, this level should be at least 2-5
fold greater, e.g., in terms of mg/ml). Individual clones isolated
from a cDNA library may be purified to electrophoretic homogeneity.
The claimed DNA molecules obtained from these clones can be
obtained directly from total DNA or from total RNA. The cDNA clones
are not naturally occurring, but rather are preferably obtained via
manipulation of a partially purified naturally occurring substance
(messenger RNA). The construction of a cDNA library from mRNA
involves the creation of a synthetic substance (cDNA) and pure
individual cDNA clones can be isolated from the synthetic library
by clonal selection of the cells carrying the cDNA library. Thus,
the process which includes the construction of a cDNA library from
mRNA and isolation of distinct cDNA clones yields an approximately
10.sup.-6-fold purification of the native message. Thus,
purification of at least one order of magnitude, preferably two or
three orders, and more preferably four or five orders of magnitude
is expressly contemplated. Thus the term "substantially pure"
refers to a preparation comprising at least 50-60% by weight the
compound of interest (e.g., nucleic acid, oligonucleotide, etc.).
More preferably, the preparation comprises at least 75% by weight,
and most preferably 90-99% by weight, the compound of interest.
Purity is measured by methods appropriate for the compound of
interest.
[0038] The term "complementary" describes two nucleotides that can
form multiple favorable interactions with one another. For example,
adenine is complementary to thymine as they can form two hydrogen
bonds. Similarly, guanine and cytosine are complementary since they
can form three hydrogen bonds. Thus if a nucleic acid sequence
contains the following sequence of bases, thymine, adenine, guanine
and cytosine, a "complement" of this nucleic acid molecule would be
a molecule containing adenine in the place of thymine, thymine in
the place of adenine, cytosine in the place of guanine, and guanine
in the place of cytosine. Because the complement can contain a
nucleic acid sequence that forms optimal interactions with the
parent nucleic acid molecule, such a complement can bind with high
affinity to its parent molecule.
[0039] With respect to single stranded nucleic acids, particularly
oligonucleotides, the term "specifically hybridizing" refers to the
association between two single-stranded nucleotide molecules of
sufficiently complementary sequence to permit such hybridization
under pre-determined conditions generally used in the art
(sometimes termed "substantially complementary"). In particular,
the term refers to hybridization of an oligonucleotide with a
substantially complementary sequence contained within a
single-stranded DNA or RNA molecule of the invention, to the
substantial exclusion of hybridization of the oligonucleotide with
single-stranded nucleic acids of non-complementary sequence. For
example, specific hybridization can refer to a sequence which
hybridizes to any IBD specific marker gene or nucleic acid, but
does not hybridize to other nucleotides. Appropriate conditions
enabling specific hybridization of single stranded nucleic acid
molecules of varying complementarity are well known in the art.
[0040] For instance, one common formula for calculating the
stringency conditions required to achieve hybridization between
nucleic acid molecules of a specified sequence homology is set
forth below (Sambrook et al., Molecular Cloning, Cold Spring Harbor
Laboratory (1989):
T.sub.m=81.5.degree. C.+16.6 Log[Na+]+0.41(% G+C)-0.63 (%
formamide)-600/#bp in duplex
[0041] As an illustration of the above formula, using [Na+]=[0.368]
and 50% formamide, with GC content of 42% and an average probe size
of 200 bases, the T.sub.m is 57.degree. C. The T.sub.m of a DNA
duplex decreases by 1-1.5.degree. C. with every 1% decrease in
homology. Thus, targets with greater than about 75% sequence
identity would be observed using a hybridization temperature of
42.degree. C.
[0042] The stringency of the hybridization and wash depend
primarily on the salt concentration and temperature of the
solutions. In general, to maximize the rate of annealing of the
probe with its target, the hybridization is usually carried out at
salt and temperature conditions that are 20-25.degree. C. below the
calculated T.sub.m of the hybrid. Wash conditions should be as
stringent as possible for the degree of identity of the probe for
the target. In general, wash conditions are selected to be
approximately 12-20.degree. C. below the T.sub.m of the hybrid. In
regards to the nucleic acids of the current invention, a moderate
stringency hybridization is defined as hybridization in
6.times.SSC, 5.times. Denhardt's solution, 0.5% SDS and 100
.mu.g/ml denatured salmon sperm DNA at 42.degree. C., and washed in
2.times.SSC and 0.5% SDS at 55.degree. C. for 15 minutes. A high
stringency hybridization is defined as hybridization in
6.times.SSC, 5.times. Denhardt's solution, 0.5% SDS and 100
.mu.g/ml denatured salmon sperm DNA at 42.degree. C., and washed in
1.times.SSC and 0.5% SDS at 65.degree. C. for 15 minutes. A very
high stringency hybridization is defined as hybridization in
6.times.SSC, 5.times. Denhardt's solution, 0.5% SDS and 100
.mu.g/ml denatured salmon sperm DNA at 42.degree. C., and washed in
0.1.times.SSC and 0.5% SDS at 65.degree. C. for 15 minutes.
[0043] The term "oligonucleotide," as used herein is defined as a
nucleic acid molecule comprised of two or more ribo- or
deoxyribonucleotides, preferably more than three. The exact size of
the oligonucleotide will depend on various factors and on the
particular application and use of the oligonucleotide.
Oligonucleotides, which include probes and primers, can be any
length from 3 nucleotides to the full length of the nucleic acid
molecule, and explicitly include every possible number of
contiguous nucleic acids from 3 through the full length of the
polynucleotide. Preferably, oligonucleotides are at least about 10
nucleotides in length, more preferably at least 15 nucleotides in
length, more preferably at least about 20 nucleotides in
length.
[0044] The term "probe" as used herein refers to an
oligonucleotide, polynucleotide or nucleic acid, either RNA or DNA,
whether occurring naturally as in a purified restriction enzyme
digest or produced synthetically, which is capable of annealing
with or specifically hybridizing to a nucleic acid with sequences
complementary to the probe. A probe may be either single-stranded
or double-stranded. The exact length of the probe will depend upon
many factors, including temperature, source of probe and use of the
method. For example, for diagnostic applications, depending on the
complexity of the target sequence, the oligonucleotide probe
typically contains 15-25 or more nucleotides, although it may
contain fewer nucleotides. The probes herein are selected to be
complementary to different strands of a particular target nucleic
acid sequence. This means that the probes must be sufficiently
complementary so as to be able to "specifically hybridize" or
anneal with their respective target strands under a set of
pre-determined conditions. Therefore, the probe sequence need not
reflect the exact complementary sequence of the target. For
example, a non-complementary nucleotide fragment may be attached to
the 5' or 3' end of the probe, with the remainder of the probe
sequence being complementary to the target strand. Alternatively,
non-complementary bases or longer sequences can be interspersed
into the probe, provided that the probe sequence has sufficient
complementarity with the sequence of the target nucleic acid to
anneal therewith specifically.
[0045] The term "primer" as used herein refers to an
oligonucleotide, either RNA or DNA, either single-stranded or
double-stranded, either derived from a biological system, generated
by restriction enzyme digestion, or produced synthetically which,
when placed in the proper environment, is able to functionally act
as an initiator of template-dependent nucleic acid synthesis. When
presented with an appropriate nucleic acid template, suitable
nucleoside triphosphate precursors of nucleic acids, a polymerase
enzyme, suitable cofactors and conditions such as a suitable
temperature and pH, the primer may be extended at its 3' terminus
by the addition of nucleotides by the action of a polymerase or
similar activity to yield a primer extension product. The primer
may vary in length depending on the particular conditions and
requirement of the application. For example, in diagnostic
applications, the oligonucleotide primer is typically 15-25 or more
nucleotides in length. The primer must be of sufficient
complementarity to the desired template to prime the synthesis of
the desired extension product, that is, to be able anneal with the
desired template strand in a manner sufficient to provide the 3'
hydroxyl moiety of the primer in appropriate juxtaposition for use
in the initiation of synthesis by a polymerase or similar enzyme.
It is not required that the primer sequence represent an exact
complement of the desired template. For example, a
non-complementary nucleotide sequence may be attached to the 5' end
of an otherwise complementary primer. Alternatively,
non-complementary bases may be interspersed within the
oligonucleotide primer sequence, provided that the primer sequence
has sufficient complementarity with the sequence of the desired
template strand to functionally provide a template-primer complex
for the synthesis of the extension product.
[0046] Polymerase chain reaction (PCR) has been described in U.S.
Pat. Nos. 4,683,195, 4,800,195, and 4,965,188, the entire
disclosures of which are incorporated by reference herein.
[0047] The term "vector" relates to a single or double stranded
circular nucleic acid molecule that can be infected, transfected or
transformed into cells and replicate independently or within the
host cell genome. A circular double stranded nucleic acid molecule
can be cut and thereby linearized upon treatment with restriction
enzymes. An assortment of vectors, restriction enzymes, and the
knowledge of the nucleotide sequences that are targeted by
restriction enzymes are readily available to those skilled in the
art, and include any replicon, such as a plasmid, cosmid, bacmid,
phage or virus, to which another genetic sequence or element
(either DNA or RNA) may be attached so as to bring about the
replication of the attached sequence or element. A nucleic acid
molecule of the invention can be inserted into a vector by cutting
the vector with restriction enzymes and ligating the two pieces
together.
[0048] Many techniques are available to those skilled in the art to
facilitate transformation, transfection, or transduction of the
expression construct into a prokaryotic or eukaryotic organism. The
terms "transformation", "transfection", and "transduction" refer to
methods of inserting a nucleic acid and/or expression construct
into a cell or host organism. These methods involve a variety of
techniques, such as treating the cells with high concentrations of
salt, an electric field, or detergent, to render the host cell
outer membrane or wall permeable to nucleic acid molecules of
interest, microinjection, PEG-fusion, and the like.
[0049] The term "promoter element" describes a nucleotide sequence
that is incorporated into a vector that, once inside an appropriate
cell, can facilitate transcription factor and/or polymerase binding
and subsequent transcription of portions of the vector DNA into
mRNA. In one embodiment, the promoter element of the present
invention precedes the 5' end of the IBD specific marker nucleic
acid molecule such that the latter is transcribed into mRNA. Host
cell machinery then translates mRNA into a polypeptide.
[0050] Those skilled in the art will recognize that a nucleic acid
vector can contain nucleic acid elements other than the promoter
element and the IBD specific marker gene nucleic acid molecule.
These other nucleic acid elements include, but are not limited to,
origins of replication, ribosomal binding sites, nucleic acid
sequences encoding drug resistance enzymes or amino acid metabolic
enzymes, and nucleic acid sequences encoding secretion signals,
localization signals, or signals useful for polypeptide
purification.
[0051] A "replicon" is any genetic element, for example, a plasmid,
cosmid, bacmid, plastid, phage or virus, which is capable of
replication largely under its own control. A replicon may be either
RNA or DNA and may be single or double stranded.
[0052] An "expression operon" refers to a nucleic acid segment that
may possess transcriptional and translational control sequences,
such as promoters, enhancers, translational start signals (e.g.,
ATG or AUG codons), polyadenylation signals, terminators, and the
like, and which facilitate the expression of a polypeptide coding
sequence in a host cell or organism.
[0053] As used herein, the terms "reporter," "reporter system",
"reporter gene," or "reporter gene product" shall mean an operative
genetic system in which a nucleic acid comprises a gene that
encodes a product that when expressed produces a reporter signal
that is a readily measurable, e.g., by biological assay,
immunoassay, radio immunoassay, or by colorimetric, fluorogenic,
chemiluminescent or other methods. The nucleic acid may be either
RNA or DNA, linear or circular, single or double stranded,
antisense or sense polarity, and is operatively linked to the
necessary control elements for the expression of the reporter gene
product. The required control elements will vary according to the
nature of the reporter system and whether the reporter gene is in
the form of DNA or RNA, but may include, but not be limited to,
such elements as promoters, enhancers, translational control
sequences, poly A addition signals, transcriptional termination
signals and the like.
[0054] The introduced nucleic acid may or may not be integrated
(covalently linked) into nucleic acid of the recipient cell or
organism. In bacterial, yeast, plant and mammalian cells, for
example, the introduced nucleic acid may be maintained as an
episomal element or independent replicon such as a plasmid.
Alternatively, the introduced nucleic acid may become integrated
into the nucleic acid of the recipient cell or organism and be
stably maintained in that cell or organism and further passed on or
inherited to progeny cells or organisms of the recipient cell or
organism. Finally, the introduced nucleic acid may exist in the
recipient cell or host organism only transiently.
[0055] The term "selectable marker gene" refers to a gene that when
expressed confers a selectable phenotype, such as antibiotic
resistance, on a transformed cell. The term "operably linked" means
that the regulatory sequences necessary for expression of the
coding sequence are placed in the DNA molecule in the appropriate
positions relative to the coding sequence so as to effect
expression of the coding sequence. This same definition is
sometimes applied to the arrangement of transcription units and
other transcription control elements (e.g. enhancers) in an
expression vector.
[0056] The terms "recombinant organism," or "transgenic organism"
refer to organisms which have a new combination of genes or nucleic
acid molecules. A new combination of genes or nucleic acid
molecules can be introduced into an organism using a wide array of
nucleic acid manipulation techniques available to those skilled in
the art. The term "organism" relates to any living being comprised
of a least one cell. An organism can be as simple as one eukaryotic
cell or as complex as a mammal. Therefore, the phrase "a
recombinant organism" encompasses a recombinant cell, as well as
eukaryotic and prokaryotic organism.
[0057] The term "isolated protein" or "isolated and purified
protein" is sometimes used herein. This term refers primarily to a
protein produced by expression of an isolated nucleic acid molecule
of the invention. Alternatively, this term may refer to a protein
that has been sufficiently separated from other proteins with which
it would naturally be associated, so as to exist in "substantially
pure" form. "Isolated" is not meant to exclude artificial or
synthetic mixtures with other compounds or materials, or the
presence of impurities that do not interfere with the fundamental
activity, and that may be present, for example, due to incomplete
purification, addition of stabilizers, or compounding into, for
example, immunogenic preparations or pharmaceutically acceptable
preparations.
[0058] A "specific binding pair" comprises a specific binding
member (sbm) and a binding partner (bp) which have a particular
specificity for each other and which in normal conditions bind to
each other in preference to other molecules. Examples of specific
binding pairs are antigens and antibodies, ligands and receptors
and complementary nucleotide sequences. The skilled person is aware
of many other examples. Further, the term "specific binding pair"
is also applicable where either or both of the specific binding
member and the binding partner comprise a part of a large molecule.
In embodiments in which the specific binding pair comprises nucleic
acid sequences, they will be of a length to hybridize to each other
under conditions of the assay, preferably greater than 10
nucleotides long, more preferably greater than 15 or 20 nucleotides
long.
[0059] "Sample" or "patient sample" or "biological sample"
generally refers to a sample which may be tested for a particular
molecule, preferably an IBD specific marker molecule, such as a
marker shown in the tables provided below. Samples may include but
are not limited to cells, body fluids, including blood, serum,
plasma, urine, saliva, tears, pleural fluid and the like.
[0060] The terms "agent" and "test compound" are used
interchangeably herein and denote a chemical compound, a mixture of
chemical compounds, a biological macromolecule, or an extract made
from biological materials such as bacteria, plants, fungi, or
animal (particularly mammalian) cells or tissues. Biological
macromolecules include siRNA, shRNA, antisense oligonucleotides,
peptides, peptide/DNA complexes, and any nucleic acid based
molecule which exhibits the capacity to modulate the activity of
the SNP containing nucleic acids described herein or their encoded
proteins. Agents are evaluated for potential biological activity by
inclusion in screening assays described hereinbelow.
Methods of Using Pediatric IBD-Associated SNPs For Diagnosing a
Propensity For the Development of Pediatric IBD
[0061] IBD-related-SNP containing nucleic acids, including but not
limited to those listed in the Tables provided below may be used
for a variety of purposes in accordance with the present invention.
IBD-associated SNP containing DNA, RNA, or fragments thereof may be
used as probes to detect the presence of and/or expression of IBD
specific markers. Methods in which IBD specific marker nucleic
acids may be utilized as probes for such assays include, but are
not limited to: (1) in situ hybridization; (2) Southern
hybridization (3) northern hybridization; and (4) assorted
amplification reactions such as polymerase chain reactions
(PCR).
[0062] Further, assays for detecting IBD-associated SNPs may be
conducted on any type of biological sample, including but not
limited to body fluids (including blood, urine, serum, gastric
lavage), any type of cell (such as brain cells, white blood cells,
mononuclear cells) or body tissue.
[0063] From the foregoing discussion, it can be seen that
IBD-associated SNP containing nucleic acids, vectors expressing the
same, IBD SNP containing marker proteins and anti-IBD specific
marker antibodies of the invention can be used to detect IBD
associated SNPs in body tissue, cells, or fluid, and alter IBD SNP
containing marker protein expression for purposes of assessing the
genetic and protein interactions involved in the development of
IBD.
[0064] In most embodiments for screening for IBD-associated SNPs,
the IBD-associated SNP containing nucleic acid in the sample will
initially be amplified, e.g. using PCR, to increase the amount of
the templates as compared to other sequences present in the sample.
This allows the target sequences to be detected with a high degree
of sensitivity if they are present in the sample. This initial step
may be avoided by using highly sensitive array techniques that are
becoming increasingly important in the art.
[0065] Alternatively, new detection technologies can overcome this
limitation and enable analysis of small samples containing as
little as 1 .mu.g of total RNA. Using Resonance Light Scattering
(RLS) technology, as opposed to traditional fluorescence
techniques, multiple reads can detect low quantities of mRNAs using
biotin labeled hybridized targets and anti-biotin antibodies.
Another alternative to PCR amplification involves planar wave guide
technology (PWG) to increase signal-to-noise ratios and reduce
background interference. Both techniques are commercially available
from Qiagen Inc. (USA).
[0066] Thus any of the aforementioned techniques may be used to
detect or quantify IBD-associated SNP marker expression and
accordingly, diagnose IBD.
Kits and Articles of Manufacture
[0067] Any of the aforementioned products can be incorporated into
a kit which may contain an IBD-associated SNP specific marker
polynucleotide or one or more such markers immobilized on a Gene
Chip, an oligonucleotide, a polypeptide, a peptide, an antibody, a
label, marker, or reporter, a pharmaceutically acceptable carrier,
a physiologically acceptable carrier, instructions for use, a
container, a vessel for administration, an assay substrate, or any
combination thereof
Methods of Using IBD-Associated SNPs For Development of Therapeutic
Agents
[0068] Since the SNPs identified herein have been associated with
the etiology of IBD, methods for identifying agents that modulate
the activity of the genes and their encoded products containing
such SNPs should result in the generation of efficacious
therapeutic agents for the treatment of a variety of disorders
associated with this condition.
[0069] Chromosomes 20 and 21 contain regions which provide suitable
targets for the rational design of therapeutic agents which
modulate their activity. Small peptide molecules corresponding to
these regions may be used to advantage in the design of therapeutic
agents which effectively modulate the activity of the encoded
proteins.
[0070] Molecular modeling should facilitate the identification of
specific organic molecules with capacity to bind to the active site
of the proteins encoded by the SNP containing nucleic acids based
on conformation or key amino acid residues required for function. A
combinatorial chemistry approach will be used to identify molecules
with greatest activity and then iterations of these molecules will
be developed for further cycles of screening. In certain
embodiments, candidate agents can be screening from large libraries
of synthetic or natural compounds. Such compound libraries are
commercially available from a number of companies including but not
limited to Maybridge Chemical Co., (Trevillet, Cornwall, UK),
Comgenex (Princeton, N.J.), Microsour (New Milford, Conn.) Aldrich
(Milwaukee, Wiss.) Akos Consulting and Solutions GmbH (Basel,
Switzerland), Ambinter (Paris, France), Asinex (Moscow, Russia)
Aurora (Graz, Austria), BioFocus DPI (Switzerland), Bionet
(Camelford, UK), Chembridge (San Diego, Calif.), Chem Div (San
Diego, Calif.). The skilled person is aware of other sources and
can readily purchase the same. Once therapeutically efficacious
compounds are identified in the screening assays described herein,
the can be formulated in to pharmaceutical compositions and
utilized for the treatment of inflammatory bowel disease.
[0071] The polypeptides or fragments employed in drug screening
assays may either be free in solution, affixed to a solid support
or within a cell. One method of drug screening utilizes eukaryotic
or prokaryotic host cells which are stably transformed with
recombinant polynucleotides expressing the polypeptide or fragment,
preferably in competitive binding assays. Such cells, either in
viable or fixed form, can be used for standard binding assays. One
may determine, for example, formation of complexes between the
polypeptide or fragment and the agent being tested, or examine the
degree to which the formation of a complex between the polypeptide
or fragment and a known substrate is interfered with by the agent
being tested.
[0072] Another technique for drug screening provides high
throughput screening for compounds having suitable binding affinity
for the encoded polypeptides and is described in detail in Geysen,
PCT published application WO 84/03564, published on Sep. 13, 1984.
Briefly stated, large numbers of different, small peptide test
compounds, such as those described above, are synthesized on a
solid substrate, such as plastic pins or some other surface. The
peptide test compounds are reacted with the target polypeptide and
washed. Bound polypeptide is then detected by methods well known in
the art.
[0073] A further technique for drug screening involves the use of
host eukaryotic cell lines or cells (such as described above) which
have a nonfunctional or altered IBD associated gene. These host
cell lines or cells are defective at the polypeptide level. The
host cell lines or cells are grown in the presence of drug
compound. The rate of cellular metabolism of the host cells is
measured to determine if the compound is capable of regulating the
cellular metabolism in the defective cells. Host cells contemplated
for use in the present invention include but are not limited to
bacterial cells, fungal cells, insect cells, mammalian cells, and
plant cells. The IBD-associated SNP encoding DNA molecules may be
introduced singly into such host cells or in combination to assess
the phenotype of cells conferred by such expression. Methods for
introducing DNA molecules are also well known to those of ordinary
skill in the art. Such methods are set forth in Ausubel et al.
eds., Current Protocols in Molecular Biology, John Wiley &
Sons, NY, N.Y. 1995, the disclosure of which is incorporated by
reference herein.
[0074] A wide variety of expression vectors are available that can
be modified to express the novel DNA sequences of this invention.
The specific vectors exemplified herein are merely illustrative,
and are not intended to limit the scope of the invention.
Expression methods are described by Sambrook et al. Molecular
Cloning: A Laboratory Manual or Current Protocols in Molecular
Biology 16.3-17.44 (1989). Expression methods in Saccharomyces are
also described in Current Protocols in Molecular Biology
(1989).
[0075] Suitable vectors for use in practicing the invention include
prokaryotic vectors such as the pNH vectors (Stratagene Inc., 11099
N. Torrey Pines Rd., La Jolla, Calif. 92037), pET vectors (Novogen
Inc., 565 Science Dr., Madison, Wis. 53711) and the pGEX vectors
(Pharmacia LKB Biotechnology Inc., Piscataway, N.J. 08854).
Examples of eukaryotic vectors useful in practicing the present
invention include the vectors pRc/CMV, pRc/RSV, and pREP
(Invitrogen, 11588 Sorrento Valley Rd., San Diego, Calif. 92121);
pcDNA3.1/V5 & His (Invitrogen); baculovirus vectors such as
pVL1392, pVL1393, or pAC360 (Invitrogen); and yeast vectors such as
YRP17, YIP5, and YEP24 (New England Biolabs, Beverly, Mass.), as
well as pRS403 and pRS413 Stratagene Inc.); Picchia vectors such as
pHIL-D1 (Phillips Petroleum Co., Bartlesville, Okla. 74004);
retroviral vectors such as PLNCX and pLPCX (Clontech); and
adenoviral and adeno-associated viral vectors.
[0076] Promoters for use in expression vectors of this invention
include promoters that are operable in prokaryotic or eukaryotic
cells. Promoters that are operable in prokaryotic cells include
lactose (lac) control elements, bacteriophage lambda (pL) control
elements, arabinose control elements, tryptophan (trp) control
elements, bacteriophage T7 control elements, and hybrids thereof
Promoters that are operable in eukaryotic cells include Epstein
Barr virus promoters, adenovirus promoters, SV40 promoters, Rous
Sarcoma Virus promoters, cytomegalovirus (CMV) promoters,
baculovirus promoters such as AcMNPV polyhedrin promoter, Picchia
promoters such as the alcohol oxidase promoter, and Saccharomyces
promoters such as the gal4 inducible promoter and the PGK
constitutive promoter, as well as neuronal-specific
platelet-derived growth factor promoter (PDGF), and the Thy-1
promoter.
[0077] In addition, a vector of this invention may contain any one
of a number of various markers facilitating the selection of a
transformed host cell. Such markers include genes associated with
temperature sensitivity, drug resistance, or enzymes associated
with phenotypic characteristics of the host organisms.
[0078] Host cells expressing the IBD-associated SNPs of the present
invention or functional fragments thereof provide a system in which
to screen potential compounds or agents for the ability to modulate
the development of IBD. Thus, in one embodiment, the nucleic acid
molecules of the invention may be used to create recombinant cell
lines for use in assays to identify agents which modulate aspects
of metabolism associated with IBD, including without limitation,
aberrant bacterial clearance, altered mucosal barriers and
persistent dysregulation of the immune response to commensal
intestinal bacteria. Also provided herein are methods to screen for
compounds capable of modulating the function of proteins encoded by
SNP containing nucleic acids.
[0079] Another approach entails the use of phage display libraries
engineered to express fragment of the polypeptides encoded by the
SNP containing nucleic acids on the phage surface. Such libraries
are then contacted with a combinatorial chemical library under
conditions wherein binding affinity between the expressed peptide
and the components of the chemical library may be detected. U.S.
Pat. Nos. 6,057,098 and 5,965,456 provide methods and apparatus for
performing such assays.
[0080] The goal of rational drug design is to produce structural
analogs of biologically active polypeptides of interest or of small
molecules with which they interact (e.g., agonists, antagonists,
inhibitors) in order to fashion drugs which are, for example, more
active or stable forms of the polypeptide, or which, e.g., enhance
or interfere with the function of a polypeptide in vivo. See, e.g.,
Hodgson, (1991) Bio/Technology 9:19-21. In one approach, discussed
above, the three-dimensional structure of a protein of interest or,
for example, of the protein-substrate complex, is solved by x-ray
crystallography, by nuclear magnetic resonance, by computer
modeling or most typically, by a combination of approaches. Less
often, useful information regarding the structure of a polypeptide
may be gained by modeling based on the structure of homologous
proteins. An example of rational drug design is the development of
HIV protease inhibitors (Erickson et al., (1990) Science
249:527-533). In addition, peptides may be analyzed by an alanine
scan (Wells, (1991) Meth. Enzym. 202:390-411). In this technique,
an amino acid residue is replaced by Ala, and its effect on the
peptide's activity is determined. Each of the amino acid residues
of the peptide is analyzed in this manner to determine the
important regions of the peptide.
[0081] It is also possible to isolate a target-specific antibody,
selected by a functional assay, and then to solve its crystal
structure. In principle, this approach yields a pharmacore upon
which subsequent drug design can be based.
[0082] One can bypass protein crystallography altogether by
generating anti-idiotypic antibodies (anti-ids) to a functional,
pharmacologically active antibody. As a mirror image of a mirror
image, the binding site of the anti-ids would be expected to be an
analog of the original molecule. The anti-id could then be used to
identify and isolate peptides from banks of chemically or
biologically produced banks of peptides. Selected peptides would
then act as the pharmacore.
[0083] Thus, one may design drugs which have, e.g., improved
polypeptide activity or stability or which act as inhibitors,
agonists, antagonists, etc. of polypeptide activity. By virtue of
the availability of SNP containing nucleic acid sequences described
herein, sufficient amounts of the encoded polypeptide may be made
available to perform such analytical studies as x-ray
crystallography. In addition, the knowledge of the protein sequence
provided herein will guide those employing computer modeling
techniques in place of, or in addition to x-ray
crystallography.
[0084] In another embodiment, the availability of IBD-associated
SNP containing nucleic acids enables the production of strains of
laboratory mice carrying the IBD-associated SNPs of the invention.
Transgenic mice expressing the IBD-associated SNP of the invention
provide a model system in which to examine the role of the protein
encoded by the SNP containing nucleic acid in the development and
progression towards IBD. Methods of introducing transgenes in
laboratory mice are known to those of skill in the art. Three
common methods include: 1. integration of retroviral vectors
encoding the foreign gene of interest into an early embryo; 2.
injection of DNA into the pronucleus of a newly fertilized egg; and
3. the incorporation of genetically manipulated embryonic stem
cells into an early embryo. Production of the transgenic mice
described above will facilitate the molecular elucidation of the
role that a target protein plays in various cellular metabolic
processes, including: aberrant bacterial clearance, altered mucosal
barriers and persistent dysregulation of the immune response to
commensal intestinal bacteria. Such mice provide an in vivo
screening tool to study putative therapeutic drugs in a whole
animal model and are encompassed by the present invention.
[0085] The term "animal" is used herein to include all vertebrate
animals, except humans. It also includes an individual animal in
all stages of development, including embryonic and fetal stages. A
"transgenic animal" is any animal containing one or more cells
bearing genetic information altered or received, directly or
indirectly, by deliberate genetic manipulation at the subcellular
level, such as by targeted recombination or microinjection or
infection with recombinant virus. The term "transgenic animal" is
not meant to encompass classical cross-breeding or in vitro
fertilization, but rather is meant to encompass animals in which
one or more cells are altered by or receive a recombinant DNA
molecule. This molecule may be specifically targeted to a defined
genetic locus, be randomly integrated within a chromosome, or it
may be extrachromosomally replicating DNA. The term "germ cell line
transgenic animal" refers to a transgenic animal in which the
genetic alteration or genetic information was introduced into a
germ line cell, thereby conferring the ability to transfer the
genetic information to offspring. If such offspring, in fact,
possess some or all of that alteration or genetic information, then
they, too, are transgenic animals.
[0086] The alteration of genetic information may be foreign to the
species of animal to which the recipient belongs, or foreign only
to the particular individual recipient, or may be genetic
information already possessed by the recipient. In the last case,
the altered or introduced gene may be expressed differently than
the native gene. Such altered or foreign genetic information would
encompass the introduction of IBD-associated SNP containing
nucleotide sequences.
[0087] The DNA used for altering a target gene may be obtained by a
wide variety of techniques that include, but are not limited to,
isolation from genomic sources, preparation of cDNAs from isolated
mRNA templates, direct synthesis, or a combination thereof.
[0088] A preferred type of target cell for transgene introduction
is the embryonal stem cell (ES). ES cells may be obtained from
pre-implantation embryos cultured in vitro (Evans et al., (1981)
Nature 292:154-156; Bradley et al., (1984) Nature 309:255-258;
Gossler et al., (1986) Proc. Natl. Acad. Sci. 83:9065-9069).
Transgenes can be efficiently introduced into the ES cells by
standard techniques such as DNA transfection or by
retrovirus-mediated transduction. The resultant transformed ES
cells can thereafter be combined with blastocysts from a non-human
animal. The introduced ES cells thereafter colonize the embryo and
contribute to the germ line of the resulting chimeric animal.
[0089] One approach to the problem of determining the contributions
of individual genes and their expression products is to use
isolated IBD-associated SNP genes as insertional cassettes to
selectively inactivate a wild-type gene in totipotent ES cells
(such as those described above) and then generate transgenic mice.
The use of gene-targeted ES cells in the generation of
gene-targeted transgenic mice was described, and is reviewed
elsewhere (Frohman et al., (1989) Cell 56:145-147; Bradley et al.,
(1992) Bio/Technology 10:534-539).
[0090] Techniques are available to inactivate or alter any genetic
region to a mutation desired by using targeted homologous
recombination to insert specific changes into chromosomal alleles.
However, in comparison with homologous extra-chromosomal
recombination, which occurs at a frequency approaching 100%,
homologous plasmid-chromosome recombination was originally reported
to only be detected at frequencies between 10.sup.-6 and 10.sup.-3.
Nonhomologous plasmid-chromosome interactions are more frequent
occurring at levels 10.sup.5-fold to 10.sup.2 fold greater than
comparable homologous insertion.
[0091] To overcome this low proportion of targeted recombination in
murine ES cells, various strategies have been developed to detect
or select rare homologous recombinants. One approach for detecting
homologous alteration events uses the polymerase chain reaction
(PCR) to screen pools of transformant cells for homologous
insertion, followed by screening of individual clones.
Alternatively, a positive genetic selection approach has been
developed in which a marker gene is constructed which will only be
active if homologous insertion occurs, allowing these recombinants
to be selected directly. One of the most powerful approaches
developed for selecting homologous recombinants is the
positive-negative selection (PNS) method developed for genes for
which no direct selection of the alteration exists. The PNS method
is more efficient for targeting genes which are not expressed at
high levels because the marker gene has its own promoter.
Non-homologous recombinants are selected against by using the
Herpes Simplex virus thymidine kinase (HSV-TK) gene and selecting
against its nonhomologous insertion with effective herpes drugs
such as gancyclovir
[0092] (GANC) or (1-(2-deoxy-2-fluoro-B-D
arabinofluranosyl)-5-iodou- racil, (FIAU). By this counter
selection, the number of homologous recombinants in the surviving
transformants can be increased. Utilizing IBD-associated SNP
containing nucleic acid as a targeted insertional cassette provides
means to detect a successful insertion as visualized, for example,
by acquisition of immunoreactivity to an antibody immunologically
specific for the polypeptide encoded by IBD-associated SNP nucleic
acid and, therefore, facilitates screening/selection of ES cells
with the desired genotype.
[0093] As used herein, a knock-in animal is one in which the
endogenous murine gene, for example, has been replaced with human
IBD-associated SNP containing gene of the invention. Such knock-in
animals provide an ideal model system for studying the development
of IBD.
[0094] As used herein, the expression of a IBD-associated SNP
containing nucleic acid, fragment thereof, or an IBD-associated SNP
fusion protein can be targeted in a "tissue specific manner" or
"cell type specific manner" using a vector in which nucleic acid
sequences encoding all or a portion of IBD-associated SNP are
operably linked to regulatory sequences (e.g., promoters and/or
enhancers) that direct expression of the encoded protein in a
particular tissue or cell type. Such regulatory elements may be
used to advantage for both in vitro and in vivo applications.
Promoters for directing tissue specific proteins are well known in
the art and described herein.
[0095] The nucleic acid sequence encoding the IBD-associated SNP of
the invention may be operably linked to a variety of different
promoter sequences for expression in transgenic animals. Such
promoters include, but are not limited to a prion gene promoter
such as hamster and mouse Prion promoter (MoPrP), described in U.S.
Pat. No. 5,877,399 and in Borchelt et al., Genet. Anal. 13(6)
(1996) pages 159-163; a rat neuronal specific enolase promoter,
described in U.S. Pat. Nos. 5,612,486, and 5,387,742; a
platelet-derived growth factor B gene promoter, described in U.S.
Pat. No. 5,811,633; a brain specific dystrophin promoter, described
in U.S. Pat. No. 5,849,999; a Thy-1 promoter; a PGK promoter; a CMV
promoter; a neuronal-specific platelet-derived growth factor B gene
promoter; and Glial fibrillar acidic protein (GFAP) promoter for
the expression of transgenes in glial cells.
[0096] Methods of use for the transgenic mice of the invention are
also provided herein. Transgenic mice into which a nucleic acid
containing the IBD-associated SNP or its encoded protein have been
introduced are useful, for example, to develop screening methods to
screen therapeutic agents to identify those capable of modulating
the development of IBD.
Pharmaceuticals and Peptide Therapies
[0097] The elucidation of the role played by the IBD associated
SNPs described herein facilitates the development of pharmaceutical
compositions useful for treatment and diagnosis of IBD. These
compositions may comprise, in addition to one of the above
substances, a pharmaceutically acceptable excipient, carrier,
buffer, stabilizer or other materials well known to those skilled
in the art. Such materials should be non-toxic and should not
interfere with the efficacy of the active ingredient. The precise
nature of the carrier or other material may depend on the route of
administration, e.g. oral, intravenous, cutaneous or subcutaneous,
nasal, intramuscular, intraperitoneal routes.
[0098] Whether it is a polypeptide, antibody, peptide, nucleic acid
molecule, small molecule or other pharmaceutically useful compound
according to the present invention that is to be given to an
individual, administration is preferably in a "prophylactically
effective amount" or a "therapeutically effective amount" (as the
case may be, although prophylaxis may be considered therapy), this
being sufficient to show benefit to the individual.
[0099] The following examples are provided to illustrate certain
embodiments of the invention. They are not intended to limit the
invention in any way.
EXAMPLE I
[0100] We report herein results of an on-going GWA study where we
genotyped 550,000 single nucleotide polymorphisms (SNPs) with the
Illumina Human Hap550 Genotyping BeadChip.sup.29 in our study
population of 1,011 IBD cases (including 647 CD and 317 UC, with
the remainder being indeterminate colitis) of European ancestry and
4,250 controls with matching ancestry (based on self report).
Self-reported Caucasian ethnicity proved to be accurate, as the
resulting genomic inflation factor for the IBD run was less than
1.1.
[0101] The following materials and methods are provided to
facilitate the practice of the present invention.
Research Subjects
1. IBD Cohort:
Subject Ascertainment and Diagnostic Classification.
[0102] Affected individuals with pediatric onset IBD (both CD and
UC) were ascertained through the Children's Hospital of Wisconsin
and Medical College of Wisconsin, Children's Hospital of
Philadelphia, and Cincinnati Children's Hospital Medical Center.
Additional UC cases were recruited from Primary Children's Medical
Center and from the University of Utah and the Pediatric
Gastroenterology & Liver Unit at the Sapienza University of
Rome, Italy. In addition, colonic mucosal biopsies from affected
IBD patients were obtained from Cincinnati Children's Medical
center and from Children's Hospital of Wisconsin during the
diagnostic endoscopic procedures. Only subjects of European
ancestry were used in the final analysis which consisted of 1,011
individuals with IBD (including 647 CD and 317 UC, with the
remainder being indeterminate colitis) where the age of onset for
IBD was before their 19.sup.th birthday. All subjects had genotypes
with call rates above 95%. Informed consent was obtained from all
participants, and protocols were approved by the local
institutional review board in all participating institutions. The
diagnosis of IBD was made after fulfilling standard criteria (ref)
across the participating centers that requires (i) one or more of
the following symptoms: diarrhea, rectal bleeding, abdominal pain,
fever or complicated perianal disease; (ii) occurrence of symptoms
on two or more occasions separated by at least 8 weeks or ongoing
symptoms of at least 6 weeks' duration and (iii) objective evidence
of inflammation from radiologic, endoscopic, video capsule
endoscopy. Histological evidence of IBD.sup.33 was considered
mandatory for the diagnosis of CD or UC and inclusion in the
study.
[0103] Phenotypic classification was based on the Montreal
classification.sup.37. For CD we defined disease location based on
each subject's all available endoscopic and radiographic
evaluation. Based on macroscopic evidence of disease location, we
classified each subject by the following: Ileum only: disease of
the small bowel proximal to the cecum and distal 4.sup.th portion
of duodenum; Colon only: any colonic location between cecum and
rectum with no small bowel disease; Ileocolonic: disease of the
small bowel and any location between cecum and rectum. In addition,
any of the above categories may have upper GI tract involvement:
disease involving esophagus, stomach, duodenum and perianal disease
including: perianal fistulae, perianal and anal lesions including
more than single skin tags and anal ulcers. For example, subjects
with ileal only, colonic only or ileocolonic disease may also have
concomitant upper tract and/or perianal disease.
2. Control Subjects from Philadelphia:
[0104] The control group included 4250 children with self reported
Caucasian status, mean age 9.5 years; 53.0% male and 47.0% female,
who did not have IBD (CD or UC). These individual were recruited by
CHOP clinicians and nursing staff within the CHOP Health Care
Network, including four primary care clinics and several group
practices and outpatient practices that included well child visits.
The Research Ethics Board of CHOP approved the study, and written
informed consent was obtained from all subjects.
Genotyping
[0105] Illumina Infinium.TM. assay: We performed high throughput
genome-wide SNP genotyping, using the Illumina Infinium.TM. II
HumanHap550 BeadChip technology.sup.29,35 (Illumina, San Diego), at
the Center for Applied Genomics at CHOP. We used 750 ng of genomic
DNA to genotype each sample, according to the manufacturer's
guidelines. On day one, genomic DNA was amplified 1000-1500-fold.
Day two, amplified DNA was fragmented .about.300-600 bp, then
precipitated and resuspended followed by hybridization on to a
BeadChip. Single base extension utilizes a single probe sequence
.about.50bp long designed to hybridize immediately adjacent to the
SNP query site. Following targeted hybridization to the bead array,
the arrayed SNP locus-specific primers (attached to beads) were
extended with a single hapten-labeled dideoxynucleotide in the SBE
reaction. The haptens were subsequently detected by a multi-layer
immunohistochemical sandwich assay, as recently described. The
Illumina BeadArray Reader scanned each BeadChip at two wavelengths
and created an image file. As BeadChip images were collected,
intensity values were determined for all instances of each bead
type, and data files were created that summarized intensity values
for each bead type. These files consisted of intensity data that
was loaded directly into Illumina's genotype analysis software,
BeadStudio. A bead pool manifest created from the LIMS database
containing all the BeadChip data was loaded into BeadStudio along
with the intensity data for the samples. BeadStudio used a
normalization algorithm to minimize BeadChip to BeadChip
variability. Once the normalization was complete, the clustering
algorithm was run to evaluate cluster positions for each locus and
assign individual genotypes. Each locus was given an overall score
based on the quality of the clustering and each individual genotype
call was given a GenCall score. GenCall scores provided a quality
metric that ranges from 0 to 1 assigned to every genotype called.
GenCall scores were then calculated using information from the
clustering of the samples. The location of each genotype relative
to its assigned cluster determined its GenCall score.
[0106] Gene Array Analysis.
[0107] The global pattern of gene expression in colon was
determined in the Microarray Core of the CCHMC Digestive Health
Center REF: PMID: 18069684. Following informed consent, colonic
biopsies were obtained from pediatric patients with CD and UC and
healthy controls. For CD and UC patients, biopsies were obtained
from an area of active disease in the ascending colon or the most
proximal area of active disease if the ascending colon was
endoscopically normal. Colon biopsies were immediately placed in
RNAlater stabilization reagent (Qiagen, Germany) at 4.degree. C.
Total RNA was isolated using the RNeasy Plus Mini Kit (Qiagen) and
stored at -80.degree. C. Samples where then submitted to the CCHMC
Digestive Health Center Microarray Core where the quality and
concentration of RNA was measured by the Agilent Bioanalyser 2100
(Hewlett Packard) using the RNA 6000 Nano Assay to confirm a
28S/18S ratio of 1.6-2.0. 100 ng of total RNA was amplified using
Target 1-round Aminoallyl-aRNA Amplification Kit 101 (Epicentre,
WI.). The biotinylated cRNA was hybridized to Affymetrix GeneChip
Human Genome HG-U133 Plus 2.0 arrays, containing probes for
approximately 22,634 genes. The images were captured using
Affymetrix Genechip Scanner 3000. The complete dataset is available
at the NCBI Gene Expression Omnibus on the world wide web at
.ncbi.nlm.nih.gov/geo accession number. GeneSpring.TM. software was
used in the CCHMC Digestive Health Center Bioinformatics core to
analyze fold changes in gene expression between patient groups and
healthy controls. Data were normalized to allow for array to array
comparisons, and differences between groups were detected in
GeneSpring.TM. with significance at the 0.05 level relative to
healthy control samples. In order to allow for comparison between
the IBD sub-groups, mucosal inflammation was quantified in colon
biopsies using the Crohn's Disease Histological Index of
Severity
RESULTS
[0108] In the IBD case-control analysis, single-marker allele
frequencies were compared using x.sup.2 statistics for all markers.
Twelve markers were above the threshold for Bonferroni correction
(Table 1), the majority of which were previously reported or in the
MHC (driven by UC); however, two markers on chromosome 20q13,
rs2315008 and rs4809330, and one marker on chromosome 21q22,
rs39387404, were novel. Thus, we have identified two non-coding
variants in strong linkage disequilibrium (LD) on 20q13 (rs2315008
allele T and rs4809330 allele A) yielding
P-values=6.30.times.10.sup.-8 (corrected P=0.032) and
P-value=6.95.times.10.sup.-8 (corrected P=0.036) respectively and
protective odds ratios (OR)=0.74 for both (Table 1). In addition,
we have identified one non-coding variant on 21q22 (rs2836878
allele A) yielding P-values=6.01.times.10.sup.-8 (corrected
P=0.031) and a protective OR=0.73. Since all previously discovered
IBD genes are primarily associated with CD, it is important to note
that the contribution to these novel signals comes from both UC and
CD (Table 2). In addition, these signals replicate in the Wellcome
Trust Case Control Consortium (WTCCC).sup.24 CD dataset as also
shown in Table 2. The LD structure for the 20q13 and 21q22 loci
pinpointing the associated SNPs and genes within these regions are
shown in FIGS. 1 and 2, respectively.
[0109] As such, these significant SNPs confer protection from IBD.
As shown in FIG. 1, the 20q13 signal resides in a complex telomeric
region of LD that harbors the genes for regulator of telomere
elongation helicase 1 (RTEL1), tumor necrosis factor receptor
superfamily member 6B (TNFRSF6B), ADP-ribosylation factor related
protein 1 (ARFRP1), zinc finger CCCH-type with G patch domain
(ZGPAT) and Lck interacting transmembrane adaptor 1 (LIME1). The
TNFRSF6B gene provides the most compelling candidate based on what
is already known about the TNF-pathway in IBD. Indeed, the mRNA
expression of TNFRSF6B is markedly different in colonic biopsies
obtained from IBD patients compared to disease-free controls; this
appears to be associated in part with colon location and with the
degree of mucosal inflammation (FIG. 3A, r.sup.2=0.24, p=0.001 for
linear regression for the Crohn's Disease Histological Index of
Severity (CDHIS) and TNFRSF6B expression). While no allelic
difference was observed in mRNA expression of TNFRSF6B between IBD
subjects with the two identified SNPs, this may have been
confounded by a greater degree of mucosal inflammation in the colon
biopsies for the subjects who did not carry the associated alleles
(mean(SEM) CDHIS for SNP+:3.7.+-.1 vs. SNP-:7.+-.1.2, p=0.05). By
comparison, we observed no difference in the expression of RTEL1,
ARFRP1, ZGPAT, or LIME1 between IBD cases and controls (FIG. 3B).
The gene product for TNFRSF6B acts as a decoy receptor in
preventing FasL induced cell death, and a resistance to FasL
dependent apoptosis has previously been shown for T lymphocytes in
CD.sup.30.
[0110] The 21 q22 signal resides in a small region of LD that
harbors no genes but the nearest gene is the Down syndrome critical
region protein 2 isoform (PSMG1). We observed a modest increase in
the colonic expression of PSMG1 between IBD cases and controls
(supplemental FIG. 1A). However, this did not vary with either the
degree of mucosal inflammation, or carriage of the PSMG1 SNP.
[0111] In the case-control analysis of CD alone, single-marker
allele frequencies were also compared using x.sup.2 statistics for
all markers. Nine markers were above the threshold for Bonferroni
correction. As shown in Table 3, all of these loci have been
previously reported in GWA studies.sup.21. However, when
investigating the site specificity of CD in patients [colon only
(29%), ileum only (17%) or ileocolonic (54%)], a genome wide
significant signal was observed for colon-only CD (Table 4), also
on chromosome 21 but approximately 1.4 Mb away from the signal we
detected on chromosome 21 for the common form of IBD (Table 1).
This new signal resides in DSCAM, a gene that has not previously
been linked with CD. DSCAM colonic expression did not differ
between IBD cases and controls, within the IBD sub-groups, or as a
function of mucosal inflammation (supplemental FIG. 1B).
[0112] Thus, we have identified two non-coding variants on 21q22
(rs2837643 allele A and rs16999939 allele T) that are associated
with the colonic form of CD, yielding a P-value
range=5.69.times.10.sup.-8-2.40.times.10.sup.-8 and an at-risk OR
range=3.29-3.57 (Table 4).
[0113] Previous work addressing disease location suggests that both
ATG16L1 and CARD15 are involved specifically in inflammation of the
ileum.sup.31. Our results are in keeping with these reports
demonstrating that the previously described CARD15 variants (and to
a lesser extent, ATG16L1) do not appear to impact on colon-only
disease in CD patients and the effects of these variants in CD
therefore appear to be limited to the ileal/small intestine form of
the disease (Table 5).
[0114] In the case-control analysis of UC alone, single-marker
allele frequencies were also compared using x.sup.2 statistics for
all markers. Seventeen markers were above the threshold for
Bonferroni correction (Table 5). However, the resulting genomic
inflation factor for the UC run was not as close to 1 i.e. 1.3;
therefore we controlled for cryptic population structure using
principle components analysis as implemented in Eigenstrat. As a
consequence, four markers remained genome-wide significant, all of
which resided in the major histocompatibility complex (MHC) on
chromosome 6q21. This reinforces previously suggested MHC
associations based on linkage studies.sup.32 and is the first GWA
study to associate UC with specific MHC alleles.
Taken together, we have identified novel susceptibility loci in
pediatric onset IBD at 20q13 and 21q22. We also show for the first
time a strong association of UC with the MHC on 6q21 and we have
refined the association of CARD15 with CD to those subjects only
who have ileal involvement.
TABLE-US-00001 TABLE 1 IBD case-control association study results
for GWA significant markers. Position Minor MAF MAF Relevant CHR
SNP (B36) Allele Aff Ctrl P-value Bonferonni P OR G 1 rs11209026
67478546 A 0.024 0.061 .sup. 7.47 .times. 10.sup.-11 3.84 .times.
10.sup.-5 0.385 IL23R 16 rs5743289 49314275 T 0.232 0.172 .sup.
3.77 .times. 10.sup.-10 0.00019 1.455 CARD15 1 rs11465804 67475114
G 0.030 0.065 1.46 .times. 10.sup.-9 0.00075 0.442 IL23R 6 rs477515
32677669 T 0.248 0.313 1.02 .times. 10.sup.-8 0.0052 0.724 MHC 6
rs2516049 32678378 G 0.248 0.313 1.06 .times. 10.sup.-8 0.0054
0.724 MHC 6 rs9271568 32698441 A 0.238 0.301 2.95 .times. 10.sup.-8
0.015 0.724 MHC 9 rs6478109 116608587 A 0.251 0.314 3.20 .times.
10.sup.-8 0.016 0.733 TNFSF1 21 rs2836878 39387404 A 0.214 0.273
6.01 .times. 10.sup.-8 0.031 0.725 20 rs2315008 61814400 T 0.250
0.311 6.30 .times. 10.sup.-8 0.032 0.737 20 rs4809330 61820030 A
0.249 0.310 6.95 .times. 10.sup.-8 0.036 0.738 9 rs6478108
116598524 C 0.262 0.324 8.36 .times. 10.sup.-8 0.043 0.743 TNFSF1
16 rs2076756 49314382 G 0.317 0.258 9.65 .times. 10.sup.-8 0.050
1.332 CARD15 Novel signals are indicated in bold indicates data
missing or illegible when filed
TABLE-US-00002 TABLE 2 Key signals in CD and UC separately and in
the WTCCC CD cohort Minor MAF MAF CHR SNP Allele Aff Ctrl P-value
OR CD 20 rs2315008 T 0.252 0.311 1.84 .times. 10.sup.-5 0.747 20
rs4809330 A 0.252 0.309 2.71 .times. 10.sup.-5 0.752 21 rs2836878 A
0.224 0.272 0.00026 0.772 UC 20 rs2315008 T 0.238 0.311 0.00013
0.694 20 rs4809330 A 0.235 0.309 8.58 .times. 10.sup.-5 0.686 21
rs2836878 A 0.194 0.272 1.71 .times. 10.sup.-5 0.643 Minor Location
r.sup.2 with CHR SNP Allele (B36) P signal WTCC CD 20 rs6011040 A
61807850 6.52 .times. 10.sup.-5 0.96 21 rs378108 G 39391390 0.032
0.34
TABLE-US-00003 TABLE 3 CD case-control association study results
for GWA significant markers Position Minor MAF MAF Relevant CHR SNP
(B36) Allele Aff Ctrl P-value Bonferonni P OR Gene 16 rs5743289
49314275 T 0.257 0.172 .sup. 1.21 .times. 10.sup.-13 6.22 .times.
10.sup.-8 1.671 CARD15 1 rs11209026 67478546 A 0.018 0.061 .sup.
3.35 .times. 10.sup.-10 0.00017 0.281 IL23R 2 rs2241880 233848107 T
0.396 0.488 .sup. 7.63 .times. 10.sup.-10 0.00039 0.687 ATG16L1 2
rs2289472 233846979 A 0.398 0.489 1.10 .times. 10.sup.-9 0.00056
0.691 ATG16L1 2 rs13391356 233835108 T 0.399 0.489 1.31 .times.
10.sup.-9 0.00067 0.693 ATG16L1 16 rs2076756 49314382 G 0.338 0.258
1.88 .times. 10.sup.-9 0.00097 1.465 CARD15 2 rs3792109 233849156 T
0.399 0.488 3.41 .times. 10.sup.-9 0.0018 0.699 ATG16L1 16
rs2066843 49302700 T 0.351 0.272 3.61 .times. 10.sup.-9 0.0019
1.449 CARD15 1 rs11465804 67475114 G 0.024 0.065 7.64 .times.
10.sup.-9 0.0039 0.355 IL23R
TABLE-US-00004 TABLE 4 SNPs of interest with respect to
site-specific CD Position Minor MAF MAF Relevant CHR SNP (B36)
Allele Aff Ctrl P-value OR Gene Colon 1 rs11465804 67475114 G 0.038
0.063 0.094 0.598 IL23R 1 rs11209026 67478546 A 0.025 0.059 0.015
0.405 IL23R 2 rs13391356 233835108 T 0.423 0.489 0.028 0.766
ATG16L1 2 rs2289472 233846979 G 0.420 0.489 0.021 0.757 ATG16L1 2
rs2241880 233848107 T 0.419 0.488 0.022 0.757 ATG16L1 2 rs3792109
233849156 C 0.426 0.488 0.041 0.780 ATG16L1 3 rs2245556 102098240 T
0.141 0.139 0.94 1.013 ABI3BP 9 rs6478108 116598524 C 0.234 0.320
0.0022 0.651 TNFSF15 9 rs6478109 116608587 A 0.238 0.310 0.0091
0.694 TNFSF15 16 rs2066843 49302700 T 0.325 0.273 0.052 1.283
CARD15 16 rs5743289 49314275 T 0.185 0.173 0.59 1.088 CARD15 16
rs2076756 49314382 G 0.294 0.260 0.20 1.186 CARD15 20 rs2315008
61814400 T 0.280 0.306 0.35 0.882 TNFRSF6B 20 rs4809330 61820030 A
0.280 0.304 0.37 0.888 TNFRSF6B 21 rs2836878 39387404 A 0.231 0.266
0.18 0.828 PSMG1 21 rs2837643 40761352 A 0.070 0.021 2.40 .times.
10.sup.-8 3.567 DSCAM 21 rs16999939 40828471 T 0.077 0.025 5.69
.times. 10.sup.-8 3.285 DSCAM Ileum 1 rs11465804 67475114 G 0.012
0.063 0.0083 0.187 IL23R 1 rs11209026 67478546 A 0.006 0.059 0.0045
0.099 IL23R 2 rs13391356 233835108 T 0.377 0.489 0.0045 0.631
ATG16L1 2 rs2289472 233846979 G 0.377 0.489 0.0047 0.632 ATG16L1 2
rs2241880 233848107 T 0.375 0.488 0.0046 0.630 ATG16L1 2 rs3792109
233849156 C 0.377 0.488 0.0050 0.635 ATG16L1 3 rs2245556 102098240
T 0.173 0.139 0.22 1.291 ABI3BP 9 rs6478108 116598524 C 0.303 0.320
0.64 0.923 TNFSF15 9 rs6478109 116608587 A 0.296 0.310 0.71 0.937
TNFSF15 16 rs2066843 49302700 T 0.364 0.273 0.010 1.525 CARD15 16
rs5743289 49314275 T 0.315 0.173 2.50 .times. 10.sup.-6 2.198
CARD15 16 rs2076756 49314382 G 0.364 0.260 0.0027 1.634 CARD15 20
rs2315008 61814400 T 0.191 0.306 0.0017 0.538 TNFRSF6B 20 rs4809330
61820030 A 0.191 0.304 0.0019 0.541 TNFRSF6B 21 rs2836878 39387404
A 0.228 0.266 0.28 0.817 PSMG1 21 rs2837643 40761352 A 0.051 0.021
0.0094 2.530 DSCAM 21 rs16999939 40828471 T 0.062 0.025 0.0030
2.593 DSCAM Ileocolonic 1 rs11465804 67475114 G 0.023 0.063 0.00029
0.345 IL23R 1 rs11209026 67478546 A 0.020 0.059 0.00037 0.335 IL23R
2 rs13391356 233835108 T 0.406 0.489 0.00033 0.713 ATG16L1 2
rs2289472 233846979 G 0.406 0.489 0.00036 0.715 ATG16L1 2 rs2241880
233848107 T 0.402 0.488 0.00025 0.706 ATG16L1 2 rs3792109 233849156
C 0.406 0.488 0.00042 0.717 ATG16L1 3 rs2245556 102098240 T 0.180
0.139 0.011 1.359 ABI3BP 9 rs6478108 116598524 C 0.254 0.320 0.0024
0.725 TNFSF15 9 rs6478109 116608587 A 0.238 0.310 0.00073 0.694
TNFSF15 16 rs2066843 49302700 T 0.355 0.273 9.01 .times. 10.sup.-5
1.462 CARD15 16 rs5743289 49314275 T 0.271 0.173 3.93 .times.
10.sup.-8 1.774 CARD15 16 rs2076756 49314382 G 0.344 0.260 3.53
.times. 10.sup.-5 1.497 CARD15 20 rs2315008 61814400 T 0.258 0.306
0.026 0.791 TNFRSF6B 20 rs4809330 61820030 A 0.258 0.304 0.031
0.796 TNFRSF6B 21 rs2836878 39387404 A 0.236 0.266 0.14 0.851 PSMG1
21 rs2837643 40761352 A 0.016 0.021 0.53 0.794 DSCAM 21 rs16999939
40828471 T 0.027 0.025 0.79 1.079 DSCAM
TABLE-US-00005 TABLE 5 UC case-control association study results
for GWA significant markers Position Minor MAF MAF Relevant CHR SNP
(B36) Allele Aff Ctrl P-value Bonferonni P OR Gene Eigenstr 6
rs9271568 32698441 A 0.148 0.301 .sup. 8.22 .times. 10.sup.-16
.sup. 4.22 .times. 10.sup.-10 0.402 MHC 5.21 .times. 10 6 rs2516049
32678378 G 0.167 0.313 .sup. 1.17 .times. 10.sup.-14 6.02 .times.
10.sup.-9 0.440 MHC 4.20 .times. 10 6 rs477515 32677669 T 0.167
0.313 .sup. 1.24 .times. 10.sup.-14 6.36 .times. 10.sup.-9 0.440
MHC 4.45 .times. 10 6 rs2395185 32541145 T 0.177 0.325 .sup. 1.97
.times. 10.sup.-14 1.01 .times. 10.sup.-8 0.447 MHC 1.06 .times. 10
6 rs3104404 32790152 A 0.353 0.230 .sup. 3.10 .times. 10.sup.-12
1.59 .times. 10.sup.-6 1.823 MHC 6 rs3129882 32517508 G 0.579 0.452
.sup. 5.76 .times. 10.sup.-10 0.00030 1.670 MHC 6 rs6903608
32536263 C 0.445 0.328 1.71 .times. 10.sup.-9 0.00088 1.644 MHC 6
rs3129763 32698903 A 0.374 0.264 1.80 .times. 10.sup.-9 0.00093
1.667 MHC 6 rs602875 32681607 G 0.377 0.268 3.75 .times. 10.sup.-9
0.0019 1.650 MHC 6 rs382259 32317005 G 0.429 0.317 6.93 .times.
10.sup.-9 0.0036 1.617 MHC 3 rs2245556 102098240 T 0.063 0.145 8.34
.times. 10.sup.-9 0.0043 0.396 ABI3BP 6 rs660895 32685358 G 0.101
0.188 4.39 .times. 10.sup.-8 0.023 0.485 MHC 3 rs2595893 102160532
C 0.066 0.144 4.44 .times. 10.sup.-8 0.023 0.421 ABI3BP 6 rs1035798
32259200 T 0.375 0.274 4.57 .times. 10.sup.-8 0.023 1.591 MHC 3
rs2245473 102098826 G 0.064 0.142 4.64 .times. 10.sup.-8 0.024
0.414 ABI3BP 4 rs7663239 38462245 G 0.125 0.068 7.50 .times.
10.sup.-8 0.039 1.965 TLR1 6 rs3135363 32497626 C 0.391 0.290 8.32
.times. 10.sup.-8 0.043 1.571 MHC indicates data missing or
illegible when filed
EXAMPLE 2
[0115] We report herein results of an on-going GWA study where we
genotyped 550,000 single nucleotide polymorphisms (SNPs) with the
Illumina Human Hap550 Genotyping BeadChip.sup.29 in our study
population of 2,161 IBD cases of European ancestry and 6,483
controls with matching ancestry (based on self report).
Self-reported Caucasian ethnicity proved to be accurate, as the
resulting genomic inflation factor for the IBD run was less than
1.07.
[0116] The following materials and methods are provided to
facilitate the practice of the present example.
Research Subjects
1. IBD Cohort:
Subject Ascertainment and Diagnostic Classification.
[0117] Affected individuals with pediatric onset IBD (both CD and
UC) were ascertained through the Children's Hospital of Wisconsin
and Medical College of Wisconsin, Children's Hospital of
Philadelphia, Cincinnati Children's Hospital Medical Center,
University of Edinburgh; Sapienza University of Rome, Italy; Casa
Sollievo della Sofferenza" Hospital San Giovanni Rotondo, Italy;
Mount Sinai Hospital Toronto; Hospital for Sick Children, Toronto;
Cedars-Sinai Medical Ctr in Los Angeles. In addition, colonic
mucosal biopsies from affected IBD patients were obtained from
Cincinnati Children's Medical center and from Children's Hospital
of Wisconsin during the diagnostic endoscopic procedures. Only
subjects of European ancestry were used in the final analysis which
consisted of 2,161 individuals with IBD where the age of onset for
IBD was before their 19.sup.th birthday. All subjects had genotypes
with call rates above 95%. Informed consent was obtained from all
participants, and protocols were approved by the local
institutional review board in all participating institutions. The
diagnosis of IBD was made after fulfilling standard criteria (ref)
across the participating centers that requires (i) one or more of
the following symptoms: diarrhea, rectal bleeding, abdominal pain,
fever or complicated perianal disease; (ii) occurrence of symptoms
on two or more occasions separated by at least 8 weeks or ongoing
symptoms of at least 6 weeks' duration and (iii) objective evidence
of inflammation from radiologic, endoscopic, video capsule
endoscopy. Histological evidence of IBD.sup.33 was considered
mandatory for the diagnosis of CD or UC and inclusion in the
study.
[0118] Phenotypic classification was based on the Montreal
classification.sup.37. For CD we defined disease location based on
each subject's all available endoscopic and radiographic
evaluation. Based on macroscopic evidence of disease location, we
classified each subject by the following: Ileum only: disease of
the small bowel proximal to the cecum and distal 4.sup.th portion
of duodenum; Colon only: any colonic location between cecum and
rectum with no small bowel disease; Ileocolonic: disease of the
small bowel and any location between cecum and rectum. In addition,
any of the above categories may have upper GI tract involvement:
disease involving esophagus, stomach, duodenum and perianal disease
including: perianal fistulae, perianal and anal lesions including
more than single skin tags and anal ulcers. For example, subjects
with ileal only, colonic only or ileocolonic disease may also have
concomitant upper tract and/or perianal disease.
2. Control Subjects from Philadelphia:
[0119] The control group included 6,483 children with self reported
Caucasian status, mean age 9.5 years; 53.0% male and 47.0% female,
who did not have IBD (CD or UC). These individual were recruited by
CHOP clinicians and nursing staff within the CHOP Health Care
Network, including four primary care clinics and several group
practices and outpatient practices that included well child visits.
The Research Ethics Board of CHOP approved the study, and written
informed consent was obtained from all subjects.
Genotyping
[0120] Illumina Infinium.TM. assay: We performed high throughput
genome-wide SNP genotyping, using the Illumina Infinium.TM. II
HumanHap550 BeadChip technology.sup.29,35 (Illumina, San Diego), at
the Center for Applied Genomics at CHOP. We used 750 ng of genomic
DNA to genotype each sample, according to the manufacturer's
guidelines. On day one, genomic DNA was amplified 1000-1500-fold.
Day two, amplified DNA was fragmented .about.300-600 bp, then
precipitated and resuspended followed by hybridization on to a
BeadChip. Single base extension utilizes a single probe sequence
.about.50bp long designed to hybridize immediately adjacent to the
SNP query site. Following targeted hybridization to the bead array,
the arrayed SNP locus-specific primers (attached to beads) were
extended with a single hapten-labeled dideoxynucleotide in the SBE
reaction. The haptens were subsequently detected by a multi-layer
immunohistochemical sandwich assay, as recently described. The
Illumina BeadArray Reader scanned each BeadChip at two wavelengths
and created an image file. As BeadChip images were collected,
intensity values were determined for all instances of each bead
type, and data files were created that summarized intensity values
for each bead type. These files consisted of intensity data that
was loaded directly into Illumina's genotype analysis software,
BeadStudio. A bead pool manifest created from the LIMS database
containing all the BeadChip data was loaded into BeadStudio along
with the intensity data for the samples. BeadStudio used a
normalization algorithm to minimize BeadChip to BeadChip
variability. Once the normalization was complete, the clustering
algorithm was run to evaluate cluster positions for each locus and
assign individual genotypes. Each locus was given an overall score
based on the quality of the clustering and each individual genotype
call was given a GenCall score. GenCall scores provided a quality
metric that ranges from 0 to 1 assigned to every genotype called.
GenCall scores were then calculated using information from the
clustering of the samples. The location of each genotype relative
to its assigned cluster determined its GenCall score.
[0121] Gene Array Analysis.
[0122] The global pattern of gene expression in colon was
determined in the Microarray Core of the CCHMC Digestive Health
Center REF: PMID: 18069684. Following informed consent, colonic
biopsies were obtained from pediatric patients with CD and UC and
healthy controls. For CD and UC patients, biopsies were obtained
from an area of active disease in the ascending colon or the most
proximal area of active disease if the ascending colon was
endoscopically normal. Colon biopsies were immediately placed in
RNAlater stabilization reagent (Qiagen, Germany) at 4.degree. C.
Total RNA was isolated using the RNeasy Plus Mini Kit (Qiagen) and
stored at -80.degree. C. Samples where then submitted to the CCHMC
Digestive Health Center Microarray Core where the quality and
concentration of RNA was measured by the Agilent Bioanalyser 2100
(Hewlett Packard) using the RNA 6000 Nano Assay to confirm a
28S/18S ratio of 1.6-2.0. 100 ng of total RNA was amplified using
Target 1-round Aminoallyl-aRNA Amplification Kit 101 (Epicentre,
WI.). The biotinylated cRNA was hybridized to Affymetrix GeneChip
Human Genome HG-U133 Plus 2.0 arrays, containing probes for
approximately 22,634 genes. The images were captured using
Affymetrix Genechip Scanner 3000. The complete dataset is available
at the NCBI Gene Expression Omnibus on the world wide web at
ncbi.nlm.nih.gov/geo accession number. GeneSpring.TM. software was
used in the CCHMC Digestive Health Center
[0123] Bioinformatics core to analyze fold changes in gene
expression between patient groups and healthy controls. Data were
normalized to allow for array to array comparisons, and differences
between groups were detected in GeneSpring.TM. with significance at
the 0.05 level relative to healthy control samples. In order to
allow for comparison between the IBD sub-groups, mucosal
inflammation was quantified in colon biopsies using the Crohn's
Disease Histological Index of Severity.
RESULTS
[0124] Following a genome wide association analysis in an IBD
cohort, we observe a constellation of novel significant loci
associating with IBD (Table 6), CD (Table 7), and UC (Table 8).
This invention consists of the genetic factors listed in the tables
below. Regions highlighted in gray color in Tables 6-8 are
genes/loci that are genome-wide significant (P<10-8). Other
regions include genes/loci that are suggestive of causality of IBD
(P<10-5).
TABLE-US-00006 TABLE 6A Genetic Factors involved in IBD (all)
##STR00001##
TABLE-US-00007 TABLE 6B Genetic Factors involved in IBD (subset)
REGION COORDS NumSNP TopSNP TopP F_A F_U OR Genes 1 chr6:
90682173-90715742 2 rs13219796 7.71E-24 0.01823 0.07632 0.2247
BACH2, CASP8AP2, CX62, MDN1 2 chr1: 60475371-60663807 2 R4529739
2.25E-22 0.0291 0.09168 0.2969 C1orf87 3 chr7: 36949937-37046283 2
rs17170842 4.71E-18 0.03079 0.08376 0.3475 ELMO1 4 chr7:
55627351-55634120 2 rs13232099 4.64E-16 0.02811 0.07495 0.357 ECOP,
FKBP9L, LANCL2, SEPT14 5 chr2: 167961916-168008207 2 rs1159502
2.82E-14 0.0376 0.08476 0.4218 XIRP2 7 chr1: 243688674-243819452 2
rs11585347 5.16E-09 0.04142 0.07594 0.5258 KIF26B 8 chr2:
227770223-227901446 2 rs6722598 1.17E-08 0.01548 0.03929 0.3846
C2orf33, COL4A3, COL4A4, HRB, TM4SF20 9 chr9: 116561013-116610587 4
rs10759736 1.63E-08 0.0753 0.1155 0.6239 ATP6V1G1, C9orf91,
TNFSF15, TNFSF8 10 chr20: 865094-876945 2 rs474816 2.76E-08 0.09732
0.1419 0.6521 ANGPT4, C20orf54, FAM110A, PSMF1, RSPO4 11 chr18:
22546376-22715449 3 rs1597317 4.12E-08 0.1893 0.247 0.7116 AQP4,
CHST9, KCTD1 12 chr4: 22776952-22855172 2 rs7676830 9.86E-08 0.2053
0.2599 0.7355 15 chr3: 125487920-125642496 2 rs13098182 3.90E-07
0.04705 0.07668 0.5945 KALRN 16 chr8: 81852567-81966154 2
rs17475446 8.47E-07 0.108 0.1476 0.6994 PAG1, ZNF704 17 chr7:
45911451-46082359 2 rs12671457 9.27E-07 0.1147 0.1546 0.7084 ADCY1,
IGFBP1, IGFBP3
TABLE-US-00008 TABLE 7 Genetic Factors involved in Crohn's Disease
##STR00002##
TABLE-US-00009 TABLE 8 Genetic Factors involved in Ulcerative
Colitis REGION COORDS SNP P F_A F_U OR Genes 1 chr18:
32218133-32251233 rs7228236 1.17E-06 0.1697 0.2284 0.6904 FHOD3 2
chr21: 39385048-39389404 rs2836878 3.89E-06 0.2042 0.2631
0.7188
[0125] IBD is a major health problem in children and an immense
economic burden on the health care systems both in the US and the
rest of the world. The GWA approach serves the critical need for a
more comprehensive and unbiased strategy to identify causal genes
related to IBD. The human genome and International HapMap projects
have enabled the development of unprecedented technology and tools
to investigate the genetic basis of complex disease. The HapMap
project, a large-scale effort aimed at understanding human sequence
variation, has yielded new insights into human genetic diversity
that is essential for the rigorous study design needed to maximize
the likelihood that a genetic association study will be successful.
Genome-wide genotyping of over 500,000 SNPs can now be readily
achieved in an efficient and highly accurate manner. Since much of
human diversity is due to single base pair variations together with
variations in copy number throughout the genome, current advances
in single-base extension (SBE) biochemistry and
hybridization/detection to synthetic oligonucleotides now make it
possible to accurately genotype and quantitate allelic copy number.
Accordingly, this project has applied the latest in high density
SNP-based genotyping technology in GWA studies aimed at identifying
genes and genetic variants that contribute to IBD in well-defined
pediatric study populations. Our invention is a discovery that
impacts on millions of children in the US and the rest of the world
with IBD.
REFERENCES FOR EXAMPLES I AND II
[0126] 1. Schreiber, S., Rosenstiel, P., Albrecht, M., Hampe, J.
& Krawczak, M. Genetics of Crohn disease, an archetypal
inflammatory barrier disease. Nat Rev Genet 6, 376-88 (2005).
[0127] 2. Bouma, G. & Strober, W. The immunological and genetic
basis of inflammatory bowel disease. Nat Rev Immunol 3, 521-33
(2003). [0128] 3. Sartor, R. B. Mechanisms of disease: pathogenesis
of Crohn's disease and ulcerative colitis. Nat Clin Pract
Gastroenterol Hepatol 3, 390-407 (2006). [0129] 4. Podolsky, D. K.
Inflammatory bowel disease. N Engl J Med 347, 417-29 (2002). [0130]
5. Halme, L. et al. Family and twin studies in inflammatory bowel
disease. World J Gastroenterol 12, 3668-72 (2006). [0131] 6.
Orholm, M. et al. Familial occurrence of inflammatory bowel
disease. N Engl J Med 324, 84-8 (1991). [0132] 7. Peeters, M. et
al. Familial aggregation in Crohn's disease: increased age-adjusted
risk and concordance in clinical characteristics. Gastroenterology
111, 597-603 (1996). [0133] 8. Yang, H. et al. Familial empirical
risks for inflammatory bowel disease: differences between Jews and
non-Jews. Gut 34, 517-24 (1993). [0134] 9. Orholm, M., Binder, V.,
Sorensen, T. I., Rasmussen, L. P. & Kyvik, K. O. Concordance of
inflammatory bowel disease among Danish twins. Results of a
nationwide study. Scand J Gastroenterol 35, 1075-81 (2000). [0135]
10. Annese, V. et al. Familial expression of anti-Saccharomyces
cerevisiae Mannan antibodies in Crohn's disease and ulcerative
colitis: a GISC study. Am J Gastroenterol 96, 2407-12 (2001).
[0136] 11. Bayless, T. M. Maintenance therapy for Crohn's disease.
Gastroenterology 110, 299-302 (1996). [0137] 12. Peeters, M.,
Cortot, A., Vermeire, S. & Colombel, J. F. Familial and
sporadic inflammatory bowel disease: different entities? Inflamm
Bowel Dis 6, 314-20 (2000). [0138] 13. Mathew, C. G. & Lewis,
C. M. Genetics of inflammatory bowel disease: progress and
prospects. Hum Mol Genet 13 Spec No 1, R161-8 (2004). [0139] 14.
Hugot, J. P. et al. Association of NOD2 leucine-rich repeat
variants with susceptibility to Crohn's disease. Nature 411,
599-603 (2001). [0140] 15. Ogura, Y. et al. A frameshift mutation
in NOD2 associated with susceptibility to Crohn's disease. Nature
411, 603-6 (2001). [0141] 16. Hampe, J. et al. Association between
insertion mutation in NOD2 gene and Crohn's disease in German and
British populations. Lancet 357, 1925-8 (2001). [0142] 17. Rioux,
J. D. et al. Genetic variation in the 5q31 cytokine gene cluster
confers susceptibility to Crohn disease. Nat Genet 29, 223-8
(2001). [0143] 18. Mirza, M. M. et al. Genetic evidence for
interaction of the 5q31 cytokine locus and the CARD15 gene in Crohn
disease. Am J Hum Genet 72, 1018-22 (2003). [0144] 19. Peltekova,
V. D. et al. Functional variants of OCTN cation transporter genes
are associated with Crohn disease. Nat Genet 36, 471-5 (2004).
[0145] 20. Duerr, R. H. et al. A genome-wide association study
identifies IL23R as an inflammatory bowel disease gene. Science
314, 1461-3 (2006). [0146] 21. Baldassano, R. N. et al. Association
of Variants of the Interleukin-23 Receptor Gene With Susceptibility
to Pediatric Crohn's Disease. Clin Gastroenterol Hepatol 5, 972-976
(2007). [0147] 22. Hampe, J. et al. A genome-wide association scan
of nonsynonymous SNPs identifies a susceptibility variant for Crohn
disease in ATG16L1. Nat Genet 39, 207-211 (2007). [0148] 23. Rioux,
J. D. et al. Genome-wide association study identifies new
susceptibility loci for Crohn disease and implicates autophagy in
disease pathogenesis. Nat Genet 39, 596-604 (2007). [0149] 24.
Wellcome Trust Case Control Consortium. Genome-wide association
study of 14,000 cases of seven common diseases and 3,000 shared
controls. Nature 447, 661-78 (2007). [0150] 25. Libioulle, C. et
al. Novel Crohn disease locus identified by genome-wide association
maps to a gene desert on 5p13.1 and modulates expression of PTGER4.
PLoS Genet 3, e58 (2007). [0151] 26. Singh, S. B., Davis, A. S.,
Taylor, G. A. & Deretic, V. Human IRGM induces autophagy to
eliminate intracellular mycobacteria. Science 313, 1438-41 (2006).
[0152] 27. Parkes, M. et al. Sequence variants in the autophagy
gene IRGM and multiple other replicating loci contribute to Crohn's
disease susceptibility. Nat Genet 39, 830-2 (2007). [0153] 28.
Baldassano, R. N. et al. Association of the T300A non-synonymous
variant of the ATG16L1 gene with susceptibility to paediatric
Crohn's disease. Gut 56, 1171-3 (2007). [0154] 29. Gunderson, K.
L., Steemers, F. J., Lee, G., Mendoza, L. G. & Chee, M. S. A
genome-wide scalable SNP genotyping assay using microarray
technology. Nat Genet 37, 549-54 (2005). [0155] 30. Ina, K. et al.
Resistance of Crohn's disease T cells to multiple apoptotic signals
is associated with a Bcl-2/Bax mucosal imbalance. J Immunol 163,
1081-90 (1999). [0156] 31. Prescott, N. J. et al. A nonsynonymous
SNP in ATG16L1 predisposes to ileal Crohn's disease and is
independent of CARD15 and IBD5. Gastroenterology 132, 1665-71
(2007). [0157] 32. Satsangi, J. et al. Contribution of genes of the
major histocompatibility complex to susceptibility and disease
phenotype in inflammatory bowel disease. Lancet 347, 1212-7 (1996).
[0158] 33. Bousvaros, A. et al. Differentiating ulcerative colitis
from Crohn disease in children and young adults: report of a
working group of the North American Society for Pediatric
Gastroenterology, Hepatology, and Nutrition and the Crohn's and
Colitis Foundation of America. J Pediatr Gastroenterol Nutr 44,
653-74 (2007). [0159] 34. Silverberg, M. S. et al. Toward an
integrated clinical, molecular and serological classification of
inflammatory bowel disease: Report of a Working Party of the 2005
Montreal World Congress of Gastroenterology. Can J Gastroenterol 19
Suppl A, 5-36 (2005). [0160] 35. Steemers, F. J. et al.
Whole-genome genotyping with the single-base extension assay. Nat
Methods 3, 31-3 (2006). [0161] 36. Hakonarson, H. et al. A
genome-wide association study identifies KIAA0350 as a type 1
diabetes gene. Nature 448, 591-594 (2007). [0162] 37. Satsangi, J.,
Silverberg, M. S., Vermeire, S. & Colombel, J. F. The Montreal
classification of inflammatory bowel disease: controversies,
consensus, and implications. Gut 55, 749-53 (2006).
EXAMPLE III
[0163] In the present example, we report results from a GWA study
conducted on a large cohort of pediatric onset IBD subjects
ascertained through international collaboration, which has lead to
the identification of several additional novel IBD loci and to the
replication of previously reported loci, thereby allowing us to
develop a genetic risk model for pediatric-onset IBD aimed at
future prediction of disease susceptibility.
[0164] The following materials and methods are provided to
facilitate the practice of the present example.
Participants
[0165] The pediatric IBD discovery case cohort (Table 9) consisted
of 2413 Caucasian patients (1637 with CD, 723 with UC and 53 with
IBD-U) recruited from multiple centers from 4 geographically
discrete countries (Table 10) that met the study's quality control
criteria and were successfully matched with disease-free control
subjects from the United
[0166] States (see details below). All patients were diagnosed
prior to their 19th birthday and fulfilled standard IBD diagnostic
criteria. Family history of IBD was obtained with focus on first
degree relatives. A patient was considered to be of Jewish heritage
when at least 2 grandparents were known to be Jewish. Phenotypic
characterization was based on a modification of the Montreal
classification such that the definitions of L1 & L3 were both
extended to include disease within the small bowel proximal to the
terminal ileum and distal to the ligament of Treitz. Disease above
the ligament of Treitz was recorded separately; perianal disease
included only those patients with perianal abscess and/or fistula.
"Isolated Colonic IBD" included all patients with disease limited
to the colon (723 with UC, 53 with IBD-U, and 402 with Colonic CD).
The term `very early onset disease` was applied to cases where the
diagnosis was made at or prior to 8 years of age (Table 11). The
Research Ethics Board of the respective Hospitals and other
participating centers approved the study, and written informed
consent was obtained from all subjects. A sub-group of IBD patients
employed in this study (1101 patients, including 647 CD and 317 UC
and 47 inflammatory bowel disease type unclassified (IBDU)), were
utilized in a previous IBD GWA analysis reporting on two novel IBD
loci on chromosome 20q13 and 21q22(11); however, only novel and
non-overlapping loci are being described in this manuscript (Table
12).
[0167] The control group was recruited by CHOP clinicians, nursing
and medical assistant staff within the CHOP Health Care Network,
which includes primary care clinics and outpatient practices. The
control subjects did not have IBD or evidence of chronic disease
based on self-reported intake questionnaire or clinician-based
assessment. The Research Ethics Board of CHOP approved the study,
and written informed consent was obtained from all subjects.
Genotyping
[0168] We performed high throughput genome-wide SNP genotyping,
using the Illumina Infinium.TM. II HumanHap550 BeadChip technology
(Illumina, San Diego), at the Center for Applied Genomics at CHOP,
as previously described in Examples I and II. Following genotyping,
we excluded 251 IBD samples with greater than 2% missing genotypes.
We used the program STRUCTURE to exclude a further 316 patients
with less than 95% European ancestry based on ancestry informative
markers (14).
TABLE-US-00010 TABLE 9 Study recruitment, subsequent inclusion, and
ultimate demographic and phenotypic characteristics of caucasian
subjects with matched controls who were included in the association
study (n = 2413) Isolated IBD CD UC IBD-U Colonic IBD [n] [n] [n]
[n] [n] Recruited for Study Total number of Subjects 3370 2304 993
73 n/a Subjects meeting Quality Control Criteria (inc Caucasian
Ethnicity) Total number of Subjects 2784 1887 835 62 n/a Subjects
Ultimately Matched and included in Association Analysis Total
number of Subjects 2413 1637 723 53 1178 Male 1273 (52.7%) 927
(56.6%) 321 (44.3%) 25 (47.2%) 567 (48.1%) Median Age at Diagnosis
12 yrs (9-14.2) 12 yrs (10-14) 12 yrs (8-15) 10.25 yrs (7-13.5) 12
yrs (8-14) (IQR) Patient Subgroups Age at Dx </= 8 yrs 489 265
205 19 321 1.degree. Familial Hx (Valid %).sup.1 289 (14%) 215
(15.5%) 63 (10.2%) 11 (21%) 130 (12.4%) Known Jewish Heritage 223
(9.6%) 161 (10.3%) 57 (8.1%) 5 (9.8%) 98 (8.5%) (Valid %).sup.2 CD
Anatomic Location.sup.3 Isolated Small Bowel Disease (Valid %) 297
(20%).sup. Isolated Colonic Disease (Valid %) 402 (27.2%) Small
Bowel Colon Disease (Valid %) 769 (52%).sup. Any Perianal
Disease.sup.5 (Valid %) 312 (21.4%) UC Disease Extent.sup.4
Extensive Disease (Valid %) 394 (70%) Left-Sided Disease (Valid %)
168 (30%) CD Disease Behaviour.sup.6 Fibrostenotic 187 (15.7%)
Internally Penetrating 190 (15.9%) .sup.1Family Hx details not
available in 14% of cases .sup.2Jewish Heritage unknown in 4% of
cases .sup.37 cases had disease isolated to the upper tract, one
case had disease isolated to the perianal region. Complete disease
location data unavailable in 10% of CD cases .sup.4Details of
disease extent unavailable in 22% of UC cases .sup.5Details of
perianal disease unavailable in 11% of CD cases .sup.6Details of
disease behaviour at latest review unavailable in 27% of CD
cases
TABLE-US-00011 TABLE 10 Geographic Distribution of Caucasian
Subjects with Matched Controls who were included in the Association
Study (n = 2413) Able to be Matched to Controls Italy 322 Scotland
374 Canada 528 United States 1189 TOTAL 2413
TABLE-US-00012 TABLE 11 Demographic and Phenotypic Characteristics
of the sub-group of matched Caucasian Subjects included in the
Association Study who were diagnosed with IBD at or before 8 years
of age (n = 489) Isolated IBD CD UC IBD-U Colonic IBD Total number
of Subjects 489 265 205 19 321 Male 266 (54.5%) 155 (58.7%) 100
(48.8%) 11 (57.9%) 160 (49.8%) Median Age at Diagnosis 6 yrs (4 to
8) 6.5 yrs (4 to 7.5) 6 yrs (4 to 7.5) 6 yrs (3 to 7.5) 6 yrs (4 to
7.4) (IQR) 1.degree. Familial Hx (Valid %).sup.1 62 (14.9%) 44
(19.4%) 13 (7.6%) 5 (26%) 36 (13%) Known Jewish Heritage.sup.2 59
(12.6%) 32 (12.8%) 23 (11.5%) 4 (21%) 37 (11.7%) CD Anatomic
Location.sup.3 Isolated Small Bowel Disease (Valid %) 18 (7.5%)
Isolated Colonic Disease (Valid %) 97 (40.4%) Small Bowel Colon
Disease (Valid %) 124 (51.7%) Any Perianal Disease.sup.5 (Valid %)
56 (23.5%) UC Disease Extent.sup.4 Extensive Disease (Valid %) 113
(70%) Left-Sided Disease (Valid %) 47 (30%) CD Disease
Behaviour.sup.6 Fibrostenotic 27 (13.5%) Internally Penetrating 25
(12.5%)
TABLE-US-00013 TABLE 12 Discovery cohort sizes and filtering
Kuthagasan et al(11) Consortium All CD UC IBD CD UC IBD CD UC IBD
Controls QC Filtered 647 317 1011 1241 548 1677 1888 865 2688 7315
Eigenmatched 606 308 903 966 470 1510 1689 778 2413 6197
Genetic Matching
[0169] We performed eigen-matching to minimize population
stratification arising from differing geographic origins between
our Caucasian cases and controls. Eigen-matching uses singular
value decomposition of genotypic data to match cases to their
closest controls in the space of k principal components. This
approach is a variant of a method recently published by Luca et al
(15), however in contrast to the outlined method, we employ
matching as a criterion to filter patients for subsequent case
control analyses. Unlike EIGENSTRAT, a common approach to correct
for the effects of stratification by adjusting genotype values,
eigen-matching removes samples from both cases and controls that
are responsible for stratification.
[0170] Our final discovery cohort following matching consisted of
2413 patients and 6197 controls, which included 1689 CD cases and
778 UC cases (each of which included 53 IBD-U cases). Contained in
this cohort were 205 very early-onset UC and 251 (16) very
early-onset CD cases (each including 15 IBD-U cases). A summary of
the number of recruited patients who met quality control and
genetic matching criteria for study inclusion is shown Table 9.
Association Analysis
[0171] All tests of association were carried out using PLINK (17)
with standard criteria for SNP quality control filtering yielding
500,606 SNPs. Given a conservative estimate of 500,606 independent
hypotheses, we determined genome-wide significance with a
Bonferroni-corrected P-value threshold of 1.0.times.10.sup.-7. We
also examined nominal signals below a P-value threshold of
1.times.10.sup.-6. We excluded 73, 45, and 4 SNPs at or below the
suggestive P-value threshold due to genotyping error in the IBD,
CD, and UC analyses, respectively. We applied the same
quality-control criterion to filter results obtained for very-early
onset, familial, colon-only, and CD/UC without IBD-U analyses. All
resulting loci with P <0.0001 for CD, UC, IBD and their
sub-analyses are included as Supplementary Data.
Replication Experiments
[0172] We leveraged results from the previously reported CD
meta-analysis (1), which combined data from three scans, totaling
3,230 cases and 4,829 controls, in order to attempt to replicate
our observed signals from the association analyses. Since the
replication cohort we had access to did not include a separate
cohort of patients with UC, we have focused the replication
analysis on the CD and IBD-combined signals. However, an
independent cohort of 60 UC trios, recruited at the Boston
Children's Hospital, was available for replication analysis of the
UC signal observed in subjects with disease onset less than 8 years
of age. Details regarding replication cohort genotyping are
included in the supplementary methods.
Gene Expression Analysis
[0173] We examined allele specific effects on gene expression for
significantly associating loci by assaying total RNA in genotyped
lymphoblast cell lines. We also compared gene expression levels
between colonic biopsy specimens obtained from pediatric IBD cases
and normal controls to detect disease specific gene expression
differences.
[0174] To evaluate allele specific effects on gene expression at
the IL27 locus for the rs1968752 variant (A/A genotype: NA10835,
NA10854, NA10860, NA12006, NA12056 and the C/C genotype: NA12144,
NA12155, NA12760, NA06993, NA07029) RNA was isolated from
HapMap-Ceu population samples using Trizol (Invitrogen). Real-time
RT PCR was performed on a Bio-Rad iCycler System using SYBR Green
detection (Bio-Rad). cDNA template was made from 2 .mu.g of total
RNA using the Invitrogen cDNA Synthesis kit. Primer sequences were
designed using Integrated DNA Technologies (IDT). Beta-actin was
used as the control gene. Primer sequences and GenBank accession
numbers for the genes selected for PCR validation are as follows.
IL27 (NM.sub.--145659,149bp)) Forward: 5-TGATGTTTCCCTGACCTTCCAGG-3;
Reverse: 5-ACAGCTGCATCCTCTCCATGTT-3; Beta-actin
(NM.sub.--001101,138bp). Forward: 5-TCAGAAGGATTCCTATGTGGGCGA-3;
Reverse: 5-CACACGCAGCTCATTGTAGAAGGT-3. Each reaction was carried
out in triplicate wells on one plate. Fold change between A/A and
C/C genotype was calculated with the comparative C.sub.T method.
Results were normalized to beta-actin for cDNA quantification
differences. Data were analyzed using ANOVA. We additionally
examined allele-specific effects on expression of the TLR locus
(TLR-1, TLR6, and TLR10) in these same cell lines and in colonic
biopsy specimens from pediatric patients with CD and UC in
comparison with healthy controls. For the latter experiments,
biotinylated cRNA was hybridized to the Affymetrix GeneChip HG-U133
Plus 2.0 arrays, containing probes for approximately 22,634 genes
at the CCHMC Digestive Health Center Microarray Core. The images
were captured using Affymetrix GeneChip Scanner 3000. Data were
normalized to allow for array to array comparisons, and differences
between groups were detected in GeneSpring.TM. with significance at
the 0.05 level relative to healthy control samples using analysis
of variance and Newman-Keuls multiple comparison test.
Risk Modeling
[0175] Cumulative risk models were constructed for CD, UC, and IBD
in a similar fashion to those recently reported in non-insulin
dependent diabetes (16, 18, 19). Each model was built using
previously described loci that were significant in our analysis as
well as for novel loci identified by our study. This corresponded
to 30 loci in CD, 17 loci in UC, and 37 loci in IBD. For each
locus, the risk allele was designated as the allele that yielded an
OR>1. At each locus, each individual could thus have 0, 1 or 2
risk alleles. A genotype score representing risk allele burden for
UC, CD, and IBD was computed for each individual in the study as
the total number of risk alleles across all loci in the respective
model.
[0176] Given a distribution of genotype scores in our case and
control populations, we computed odds ratios for disease with
respect to a reference group for each model. In this regard, we set
a threshold score to yield a reference group comprising the lowest
7-10 percentile in the study population. This corresponded to
thresholds of 23, 13, and 28 risk alleles for the CD, UC, and IBD
models, respectively. Similarly, we defined a "high score" group as
comprising the upper 7-10 percentile of the risk allele burden
distribution for each diagnosis. This corresponded to thresholds of
34, 20, and 40 risk alleles for the
[0177] CD, UC, and IBD models, respectively. For each model, we
assigned the remaining patients into risk groups defined by each
unique value of the genotype scale between the "low score" and
"high score" group thresholds. For a given risk group
(corresponding to a genotypic score), the odds ratio and its
confidence interval was computed as a function of the number of
cases/controls in that group and the number of cases/controls in
the reference group. We also used logistic regression to quantify
the degree of additional risk conferred by each genotypic score
increment. We set up the regression employing the odds ratio as the
dependent and the genotypic score as the independent variable. The
slope of the resulting linear fit corresponds to an estimate of
marginal risk conferred by each risk allele burden increment.
RESULTS
[0178] To detect significantly associated susceptibility alleles,
we compared single-marker allele frequencies using X.sup.2
statistics on SNPs with a minor allele frequency greater than 1%
and with Hardy-Weinberg equilibrium P<10.sup.-5. Plots of
association results are shown in FIG. 4.
Crohn's Disease
[0179] Our CD analysis yielded one novel locus at the genome-wide
significant threshold (P<1.0.times.10.sup.-7) and three novel
loci at the suggestive significant level (P<1.times.10.sup.-6;
Table 13). Of these three signals, two were further corroborated by
in silico analysis of the independent CD-meta analysis data set
(P<0.05 after correcting for three independent tests). These
replicating CD loci reside on 16p11 and 5q15, respectively (Table
13).
TABLE-US-00014 TABLE 13 Novel genome wide significant (P < 1
.times. 10.sup.-7) and suggestive (P < 1 .times. 10.sup.-6)
putative CD loci identified in this GWA scan. CD Discovery (1689)
CD meta analysis band MB Genes SNP P Aff Unaff OR SNP P 6p21.33
31.60-31.67 BAT1, LST1, LTA, LTB, rs2844480 3.71E-07 0.24 0.20 1.27
[1.16-1.39] rs2844482 1.02E-01 NCR3, NFKBIL1, TNF Loci highlighted
in bold italics were independently replicated in a large adult CD
cohort. Z scores in the meta analysis cohort represent directions
of effect of the minor allele, with positive (negative) Z-scores
conferring risk (protection). Criteria for determining bounds of
region of association are described in the Methods. indicates data
missing or illegible when filed
[0180] The most significant SNP in the LD block harboring the 16p11
signal, rs1968752, yielded a P=1.27.times.10.sup.-8, with its minor
A allele conferring risk (OR=1.26 [1.16-1.36]). In the CD
meta-analysis dataset, an LD proxy for this SNP, rs4788084
(r2=0.83), was found to associate with CD (P=0.0035 OR=1.13). This
LD block contains multiple genes, including IL27, CCDC101, CLN3,
EIF3C, NUPR1 and SULT1A1, of which the most plausible candidate for
CD pathogenesis is IL27, an immunomodulatory cytokine that is
posited to regulate adaptive immunity responses. To determine if
IL27 expression varied according to genotype, we compared IL27,
CCDC101, CLN3, and EIF3C expression levels in lymphoblastoid cell
lines obtained from 10 homozygous individuals with either the AA or
GG rsl 968752 genotype. We detected a several fold decrease in IL27
gene expression in individuals with the AA genotype relative to
those with GG (FIG. 5A), suggesting that this SNP may exerts a
potent regulatory effect on IL27 gene expression (P=0.0031). Unlike
IL27, expression effects were not observed for the other genes at
this locus (FIG. 5B). Measuring IL27 colonic gene expression in 37
CD and 13 control samples, we detected significantly reduced
expression in CD when compared to normal tissue (P=0.028) (FIG.
6).
[0181] With respect to the 5q15 association signal, it resides in
an LD block harboring two genes: LNPEP and LRAP. The primary SNP in
this region, rs10044354, associated with CD at a P-value of
4.5.times.10.sup.-7 and OR=1.22 [1.13-1.31]. Since this SNP is not
contained in the meta-analysis dataset, we corroborated this result
with an LD proxy SNP (rs27302; r2=0.932), which associates with CD
in the discovery dataset with P=3.843.times.10.sup.-6 and OR=1.19
and replicates in the meta-analysis (P=0.0028, OR=1.09). We did not
observe allele specific changes in LNPEP/LRAP gene expression in
lymphoblastoid cell lines based on the genotype of these SNPs. We
also did not observe a difference in LNPEP/LRAP gene expression
between normal and Crohn's Disease colonic biopsies (data not
shown).
[0182] In addition to the discovery of IL27 and LNPEP/LRAP as novel
CD loci, we also sought evidence of association with previously
reported adult-onset CD signals (Table 14). Of the 32 CD loci
implicated by meta-analysis, 28 showed nominal evidence of
replication, 21 were significant to a Bonferroni adjusted P value
of 0.05 (adjusting for 32 hypotheses). Eleven of these previously
reported loci, including IL23R, NOD2, IL12B, and ATG16L1, were
genome-wide significant (P<1.0.times.10.sup.-7) in our pediatric
IBD cohort. Of the eight CD loci shown to be nominally significant
in the previously reported CD meta-analysis, we observed
association for three (P value<0.00625) (1). These were the
IL18R1-IL18RAP locus on 2q12 (rs917997 P=2.23.times.10.sup.-6,
OR=1.23 [1.13-1.34]), the C-C motif chemokine (CCL) gene cluster on
17q12 (rs991804, P=1.05.times.10.sup.-4, OR=0.84 [0.77-0.92]) and
the CCDC139 locus on 2p16 (rs13003464, P=2.81.times.10.sup.-3,
OR=1.12 [1.04-1.22]). In addition, when examining previously
reported UC signals in our CD cohort, we detected association to
the recently identified UC gene, IL10 on 1q32.1, suggesting that
this locus may also play a role in CD susceptibility (rs3024505,
P=1.0.times.10.sup.-4, OR=1.22 [1.11-1.36]) (Table 15).
TABLE-US-00015 TABLE 14 48 previously identified IBD loci examined
by our study, including 8 loci having nominal evidence for
association with IBD/CD/UC in previous studies and 2 loci published
on a subset of the current cohort (asterisk). CD Small Small All
Bowel Colonic Bowel + (1689) (297) (402) Colon (769) (1) (2) (3)
(4) band MB Genes SNP P OR OR OR OR .cndot. 1p13.2 114.18 rs2476601
5.61E-06 0.71 0.66 0.80 0.72 [0.62-0.13] [0.47-0.92] [0.62-1.05]
[0.58-0.88] .cndot. .cndot. 1p31.3 67.48 IL23R rs11465804 2.10E-14
0.45 0.43 0.47 0.47 [0.36-0.55] [0.26-0.70] [0.31-0.70] [0.35-0.63]
.cndot. 1q21.2 148.75 rs13294 7.20E-01 1.01 0.89 1.06 1.07
[0.94-1.10] [0.76-1.06] [0.92-1.23] [0.96-1.19] .cndot. 1q23.3
159.12 OR10J1 rs2274910 3.87E-01 0.96 1.09 0.94 0.95 [0.89-1.05]
[0.92-1.30] [0.80-1.10] [0.84-1.06] .cndot. 1q24.3 171.13 rs9286879
3.81E-05 1.20 1.19 1.20 1.23 [1.10-1.30] [0.99-1.43] [1.03-1.41]
[1.10-1.39] .cndot. 1q32.1 199.25 rs12122721 1.48E-01 0.94 0.78
1.02 0.95 [0.86-1.02] [0.64-0.95] [0.87-1.20] [0.84-1.07] .cndot.
1q32.1 205.01 rs3024505 1.01E-04 1.22 1.05 1.36 1.22 [1.11-1.36]
[0.83-1.32] [1.13-1.64] [1.06-1.40] .cndot. 2p16.1 61.04 AHSA2,
CCDC139, rs13003464 2.81E-03 1.12 1.26 1.08 1.12 PEX13, USP34,
[1.04-1.22] [1.06-1.48] [0.93-1.25] [1.01-1.25] PUS10 .cndot.
2p23.3 27.59 rs780094 2.56E-01 1.05 1.22 0.99 1.06 [0.97-1.13]
[1.03-1.43] [0.86-1.15] [0.95-1.18] .cndot. 2q12.1 102.44 IL18R1,
rs917997 2.23E-06 1.23 1.23 1.27 1.19 IL18RAP, [1.13-1.34]
[1.02-1.48] [1.08-1.49] [1.05-1.34] .cndot. 2q35 218.77 rs6752254
7.43E-01 0.99 1.04 0.97 0.95 [0.91-1.07] [0.88-1.22] [0.84-1.12]
[0.86-1.06] .cndot. 2q37.1 233.85 DGKD rs2241880 1.57E-17 0.71 0.59
0.89 0.69 [0.66-0.77] [0.50-0.71] [0.77-1.03] [0.62-0.77] .cndot.
3p12.1 85.84 rs7611991 6.87E-01 0.98 0.99 0.98 0.99 [0.90-1.07]
[0.82-1.20] [0.84-1.16] [0.87-1.11] .cndot. .cndot. 3p21.31 49.70
MST1 rs3197999 3.48E-08 1.26 1.40 1.28 1.18 [1.16-1.36] [1.17-1.67]
[1.10-1.49] [1.05-1.32] .cndot. .cndot. 5p13.1 40.43 rs4613763
1.68E-05 1.28 1.42 0.96 1.34 [1.14-1.43] [1.13-1.78] [0.77-1.21]
[1.16-1.56] .cndot. 5q13.3 76.18 F2RL1, S100Z rs7724915 3.74E-01
0.94 0.98 0.75 1.04 [0.81-1.08] [0.72-1.35] [0.55-1.01] [0.85-1.27]
.cndot. .cndot. 5q31.1 131.80 rs2188962 2.72E-06 1.20 1.20 1.16
1.32 [1.11-1.30] [1.01-1.41] [1.01-1.34] [1.19-1.47] .cndot. 5q33.1
150.25 ZNF300 rs7714584 1.46E-03 1.22 1.45 1.09 1.26 [1.08-1.38]
[1.14-1.85] [0.86-1.38] [1.06-1.48] .cndot. .cndot. 5q33.3 158.75
rs10045431 6.55E-07 0.80 0.70 0.87 0.80 [0.73-0.87] [0.58-0.86]
[0.74-1.02] [0.71-0.90] .cndot. .cndot. 6p21.32 32.54 BTNL2,
SLC26A3, rs2395185 5.07E-02 0.92 1.08 0.85 0.95 HLA-DRB1,
[0.85-1.00] [0.91-1.29] [0.73-0.99] [0.84-1.06] HLA-DQA1 .cndot.
6p21.32 32.69 rs660895 2.38E-04 0.83 0.91 0.78 0.84 [0.75-0.92]
[0.74-1.13] [0.65-0.95] [0.73-0.97] .cndot. 6p22.3 20.84 CDKAL1
rs6908425 2.40E-02 0.90 0.84 0.94 0.87 [0.81-0.99] [0.68-1.04]
[0.79-1.13] [0.76-1.00] .cndot. 6p25.1 5.10 rs12529198 7.73E-01
0.98 1.11 0.99 0.85 [0.84-1.13] [0.82-1.51] [0.75-1.30] [0.68-1.06]
.cndot. 6p25.2 3.38 C6orf85 rs4959832 8.60E-01 0.99 1.04 1.03 0.92
[0.92-1.07] [0.88-1.24] [0.89-1.19] [0.83-1.03] .cndot. 6q21 106.58
rs6938089 2.56E-02 1.10 1.14 1.17 1.05 [1.01-1.19] [0.96-1.36]
[1.01-1.36] [0.94-1.18] .cndot. 6q25.1 149.62 rs7758080 9.21E-01
1.00 0.94 0.98 1.03 [0.92-1.09] [0.78-1.13] [0.83-1.14] [0.92-1.16]
.cndot. 6q27 167.36 rs2301436 3.36E-02 1.09 1.07 1.18 1.04
[1.01-1.17] [0.91-1.27] [1.02-1.36] [0.94-1.16] .cndot. 7p12.2
50.24 ZPBP rs1456893 5.10E-05 0.84 0.86 0.76 0.85 [0.77-0.91]
[0.71-1.03] [0.64-0.89] [0.75-0.95] .cndot. 8q24.13 126.61
rs1551398 1.26E-06 0.82 0.83 0.89 0.76 [0.76-0.89] [0.70-0.99]
[0.76-1.03] [0.68-0.85] .cndot. .cndot. 9p24.1 4.97 INSL6, JAK2
rs10758669 2.71E-04 1.16 1.29 1.26 1.11 [1.07-1.25] [1.09-1.52]
[1.09-1.46] [1.00-1.24] .cndot. 9q32 116.60 rs6478108 8.43E-08 0.79
0.88 0.80 0.74 [0.73-0.86] [0.74-1.05] [0.68-0.94] [0.66-0.84]
.cndot. .cndot. 10p11.21 35.43 CCNY, CREM, rs4934724 6.97E-05 1.17
1.25 1.17 1.16 CUL2 [1.08-1.27] [1.06-1.48] [1.01-1.35] [1.04-1.29]
.cndot. .cndot. 10q21.2 64.07 rs10995250 1.16E-06 1.21 1.02 1.12
1.34 [1.12-1.31] [0.86-1.21] [0.97-1.29] [1.20-1.49] .cndot.
.cndot. 10q24.2 101.28 NKX2-3 rs11190140 4.43E-09 1.26 1.31 1.15
1.28 [1.16-1.36] [1.11-1.55] [1.00-1.33] [1.15-1.42] .cndot.
11q13.5 75.95 rs7130588 4.90E-03 1.12 1.16 1.08 1.15 [1.03-1.21]
[0.98-1.37] [0.93-1.25] [1.03-1.28] .cndot. 12q12 38.67 LRRK2,
rs11174631 7.24E-05 1.43 1.51 1.20 1.53 SLC2A13 [1.20-1.70] [1.05-2
17] [0.85-1.69] [1.21-1.93] .cndot. 13q14.11 43.36 rs3764147
1.10E-04 1.18 1.29 1.07 1.18 [1.09-1.29] [1.08-1.54] [0.91-1.26]
[1.05-1.33] .cndot. 15q13.1 26.20 HERC2, OCA2 rs1667394 4.66E-01
0.97 1.08 0.95 0.94 [0.88-1.06] [0.88-1.32] [0.80-1.14] [0.82-1.07]
.cndot. 17q12 29.61 rs991804 1.05E-04 0.84 0.75 0.94 0.83
[0.77-0.92] [0.61-0.91] [0.80-1.11] [0.73-0.94] .cndot. 17q12 35.29
ORMDL3 rs2872507 2.32E-03 1.13 1.08 1.16 1.12 [1.04-1.21]
[0.92-1.28] [1.01-1.34] [1.01-1.25] .cndot. .cndot. 17q21.2 37.77
rs744166 4.32E-02 0.92 0.96 0.91 0.88 [0.85-1.00] [0.81-1.13]
[0.79-1.06] [0.79-0.96] .cndot. .cndot. 18p11.21 12.80 PTPN2
rs1893217 4.86E-04 1.20 1.45 1.09 1.20 [1.08-1.32] [1.18-1.78]
[0.90-1.32] [1.04-1.38] .cndot. 18q11.2 17.93 rs8098673 5.62E-02
1.06 1.38 0.95 1.02 [1.00-1.17] [1.17-1.63] [0.82-1.10] [0.91-1.14]
.cndot. 19p13.3 1.08 SBNO2 rs2024092 2.46E-02 1.11 1.04 1.13 1.16
[1.01-1.22] [0.85-1.27] [0.95-1.34] [1.02-1.31] 20q13.33 1.68-61.8
rs2315008 5.13E-05 0.84 0.84 0.82 0.84 [0.77-0.91] [0.70-1.01]
[0.70-0.97] [0.75-0.95] .cndot. 21q21.1 15.74 rs1736148 1.62E-04
0.85 0.79 0.93 0.86 [0.80-0.93] [0.66-0.94] [0.81-1.08] [0.77-0.96]
21q22.2 9.39-39.3 rs2836878 2.28E-06 0.81 0.93 0.79 0.82
[0.74-0.88] [0.77-1.12] [0.66-0.93] [0.72-0.93] UC IBD All All
Colonic (777) (2413) (1178) (1) (2) (3) (4) band P OR P OR OR
.cndot. 1p13.2 6.44E-01 0.96 1.82E-04 0.79 0.90 [0.60-1.15]
[0.70-0.90] [0.77-1.06] .cndot. .cndot. 1p31.3 5.30E-04 0.64
7.33E-15 0.51 0.58 [0.49-0.82] [0.43-0.61] [0.46-0.72] .cndot.
1q21.2 9.27E-02 1.10 3.02E-01 1.04 1.08 [0.98-1.22] [0.97-1.11]
[0.99-1.19] .cndot. 1q23.3 4.89E-01 0.96 3.22E-01 0.96 0.95
[0.86-1.08] [0.90-1.04] [0.87-1.05] .cndot. 1q24.3 9.81E-01 1.00
5.84E-04 1.14 1.07 [0.69-1.13] [1.06-1.23] [0.97-1.18] .cndot.
1q32.1 3.45E-02 0.88 3.59E-02 0.92 0.93 [0.78-0.99] [0.86-0.99]
[0.84-1.02] .cndot. 1q32.1 1.12E-03 1.26 2.57E-06 1.24 1.29
[1.10-1.45] [1.13-1.35] [1.15-1.45] .cndot. 2p16.1 1.47E-01 1.08
1.50E-03 1.12 1.08 [0.97-1.21] [1.04-1.19] [0.99-1.18] .cndot.
2p23.3 1.09E-02 1.15 2.15E-02 1.08 1.09 [1.03-1.28] [1.01-1.16]
[1.00-1.19] .cndot. 2q12.1 1.56E-01 1.09 7.09E-06 1.19 1.15
[0.97-1.23] [1.10-1.29] [1.04-1.27] .cndot. 2q35 2.93E-02 0.89
1.64E-01 0.95 0.92 [0.80-0.99] [0.89-1.02] [0.84-1.00] .cndot.
2q37.1 4.97E-01 0.96 7.37E-12 0.79 0.94 [0.87-1.07] [0.74-0.84]
[0.86-1.03] .cndot. 3p12.1 2.32E-03 0.82 7.97E-02 0.93 0.88
[0.72-0.93] [0.86-1.01] [0.79-0.97] .cndot. .cndot. 3p21.31
9.43E-04 1.21 1.77E-09 1.25 1.23 [1.08-1.35] [1.16-1.34]
[1.12-1.35] .cndot. .cndot. 5p13.1 2.14E-02 1.20 1.46E-05 1.24 1.12
[1.03-1.40] [1.13-1.37] [0.98-1.28] .cndot. 5q13.3 7.21E-02 1.19
8.28E-01 1.01 1.03 [0.98-1.43] [0.89-1.15] [0.88-1.22] .cndot.
.cndot. 5q31.1 8.94E-01 0.99 3.36E-04 1.13 1.05 [0.89-1.10]
[1.06-1.21] [0.96-1.15] .cndot. 5q33.1 1.08E-01 1.15 8.02E-04 1.20
1.13 [0.97-1.36] [1.08-1.34] [0.98-1.31] .cndot. .cndot. 5q33.3
2.67E-05 0.77 2.93E-09 0.79 0.80 [0.68-0.87] [0.73-0.86]
[0.72-0.89] .cndot. .cndot. 6p21.32 1.59E-21 0.57 6.59E-09 0.81
0.66 [0.50-0.64] [0.75-0.87] [0.59-0.73] .cndot. 6p21.32 7.21E-13
0.57 2.60E-11 0.74 0.64 [0.48-0.66] [0.67-0.81] [0.56-0.72] .cndot.
6p22.3 9.62E-02 0.89 6.73E-03 0.89 0.91 [0.78-1.02] [0.82-0.97]
[0.82-1.02] .cndot. 6p25.1 7.29E-01 0.96 6.35E-01 0.97 0.97
[0.78-1.19] [0.85-1.10] [0.82-1.15] .cndot. 6p25.2 7.25E-01 0.98
6.45E-01 0.98 1.00 [0.88-1.09] [0.92-1.05] [0.91-1.09] .cndot. 6q21
8.19E-01 1.01 6.76E-02 1.07 1.06 [0.91-1.13] [1.00-1.15]
[0.97-1.17] .cndot. 6q25.1 4.47E-01 1.05 6.72E-01 1.02 1.02
[0.93-1.17] [0.94-1.09] [0.93-1.13] .cndot. 6q27 6.49E-01 1.02
5.30E-02 1.07 1.08 [0.92-1.14] [1.00-1.14] [0.99-1.17] .cndot.
7p12.2 7.94E-01 1.02 7.50E-04 0.88 0.92 [0.91-1.14] [0.82-0.95]
[0.84-1.02] .cndot. 8q24.13 8.77E-01 1.01 1.84E-05 0.88 0.97
[0.91-1.12] [0.82-0.84] [0.88-1.06] .cndot. .cndot. 9p24.1 1.70E-02
1.14 3.89E-05 1.16 1.18 [1.02-1.27] [1.08-1.24] [1.08-1.29] .cndot.
9q32 2.57E-04 0.80 6.61E-10 0.79 0.80 [0.71-0.90] [0.74-0.85]
[0.73-0.88] .cndot. .cndot. 10p11.21 3.42E-02 1.13 2.08E-05 1.16
1.14 [1.01-1.26] [1.08-1.24] [1.04-1.25] .cndot. .cndot. 10q21.2
1.71E-01 1.08 3.61E-06 1.17 1.09 [0.97-1.20] [1.10-1.26]
[1.00-1.19] .cndot. .cndot. 10q24.2 9.71E-07 1.30 1.93E-12 1.27
1.25 [1.17-1.45] [1.19-1.36] [1.14-1.36] .cndot. 11q13.5 4.07E-02
1.12 1.33E-03 1.12 1.11 [1.00-1.25] [1.04-1.20] [1.01-1.21] .cndot.
12q12 8.32E-01 0.97 1.35E-03 1.30 1.05 [0.73-1.28] [1.11-1.53]
[0.83-1.31] .cndot. 13q14.11 8.40E-01 1.01 1.27E-03 1.13 1.03
[0.90-1.14] [1.05-1.22] [0.93-1.14] .cndot. 15q13.1 2.39E-01 1.08
8.89E-01 1.01 1.04 [0.95-1.23] [0.93-1.09] [0.93-1.15] .cndot.
17q12 6.18E-03 0.84 8.11E-06 0.64 0.88 [0.75-0.95] [0.78-0.91]
[0.79-0.97] .cndot. 17q12 6.60E-04 1.20 5.01E-05 1.15 1.19
[1.08-1.34] [1.07-1.23] [1.09-1.30] .cndot. .cndot. 17q21.2
1.20E-01 0.92 1.53E-02 0.92 0.92 [0.82-1.02] [0.86-0.98]
[0.84-1.00] .cndot. .cndot. 18p11.21 3.12E-01 1.08 1.69E-03 1.15
1.08 [0.93-1.24] [1.06-1.26] [0.96-1.22] .cndot. 18q11.2 3.05E-01
1.06 3.70E-02 1.08 1.02 [0.95-1.18] [1.00-1.15] [0.93-1.12] .cndot.
19p13.3 5.90E-02 1.13 6.16E-03 1.12 1.13 [1.00-1.28] [1.03-1.21]
[1.02-1.25] 20q13.33 1.97E-03 0.82 3.03E-07 0.82 0.82 ***
[0.73-0.92] [0.77-0.89] [0.74-0.91] .cndot. 21q21.1 5.05E-02 0.90
2.08E-04 0.88 0.91 [0.81-1.00] [0.82-0.94] [0.83-1.00] 21q22.2
1.67E-09 0.67 3.23E-11 0.77 0.71 *** [0.59-0.76] [0.71-0.83]
[0.64-0.79] Filled circles in the first four columns of the table
specify whether the given row represents a (1) known CD locus, (2)
putative/nominal CD locus, (3) known UC locus, and/or (4)
putative/nominal UC locus, respectively. We replicate 21 of 32
known CD loci, 8 of 15 known UC loci, and overall 26 of 38 known
IBD loci. Loci replicating at a Bonferronni-corrected P < .05
are denoted in bold. Our data also implicate several previously
described CD loci as having association with UC (bold italics). We
also verify 3 nominally associating SNPs from the recent CD
meta-analysis (bold italics). indicates data missing or illegible
when filed
TABLE-US-00016 TABLE 15 8 previously identified IBD loci examined
by our study that were either (a) previously nominal signals that
are verified by our data or (b) signals previously shown to have an
effect on UC (CD) and found by our study to have an effect on CD
(UC). CD UC IBD All All All (1689) (777) (2413) (1) (2) (3) (4)
band MB Genes SNP P OR P OR P OR .cndot. .cndot. 1p31.3 67.48
rs11465804 2.10E-14 0.45 5.38E-04 0.64 1.33E-15 0.51 [0.36-0.55]
[0.49-0.82] [0.43-0.51] .cndot. 1q32.1 205.01 RBBP5, rs3024505
1.01E-04 1.22 1.12E-03 1.26 2.57E-06 1.24 RIPK5 [1.11-1.36]
[1.10-1.45] [1.13-1.35] .cndot. 2p16.1 61.34 rs13003464 1.12
1.47E-01 1.08 1.50E-03 1.12 [1.04-1.22] [0.97-1.21] [1.04-1.19]
.cndot. 2q12.1 102.44 IL16R1, rs917997 2.23E-06 1.23 1.56E-01 1.09
7.09E-06 1.19 IL15RAP, [1.13-1.34] [0.97-1.23] [1.10-1.29] .cndot.
9q32 116.00 rs6476108 6.43E-08 0.79 2.67E-04 0.80 5.61E-10 0.79
[0.73-0.86] [0.71-0.90] [0.74-0.85] .cndot. 17q12 29.61 CCL11,
rs991804 1.05E-04 0.84 6.18E-03 0.84 8.11E-06 0.84 CCL2,
[0.77-0.92] [0.75-0.95] [0.78-0.91] CCL7 .cndot. 17q12 35.29
rs2872507 2.32E-03 1.13 6.60E-04 1.20 5.91E-05 1.15 [1.04-1.21]
[1.08-1.34] [1.07-1.23] .cndot. 21q22.3 44.44 ICOSLG1 rs762421
1.78E-07 1.23 7.29E-05 1.24 1.83E-09 1.23 [1.14-1.33] [1.12-1.38]
[1.15-1.32] Filled circles in the first four columns of the table
specify whether the given row represents a (1) known CD locus, (2)
putative/nominal CD locus, (3) known UC locus, and/or (4)
putative/nominal UC locus, respectively. Overall, we replicate 21
of 32 known CD loci, 8 of 15 known UC loci, and 26 of 38 known IBD
loci. Loci replicating at a Bonferronni-corrected P < .05 are
denoted in bold, and novel significant effects are denoted in bold
italics. indicates data missing or illegible when filed
[0183] Taken together, our results are in keeping with our
hypothesis that genome wide analysis of early-onset cases is well
suited to detect novel CD loci and the concordance of our results
with published CD analyses indicates that there may be many
commonalities in the genetic pathogenesis of adult and early onset
CD.
Ulcerative Colitis
[0184] In the UC analysis, we uncovered three loci with genome-wide
significant P-values (P<1.0.times.10.sup.-7) and five additional
loci attaining suggestive significance (P<1.times.10.sup.-6)
levels in the discovery cohort (Table 16). We detected association
to the previously reported 1 Mb stretch of the MHC region on 6p21
encompassing multiple HLA genes (HLA-DOB, -DQA1, -DQA2, -DRA,
-DRB1, -DRB5) as well as to the 10q24 locus containing the NKX2-3
gene. The third signal resides on 21q22 in an LD block containing
the genes BWRD1 and PSMG1, which we previously reported in IBD and
independently replicated in the publically available CD dataset
from WTCCC (11). Here, we observe a robust association with UC
alone (rs2836878, P=1.67.times.10.sup.-9, OR=0.67 [0.59-0.76])
suggesting that this locus may have a more primary role in the
pathogenesis of UC.
TABLE-US-00017 TABLE 15 Novel genome wide significant (P < 1
.times. 10.sup.-7) and suggestive (P < 1 .times. 10.sup.-6)
putative UC loci identified in this GWA scan. UC Discovery (777)
band MB Genes SNP P Aff Unaff OR 18q12.2 32.22-32.25 FHCOO3,
rs7226236 9.72E-08 0.17 0.22 0. 6 MOCOS [0.59-0.79] 16q21
57.06-57.07 NDRG4 rs16960173 1.70E-07 0.34 0.28 1.35 [1.20-1.51] 1
q25.3 116.17-115.26 HABP2, NRAP rs12360212 2.15E-07 0.20 0.24 1.36
[1.21-1.53] 6p21.33 31.43-31.88 BAT1, LST1, rs3749946 4.56E-07 0.14
0.09 1.50 LTA, LTB, [1.28-1.75] NCR3, NFKBIL1, TNF 2q37.3
241.21-241.42 AQP12A rs4676410 5.60E-07 0.24 0.18 1.38 [1.22-1.56]
UC Replication CD meta analysis (60 trios) band SNP P Z SNP P T U
OR 18q12.2 rs732 236 3.31E-01 0.97 rs2 2 8.69E-01 18 19 0.95
[0.59-1.85] 16q21 rs16960173 6.18E-01 0.50 rs16960170 1.00E-00 2 2
1 [0.33-7.1] 1 q25.3 rs16885460 6.27E-01 -0.49 rs12360212 1.17E-01
21 12 1.75 [0.74-2.5 ] 6p21.33 rs3749946 2.07E-05 -4.26 rs3749946
8.66E-01 17 18 0.94 [0.68-1.89] 2q37.3 rs4676406 1.51E-01 1.44
rs4676410 2.28E-01 2 18 1.44 [0.7 -2.14] Criteria for determining
bounds of region of association are described in the Methods.
indicates data missing or illegible when filed
[0185] We also sought to follow up on all previously reported
adult-onset UC signals (Table 14). Of the 15 previously identified
UC loci, 11 showed nominal evidence of replication and 8 were
significant to a Bonferroni adjusted P value of 0.05 (adjusting for
15 hypotheses, nominal P<0.0033). These include loci already
well established in UC, such as IL23R on 1p31, as well as more
recently identified loci like IL10 on 1q32 and CADM2 on 3p12.
Examining known CD signals in our UC cohort uncovered three loci
that have not been previously associated with UC susceptibility:
ICOSLG on 21q22, TNSF15 on 9q32, and ORMDL3 on 17q12 (Table
15).
Inflammatory Bowel Disease
[0186] We combined the CD and UC datasets to obtain a composite and
more highly powered IBD cohort. Although we did not identify any
new loci at the genome-wide significance threshold of
P<1.0.times.10.sup.-7, we uncovered 3 novel candidate loci at
the suggestive P-value threshold of <1.times.10.sup.-6. One of
these signals corresponds to the 16p11 CD locus already discussed
above. The second novel and replicating IBD locus resides on
chromosome 22q12. The risk conferring minor allele for rs2412973
(P=9.99.times.10.sup.-7; OR=1.18 [1.10-1.26]), replicated in the
independent meta-analysis data (P=0.000953, OR=1.17). This SNP
resides inside the HORMA domain containing 2 (HORMAD2) gene, an ORF
with a Gene Ontology annotation for `mitosis`; the HORMA domain is
a common structural denominator in mitotic checkpoints, chromosome
synapsis and DNA repair. Other neighboring genes in the LD block
include myotubularin-related protein 3 (MTMR3), which is 50 kb
upstream of rs2412973 and encodes a protein phosphatase. Downstream
of the LD block is leukemia inhibitory factor (LIP), which resides
100 kb downstream and encodes a cytokine that stimulates
differentiation in leukocytes. The third novel and replicating IBD
locus at the suggestive significance level resides on 15q22. This
locus is highlighted by the SNP rs16950687 (P=6.67.times.10.sup.-7,
OR=1.20 [1.12-1.29]), which replicates in the meta-analysis data
set (P=0.0287, OR=1.10). This SNP lies in an LD block containing
the genes SMAD3, a TGF.beta. activated transcriptional modulator,
and IQCH, a protein thought to have a regulatory role in
spermatogenesis. We did not observe allele specific changes in
HORMAD2 or SMAD3 lymphoblastoid cell line gene expression based on
the genotype of these respective SNPs. We also did not observe a
difference in expression for these genes between normal and Crohn's
disease colonic biopsies (data not shown). The remaining IBD loci
did not replicate in the CD meta-analysis cohort. Our most
significant IBD signals are summarized in Table 17.
TABLE-US-00018 TABLE 17 Novel genome wide significant (P < 1
.times. 10.sup.-7) and suggestive (P < 1 .times. 10.sup.-6)
putative IBD loci identified in this GWA scan. IBD Discovery (2413)
CD meta analysis band MB Genes SNP P Aff Unaff OR SNP P Z 6q24.21
128.25-128.28 rs2456449 1.86E-07 0.30 0.34 0.83 rs2456449 2.33E-01
1.19 [0.77-0.89] 16p11.2 28.74-28.81 IL27 rs8049439 2.37E-07 0.41
0.37 1.20 rs8049439 4.96E-03 2.81 [1.12-1.28] 6p21.33 31.3 -31.67
rs2844482 5.76E-07 0.19 0.16 1.25 rs2844482 1.02E-01 1.63
[1.14-1.36] 15q22.33 65.25-65.26 SMAD3 rs16950687 6.67E-07 0.31
0.27 1.20 rs16950687 2.87E-02 2.19 [1.12-1.29] 22q12.2 28.75-28.86
rs2412973 9.99E-07 0.50 0.46 1.18 rs2412973 9.53E-04 3.30
[1.10-1.26] Loci highlighted in bold italics were independently
replicated in a large adult CD cohort. Z scores in the meta
analysis cohort represent directions of effect of the minor allele,
with positive (negative) Z-scores conferring risk (protection).
Criteria for determining bounds of region of association are
described in the Methods. indicates data missing or illegible when
filed
Very Early Onset IBD
[0187] Given the potential for a genetic enrichment of very
early-onset pediatric IBD cases (22), we re-analyzed the data
including only cases with age of onset of IBD prior to 8 years of
age. This analysis included 466 combined IBD, 266 CD only, and 205
UC only cases. In the UC analysis, we found a cluster of signals
encompassing three genes in the toll-like receptor gene family
(TLR1, TLR6, and TLR10) (Table 18). This interval contains two
independent set of variants: SNPs with risk-conferring minor
alleles that associate with OR's 1.49 to 1.59 and SNPs with
protective minor alleles that associate with OR's between 0.56 and
0.62. There is one SNP in this region, rs4833103, below the
Bonferonni-adjusted threshold for genome-wide significance
(P=1.805.times.10.sup.-8, OR=0.56 [0.46-0.69]), with other SNP
being supportive. A chart of minor allele frequencies demonstrates
the age-dependence of the minor allele frequency of this SNP, which
averages 0.35 for patients with onset between ages one and eight,
and peaks at to 0.45 for older pediatric UC patients (Table
19).
[0188] Among SNPs with risk conferring minor alleles, the most
significant association was with rs10030125
(P=2.76.times.10.sup.-6, OR=1.589).
TABLE-US-00019 TABLE 18 Early onset UC loci Early Onset UC (205
Cases, 6197 Controls) REGION Band MB Genes SNP TopP Aff Unaff OR 1
4p14 38.26-38.59 TLR1, TLR6, TLR10 rs4833103 1.81E-08 0.35 0.49
0.56 (0.46-0.68) 2 6p21.32 32.54-32.94 Multiple (MHC region)
rs9271568 1.12E-07 0.18 0.31 0.51 (0.39-0.65) 3 13q22.1 73.80-73.82
rs10492494 2.21E-07 0.17 0.10 1.97 (1.52-2.55)
TABLE-US-00020 TABLE 19 rs4833103 MAF in UC by age 1-2 yo 3-4 yo
5-6 yo 7-8 yo 9-10 yo 11-12 yo 13-14 yo 16-16 yo 17-18 yo rs4833103
MAF 0.33 0.34 0.38 0.35 0.46 0.48 0.45 0.44 0.46 n 18 46 58 92 105
122 140 117 68
[0189] However, in order to replicate this result, we employed a
small family based cohort of 60 pediatric UC trios with a normal
age of onset distribution. We genotyped rs10030125 and an LD
surrogate, rs4240248, (r.sup.2=0.58) which in the discovery cohort
had shown nominal association with a risk conferring effect
(P=1.7.times.10.sup.-4 , OR=1.45). While genotyping of rs10030125
failed, using the transmission disequilbrium test on this small
replication cohort, we found rs4240248 to associate with UC
(P=0.008 and OR=2.19) in this independent data set.
[0190] To further address the potential biological role of the TLR
locus in early onset UC, we examined the expression of the genes in
this locus, TLR1, TLR6 and TLR10, in the same cell lines as for the
IL27 locus as well as in colonic biopsy specimens obtained from
normal subjects and patients with UC. Unlike the allele-specific
effects observed on 1L27 expression, we did not detect
allele-specific effects on the TLR gene expression in
lymphoblastoid cell lines (data not shown). However, gene
expression analysis in colonic biopsies demonstrated that the
transcription of TLR1, TLR6, and TLR10 genes is significantly
enhanced in UC samples relative to normal (Students t-test
P<0.05) (FIG. 7). Taken together, our association findings, when
coupled with these expression data, suggest that functional
differences in pathways associated with this cluster of Toll-like
receptors may contribute to UC pathogenesis, in particular to the
very-early onset disease. Extended analysis of very early-onset UC,
CD, and IBD cohorts did not yield any further genome-wide
significant loci.
Risk Modeling
[0191] We evaluated IBD risk in individuals carrying different
numbers of risk variants. We conducted separate analyses for CD and
UC and for IBD combined. For the CD analysis, we examined risk
alleles from 30 replicating loci in our study. Individuals in this
cohort carried between 14 and 41 (out of 60 possible) risk alleles,
with a case/control frequency distribution as shown in FIG. 8a.
FIG. 8d demonstrates OR for disease as a function of genotypic
score. Analysis of this plot revealed that the OR for CD increases
on average by 28% with each increment in the genotypic score above
23. Furthermore, the group of children containing 34 or more risk
alleles (comprising the top 3rd percentile of genotypic score) had
more than 13 fold increased risk (OR=13.1 [9.4-18.2]) of developing
CD. We performed a similar analysis on the UC subcohort, using risk
alleles from 17 replicating loci in our study. Individuals in our
cohort carried between 7 and 24 (out of 34 possible) risk alleles,
with a frequency distribution as shown in FIG. 8b, yielded
estimates of cumulative risk as shown in FIG. 8e. In this model,
each increment in the genotypic score above 14 increased cumulative
UC risk by 36% (on average) to a maximum odds ratio of 7.4
[5.1-10.8]. Finally, we combined CD and UC risk variants to build a
IBD cumulative risk model employing 37 total loci and 74 total risk
alleles. FIG. 8c shows the frequency distribution of genotypic
score among our 2413 IBD patients relative to the cohort of
controls. According to this risk model, plotted in FIG. 8f, each
additional risk allele increases the odds ratio of IBD by an
average of 46% , with the top 3.sup.rd percentile of individuals
having over 12 fold risk of IBD (OR=12.6 [9.5-16.8]) with respect
to the reference group. These results demonstrate that common
variants that individually provide relatively small alteration of
disease susceptibility can combine to have a dramatic influence on
disease risk. This suggests that SNPs discovered in this study and
in previous studies have future potential to be incorporated into
high-dimensional molecular panels that can be used in clinical
diagnosis and management.
Extended CD and UC Analyses
[0192] We performed a separate analysis on CD cases excluding
patients with the IBD-U diagnosis, yielding 1637 total cases. This
analysis uncovered one additional CD signal on 1q22. This signal,
highlighted by rs3180018 showed suggestive significance in our
discovery cohort (P=6.times.10.sup.-7, OR=1.24 [1.14-1.36]). An LD
surrogate for rs3180018, rs1052176, nominally replicated in the CD
meta analysis (P=0.02, OR=1.11). This SNP lies in the gene SCAMP-3,
a carrier protein that participates in post-Golgi recycling
pathways.
[0193] We also note that a comparable analysis of UC cases
excluding patients with the IBD-U diagnosis, yielding 723 total
cases, did not reveal any novel associations apart from those
listed in the manuscript.
Familial IBD
[0194] Given the significant environmental component of IBD,
enrichment of the cohort for individuals that have at least one
affected first-degree relative has the potential to reveal novel
genetic factors mediating IBD susceptibility. Alternatively, IBD
cases that cluster in families may represent a specific genetic
subtype characterized by a unique set of markers. Of the 2413 cases
in our discovery cohort, 289 (14%) have at least one first degree
relative (sibling or parent) with IBD. A genome wide analysis on
this subset of the cohort revealed only a single locus near
genome-wide significance on 16q21 (rs5743289,
P=3.31.times.10.sup.-7, OR=1.64 [1.35-1.98]), corresponding to the
well characterized IBD gene NOD2. The evidence for NOD2, one of the
earliest identified IBD susceptibility loci, was initially obtained
from the study of families with at least two affected siblings
(51). It is noteworthy that our analysis of rs5743289 revealed a
weaker association with IBD in the portion of our cohort with
sporadic (i.e. non-familial) disease (P=9.06.times.10.sup.-7,
OR=1.24 [1.14-1.35]). Furthermore, comparison of rs5743289 minor
allele frequencies between familial and sporadic IBD cases revealed
a significant difference between the two groups (P=0.006),
suggesting that NOD2 may be a marker for familial disease.
Colonic IBD Analysis
[0195] A separate analysis was performed employing 1178 Colonic IBD
cases (including 723 UC, 402 Crohn's, and 53 IBD-U cases) against
our control dataset. This analysis revealed several previously
identified UC loci at the genome-wide level of significance but did
not reveal any novel loci: an 800 KB region of association in the
MHC locus on 6p21, 21q22 (near the PSMG1 gene), and the IL23R locus
on 1p31. In addition, known IBD loci on 10q24 (NKX2-3) and 5q33
(IL12B) were found at the nominal significance level. We observed
several previously uncharacterized loci at the nominal level of
significance, including rs12360212 (P=3.7.times.10.sup.-7, OR=1.29
[1.17- 1.42]) on 18q12 near FHOD and MOCOS, rs7228236
(P=4.5.times.10.sup.-7, OR=0.75 [0.67-0.84]) on 10q25 near HABP,
NRAP and rs4676410 (P=6.6.times.10.sup.-7, OR=1.31 [1.18-1.46]) on
2q37 in the GPR35 gene. These loci were also detected in our
UC-only analysis, which contains a subset of these patients.
Replication in independent cohorts is difficult due to the
uniqueness of this phenotypes in pediatric cases.
DISCUSSION
[0196] We have assembled a unique cohort of patients with
early-onset IBD from centers in Europe and North America for
genome-wide association. In this population, we have identified 5
novel susceptibility loci for pediatric IBD on chromosomes 4p14,
5q15, 6p21, 16p11, and 22q12, and replicated 26 of 38 previously
reported IBD loci. For two of these loci, IL27 and the
TLR1/TLR6/TLR10 cluster, we provide additional expression data
demonstrating significantly altered gene expression that lend
further support to the role of these genes in pediatric onset
IBD.
[0197] The results of our current study add new insight into the
pathogenic mechanisms mediating early onset IBD and the interface
between early-onset and adult-onset disease. Our findings suggest
that molecular events in early-onset disease closely parallel
molecular mechanisms in adult IBD. Our discovery of the TLR locus
in very-early onset UC suggests that there may also be pathways
specific to childhood IBD. Multiple genes involved in innate
immunity have already been implicated in IBD, including NOD2, IRGM,
and ATG16L1. Loci discovered by our study further crystallize the
link between inflammation and the innate/adaptive immune system in
the pathogenesis of IBD. Examination of the immune physiology
underlying these loci provides intriguing links to genes discovered
by previous IBD genome scans and compelling directions for further
investigation.
[0198] Our discovery of IL27 on 16p11 as a CD susceptibility gene
strengthens connections between CD pathogenesis and the
dysregulation of the Th-17 cell lineage. Genetic variants within
IL-23R, IL-12B, STAT3, and JAK2 loci all affect the same lineage,
and have been associated with susceptibility to both CD and UC.
T.sub.H-17 cells are a recently characterized pro-inflammatory
lineage of effector T-cells that are implicated in the pathogenesis
of multiple auto-immune/inflammatory diseases, including rheumatoid
arthritis, multiple sclerosis, lupus, and asthma (23, 24) The IL27
gene has been the subject of several recent studies examining its
role as an in vivo inhibitor of innate and adaptive immunity. Mice
deficient in the IL27 receptor have heightened immune responses
that are associated with upregulation of multiple T-cell lineages.
Furthermore, IL27ra-/- mice demonstrate increased inflammation in
response to inoculation with helminthic and intracellular pathogens
and are more susceptible to experimental induction of auto-immune
colitis, hepatitis, encephalitis, and allergic asthma (25-32). A
recent study linked anti-inflammatory effects of IL27 in mice to
suppression of the T-helper (T.sub.H-17) cell response, mediated
through STAT-1 activation and antagonism of IL-6 (26, 33). IL27
mediated immune suppression has also been linked to the modulation
of regulatory T-cells. In a recent study, Awasthi et al
demonstrated that IL27 mediates differentiation of CD4+ T-cells
into Tr1 regulatory T-cells (34). It serves to note that our study
is not the first to link IL27 to auto-immune disease
susceptibility; variants at this locus have been linked to asthma
susceptibility in a recent study performed on a Korean population
(35). Our data, demonstrate a profound effect of genotypic
variation at the IL27 locus on IL27 gene expression in
lymphoblastoid cell lines thereby implicating a role for this gene
in CD pathogenesis.
[0199] Our study revealed an interval on 5q15 to associate with
both early and adult onset CD--the data in our discovery cohort
achieving genome-wide significance. Of the two genes in the LD
block containing this interval, LRAP presents a more obvious
candidate for CD immunopathogenesis: it encodes a leukocyte-derived
arginine aminopeptidase that cleaves MHC class I presented antigen
peptides and is upregulated by interferon gamma (36, 37).
[0200] The IBD susceptibility locus we have identified on 15q22
resides in the LD neighborhood (r.sup.2>0.2) of SMAD3, another
gene providing a link between T-cell dysregulation and CD
susceptibility. SMAD3 (along with other SMADs) mediates the signal
transduction of TGF.beta., a cytokine that pleiotropically affects
proliferation, differentiation, and survival in multiple cell types
(38). In the intestinal mucosa, TGF.beta. mediates epithelial wound
closure and cellular migration, a pathway that is inhibited in both
CD and UC. Smad3 null mice show impaired restitutive epithelial
cell migration and slowed mucosal healing in an intestinal ulcer
model (39). In the immune system, TGF.beta. prevents T-cell
hyper-reactivity through direct suppression of cytotoxic T-cell and
T.sub.H1 differentiation and maintenance of regulatory CD4+ T-cells
(T.sub.reg) (38). Of note, TGF.beta. also has a pro-inflammatory
role by stimulating the differentiation of T.sub.H-17 cells.
T.sub.H-17 differentiation is impacted not only by IL27 signaling
(as discussed above), but is also a downstream target of IL-23R and
STAT-3, two CD susceptibility loci that have been replicated by
multiple studies (including ours) (2).
[0201] We have also discovered a cluster of toll-like receptor
(TLR1, TLR6, TLR10) genes whose genetic variation modulates
very-early onset UC risk. Ours is the first GWAS to study patients
with this rare phenotype. For all three genes in this cluster
(TLR1, TLR6, and TLR10), we show significantly increased gene
expression in colonic specimens from UC patients indicating that
they are active players in the pathogenesis of UC (FIG. 7). TLR's
are pattern recognition receptors that recognize antigenic
structures broadly-expressed across various species of
microorganisms. TLR's are known to synergize with another
IBD-susceptibility gene, NOD2, in pathways that trigger and
regulate innate immune responses to bacterial pathogens (2, 40).
Functionally, TLR1 and TLR6 are known to heterodimerize with
another TLR family member, TLR2, to mediate downstream signaling
events in innate immunity pathways, while TLR10 is a less well
studied "orphan" member of the TLR family. There are numerous
existing links suggesting an important role for TLR dysregulation
in IBD pathogenesis. Mice deficient in G-protein a inhibitory
subunit 2, which mediates intracellular TLR signaling, develop a
fatal auto-immune colitis (41). Though TLR1, TLR6 and TLR10 have
never been associated with IBD, other toll-like receptors genes
(TLR2 and TLR4) have been previously implicated in IBD pathogenesis
(42, 43). One study examining the role of TLR gene variation in IBD
suggested that variation in TLR1 and TLR6 may modulate the risk of
pancolitis and proctitis in UC patients; however, no significant
association was detected with UC (44). Variation in the TLR1, TLR6,
TLR10 gene cluster have been found by multiple previous studies to
modulate prostate-cancer and asthma susceptibility (45-48). UC
developing during early childhood differs substantially from adult
onset disease, where the colitis is often very limited in extent.
The identification of altered TLR gene expression as a risk factor
will need to be replicated in additional patients with this
phenotypic subtype of IBD.
[0202] The additive IBD risk in individuals carrying increasing
numbers of variants provides an opportunity to identify high-risk
individuals that may be more informative for future studies. The
fact that common variants that individually provide relatively
small alteration of disease susceptibility can combine to have a
dramatic influence on disease risk provides new insight and
strategies in pursuing functional studies, molecular diagnostic
development and targeted drug design, thereby laying the foundation
for the development of personalized treatment algorithms. Thus, the
molecular markers discovered in this and previous studies may have
future potential to be incorporated into high-dimensional molecular
panels that can be used in clinical diagnosis and management.
[0203] Though we have identified and replicated a number of novel
and previously reported loci in this study, there are likely many
more genetic loci to be discovered that modulate both early and
adult onset IBD risk. Our genotyping platform captures only a
subset of the common Caucasians genetic variation; therefore, it is
quite plausible that numerous other common variants may be
discovered using a platform with more complete coverage of
Caucasian genetic diversity. Application of appropriate genotyping
platforms to examine genetic variation in non-Caucasian IBD
patients may also reveal novel loci not addressed by this or recent
genome-scans. Similarly, replication of early-onset IBD
susceptibility loci in non-Caucasian populations is warranted to
determine the ethnic heterogeneity of their effect. Loci discovered
by our study likely represent surrogates of causal variants.
Fine-mapping and resequencing of these regions may reveal
haplotypes that confer more profound risk or protection from
IBD.
[0204] Taken together, our results substantially advance the
current understanding of pediatric-onset IBD by highlighting key
pathogenetic mechanisms, most notably including Th17 signaling and
innate immunity based on the discovery of the IL27 and TLR loci in
CD and UC, respectively, quantifying the cumulative IBD risk
conferred by multiple risk alleles in pediatric-onset disease, and
allowing for the first time a comparison between genetic
susceptibility in an exclusively pediatric cohort and the
previously described populations with predominantly adult-onset
disease.
REFERENCES FOR EXAMPLE III
[0205] 1. Barrett J C, Hansoul S, Nicolae D L, Cho J H, Duerr R H,
Rioux J D, et al. Genome-wide association defines more than 30
distinct susceptibility loci for Crohn's disease. Nat Genet. 2008
Jun 29; 40(8):955-62. [0206] 2. Cho J H. The genetics and
immunopathogenesis of inflammatory bowel disease. Nat Rev Immunol.
2008 Jun 1; 8(6):458-66. [0207] 3. Podolsky D K. Inflammatory bowel
disease. N Engl J Med. 2002 Aug 8; 347(6):417-29. [0208] 4. Binder
V. Genetic epidemiology in inflammatory bowel disease. Digestive
diseases (Basel, Switzerland). 1998 Jan 1; 16(6):351-5. [0209] 5.
Duerr R H, Taylor K D, Brant S R, Rioux J D, Silverberg M S, Daly
M, et al. A genome-wide association study identifies IL23R as an
inflammatory bowel disease gene. Science. 2006 Dec 1;
314(5804):1461-3. [0210] 6. Consortium WTCC. Genome-wide
association study of 14,000 cases of seven common diseases and
3,000 shared controls. Nature. 2007 Jun 7; 447(7145):661-78. [0211]
7. Fisher S, Tremelling M, Anderson C A, Gwilliam R, Bumpstead S,
Prescott N, et al. Genetic determinants of ulcerative colitis
include the ECM1 locus and five loci implicated in Crohn's disease.
Nat Genet. 2008 Jun 1; 40(6):710-2. [0212] 8. Franke A, Balschun T,
Karlsen T H, Hedderich J, May S, Lu T, et al. Replication of
signals from recent studies of Crohn's disease identifies
previously unknown disease loci for ulcerative colitis. Nat Genet.
2008 Jun 1; 40(6):713-5. [0213] 9. Vernier-Massouille G, Mamadou B,
Julia S, Dominique T, Jean Louis D, Olivier M, et al. Natural
History of Pediatric Crohn's Disease: A Population-Based Cohort
Study. Gastroenterology. 2008; 135(4):1106-13. [0214] 10. Van
Limbergen J, Russell R K, Drummond H E, Aldhous M C, Round N K,
Nimmo E R, et al. Definition of phenotypic characteristics of
childhood-onset inflammatory bowel disease. Gastroenterology. 2008
Oct; 135(4):1114-22. [0215] 11. Kugathasan S, Baldassano R N,
Bradfield J P, Sleiman P M, Imielinski M, Guthery S L, et al. Loci
on 20q13 and 21q22 are associated with pediatric-onset inflammatory
bowel disease. Nature genetics. 2008 Oct; 40(10):1211-5. [0216] 12.
Kugathasan S, Baldassano R N, Bradfield J P, Sleiman P M A,
Imielinski M, Guthery S L, et al. A Genome Wide Association Study
Identifies Novel Inflammatory Bowel Disease Susceptibility Loci on
20q13 and 21q22 in Patients with Pediatric Onset IBD. Nat Genet.
2008; 40(10):1211-5. [0217] 13. Hakonarson H, Grant S, Bradfield J
P, Marchand L, Kim C E, Glessner J T, et al. A genome-wide
association study identifies KIAA0350 as a type 1 diabetes gene.
Nature. 2007 Aug 2; 448(7153):591-4. [0218] 14. Pritchard J K,
Stephens M, Donnelly P. Inference of population structure using
multilocus genotype data. Genetics. 2000 Jun 1; 155(2):945-59.
[0219] 15. Luca D, Ringquist S, Klei L, Lee A B, Gieger C, Wichmann
H E, et al. On the use of general control samples for genome-wide
association studies: genetic matching highlights causal variants.
Am J Hum Genet. 2008 Feb; 82(2):453-63. [0220] 16. Lyssenko V,
Jonsson A, Almgren P, Pulizzi N, Isomaa B, Tuomi T, et al. Clinical
risk factors, DNA variants, and the development of type 2 diabetes.
N Engl J Med. 2008 Nov 20; 359(21):2220-32. [0221] 17. Purcell S,
Neale B, Todd-Brown K, Thomas L, Ferreira M A, Bender D, et al.
PLINK: a tool set for whole-genome association and population-based
linkage analyses. Am J Hum Genet. 2007 Sep 1; 81(3):559-75. [0222]
18. Stranger B E, Nica A C, Forrest M S, Dimas A, Bird C P, Beazley
C, et al. Population genomics of human gene expression. Nat Genet.
2007 Oct; 39(10):1217-24. [0223] 19. Stranger B E, Forrest M S,
Dunning M, Ingle C E, Beazley C, Thorne N, et al. Relative impact
of nucleotide and copy number variation on gene expression
phenotypes. Science. 2007 Feb 9; 315(5813):848-53. [0224] 20.
Cauchi S, Meyre D, Durand E, Proenca C, Mane M, Hadjadj S, et al.
Post genome-wide association studies of novel genes associated with
type 2 diabetes show gene-gene interaction and high predictive
value. PLoS ONE. 2008; 3(5):e2031. [0225] 21. Meigs J B, Shrader P,
Sullivan L M, McAteer J B, Fox C S, Dupuis J, et al. Genotype score
in addition to common risk factors for prediction of type 2
diabetes. N Engl J Med. 2008 Nov 20; 359(21):2208-19. [0226] 22.
Heyman M B, Kirschner B S, Gold B D, Ferry G, Baldassano R, Cohen S
A, et al. Children with early-onset inflammatory bowel disease
(IBD): analysis of a pediatric IBD consortium registry. J Pediatr.
2005 Jan; 146(1):35-40. [0227] 23. Steinman L. A brief history of
T(H)17, the first major revision in the T(H)1/T(H)2 hypothesis of T
cell-mediated tissue damage. Nat Med. 2007 Feb; 13(2):139-45.
[0228] 24. Bettelli E, Oukka M, Kuchroo V K. T(H)-17 cells in the
circle of immunity and autoimmunity. Nat Immunol. 2007 Apr;
8(4):345-50. [0229] 25. Miyazaki Y, Inoue H, Matsumura M, Matsumoto
K, Nakano T, Tsuda M, et al. Exacerbation of experimental allergic
asthma by augmented Th2 responses in WSX-1-deficient mice. J
Immunol. 2005 Aug 15; 175(4):2401-7. [0230] 26. Batten M, Li J, Yi
S, Kljavin N M, Danilenko D M, Lucas S, et al. Interleukin 27
limits autoimmune encephalomyelitis by suppressing the development
of interleukin 17-producing T cells. Nat Immunol. 2006 Sep;
7(9):929-36. [0231] 27. Honda K, Nakamura K, Matsui N, Takahashi M,
Kitamura Y, Mizutani T, et al. T helper 1-inducing property of
IL-27/WSX-1 signaling is required for the induction of experimental
colitis. Inflamm Bowel Dis. 2005 Dec; 11(12):1044-52. [0232] 28.
Yamanaka A, Hamano S, Miyazaki Y, Ishii K, Takeda A, Mak T W, et
al. Hyperproduction of proinflammatory cytokines by WSX-1-deficient
NKT cells in concanavalin A-induced hepatitis. J Immunol. 2004 Mar
15; 172(6):3590-6. [0233] 29. Artis D, Villarino A, Silverman M, He
W, Thornton E M, Mu S, et al. The IL-27 receptor (WSX-1) is an
inhibitor of innate and adaptive elements of type 2 immunity. J
Immunol. 2004 Nov 1; 173(9):5626-34. [0234] 30. Holscher C,
Holscher A, Ruckerl D, Yoshimoto T, Yoshida H, Mak T, et al. The
IL-27 receptor chain WSX-1 differentially regulates antibacterial
immunity and survival during experimental tuberculosis. J Immunol.
2005 Mar 15; 174(6):3534-44. [0235] 31. Pearl J E, Khader S A,
Solache A, Gilmartin L, Ghilardi N, deSauvage F, et al. IL-27
signaling compromises control of bacterial growth in
mycobacteria-infected mice. J Immunol. 2004 Dec 15; 173(12):7490-6.
[0236] 32. Villarino A, Hibbert L, Lieberman L, Wilson E, Mak T,
Yoshida H, et al. The IL-27R (WSX-1) is required to suppress T cell
hyperactivity during infection. Immunity. 2003 Nov; 19(5):645-55.
[0237] 33. Dong C. TH17 cells in development: an updated view of
their molecular identity and genetic programming. Nat Rev Immunol.
2008 May; 8(5):337-48. [0238] 34. Awasthi A, Carrier Y, Peron J P,
Bettelli E, Kamanaka M, Flavell R A, et al. A dominant function for
interleukin 27 in generating interleukin 10-producing
anti-inflammatory T cells. Nat Immunol. 2007 Dec; 8(12):1380-9.
[0239] 35. Chae S C, Li C S, Kim K M, Yang J Y, Zhang Q, Lee Y C,
et al. Identification of polymorphisms in human interleukin-27 and
their association with asthma in a Korean population. J Hum Genet.
2007; 52(4):355-61. [0240] 36. Tanioka T, Hattori A, Masuda S,
Nomura Y, Nakayama H, Mizutani S, et al. Human leukocyte-derived
arginine aminopeptidase. The third member of the oxytocinase
subfamily of aminopeptidases. J Biol Chem. 2003 Aug 22;
278(34):32275-83. [0241] 37. Tanioka T, Hattori A, Mizutani S,
Tsujimoto M. Regulation of the human leukocyte-derived arginine
aminopeptidase/endoplasmic reticulum-aminopeptidase 2 gene by
interferon-gamma. FEBS J. 2005 Feb; 272(4):916-28. [0242] 38.
Rubtsov Y P, Rudensky A Y. TGFbeta signalling in control of
T-cell-mediated self-reactivity. Nat Rev Immunol. 2007 Jun;
7(6):443-53. [0243] 39. Owen C R, Yuan L, Basson M D. Smad3
knockout mice exhibit impaired intestinal mucosal healing. Lab
Invest. 2008 Oct; 88(10):1101-9. [0244] 40. Trinchieri G, Sher A.
Cooperation of Toll-like receptor signals in innate immune defense.
Nat Rev Immunol. 2007 Mar; 7(3):179-90. [0245] 41. Rudolph U,
Finegold M J, Rich S S, Harriman G R, Srinivasan Y, Brabet P, et
al. Ulcerative colitis and adenocarcinoma of the colon in G alpha
i2-deficient mice. Nat Genet. 1995 Jun; 10(2):143-50. [0246] 42. Le
Bourhis L, Benko S, Girardin S E. Nod1 and Nod2 in innate immunity
and human inflammatory disorders. Biochem Soc Trans. 2007 Dec;
35(Pt 6):1479-84. [0247] 43. De Jager P L, Franchimont D,
Waliszewska A, Bitton A, Cohen A, Langelier D, et al. The role of
the Toll receptor pathway in susceptibility to inflammatory bowel
diseases. Genes Immun. 2007 Jul; 8(5):387-97. [0248] 44. Pierik M,
Joossens S, Van Steen K, Van Schuerbeek N, Vlietinck R, Rutgeerts
P, et al. Toll-like receptor-1, -2, and -6 polymorphisms influence
disease extension in inflammatory bowel diseases. Inflamm Bowel
Dis. 2006 Jan; 12(1):1-8. [0249] 45. Lazarus R, Raby B A, Lange C,
Silverman E K, Kwiatkowski D J, Vercelli D, et al. TOLL-like
receptor 10 genetic variation is associated with asthma in two
independent samples. Am J Respir Crit Care Med. 2004 Sep 15;
170(6):594-600. [0250] 46. Tantisira K, Klimecki W T, Lazarus R,
Palmer L J, Raby B A, Kwiatkowski D J, et al. Toll-like receptor 6
gene (TLR6): single-nucleotide polymorphism frequencies and
preliminary association with the diagnosis of asthma. Genes Immun.
2004 Aug; 5(5):343-6. [0251] 47. Sun J, Wiklund F, Zheng S L, Chang
B, Balter K, Li L, et al. Sequence variants in Toll-like receptor
gene cluster (TLR6-TLR1-TLR10) and prostate cancer risk. J Natl
Cancer
[0252] Inst. 2005 Apr 6; 97(7):525-32. [0253] 48. Kormann M S,
Depner M, Hartl D, Klopp N, Illig T, Adamski J, et al. Toll-like
receptor heterodimer variants protect from childhood asthma. J
Allergy Clin Immunol. 2008 Jul; 122(1):86-92, el-8. [0254] 49.
Barrett J C, Fry B, Mailer J, Daly M J. Haploview: analysis and
visualization of LD and haplotype maps. Bioinformatics. 2005 Jan
15; 21(2):263-5. [0255] 50. Patterson N, Price A L, Reich D.
Population structure and eigenanalysis. PLoS Genet. 2006 Dec;
2(12):e190. [0256] 51. Hugot J P, Laurent-Puig P, Gower-Rousseau C,
Olson J M, Lee J C, Beaugerie L, et al. Mapping of a susceptibility
locus for Crohn's disease on chromosome 16. Nature. 1996 Feb 29;
379(6568):821-3.
[0257] While certain of the preferred embodiments of the present
invention have been described and specifically exemplified above,
it is not intended that the invention be limited to such
embodiments. Various modifications may be made thereto without
departing from the scope and spirit of the present invention, as
set forth in the following claims.
* * * * *