U.S. patent application number 14/414674 was filed with the patent office on 2015-10-08 for methods for treating, preventing and predicting risk of developing breast cancer.
The applicant listed for this patent is Dana-Farber Cancer Institute, Inc.. Invention is credited to Vanessa Almendro, Sibgat Choudhury, Kornelia Polyak.
Application Number | 20150285802 14/414674 |
Document ID | / |
Family ID | 49949159 |
Filed Date | 2015-10-08 |
United States Patent
Application |
20150285802 |
Kind Code |
A1 |
Polyak; Kornelia ; et
al. |
October 8, 2015 |
METHODS FOR TREATING, PREVENTING AND PREDICTING RISK OF DEVELOPING
BREAST CANCER
Abstract
Methods for treating, preventing and predicting a subject's risk
of developing breast cancer are provided. In one aspect, a method
of predicting a subject's risk of developing breast cancer is
provided, wherein the method includes: (a) determining the
frequency in a breast tissue sample of CD44+, CD24- breast
epithelial cells, and (b) predicting that the subject has a
relatively elevated risk of developing breast cancer if the
frequency of CD44+, CD24- breast epithelial cells is decreased
compared to a first control frequency of CD44+, CD24- breast
epithelial cells; or (c) predicting that the subject has a
relatively reduced risk of developing breast cancer if the
frequency of CD44+ breast epithelial cells is increased compared to
a second control frequency of CD44+, CD24- breast epithelial
cells.
Inventors: |
Polyak; Kornelia;
(Brookline, MA) ; Almendro; Vanessa; (Brookline,
MA) ; Choudhury; Sibgat; (Chestnut Hill, MA) |
|
Applicant: |
Name |
City |
State |
Country |
Type |
Dana-Farber Cancer Institute, Inc. |
Boston |
MA |
US |
|
|
Family ID: |
49949159 |
Appl. No.: |
14/414674 |
Filed: |
March 15, 2013 |
PCT Filed: |
March 15, 2013 |
PCT NO: |
PCT/US13/32384 |
371 Date: |
January 13, 2015 |
Related U.S. Patent Documents
|
|
|
|
|
|
Application
Number |
Filing Date |
Patent Number |
|
|
61672973 |
Jul 18, 2012 |
|
|
|
Current U.S.
Class: |
435/6.12 ;
435/30; 435/7.1; 435/7.23; 435/7.92 |
Current CPC
Class: |
G01N 33/57415 20130101;
G01N 2333/70585 20130101; G01N 2333/70596 20130101; G01N 2800/50
20130101; G01N 2333/705 20130101; G01N 2333/47 20130101 |
International
Class: |
G01N 33/574 20060101
G01N033/574 |
Goverment Interests
STATEMENT REGARDING FEDERALLY SPONSORED RESEARCH OR DEVELOPMENT
[0002] The research described in this application was supported in
part by grants from the National Institutes of Health (Nos. T32
CA009382-26, P01 CA117969, P50 CA89383, P01 CA080111,
CA116235-0451, and CA087969), and from a grant from the U.S. Army
Congressionally Directed Research (No. W81XWH-07-1-0294). Thus, the
U.S. government has certain rights in the invention.
Claims
1. A method of predicting a subject's risk of developing breast
cancer, wherein the method comprises: (a) determining, in a breast
tissue sample from a subject, the frequency of CD44+, CD24- breast
epithelial cells or the frequency of CD44+ breast epithelial cells
with an assay that comprises the use of one or both of an antibody
that binds specifically to CD44 and an antibody that binds
specifically to CD24, and (b) predicting that the subject has a
relatively elevated risk of developing breast cancer if the
frequency of CD44+, CD24- breast epithelial cells is decreased
compared to a first control frequency of CD44+, CD24- breast
epithelial cells; or predicting that the subject has a relatively
reduced risk of developing breast cancer if the frequency of CD44+
breast epithelial cells is increased compared to a second control
frequency of CD44+, CD24- breast epithelial cells.
2. The method of claim 1, further comprising determining the
frequency of CD24+ breast epithelial cells.
3. The method of claim 2, wherein step (b) comprises: predicting
that the subject has a relatively elevated risk of developing
breast cancer if: (i) the frequency of CD44+, CD24- breast
epithelial cells is decreased compared to the first control
frequency of CD44+, CD24- breast epithelial cells, and (ii) the
frequency of CD24+ breast epithelial cells is increased compared to
a first control frequency of CD24+ breast epithelial cells; or
predicting that the subject has a relatively reduced risk of
developing breast cancer if: (i) the frequency of CD44+_breast
epithelial cells is increased compared to the second control
frequency of CD44+, CD24- breast epithelial cells, and (ii) the
frequency of CD24+ breast epithelial cells is decreased compared to
a second control frequency of CD24+ breast epithelial cells.
4. The method of claim 2, wherein step (b) comprises: predicting
that the subject has a relatively elevated risk of developing
breast cancer if the frequency of CD24+ breast epithelial cells is
greater than the frequency of CD44+, CD24-breast epithelial cells
in the sample; or predicting that the subject has a relatively
reduced risk of developing breast cancer if the frequency of CD24+
breast epithelial cells is equal to or less than the frequency of
CD44+, CD24- breast epithelial cells in the sample.
5. The method of claim 1, wherein the subject is in need of such
predicting.
6. A method of predicting a subject's risk of developing breast
cancer, wherein the method comprises: (a) determining the frequency
in a breast tissue sample of cells of one or more types selected
from the group consisting of p27+ breast epithelial cells, Sox17+
breast epithelial cells, Cox2+ breast epithelial cells, Ki67+
breast epithelial cells, ER+, p27+ breast epithelial cells, ER+,
Sox17+ breast epithelial cells, ER+, Cox2+ breast epithelial cells,
ER+, Ki67+ breast epithelial cells; androgen-receptor-positive
(AR+), p27+ breast epithelial cells, AR+, Sox17+ breast epithelial
cells, AR+, Cox2+ breast epithelial cells, and AR+, Ki67+ breast
epithelial cells with an assay that comprises the use of one or
more antibodies selected from the group consisting of: an antibody
that binds specifically to p27, an antibody that binds specifically
to Sox17, an antibody that bind specifically to Cox2, an antibody
that binds specifically to Ki67, an antibody that binds
specifically to ER, and an antibody that specifically binds to AR;
and (b) predicting that the subject has a relatively elevated risk
of developing breast cancer if the frequency of the cells of one or
more types is about the same or increased compared to a first
control frequency of cells of the one or more types, respectively;
or predicting that the subject has a relatively reduced risk of
developing breast cancer if the frequency of the cells of the one
or more types is decreased compared to a second control frequency
of the cells of the one or more types, respectively.
7. The method of claim 6, wherein the frequency of p27+ breast
epithelial cells is determined, and the first control frequency of
the p27+ breast epithelial cells is a level that represents 15%,
20%, or 25% of the breast epithelial cells in the sample, and the
second control frequency of p27+ breast epithelial cancer cells is
a level that represents 15%, 20%, or 25% of the breast epithelial
cells in the sample.
8.-9. (canceled)
10. The method of claim 6, wherein the frequency of Ki67+ breast
epithelial cells is determined, the first control frequency of the
Ki67+ breast epithelial cells is a level that represents 2% of the
breast epithelial cells in the sample, and the second control
frequency of Ki67+ breast epithelial cells is a level that
represents 2% of the breast epithelial cells in the sample.
11. (canceled)
12. A method of predicting a subject's risk of developing breast
cancer, wherein the method comprises: (a) determining the protein
or mRNA expression level in a breast tissue sample from a subject
of at least one marker selected from the group consisting of p27,
Sox17 and Cox2; and (b) predicting that the subject has a
relatively elevated risk of developing breast cancer if the protein
or mRNA expression level of the at least one marker is increased
compared to a first control level of the at least one marker; or
predicting that the subject has a relatively reduced risk of
developing breast cancer if the protein or mRNA expression level of
the at least one marker is decreased compared to a second control
level of the at least one marker.
13.-14. (canceled)
15. The method of claim 12, wherein step (a) further comprises
determining the protein or mRNA expression level of one or more
additional markers having an expression level that is modulated in
breast epithelial cells of parous women compared to the levels in
breast epithelial cells of nulliparous women.
16. The method of claim 12, wherein the sample is enriched for
CD44+, CD24- breast epithelial cells, Ki67+ breast epithelial
cells, CD44+Ki67+ breast epithelial cells, or CD24+ breast
epithelial cells prior to the determining.
17.-19. (canceled)
20. The method of claim 1, wherein the subject has a BRCA1
mutation.
21. The method of claim 1, wherein the subject has a BRCA2
mutation.
22. The method of claim 12, wherein step (a) comprises determining
the protein or mRNA expression level of at least two markers
selected from the group consisting of p27, Sox17 and Cox2.
23. The method of claim 22, wherein step (a) comprises determining
the protein or mRNA expression level of p27, Sox17, and Cox2.
24. A method of predicting a subject's risk of developing breast
cancer, the method comprising: determining a
parity/nulliparity-associated mRNA expression signature in a sample
comprising breast epithelial cells from a subject; and predicting a
subject's risk of developing breast cancer based on the determined
parity/nulliparity-associated mRNA expression profile in the
sample.
25. The method of claim 24, wherein the sample is enriched for
CD44+ cells, CD24+ cells, or CD10+ cells.
26.-39. (canceled)
40. The method of claim 1, wherein the breast cancer is an ER+
breast cancer.
41. The method of claim 1, wherein the breast cancer is an ER-
breast cancer.
Description
RELATED APPLICATIONS
[0001] The present application claims the benefit of U.S.
Provisional Patent Application Ser. No. 61/672,973, filed Jul. 18,
2012, which is herein incorporated by reference in its
entirety.
TECHNICAL FIELD
[0003] Methods for treating, preventing and predicting a subject's
risk of developing breast cancer are provided.
BACKGROUND
[0004] Breast cancer is the most common type of cancer among women
in the United States, accounting for more than a quarter of all
cancers in women. Approximately 2.5 million women in this country
are breast cancer survivors, and an estimated 192,370 new cases of
breast cancer were diagnosed in women in 2009. Further, estrogen
receptor positive (ER+) postmenopausal breast cancer is the most
common form of the disease. While advances in treatment have
enabled more women to live longer overall and to live longer
without disease progression, what is needed in the art are methods
for identifying subjects at risk of developing breast cancer before
they develop it, and for preventing the development of the disease
altogether. Presently, however, very few reliable predictive
markers for identifying subjects at high risk for developing breast
cancer, such as ER+ or ER- breast cancer, are known.
[0005] BRCA1 and BRCA2 mutations are examples of predictive markers
that have been correlated with an increased risk of developing
breast cancer; however, only 5-10% of breast cancers are thought to
be caused by inherited abnormalities in BRCA1 and BRCA2 (i.e.
hereditary breast cancer). The remaining approximately 90-95% of
all breast cancers are sporadic. Thus, what is needed in the art
are novel markers that are useful for identifying subjects having
an elevated risk of developing breast cancer, as well as novel
targets of breast cancer therapies.
SUMMARY OF THE INVENTION
[0006] As follows from the Background section above, there remains
a need in the art for methods for predicting a subject's risk of
developing breast cancer. Such methods, as well as other, related
benefits, are presently provided, as discussed in detail below.
[0007] In one aspect, a method of predicting a subject's risk of
developing breast cancer is provided, wherein the method includes:
(a) determining the frequency in a breast tissue sample of CD44+,
CD24- breast epithelial cells, and (b) predicting that the subject
has a relatively elevated risk of developing breast cancer if the
frequency of CD44+, CD24- breast epithelial cells is decreased
compared to a first control frequency of CD44+, CD24- breast
epithelial cells; or (c) predicting that the subject has a
relatively reduced risk of developing breast cancer if the
frequency of CD44+ breast epithelial cells is increased compared to
a second control frequency of CD44+, CD24- breast epithelial
cells.
[0008] In another aspect, the method further includes determining
the frequency of CD24+ breast epithelial cells. In one aspect, step
(b) includes predicting that the subject has a relatively elevated
risk of developing breast cancer if: (i) the frequency of CD44+,
CD24- breast epithelial cells is decreased compared to a first
control frequency of CD44+, CD24- breast epithelial cells, and (ii)
the frequency of CD24+ breast epithelial cells is increased
compared to a first control frequency of CD24+ breast epithelial
cells; and step (c) includes predicting that the subject has a
relatively reduced risk of developing breast cancer if: (i) the
frequency of CD44+ breast epithelial cells is increased compared to
a second control frequency of CD44+, CD24- breast epithelial cells,
and (ii) the frequency of CD24+ breast epithelial cells is
decreased compared to a second control frequency of CD24+ breast
epithelial cells. In another aspect, step (b) includes: predicting
that the subject has a relatively elevated risk of developing
breast cancer if the frequency of CD24+ breast epithelial cells is
greater than the frequency of CD44+, CD24-breast epithelial cells
in the sample; and step (c) includes predicting that the subject
has a relatively reduced risk of developing breast cancer if the
frequency of CD24+ breast epithelial cells is equal to or less than
the frequency of CD44+, CD24- breast epithelial cells in the
sample. In still another aspect, the subject is in need of such
predicting.
[0009] In another aspect, a method of predicting a subject's risk
of developing breast cancer is provided. The method includes: (a)
determining the frequency in a breast tissue sample of cells of one
or more types selected from the group consisting of p27+ breast
epithelial cells, Sox17+ breast epithelial cells, Cox2+ breast
epithelial cells, Ki67+ breast epithelial cells, ER+, p27+ breast
epithelial cells, ER+, Sox17+ breast epithelial cells, ER+, Cox2+
breast epithelial cells, ER+, Ki67+ breast epithelial cells;
androgen-receptor-positive (AR+), p27+ breast epithelial cells,
AR+, Sox17+ breast epithelial cells, AR+, Cox2+ breast epithelial
cells, and AR+, Ki67+ breast epithelial cells; and (b) predicting
that the subject has a relatively elevated risk of developing
breast cancer if the frequency of the cells of the type is
increased compared to a first control frequency of cells of the
type; or (c) predicting that the subject has a relatively reduced
risk of developing breast cancer if the frequency of the cells of
the type is decreased compared to a second control frequency of the
cells of the type.
[0010] In certain aspects, step (b) includes predicting that the
subject has a relatively elevated risk of developing breast cancer
if the frequency of p27+ breast epithelial cells is 15 percent (%)
or greater of the breast epithelial cells in the sample; and step
(c) includes predicting that the subject has a relatively reduced
risk of developing breast cancer if the frequency of p27+ breast
epithelial cells is less than 15% of the breast epithelial cells in
the sample. In other aspects, step (b) includes predicting that the
subject has a relatively elevated risk of developing breast cancer
if the frequency of p27+ breast epithelial cells is 20 percent (%)
or greater of the breast epithelial cells in the sample; and step
(c) includes predicting that the subject has a relatively reduced
risk of developing breast cancer if the frequency of p27+ breast
epithelial cells is less than 20% of the breast epithelial cells in
the sample. In still another aspect, step (b) includes predicting
that the subject has a relatively elevated risk of developing
breast cancer if the frequency of p27+ breast epithelial cells is
25 percent (%) or greater of the breast epithelial cells in the
sample; and step (c) includes predicting that the subject has a
relatively reduced risk of developing breast cancer if the
frequency of p27+ breast epithelial cells is less than 25% of the
breast epithelial cells in the sample. In certain aspects, step (b)
includes predicting that the subject has a relatively elevated risk
of developing breast cancer if the frequency of Ki67+ breast
epithelial cells is 2 percent (%) or greater of the breast
epithelial cells in the sample; and step (c) includes predicting
that the subject has a relatively reduced risk of developing breast
cancer if the frequency of Ki67+ breast epithelial cells is less
than 2% of the breast epithelial cells in the sample. In yet other
aspects, step (b) includes predicting that the subject has a
relatively elevated risk of developing breast cancer if: (i) the
frequency of p27+ breast epithelial cells is increased compared to
a first control frequency of p27+ breast epithelial cells, and (ii)
the frequency of Ki67+ breast epithelial cells is increased
compared to a first control frequency of Ki67+ breast epithelial
cells; and step (c) includes predicting that the subject has a
relatively reduced risk of developing breast cancer if: (i) the
frequency of p27+ breast epithelial cells is decreased compared to
a second control frequency of p27+ breast epithelial cells, and
(ii) the frequency of Ki67+ breast epithelial cells is decreased
compared to a second control frequency of Ki67+ breast epithelial
cells.
[0011] In another aspect, a method of predicting a subject's risk
of developing breast cancer is provided. The method includes: (a)
determining the expression level in a breast tissue sample from a
subject of at least one marker selected from the group consisting
of p27, Sox17 and Cox2; and (b) predicting that the subject has a
relatively elevated risk of developing breast cancer if the
expression level of the at least one marker is increased compared
to a first control level of the at least one marker; or (c)
predicting that the subject has a relatively reduced risk of
developing breast cancer if the expression level of the at least
one marker is decreased compared to a second control level of the
at least one marker. In certain aspects, the expression level
determined is the mRNA expression level of the at least one marker.
In other aspects, the expression level determined is the protein
expression level of the at least one marker. In certain aspects,
step (a) includes determining the expression level of at least two
(2) markers or all 3 markers selected from the group consisting of
p27, Sox17 and Cox2.
[0012] In some aspects, step (a) further includes determining the
expression level of one or more additional markers having an
expression level that is modulated in breast epithelial cells of
parous women compared to the levels in breast epithelial cells of
nulliparous women. In certain aspects, the sample is enriched for
CD44+, CD24- breast epithelial cells or for CD24+ breast epithelial
cells prior to the determining. In still other aspects, the sample
is enriched for Ki67+ breast epithelial cells or CD44+Ki67+ breast
epithelial cells prior to the determining.
[0013] In certain aspects, the subject for whom the risk of
developing an estrogen-receptor-positive (ER+) breast cancer is
being predicted has a BRCA1 and/or a BRCA2 mutation.
[0014] In other aspects, a method of predicting a subject's risk of
developing breast cancer is provided, which includes determining a
parity/nulliparity-associated gene expression signature in a sample
containing breast epithelial cells. In certain aspects, the sample
is enriched for CD44+ cells, CD24+ cells, or CD10+ cells.
[0015] In one aspect, a method of predicting breast cancer disease
outcome is provided, including testing for a
parity/nulliparity-associated gene expression signature in breast
cancer cells.
[0016] In another aspect, a method of treating
estrogen-receptor-positive (ER+) breast cancer in a subject is
provided. The method includes administering to the subject a
composition that includes an inhibitor of a pathway that has
increased activity in CD44+, CD24- breast epithelial cells of
nulliparous women compared to the activity in CD44+, CD24- breast
epithelial cells of parous women. In certain aspects, the pathway
can be cytoskeleton remodeling, chemokines, androgen signaling,
cell adhesion, or Wnt signaling.
[0017] In yet another aspect, a method of preventing breast cancer
in a subject is provided. The method includes administering to a
subject at risk of developing breast cancer an inhibitor of a
pathway that has increased activity in breast epithelial cells of
nulliparous women compared to breast epithelial cells of parous
women. In some aspects, the pathway can be cytoskeleton remodeling,
chemokines, androgen signaling, cell adhesion, or Wnt signaling. In
certain aspects, the pathway includes a mediator molecule that can
be cAMP, EGFR, Cox2, hedgehog (Hh), TGF.beta. receptor (TGFBR) or
IGF receptor (IGFR). In still other aspects, the inhibitor
selectively targets CD44+, CD24- breast epithelial cells, CD24+
breast epithelial cells, p27+ breast epithelial cells, or Ki67+
breast epithelial cells. In certain aspects, the cells selectively
targeted by the inhibitor are also ER+. In certain aspects, the
subject has a BRCA1 or BRCA2 mutation.
[0018] In certain aspects, methods of treating or preventing breast
cancer in a subject are provided. The methods include administering
to a subject an agonist of a pathway that has decreased activity in
CD44+, CD24- breast epithelial cells of nulliparous women compared
to CD44+, CD24- breast epithelial cells of parous women. In certain
aspects, the pathway can be tumor suppression (Hakai/CBLL1, CASP8,
SCRIB, LLGL2), DNA repair, PI3K/AKT signaling, or apoptosis. In
certain aspects, the agonist selectively targets CD44+, CD24-
breast epithelial cells, CD24+ breast epithelial cells, p27+ breast
epithelial cells, or Ki67+ breast epithelial cells. In another
aspect, the cells selectively targeted by the agonist are also ER+.
In certain aspects, the subject has a BRCA1 or BRCA2 mutation.
[0019] In any of the above aspects, the breast cancer can be an ER+
or an ER- breast cancer.
[0020] Unless otherwise defined, all technical and scientific terms
used herein have the same meaning as commonly understood by one of
ordinary skill in the art to which this invention pertains. In case
of conflict, the present document, including definitions, will
control.
[0021] All publications, patent applications, patents, and other
references mentioned herein are incorporated by reference in their
entirety. The materials, methods, and examples disclosed herein are
illustrative only and not intended to be limiting.
[0022] The details of one or more embodiments of the invention are
set forth in the accompanying drawings and the description below.
Preferred methods and materials are described below, although
methods and materials similar or equivalent to those described
herein can also be used in the practice or testing of the present
invention. Other features, objects, and advantages of the invention
will be apparent from the description and drawings, and from the
claims.
DESCRIPTION OF DRAWINGS
[0023] FIG. 1 contains representative FACS plots for cells stained
with fluorescent antibodies specific for CD24 and CD44 from normal
breast tissue of nulliparous (upper plot) and parous (lower plot)
women.
[0024] FIG. 2 contains graphs plotting the frequency (%) of CD44+,
CD24+, and CD10+ human breast epithelial cells relative to total
human breast epithelial cells from nulliparous and parous women. 10
samples each from nulliparous and parous groups were analyzed, and
each dot represents an individual sample. Error bars represent
mean.+-.SEM.
[0025] FIG. 3 contains dot plots showing a genome-wide view of
genes differentially expressed between nulliparous (N) and parous
(P) samples in CD44+, CD24- breast epithelial cells (upper left
quadrant), CD10+ breast epithelial cells (upper right quadrant),
CD24+ breast epithelial cells (lower left quadrant), and stromal
fibroblasts (lower right quadrant). Each dot represents a gene.
Fold differences between averaged N and P samples and their
corresponding p-values are plotted on the y and x-axis,
respectively. Vertical lines indicate p=0.05, numbers indicate the
number of genes differentially expressed at p<0.05.
[0026] FIG. 4A is a three-dimensional projection of the gene
expression data onto the first three principal components. Each
ball is a different sample; cell type and parity are indicated.
[0027] FIG. 4B is a box-and-whisker diagram of the paired Euclidean
distance for each of the indicated cell types: CD44+, CD24+, CD10+,
and stromal fibroblasts ("stroma"). The middle line within a box
represents the median value. The Box is the IQR (interquartile
range, 25th and 75th percentile). The top and bottom line of each
box plot is the data range: the lowest data still within 1.5 IQR of
the lower quantile and the highest data still within 1.5 IQR of the
upper quantile. Data shown outside the range are plotted as
circles. The Kolmogorov-Smirnov (KS) test was used to determine the
significance of difference between CD44+ and other cell types.
Statistical significance (p) is indicated.
[0028] FIG. 5 is a box-and-whisker diagram of the paired Euclidean
distance for the following pair-wise comparisons (from left to
right on the x-axis): CD44+, CD24- nulliparous vs. CD10+
nulliparous; CD44+, CD24- nulliparous vs. CD24+ nulliparous; CD44+,
CD24-parous vs. CD10+ nulliparous, CD44+, CD24- parous vs. CD24+
nulliparous; (N: nulliparous. P: parous). The middle line within a
box represents the median value. The Box is the IQR (interquartile
range, 25th and 75th percentile). The top and bottom line of each
box plot is the data range: the lowest data still within 1.5 IQR of
the lower quantile and the highest data still within 1.5 IQR of the
upper quantile. Data shown outside the range are plotted as
circles. The Kolmogorov-Smirnov (KS) test was used to determine
significance of differences, indicated on the plot (p).
[0029] FIG. 6 contains dot plots showing the relative DNA
methylation, as determined by qMSP analysis (left panel), and the
expression, as determined by qRT-PCR (right panel), of the
indicated genes (left panel: TTC9B, RRP15, and AOPKO5; right panel:
CDKN1B, PTGS2, COL1A1 and COL3A1) in CD44+, CD24- breast epithelial
cells and CD24- breast epithelial cells isolated from multiple
nulliparous and parous women, respectively. Relative methylation
and expression levels normalized to ACTB and RPL19, respectively,
are indicated on the y-axis. The bars mark the median and p-values
indicate the statistical significance of the observed
differences.
[0030] FIG. 7 is a dendrogram showing the hierarchical clustering
of Norwegian cohort (GSE18672) based on Pearson correlation using
genes differentially expressed in CD44+ cells. Individual patient
samples from the cohort are shown (MDG-110, MDG124, etc.); "N-pre"
means premenopausal. Clustering analysis using the differentially
expressed gene sets divided these samples into two groups, a mixed
parous/nulliparous (Nulliparous A) group, and a distinct,
nulliparous (Nulliparous B) group.
[0031] FIG. 8 is a bar plot of the serum estradiol levels in
picograms per milliliter for the samples corresponding to FIG. 7
(Nulliparous A, Nulliparous B and Parous groups).
[0032] FIG. 9A is a dendrogram showing the hierarchical clustering
of CD44+ cells from parous and nulliparous control women and parous
BRCA1 mutation carriers.
[0033] FIG. 9B is a dot plot showing the relative frequency of
CD44.sup.+, CD24.sup.+, and CD10.sup.+ cells among all breast
epithelial cells in samples from nulliparous and parous groups from
control and BRCA1/2 mutation carriers. The error bars mark the
mean.+-.standard error of the mean (SEM).
[0034] FIG. 10 is a dendrogram depicting hierarchical clustering of
signaling pathways significantly high in parous or nulliparous
samples in any of the four cell types (stromal fibroblasts
("stroma"), CD10+, CD44+ and CD24+ breast epithelial cells)
analyzed.
[0035] FIG. 11 is a heat map depicting unsupervised clustering of
signaling pathways significantly down- or upregulated in parous
compared to nulliparous samples in any of the four cell types types
(stromal fibroblasts ("stroma"), CD10+, CD44+ and CD24+ breast
epithelial cells) analyzed. Gray scale indicates -log p value of
enrichment. Rectangles highlight cell type-specific or common
altered pathways.
[0036] FIG. 12 contains graphs showing the relative enrichment
(left panel) and relative connectivity (right panel) of the
indicated protein classes in nulliparous and parous samples in each
of the four cells types (stromal fibroblasts ("stroma"), CD10+,
CD44+ and CD24+ breast epithelial cells) analyzed. X-axes indicate
-log 10 p-values for enrichment (left panel) with the listed
protein classes and the number of overconnected objects, defined as
proteins with higher than expected number of interactions, in each
functional category within each group (right panel),
respectively.
[0037] FIG. 13 is an integrated map of statistically significant
(P-val<0.05) pathways enriched in genes highly expressed in
CD44+ nulliparous cells along with DNA methylation patterns.
Important pathways highly active in CD44+ nulliparous cells
potentially regulated by DNA methylation include PI3K signaling and
TCF/Lef signaling. Highly expressed genes, and promoter and gene
body hypo and hyper-methylation are indicated.
[0038] FIG. 14 is an integrated map of statistically significant
(P-val<0.05) pathways enriched in genes highly expressed in
CD44+ parous cells along with DNA methylation patterns. Active
pathways potentially regulated by DNA methylation in CD44+ parous
cells include TGFB2 signaling. Highly expressed genes, and promoter
and gene body hypo and hyper-methylation are indicated.
[0039] FIG. 15A is a Venn diagram depicting the number of unique
and common pathways high in CD44+ nulliparous cells and in mammary
glands of virgin rats, respectively.
[0040] FIG. 15B is a list of top common pathways downregulated in
CD44+ cells and mammary glands from parous women and rats,
respectively. Names of the pathways and p-values of enrichment are
indicated.
[0041] FIG. 16 contains dot plots showing a genome-wide view of
differentially methylated genes in CD44+ (upper panel) and CD24+
(lower panel) cells between nulliparous and parous samples. All
MSDK sites are plotted on the x-axis in the order of p-values of
the difference between nulliparous and parous samples in CD44+ or
CD24+ cells. Log ratios of averaged MSDK counts in three N and
three P samples are plotted on the y-axis. Vertical lines indicate
p=0.01 and the numbers of significant DMRs (p<0.01) are shown in
the upper and lower right corners of the plots.
[0042] FIG. 17 is a heat map showing the pathways enriched by genes
associated with gene body or promoter DMRs in CD44+ cells from
nulliparous and parous samples.
[0043] FIG. 18 contains graphs quantifying (in arbitrary units) the
expression of p27, Sox17 and Cox2 in CD44+ and CD24+ breast
epithelial cells in premenopausal nulliparous (NP) and parous (P)
women. Horizontal bars indicate the median, vertical bars indicate
SEM, and p-values indicate the statistical significance of the
observed differences.
[0044] FIG. 19 is a graph showing the frequencies (% of total
breast epithelial cells) of p27+ and Ki67+ cells in nulliparous
(NP) and parous (P) breast tissue samples. Horizontal bars indicate
the median, vertical bars indicate SEM, and p-values of differences
between nulliparous and parous groups are indicated.
[0045] FIG. 20 contains graphs quantifying the expression of p27
(in arbitrary units) and the frequencies (% of total breast
epithelial cells) of p27+ and Ki67+ cells in CD44+ and CD24+ breast
epithelial cells in postmenopausal nulliparous (NP) and parous (P)
women (FIG. 20)
[0046] FIG. 21 contains graphs quantifying the expression of p27
(in arbitrary units) and the frequencies (% of total breast
epithelial cells) of p27+ and Ki67+ cells in high and low density
areas of the same breast from premenopausal parous women.
[0047] FIG. 22 contains bar graphs quantifying the frequencies (%
of total breast epithelial cells) of p27+ and ER+ cells in each
group of samples (nulliparous, parous, women in follicular or
luteal phase of menstrual cycle, oocyte donor, early pregnancy,
late pregnancy, BRCA1+ mutation carriers and BRCA-2 mutation
carriers). Horizontal bars indicate the median, vertical bars mark
the SEM, and asterisks indicate significant (p<0.05, t-test or
Fisher exact test) differences between groups of 4-8 samples.
[0048] FIG. 23A is a bar graph quantifying frequencies (fraction
(%) of total breast epithelial cells) of p27+, androgen receptor
(AR)+, and p27+AR+ cells in each set of samples (nulliparous,
parous, and BRCA1+ mutation carriers).
[0049] FIG. 23B contains bar graphs quantifying frequencies (% of
total breast epithelial cells) of p27+, Ki67+, and p27+Ki67+ cells
in each set of samples (sample collected from women in the
follicular or luteal phase of the menstrual cycle, oocyte donor and
women in early pregnancy).
[0050] FIG. 23C contains bar graphs quantifying the frequency of
p27+, Ki67+, and p27+Ki67+ cells in the breast tissue of
premenopausal and postmenopausal nulliparous (NP) or parous (P)
women in different phases of the menstrual cycle (i.e., follicular
("Foll") and luteal ("Lut")) or with breast cancer (BC) or without
(cont); asterisks mark p<0.05.
[0051] FIG. 24 contains bar graphs quantifying the frequency (% of
total breast epithelial cells) of BrdU+, Ki67+, and p27+ cells in
each of the indicated conditions (control, inhibition of cAMP,
EGFR, Cox2, Hh, TGF.beta., Wnt, or IGFR in normal breast tissues
incubated in a tissue explant culture model with the relevant
inhibitor); * indicates p<0.05 and bars indicate SEM.
[0052] FIG. 25 contains bar graphs quantifying the frequency (% of
total breast epithelial cells) of pSMAD2+ cells, or the mean
fluorescence intensity of pEGFR and Axin 2 in breast epithelial
tissue treated with control (C) or inhibitor (I) (inhibitor of
TGFb, EGFR or Wnt, from top graph to bottom graph).
[0053] FIG. 26A contains line graphs plotting the RGB spectra
demonstrating overlap between the expression of p27 and the
indicated marker (in the top panels: circles mark the line for
pSMAD2, triangles mark the line for p27, and squares mark the line
for DAPI; in the middle panels: circles mark the line for pEGFR,
triangles mark the line for p27, squares mark the line for DAPI; in
the lower panel: circles mark the line for axin2 and squares mark
the line for DAPI); left graphs are control groups and right graphs
are treated with the indicated inhibitor. In all graphs, intensity
is plotted on the y-axis and distance (in pixels) in plotted on the
x-axis.
[0054] FIG. 26B contains a bar graph quantifying the frequency (%)
of p27+ cells in tissue slices from 3-4 independent cases treated
with hormones mimicking the indicated physiologic levels (control,
follicular phase, luteal phase, and pregnancy) in women. Asterisks
indicated significant (p.ltoreq.0.05) differences.
[0055] FIG. 26C contains bar graphs quantifying the frequency (% of
all breast epithelial cells) of p27+, Ki67+, and p27+Ki67+ cells in
tissue slice cultures treated with Shh or Tamoxifen; asterisks
indicate a statistical significance of p.ltoreq.0.05.
[0056] FIG. 27 contains Kaplan-Meier plots depicting the
probability of breast cancer-specific survival among women with
invasive ER+ (left panel) or ER- (right panel) breast cancer by
parity in the Nurses' Health Study (1976-2006). The p-value of the
difference between the two survival curves overall was calculated
with use of the log-rank test. Beneath each plot the number of
parous and nulliparous women alive at each of the time points shown
on the x-axes of the plots (beginning at 5 years) is shown.
[0057] FIGS. 28 and 29A-C contain heat maps (left panel) and
Kaplan-Meier plots with their corresponding log-rank test p-values
(right panel) showing a significant association of the presence of
a parity/nulliparity-related gene signature with overall survival
in the indicated cohorts of breast cancer patients with ER+ tumors.
In each figure, the top heat map shows the signature from down
regulated genes in parous subjects and the bottom heat map from up
group genes. The bars above the heat maps indicate the two distinct
patients groups separated by the co-expression of the signature
(light gray (left bar on heat map, upper line on Kaplan-Meier
plots): better survival group; dark gray (right bar on heat map,
lower line on Kaplan-Meier plots): worse survival group). The bar
at the right side of heat map, divided into an upper and lower
group, indicates effect of parity on genes in breast cancer
progression. The upper group indicates parity induces gene
expression level change in the same trend as breast cancer
progression. The lower group indicates parity induces gene
expression level change in the opposite trend as breast cancer
progression. Black bars (beneath the heat maps) indicate death. The
genes shown in the heat maps (the parity/nulliparity-related gene
signature) are shown in Table 18, below, which shows the gene
symbol, gene description, gene expression pattern (i.e., high in
parous and nulliparous samples), and prognostic values (good or bad
prognosis) for each of the genes.
[0058] FIG. 30 contains a diagram showing the timeline for
simulations in a mathematical model of the dynamics of
proliferating mammary epithelial cells that can accumulate the
changes leading to cancer initiation, run from the time of menarche
at 12.6 years through cancer initiation or death at 80.9 years. The
earliest time of pregnancy is at menarche; the latest time is right
before menopause at 51.3 years.
[0059] FIGS. 31-33 are schematic representations of a mathematical
model of the dynamics of proliferating mammary epithelial cells
that can accumulate the changes leading to cancer initiation. In
FIG. 31, initially, there are N wild-type stem cells (top of
schematic), which give rise to a differentiation cascade of
2.sup.z+1-1 wild-type luminal progenitor cells (triangular, lower
region). Darkening gray gradations refer to successively more
differentiated cells and serve to clarify a single time step of the
stochastic process. In FIG. 32, "WT" means wild-type (non-mutated)
stem cell and "f.sub.mut" means mutant progenitor cell. Division
during pregnancy is indicated by "z.sub.preg"; z is the number of
cell divisions; K indicates the number of cell divisions from the
first progeny of the stem cell (k=0) to the terminally
differentiated cell (darkest gray).
[0060] FIG. 34 is a bar graph quantifying the effect the indicated
parameters of the mathematical model described in Example 10 (N
value, Zpreg, and p) have on the relative probability of cancer
initiation (per duct) relative to nulliparous women. The default
values were: N=8, p=10.sup.-2, Z.sub.preg=2.
[0061] FIG. 35 is a line graph plotting the likelihood (relative
probability) of cancer initiation relative to nulliparous (y-axis)
against time of first pregnancy after menarche (years) on the
x-axis for the indicated starting number of stem cells (N=5, N=8,
and N=10).
[0062] FIG. 36 is a line graph plotting the likelihood (relative
probability) of cancer initiation relative to nulliparous (y-axis)
against time of first pregnancy after menarche (years) on the
x-axis for the indicated probabilities of stem cell differentiation
(p=0.1, p=0.01, and p=0.001)
[0063] FIG. 37 is a line graph plotting the likelihood (relative
probability) of cancer initiation relative to nulliparous (y-axis)
against time of first pregnancy after menarche (years) on the
x-axis for the indicated number of additional cell divisions during
pregnancy (3 and 2).
DETAILED DESCRIPTION
[0064] Various aspects of the invention are described below.
I. OVERVIEW
[0065] A single full-term pregnancy in early adulthood decreases
the risk of estrogen receptor (ER)-positive (+) postmenopausal
breast cancer, the most common form of the disease. Age at first
pregnancy is critical, as the protective effect decreases after the
mid 20s, and women aged >35 years at first birth have increased
risk of both ER+ and ER- breast cancer. Parity-associated risk is
also influenced by germline variants, as BRCA1 and BRCA2 mutation
carriers do not experience the same decrease in risk reduction as
does the general population. These human epidemiological data
suggest that pregnancy induces long-lasting effects in the normal
breast epithelium and that ER+ and ER- tumors might have a
different cell of origin. The protective effect of parity is also
observed in animal models, where its protective effect can be
mimicked by hormonal factors in the absence of gestation.
[0066] The cellular and molecular mechanisms that underlie
pregnancy and hormone-induced refractoriness to carcinogens are
largely undefined. Several hypotheses have been proposed including
the induction of differentiation, decreased susceptibility to
carcinogens, a decrease in cell proliferation and in the number of
mammary epithelial stem cells, an altered systemic environment due
to a decrease in circulating growth hormone and other endocrine
factors, and permanent molecular changes leading to alterations in
cell fate. Almost all studies investigating pregnancy-induced
changes and the breast cancer preventative effects of pregnancy
have been conducted in rodent models and most of them have focused
only on the mammary gland. Global gene expression profiling of
mammary glands from virgin and parous rats identified changes in
TGF.beta. and IGF signaling, and in the expression of extracellular
matrix proteins.
[0067] Related studies conducted in humans also identified
consistent differences in gene expression profiles between
nulliparous and parous women (see Asztalos et al. (2010) Cancer
Prev Res (Phila) 3, 301-311; Belitskaya-Levy et al. (2011) Cancer
Prev Res (Phila) 4, 1457-1464; Russo et al. (2008) Cancer Epidemiol
Biomarkers Prev 17, 51-66; and Russo et al. (2011) Int J Cancer;
October 25; E-pub ahead of print). Because those studies used total
mammary gland or mammary organoids, which are composed of multiple
cell types the cellular origin of these gene expression differences
remains unknown. Emerging data indicate that mammary epithelial
progenitor or stem cells are the cell of origin of breast
carcinomas. Studies assessing changes in mammary epithelial stem
cells following pregnancy, however, have been conducted only in
mice and thus far have been inconclusive. Thus, the effect of
pregnancy on the number and functional properties of murine mammary
epithelial progenitors is still elusive and it has not yet been
analyzed in humans.
[0068] It is presently discovered that parity has a pronounced
effect on CD44+ cells with progenitor features. As demonstrated in
the present Examples, most of the differences in CD44+ cells
between nulliparous and parous samples related to transcriptional
repression and downregulation of genes and pathways important for
stem cell function, many of which also play a role in
tumorigenesis, including EGF, IGF, Hh, and TGF.beta. signaling.
High circulating IGF-1 levels have been associated with increased
risk of ER+ breast cancer, and germline polymorphism in members of
the TGF.beta. signaling pathway have also been described to
influence breast cancer susceptibility.
[0069] The present Examples also demonstrate that parity not only
influences the risk of developing breast cancer, but potentially
even the type of tumor and associated clinical outcome in breast
cancer patients. Moreover, based on the genomic profiling and
functional validation results in tissue explant cultures shown in
the present Examples, the pathways that were identified as less
active in parous women can be used for risk stratification and for
chemoprevention in high-risk women, as their inhibition will mimic
the cancer-reducing effects of parity.
[0070] The present Examples also demonstrate a significant decrease
in the number of p27+ cells in breast tissues of parous women,
which seems paradoxical as p27 (also known as CDKN1B/p27(kip1)) is
a bona fide tumor suppressor and potent inhibitor of cell cycle
progression. p27 has been shown to play an important role in stem
cells, best characterized in the hematopoietic system, where loss
of p27 increases the number of transit amplifying progenitors but
not that of stem cells. In the mouse mammary gland, p27 deficiency
leads to hypoplasia and impaired ductal branching and
lobulo-alveolar differentiation, a phenotype consistent with a
putative role in regulating the number and proliferation of mammary
epithelial progenitors, although this has not been
investigated.
[0071] While not intending to be bound by any one particular theory
or mechanism of action, based on the data in the present Examples,
it is thought that p27 regulates the proliferation and pool size of
hormone-responsive breast epithelial progenitors; thus, the lower
number of p27+ cells in parous women reflects a decrease in the
number of quiescent progenitors with proliferative potential, which
may contribute to their decrease in breast cancer risk. High p27
levels and quiescence are maintained in these cells by TGF.beta.
signaling, as implied by the co-expression of pSmad2 with p27 and
the increase in BrdU incorporation with concomitant decrease in p27
(Example 9).
[0072] It is also presently discovered that the frequency of p27+
cells was high in control nulliparous women and even higher in
BRCA1 and BRCA2 mutation carriers even though these different
groups of women are predisposed to different types of breast cancer
(Example 2). Nulliparous women have increased risk of
postmenopausal ER+ breast cancer, whereas BRCA1 mutation carriers
most commonly have ER- basal-like tumors. However, recently
published studies analyzing the potential cell-of-origin of
BRCA1-associated breast cancer in animal models and in humans have
found that even these basal-like tumors may initiate from luminal
progenitors. The present Examples demonstrate increased frequency
of hormone responsive p27+ cells in all high-risk women, supporting
these hypotheses.
[0073] Thus, the number of p27+ breast epithelial progenitor
(CD44+) cells in the normal breast and the activity of pathways
that regulate the number of p27+ cells can be used as markers for
predicting the risk of developing breast cancer (e.g., ER+ breast
cancer or ER- breast cancer), as novel targets for cancer
preventive and treatment strategies (e.g. therapeutic
intervention), and for monitoring the efficacy of such preventive
and treatment strategies. Furthermore, the pathways identified
herein, e.g., a TGF.beta. pathway, can be exploited for breast
cancer prevention, as they can be modulated to deplete p27+ cells
with progenitor features and consequently decrease breast cancer
risk.
II. DEFINITIONS
[0074] As used herein, the term "estrogen-receptor-positive (ER+)
breast cancer" means a cancer wherein at least one cancer cell
expresses the estrogen receptor. As used herein, the term
"estrogen-receptor-negative (ER-) breast cancer" means a cancer
wherein the cancer cells do not express the estrogen receptor.
[0075] As used herein, a "breast tissue sample" can include, but is
not limited to, histological sections of normal breast tissue,
e.g., healthy breast tissue, tumors or cancer cell-containing
tissue, whole or soluble fractions of tissue or cell (e.g., cancer
cell) lysates, cell subfractions (e.g., mitochondrial or nuclear
subfractions), whole or soluble fractions of tissue or cell (e.g.,
cancer cell) subfraction lysates can be analyzed.
[0076] As used herein, a cell that is "positive" for a marker, such
as, e.g., a CD44+, p27+, CD24+, or CD10+ cell, expresses the marker
at the mRNA and/or protein level.
[0077] As used herein, breast "stromal cells" are breast cells
other than epithelial cells.
[0078] As used herein, the term "subject" means any animal,
including any vertebrate or mammal, and, in particular, a human,
and can also be referred to, e.g., as an individual or patient.
Typically, not necessarily, the subject is female. A subject in
"need of such predicting" i.e., a subject in need of predicting the
subject's risk of developing breast cancer, can be, e.g., a subject
with a family history of breast cancer, a subject who has not been
tested for and/or has not been diagnosed with breast cancer, a
subject who wishes to know their risk of developing breast cancer,
e.g., ER+ or ER- breast cancer, and/or a subject undergoing a
routine health screen by, e.g., their attending physician, and/or a
subject undergoing a therapy (e.g., raloxifen or tamoxifen) for the
treatment and/or prevention of cancer (e.g., breast cancer).
[0079] As used herein, a subject (e.g., patient) having a
characteristic (as described herein) that results in a "relatively
elevated risk of developing breast cancer," (e.g., ER+ or ER-
breast cancer) has a greater risk of developing breast cancer than
a subject not having that characteristic. Conversely, a subject
having a characteristic (as described herein) that results in a
"relatively reduced risk of developing breast cancer," has a lesser
risk of developing breast cancer than a subject not having that
characteristic.
[0080] As used herein, a "parous" subject is a woman who has
carried a pregnancy for at least 37 weeks of gestation, one or more
times. As used herein, a "nulliparous" subject is a woman who has
never carried a pregnancy for at least 37 weeks gestation.
[0081] As used herein, a "first control frequency" of a cell type
(e.g., CD44+ or CD24+ cells) is the frequency of the cell type in a
comparable sample from a patient or the average frequency in
comparable samples from a plurality of patients known to be at low
risk of developing breast cancer (e.g., parous women not expressing
BRCA1 or BRCA2 mutations). "Comparable sample" typically means the
same sample type (e.g., tumor biopsy or histological section from
the same tissue (e.g. breast tissue). The first control frequency
can also be a "predetermined reference frequency" (i.e., standard)
to which the frequency of the cell type in a test sample is
compared. As used herein, a "second control frequency" of a cell
type (e.g., CD44+ or CD24+ cells) is the frequency of the cell type
in a comparable sample from a patient or the average frequency in
comparable samples from a plurality of patients known to be at high
risk of developing breast cancer (e.g., nulliparous women).
[0082] As used herein, the "expression level" of a marker, such as,
e.g., CD44, CD24, CD10, p27, Ki67, Sox17, Cox2, cAMP, EGFR, TGFBR,
Cox2, Hh, and IGFR, etc. means the mRNA and/or protein expression
level of the marker, or the measurable level of the marker in a
sample (e.g., the level of cAMP can be detected by immunoassay),
which can be determined by any suitable method known in the art,
such as, but not limited to Northern blot, polymerase chain
reaction (PCR), e.g., quantitative real-time, "QPCR", Western blot,
immunoassay (e.g., ELISA), immunohistochemistry, cell
immunostaining and fluorescence activated cell sorting (FACS),
etc.
[0083] As used herein, a "substantially altered" level of
expression of a gene in a first cell (or first tissue) compared to
a second cell (or second tissue) is an at least 2-fold (e.g., at
least: 2-; 3-; 4-; 5-; 6-; 7-; 8-; 9-; 10-; 15-; 20-; 30-; 40-;
50-; 75-; 100-; 200-; 500-; 1,000-; 2000-; 5,000-; or 10,000-fold)
altered level of expression of the gene. It is understood that the
alteration can be an increase or a decrease.
[0084] As used herein, the term "selectively targets", e.g., in the
context of a specific cell type (e.g., CD44+, CD24- breast
epithelial cells, p27+ breast epithelial cells, etc.) means the
targeting agent (e.g., an inhibitor or agonist) mediates an effect
on the specific target cell, but not on other cells. Thus, for
example, an inhibitor that selectively targets CD44+ cells will
mediate an effect (e.g. inhibition, e.g., of proliferation) on
CD44+ cells, but not on CD44- cells. Such selective targeting can
be achieved, e.g., by conjugating the inhibitor to an antibody that
specifically binds to the target cell (e.g., an anti-CD44
antibody), as well as by other methods known in the art.
[0085] As used herein, "treating" or "treatment" of a state,
disorder or condition includes: (1) preventing or delaying the
appearance of clinical or sub-clinical symptoms of the state,
disorder or condition developing in a mammal that may be afflicted
with or predisposed to the state, disorder or condition but does
not yet experience or display clinical or subclinical symptoms of
the state, disorder or condition; and/or (2) inhibiting the state,
disorder or condition, i.e., arresting, reducing or delaying the
development of the disease or a relapse thereof (in case of
maintenance treatment) or at least one clinical or sub-clinical
symptom thereof; and/or (3) relieving the disease, i.e., causing
regression of the state, disorder or condition or at least one of
its clinical or sub-clinical symptoms; and/or (4) causing a
decrease in the severity of one or more symptoms of the disease.
The benefit to a subject to be treated is either statistically
significant or at least perceptible to the patient or to the
physician.
[0086] As used herein, the term "treating cancer" (e.g., treating
an ER+ or ER- breast cancer) means causing a partial or complete
decrease in the rate of growth of a tumor, and/or in the size of
the tumor and/or in the rate of local or distant tumor metastasis
in the presence of an inhibitor of the invention, and/or any
decrease in tumor survival.
[0087] As used herein, the term "preventing a disease" (e.g.,
preventing ER+ or ER- breast cancer) in a subject means for
example, to stop the development of one or more symptoms of a
disease in a subject before they occur or are detectable, e.g., by
the patient or the patient's doctor. Preferably, the disease (e.g.,
cancer) does not develop at all, i.e., no symptoms of the disease
are detectable. However, it can also result in delaying or slowing
of the development of one or more symptoms of the disease.
Alternatively, or in addition, it can result in the decreasing of
the severity of one or more subsequently developed symptoms.
[0088] As used herein, a "pathway that has decreased activity",
e.g., in breast epithelial cells (e.g., CD44+, CD24- breast
epithelial cells)) of parous or nulliparous women means a pathway
involving one or more genes or polypeptides mediating a function in
the pathway that have reduced level of expression and/or activity.
Non-limiting examples of such pathways are exemplified in Tables 10
and 11.
[0089] As used herein, the term "parity/nulliparity-related gene
signature" means the known expression level of a group of two or
more genes in breast epithelial cells of parous and nulliparous
women (as disclosed herein). For example, the group of genes that
were shown to be upregulated or downregulated in FIG. 28, or a
subgroup of the genes, are part of such parity/nulliparity-related
gene signature. The genes shown in FIG. 28 are summarized in Table
18. Of course, the skilled artisan will appreciate that a
parity/nulliparity-related gene signature can, but does not
necessarily, include all of the genes shown in Table 18.
Preferably, the signature includes 2 or more, 3 or more, 4 or more,
5 or more, 10 or more, 15 or more, 20 or more, 30 or more, 40 or
more, 50 or more, or 100 or more of the genes shown in Table
18.
[0090] As used herein "combination therapy" means the treatment of
a subject in need of treatment with a certain composition or drug
in which the subject is treated or given one or more other
compositions or drugs for the disease in conjunction with the first
and/or in conjunction with one or more other therapies, such as,
e.g., a cancer therapy such as chemotherapy, radiation therapy,
and/or surgery. Such combination therapy can be sequential therapy
wherein the patient is treated first with one treatment modality
(e.g., drug or therapy), and then the other (e.g., drug or
therapy), and so on, or all drugs and/or therapies can be
administered simultaneously. In either case, these drugs and/or
therapies are said to be "coadministered." It is to be understood
that "coadministered" does not necessarily mean that the drugs
and/or therapies are administered in a combined form (i.e., they
may be administered separately or together to the same or different
sites at the same or different times).
[0091] The term "pharmaceutically acceptable derivative" as used
herein means any pharmaceutically acceptable salt, solvate or
prodrug, e.g., ester, of a compound of the invention, which upon
administration to the recipient is capable of providing (directly
or indirectly) a compound of the invention, or an active metabolite
or residue thereof. Such derivatives are recognizable to those
skilled in the art, without undue experimentation. Nevertheless,
reference is made to the teaching of Burger's Medicinal Chemistry
and Drug Discovery, 5th Edition, Vol 1: Principles and Practice,
which is incorporated herein by reference to the extent of teaching
such derivatives. Pharmaceutically acceptable derivatives include
salts, solvates, esters, carbamates, and/or phosphate esters.
[0092] As used herein the terms "therapeutically effective" and
"effective amount", used interchangeably, applied to a dose or
amount refer to a quantity of a composition, compound or
pharmaceutical formulation that is sufficient to result in a
desired activity upon administration to an animal in need thereof.
Within the context of the present invention, the term
"therapeutically effective" refers to that quantity of a
composition, compound or pharmaceutical formulation that is
sufficient to reduce or eliminate at least one symptom of a disease
or condition specified herein, e.g., breast cancer such as ER+ or
ER- breast cancer. When a combination of active ingredients is
administered, the effective amount of the combination may or may
not include amounts of each ingredient that would have been
effective if administered individually. The dosage of the
therapeutic formulation will vary, depending upon the nature of the
disease or condition, the patient's medical history, the frequency
of administration, the manner of administration, the clearance of
the agent from the host, and the like. The initial dose may be
larger, followed by smaller maintenance doses. The dose may be
administered, e.g., weekly, biweekly, daily, semi-weekly, etc., to
maintain an effective dosage level.
[0093] Therapeutically effective dosages can be determined stepwise
by combinations of approaches such as (i) characterization of
effective doses of the composition or compound in in vitro cell
culture assays using tumor cell growth and/or survival as a readout
followed by (ii) characterization in animal studies using tumor
growth inhibition and/or animal survival as a readout, followed by
(iii) characterization in human trials using enhanced tumor growth
inhibition and/or enhanced cancer survival rates as a readout.
[0094] The term "nucleic acid hybridization" refers to the pairing
of complementary strands of nucleic acids. The mechanism of pairing
involves hydrogen bonding, which may be Watson-Crick, Hoogsteen or
reversed Hoogsteen hydrogen bonding, between complementary
nucleoside or nucleotide bases (nucleobases) of the strands of
nucleic acids. For example, adenine and thymine are complementary
nucleobases that pair through the formation of hydrogen bonds.
Hybridization can occur under varying circumstances. Nucleic acid
molecules are "hybridizable" to each other when at least one strand
of one nucleic acid molecule can form hydrogen bonds with the
complementary bases of another nucleic acid molecule under defined
stringency conditions. Stringency of hybridization is determined,
e.g., by (i) the temperature at which hybridization and/or washing
is performed, and (ii) the ionic strength and (iii) concentration
of denaturants such as formamide of the hybridization and washing
solutions, as well as other parameters. Hybridization requires that
the two strands contain substantially complementary sequences.
Depending on the stringency of hybridization, however, some degree
of mismatches may be tolerated. Under "low stringency" conditions,
a greater percentage of mismatches are tolerable (i.e., will not
prevent formation of an anti-parallel hybrid). See Molecular
Biology of the Cell, Alberts et al., 3rd ed., New York and London:
Garland Publ., 1994, Ch. 7.
[0095] Typically, hybridization of two strands at high stringency
requires that the sequences exhibit a high degree of
complementarity over an extended portion of their length. Examples
of high stringency conditions include: hybridization to
filter-bound DNA in 0.5 M NaHPO4, 7% SDS, 1 mM EDTA at 65.degree.
C., followed by washing in 0.1.times.SSC/0.1% SDS (where
1.times.SSC is 0.15 M NaCl, 0.15 M Na citrate) at 68.degree. C. or
for oligonucleotide (oligo) inhibitors washing in 6.times.SSC/0.5%
sodium pyrophosphate at about 37.degree. C. (for 14 nucleotide-long
oligos), at about 48.degree. C. (for about 17 nucleotide-long
oligos), at about 55.degree. C. (for 20 nucleotide-long oligos),
and at about 60.degree. C. (for 23 nucleotide-long oligos).
[0096] Conditions of intermediate or moderate stringency (such as,
for example, an aqueous solution of 2.times.SSC at 65.degree. C.;
alternatively, for example, hybridization to filter-bound DNA in
0.5 M NaHPO4, 7% SDS, 1 mM EDTA at 65.degree. C. followed by
washing in 0.2.times.SSC/0.1% SDS at 42.degree. C.) and low
stringency (such as, for example, an aqueous solution of
2.times.SSC at 55.degree. C.), require correspondingly less overall
complementarity for hybridization to occur between two sequences.
Specific temperature and salt conditions for any given stringency
hybridization reaction depend on the concentration of the target
DNA or RNA molecule and length and base composition of the probe,
and are normally determined empirically in preliminary experiments,
which are routine (see Southern, J. Mol. Biol. 1975; 98:503;
Sambrook et al., Molecular Cloning: A Laboratory Manual, 2nd ed.,
vol. 2, ch. 9.50, CSH Laboratory Press, 1989; Ausubel et al.
(eds.), 1989, Current Protocols in Molecular Biology, Vol. I, Green
Publishing Associates, Inc., and John Wiley & Sons, Inc., New
York, at p. 2.10.3). An extensive guide to the hybridization of
nucleic acids is found in, e.g., Tijssen (1993) Laboratory
Techniques in Biochemistry and Molecular Biology--Hybridization
with Nucleic Acid Probes part I, chapt 2, "Overview of principles
of hybridization and the strategy of nucleic acid probe assays,"
Elsevier, N.Y. ("Tijssen").
[0097] As used herein, the term "standard hybridization conditions"
refers to hybridization conditions that allow hybridization of two
nucleotide molecules having at least 50% sequence identity.
According to a specific embodiment, hybridization conditions of
higher stringency may be used to allow hybridization of only
sequences having at least 75% sequence identity, at least 80%
sequence identity, at least 90% sequence identity, at least 95%
sequence identity, or at least 99% sequence identity.
[0098] As used herein, the phrase "under hybridization conditions"
means under conditions that facilitate specific hybridization of a
subset of capture oligonucleotides to complementary sequences
present in the cDNA or cRNA. The terms "hybridizing specifically
to" and "specific hybridization" and "selectively hybridize to," as
used herein refer to the binding, duplexing, or hybridizing of a
nucleic acid molecule preferentially to a particular nucleotide
sequence under at least moderately stringent conditions, and
preferably, highly stringent conditions, as discussed above.
[0099] "Polypeptide" and "protein" are used interchangeably and
mean any peptide-linked chain of amino acids, regardless of length
or post-translational modification.
[0100] As used herein, the term "nucleic acid" or "oligonucleotide"
refers to a deoxyribonucleotide or ribonucleotide in either single-
or double-stranded form. The term also encompasses
nucleic-acid-like structures with synthetic backbones. DNA backbone
analogues provided by the invention include phosphodiester,
phosphorothioate, phosphorodithioate, methylphosphonate,
phosphoramidate, alkyl phosphotriester, sulfamate, 3'-thioacetal,
methylene(methylimino), 3'-N-carbamate, morpholino carbamate, and
peptide nucleic acids (PNAs); see Oligonucleotides and Analogues, a
Practical Approach, edited by F. Eckstein, IRL Press at Oxford
University Press (1991); Antisense Strategies, Annals of the New
York Academy of Sciences, Volume 600, Eds. Baserga and Denhardt
(NYAS 1992); Milligan (1993) J. Med. Chem. 36:1923-1937; Antisense
Research and Applications (1993, CRC Press). PNAs contain non-ionic
backbones, such as N-(2-aminoethyl) glycine units. Phosphorothioate
linkages are described in WO 97/03211; WO 96/39154; Mata (1997)
Toxicol. Appl. Pharmacol. 144:189-197. Other synthetic backbones
encompassed by the term include methyl-phosphonate linkages or
alternating methylphosphonate and phosphodiester linkages
(Strauss-Soukup (1997) Biochemistry 36:8692-8698), and
benzylphosphonate linkages (Samstag (1996) Antisense Nucleic Acid
Drug Dev 6:153-156). The term nucleic acid is used interchangeably
with cDNA, cRNA, mRNA, oligonucleotide, probe and amplification
product.
III. CELL MARKERS
[0101] In certain embodiments, it is desirable to detect the
presence and/or expression level of one or more cell markers (e.g.,
estrogen receptor (ER), p27, CD24, CD44, CD10, Ki67, BRCA1, BRCA2,
etc.) associated with breast epithelial cells and/or breast cancer
(e.g., ER+ or ER- breast cancer). Moreover, the present document
features methods in which the relative numbers of cells expressing
one or more of these markers are determined. The nucleic acid and
amino acid sequences for such markers are known and have been
described, and the GenBank.RTM. Accession Nos. of exemplary nucleic
acid and amino acid sequences for the human markers are provided in
Table 1, below.
TABLE-US-00001 TABLE 1 Exemplary GenBank .RTM. Accession Numbers
Breast Cancer-Associated Markers Nucleic Acid Amino Acid GenBank
.RTM. SEQ Corresponding GenBank .RTM. SEQ Gene Name Accession No.
ID NO Polypeptide Name Accession No. ID NO CD24 BG327863 1
Sialoglycoprotein ACI46150.1 2 CD10 NM_007289.2 3 Neprilysin
NP_009220.2 4 CD44 BC004372 5 CD44 AAB30429.1 6 P27/CDKN1B BC001971
7 CDKN1B CAG33680.1 8 Ki67 (MKI67) AU152107 9 KI67 antigen
CAD99007.1 10 Homo sapiens NM_022454.3 11 transcription factor
NP_071899 12 SRY (sex SOX-17 determining region Y)-box 17 (SOX17)
Prostaglandin- NM_000963 13 prostaglandin G/H NP_000954 14
endoperoxide synthase 2 synthase 2 precursor (prostaglandin G/H
synthase and cyclooxygenase) (PTGS2) Epidermal NM_005228 15
Epidermal growth NP_005219.2 19 Growth Factor NM_201282 16 factor
receptor NP_958439.1 20 Receptor P NM_201283.1 17 NP_958440.1 21
(EGFR) NM_201284 18 NP_958441.1 22 sonic hedgehog NM_000193 23
Sonic hedgehog NP_000184 24 protein (SHH) protein insulin-like
NM_000875 25 Insulin like NP000866 26 growth factor 1 Growth factor
receptor receptor (IGF1R) transforming NM_004612 27 Transforming
NP_004603 29 growth factor, NM_001130916 28 Growth factor
NP_001124388 30 beta receptor 1 receptor beta (TGFBR1) receptor
estrogen NM_000125.3 31 Estrogen Receptor NP_000116 35 receptor 1
NM_001122740.1 32 I NP_001116212 36 (ESR1) NM_001122741.1 33
NP_001116213 37 NM_001122742.1 34 NP_001116214 38 breast cancer
NM_007294.3 39 breast cancer type NP_009225 44 type 1 NM_007300.3
40 1 susceptibility NP_009231.2 45 susceptibility NM_007297.3 41
protein (BRCA1) NP_009228.2 46 protein NM_007298.3 42 NP_009229.2
47 (BRCA1) NM_007299.3 43 NP_009230.2 48 Homo sapiens NM_000059 49
breast cancer type NP_000050 50 breast cancer 2, 2 susceptibility
early onset protein (BRCA2) (BRCA2), Androgen NM_000044 51 Androgen
NP_000035 53 Receptor (AR) NM_001011645 52 Receptor (AR)
NP_001011645 54
[0102] In certain embodiments, it is desirable to determine (e.g.,
assay, measure, approximate) the level (e.g., expression or
activity), e.g., one of the above-identified markers. The
expression level of such markers may be determined according to any
suitable method known in the art. A non-limiting example of such a
method includes real-time PCR (RT-PCR), e.g., quantitative RT-PCR
(QPCR), which measures the expression level of the mRNA encoding
the polypeptide. Real-time PCR evaluates the level of PCR product
accumulation during amplification. RNA (or total genomic DNA for
detection of germline mutations) is isolated from a sample. RT-PCR
can be performed, for example, using a Perkin Elmer/Applied
Biosystems (Foster City, Calif.) 7700 Prism instrument. Matching
primers and fluorescent probes can be designed for genes of
interest using, based on the genes' nucleic acid sequences (e.g.,
as described above), for example, the primer express program
provided by Perkin Elmer/Applied Biosystems (Foster City, Calif.).
Optimal concentrations of primers and probes can be initially
determined by those of ordinary skill in the art, and control (for
example, beta-actin) primers and probes may be obtained
commercially from, for example, Perkin Elmer/Applied Biosystems
(Foster City, Calif.).
[0103] To quantitate the amount of the specific nucleic acid of
interest in a sample, a standard curve is generated using a
control. Standard curves may be generated using the Ct values
determined in the real-time PCR, which are related to the initial
concentration of the nucleic acid of interest used in the assay.
Standard dilutions ranging from 10-10.sup.6 copies of the gene of
interest are generally sufficient. In addition, a standard curve is
generated for the control sequence. This permits standardization of
initial content of the nucleic acid of interest in a tissue sample
to the amount of control for comparison purposes. Methods of QPCR
using TaqMan probes are well known in the art. Detailed protocols
for QPCR are provided, for example, for RNA in: Gibson et al.,
1996, Genome Res., 10:995-1001; and for DNA in: Heid et al., 1996,
Genome Res., 10:986-994; and in Innis et al. (1990) Academic Press,
Inc. N.Y.
[0104] Expression of mRNA, as well as expression of peptides and
other biological factors can also be determined using microarray,
methods for which are well known in the art [see, e.g., Watson et
al. Curr Opin Biotechnol (1998) 9: 609-14; "DNA microarray
technology: Devices, Systems, and Applications" Annual Review of
Biomedical Engineering; Vol. 4: 129-153 (2002); Chehab et al.
(1989) "Detection of specific DNA sequences by fluorescence
amplification: a color complementation assay" Proc. Natl. Acad.
Sci. USA, 86: 9178-9182; Lockhart et al. (1996) "Expression
monitoring by hybridization to high-density oligonucleotide arrays"
Nature Biotechnology, 14: 1675-1680; and M. Schena et al. (1996)
"Parallel human genome analysis: Microarray-based expression
monitoring of 1000 genes" Proc. Natl. Acad. Sci. USA,
93:10614-10619; Peptide Microarrays Methods and Protocols; Methods
in Molecular Biology; Volume 570, 2009, Humana Press; and Small
Molecule Microarrays Methods and Protocols; Series: Methods in
Molecular Biology, Vol. 669, Uttamchandani, Mahesh; Yao, Shao Q.
(Eds.) 2010, 2010, Humana Press]. For example, mRNA expression
profiling can be performed to identify differentially expressed
genes, wherein the raw intensities determined by microarray are
log.sub.e-transformed and quantile normalized and gene set
enrichment analysis (GSEA) is performed according, e.g., to
Subramanian et al. (2005) Proc Natl Acad Sci USA
102:15545-15550).
[0105] Other suitable amplification methods include, but are not
limited to ligase chain reaction (LCR) (see Wu and Wallace (1989)
Genomics 4:560, Landegren et al. (1988) Science 241:1077, and
Barringer et al. (1990) Gene 89:117), transcription amplification
(Kwoh et al. (1989) Proc. Natl. Acad. Sci. USA 86:1173),
self-sustained sequence replication (Guatelli et al. (1990) Proc.
Nat. Acad. Sci. USA 87:1874), dot PCR, and linker adapter PCR, etc.
In another embodiment, DNA sequencing may be used to determine the
presence of ER in a genome. Methods for DNA sequencing are known to
those of skill in the art.
[0106] Other methods for detecting gene expression (e.g., mRNA
levels) include Serial Analysis of Gene Expression applied to
high-throughput sequencing (SAGEseq), as described in the present
Examples and in Wu Z J et al. Genome Res. 2010
December;20(12):1730-9. 2.
[0107] For the detection of germline mutations (e.g., in BRCA1,
BRCA2), Southern blotting can also be used. Methods for Southern
blotting are known to those of skill in the art (see, e.g., Current
Protocols in Molecular Biology, Chapter 19, Ausubel, et al., Eds.,
Greene Publishing and Wiley-Interscience, New York, 1995, or
Sambrook et al., Molecular Cloning: A Laboratory Manual, 2d Ed.
vol. 1-3, Cold Spring Harbor Press, NY, 1989). In such an assay,
the genomic DNA (typically fragmented and separated on an
electrophoretic gel) is hybridized to a probe specific for the
target region. Comparison of the intensity of the hybridization
signal from the probe for the target region with control probe
signal from analysis of normal genomic DNA (e.g., genomic DNA from
the same or related cell, tissue, organ, etc.) provides an estimate
of the relative copy number of the target nucleic acid. Arrays of
nucleic probes can also be employed to detect single or multiple
germline or somatic mutations by methods known in the art.
[0108] Other examples of suitable methods for detecting expression
levels of the cell markers described herein include, e.g., Western
blot, ELISA and/or immunohistochemistry, which can be used to
measure protein expression level. Such methods are well known in
the art.
[0109] The frequency of cells that are specific for one or more
particular markers (e.g., the frequency of CD44+ or CD24+ breast
epithelial cells) can be detected according to any suitable method
known in the art. For example, flow cytometry is widely used for
analyzing the expression of cell surface and intracellular
molecules (on a per cell basis), characterizing and defining
different cell types in heterogeneous populations, assessing the
purity of isolated subpopulations, and analyzing cell size and
volume. This technique is predominantly used to measure
fluorescence intensity produced by fluorescent-labeled antibodies
or ligands that bind to specific cell-associated molecules, and is
described in detail in, e.g., Holmes, K. et al. "Preparation of
Cells and Reagents for Flow Cytometry" Current Protocols in
Immunology, Unit 5.3.
[0110] Non-limiting examples of primary antibodies that may be used
to identify the expression of certain markers by one or more
assays, e.g., by flow cytometry, immunohistochemistry (IHC), and/or
Western blot are listed in Table 2, below:
TABLE-US-00002 TABLE 2 Exemplary Cell Marker Primary Antibodies
Application (e.g., Cell Western blot, flow Commercial Marker
Primary Antibody cytometry, IHC) Source CD24 clone SN3b IHC
Neomarkers CD24 clone ML5 FACS Biolegend CD10 56C6 clone IHC Dako
CD10 Clone HI10a FACS Biolegend CD44 clone 156-3C11 IHC Neomarkers
CD44 Clone 515 FACS BD P27 clone 57/Kip1/p27 IHC Bd Biosciences
Ki67 N/A IHC Abcam Sox17 clone 245013 IHC R&D Systems Cox2
clone CX229 IHC Cayman Chemical pEGFR 53A5 (Tyr1173) IHC Cell
Signaling Technology Shh Cat# 06-1106 WB, IHC Millipore IGF-1R
Clone 24-31 IHC (P) Imgenex pTGFBR Phospho S165 ICC/IF Abcam ER
Estrogen Receptor IHC Thermo Scientific (clone SP1) AR Androgen
receptor WB/IHC-P/IF/IC/F Cell Signaling (clone D6F11, Technology
#5153) BRCA1 MS110 clone IF/IP/WB Calbiochem BRCA2 Cst#CA1033
WB/IP/IHC(P) Millipore Abbreviations: WB: Western blotting; IHC:
Immunohistochemistry; IHC-P: immunohistochemistry-paraffin; ICC:
immunocytochemistry; IF: immunofluorescence; F: flow cytometry
IV. GENES AND PATHWAYS DIFFERENTIALLY REGULATED BY PARITY
STATUS
[0111] In certain embodiments, it is desirable to decrease (e.g.,
inhibit) the expression and/or activity of genes and/or
polypeptides encoded by those genes that are discovered herein to
be upregulated in breast epithelial cells of nulliparous women
relative to parous women. For example, one or more of the genes
that are upregulated in CD44+, CD24+, CD10+ and stromal breast
epithelial cells of nulliparous women, in Tables 4, 5, 6 and 7,
respectively, can be targeted with an inhibitor as described herein
in order to treat or prevent breast cancer (e.g., ER+ or ER- breast
cancer). Further, for example, one or more of the genes that are
upregulated in CD44+ breast epithelial cells of BRCA1 and/or BRCA2
mutation carriers compared to control (normal) breast epithelial
cells), as shown, e.g., in Tables 8 and 9 can be targeted with an
inhibitor as described herein in order to treat or prevent breast
cancer (e.g., ER+ or ER- breast cancer). By way of non-limiting
example, asp27 expression is higher in BRCA1 mutation carriers and
in BRCA2 mutation carriers compared to control (non-mutation
carriers, normal cells), and is an exemplary target for an
inhibitor as discussed above.
[0112] In other embodiments, it is desirable to increase the
expression and/or activity of genes and/or polypeptides encoded by
those genes that are discovered herein to be upregulated in breast
epithelial cells of parous women relative to nulliparous women. For
example, one or more of the genes that are upregulated in CD44+,
CD24+, CD10+ and stromal breast epithelial cells of parous women,
in Tables 4, 5, 6 and 7, respectively, can be targeted with an
agonist as described herein in order to treat or prevent breast
cancer (e.g., ER+ or ER- breast cancer). Further, for example, one
or more of the genes that are downregulated in CD44+ breast
epithelial cells of BRCA1 and/or BRCA2 mutation carriers compared
to control (normal) breast epithelial cells), as shown, e.g., in
Tables 8 and 9, can be targeted with an agonist as described herein
in order to treat or prevent breast cancer (e.g., ER+ or ER- breast
cancer).
[0113] In certain embodiments, methods for treating breast cancer
(e.g., ER+ or ER- breast cancer) involve targeting (e.g.,
inhibiting) one or more pathways that have increased activity in
breast epithelial cells (e.g., CD44+, CD24- breast epithelial
cells) of nulliparous women compared to the activity in the breast
epithelial cells of parous women (such pathways are also referred
to herein as "pathways active in nulliparous (NP) breast epithelial
cells"). The identification of such pathways is described in detail
in Example 3, below, and the pathways are listed in Tables 10 and
11, below. In a specific embodiment, the pathway is a member
selected from the group consisting of cytoskeleton remodeling,
chemokine, androgen signaling, cell adhesion, and Wnt signaling. In
another embodiment, the pathway includes a mediator molecule
selected from the group consisting of cyclic AMP (cAmp) (Signal
transduction_cAMP signaling pathway), EGFR (e.g., Development_EGFR
signaling via small GTPases pathway, EGFR signaling pathway), Cox2
(e.g., Role and regulation of Prostaglandin E2 in gastric cancer
pathway, Hh (e.g., hedgehog signaling pathways), and IGFR (IGFR-IGF
signaling pathways).
[0114] In other embodiments, methods for treating breast cancer
involve targeting (e.g., administering an agonist of) one or more
pathways that have decreased activity in breast epithelial cells
(e.g., CD44+, CD24- breast epithelial cells) of nulliparous women
compared to the breast epithelial cells of parous women (i.e.,
pathways that have increased activity in breast epithelial cells of
parous women, which also referred to herein as "pathways active in
parous (P) breast epithelial cells). Such pathways are identified
in Example 3 and Tables 10 and 12.
[0115] Exemplary pathways are pathways active in nulliparous CD44+,
CD24- breast epithelial cells, as shown in Table 11, although
pathways active in other nulliparous breast epithelial cells types
(e.g., CD24+, CD10+ and/or stromal breast epithelial cells) are
also encompassed herein, and include, but are not limited to,
Cytoskeleton remodeling_Role of PKA in cytoskeleton reorganisation,
Development_MAG-dependent inhibition of neurite outgrowth, Role of
DNA methylation in progression of multiple myeloma, Cell
adhesion_Histamine H1 receptor signaling in the interruption of
cell barrier integrity, Stem cells_Response to hypoxia in
glioblastoma stem cells, Development_WNT signaling pathway. Part 2,
Development_Slit-Robo signaling, Cytoskeleton
remodeling_Fibronectin-binding integrins in cell motility,
Oxidative phosphorylation, etc. The genes and the polypeptides
encoded by those genes that mediate one or more functions in these
pathways are known in the art and can be determined using, e.g.,
Metaminer software (GeneGo). Thus, the following genes are provided
as non-limiting examples of genes involved in the pathways active
in nulliparous CD44+, CD24- breast epithelial cells.
[0116] For example, genes involved in metabolic pathways active in
nulliparous CD44+, CD24- breast epithelial cells (e.g., the
pathways: Transcription_Transcription regulation of amino acid
metabolism, Regulation of lipid metabolism_Stimulation of
Arachidonic acid production by ACM receptors, Ubiquinone
metabolism, and Mitochondrial ketone bodies biosynthesis and
metabolism), include, but are not limited to, HSD17B11 (GenBank
Accession No. BC014327, CA775960), HSD17B12 (GenBank Accession No.
AF078850), and HSD17B14 (GenBank Accession No. AF126781), which are
involved in regulation of lipid metabolism pathways.
[0117] Genes involved in androgen signaling pathways active in
nulliparous CD44+, CD24- breast epithelial cells (e.g., the
pathways: "Putative role of Estrogen receptor and Androgen receptor
signaling in progression of lung cancer", "Androgen signaling in
HCC" (see Tables 10 and 11)) include, but are not limited to, PSA
(KLK3) (GenBank Accession Nos. AC011523, BC005307), which are
involved in the androgen signaling.
[0118] Genes involved in developmental and thyroid signaling
pathways active in nulliparous CD44+, CD24- breast epithelial cells
(e.g., the pathways: Development_Glucocorticoid receptor signaling,
Development_Hedgehog and PTH signaling pathways in bone and
cartilage development) include, but are not limited to, NCOR1
(GenBank Accession No. AC002553), NCOR2 (GenBank Accession No.
AB209089, AC073916), NCOA4 (GenBank Accession No. AL162047), and
NCOA7 (GenBank Accession No. AJ420542).
[0119] Genes involved in Wnt signaling pathways active in
nulliparous CD44+, CD24-breast epithelial cells (e.g., the
pathways: Development_WNT signaling pathway, Cytoskeleton
remodeling_TGF, WNT and cytoskeletal remodeling, Stem
cells_WNT/Beta-catenin and NOTCH in induction of osteogenesis)
include, but are not limited to, SFRP2 (GenBank Accession No.
AA449032, AF311912), SFRP4 (GenBank Accession No. AC018634,
BT019679), VEGFA (GenBank Accession Nos. AF024710, BF700556), HIF1A
(GenBank Accession Nos. BC012527, CN264320), NOTCH1 (GenBank
Accession Nos. AB209873, AF308602, AL592301), FN1 (GenBank
Accession Nos AI033037, AJ535086).
[0120] Genes involved in chemokine pathways active in nulliparous
CD44+, CD24-breast epithelial cells (e.g., the pathways: Cell
adhesion_Chemokines and adhesion, Cell adhesion_Alpha-4 integrins
in cell migration and adhesion, Cell adhesion_Plasmin signaling,
Cell adhesion_ECM remodeling, Cell adhesion_Role of tetraspanins in
the integrin-mediated cell adhesion) include, but are not limited
to, ITGA4 (GenBank Accession No., AC020595) (ITGB1 (GenBank
Accession No., AI261443), and TSPAN6 (GenBank Accession Nos.
AF043906, BC012389).
[0121] Genes involved in cytoskeleton remodeling pathways active in
nulliparous CD44+, CD24- breast epithelial cells (e.g., the
pathways: Cytoskeleton remodeling_Regulation of actin cytoskeleton
by Rho GTPases, Cytoskeleton remodeling_Fibronectin-binding
integrins in cell motility, Cytoskeleton remodeling_Reverse
signaling by ephrin B, Cytoskeleton remodeling_Role of PKA in
cytoskeleton reorganisation) include, but are not limited to, RhoA
(GenBank Accession Nos. AK130066, BC000946), RAC1 (GenBank
Accession No. AC009412), CDC42 (GenBank Accession No.,
NM.sub.--001039802), and EPHB4 (GenBank Accession Nos. AY056048,
BC052804).
[0122] The pathways for DNA repair, PI3K/AKT signaling, and
apoptosis have been demonstrated herein to be active in parous
CD44+, CD24- breast epithelial cells. Other non-limiting examples
of pathways active in parous breast epithelial cells include, e.g.,
TTP metabolism, Resistance of pancreatic cancer cells to death
receptor signaling, Transcription_Assembly of RNA Polymerase II
preinitiation complex on TATA-less promoters, Development_PIP3
signaling in cardiac myocytes, HCV-dependent regulation of RNA
polymerases leading to HCC, Stem cells_H3K9 demethylases in
pluripotency maintenance of stem cells, Inhibition of apoptosis in
gastric cancer, Cell cycle_Start of DNA replication in early S
phase, Apoptosis and survival_Caspase cascade, Immune response_BCR
pathway, Immune response_ICOS pathway in T-helper cell, Cell
cycle_The metaphase checkpoint, Inhibitory action of Lipoxins on
neutrophil migration, Cytoskeleton remodeling_Alpha-1A adrenergic
receptor-dependent inhibition of PI3K, DNA damage_NHEJ mechanisms
of DSBs repair, Regulation of metabolism_Triiodothyronine and
Thyroxine signaling, Cell cycle_Chromosome condensation in
prometaphase, Development_IGF-1 receptor signaling, dCTP/dUTP
metabolism, dGTP metabolism, Inhibition of RUNX3 signaling in
gastric cancer, Apoptosis and survival_Beta-2 adrenergic receptor
anti-apoptotic action, Signal transduction_Activin A signaling
regulation, Stem cells_Fetal brown fat cell differentiation, Immune
response_CXCR4 signaling via second messenger, dATP/dITP
metabolism, Signal transduction_PTEN pathway, Microsatellite
instability in gastric cancer, Inhibition of TGF-beta signaling in
gastric cancer, Immune response_Regulation of T cell function by
CTLA-4, DNA damage_DNA-damage-induced responses, etc. (see Tables
10 and 12). The genes and proteins encoded by those genes that
mediate functions in these pathways are well known in the art.
Thus, the skilled artisan will know which specific genes and/or
polypeptides to target (e.g., with an agonist) as described herein
(e.g., for the treatment or prevention of breast cancer (e.g., ER+
or ER- breast cancer)).
[0123] By way of example, genes involved in apoptosis pathways
active in parous CD44+, CD24- breast epithelial cells (e.g., the
pathways, Apoptosis and survival_FAS signaling cascades, Apoptosis
and survival_Caspase cascade, Apoptosis and survival_HTR1A
signaling, Apoptosis and survival_Beta-2 adrenergic receptor
anti-apoptotic action, Apoptosis and survival_Granzyme A signaling,
Apoptosis and survival_Cytoplasmic/mitochondrial transport of
pro-apoptotic proteins Bid, Bmf and Bim) upregulated in parous
breast epithelial cells included, but are not limited to, BCL2L11
(GenBank Accession Nos. AC096670, AI268146, AK290377, AY428962),
TNFRSF4 (GenBank Accession Nos. AW290885, BC105070), BMPR2 (GenBank
Accession Nos. AC009960, BC035097), CASP8 (GenBank Accession Nos.
BF439983, AC007256, AF422927), and PP2A (GenBank Accession Nos.
AL158151, CD630703, DA052599, X73478).
[0124] Genes involve in PI3K/AKT signaling pathways active in
parous CD44+, CD24-breast epithelial cells (e.g., the pathways,
Cytoskeleton remodeling_Alpha-1A adrenergic receptor-dependent
inhibition of PI3K, Signal transduction_AKT signaling, PI3K
signaling in gastric cancer) that are upregulated in parous breast
epithelial cells included, but are not limited to, PIK3CG (GenBank
Accession No. X83368), p85 (GenBank Accession No. AC016564,
BC094795, CA427864, CT003423), ILK (GenBank Accession No. BC001554,
CB113885, U40282), PDPK1 (GenBank Accession No. AC093525, AC141586,
BC012103).
[0125] Genes involved in tumor suppressor pathways active in parous
breast epithelial cells (e.g., the pathways: Apoptosis and
survival_Cytoplasmic/mitochondrial transport of pro-apoptotic
proteins Bid, Bmf and Bim, Apoptosis and survival_Caspase cascade,
Cytoskeleton remodeling_Alpha-1A adrenergic receptor-dependent
inhibition of PI3K, Cell cycle_The metaphase checkpoint) include,
but are not limited to, Hakai/CBLL1 (GenBank Accession Nos.
AC002467, AK026762, AK293352), CASP8 (GenBank Accession No.
BF439983), SCRIB (GenBank Accession No. A1469403), and LLGL2
(GenBank Accession Nos. AC100787, BC031842).
[0126] The skilled artisan will appreciate that the foregoing are
non-limiting examples of pathways, as well as genes and
polypeptides mediating functions in those pathways, that can be
targeted (e.g., by an inhibitor or agonist) for the treatment of
breast cancer, and other targets, such as those set forth in Tables
10, 11, and 12, below, are also encompassed by the present
invention.
V. INHIBITORS AND AGONISTS
[0127] Inhibitors and agonists may be used to treat or prevent
breast cancer in a subject, as described herein. One of skill in
the art will appreciate that the design of such inhibitors and
agonists will depend on the specific pathway (e.g., metabolic
pathways androgen signaling pathways, tumor suppression, etc., as
described above) being targeted. The skilled artisan will
understand how to design such inhibitors and agonists, based on
methods well known in the art.
[0128] The following are thus provided as non-limiting examples
(e.g., antisense nucleic acids, RNAi, ribozymes, triple helix
forming oligonucleotides (TFOs), antibodies (including, but not
limited to intrabodies), aptamers, and other small molecules), and
other inhibitors that target pathways (e.g., inhibit expression
and/or activity of specific genes and/or polypeptides encoded by
those genes that mediate a function in the pathway) that are active
in breast epithelial cells of nulliparous women, and agonists that
target pathways (e.g., increase expression and/or activity of
specific genes and/or polypeptides that mediate a function in the
pathway) that are active in parous women, are also encompassed by
the present disclosure.
[0129] Antisense Nucleic Acids
[0130] Antisense oligonucleotides can be used to inhibit the
expression of a target polypeptide of the invention (e.g.,
HSD17B11, HSD17B12, HSD17B14, etc.). Antisense oligonucleotides
typically are about 5 nucleotides to about 30 nucleotides in
length, about 10 to about 25 nucleotides in length, or about 20 to
about 25 nucleotides in length. For a general discussion of
antisense technology, see, e.g., Antisense DNA and RNA, (Cold
Spring Harbor Laboratory, D. Melton, ed., 1988).
[0131] Appropriate chemical modifications of the inhibitors are
made to ensure stability of the antisense oligonucleotide, as
described below. Changes in the nucleotide sequence and/or in the
length of the antisense oligonucleotide can be made to ensure
maximum efficiency and thermodynamic stability of the inhibitor.
Such sequence and/or length modifications are readily determined by
one of ordinary skill in the art.
[0132] The antisense oligonucleotides can be DNA or RNA or chimeric
mixtures, or derivatives or modified versions thereof, and can be
single-stranded or double-stranded. Thus, for example, in the
antisense oligonucleotides set forth in herein, when a sequence
includes thymidine residues, one or more of the thymidine residues
may be replaced by uracil residues and, conversely, when a sequence
includes uracil residues, one or more of the uracil residues may be
replaced by thymidine residues.
[0133] Antisense oligonucleotides comprise sequences complementary
to at least a portion of the corresponding target polypeptide.
However, 100% sequence complementarity is not required so long as
formation of a stable duplex (for single stranded antisense
oligonucleotides) or triplex (for double stranded antisense
oligonucleotides) can be achieved. The ability to hybridize will
depend on both the degree of complementarity and the length of the
antisense oligonucleotides. Generally, the longer the antisense
oligonucleotide, the more base mismatches with the corresponding
nucleic acid target can be tolerated. One skilled in the art can
ascertain a tolerable degree of mismatch by use of standard
procedures to determine the melting point of the hybridized
complex.
[0134] Antisense nucleic acid molecules can be encoded by a
recombinant gene for expression in a cell (see, e.g., U.S. Pat.
Nos. 5,814,500 and 5,811,234), or alternatively they can be
prepared synthetically (see, e.g., U.S. Pat. No. 5,780,607).
[0135] The antisense oligonucleotides can be modified at the base
moiety, sugar moiety, or phosphate backbone, or a combination
thereof. In one embodiment, the antisense oligonucleotide comprises
at least one modified sugar moiety, e.g., a sugar moiety such as
arabinose, 2-fluoroarabinose, xylulose, and hexose.
[0136] In another embodiment, the antisense oligonucleotide
comprises at least one modified phosphate backbone such as a
phosphorothioate, a phosphorodithioate, a phosphoramidothioate, a
phosphoramidate, a phosphordiamidate, a methylphosphonate, an alkyl
phosphotriester, and a formacetal or analog thereof. Examples
include, without limitation, phosphorothioate antisense
oligonucleotides (e.g., an antisense oligonucleotide phosphothioate
modified at 3' and 5' ends to increase its stability) and chimeras
between methylphosphonate and phosphodiester oligonucleotides.
These oligonucleotides provide good in vivo activity due to
solubility, nuclease resistance, good cellular uptake, ability to
activate RNase H, and high sequence selectivity.
[0137] Other examples of synthetic antisense oligonucleotides
include oligonucleotides that contain phosphorothioates,
phosphotriesters, methyl phosphonates, short chain alkyl, or
cycloalkyl intersugar linkages or short chain heteroatomic or
heterocyclic intersugar linkages. Most preferred are those with
CH2-NH--O--CH2, CH2-N(CH3)-O--CH2, CH2-O--N(CH3)-CH2,
CH2-N(CH3)-N(CH3)-CH2 and O--N(CH3)-CH2-CH2 backbones (where
phosphodiester is O--PO2-O--CH2). U.S. Pat. No. 5,677,437 describes
heteroaromatic oligonucleoside linkages. Nitrogen linkers or groups
containing nitrogen can also be used to prepare oligonucleotide
mimics (U.S. Pat. Nos. 5,792,844 and 5,783,682). U.S. Pat. No.
5,637,684 describes phosphoramidate and phosphorothioamidate
oligomeric compounds.
[0138] In other embodiments, such as the peptide-nucleic acid (PNA)
backbone, the phosphodiester backbone of the oligonucleotide may be
replaced with a polyamide backbone, the bases being bound directly
or indirectly to the aza nitrogen atoms of the polyamide backbone
(Nielsen et al., Science 1991; 254:1497). Other synthetic
oligonucleotides may contain substituted sugar moieties comprising
one of the following at the 2' position: OH, SH, SCH3, F, OCN,
O(CH2)nNH2 or O(CH2)nCH3 where n is from 1 to about 10; C1 to C10
lower alkyl, substituted lower alkyl, alkaryl or aralkyl; Cl; Br;
CN; CF3; OCF3; O--; S-, or N-alkyl; O-, S-, or N-alkenyl; SOCH3;
SO2CH3; ONO2; NO2; N3; NH2; heterocycloalkyl; heterocycloalkaryl;
aminoalkylamino; polyalkylamino; substituted sialyl; a fluorescein
moiety; an RNA cleaving group; a reporter group; an intercalator; a
group for improving the pharmacokinetic properties of an
oligonucleotide; or a group for improving the pharmacodynamic
properties of an oligonucleotide, and other substituents having
similar properties.
[0139] Oligonucleotides may also have sugar mimetics such as
cyclobutyls or other carbocyclics in place of the pentofuranosyl
group. Nucleotide units having nucleosides other than adenosine,
cytidine, guanosine, thymidine and uridine may be used, such as
inosine. In other embodiments, locked nucleic acids (LNA) can be
used (reviewed in, e.g., Jepsen and Wengel, Curr. Opin. Drug
Discov. Devel. 2004; 7:188-194; Crinelli et al., Curr. Drug Targets
2004; 5:745-752). LNA are nucleic acid analog(s) with a 2'-O, 4'-C
methylene bridge. This bridge restricts the flexibility of the
ribofuranose ring and locks the structure into a rigid C3-endo
conformation, conferring enhanced hybridization performance and
exceptional biostability. LNA allows the use of very short
oligonucleotides (less than 10 bp) for efficient hybridization in
vivo.
[0140] In one embodiment, an antisense oligonucleotide can comprise
at least one modified base moiety such as a group including but not
limited to 5-fluorouracil, 5-bromouracil, 5-chlorouracil,
5-iodouracil, hypoxanthine, xantine, 4-acetylcytosine,
5-(carboxyhydroxylmethyl) uracil,
5-carboxymethylaminomethyl-2-thiouridine,
5-carboxymethylaminomethyluracil, dihydrouracil,
beta-D-galactosylqueosine, inosine, N6-isopentenyladenine,
1-methylguanine, 1-methylinosine, 2,2-dimethylguanine,
2-methyladenine, 2-methylguanine, 3-methylcytosine,
5-methylcytosine, N6-adenine, 7-methylguanine,
5-methylaminomethyluracil, 5-methoxyaminomethyl-2-thiouracil,
beta-D-mannosylqueosine, 5-methoxycarboxymethyluracil,
5-methoxyuracil, 2-methylthio-N6-isopentenyladenine,
uracil-5-oxyacetic acid (v), pseudouracil, queosine,
2-thiocytosine, 5-methyl-2-thiouracil, 2-thiouracil, 4-thiouracil,
5-methyluracil, uracil-5-oxyacetic acid methylester,
uracil-5-oxyacetic acid (v), 5-methyl-2-thiouracil,
3-(3-amino-3-N-2-carboxypropyl) uracil, (acp3)w, and
2,6-diaminopurine.
[0141] In another embodiment, the antisense oligonucleotide can
include .alpha.-anomeric oligonucleotides. An .alpha.-anomeric
oligonucleotide forms specific double-stranded hybrids with
complementary RNA in which, contrary to the usual .beta.-units, the
strands run parallel to each other (Gautier et al., Nucl. Acids
Res. 1987; 15:6625-6641).
[0142] Oligonucleotides may have morpholino backbone structures
(U.S. Pat. No. 5,034,506). Thus, in yet another embodiment, the
antisense oligonucleotide can be a morpholino antisense
oligonucleotide (i.e., an oligonucleotide in which the bases are
linked to 6-membered morpholine rings, which are connected to other
morpholine-linked bases via non-ionic phosphorodiamidate
intersubunit linkages). Morpholino oligonucleotides are highly
resistant to nucleases and have good targeting predictability, high
in-cell efficacy and high sequence specificity (U.S. Pat. No.
5,034,506; Summerton, Biochim. Biophys. Acta 1999; 1489:141-158;
Summerton and Weller, Antisense Nucleic Acid Drug Dev. 1997;
7:187-195; Arora et al., J. Pharmacol. Exp. Ther. 2000;
292:921-928; Qin et al., Antisense Nucleic Acid Drug Dev. 2000;
10:11-16; Heasman et al., Dev. Biol. 2000; 222:124-134; Nasevicius
and Ekker, Nat. Genet. 2000; 26:216-220).
[0143] Antisense oligonucleotides may be chemically synthesized,
for example using appropriately protected ribonucleoside
phosphoramidites and a conventional DNA/RNA synthesizer. Antisense
nucleic acid oligonucleotides can also be produced intracellularly
by transcription from an exogenous sequence. For example, a vector
can be introduced in vivo such that it is taken up by a cell within
which the vector or a portion thereof is transcribed to produce an
antisense RNA. Such a vector can remain episomal or become
chromosomally integrated, so long as it can be transcribed to
produce the desired antisense RNA. Such vectors can be constructed
by recombinant DNA technology methods standard in the art. Vectors
can be plasmid, viral, or others known in the art, used for
replication and expression in mammalian cells. In another
embodiment, "naked" antisense nucleic acids can be delivered to
adherent cells via "scrape delivery", whereby the antisense
oligonucleotide is added to a culture of adherent cells in a
culture vessel, the cells are scraped from the walls of the culture
vessel, and the scraped cells are transferred to another plate
where they are allowed to re-adhere. Scraping the cells from the
culture vessel walls serves to pull adhesion plaques from the cell
membrane, generating small holes that allow the antisense
oligonucleotides to enter the cytosol.
[0144] RNAi
[0145] Reversible short inhibition of a target polypeptide (e.g.,
Gfpt1, RPIA, RPE, etc.) of the invention may also be useful. Such
inhibition can be achieved by use of siRNAs. RNA interference
(RNAi) technology prevents the expression of genes by using small
RNA molecules such as small interfering RNAs (siRNAs). This
technology in turn takes advantage of the fact that RNAi is a
natural biological mechanism for silencing genes in most cells of
many living organisms, from plants to insects to mammals (McManus
et al., Nature Reviews Genetics, 2002, 3(10) p. 737). RNAi prevents
a gene from producing a functional protein by ensuring that the
molecule intermediate, the messenger RNA copy of the gene is
destroyed siRNAs can be used in a naked form and incorporated in a
vector, as described below.
[0146] RNA interference (RNAi) is a process of sequence-specific
post-transcriptional gene silencing by which double stranded RNA
(dsRNA) homologous to a target locus can specifically inactivate
gene function in plants, fungi, invertebrates, and vertebrates,
including mammals (Hammond et al., Nature Genet. 2001; 2:110-119;
Sharp, Genes Dev. 1999; 13:139-141). This dsRNA-induced gene
silencing is mediated by short double-stranded small interfering
RNAs (siRNAs) generated from longer dsRNAs by ribonuclease III
cleavage (Bernstein et al., Nature 2001; 409:363-366 and Elbashir
et al., Genes Dev. 2001; 15:188-200). RNAi-mediated gene silencing
is thought to occur via sequence-specific RNA degradation, where
sequence specificity is determined by the interaction of an siRNA
with its complementary sequence within a target RNA (see, e.g.,
Tuschl, Chem. Biochem. 2001; 2:239-245).
[0147] For mammalian systems, RNAi commonly involves the use of
dsRNAs that are greater than 500 bp; however, it can also be
activated by introduction of either siRNAs (Elbashir, et al.,
Nature 2001; 411: 494-498) or short hairpin RNAs (shRNAs) bearing a
fold back stem-loop structure (Paddison et al., Genes Dev. 2002;
16: 948-958; Sui et al., Proc. Natl. Acad. Sci. USA 2002;
99:5515-5520; Brummelkamp et al., Science 2002; 296:550-553; Paul
et al., Nature Biotechnol. 2002; 20:505-508).
[0148] The siRNAs are preferably short double stranded nucleic acid
duplexes comprising annealed complementary single stranded nucleic
acid molecules. Preferably, the siRNAs are short dsRNAs comprising
annealed complementary single strand RNAs. siRNAs may also comprise
an annealed RNA:DNA duplex, wherein the sense strand of the duplex
is a DNA molecule and the antisense strand of the duplex is a RNA
molecule.
[0149] Preferably, each single stranded nucleic acid molecule of
the siRNA duplex is of from about 19 nucleotides to about 27
nucleotides in length. In preferred embodiments, duplexed siRNAs
have a 2 or 3 nucleotide 3' overhang on each strand of the duplex.
In preferred embodiments, siRNAs have 5'-phosphate and 3'-hydroxyl
groups.
[0150] RNAi molecules may include one or more modifications, either
to the phosphate-sugar backbone or to the nucleoside. For example,
the phosphodiester linkages of natural RNA may be modified to
include at least one heteroatom other than oxygen, such as nitrogen
or sulfur. In this case, for example, the phosphodiester linkage
may be replaced by a phosphothioester linkage. Similarly, bases may
be modified to block the activity of adenosine deaminase. Where the
RNAi molecule is produced synthetically, or by in vitro
transcription, a modified ribonucleoside may be introduced during
synthesis or transcription. The skilled artisan will understand
that many of the modifications described above for antisense
oligonucleotides may also be made to RNAi molecules. Such
modifications are well known in the art.
[0151] siRNAs may be introduced to a target cell as an annealed
duplex siRNA, or as single stranded sense and antisense nucleic
acid sequences that, once within the target cell, anneal to form
the siRNA duplex. Alternatively, the sense and antisense strands of
the siRNA may be encoded on an expression construct that is
introduced to the target cell. Upon expression within the target
cell, the transcribed sense and antisense strands may anneal to
reconstitute the siRNA.
[0152] shRNAs typically comprise a single stranded "loop" region
connecting complementary inverted repeat sequences that anneal to
form a double stranded "stem" region. Structural considerations for
shRNA design are discussed, for example, in McManus et al., RNA
2002; 8:842-850. In certain embodiments the shRNA may be a portion
of a larger RNA molecule, e.g., as part of a larger RNA that also
contains U6 RNA sequences (Paul et al., supra).
[0153] In preferred embodiments, the loop of the shRNA is from
about 1 to about 9 nucleotides in length. In preferred embodiments
the double stranded stem of the shRNA is from about 19 to about 33
base pairs in length. In preferred embodiments, the 3' end of the
shRNA stem has a 3' overhang. In particularly preferred
embodiments, the 3' overhang of the shRNA stem is from 1 to about 4
nucleotides in length. In preferred embodiments, shRNAs have
5'-phosphate and 3'-hydroxyl groups.
[0154] Although RNAi molecules preferably contain nucleotide
sequences that are fully complementary to a portion of the target
nucleic acid, 100% sequence complementarity between the RNAi probe
and the target nucleic acid is not required.
[0155] Similar to the above-described antisense oligonucleotides,
RNAi molecules can be synthesized by standard methods known in the
art, e.g., by use of an automated synthesizer. RNAs produced by
such methodologies tend to be highly pure and to anneal efficiently
to form siRNA duplexes or shRNA hairpin stem-loop structures.
Following chemical synthesis, single stranded RNA molecules are
deprotected, annealed to form siRNAs or shRNAs, and purified (e.g.,
by gel electrophoresis or HPLC). Alternatively, standard procedures
may be used for in vitro transcription of RNA from DNA templates
carrying RNA polymerase promoter sequences (e.g., T7 or SP6 RNA
polymerase promoter sequences). Efficient in vitro protocols for
preparation of siRNAs using T7 RNA polymerase have been described
(Done and Picard, Nucleic Acids Res. 2002; 30:e46; and Yu et al.,
Proc. Natl. Acad. Sci. USA 2002; 99:6047-6052). Similarly, an
efficient in vitro protocol for preparation of shRNAs using T7 RNA
polymerase has been described (Yu et al., supra). The sense and
antisense transcripts may be synthesized in two independent
reactions and annealed later, or may be synthesized simultaneously
in a single reaction.
[0156] RNAi molecules may be formed within a cell by transcription
of RNA from an expression construct introduced into the cell. For
example, both a protocol and an expression construct for in vivo
expression of siRNAs are described in Yu et al., supra. The
delivery of siRNA to tumors can potentially be achieved via any of
several gene delivery "vehicles" that are currently available.
These include viral vectors, such as adenovirus, lentivirus, herpes
simplex virus, vaccinia virus, and retrovirus, as well as
chemical-mediated gene delivery systems (for example, liposomes),
or mechanical DNA delivery systems (DNA guns). The oligonucleotides
to be expressed for such siRNA-mediated inhibition of gene
expression would be between 18 and 28 nucleotides in length.
Protocols and expression constructs for in vivo expression of
shRNAs have been described (Brummelkamp et al., Science 2002;
296:550-553; Sui et al., supra; Yu et al., supra; McManus et al.,
supra; Paul et al., supra).
[0157] The expression constructs for in vivo production of RNAi
molecules comprise RNAi encoding sequences operably linked to
elements necessary for the proper transcription of the RNAi
encoding sequence(s), including promoter elements and transcription
termination signals. Preferred promoters for use in such expression
constructs include the polymerase-III HI-RNA promoter (see, e.g.,
Brummelkamp et al., supra) and the U6 polymerase-III promoter (see,
e.g., Sui et al., supra; Paul, et al. supra; and Yu et al., supra).
The RNAi expression constructs can further comprise vector
sequences that facilitate the cloning of the expression constructs.
Standard vectors are known in the art (e.g., pSilencer 2.0-U6
vector, Ambion Inc., Austin, Tex.).
[0158] Ribozyme Inhibition
[0159] The level of expression of a target polypeptide of the
invention can also be inhibited by ribozymes designed based on the
nucleotide sequence thereof.
[0160] Ribozymes are enzymatic RNA molecules capable of catalyzing
the sequence-specific cleavage of RNA (for a review, see Rossi,
Current Biology 1994; 4:469-471). The mechanism of ribozyme action
involves sequence-specific hybridization of the ribozyme molecule
to complementary target RNA, followed by an endonucleolytic
cleavage event. The composition of ribozyme molecules must include:
(i) one or more sequences complementary to the target RNA; and (ii)
a catalytic sequence responsible for RNA cleavage (see, e.g., U.S.
Pat. No. 5,093,246).
[0161] The use of hammerhead ribozymes is preferred. Hammerhead
ribozymes cleave RNAs at locations dictated by flanking regions
that form complementary base pairs with the target RNA. The sole
requirement is that the target RNA has the following sequence of
two bases: 5'-UG-3'. The construction of hammerhead ribozymes is
known in the art, and described more fully in Myers, Molecular
Biology and Biotechnology: A Comprehensive Desk Reference, VCH
Publishers, New York, 1995 (see especially FIG. 4, page 833) and in
Haseloff and Gerlach, Nature 1988; 334:585-591.
[0162] As in the case of antisense oligonucleotides, ribozymes can
be composed of modified oligonucleotides (e.g., for improved
stability, targeting, etc.). These can be delivered to cells which
express the target polypeptide in vivo. A preferred method of
delivery involves using a DNA construct "encoding" the ribozyme
under the control of a strong constitutive pol III or pol II
promoter, so that transfected cells will produce sufficient
quantities of the ribozyme to catalyze cleavage of the target mRNA
encoding the target polypeptide. However, because ribozymes, unlike
antisense molecules, are catalytic, a lower intracellular
concentration may be required to achieve an adequate level of
efficacy.
[0163] Ribozymes can be prepared by any method known in the art for
the synthesis of DNA and RNA molecules, as discussed above.
Ribozyme technology is described further in Intracellular Ribozyme
Applications: Principals and Protocols, Rossi and Couture eds.,
Horizon Scientific Press, 1999.
[0164] Triple Helix Forming Oligonucleotides (TFOs)
[0165] Nucleic acid molecules useful to inhibit expression level of
a target polypeptide of the invention via triple helix formation
are preferably composed of deoxynucleotides. The base composition
of these oligonucleotides is typically designed to promote triple
helix formation via Hoogsteen base pairing rules, which generally
require sizeable stretches of either purines or pyrimidines to be
present on one strand of a duplex. Nucleotide sequences may be
pyrimidine-based, resulting in TAT and CGC triplets across the
three associated strands of the resulting triple helix. The
pyrimidine-rich molecules provide base complementarity to a
purine-rich region of a single strand of the duplex in a parallel
orientation to that strand. In addition, nucleic acid molecules may
be chosen that are purine-rich, e.g., those containing a stretch of
G residues. These molecules will form a triple helix with a DNA
duplex that is rich in GC pairs, in which the majority of the
purine residues are located on a single strand of the targeted
duplex, resulting in GGC triplets across the three strands in the
triplex.
[0166] Alternatively, sequences can be targeted for triple helix
formation by creating a so-called "switchback" nucleic acid
molecule. Switchback molecules are synthesized in an alternating
5'-3', 3'-5' manner, such that they base pair with first one strand
of a duplex and then the other, eliminating the necessity for a
sizeable stretch of either purines or pyrimidines to be present on
one strand of a duplex.
[0167] Similarly to RNAi molecules, antisense oligonucleotides, and
ribozymes, described above, triple helix molecules can be prepared
by any method known in the art. These include techniques for
chemically synthesizing oligodeoxyribonucleotides and
oligoribonucleotides such as, e.g., solid phase phosphoramidite
chemical synthesis. Alternatively, RNA molecules can be generated
by in vitro or in vivo transcription of DNA sequences "encoding"
the particular RNA molecule. Such DNA sequences can be incorporated
into a wide variety of vectors that incorporate suitable RNA
polymerase promoters such as the T7 or SP6 polymerase promoters.
See, Nielsen, P. E. "Triple Helix: Designing a New Molecule of
Life", Scientific American, December, 2008; Egholm, M., et al. "PNA
Hybridizes to Complementary Oligonucleotides Obeying the
Watson-Crick Hydrogen Bonding Rules." (1993) Nature, 365, 566-568;
Nielsen, P. E. `PNA Technology`. Mol Biotechnol. 2004;
26:233-48.
[0168] Antibodies and Aptamers
[0169] The polypeptide targets described herein, e.g., HSD17B11,
HSD17B12, HSD17B14, etc.) can be inhibited (e.g., the level can be
reduced) by the administration to or expression in a subject or a
cell or tissue thereof, of blocking antibodies or aptamers against
the polypeptide.
[0170] Antibodies, or their equivalents and derivatives, e.g.,
intrabodies, or other antagonists of the polypeptide, may be used
in accordance with the present methods. Methods for engineering
intrabodies (intracellular single chain antibodies) are well known.
Intrabodies are specifically targeted to a particular compartment
within the cell, providing control over where the inhibitory
activity of the treatment is focused. This technology has been
successfully applied in the art (for review, see Richardson and
Marasco, 1995, TIBTECH vol. 13; Lo et al. (2009) Handb Exp
Pharmacol. 181:343-73; Maraasco, W. A. (1997) Gene Therapy 4:11-15;
see also, U.S. Pat. Appln. Pub. No. 2001/0024831 by Der Maur et al.
and U.S. Pat. No. 6,004,940 by Marasco et al.).
[0171] Administration of a suitable dose of the antibody or the
antagonist (e.g., aptamer) may serve to block the level (expression
or activity) of the polypeptide in order to treat or prevent
cancer, e.g., inhibit growth of a breast cancer cell or tumor
(e.g., ER+ or ER- breast cancer cell or tumor).
[0172] In addition to using antibodies and aptamers to inhibit the
levels and/or activity of a target polypeptide, it may also be
possible to use other forms of inhibitors. For example, it may be
possible to identify antagonists that functionally inhibit the
target polypeptide (e.g., HSD17B11, HSD17B12, HSD17B14, etc.). In
addition, it may also be possible to interfere with the interaction
of the polypeptide with its substrate. Other suitable inhibitors
will be apparent to the skilled person.
[0173] The antibody (or other inhibitors and antagonists) can be
administered by a number of methods. For example, for the
administration of intrabodies, one method is set forth by Marasco
and Haseltine in PCT WO 94/02610. This method discloses the
intracellular delivery of a gene encoding the intrabody. In one
embodiment, a gene encoding a single chain antibody is used. In
another embodiment, the antibody would contain a nuclear
localization sequence. By this method, one can intracellularly
express an antibody, which can block activity of the target
polypeptide in desired cells.
[0174] Aptamers are oligonucleic acid or peptide molecules that
bind to a specific target molecule. Aptamers can be used to inhibit
gene expression and to interfere with protein interactions and
activity. Nucleic acid aptamers are nucleic acid species that have
been engineered through repeated rounds of in vitro selection
(e.g., by SELEX (systematic evolution of ligands by exponential
enrichment)) to bind to various molecular targets such as small
molecules, proteins, nucleic acids, and even cells, tissues and
organisms. Peptide aptamers consist of a variable peptide loop
attached at both ends to a protamersein scaffold. Aptamers are
useful in biotechnological and therapeutic applications as they
offer molecular recognition properties that rival that of
antibodies. Aptamers can be engineered completely in a test tube,
are readily produced by chemical synthesis, possess desirable
storage properties, and elicit little or no immunogenicity in
therapeutic application. Aptamers can be produced using the
methodology disclosed in a U.S. Pat. No. 5,270,163 and WO
91/19813.
[0175] Small Molecules
[0176] Chemical agents, referred to in the art as "small molecule"
compounds are typically organic, non-peptide molecules, having a
molecular weight less than 10,000 Da, preferably less than 5,000
Da, more preferably less than 1,000 Da, and most preferably less
than 500 Da. This class of modulators includes chemically
synthesized molecules, for instance, compounds from combinatorial
chemical libraries. Synthetic compounds may be rationally designed
or identified utilizing the screening methods described below.
Methods for generating and obtaining small molecules are well known
in the art (Schreiber, Science 2000; 151:1964-1969; Radmann et al.,
Science 2000; 151:1947-1948).
[0177] Non-limiting of small molecule inhibitors (and exemplary
dosages for in vitro use in cell-based assays) include, e.g.,
cyclopamine (e.g., 10 .mu.M) (Selleck Chemicals, cat#S1146), an
inhibitor of Smo receptor of Hh ligands; LY2109761 (e.g., 500 nM)
(Eli Lilly), an inhibitor of TGFBR kinases; celecoxib (e.g., 100
.mu.M) (LKT laboratories, cat#C1644), an inhibitor of Cox2;
2-5dideoxyadenosine (e.g., 100 .mu.M) (Enzo Life Sciences,
cat#BML-CN110-005), an inhibitor of adenylate cyclase; tyrphostin
AG1478 (e.g., 10 .mu.M) (Cayman Chemicals, cat#10010244), an
inhibitor of EGFR; XAV939 (e.g., 1 .mu.M)(Tocris Bioscience,
cat#3748), a Tankyrase (TNKS) inhibitor that antagonizes Wnt
signaling via stimulation of .beta.-catenin degradation and
stabilization of axin; and picropodophylotoxin (e.g., 0.5 .mu.M)
(Tocris Bioscience, cat#2956), an IGFR inhibitor in which stock
solutions (1,000.times.) are prepared in DMSO.
[0178] Non-limiting examples of small molecule agonists include,
e.g., the TFGb agonists described in detail in U.S. Pat. No.
8,097,645 to Wyss-Coray et al., the hedgehog (Hh) agonist
cyclopamine (see, King, W K. Journal of Biology 2002, 1:8); the Wnt
agonist Calbiochem (EMD Millipore), and the cAMP agonist Alotaketal
A described in Huang et al. (J. Am. Chem. Soc., 2012, 134 (21), pp
8806-8809).
[0179] In certain embodiments, the above described inhibitors and
agonists can be directly targeted to a specific cell type (e.g.,
CD44+ or CD24+ breast epithelial cells, p27+ or Ki67+ breast
epithelial cells, AR+ cells (e.g., AR+ breast epithelial cells),
ER+ breast epithelial cells, ER- breast epithelial cells, and
combinations thereof, e.g., ER+p27+ cells (e.g., ER+p27+ breast
epithelial cells), or AR+p27+ cells (e.g., AR+p27+ breast
epithelial cells), etc. The skilled artisan will appreciate that
methods for specific cell targeting are well known in the art. By
way of non-limiting example, antibodies, e.g., an anti-CD44,
anti-CD24, anti-AR, or anti-ER antibody, etc., may be conjugated to
an inhibitor or agonist described herein, in order to target the
inhibitor or agonist to, for example and without limitation, CD44+,
CD24+ or ER+ cells. Further the site of administration (e.g.,
direct injection into breast tissue and/or breast tumor) can
further increase the specificity of cell targeting.
VI. METHODS FOR PREDICTING A SUBJECT'S RISK OF DEVELOPING BREAST
CANCER
[0180] Provided herein are methods for predicting a subject's risk
of developing breast cancer (e.g., ER+ or ER- breast cancer).
[0181] In one embodiment, the method comprises (a) determining the
frequency in a breast tissue sample of CD44+, CD24- breast
epithelial cells and (b) predicting that the subject has a
relatively elevated risk of developing breast cancer if the
frequency of CD44+, CD24- breast epithelial cells is decreased
compared to a first control frequency of CD44+, CD24- breast
epithelial cells; or (c) predicting that the subject has a
relatively reduced risk of developing breast cancer if the
frequency of CD44+ breast epithelial cells is increased compared to
a second control frequency of CD44+, CD24- breast epithelial
cells.
[0182] In another embodiment, the method comprises: (a) determining
the frequency in a breast tissue sample of CD24+ breast epithelial
cells and (b) predicting that the subject has a relatively elevated
risk of developing breast cancer if the frequency of CD24+ breast
epithelial cells is increased compared to a first control frequency
of CD24+ breast epithelial cells; or (c) predicting that the
subject has a relatively reduced risk of developing breast cancer
if the frequency of CD24+ breast epithelial cells is decreased
compared to a second control frequency of CD24+ breast epithelial
cells.
[0183] As discussed in the Definitions section, above, a "first
control frequency" of a cell type (e.g., CD44+ or CD24+ cells, or
p27+ cells, Ki67+ cells, etc.) is the frequency of that cell type
in a comparable sample from a patient or the average frequency in
comparable samples from a plurality of patients known to be at low
risk of developing breast cancer (e.g., parous women not expressing
BRCA1 or BRCA2 mutations, where the women are premenopausal and/or
postmenopausal). In other words, the first control frequency is a
"negative" control for an elevated risk of developing breast
cancer. As also discussed above, a "second control frequency" of a
cell type is the frequency of that cell type in a comparable sample
from a patient or the average frequency in comparable samples from
a plurality of patients known to be at high risk of developing
breast cancer (e.g., pre and/or postmenopausal nulliparous women).
In other words, the second control frequency is a "positive"
control for an elevated risk of developing the breast cancer. The
first and second control frequencies can be simultaneously
determined or can be determined before or after the frequency of
the relevant cell is determined in the breast cells from the
subject for whom the risk prediction is being made.
[0184] In a particularly preferred embodiment, the frequency of
both CD44+ and CD24+ breast epithelial cells in the sample is
determined as described above, and the method comprises predicting
that the subject has a relatively elevated risk of developing
breast cancer if: (i) the frequency of CD44+, CD24- breast
epithelial cells is decreased compared to a first control frequency
of CD44+, CD24- breast epithelial cells, and (ii) the frequency of
CD24+ breast epithelial cells is increased compared to a first
control frequency of CD24+ breast epithelial cells; and step (c)
comprises predicting that the subject has a relatively reduced risk
of developing breast cancer if: (i) the frequency of CD44+ breast
epithelial cells is increased compared to a second control
frequency of CD44+, CD24- breast epithelial cells, and (ii) the
frequency of CD24+ breast epithelial cells is decreased compared to
a second control frequency of CD24+ breast epithelial cells.
[0185] In other embodiments, the first and second control
frequencies of CD44+ and CD24+ breast epithelial cells, described
above, can also be first and second predetermined reference
frequencies, respectively (i.e., standards) to which the frequency
of the cell type in a test sample is compared.
[0186] For example, the predetermined reference frequency for a
first control frequency, of CD44+, CD24- breast epithelial cells is
preferably in the range of 15-30% or higher of the total breast
epithelial cells in the sample. Further, as disclosed herein, a
subject considered to have a relatively elevated risk of developing
breast cancer will have a decreased frequency of CD44+, CD24-
breast epithelial cells relative to that predetermined reference
frequency; thus, a subject determined to have a frequency of CD44+,
CD24- breast epithelial cells less than 15% would be predicted to
have a relatively elevated risk of developing breast cancer. More
preferably, a subject determined to have a frequency of CD44+,
CD24- breast epithelial cells less than 14%, less than 13%, less
than 12%, less than 11%, less than 10%, less than 9%, less than 8%,
less than 7%, less than 6%, or less than 5%, is predicted to have a
relatively elevated risk of developing breast cancer.
[0187] The predetermined reference frequency for a second control
frequency of CD44+, CD24- breast epithelial cells is preferably in
the range of 15% or less (e.g., less than 15%, less than 14%, less
than 13%, less than 12%, less than 11%, less than 10%, etc.) of the
total breast epithelial cells in the sample. As disclosed herein, a
subject considered to have a relatively reduced risk of developing
breast cancer will have an increased frequency of CD44+,
CD24-breast epithelial cells relative to the second predetermined
reference frequency; thus, a subject determined to have a frequency
of CD44+, CD24- breast epithelial cells greater than 15%,
preferably greater than 16%, greater than 17%, greater than 18%,
greater than 19%, greater than 20%, greater than 21%, greater than
22%, greater than 23%, greater than 24%, greater than 25%, greater
than 26%, greater than 27%, greater than 28%, greater than 29%, or
greater than 30% is predicted to have a relatively reduced risk of
developing breast cancer.
[0188] The first predetermined reference frequency of CD24+ breast
epithelial cells is preferably 20%, or less than 20%, less than
19%, less than 18%, less than 17%, less than 16%, less than 15%,
less than 14%, less than 13%, less than 12%, less than 11%, less
than 10%, less than 9%, less than 8%, less than 7%, less than 6%,
or less than 5% of the total breast epithelial cells in the sample.
As disclosed herein, a subject considered to have a relatively
elevated risk of developing breast cancer will have an increased
frequency of CD24+ breast epithelial cells relative to the first
predetermined reference frequency of CD24+ breast epithelial cells;
thus, a subject determined to have a frequency of CD24+ breast
epithelial cells greater than 20%, greater than 21%, greater than
22%, greater than 23%, greater than 24%, greater than 25%, greater
than 26%, greater than 27%, greater than 28%, greater than 29%,
greater than 30%, greater than 31%, greater than 32%, greater than
33%, greater than 34%, greater than 35%, greater than 36%, greater
than 37%, greater than 38%, greater than 39%, greater than 40%,
greater than 41%, greater than 42%, greater than 43%, greater than
44%, greater than 45%, greater than 46%, greater than 47%, greater
than 48%, greater than 49%, or greater than 50% of the total breast
epithelial cells in the sample, is predicted to have a relatively
elevated risk of developing breast cancer.
[0189] The second predetermined reference frequency of CD24+ breast
epithelial cells is preferably 20%, or greater than 20%, greater
than 21%, greater than 22%, greater than 23%, greater than 24%,
greater than 25%, greater than 26%, greater than 27%, greater than
28%, greater than 29%, greater than 30%, greater than 31%, greater
than 32%, greater than 33%, greater than 34%, greater than 35%,
greater than 36%, greater than 37%, greater than 38%, greater than
39%, greater than 40%, greater than 41%, greater than 42%, greater
than 43%, greater than 44%, greater than 45%, greater than 46%,
greater than 47%, greater than 48%, greater than 49%, or greater
than 50%, of the total breast epithelial cells in the sample. As
disclosed herein, a subject considered to have a relatively reduced
risk of developing breast cancer will have a decreased frequency of
CD24+ breast epithelial cells relative to the second predetermined
reference frequency; thus, a subject determined to have a frequency
of CD24+ breast epithelial cells less than 20% (e.g., less than
20%, less than 19%, less than 18%, less than 17%, less than 16%,
less than 15%, less than 14%, less than 13%, less than 12%, less
than 11%, less than 10%, less than 5%, etc.) would be predicted to
have a relatively reduced risk of developing breast cancer.
[0190] In yet other embodiments, the method for predicting a
subject's risk of developing an breast cancer comprises: predicting
that the subject has a relatively elevated risk of developing
breast cancer if the frequency of CD24+ breast epithelial cells is
greater than the frequency of CD44+, CD24- breast epithelial cells
in the sample; and step (c) comprises predicting that the subject
has a relatively reduced risk of developing breast cancer if the
frequency of CD24+ breast epithelial cells is equal to or less than
the frequency of CD44+, CD24- breast epithelial cells in the
sample. In still other embodiments, the method for predicting a
subject's risk of developing an breast cancer comprises predicting
that the subject has a relatively elevated risk of developing
breast cancer if the ratio of CD24+ breast epithelial cells to
CD44+, CD24- breast epithelial cells in a breast epithelial
cell-containing sample from the subject is 2, or greater than 2,
greater than 3, greater than 4, greater than 5, greater than 6,
greater than 7, greater than 8, greater than 9, or greater than 10;
or, predicting that the subject has a relatively reduced risk of
developing breast cancer if the ratio of CD24+ breast epithelial
cells to CD44+, CD24- breast epithelial cells in a breast
epithelial cell-containing sample from the subject is less than 2,
preferably less than 1.5, less than 1, less than 0.9, less than
0.8, less than 0.7, less than 0.6, less than 0.5, less than 0.4,
less than 0.3, less than 0.2, less than 0.1, less than 0.05, or
less than 0.01.
[0191] In other embodiments, a method of predicting a subject's
risk of developing an estrogen-receptor-positive (ER+) breast
cancer is provided, wherein the method comprises: (a) determining
the frequency in a breast tissue sample of cells of one or more
types of cells, such as, e.g., p27+ breast epithelial cells, Sox17+
breast epithelial cells, Cox2+ breast epithelial cells, Ki67+
breast epithelial cells, ER+, p27+ breast epithelial cells, ER+,
Sox17+ breast epithelial cells, ER+, Cox2+ breast epithelial cells,
ER+, Ki67+ breast epithelial cells, AR+, p27+ breast epithelial
cells, AR+, Sox17+ breast epithelial cells, AR+, Cox2+ breast
epithelial cells, and AR+, Ki67+ breast epithelial cells; and (b)
predicting that the subject has a relatively elevated risk of
developing breast cancer if the frequency of the cells of the type
is increased compared to a first control frequency of cells of the
type; or (c) predicting that the subject has a relatively reduced
risk of developing breast cancer if the frequency of the cells of
the type is decreased compared to a second control frequency of the
cells of the type. In a preferred embodiment, the frequencies of
two or more, three or more, or all of the cell types (e.g., p27+,
Ki67+, Sox17 and/or Cox2+ breast epithelial cells and/or ER+, p27+
breast epithelial cells, ER+, Sox17+ breast epithelial cells, ER+,
Cox2+ breast epithelial cells, ER+, Ki67+ breast epithelial cells,
AR+, p27+ breast epithelial cells, AR+, Sox17+ breast epithelial
cells, AR+, Cox2+ breast epithelial cells, and/or AR+, Ki67+ breast
epithelial cells are determined, as described above.
[0192] In one embodiment of the above method, the frequency of the
p27+ breast epithelial cells, Ki67+ breast epithelial cells, Sox17+
breast epithelial cells, Cox2+ breast epithelial cells, ER+, p27+
breast epithelial cells, ER+, Sox17+ breast epithelial cells, ER+,
Cox2+ breast epithelial cells, ER+, Ki67+ breast epithelial cells,
AR+, p27+ breast epithelial cells, AR+, Sox17+ breast epithelial
cells, AR+, Cox2+ breast epithelial cells, and/or AR+, Ki67+ breast
epithelial cells is increased relative to the first control
frequency by at least 2-fold, at least 3-fold, at least 4-fold, at
least 5-fold, at least 10-fold, or more. Also preferably, in the
above method, the frequency of the p27+, Ki67+, Sox17 and/or Cox2+
breast epithelial cells is decreased relative to the second control
frequency by at least 2-fold, at least 3-fold, at least 4-fold, at
least 5-fold, at least 10-fold, or more.
[0193] In another embodiment, step (b) of the method described
above comprises predicting that the subject has a relatively
elevated risk of developing breast cancer if the frequency of p27+
breast epithelial cells is 15% or greater (e.g., 15%, 16%, 17%,
18%, 19%, 20%, 21%, 22%, 23%, 24%, 25%, 26%, 27%, 28%, 29%, 30%,
31%, 32%, 33%, 34%, 35%, 36%, 37%, 38%, 39%, 40% or greater) of the
breast epithelial cells in the sample; and step (c) comprises
predicting that the subject has a relatively reduced risk of
developing breast cancer if the frequency of p27+ breast epithelial
cells is less than 15% (e.g., 14%, 13%, 12%, 11%, 10%, 9%, 8%, 7%,
6%, 5%, 4%, 3%, 2%, 1% or less) of the breast epithelial cells in
the sample.
[0194] In another embodiment, step (b) of the method described
above comprises predicting that the subject has a relatively
elevated risk of developing breast cancer if the frequency of Ki67+
breast epithelial cells is 2% or greater or 3% of greater of the
breast epithelial cells in the sample, and step (c) comprises
predicting that the subject has a relatively reduced risk of
developing breast cancer if the frequency of Ki67+ breast
epithelial cells is less than 2% (e.g., 1.9%, 1.8%, 1.7%, 1.6%,
1.5%, 1.0%, 0.5%, or 0%) of the breast epithelial cells in the
sample.
[0195] In another embodiment, step (b) of the method described
above comprises predicting that the subject has a relatively
elevated risk of developing breast cancer if the frequency of p27+
breast epithelial cells is 15% or greater (e.g., 15%, 16%, 17%,
18%, 19%, 20%, 21%, 22%, 23%, 24%, 25%, 26%, 27%, 28%, 29%, 30%,
31%, 32%, 33%, 34%, 35%, 36%, 37%, 38%, 39%, 40% or greater) of the
breast epithelial cells in the sample; and step (c) comprises
predicting that the subject has a relatively reduced risk of
developing breast cancer if the frequency of p27+ breast epithelial
cells is less than 15% (e.g., 14%, 13%, 12%, 11%, 10%, 9%, 8%, 7%,
6%, 5%, 4%, 3%, 2%, 1% or less) of the breast epithelial cells in
the sample.
[0196] In another embodiment, step (b) of the method described
above comprises predicting that the subject has a relatively
elevated risk of developing breast cancer if the frequency of p27+,
AR+ breast epithelial cells is 10% or greater (e.g., 10%, 11%, 12%,
13%, 14%, 15%, 16%, 17%, 18%, 19%, 20%, 21%, 22%, 23%, 24%, 25%,
26%, 27%, 28%, 29%, 30%, 31%, 32%, 33%, 34%, 35%, 36%, 37%, 38%,
39%, 40% or greater) of the breast epithelial cells in the sample;
and step (c) comprises predicting that the subject has a relatively
reduced risk of developing breast cancer if the frequency of p27+
breast epithelial cells is less than 10% (e.g., 9%, 8%, 7%, 6%, 5%,
4%, 3%, 2%, 1% or less) of the breast epithelial cells in the
sample.
[0197] In yet other embodiments, a method of predicting a subject's
risk of developing an breast cancer is provided, wherein the method
comprises: (a) determining the expression level in a breast tissue
sample from a subject of at least one marker, e.g., p27, Sox17 and
Cox2; and (b) predicting that the subject has a relatively elevated
risk of developing breast cancer if the expression level of the at
least one marker is increased compared to a first control level of
the at least one marker; or (c) predicting that the subject has a
relatively reduced risk of developing breast cancer if the
expression level of the at least one marker is decreased compared
to a second control level of the at least one marker. Methods for
determining the expression level of markers p27, Sox17 and Cox2
(e.g., QPCR, FACS, immunohistochemistry, Western blot, ELISA) are
described above.
[0198] In step (b) in the above method, preferably, the expression
level of p27, Sox17 and/or Cox2 (e.g., mRNA and/or polypeptide) is
increased by at least 2-fold, at least 3-fold, at least 4-fold, at
least 5-fold, at least 6-fold, at least 7-fold, at least 8-fold, at
least 9-fold, at least 10-fold, at least 20-fold or greater,
compared to the first control level (i.e., a control level from a
subject known to be at low risk of developing breast cancer). In
step (c) in the above method, preferably, the expression level of
p27, Sox17 and/or Cox2 (e.g., mRNA and/or polypeptide) is decreased
by at least 2-fold, at least 3-fold, at least 4-fold, at least
5-fold, at least 6-fold, at least 7-fold, at least 8-fold, at least
9-fold, at least 10-fold, at least 20-fold or more, compared to the
second control level (i.e., a control level from a subject known to
be at high risk of developing breast cancer).
[0199] In still other embodiments, methods of predicting the risk
of developing breast cancer are provided, which comprise
determining a parity/nulliparity-associated gene expression
signature in a sample comprising breast epithelial cells. Also
provided are methods of predicting breast cancer disease outcome by
testing for a parity/nulliparity-associated gene expression
signature in breast cancer cells.
[0200] As described above and in Example 10, the genes that were
shown to be upregulated or downregulated in FIG. 28 make up a
parity/nulliparity-related gene signature. Further, the genes for
which the expression profile is shown in FIG. 28 are described in
detail in Table 18, below. Of course, the skilled artisan will
appreciate that a parity/nulliparity-related gene signature can,
but does not necessarily comprise all of the genes shown in Table
18. Such gene signature comprises 2 or more, 3 or more, 4 or more,
5 or more, 10 or more, 15 or more, 20 or more, 30 or more, 40 or
more, 50 or more, or 100 or more of the genes shown in Table
18.
[0201] Further, for each of the genes shown in Table 18, the
disease outcome based on the expression of a particular gene in the
expression is shown (i.e., a prognosis of "good" or "bad"). Thus,
the skilled artisan can select one or more genes from the list of
genes in Table 18 that are correlated with a "good" prognosis
and/or one or more genes associated with a "bad" prognosis, and
assemble the selected genes in a custom gene signature. A subject's
gene expression profile for the genes in the custom signature can
be determined, and for example, if the subject expresses more of
the genes associated with a "bad" prognosis than the genes
associated with a "good" prognosis, then the patient's disease
outcome is predicted to be "bad" or "poor", whereas as subject
expressing more of the "good" prognosis gens is predicted to have a
"good" prognosis (i.e., more likely to survive the disease).
[0202] The above described methods of predicting a subject's risk
of developing cancer and for determining a dise outcome (e.g.,
prognosis), can be used, e.g., by the subject's physician to
determine the best course of treatment or prophylaxis to administer
to the subject in need thereof, as well as other courses of action.
For example, such methods can further comprise administering to a
subject identified as having an increased risk of developing breast
cancer, or a subject diagnosed with breast cancer and determined
according to the above methods to have a bad prognosis, a therapy
or therapeutic agent for treating, reducing the risk of developing,
or preventing breast cancer (e.g., ER+ or ER- breast cancer). In
other embodiments, the methods can comprise performing additional
diagnostic assays to confirm the diagnosis (e.g., imaging, biopsy,
etc.), recording the diagnosis in a database or medical history
(e.g., medical records) of the subject, performing diagnostic tests
on a family member of the subject, selecting the subject for
increased monitoring or periodically monitoring the health of the
subject (e.g., for development of signs or symptoms of breast
cancer, e.g., tumor development or tumor size changes (e.g.,
increased or decreased size), such as e.g., clinical breast exam,
mammography, MRI, or other suitable imaging or other diagnostic
method(s) known in the art.
VII. ADMINISTRATION
[0203] Compositions and formulations comprising an inhibitor or
agonist of the invention (e.g., an inhibitor or agonist of a gene
or polypeptide mediating a function in a pathway that is
upregulated or downregulated in breast epithelial cells of
nulliparous women), can be administered topically, parenterally,
orally, by inhalation, as a suppository, or by other methods known
in the art. The term "parenteral" includes injection (for example,
intravenous, intraperitoneal, epidural, intrathecal, intramuscular,
intraluminal, intratracheal or subcutaneous). Exemplary routes of
administration include, e.g., intravenous, intraductal, and
intratumoral.
[0204] While it is possible to use an inhibitor or agonist of the
invention for therapy as is, it may be preferable to administer an
inhibitor or agonist as a pharmaceutical formulation, e.g., in
admixture with a suitable pharmaceutical excipient, diluent, or
carrier selected with regard to the intended route of
administration and standard pharmaceutical practice. Pharmaceutical
formulations comprise at least one active compound, or a
pharmaceutically acceptable derivative thereof, in association with
a pharmaceutically acceptable excipient, diluent, and/or carrier.
The excipient, diluent and/or carrier must be "acceptable," as
defined above.
[0205] Administration of a composition or formulation of the
invention can be once a day, twice a day, or more often. Frequency
may be decreased during a treatment maintenance phase of the
disease or disorder, e.g., once every second or third day instead
of every day or twice a day. The dose and the administration
frequency will depend on the clinical signs, which confirm
maintenance of the remission phase, with the reduction or absence
of at least one or more preferably more than one clinical signs of
the acute phase known to the person skilled in the art. More
generally, dose and frequency will depend in part on recession of
pathological signs and clinical and subclinical symptoms of a
disease condition or disorder contemplated for treatment with the
present compounds.
[0206] It will be appreciated that the amount of an inhibitor
required for use in treatment will vary with the route of
administration, the nature of the condition for which treatment is
required, and the age, body weight and condition of the patient,
and will be ultimately at the discretion of the attendant physician
or veterinarian. Compositions will typically contain an effective
amount of the active agent(s), alone or in combination. Preliminary
doses can be determined according to animal tests, and the scaling
of dosages for human administration can be performed according to
art-accepted practices.
[0207] Length of treatment, i.e., number of days, will be readily
determined by a physician treating the subject; however the number
of days of treatment may range from 1 day to about 20 days. As
provided by the present methods, and discussed below, the efficacy
of treatment can be monitored during the course of treatment to
determine whether the treatment has been successful, or whether
additional (or modified) treatment is necessary.
VIII. METHODS OF TREATING AND PREVENTING BREAST CANCER
[0208] Provided herein are methods for treating and preventing
estrogen-receptor-positive (ER+) breast cancer in a subject.
Typically, a subject that can be administered an inhibitor or
agonist, or composition, e.g., pharmaceutical composition,
comprising one or more inhibitors or agonists described above is a
premenopausal or postmenopausal woman. In some embodiments, the
subject has a BRCA-1 or BRCA-2 germline mutation.
[0209] In certain embodiments, methods of treating breast cancer
(e.g., ER+ or ER- breast cancer) in a subject are provided that
comprise administering to the subject a composition comprising an
inhibitor of a pathway that has increased activity in breast
epithelial cells (e.g., CD44+, CD24- breast epithelial cells) of
nulliparous women compared to the activity in breast epithelial
cells of parous women (i.e. a pathway active in nulliparous breast
epithelial cells). In other embodiments an agonist of a pathway
that has decreased activity in breast epithelial cells (e.g.,
CD44+, CD24- breast epithelial cells) of nulliparous women compared
to the activity in breast epithelial cells of parous women (i.e. a
pathway active in parous breast epithelial cells) can be
administered. Such inhibitors and agonists and the target pathways
and genes in those pathways are described in detail above.
[0210] In other embodiments, methods of preventing breast cancer
(e.g., ER+ or ER- breast cancer) in a subject are provided that
comprise administering to a subject at risk of developing breast
cancer an inhibitor of a pathway active in nulliparous breast
epithelial cells (e.g., CD44+, CD24- breast epithelial cells). For
example, the pathway can include a mediator molecule such as cAMP,
EGFR, Cox2, Hh, TGFBR, and IGFR, as described above. In another
embodiment, the method of preventing breast cancer in a subject
comprises administering to the subject an agonist of a pathway
active in parous breast epithelial cells (e.g., CD44+, CD24-breast
epithelial cells) (e.g., an agonist of Hakai/CBLL1, CASP8, SCRIB,
LLGL2, PI3K/AKT signaling, and apoptosis).
[0211] In certain embodiments, an inhibitor or agonist or any
combination of 2 or more, 3 or more, 4 or more, or 5 or more
inhibitors and/or agonists of the above-described target genes
and/or polypeptides can be administered in a combination therapy to
a subject for the treatment or prevention of breast cancer (e.g.,
ER+ or ER- breast cancer).
[0212] The skilled artisan will appreciate that other combinations
of inhibitors and/or agonists are possible, so long as the
combination results in the treatment or prevention of breast
cancer.
[0213] The skilled artisan will also appreciate that the methods of
treating breast cancer described herein (e.g., administration of
one or more of the inhibitors and agonists described above) may
also be administered in a combination therapy with other
treatments, e.g. other cancer therapies. Non-limiting examples of
such cancer therapies include, e.g., chemotherapy, radiation
therapy, biological therapy (e.g., antibodies, biological modifiers
(cytokines, growth factors, lymphokines, chemokines, etc.), immune
cell therapies (LAK cells, tumor specific CTL, etc.),
anti-angiogenic therapy, surgery, and combinations thereof.
[0214] Chemotherapeutic agents, include for example: taxanes such
as taxol, taxotere or their analogues; alkylating agents such as
cyclophosphamide, isosfamide, melphalan, hexamethylmelamine,
thiotepa or dacarbazine; antimetabolites such as pyrimidine
analogues, for instance 5-fluorouracil, cytarabine, capecitabine,
and gemcitabine or its analogues such as 2-fluorodeoxycytidine;
folic acid analogues such as methotrexate, idatrexate or
trimetrexate; spindle poisons including vinca alkaloids such as
vinblastine, vincristine, vinorelbine and vindesine, or their
synthetic analogues such as navelbine, or estramustine and a
taxoid; platinum compounds such as cisplatin; epipodophyllotoxins
such as etoposide or teniposide; antibiotics such as daunorubicin,
doxorubicin, bleomycin or mitomycin, enzymes such as
L-asparaginase, topoisomerase inhibitors such as topotecan or
pyridobenzoindole derivatives; and various agents such as
procarbazine, mitoxantrone, and biological response modifiers or
growth factor inhibitors such as interferons or interleukins. Other
chemotherapeutic agents include, though are not limited to, a
p38/JAK kinase inhibitor, e.g., SB203580; a phospatidyl inositol-3
kinase (PI3K) inhibitor, e.g., LY294002; a MAPK inhibitor, e.g.
PD98059; a JAK inhibitor, e.g., AG490; preferred chemotherapeutics
such as UCN-01, NCS, mitomycin C (MMC), NCS, and anisomycin;
taxoids in addition to those describe above (e.g., as disclosed in
U.S. Pat. Nos. 4,857,653; 4,814,470; 4,924,011, 5,290,957;
5,292,921; 5,438,072; 5,587,493; European Patent No. 0 253 738; and
PCT Publication Nos. WO 91/17976, WO 93/00928, WO 93/00929, and WO
96/01815. In other embodiments, a cancer therapy can include but is
not limited to administration of cytokines and growth factors such
as interferon (IFN)-gamma, tumor necrosis factor (TNF)-alpha,
TNF-beta, and/or similar cytokines, or an antagonist of a tumor
growth factor (e.g., TGF-.beta. and IL-10). Antiangiogenic agents,
include, e.g., endostatin, angiostatin, TNP-470, Caplostatin
(Stachi-Fainaro et al., Cancer Cell 7(3), 251 (2005)). Drugs that
interfere with intracellular protein synthesis can also be used in
the methods of the present invention; such drugs are known to those
skilled in the art and include puromycin, cycloheximide, and
ribonuclease.
[0215] For radiation therapy, common sources of radiation used for
cancer treatment include, but are not limited to, high-energy
photons that come from radioactive sources such as cobalt, cesium,
iodine, palladium, or a linear accelerator, proton beams; neutron
beams (often used for cancers of the head, neck, and prostate and
for inoperable tumors), x or gamma radiation, electron beams,
etc.
[0216] It is well known that radioisotopes, drugs, and toxins can
be conjugated to antibodies or antibody fragments which
specifically bind to markers which are produced by or associated
with cancer cells, and that such antibody conjugates can be used to
target the radioisotopes, drugs or toxins to tumor sites to enhance
their therapeutic efficacy and minimize side effects. Examples of
these agents and methods are reviewed in Wawrzynczak and Thorpe (in
Introduction to the Cellular and Molecular Biology of Cancer, L. M.
Franks and N. M. Teich, eds, Chapter 18, pp. 378-410, Oxford
University Press. Oxford, 1986), in Immunoconjugates: Antibody
Conjugates in Radioimaging and Therapy of Cancer (C. W. Vogel, ed.,
3-300, Oxford University Press, N.Y., 1987), in Dillman, R. 0. (CRC
Critical Reviews in Oncology/Hematology 1:357, CRC Press, Inc.,
1984), in Pastan et al. (Cell 47:641, 1986) in Vitetta et al.
(Science 238:1098-1104, 1987) and in Brady et al. (Int. J. Rad.
Oncol. Biol. Phys. 13:1535-1544, 1987). Other examples of the use
of immunoconjugates for cancer and other forms of therapy have been
disclosed, inter alia, in U.S. Pat. Nos. 4,331,647, 4,348,376,
4,361,544, 4,468,457, 4,444,744, 4,460,459, 4,460,561 4,624,846,
4,818,709, 4,046,722, 4,671,958, 4,046,784, 5,332,567, 5,443,953,
5,541,297, 5,601,825, 5,637,288, 5,677,427, 5,686,578, 5,698,178,
5,789,554, 5,922,302, 6,187,287, and 6,319,500.
IX. METHODS FOR DETERMINING EFFICACY OF A BREAST CANCER THERAPY
[0217] In certain embodiments, methods for determining the efficacy
of a breast cancer therapy (including prophylactic therapy) are
provided. The therapy can be a therapy described herein or any
other conventional breast cancer therapy. In one embodiment, the
efficacy of a cancer therapy is determined by comparing a subject's
parity/nulliparity-related gene expression profile before treatment
for the breast cancer to the subject's parity/nulliparity-related
gene expression profile during or after the treatment. Typically, a
subject that is in need of breast cancer treatment (including
prophylactic therapy, e.g., for a subject determined to have an
elevated risk of developing breast cancer) will have a
parity/nulliparity-related gene expression profile that most
closely resembles (i.e., is the same or similar to) the gene
signature for nulliparous women. After a successful therapy, it is
expected that the subject's gene expression profile will more
closely resemble the parity/nulliparity-related gene expression
profile of parous women, as described herein (e.g., FIG. 28 and
Table 18). A gene signature not resembling the gene expression
profile of parous women is an indication that the treatment was not
successful, and further treatment or a different treatment is
needed.
[0218] In other embodiments, a method for determining efficacy of
an breast cancer therapy (including prophylactic therapy) comprises
measuring the level of a specific gene and/or polypeptide before
and after (or during the therapy). For example, as described above,
in certain embodiments a method for treating or preventing breast
cancer comprises administering an inhibitor or agonist of a
specific gene or polypeptide. The level or activity of the target
gene or polypeptide can be measured before or at the beginning of
treatment, and then again during of after treatment; typically,
when an inhibitor is administered as a cancer therapy, the
inhibition and therapy is deemed effective if the level or activity
of the target gene or polypeptide is decreased by at least 2-fold,
at least 3-fold, at least 4-fold, at least 5-fold, at least
10-fold, or more, relative to the level of the target gene or
polypeptide at the beginning of or before commencement of the
cancer therapy. Typically, when an agonist is administered as a
cancer therapy, the inhibition and therapy is deemed effective if
the level or activity of the target gene or polypeptide is
increased by at least 2-fold, at least 3-fold, at least 4-fold, at
least 5-fold, at least 10-fold, or more, relative to the level of
the target gene or polypeptide at the beginning of or before
commencement of the cancer therapy.
[0219] The above described methods can further comprise
administering to the subject (e.g., a subject in which the efficacy
of the breast cancer therapy was determined to be poor or not
optimal) an additional therapy or therapeutic agent for treating,
reducing the risk of developing, or preventing breast cancer (e.g.,
ER+ or ER- breast cancer). In other embodiments, the methods can
comprise recording the results in a database or medical history
(e.g., medical records) of the subject, selecting the subject for
increased monitoring or periodically monitoring the health of the
subject (e.g., for development or changes in the signs or symptoms
of the breast cancer, e.g., tumor development and/or changes in
tumor size (e.g., increased or decreased size), such as e.g.,
clinical breast exam, mammography, MRI, or other suitable imaging
or other diagnostic method(s) known in the art.
[0220] Methods for determining the level of a target gene or
polypeptide are well known in the art, as described above.
[0221] As above, such methods can be conducted in parallel, or
before or after, conventional methods for determining success of a
treatment, such as, e.g. measuring tumor size or other symptoms of
breast cancer known in the art.
X. KITS
[0222] In certain embodiments, kits are provided for predicting a
subject's risk of developing breast cancer. In other embodiments,
kits are provided for predicting a subject's breast cancer disease
outcome (i.e., prognosis, e.g., likeliness to survive the disease).
In other embodiments, kits are provided for treating breast cancer.
In still other embodiments, kits are provided for determining the
efficacy of a cancer therapy.
[0223] The above kits can comprise means (e.g., reagents, dishes,
solid substrates (e.g., microarray slides, ELISA plates, multiplex
beads), solutions, media, buffers, etc.) for determining the level
of expression or activity of one or more of the genes and/or
pathways described herein. Such kits can further comprise
instructions for use, e.g., guidelines for determining the efficacy
of a cancer therapy, or for predicting a subject's risk of
developing breast cancer, based on the level of expression or
activity of the one or more genes detected using the kit.
[0224] Other kits comprise means for determining (e.g., reagents,
dishes, solid substrates (e.g., microarray slides, ELISA plates,
multiplex beads), solutions, media, buffers, etc.) the frequency of
breast epithelial cell types (e.g., the frequency of CD44+, CD24-
breast epithelial cells, CD24+ breast epithelial cells, CD10+
breast epithelial cells, p27+ breast epithelial cells, Ki67+ breast
epithelial cells, Sox17+ breast epithelial cells and/or Cox2+
breast epithelial cells, and/or ER+, p27+ breast epithelial cells,
ER+, Sox17+ breast epithelial cells, ER+, Cox2+ breast epithelial
cells, ER+, Ki67+ breast epithelial cells, AR+, p27+ breast
epithelial cells, AR+, Sox17+ breast epithelial cells, AR+, Cox2+
breast epithelial cells, and/or AR+, Ki67+ breast epithelial
cells). Such kits can comprise means for detecting expression
(e.g., mRNA and/or protein) levels of one or more of the markers
(e.g., CD44, CD24, CD10, p27, Ki67, Sox17, and/or Cox2) of the cell
types described above. Such kits can also comprise instructions for
determining a subject's risk of developing breast cancer based on
the frequencies of those cell types determined. The frequencies
that indicate an elevated or reduced risk of developing breast
cancer are disclosed above and in the present Examples.
[0225] Other kits can comprise means for determining a
parity/nulliparity gene expression profile. For example, such kits
can comprise a microarray slide or slides comprising probes for two
or more genes making up the parity/nulliparity gene expression
profile, or means for performing PCR (e.g., QPCR), such as forward
and reverse primers, reverse transcriptase, plates, and/or other
PCR reagents. Such kits can further comprise instructions for
determining a subject's disease outcome based on the subject's
parity/nulliparity gene expression profile, as described above and
in the present Examples, and may also provide a standard or
reference gene expression profile for comparison.
[0226] Other kits comprise one or more inhibitors or agonists of
pathways active in nulliparous or parous breast epithelial cells
(e.g. CD44+, CD24- breast epithelial cells), as described herein,
for the treatment or prevention of breast cancer (e.g., ER+ or ER-
breast cancer), and, optionally instructions for use (e.g.
administration and/or dosage).
[0227] In other embodiments, a kit comprises an array containing a
substrate having at least 10, 25, 50, 100, 200, 500, or 1,000
addresses, wherein each address has disposed thereon a capture
probe that includes: (a) a nucleic acid sequence consisting of a
tag nucleotide sequence for the detection of a gene identified in
Tables 4, 5, 6, 7 and/or 18 (e.g., HSD17B11, HSD17B12, HSD17B14,
HSP90AB1 (GenBank Accession No. AAH09206), PSA (KLK3), NCOR1,
NCOR2, NCOA4, NCOA7, SFRP2, SFRP4, VEGFA, NOTCH1, FN1, ITGA4,
ITGB1, TSPAN6, RhoA, RAC1, CDC42, PHB4, BCL2L11, TNFRSF4, BMPR2,
CASP8, PP2A, PIK3CG, ILK, PDPK1, Hakai/CBLL1, SCRIB, and LLGL2,
MAP2K4 (GenBank Accession No. NM.sub.--003010.2), PTP4A2 (GenBank
Accession No. NM.sub.--080391.3), EPHB4 (GenBank Accession No.
NM.sub.--004444), SPARC (GenBank Accession No. NM.sub.--003118.3),
RAB32 (GenBank Accession No. NM.sub.--006834.3), FIGF (GenBank
Accession No. NM.sub.--004469.4), SNX3 (GenBank Accession Nos.
NM.sub.--003795.4, NM.sub.--152827.2), GADD45A (GenBank Accession
Nos. NM.sub.--001924.3, NM.sub.--001199741.1,
NM.sub.--001199742.1), ANXA3 (GenBank Accession Nos.
NM.sub.--005139.2), and HSPA2 (GenBank Accession No.
NM.sub.--021979.3)); and (b) the complement of the nucleic acid
sequence.
[0228] Another kit provided herein contains at least 10 antibodies
each of which is specific for a different protein encoded by a gene
identified in Tables 4, 5, 6, 7 and/or 18. The antibodies can be,
for example, but not limited to, specific for a protein such as
HSD17B11, HSD17B12, HSD17B14, HSP90AB1 (GenBank Accession No.
AAH09206), PSA (KLK3), NCOR1, NCOR2, NCOA4, NCOA7, SFRP2, SFRP4,
VEGFA, NOTCH1, FN1, ITGA4, ITGB1, TSPAN6, RhoA, RAC1, CDC42, PHB4,
BCL2L11, TNFRSF4, BMPR2, CASP8, PP2A, PIK3CG, ILK, PDPK1,
Hakai/CBLL1, SCRIB, and LLGL2, MAP2K4 (GenBank Accession No.
NM.sub.--003010.2), PTP4A2 (GenBank Accession No.
NM.sub.--080391.3), EPHB4 (GenBank Accession No. NM.sub.--004444),
SPARC (GenBank Accession No. NM.sub.--003118.3), RAB32 (GenBank
Accession No. NM.sub.--006834.3), FIGF (GenBank Accession No.
NM.sub.--004469.4), SNX3 (GenBank Accession Nos. NM.sub.--003795.4,
NM.sub.--152827.2), GADD45A (GenBank Accession Nos.
NM.sub.--001924.3, NM.sub.--001199741.1, NM.sub.--001199742.1),
ANXA3 (GenBank Accession Nos. NM.sub.--005139.2), and HSPA2
(GenBank Accession No. NM.sub.--021979.3). The kit can contain at
least 5 antibodies, at least 10 antibodies, at least 15 antibodies,
at least 25 antibodies; at least 50 antibodies; at least 100
antibodies; at least 200 antibodies; or at least 500
antibodies.
[0229] The kits, regardless of type, will generally comprise one or
more containers into which the biological agents (e.g. inhibitors)
are placed and, preferably, suitably aliquotted. The components of
the kits may be packaged either in aqueous media or in lyophilized
form.
[0230] In accordance with the present invention, there may be
employed conventional molecular biology, microbiology, recombinant
DNA, immunology, cell biology and other related techniques within
the skill of the art. See, e.g., Sambrook et al., (2001) Molecular
Cloning: A Laboratory Manual. 3rd ed. Cold Spring Harbor Laboratory
Press: Cold Spring Harbor, N.Y.; Sambrook et al., (1989) Molecular
Cloning: A Laboratory Manual. 2nd ed. Cold Spring Harbor Laboratory
Press: Cold Spring Harbor, N.Y.; Ausubel et al., eds. (2005)
Current Protocols in Molecular Biology. John Wiley and Sons, Inc.:
Hoboken, N.J.; Bonifacino et al., eds. (2005) Current Protocols in
Cell Biology. John Wiley and Sons, Inc.: Hoboken, N.J.; Coligan et
al., eds. (2005) Current Protocols in Immunology, John Wiley and
Sons, Inc.: Hoboken, N.J.; Coico et al., eds. (2005) Current
Protocols in Microbiology, John Wiley and Sons, Inc.: Hoboken,
N.J.; Coligan et al., eds. (2005) Current Protocols in Protein
Science, John Wiley and Sons, Inc.: Hoboken, N.J.; Enna et al.,
eds. (2005) Current Protocols in Pharmacology John Wiley and Sons,
Inc.: Hoboken, N.J.; Hames et al., eds. (1999) Protein Expression:
A Practical Approach. Oxford University Press: Oxford; Freshney
(2000) Culture of Animal Cells: A Manual of Basic Technique. 4th
ed. Wiley-Liss; among others. The Current Protocols listed above
are updated several times every year.
[0231] The following examples are meant to illustrate, not limit,
the invention.
EXAMPLES
Example 1
Materials and Methods
[0232] The following are the materials and methods used in the
Examples set forth below.
[0233] FACS (Fluorescence Activated Cell Sorting)
[0234] A single-cell suspension of human mammary epithelial cells
was obtained from organoids after trypsinization (5 mins,
37.degree. C.) and filtration through 40 .mu.m cell strainers.
Leukocytes, fibroblasts, and endothelial cells were removed by
immuno-magnetic bead purification using cell-type-specific surface
markers essentially as previously described [Bloushtain-Qimron, et
al. (2008). Proc Natl Acad Sci USA 105, 14076-14081; Shipitsin, M.,
et al. (2007). Cancer Cell 11, 259-273]. Cells were re-suspended in
ice cold PBE (0.5% BSA and 2 mM EDTA in PBS) at 2.times.10.sup.6
cells/ml. 2.times.10.sup.5 cells from each sample were used for
multicolor FACS analysis. Cells were stained with propidium iodine
(PI, Sigma), FITC conjugated anti-human EpCAM (Dako, clone
Ber-Ep4), PE-conjugated anti-human CD49f (BD, clone GoH3),
PE/Cy7-conjugated anti-human CD10 (Biolegend, Clone HI10a),
APC-conjugated anti-human CD24 (Biolegend, clone ML5), and purified
anti-human CD44 (BD, Clone 515). CD44 antibody was pre-labeled with
Zenon Alexa 405 mouse IgG1 kit (Invitrogen). Only PI-negative
(viable cells) were used to calculate the relative fraction of each
cell population.
[0235] Multicolor Immunofluorescence and Immunohistochemical
Analyses
[0236] Multicolor immunofluorescence for CD44 (Neomarkers, clone
156-3C11, mouse monoclonal IgG2), CD24 (SWAII clone, generously
provided by Dr. Peter Altevogt (German Cancer Research Center,
Heidelberg, Germany), mouse monoclonal IgG2), p27 (BD Biosciences,
clone 57/Kip1/p27, mouse monoclonal IgG1), Sox17 (R&D Systems,
clone 245013, mouse monoclonal IgG3), COX2 (Cayman Chemical, clone
CX229, mouse monoclonal IgG1), Ki67 (DAKO, clone MIB-1, mouse
monoclonal IgG1), Ki67 (Abcam, #16667, rabbit monoclonal) and
bromodeoxyuridine (BrdU, Roche, clone BMC9318, mouse monoclonal
IgG1), CD10 (DAKO M7308), p63 clone 4A4 (Santa Cruz SC-8431), SMA
clone 1A4 (DAKO M0851), Axin2 clone 354214 (R&D systems
MAB6078), Phosphor-EGF Receptor (Tyr1173) clone 53A5 (Cell
Signaling #4407), Phospho-Smad2 (Ser 465/467) (Cell Signaling
#3101), Gata3 (Santa Cruz SC-268), estrogen receptor (clone SP1,
Thermo Scientific RM-9101), androgen receptor (clone D6F11, Cell
Signalling #5153), and bromodeoxyuridine (BrdU, Roche, clone
BMC9318), was performed using whole sections of formalin fixed
paraffin embedded (FFPE) normal human breast tissue.
[0237] The tissues were deparaffinized in xylene and hydrated in a
series of 100%, 70%, 50% and 0% ethanol solutions. After
heat-induced antigen retrieval in citrate buffer (pH 6), the
samples were blocked with goat serum and sequentially stained with
the different primary and secondary antibodies. The sequential
staining was optimized to avoid cross-reaction between antibodies
and was performed as follows: monoclonal (IgG2a) antibody anti-CD44
(1:100 dilution) for one hour at room temperature; goat anti-mouse
IgG2a A1exa555-conjugated (Invitrogen, 1:100 dilution) for 30
minutes at room temperature; monoclonal antibody anti-p27 (1:100
dilution) or monoclonal antibody anti-Sox17 (1:50 dilution) or
anti-COX2 (1:50 dilution), and monoclonal antibody anti-CD24 (1:25
dilution) biotin labeled (Zenon.RTM. Biotin-XX Rabbit IgG Labeling
Kit, Invitrogen), p63 (1:100 dilution), SMA (1:80 dilution), CD10
(1:100 dilution), Gata3 (1:50 dilution) for one hour at room
temperature; goat anti-mouse IgG1 Alexa 488-conjugated (Invitrogen,
1:100 dilution, for detection of p27 or COX2), goat anti-mouse
Alexa 488/555/647 (Invitrogen 1:100 dilution, for detection of p63,
SMA, CD10 and Gata3) or goat anti-mouse IgG3 Alexa 488-conjugated
(Invitrogen, 1:100 dilution, for detection of Sox17) and
streptavidin Alexa-647 conjugated for 30 minutes at room
temperature.
[0238] The multicolor immunofluorescence for p27 and Ki67 was
performed by incubating the samples with monoclonal antibody
anti-p27 (1:100 dilution) and polyclonal antibody anti-Ki67 (1:50
dilution) for one hour at room temperature followed by goat
anti-mouse IgG1 Alexa 555-conjugated (Invitrogen, 1:100 dilution,
for detection of p27) and goat anti-rabbit Alexa 488-conjugated
(Invitrogen, 1:100 dilution, for detection of Ki67) for 30 minutes
at room temperature. Multicolor immunofluorescence for pSMAD2 (1:50
dilution), pEGFR (1:50 dilution) and Axin2 (1:20 dilution) were
performed by incubation for 2 h at room temperature or overnight at
4.degree. C. followed by secondary antibody Rabbit Alexa 488
conjugated (Invitrogen, 1:100 dilution for pSMAD2 and pEGFR) or
mouse IgG1 Alexa-488 conjugated (Invitrogen 1:100 dilution) for
Axin2 for 30 minutes at room temperature.
[0239] The samples were washed twice with PBS-Tween 0.05% between
incubations and protected for long-term storage with VECTASHIELD
HardSet Mounting Medium with DAPI (Vector laboratories, cat
#H-1500). Before image analysis, the samples were stored at
-20.degree. C. for at least 48 hours. Different immunofluorescence
images from multiple areas of each sample were acquired with a
Nikon Ti microscope attached to a Yokogawa spinning-disk confocal
unit, 60.times. plan apo objective, and OrcaER camera controlled by
Andor iQ software. For the immunohistochemical detection of Sox17
and COX2 the samples were stained with antibodies against Sox17 and
COX2 as above, and then incubated with anti-mouse IgG biotinylated
antibody (1:100 dilution) for 30 minutes at room temperature
followed by the ABC peroxidase System (Vectastain.RTM., ABC System
Vector Laboratories). DAB (3,3'-diaminodbenzidine) was used as
colorimetric substrate and the signal was enhanced by the addition
of 0.04% of nickel chloride. The slides were finally counterstained
with Methyl green.
[0240] Scoring for the expression of each marker was done as
follows: p27 fluorescence intensity was scored in the nuclei of 20
randomly selected cells using the ImageJ 1.43r software; Sox17 and
COX2 expression was inferred by the combination of two variables:
1) the percentage cells expressing each marker, and 2) the
intensity of each marker transformed into a categorical variable
based on 0 no expression, 1 weak expression, 2 moderate expression
and 3 high expression; the percentage of p27+, Ki67+ and BrdU+
cells was estimated by counting an average of 1000 cells/sample in
the case of the mammary epithelium for premenopausal,
postmenopausal and high-low density cases, and an average of 2,000
cells in the case of the tissue slices cultures. % of pSMAD2+ cells
was estimated by counting an average of 600 cells/sample. For pEGFR
and Axin2 fluorescence intensity measurement, mean fluorescence
intensity was measured using Image J 1.43r software by counting an
average of 600 cells/sample corrected by area and subtracting the
average of background fluorescence intensity. RGB profile was also
generated using Image J 1.43 software. For multicolor
immunofluorescence of p27 and ER, p27 (1:100 dilution) and ER
(1:500 dilution) antibodies were incubated overnight at 40 C
followed by incubation at RT for 1 h with subsequent staining by
goat anti-mouse IgG1 Alexa 555-conjugated (Invitrogen, 1:100
dilution, for detection of p27) while detection of ER antibody was
performed by Biotinylated anti Rabbit 20 antibody (1:100 dilution)
using Perkin Elmer TSATM INDIRECT tyramide amplification kit
(NEL700001KT) and streptavidin conjugated Alexa 647 from Invitrogen
(1:80 dilution). For p27 and AR staining, p27 (1:100 dilution and
AR (1:30 dilution) antibodies were incubated overnight at 40 C
followed by incubation at RT for 1 h with subsequent staining by
goat anti-mouse IgG1 Alexa 555-conjugated (Invitrogen, 1:100
dilution, for detection of p27) and anti-rabbit IgG Alexa
488-conjugated (Invitrogen, 1:80 dilution). Percentage of p27+,
AR+, ER+ cells was estimated by counting 500-1000 cells/sample.
Nuclear staining with DAPI and multiple fluorescence images from
each section were acquired with 40.times. plan apo objective,
following procedure described above.
[0241] Culture of Tissue Slices
[0242] Normal human breast tissues were collected from reduction
mammoplasties, transported in ice-cold DMEM-F12 medium, and
processed within 24 hrs. For organ cultures, thin (.about.1 mm
thick) slices of tissue were cut from epithelium-enriched areas and
cultured for 8 days in 6-well plates using co-culture inserts to
optimize the tissue/medium contact surface and changing medium (2
ml/well) every 24 hrs. The M87A medium previously optimized for
human primary mammary epithelial cultures was used [see,
Bloushtain-Qimron, et al. (2008) supra; Garbe, J. C., et al.
(2009). Cancer Res 69, 7557-7568]. Inhibitors used included
cyclopamine (Selleck Chemicals, cat#S1146)--inhibitor of Smo
receptor of Hh ligands, LY2109761 (Eli Lilly)--inhibitor of TGFBR
kinases, celecoxib (LKT laboratories, cat#C1644)--inhibitor of
Cox2, 2-5dideoxyadenosine (Enzo Life Sciences,
cat#BML-CN110-005)--adenylate cyclase inhibitor, tyrphostin AG1478
(Cayman Chemicals, cat#10010244)--EGFR inhibitor, XAV939 (Tocris
Bioscience, cat#3748)--Tankyrase (TNKS) inhibitor- antagonizes Wnt
signaling via stimulation of .beta.-catenin degradation and
stabilization of axin, picropodophylotoxin (Tocris Bioscience,
cat#2956)--IGFR inhibitor Stock solutions (1,000.times.) were
prepared in DMSO. Final drug concentrations were as follows:
cyclopamine--10 .mu.M, LY2109761-500 nM, celecoxib--100 .mu.M,
2-5dideoxyadenosine--100 .mu.M, AG1478--10 .mu.M, XAV939--1 .mu.M
and Picropodophylotoxin--0.5 .mu.M. Following 8 days of culture,
labeled tissue slices were pulse with bromo-deoxy-uridine (30 .mu.M
final concentration) for 5 hrs before fixing the tissue in buffered
formalin at room temperature for 24 hrs followed by embedding in
paraffin. Experiments were performed in triplicates using tissue
from different regions of the same breast, uncultured tissue and
tissue cultured without any drugs as controls. To experimentally
reproduce hormone levels in follicular and luteal phase of the
menstrual cycle and in mid-pregnancy, the following was used: 0.5
nM of estradiol for 8 days to mimic follicular phase; 1.2 nM of
estradiol for 2 days (representing ovulation) followed by 0.7 nM of
estradiol and 50 nM of progesterone for 6 days to mimic luteal
phase; and a combination of 250 nM estradiol, 600 nM progesterone,
600 ng/mL prolactin, and 10 IU/mL HCG for 8 days to mimic pregnancy
in the normal breast.
[0243] PCA Analysis and Plot
[0244] Unsupervised principle component analysis (PCA) was applied
using R package `pcurve` to gene expression profiles of different
cell types from parous and nulliparous tissues. The mean of each
sample was centered to zero before PCA analysis. Genes were the
feature variables and samples were projected to the principle
components. OpenGL was used to plot PCA results by projecting each
sample to the first three principal components. Using the projected
value on the largest 3 principal component as the Euclidean
coordinates for each individual, paired Euclidean distance between
nulliparous and parous individuals for each cell type was
calculated. The distance is a global measurement of the difference
between individuals. It indicated, for example, that the gene
expression of CD44.sup.+ cells changed the most, as it has the most
significant distance between nulliparous and parous samples.
[0245] Rat Gene Expression Data Analysis and Comparison with
Human
[0246] Previously published gene expression data from virgin and
parous rats was reanalyzed using four (WistarFurth, Copenhagan,
Fischer344, and Lewis) inbred strains of rats [Blakely, C. M., et
al. (2006). Cancer Res 66, 6421-6431]. The raw data (generated
using RG_U34A array) was obtained online and normalized by RMA
using default parameters followed by the selection of
differentially expressed genes using SAM (significance analysis of
microarray) algorithm [Tusher, et al. (2001) Proc Natl Acad Sci USA
98, 5116-5121]. Differentially expressed genes for each strain was
called using p value cutoff 0.05 and the union of these was used
defined as "rat differential gene list". Genes that appeared in
both up and down union groups were excluded. Only genes that had
homologues in both species were used for comparisons.
[0247] Supervised Principal Component Analysis with Randomized
Input
[0248] Supervised principal component analysis (SPCA) was used for
selection of a subset of genes with prognostic value from
differentially expressed genes [Tibshirani, R., et al. (2004).
Bioinformatics 20:3034-3044]. The training (Wang's) cohort [Wang,
Y., et al. (2005) Lancet 365, 671-679] was randomly split after
appropriate filtering of patients into training set and testing set
of the same size (the same number of individual patients).
Traditional PCA uses all genes to identify principal components in
an unsupervised way. However, the 1.sup.st principal component of
unsupervised PCA might not be the projection direction of
interested. SPCA in this study finds the principal components using
only genes correlated with survival (ex, log rank test p value 0.05
as cutoff using univariate cox regression). The 1st principal
component was used to predict the survival outcome. The correlation
between a gene and the predicted outcome was used as the importance
score to rank genes of importance. Cross-validation was applied to
determine cut-off for significance. Genes with importance score
higher than this cut-off formed the gene signature. For each random
split configuration, a parity signature was obtained using SPCA. To
get a robust gene signature, Wang's data was randomly split into
training and testing sets 1,000 times and a signature for each
configuration was obtained. It was argued that the genes that
significantly contribute to breast cancer progression should appear
in signatures multiple times more than randomly expected. Those
genes whose frequency appearing in signature 5 times higher than
random background were chosen as the final parity gene
signature.
[0249] Prognostic Signature
[0250] 3,515 genes were identified that were differentially
expressed after pregnancy in CD44+ cells at p value cut-off 0.05
using SageExpress pipeline [Wu, Z. J., et al. (2010). Genome Res
20, 1730-1739]. Pregnancy resulted in multifaceted alterations of
the mRNA expression levels in cells. Applying univariate Cox
regression, 1899 genes were identified to have significant (log
rank p value <0.05) correlation with survival in Wang's cohort,
among which 441 genes were shown to be differentially expressed
after pregnancy (p value <1.75e-10 using hypergeometric
distribution for significance test). Those results suggested that
the alterations of pregnancy on cell factory are likely associated
with carcinogenesis and cancer progression.
[0251] In order to elucidate the parity-induced differential genes
that were not only expressed together but also correlated with
survival (parity-induced breast cancer signature), supervised
principal component analysis described above was applied. Simply
using univariate cox regression to identify genes correlated with
breast cancer as the parity-induced breast cancer signature has the
following drawbacks. First, univariate analysis excludes the
contributions of other covariates (genes). Thus significant genes
in univariate analysis might not be significant when considering
other covariates. Second, gene expression often changes in a
coherent way such that genes that are functionally related in one
or several pathways often show strong correlation in expression
levels, which is not captured by univariate analysis.
Parity-induced breast cancer signature was obtained using SPCA on
up and down genes after pregnancy separately. Wang's cohort was
used as the training set and the signatures were validated in three
other widely used breast cancer cohorts (NKI, GSE7390 (Transbig),
GSE2990 (Tamoxifen) [Desmedt, C., et al. (2007). Clin Cancer Res
13, 3207-3214; Sotiriou, C., et al. (2006) J Natl Cancer Inst 98,
262-272; van de Vijver, M. J et al. (2002) N Engl J Med 347,
1999-2009]. K-mean clustering (k=2) of these signatures separated
patients into two groups with significant survival difference.
[0252] Norwegian Cohort
[0253] GSE18672 cohort [Haakensen, V. D., et al. (2011a) BMC Cancer
11, 332; Haakensen, V. D., et al. (2011b). BMC medical genomics 4,
77] was used to validate the expression patterns of parity-related
genes identified in this study. The following criteria were applied
for sample selection from this cohort in order to match the samples
used in this study: for nulliparous samples--pre-menopausal and
age<40; for parous samples--pre-menopausal, number of parity
with live birth=2, age<40, age at 1st birth<30. The following
procedures were taken to preprocess the public data cohort
GSE18672: 1--Missing value estimation using local least squares (R
package pacMethods: llsimpute), 2--All genes were centered to zero
followed by a loess normalization (R package affy:
normalize.loess).
[0254] Statistical Analyses
[0255] The differences between the percentage of p27+ and Ki67+
cells in the samples from nulliparous and parous women were
analyzed by Fisher exact test. The differences between high and
low-density samples were analyzed by binomial test. P value of
overlap between two groups was obtained by statistical test on
hypergeometric distribution. The differences between the
percentages of p27+ in the tissue slices experiments were analyzed
by t-test, and the differences in BrdU+ cells were analyzed by
Fisher exact test.
[0256] Kappa Statistics
[0257] Kappa statistics are a statistical measure of inter-rater
agreement [Cohen, J. (1960). Educat Psych Meas 20, 37-46]. The
input for kappa involves a couple of raters or learners, which
classify a set of objects into categories. Here, it was used to
compare lists of differentially expressed genes for their
congruency. Hierarchical clustering of signaling pathways
significantly down or upregulated in the four cell types was
performed. Distance between two enrichments was assessed using the
kappa statistics. Similar to the design in previous publications
[Bessarabova, M., et al. (2011) Cancer Res 71, 3471-3481; Huang da,
W., et al. (2007) Genome Biol 8, R183; Shi, W., et al. (2010)
Pharmacogenomics J 10, 310-323], the value of 1 was assigned to a
map if it was significant for an experiment and the value of 0 if
the significant enrichment was not observed. Pathways determined to
have significant enrichment are referred to herein as
"statistically significant pathways." Kappa value was calculated
as
.kappa. = Pr ( a ) - Pr ( e ) 1 - Pr ( e ) , ##EQU00001##
where Pr(a) is the relative observed agreement among two
enrichments, and Pr(e) is the hypothetical probability of chance
agreement, using the observed data to calculate the probabilities
of randomly calling maps significant in each experiment. As the
higher values of kappa mean better agreement between enrichments
and the maximal possible value of kappa is 1, the value (1-K) was
used as a distance between two experiments. Average linkage was
used to construct cluster dendrogram depicted in FIG. 10.
[0258] Generation of SAGEseq, MSDKseq, and ChIPseq Libraries
[0259] Detailed protocols for cell purification and the generation
of SAGEseq (Serial Analysis of Gene Expression applied to
high-throughput sequencing) [Genome Res. 2010 December;
20(12):1730-9. Epub 2010 Nov. 2., Proc Natl Acad Sci USA. 2012 Feb.
21; 109(8):2820-4. Epub 2010 Nov. 22.
(http://research4.dfci.harvard.edu/polyaklab/protocols_linkpage.php)],
MSDKseq (Methylation-Specific Digital Karyotyping [Hu, M., et al.
(2005) Nat Genet 37, 899-905], and ChIPseq (Chromatin
Immunoprecipitation applied to high-throughput sequencing)
[Maruyama, R. et al. (2011) PLoS genetics 7, e1001369] libraries
are posted on the web-site
(http://research4.dfci.harvard.edu/polyaklab/protocols_linkpage.php).
Genomic data were analyzed as described before [Kowalczyk, A., et
al. (2011) J Comput Biol 18, 391-400; Maruyama, R., et al. (2011)
supra; Wu, Z. J., et al. (2010) Genome Res 20, 1730-1739].
[0260] Integrated View of ChIPseq, SAGEseq, and MSDKseq Data
[0261] Differentially Methylated Regions across parity groups were
identified using the Poisson margin test [Kowalczyk, A., et al.
(2011) supra]. Genes were ordered as a spectrum going from higher
in parous to higher in nulliparous, based on p-values. Fisher exact
tests were performed using sum of target gene numbers in 1,000-gene
window and total count of target genes outside of the window,
testing the enrichment of targets inside the windows.
[0262] Protein Interactome Analyses
[0263] In order to determine overall activation of specific
biological functions due to parity in the cell types analyzed,
pathway enrichment, network, and protein interactome analyses were
performed using the MetaCore platform as described in Bessarabova
et al., supra; Ekins, S., et al. (2006) Book Chapter in In High
Content Screening (Humana Press), pp. 319-350; Nikolsky, Y., et al.
(2009) Methods Mol Biol 563, 177-196).
[0264] Nurses' Health Study Data
[0265] The Nurses' Health Study (NHS) is a prospective cohort study
established in 1976 when 121,700 female registered nurses from
across the United States, aged 30-55 years, completed a mailed
questionnaire on factors that influence women's health. Follow-up
questionnaires have since been sent out every two years to the NHS
participants to update exposure information and ascertain non-fatal
incident diseases. Incident breast cancer was ascertained by the
biennial questionnaire to study participants. For any report of
breast cancer, written permission was obtained from participants to
review their medical records to confirm the diagnosis and to
classify cancers as in situ or invasive, by histological type, size
and presence or absence of metastases. Overall, 99% of
self-reported breast cancers have been confirmed. To identify
breast cancer cases in non-respondents who died, death certificates
and medical records for all deceased participants were obtained to
ascertain cause of death. This study was approved by the Human
Subjects Committee at Brigham and Women's Hospital in Boston, Mass.
Breast cancer cases were followed from the date of diagnosis until
Jan. 1, 2008 or death, whichever came first. Ascertainment of
deaths included reporting by next of kin or postal authorities or
searching the National Death Index.
[0266] Approximately 98% of deaths in the NHS have been identified
by these methods. Cause of death was ascertained from death
certificates and physician review of medical records. Information
on estrogen receptor (ER) status was extracted from the medical
record and pathology reports. If data were missing for ER status,
scoring from immunohistochemical staining for ER on 5 .mu.m
paraffin sections cut from tissue microarray (TMA) blocks was used
[Tamimi, R. M., et al. (2008) Breast Cancer Res 10, R67]. There
were 8,055 women with invasive breast cancer diagnosed after return
of the 1976 baseline questionnaire through 2006 questionnaire. One
woman was excluded due to missing information on parity. Thus, our
final analysis included 8,054 women with invasive breast cancer and
information on parity. Survival curves were estimated by the
Kaplan-Meier method and statistical significance was assessed with
the log-rank test. Multivariate cox proportional hazards regression
models were used to evaluate the relationship between parity and
breast cancer-specific mortality after adjusting for age at
diagnosis, aspirin use, date of diagnosis, disease stage, grade,
radiation treatment, chemotherapy and hormonal treatment. All
analyses were performed using SAS version 9.1. All statistical
tests were two sided and P<0.05 was considered statistically
significant.
[0267] Accession Numbers
[0268] Raw data files and methodological details have been
submitted to GEO with accession number GSE32017.
Example 2
Parity-Related Differences in Gene Expression in Multiple Cell
Types
[0269] This example demonstrates the effect parity has on the
cellular composition of normal human breast.
[0270] To investigate if parity affects the cellular composition of
normal human breast, first breast epithelial cells from nulliparous
and parous women were analyzed by FACS (fluorescence-activated cell
sorting) for cell surface markers previously associated with
luminal epithelial (CD24), myoepithelial (CD10), and progenitor
features (lin-/CD44+) [Bloushtain-Qimron et al., supra; Mani et al.
(2008) Cell 16; 133(4):704-15; Shipitsin et al., (2007) Cancer Cell
11, 259-273]. It was found that CD24+, CD44+, and CD10+ cells
represent three distinct cell populations with minimal overlap both
in nulliparous and parous tissues. FIG. 1 shows the FACS plot for
CD24+ versus CD44+ cells, and it could be seen that there were very
few cells that stained positive for both markers. (FIG. 1).
Multicolor immunofluorescence analyses was also performed for these
three cell surface markers alone or in combinations, and additional
known markers for a subset of luminal (GATA3) and myoepithelial
(SMA) cells, which further confirmed the identity of the cells.
Subsequent FACS analysis of multiple tissue samples showed
significant differences in the relative frequency of CD44+ and
CD24+ cells between parous and nulliparous samples, whereas the
relative frequency of CD10+ cells was essentially the same (FIG.
2). The changes in the relative frequency of CD24+ and CD44+ cells
could potentially have been due to the increased number of
lobulo-alveolar structures observed in parous women.
[0271] To investigate parity-related differences in global gene
expression profiles, immuno-magnetic bead purified
(Bloushtain-Qimron et al., 2008 supra; Shipitsin et al., 2007,
supra) CD24+, CD10+, and CD44+ cells (captured sequentially, thus,
CD44+ fraction was CD24-CD10-CD44+, but the CD24+ fraction may have
contained some CD24+CD44+ cells), and fibroblast-enriched stroma
from multiple nulliparous and parous women were analyzed using
SAGEseq (Serial Analysis of Gene Expression applied to
high-throughput sequencing). To minimize variability among
individuals unrelated to parity status, women were closely matched
for age, the number of pregnancies, time at first and since last
pregnancy, and ethnicity. The analysis is summarized in Table 3,
below, which shows the tissue code, age, parity, ethnicity, and
menopausal status of the patient, type of surgery for tissue
acquisition, mammographic breast density, cell type analyzed, raw
and aligned tag/read counts for Sageseq, MSDKseq, and ChIPseq data
below, in which an "x" in qRT-PCR, qMSP, FACS, and IF/IHC
(immunofluorescence/immunohistochemistry) columns indicate the use
of that sample for the analysis.
[0272] The expression of known cell type-specific genes (e.g.,
luminal cell markers KRT8 and MUC1, myoepithelial cell markers
ACTG2 and CNN1, and progenitor cell markers ZEB2 and TWIST1) was
consistently observed in each of the three respective epithelial
cell types both from nulliparous and parous samples based on
SAGEseq confirming the purity and identity of the cells. Comparison
of each cell type between nulliparous and parous samples revealed
the most pronounced differences in CD44+ cells (FIG. 3 and Table 4,
below), where the numbers of significantly (p<0.05)
differentially expressed genes and the fold differences were the
largest between groups. Tables 4, 5, 6 and 7 list the
differentially expressed genes in CD44+, CD24+, CD10+, and stromal
breast epithelial cells, respectively, from normal human reduction
mammoplasty samples of nulliparous (NP) and parous (P) women. The
tables list gene symbols, log transformed normalized tag counts in
CD44+, CD24+, CD10+ or stromal breast epithelial cells from
nulliparous (columns 2-4) and parous (columns 5-7) with fold change
between nulliparous and parous samples (based on average of actual
normalized tag count of the three tissues), p-value (<0.05) and
gene description.
[0273] The degrees of differences were smaller and similar in CD10+
and CD24+ cells, whereas stromal fibroblasts had the fewest
differentially expressed genes (Tables 5 and 6). Further
examination of parity-related differences in expression patterns
using principal component analysis (PCA) confirmed that CD24+ and
CD10+ cells and fibroblasts from nulliparous and parous women were
similar, whereas CD44+ cells formed very distinct nulliparous and
parous clusters (FIGS. 4A and 4B). Interestingly, CD44+ cells from
nulliparous women were more similar to CD10+ cells, whereas from
parous cases they were more similar to CD24+ cells. This implied a
shift from a more basal to a more luminal gene expression pattern
in CD44+ cells after parity (FIG. 5).
TABLE-US-00003 Lengthy table referenced here
US20150285802A1-20151008-T00001 Please refer to the end of the
specification for access instructions.
TABLE-US-00004 Lengthy table referenced here
US20150285802A1-20151008-T00002 Please refer to the end of the
specification for access instructions.
TABLE-US-00005 Lengthy table referenced here
US20150285802A1-20151008-T00003 Please refer to the end of the
specification for access instructions.
TABLE-US-00006 Lengthy table referenced here
US20150285802A1-20151008-T00004 Please refer to the end of the
specification for access instructions.
TABLE-US-00007 Lengthy table referenced here
US20150285802A1-20151008-T00005 Please refer to the end of the
specification for access instructions.
[0274] To validate differences in gene expression in additional
samples and by other methods, quantitative RT-PCR (qRT-PCR)
analyses of selected genes were performed using CD44+ cells from
multiple nulliparous and parous cases. Despite some interpersonal
variability, statistically significant differences between
nulliparous and parous groups were detected that overall correlated
with SAGEseq data (FIG. 6).
[0275] To validate the parity-related gene expression differences
in an independent cohort, the levels of the differentially
expressed genes (in all cell types or only in CD44+ cells) were
analyzed in gene expression data from breast biopsies of a cohort
of Norwegian women matched to the nulliparous and parous samples
for age (<40) and parity (P2). Clustering analysis using the
differentially expressed gene sets divided these samples into a
distinct nulliparous (Nulliparous B) and a mixed parous/nulliparous
(Nulliparous A) group (FIG. 7). Using genes differentially
expressed in all four cell types (i.e., CD24+, CD10+, CD44+ cells,
and fibroblasts), combined, or only in CD44+ cells, gave identical
results, supporting the hypothesis that changes in CD44+ cells are
the most significant and physiologically relevant. Interestingly,
the nulliparous samples that formed a distinct cluster (Nulliparous
B), or were closer to parous cases (Nulliparous A), displayed
significant differences in serum estradiol levels (SEL), with the
samples more similar to parous cases having low SEL; all parous
samples also had low SEL (FIG. 8). Because these were all
premenopausal women and SEL is known to be higher in the luteal
phase of the menstrual cycle, when breast epithelial cell
proliferation is also higher, these findings implied that breast
tissues of nulliparous and parous women may be more distinct in the
luteal phase potentially due to differences in the activity of
signaling pathways driving cell proliferation or the number of
cells that respond to these stimuli.
[0276] To strengthen the hypothesis that the parity-associated
differences detected in CD44+ cells might be related to subsequent
breast cancer risk, the gene expression profiles of CD44+ cells
from parous BRCA1 and BRCA2 mutation carriers, whose risk is not
decreased by parity, were analyzed. CD44+ cells from parous BRCA1/2
mutation carriers clustered with CD44+ cells from nulliparous
controls (FIG. 9A), thereby demonstrating that parity-associated
changes observed in control parous women may not occur in these
high risk women. The gene expression data in CD10-, CD24-, CD44+
breast epithelial cells from BRCA1 and BRCA2 mutation carriers is
shown in Tables 8 and 9, below. Tables 8 and 9 show, from left
column to right column, the t-value (t-score), the q-value, which
is the smallest FDR (false discovery rate) at which a particular
gene would just stay on the list of positives, the p-value, which
is the smallest false positive rate (FPR) at which the gene appears
positive, and the gene expression in P1, P2, and P3 (samples from
three control tissues (CD10-, CD24-, CD44+ breast epithelial cells
from parous subjects)), and in BRCA1-N105, BRCA1-N171 and
BRCA1-N174 (samples from three BRCA1 mutation carriers) in Table 8
or in BRCA2-N151, BRCA2-N161 and BRCA2-N172 (samples from three
BRCA2 mutation carriers). The statistical values t, p, and q are
described at
http://discover.nci.nih.gov/microarrayAnalysis/Statistical.Tests.jsp.
TABLE-US-00008 Lengthy table referenced here
US20150285802A1-20151008-T00006 Please refer to the end of the
specification for access instructions.
TABLE-US-00009 Lengthy table referenced here
US20150285802A1-20151008-T00007 Please refer to the end of the
specification for access instructions.
[0277] To determine if the lack of parity-associated changes in
CD44.sup.+ cells from BRCA1/2 women could be due to differences in
the cell populations identified by the three cell surface markers,
FACS analysis of multiple tissue samples from control and BRCA1/2
women was performed. The relative frequency of CD44.sup.+ was
slightly higher in control and BRCA1/2 parous compared to
nulliparous control samples, which was associated with a slight
decrease in the frequency of CD24.sup.+ cells, whereas the relative
frequency of CD10.sup.+ cells was about the same in all groups
(FIG. 9B). The increase in the relative frequency of CD44.sup.+ to
CD24.sup.+ cells in parous samples could potentially be due to the
increased number of lobulo-alveolar relative to ductal structures
observed in parous women (FIG. 1), or due to the loss of CD24.sup.+
cells during involution, or may also reflect the presence of
parity-induced stem cells described in murine mammary glands.
Example 3
Biological Pathways and Networks Affected by Parity-Related Gene
Expression Changes
[0278] This example identifies biological pathways that are
activated or repressed by parity.
[0279] It was investigated which signaling pathways might be
affected by parity-related molecular changes. Early pregnancy
specifically decreases the risk of ER+ breast tumors.
Differentially expressed genes (Table 4, supra) were explored in
CD44+ cells for candidate mediators of this effect. Several genes
were identified that can change the response of breast tissue to
steroid hormones by altering metabolism (e.g., HSD17B11, HSD17B12,
and HSD17B14) or by modulating nuclear receptors (e.g., NCOR1,
NCOR2, NCOA4, and NCOA7). Interestingly, androgen receptor (AR) and
one of its key targets PSA (KLK3) were highly expressed in
nulliparous CD44+ cells, implying active androgen signaling pathway
that is decreased following pregnancy. Among genes highly expressed
in parous CD44+ cells were a number of known tumor suppressors,
such as Hakai/CBLL1, CASP8, SCRIB and LLGL2, and DNA repair-related
genes (e.g., PRKDC, FANCB), suggesting that these cells may be more
resistant to transformation in parous women.
[0280] In order to determine overall activation of specific
biological functions due to parity in the cell types analyzed,
pathway enrichment, network, and protein interactome analyses were
performed using the MetaCore platform. The analyses are summarized
in Table 10, below, which contains a full list of enriched GeneGo
pathway maps in four different cell types (CD24+, CD44+, CD10+ and
stromal fibroblasts) from human breast epithelium from nulliparous
and parous subjects. Table 10 contains canonical pathway maps with
p-values (<0.05) indicating significance of enrichment for
differentially expressed genes upregulated in individual cell types
(CD44+, CD24+, CD10+ and stroma) isolated from nulliparous and
parous breast tissue, pathway maps, and p-value of enrichment in
differentially expressed gene sets from the indicated human cell
types from nulliparous and parous women. Table 10 also includes
pathways enriched in genes highly expressed in virgin compared to
publicly available datasets for parous rats [Blakely et al.,
supra]. It was found that parity had similar global effects on
three of the four cell types analyzed, as pathways built on
expression patterns in CD10+ and CD44+ cells and stroma cluster
together for parous and nulliparous states (FIG. 10).
TABLE-US-00010 TABLE 10 List of Enriched GeneGo Pathway Maps in
Four Different Breast Epithelial Cell Types p-values in Nulliparous
p-values in Parous Pathway maps CD44+ CD24+ CD10+ Stroma rat CD44+
CD24+ CD10+ Stroma Cytoskeleton remodeling_Cytoskeleton remodeling
1.05E-09 1.79E-04 3.27E-06 9.10E-05 3.77E-04 3.49E-03 0.0256
1.17E-04 Cytoskeleton remodeling_Regulation of actin cytoskeleton
by Rho GTPases 1.34E-09 1.17E-02 2.73E-02 9.98E-07 0.00412 7.52E-04
Cytoskeleton remodeling_TGF, WNT and cytoskeletal remodeling
1.88E-09 5.71E-08 1.46E-07 8.12E-04 2.69E-03 6.92E-03 1.92E-02
7.29E-03 2.63E-04 Cell adhesion_Chemokines and adhesion 2.69E-07
3.55E-05 1.03E-05 3.54E-04 3.88E-03 0.0217 2.84E-02 4.53E-02
Cytoskeleton remodeling_Role of PKA in cytoskeleton reorganisation
6.44E-07 1.40E-04 9.01E-05 0.00934 Development_MAG-dependent
inhibition of neurite outgrowth 1.54E-06 1.45E-02 3.82E-02 1.71E-02
1.12E-02 0.0318 Role of DNA methylation in progression of multiple
myeloma 2.40E-06 7.26E-03 1.50E-03 6.35E-03 0.00478 4.82E-03 Cell
adhesion_Histamine H1 receptor signaling in the interruption of
cell barrier integrity 3.24E-06 7.62E-06 6.00E-03 0.0205 0.00325
Cell adhesion_Alpha-4 integrins in cell migration and adhesion
3.71E-06 1.02E-02 6.75E-03 7.85E-03 0.0221 0.0334 Stem
cells_Response to hypoxia in glioblastoma stem cells 4.22E-06
3.68E-03 Development_WNT signaling pathway. Part 2 5.42E-06
4.58E-03 5.02E-03 1.38E-02 0.00283 6.24E-06 Development_Slit-Robo
signaling 6.19E-06 1.32E-04 3.54E-03 8.20E-03 4.54E-03 Cytoskeleton
remodeling_Fibronectin-binding integrins in cell motility 8.94E-06
1.17E-03 7.71E-04 8.39E-04 Oxidative phosphorylation 9.31E-06
1.25E-07 5.50E-03 2.34E-13 Cell adhesion_Role of tetraspanins in
the integrin-mediated cell adhesion 1.02E-05 5.25E-04 4.99E-05 Cell
cycle_Role of Nek in cell cycle regulation 1.27E-05 7.84E-03
9.44E-04 1.60E-05 5.46E-03 0.0196 Signal transduction_PKA signaling
1.64E-05 1.47E-02 2.59E-03 3.46E-02 0.0356 Blood coagulation_Blood
coagulation 1.86E-05 6.50E-04 2.90E-03 Cell adhesion_ECM remodeling
2.09E-05 2.54E-08 1.01E-06 2.90E-03 0.0000897 Inhibitory action of
Lipoxin A4 on PDGF, EGF and LTD4 signaling 2.45E-05 4.38E-02
6.75E-03 3.60E-02 0.00123 Stem cells_WNT/Beta-catenin and NOTCH in
induction of osteogenesis 2.48E-05 4.20E-03 0.0118 HIF-1 in gastric
cancer 3.00E-05 9.13E-03 1.60E-03 2.68E-02 0.0181 Cell
adhesion_Plasmin signaling 3.33E-05 7.32E-07 1.41E-02 0.00805
Development_Lipoxin inhibitory action on PDGF, EGF and LTD4
signaling 3.33E-05 4.80E-02 7.80E-03 3.95E-02 0.00144 Cell
adhesion_Integrin-mediated cell adhesion and migration 3.84E-05
1.11E-02 1.02E-02 9.18E-03 1.81E-03 0.000871 Cytoskeleton
remodeling_Reverse signaling by ephrin B 5.92E-05 4.20E-03 5.25E-03
Immune response_IL-1 signaling pathway 7.06E-05 1.50E-03 6.35E-03
Cell adhesion_Endothelial cell contacts by junctional mechanisms
7.46E-05 4.30E-04 2.36E-03 Signal transduction_cAMP signaling
7.78E-05 1.87E-02 2.53E-03 0.00751 Regulation of CFTR activity
(norm and CF) 7.82E-05 1.98E-02 2.57E-04 3.91E-04 1.08E-03 2.12E-02
1.13E-02 Development_TGF-beta-dependent induction of EMT via RhoA,
PI3K and ILK. 1.13E-04 3.86E-04 4.37E-04 2.52E-04 1.40E-03 0.00597
6.19E-03 Role of stellate cells in progression of pancreatic cancer
1.16E-04 9.06E-03 7.55E-06 1.92E-04 1.57E-03 0.00135 Cell
cycle_Influence of Ras and Rho proteins on G1/S Transition 1.18E-04
3.51E-05 1.73E-02 3.23E-03 4.07E-02 0.000894 2.90E-02 Stem
cells_NOTCH1-induced self-renewal of glioblastoma stem cells
1.30E-04 Stem cells_Pancreatic cancer stem cells in tumor
metastasis 1.30E-04 3.68E-03 1.36E-06 0.000276 Tumor-stroma
interactions in pancreatic cancer 1.44E-04 5.38E-05 8.16E-04 Stem
cells_Regulation of lung epithelial progenitor cell differentiation
1.66E-04 2.88E-05 2.41E-02 LKB1 signaling pathway in lung cancer
cells 1.66E-04 9.23E-04 1.33E-02 6.90E-04 6.32E-04 0.000598 Immune
response_CCR3 signaling in eosinophils 1.68E-04 3.21E-03 4.15E-02
1.76E-02 1.17E-04 0.000191 Non-genomic signaling of ESR2 (membrane)
in lung cancer cells 1.76E-04 4.00E-02 1.81E-03 0.00451 Blood
coagulation_GPCRs in platelet aggregation 2.20E-04 2.73E-02
1.18E-03 0.0283 Cytoskeleton remodeling_Role of PDGFs in cell
migration 2.55E-04 1.10E-02 0.00146 Stem cells_Role of BMP
signaling in embryonic stem cell neural differentiation 2.59E-04
3.54E-03 Development_Hedgehog and PTH signaling pathways in bone
and cartilage 3.07E-04 1.71E-02 4.70E-02 0.0316 development Stem
cells_Endothelial differentiation during embryonic development
3.25E-04 3.98E-05 3.46E-02 0.0365 3.56E-02 Stem cells_Hedgehog, BMP
and Parathyroid hormone in osteogenesis 3.25E-04 5.00E-02 1.41E-02
Dual role of BMP signaling in gastric cancer 3.50E-04 1.57E-02
1.31E-03 2.99E-02 0.0306 4.69E-02 IGF signaling in HCC 3.94E-04
1.61E-02 1.11E-03 1.21E-02 1.08E-04 0.0269 0.0108 Development_EGFR
signaling via small GTPases 4.43E-04 3.61E-02
Development_FGF2-dependent induction of EMT 4.46E-04 3.56E-04
5.64E-03 0.0139 0.034 Cell adhesion_Cadherin-mediated cell adhesion
4.72E-04 4.30E-04 4.09E-02 3.07E-04 Stem cells_Differentiation of
white adipocytes 4.75E-04 6.82E-04 6.78E-06 Apoptosis and
survival_Endoplasmic reticulum stress response pathway 4.75E-04
1.76E-02 0.0419 Development_BMP signaling 5.69E-04 2.45E-02
1.15E-02 0.0202 Development_TGF-beta-dependent induction of EMT via
MAPK 6.02E-04 3.70E-02 2.33E-03 3.74E-02 7.44E-03 0.00698
Transcription_ChREBP regulation pathway 6.25E-04 6.76E-03 0.0165
6.22E-03 4.33E-03 Translation_Regulation of translation initiation
6.27E-04 2.05E-02 3.85E-02 0.00155 PGE2 pathways in cancer 6.80E-04
0.0333 Immune response_Antigen presentation by MHC class I 8.21E-04
2.32E-02 3.32E-03 Muscle contraction_Regulation of eNOS activity in
endothelial cells 8.47E-04 1.36E-03 2.89E-02 2.39E-03 0.0343
HBV-dependent NF-kB and PI3K/AKT pathways leading to HCC 8.76E-04
1.70E-05 3.43E-02 8.47E-03 0.00814 2.99E-02 IL-6 signaling in
multiple myeloma 8.76E-04 1.08E-04 3.71E-02 9.11E-03 0.0291
8.14E-03 5.00E-03 Development_Melanocyte development and
pigmentation 8.76E-04 Stem cells_Extraembryonic differentiation of
embryonic stem cells 9.09E-04 1.65E-03 Stem cells_Astrocyte
differentiation from adult stem cells 9.09E-04 3.09E-02 3.95E-02
Apoptosis and survival_BAD phosphorylation 9.18E-04 5.76E-03
8.02E-04 3.55E-03 3.77E-03 7.83E-04 3.78E-04 Apoptosis and
survival_Apoptotic TNF-family pathways 9.18E-04 2.61E-02 3.55E-03
0.00377 1.49E-02 Stem cells_Auditory hair cell differentiation in
embryogenesis 1.06E-03 Effect of H. pylori infection on gastric
epithelial cells motility 1.12E-03 2.38E-04 5.57E-03
Development_S1P3 receptor signaling pathway 1.12E-03 4.78E-03
1.89E-02 0.0126 Development_Role of IL-8 in angiogenesis 1.12E-03
1.88E-03 2.00E-02 0.0212 Immune response_IL-9 signaling pathway
1.13E-03 1.29E-02 3.44E-02 4.32E-02 0.0291 Transcription_CREB
pathway 1.35E-03 2.88E-02 1.07E-03 0.00464 4.78E-03 5.07E-04
Apoptosis and survival_Granzyme A signaling 1.35E-03 2.92E-02
6.98E-04 1.33E-02 0.0136 2.64E-03 Cell adhesion_Gap junctions
1.35E-03 1.67E-02 4.63E-02 6.98E-04 DNA damage_Brca1 as a
transcription regulator 1.35E-03 2.92E-02 Stem cells_Early
embryonal hypaxial myogenesis 1.40E-03 1.71E-02 Immune
response_Oncostatin M signaling via MAPK in human cells 1.40E-03
1.12E-02 4.37E-02 3.16E-02 0.00115 Stem cells_Beta adrenergic
receptors in brown adipocyte differentiation 1.40E-03 2.20E-03
1.02E-02 0.0000202 ENaC regulation in airways (normal and CF)
1.48E-03 3.27E-03 4.28E-02 EGFR family signaling in pancreatic
cancer 1.49E-03 7.40E-06 4.97E-03 1.91E-03 0.00101 Cell
adhesion_Endothelial cell contacts by non-junctional mechanisms
1.52E-03 1.36E-02 2.59E-02 0.0423 Immune response_lnhibitory action
of Lipoxins on pro-inflammatory TNF-alpha 1.62E-03 3.14E-02
7.19E-03 4.18E-05 0.000182 5.47E-03 signaling Neurophysiological
process_Glutamate regulation of Dopamine D1A receptor 1.62E-03
8.10E-03 6.00E-03 signaling Neurophysiological
process_Receptor-mediated axon growth repulsion 1.62E-03 8.10E-03
2.56E-02 2.15E-04 Role of cell adhesion molecules in progression of
pancreatic cancer 1.62E-03 7.19E-03 Immune response_Fc gamma
R-mediated phagocytosis in macrophages 1.62E-03 8.10E-03 2.48E-02
Neurophysiological process_ACM regulation of nerve impulse 1.93E-03
3.50E-02 2.52E-04 0.0226 Transcription_Transcription regulation of
aminoacid metabolism 1.98E-03 G-protein signaling_Regulation of p38
and JNK signaling mediated by G-proteins 2.08E-03 4.65E-02 1.40E-02
0.0105 0.0377 Stem cells_Role of GSK3 beta in cardioprotection
against myocardial infarction 2.12E-03 2.17E-02 6.03E-03 0.0196
Development_NOTCH-induced EMT 2.12E-03 HCV-dependent transcription
regulation leading to HCC 2.12E-03 6.90E-04 3.16E-02 Regulation of
lipid metabolism_Insulin signaling: generic cascades 2.29E-03
7.67E-05 2.94E-04 6.73E-03 0.00698 7.65E-04 Development_PDGF
signaling via MAPK cascades 2.29E-03 3.70E-02 0.00664
Transport_Clathrin-coated vesicle cycle 2.30E-03 8.53E-04 1.21E-02
0.00213 Stem cells_Stimulation of differentiation of mouse
embryonic fibroblasts into 2.30E-03 3.02E-03 4.60E-03 2.20E-04
0.0000954 adipocytes by extracellular factors Immune response_MIF
in innate immunity response 2.50E-03 4.50E-03 0.0425
Development_S1P2 and S1P3 receptors in cell proliferation and
differentiation 2.54E-03 1.80E-02 1.46E-02 Reproduction_GnRH
signaling 2.61E-03 2.32E-02 0.0225 Regulation of lipid
metabolism_Stimulation of Arachidonic acid production by ACM
receptors 2.61E-03 2.94E-02 4.48E-02 3.00E-04 Regulation of lipid
metabolism_Insulin regulation of glycogen metabolism 2.76E-03
2.25E-02 1.95E-04 1.72E-02 0.0178 2.20E-03 Immune
response_Oncostatin M signaling via JAK-Stat in human cells
2.84E-03 3.62E-02 Development_WNT signaling pathway. Part 1.
Degradation of beta-catenin in the 2.84E-03 3.70E-05 1.79E-03
3.62E-02 0.0006 absence WNT signaling Development_VEGF-family
signaling 3.00E-03 2.88E-05 0.0441 Hypoxia-induced EMT in cancer
and fibrosis 3.01E-03 6.83E-04 0.0398 Cell adhesion_Role of CDK5 in
cell adhesion 3.01E-03 Immune response_IL-2 activation and
signaling pathway 3.17E-03 1.22E-02 3.43E-02 2.91E-02 0.0314
2.99E-02 Mechanisms of drug resistance in multiple myeloma 3.17E-03
1.22E-02 4.27E-02 0.0299 Activation of TGF-beta signaling in
pancreatic cancer 3.20E-03 Development_NOTCH1-mediated pathway for
NF-KB activity modulation 3.20E-03 0.00103 Regulation of VEGF
signaling in pancreatic cancer 3.20E-03 2.01E-03 Possible pathway
of TGF-beta 1-dependent inhibition of CFTR expression 3.20E-03
Signal transduction_Erk Interactions: Inhibition of Erk 3.20E-03
1.02E-02 1.40E-03 0.0227 Muscle contraction_ GPCRs in the
regulation of smooth muscle tone 3.51E-03 2.31E-04 Stem cells_NOTCH
in inhibition of WNT/Beta-catenin-induced osteogenesis 3.56E-03
1.16E-04 Apoptosis and survival_Inhibition of ROS-induced apoptosis
by 17beta-estradiol 3.56E-03 Development_TGF-beta receptor
signaling 3.70E-03 1.34E-02 4.55E-02 TGF-beta 1-induced
transactivation of membrane receptors signaling in HCC 3.70E-03
3.28E-03 1.27E-02 2.30E-03 0.000388 Beta-2 adrenergic-dependent
CFTR expression 3.87E-03 Immune response_Oncostatin M signaling via
MAPK in mouse cells 3.88E-03 8.88E-03 3.66E-02 0.000851 Role of
osteoblasts in bone lesions formation in multiple myeloma 3.88E-03
3.09E-02 2.31E-03 Mechanisms of CAM-DR in multiple myeloma 3.88E-03
4.80E-02 3.95E-02 0.0366 Development_TGF-beta-dependent induction
of EMT via SMADs 3.88E-03 7.80E-03 0.000216 2.55E-02 Stem cells_WNT
and Notch signaling in early cardiac myogenesis 3.88E-03 7.80E-03
2.55E-02 PI3K signaling in gastric cancer 4.30E-03 3.68E-03
9.62E-04 5.23E-04 6.36E-04 0.00226 2.49E-05 Some pathways of EMT in
cancer cells 4.30E-03 7.92E-04 7.66E-05 3.56E-02 0.025
Membrane-bound ESR1: interaction with G-proteins signaling 4.30E-03
1.18E-02 1.10E-02 Cell adhesion_Tight junctions 4.66E-03 2.63E-03
1.00E-02 Cytoskeleton remodeling_Keratin filaments 4.66E-03
1.29E-02 1.90E-03 9.08E-03 5.23E-06 0.000138 IGF-1 signaling in
pancreatic cancer 4.66E-03 2.61E-03 8.97E-03 4.32E-02 9.08E-03
0.0291 Stem cells_Dopamine-induced expression of CNTF in adult
neurogenesis
4.79E-03 4.63E-02 Cell cycle_Role of 14-3-3 proteins in cell cycle
regulation 4.79E-03 1.07E-03 0.00516 Development_Thrombopoetin
signaling via JAK-STAT pathway 4.79E-03 Immune response_IL-17
signaling pathways 4.82E-03 3.05E-02 0.00571 7.94E-03 Suppression
of TGF-beta signaling in pancreatic cancer 4.93E-03 7.26E-03
G-protein signaling_G-Protein alpha-12 signaling pathway 5.57E-03
1.12E-02 0.0067 Translation _Regulation of EIF4F activity 5.72E-03
1.03E-03 6.82E-04 1.81E-04 0.000894 1.59E-03 G-protein
signaling_Regulation of cAMP levels by ACM 5.78E-03 Cell
adhesion_Ephrin signaling 5.78E-03 2.52E-04 2.48E-02 G-protein
signaling_Cross-talk between Ras-family GTPases 6.08E-03 9.44E-03
Proteolysis_Putative ubiquitin pathway 6.08E-03 8.14E-04 Stem
cells_Aberrant Wnt signaling in medulloblastoma stem cells 6.08E-03
2.73E-02 3.07E-03 0.000622 Putative role of Estrogen receptor and
Androgen receptor signaling in progression of 6.56E-03 5.08E-03
0.00806 lung cancer ERBB family and HGF signaling in gastric cancer
6.56E-03 1.91E-02 1.47E-03 3.60E-03 4.51E-02 4.53E-02 0.00806 Stem
cells_Noncanonical WNT signaling in cardiac myogenesis 6.59E-03
9.53E-05 0.00921 K-RAS signaling in lung cancer 6.72E-03 9.01E-03
8.12E-03 2.20E-02 2.46E-02 2.26E-02 1.66E-02 G-protein
signaling_Rap2A regulation pathway 7.03E-03
Transport_Macropinocytosis regulation by growth factors 7.05E-03
2.60E-02 0.000969 Development_EGFR signaling pathway 7.05E-03
7.64E-04 4.84E-04 0.0106 Dual role of TGF-beta 1 in HCC 7.59E-03
1.36E-02 Immune response_IFN alpha/beta signaling pathway 7.59E-03
Development_Glucocorticoid receptor signaling 7.59E-03 2.59E-02
0.00515 Cell adhesion_PLAU signaling 7.76E-03 3.17E-03 2.90E-03
0.0386 0.00839 Transcription_P53 signaling pathway 7.76E-03
7.33E-04 1.05E-02 1.40E-02 0.0377 Stem cells_BMP7 in brown
adipocyte differentiation 7.76E-03 3.96E-03 0.0000304
Development_Beta-adrenergic receptors regulation of ERK 7.77E-03
2.93E-02 Role and regulation of Prostaglandin E2 in gastric cancer
7.77E-03 0.0249 Development_Leptin signaling via PI3K-dependent
pathway 7.77E-03 3.70E-02 7.44E-03 0.0249 Transport_Alpha-2
adrenergic receptor regulation of ion channels 7.77E-03 3.10E-02
3.74E-02 2.93E-02 Influence of bone marrow cell environment on
progression of multiple myeloma 7.77E-03 2.33E-03 1.60E-03 0.00664
Immune response_CD40 signaling 7.95E-03 4.01E-02 4.85E-03 1.61E-03
0.0278 3.47E-03 Muscle contraction_ACM regulation of smooth muscle
contraction 8.52E-03 9.93E-04 Stem cells_H3K4 demethylases in stem
cell maintenance 8.73E-03 2.17E-02 Development_PDGF signaling via
STATs and NF-kB 8.73E-03 1.39E-03 2.96E-02 8.83E-04 0.00354 Muscle
contraction_Relaxin signaling pathway 8.94E-03 4.00E-02 0.0265
2.90E-02 1.97E-02 Transition of HCC cells to invasive and migratory
phenotype 9.07E-03 1.55E-02 0.0141 4.25E-02 WNT signaling in HCC
9.07E-03 4.50E-03 1.42E-04 4.20E-03 0.0141 1.18E-02
Development_Neurotrophin family signaling 9.07E-03 1.42E-04 0.00934
Ubiquinone metabolism 9.10E-03 8.55E-03 9.27E-08 Immune
response_Oncostatin M signaling via JAK-Stat in mouse cells
9.13E-03 2.73E-02 Androgen signaling in HCC 9.13E-03 4.73E-03 Cell
cycle_Initiation of mitosis 9.37E-03 2.99E-02 0.0306 4.69E-02
Development_Leptin signaling via JAK/STAT and MAPK cascades
9.37E-03 3.60E-02 Transport_Macropinocytosis 9.84E-03 0.0176
Transport_RAB1A regulation pathway 9.84E-03 Cytoskeleton
remodeling_Integrin outside-in signaling 1.02E-02 1.22E-02 1.14E-02
3.14E-04 Influence of multiple myeloma cells on bone marrow stromal
cells 1.04E-02 3.98E-02 3.27E-02 0.0196 0.00624 Role of
metalloproteases and heparanase in progression of pancreatic cancer
1.04E-02 2.45E-02 Cytoskeleton remodeling_Thyroliberin in
cytoskeleton remodeling 1.04E-02 Transport_ACM3 in salivary glands
1.06E-02 1.71E-02 0.0465 Transport_Intracellular cholesterol
transport in norm 1.10E-02 2.85E-02 Muscle contraction_Delta-type
opioid receptor in smooth muscle contraction 1.14E-02 2.36E-03
G-protein signaling_Ras family GTPases in kinase cascades (scheme)
1.14E-02 0.0348 Development_Alpha-1 adrenergic receptors signaling
via cAMP 1.16E-02 HCV-mediated liver damage and predisposition to
HCC progression via p53 1.16E-02 5.81E-03 0.0118 wtCFTR and
delta508 traffic/Clathrin coated vesicles formation (norm and CF)
1.16E-02 3.71E-02 2.16E-03 0.0228 Apoptosis and survival_HTR1A
signaling 1.17E-02 4.65E-02 1.00E-02 3.17E-02 0.0327 2.93E-05
Immune response_Histamine signaling in dendritic cells 1.17E-02
4.65E-02 Development_GM-CSF signaling 1.17E-02 6.92E-04 4.04E-02
3.39E-02 3.27E-02 0.00553 Development_A2B receptor: action via
G-protein alpha s 1.17E-02 4.65E-02 4.55E-02 0.00897 3.27E-02
Angiogenesis in HCC 1.17E-02 8.29E-04 Pro-inflammatory action of
Gastrin in gastric cancer 1.17E-02 3.28E-03 3.39E-02 2.54E-03
0.0231 Chemoresistance pathways mediated by constitutive activation
of PI3K pathway and 1.22E-02 2.41E-02 1.09E-03 8.02E-04 2.20E-05
1.72E-02 7.83E-04 1.15E-02 BCL-2 in small cell lung cancer
Oxidative stress_Role of ASK1 under oxidative stress 1.22E-02
0.00528 Stem cells_BMP signaling in cardiac myogenesis 1.22E-02
1.38E-03 Transcription_Role of VDR in regulation of genes involved
in osteoporosis 1.23E-02 Stem cells_TNF-alpha, IL-1 alpha and
WNT5A-dependent regulation of osteogenesis 1.33E-02 1.47E-02
4.83E-02 2.59E-03 0.0356 0.025 and adipogenesis in mesenchymal stem
cells Transcription_Role of Akt in hypoxia induced HIF1 activation
1.38E-02 5.59E-03 2.81E-03 1.49E-03 0.00869 Mitochondrial ketone
bodies biosynthesis and metabolism 1.38E-02 4.26E-05 Signal
transduction_AKT signaling 1.40E-02 4.91E-06 1.54E-04 2.74E-05
0.00425 1.75E-04 Regulation of beta-adrenergic receptors signaling
in pancreatic cancer 1.40E-02 Development_Notch Signaling Pathway
1.40E-02 0.00422 Development_A2A receptor signaling 1.40E-02
1.34E-03 2.07E-02 0.0000291 Development_VEGF signaling and
activation 1.40E-02 2.64E-02 2.08E-02 Apoptosis and
survival_Anti-apoptotic action of Gastrin 1.40E-02 1.34E-03
4.78E-03 0.000175 Neurophysiological process_Melatonin signaling
1.40E-02 Neurophysiological process_EphB receptors in dendritic
spine morphogenesis and 1.43E-02 3.95E-02 synaptogenesis Stem
cells_Putative pathways of telomerase regulation in glioblastoma
stem cells 1.46E-02 6.24E-05 0.000261 Cytoskeleton remodeling_Role
of Activin A in cytoskeleton remodeling 1.46E-02 8.91E-04 Stem
cells_H3K36 demethylation in stem cell maintenance 1.46E-02
4.24E-02 0.0142 Development_Beta-adrenergic receptors signaling via
cAMP 1.50E-02 1.08E-04 0.0117 6.71E-03 Effect of H. pylori
infection on inflammation in gastric epithelial cells 1.54E-02
3.27E-02 0.000141 K-RAS signaling in pancreatic cancer 1.60E-02
0.0179 Development_S1P1 signaling pathway 1.60E-02 5.36E-03 0.0139
Development_Ligand-independent activation of ESR1 and ESR2 1.60E-02
2.88E-02 4.78E-03 1.85E-02 0.0139 CFTR-dependent regulation of ion
channels in Airway Epithelium (norm and CF) 1.60E-02 Mechanisms of
resistance to EGFR inhibitors in lung cancer 1.60E-02 2.81E-04
2.31E-02 2.11E-04 1.82E-04 0.0179 5.07E-04 Development_Regulation
of CDK5 in CNS 1.64E-02 HGF signaling in pancreatic cancer 1.64E-02
6.21E-06 3.32E-03 0.003 4.42E-02 E-cadherin signaling and its
regulation in gastric cancer 1.67E-02 4.41E-04 1.96E-03 1.90E-03
0.000034 HBV signaling via protein kinases leading to HCC 1.67E-02
2.61E-03 0.0285 Development_Endothelin-1/EDNRA signaling 1.69E-02
1.76E-02 1.32E-02 2.83E-03 1.34E-02 0.00159 Development_VEGF
signaling via VEGFR2 - generic cascades 1.82E-02 3.14E-02 2.56E-02
Immune response_IL-13 signaling via JAK-STAT 1.82E-02 Signal
transduction_Calcium signaling 1.82E-02 6.00E-03 Cytoskeleton
remodeling_ACM3 and ACM4 in keratinocyte migration 1.92E-02
1.12E-02 Stem cells_Role of Neuregulin 1 and Thymosin beta-4 in
myocardium regeneration 1.94E-02 7.11E-05 3.90E-03 0.0484 after
infarction Cholesterol and Sphingolipids transport/Distribution to
the intracellular membrane 1.94E-02 0.014 compartments (normal and
CF) Stem cells_Notch signaling in medulloblastoma stem cells
1.94E-02 Proteolysis_Putative SUMO-1 pathway 1.94E-02 5.10E-03
0.0494 FGF signaling in pancreatic cancer 2.07E-02 2.01E-03
8.12E-03 2.70E-02 0.022 1.22E-03 1.66E-02 Cytoskeleton
remodeling_CDC42 in cellular processes 2.18E-02 9.99E-03 0.0194
Transcription_Role of heterochromatin protein 1 (HP1) family in
transcriptional 2.18E-02 9.99E-03 4.30E-03 2.59E-03 4.36E-02
0.000604 silencing Immune response_MIF-mediated glucocorticoid
regulation 2.18E-02 9.99E-03 Apoptosis and survival_Ceramides
signaling pathway 2.21E-02 1.61E-02 1.17E-02 1.96E-03 3.69E-04
9.21E-03 7.51E-03 Cell adhesion_Cell-matrix glycoconjugates
2.21E-02 9.29E-05 6.14E-06 0.0475 Role of histone modificators in
progression of multiple myeloma 2.28E-02 5.92E-03 1.33E-02 0.00275
Cytoskeleton remodeling_RalA regulation pathway 2.28E-02 2.92E-02
Muscle contraction_S1P2 receptor-mediated smooth muscle contraction
2.28E-02 2.39E-02 0.0158 EGFR signaling pathway in Lung Cancer
2.33E-02 9.13E-03 Influence of smoking on activation of EGFR
signaling in lung cancer cells 2.33E-02 Development_HGF signaling
pathway 2.33E-02 3.70E-02 2.33E-03 2.93E-02 0.0268 Cardiac
Hypertrophy_NF-AT signaling in Cardiac Hypertrophy 2.33E-02
4.65E-02 2.64E-03 2.98E-02 3.69E-02 0.00119 Immune response_TLR
signaling pathways 2.36E-02 3.84E-03 0.00521 Chemotaxis_Leukocyte
chemotaxis 2.47E-02 2.83E-02 1.36E-03 4.01E-03 4.21E-04 Cytokine
production by Th17 cells in CF 2.52E-02 Development_PACAP signaling
in neural cells 2.52E-02 Translation_Regulation of EIF2 activity
2.52E-02 7.33E-04 2.90E-03 0.0386 0.00153 Cytoskeleton
remodeling_FAK signaling 2.62E-02 1.67E-03 4.89E-03 0.000356 0.0104
Inhibition of apoptosis in pancreatic cancer 2.62E-02 7.84E-03
4.89E-03 0.0381 Apoptosis and survival_Role of lAP-proteins in
apoptosis 2.65E-02 2.66E-02 3.16E-03 0.0246 1.56E-02 Stem
cells_Neovascularization of glioblastoma in response to hypoxia
2.65E-02 7.71E-04 Stem cells_Embryonal epaxial myogenesis 2.65E-02
1.32E-03 Inflammatory mechanisms of pancreatic cancerogenesis
2.82E-02 4.84E-02 1.94E-03 0.000647 Sorafenib-induced inhibition of
cell proliferation and angiogenesis in HCC 2.84E-02 2.34E-02 IL-1
beta-dependent CFTR expression 2.84E-02 Role of IGH translocations
in multiple myeloma 2.87E-02 4.50E-03 1.49E-02 0.0114 0.0141
Development_Role of Activin A in cell differentiation and
proliferation 2.87E-02 Stem cells_H3K27 demethylases in
differentiation of stem cells 2.87E-02 4.50E-03 4.20E-03
Reproduction_Progesterone-mediated oocyte maturation 2.87E-02
0.0425 Stem cells_Regulation of endothelial progenitor cell
differentiation from adult stem 2.90E-02 6.15E-04 0.000215 cells
Bacterial infections in CF airways 2.90E-02 Cytokine production by
Th17 cells in CF (Mouse model) 2.93E-02 4.32E-02 Development_PEDF
signaling 2.93E-02 3.71E-02 Immune response_Bacterial infections in
normal airways 2.93E-02 4.27E-02 Apoptosis and survival_Granzyme B
signaling 3.06E-02 7.84E-03 2.96E-02 0.00373 0.0274 Stem
cells_Cooperation between Hedgehog, IGF-2 and HGF signaling
pathways in 3.06E-02 3.61E-02 1.52E-04 0.0274 medulloblastoma stem
cells Proteolysis_Role of Parkin in the Ubiquitin-Proteasomal
Pathway 3.11E-02 1.10E-02 5.01E-03 0.0101 2.67E-02 Immune
response_Immunological synapse formation 3.20E-02 2.83E-02 1.62E-04
Stem cells_Muscle progenitor cell migration in hypaxial myogenesis
3.24E-02 0.0104 Apoptosis and survival_Lymphotoxin-beta receptor
signaling 3.24E-02 2.19E-02 4.68E-03 0.0465 Immune response_Gastrin
in inflammatory response 3.38E-02 1.49E-03 4.23E-02 0.000434
0.00713 DNA damage_Role of SUMO in p53 regulation 3.50E-02 3.80E-03
4.55E-02
0.00781 Transcription_Transcription factor Tubby signaling pathways
3.50E-02 0.00781 Stem cells_EGF-induced proliferation of Type C
cells in SVZ of adult brain 3.51E-02 5.80E-03 1.19E-03 0.0218
Normal and pathological TGF-beta-mediated regulation of cell
proliferation 3.51E-02 8.95E-03 0.00624 Chemotaxis_Inhibitory
action of lipoxins on IL-8- and Leukotriene B4-induced 3.63E-02
1.10E-02 6.36E-04 0.0109 2.24E-04 neutrophil migration Mucin
expression in CF via TLRs, EGFR signaling pathways 3.63E-02
1.47E-02 0.0109 Translation_Insulin regulation of translation
3.64E-02 2.00E-04 4.24E-03 7.48E-04 0.000783 1.15E-02 Immune
response_Neurotensin-induced activation of IL-8 in colonocytes
3.64E-02 2.41E-02 4.24E-03 0.0115 Signal transduction_JNK pathway
3.64E-02 2.41E-02 0.0000233 Immune response_IL-23 signaling pathway
3.66E-02 2.99E-02 0.0306 4.69E-02 Cytoskeleton
remodeling_Neurofilaments 3.66E-02 2.89E-02 1.97E-03 0.00619 0.0469
Development_Thyroliberin signaling 3.87E-02 Transcription_PPAR
Pathway 3.87E-02 0.000148 Apoptosis and
survival_Cytoplasmic/mitochondrial transport of proapoptotic
proteins 4.00E-02 1.61E-04 0.000178 2.27E-02 Bid, Bmf and Bim Stem
cells_Role of PKR1 and ILK in cardiac progenitor cells 4.00E-02
0.0334 2.27E-02 Apoptosis and survival_Role of CDK5 in neuronal
death and survival 4.00E-02 4.38E-02 6.75E-03 3.60E-02 3.34E-02
5.28E-03 0.0241 Development_CNTF receptor signaling 4.00E-02
3.60E-02 3.34E-02 2.27E-02 0.00463 wtCFTR and deltaF508
traffic/Membrane expression (norm and CF) 4.00E-02 1.28E-02
Chemotaxis_CXCR4 signaling pathway 4.00E-02 6.75E-03 3.60E-02
0.0241 G-protein signaling_Proinsulin C-peptide signaling 4.02E-02
4.45E-06 1.56E-02 1.21E-02 1.17E-02 2.53E-03 7.75E-04 1.42E-03
Apoptosis and survival_TNFR1 signaling pathway 4.08E-02 6.48E-03
1.61E-02 0.0189 1.66E-02 Immune response_IL-10 signaling pathway
4.26E-02 9.09E-03 2.36E-03 3.41E-02 0.0348 Neurophysiological
process_Dopamine D2 receptor transactivation of PDGFR in CNS
4.26E-02 1.80E-02 4.09E-02 1.46E-02 2.13E-03 3.48E-02 0.00136 Stem
cells_Insulin, IGF-1 and TNF-alpha in brown adipocyte
differentiation 4.43E-02 1.26E-04 0.0129 2.83E-03 9.70E-08
Development_Angiopoietin - Tie2 signaling 4.53E-02 1.15E-02
3.09E-02 3.95E-02 0.00118 0.00805 Anti-apoptotic action of Gastrin
in pancreatic cancer 4.53E-02 1.15E-02 8.88E-03 5.92E-03 3.66E-02
6.12E-03 2.65E-02 Development_Regulation of telomere length and
cellular immortalization 4.53E-02 8.88E-03 2.48E-02 0.0255
Development_Flt3 signaling 4.55E-02 2.88E-02 6.34E-03 2.27E-02
1.79E-02 2.07E-02 1.08E-03 2.89E-03 Pancreatic cancer cell
resistance to Tarceva (erlotinib) 4.91E-02 2.05E-02 4.61E-02
2.81E-03 0.0385 0.00254 Immune response_Signaling pathway mediated
by IL-6 and IL-1 4.91E-02 Apoptosis and survival_FAS signaling
cascades 2.64E-02 2.74E-05 0.000131 4.22E-03 TIP metabolism
0.0000608 Resistance of pancreatic cancer cells to death receptor
signaling 1.29E-04 0.00105 4.53E-03 Transcription_Assembly of RNA
Polymerase II preinitiation complex on TATA-less 0.000136 0.0257
promoters Development_PIP3 signaling in cardiac myocytes 2.29E-03
4.70E-05 3.38E-04 1.39E-03 6.60E-05 1.24E-04 HCV-dependent
regulation of RNA polymerases leading to HCC 0.00035 0.0387 Stem
cells_H3K9 demethylases in pluripotency maintenance of stem cells
4.62E-04 4.36E-02 1.98E-02 3.37E-02 Inhibition of apoptosis in
gastric cancer 6.32E-04 0.00333 6.61E-04 Cell cycle_Start of DNA
replication in early S phase 3.61E-02 0.00067 0.000883 Apoptosis
and survival_Caspase cascade 1.64E-03 0.000816 0.00105 Immune
response_BCR pathway 7.76E-04 9.79E-04 1.29E-02 4.15E-03 8.06E-03
Immune response_ICOS pathway in T-helper cell 9.01E-03 1.40E-03
1.40E-03 0.0246 6.19E-03 Cell cycle_The metaphase checkpoint
0.00141 Inhibitory action of Lipoxins on neutrophil migration
1.85E-02 1.46E-03 0.0194 4.90E-04 DNA damage_NHEJ mechanisms of
DSBs repair 3.16E-02 1.67E-03 0.0297 1.18E-02 Cytoskeleton
remodeling_Alpha-1A adrenergic receptor-dependent inhibition of
PI3K 5.17E-04 1.67E-03 2.97E-02 1.18E-02 2.89E-04 Regulation of
metabolism_Triiodothyronine and Thyroxine signaling 0.00186 Cell
cycle_Chromosome condensation in prometaphase 2.70E-03 0.00000331
Development_IGF-1 receptor signaling 2.47E-05 5.23E-04 2.77E-03
9.87E-03 6.69E-04 2.24E-04 dCTP/dUTP metabolism 0.003 dGTP
metabolism 0.00332 Inhibition of RUNX3 signaling in gastric cancer
4.63E-02 0.00336 0.00739 Apoptosis and survival_Beta-2 adrenergic
receptor anti-apoptotic action 0.00412 8.69E-03 6.09E-03 Signal
transduction_Activin A signaling regulation 1.15E-02 4.38E-03
0.00105 4.53E-03 Stem cells_Fetal brown fat cell differentiation
4.00E-03 0.00447 1.41E-02 8.81E-03 Immune response_CXCR4 signaling
via second messenger 4.38E-02 6.75E-03 3.60E-02 5.11E-03 0.00711
5.28E-03 dATP/dITP metabolism 0.00573 Signal transduction_PTEN
pathway 2.01E-03 6.69E-03 5.97E-03 0.0246 6.19E-03 Microsatellite
instability in gastric cancer 0.00601 0.00177 Inhibition of
TGF-beta signaling in gastric cancer 6.01E-03 0.0117 3.06E-02
Immune response_Regulation of T cell function by CTLA-4 3.44E-02
1.90E-03 6.82E-03 1.68E-03 2.85E-02 5.95E-03 DNA
damage_DNA-damage-induced responses 4.67E-02 0.00747 0.00337 Stem
cells_Self-renewal of adult neural stem cells 0.00756 0.029
Regulation of degradation of deltaF508 CFTR in CF 1.67E-02 8.44E-03
0.00869 Transcription_Sin3 and NuRD in transcription regulation
3.46E-03 0.00892 3.47E-02 Blood coagulation_GPIb-IX-V-dependent
platelet activation 0.00952 1.11E-02
Transcription_Receptor-mediated HIF regulation 3.96E-03 1.32E-02
5.03E-04 1.01E-02 0.00238 8.39E-03 Stem cells_Signaling pathways in
embryonic hepatocyte maturation 5.00E-02 1.41E-02 1.05E-02 0.0365
3.56E-02 Apoptosis and survival_nAChR in apoptosis inhibition and
cell cycle progression 2.61E-02 2.13E-02 1.15E-02 0.0118 Stem
cells_Role of growth factors in the maintenance of embryonic stem
cell 3.23E-03 0.0129 0.000583 pluripotency Apoptosis and
survival_Anti-apoptotic TNFs/NF-kB/Bcl-2 pathway 5.10E-03 1.09E-06
1.29E-02 0.0156 6.61E-04 DNA damage_Role of Brca1 and Brca2 in DNA
repair 2.92E-02 0.0133 Translation IL-2 regulation of translation
4.24E-02 3.62E-02 0.0139 3.40E-02 3.60E-03 DNA damage Mismatch
repair 0.0139 0.00518 Neurophysiological process_Olfactory
transduction 0.0139 DNA damage_Inhibition of telomerase activity
and cellular senescence 7.04E-03 3.62E-02 0.0139 Immune
response_Role of DAP12 receptors in NK cells 4.91E-02 0.0142 0.0451
Immune response_CD28 signaling 1.17E-03 1.44E-02 1.42E-02 0.0451
1.47E-02 Immune response_PIP3 signaling in B lymphocytes 0.0144
1.72E-02 1.15E-02 Immune response_ETV3 affect on CSF1-promoted
macrophage differentiation 0.0152 Blood coagulation_GPVI-dependent
platelet activation 1.57E-02 0.0157 0.0482 Inhibition of tumor
suppressive pathways in pancreatic cancer 8.43E-03 1.65E-02 0.0387
1.69E-02 Transcription_Ligand-Dependent Transcription of
Retinoid-Target genes 0.0196 Development_Thrombopoietin-regulated
cell processes 2.48E-02 1.99E-02 0.000252 Role of alpha-6/beta-4
integrins in carcinoma progression 1.77E-03 4.35E-06 0.0199
1.52E-02 Chemotaxis_Lipoxin inhibitory action on fMLP-induced
neutrophil chemotaxis 2.10E-04 6.69E-03 2.20E-02 0.0226 3.63E-03
Development_EGFR signaling via PIP3 1.17E-02 0.0226 Stem
cells_Differentiation of natural regulatory T cells 0.0248 0.00805
G-protein signaling_S1P2 receptor signaling 8.88E-03 0.0248
Translation_Opioid receptors in regulation of translation 2.61E-02
0.0267 9.24E-04 Transport_RAB3 regulation pathway 0.0271 G-protein
signaling_RAC1 in cellular process 2.61E-03 1.00E-02 0.0277 DNA
damage_Nucleotide excision repair 0.0277 Immune response_Inhibitory
action of lipoxins on superoxide production induced by IL- 3.43E-02
2.91E-02 0.0299 8 and Leukotriene B4 in neutrophils Inhibitory
action of Lipoxins on Superoxide production in neutrophils 3.43E-02
2.91E-02 0.0299 wtCFTR and delta508-CFTR traffic/Generic schema
(norm and CF) 0.0317 4.59E-07 Apoptosis and
survival_DNA-damage-induced apoptosis 2.03E-04 0.0327 0.0155
Apoptosis and survival_NGF signaling pathway 9.09E-03 0.0341 0.0135
Apoptosis and survival_APRIL and BAFF signaling 3.35E-03 3.42E-02
0.00921 Immune response_NFAT in Immune response 5.00E-02 1.10E-02
0.0346 0.00987 Apoptosis and survival_Anti-apoptotic TNFs/NF-kB/IAP
pathway 3.70E-03 5.59E-03 3.85E-02 0.0394 Immune response_TCR and
CD28 co-stimulation in activation of NF-kB 0.0414 Immune
response_Innate Immune response to RNA viral infection 6.39E-03
4.33E-02 0.0102 Immune response_IFN gamma signaling pathway
3.60E-03 0.044 1.77E-03 Immune response_CD16 signaling in NK cells
1.88E-02 2.43E-03 1.37E-02 0.0472 0.0121 Immune response_Delta-type
opioid receptor signaling in T-cells 2.13E-02 4.84E-02 0.000367
1.40E-02 Apoptosis and survival_p53-dependent apoptosis 1.14E-05
0.0484 0.00352 Effect of H. pylori infection on apoptosis in
gastric epithelial cells 7.92E-04 0.0365 Immune response_Histamine
H1 receptor signaling in Immune response 1.11E-02 3.18E-02 0.029
Immune response_IL-4 - antiapoptotic action 5.92E-03 0.0136 0.0158
Development_Angiotensin signaling via PYK2 6.48E-03 2.07E-02
0.00422 0.0126 Development_Alpha-2 adrenergic receptor activation
of ERK 3.91E-03 7.77E-03 1.68E-03 0.000168 Immune response_CCR5
signaling in macrophages and T lymphocytes 1.26E-03 0.0212 0.00269
Development_A3 receptor signaling 1.22E-02 0.00222 0.0214 G-protein
signaling_N-RAS regulation pathway 8.95E-03 Immune response_Murine
NKG2D signaling 4.87E-03 1.89E-02 EML4/ALK fusion protein in
nonsmoking-related lung cancer 1.39E-03 2.74E-02 1.78E-02 0.0196
Transcription_NF-kB signaling pathway 1.79E-02 3.76E-03 0.00238
Development_ERBB-family signaling 3.96E-03 0.0105 0.0377 Fructose
metabolism 7.89E-03 Apoptosis and survival_Apoptotic Activin A
signaling 0.00619 0.0469 Development_EPO-induced Jak-STAT pathway
0.00526 DNA damage_Role of NFBD1 in DNA damage response 1.30E-02
Mechanisms of K-RAS addiction in lung cancer cells 2.92E-02
Development_EDNRB signaling 3.70E-02 0.00979 0.00553 Immune
response_Role of the Membrane attack complex in cell survival
0.00528 0.0241 Regulation of lipid metabolism_Insulin regulation of
fatty acid methabolism 1.72E-02 5.72E-06 KLF6 and regulation of
KLF6 alternative splicing in HCC 1.33E-03 0.00424 0.0379
Development_S1P1 receptor signaling via beta-arrestin 3.03E-02
0.0000769 Cell cycle_Cell cycle (generic schema) 1.13E-03 0.00278
Development_Regulation of epithelial-to-mesenchymal transition
(EMT) 1.59E-05 0.000016 Development_S1P4 receptor signaling pathway
9.99E-03 8.03E-03 0.0337 Signal transduction_IP3 signaling 4.32E-02
4.27E-02 3.43E-02 1.78E-03 4.92E-04 0.000988
Development_Endothelin-1/EDNRA transactivation of EGFR 9.01E-03
1.40E-03 0.00619 0.0166 Cell cycle Sister chromatid cohesion 0.0198
Glutathione metabolism/Rodent version 2.16E-02 1.00E-05
Development_Beta-adrenergic receptors transactivation of EGFR
2.32E-03 4.70E-02 3.10E-04 0.000166 Development_ACM2 and ACM4
activation of ERK 2.64E-02 4.78E-03 0.0126 Activation of
pro-oncogenic TGF-beta potential in gastric cancer 2.60E-03
2.89E-02 0.0306 Stem cells_FGF10 in development of subcutaneous
white adipose tissue in 2.34E-02 2.62E-04 0.0168 embryogenesis
G-protein signaling_RhoA regulation pathway 1.02E-02 1.38E-03
3.60E-02 0.0227 Immune response_IL-7 signaling in B lymphocytes
1.89E-02 0.0126 G-protein signaling_Rap2B regulation pathway
4.59E-02 Development_Activation of ERK by Alpha-1 adrenergic
receptors
0.0152 EGF- and HGF-dependent stimulation of metastasis in gastric
cancer 1.42E-03 4.63E-02 Cell cycle_Spindle assembly and chromosome
separation 8.95E-03 3.27E-02 Glycogen metabolism 0.0377
Neurophysiological process_Delta-type opioid receptor in the
nervous system 0.0408 Fructose metabolism/Rodent version 1.65E-02
Inhibitory action of Lipoxins and Resolvin E1 on neutrophil
functions 4.20E-03 0.0425 Immune response_PGE2 in immune and
neuroendocrine system interactions 2.81E-02 Development_Dopamine D2
receptor transactivation of EGFR 3.14E-02 1.10E-02 1.01E-02
2.67E-02 0.0000955 Autophagy_Autophagy 7.84E-03 4.95E-03 Regulation
of lipid metabolism_RXR-dependent regulation of lipid metabolism
via 0.00264 PPAR, RAR and VDR Development_A1 receptor signaling
4.21E-02 8.94E-04 0.00736 Cell cycle_Role of APC in cell cycle
regulation 4.95E-03 Plasminogen activators signaling in pancreatic
cancer 1.15E-02 3.95E-02 0.0265 NGF activation of NF-kB 5.10E-03
5.00E-04 9.65E-04 0.0197 2.29E-03 Immune response_IL-15 signaling
4.01E-02 4.33E-02 0.00347 0.00108 Cell cycle_Role of SCF complex in
cell cycle regulation 1.14E-05 5.00E-04 Development_Gastrin in
differentiation of the gastric mucosa 0.00921 Propionate metabolism
p.1 0.0441 Lysine metabolism 1.42E-02 0.00192 CFTR folding and
maturation (norm and CF) 5.69E-03 0.00369 Development_Keratinocyte
differentiation 1.49E-03 Tryptophan metabolism/Rodent version
4.11E-02 0.00734 G-protein signaling_H-RAS regulation pathway
1.03E-02 Normal wtCFTR traffic/Sorting endosome formation 1.86E-04
1.18E-02 Apoptosis and survival_Regulation of Apoptosis by
Mitochondrial Proteins 3.26E-02 0.0246 Immune response_IL-4
signaling pathway 2.88E-02 Development_Cross-talk between VEGF and
Angiopoietin 1 signaling pathways 1.80E-02 4.09E-02 Cell cycle_ESR1
regulation of G1/S transition 3.98E-02 1.92E-04
Development_Activation of ERK by Kappa-type opioid receptor
1.29E-02 4.32E-02 4.01E-02 1.46E-03 0.00595 HCV-dependent
regulation of membrane receptors signaling in HCC 0.000227
Delta508-CFTR traffic/Sorting endosome formation in CF 8.14E-04
2.31E-02 0.000752 Immune response_IL-13 signaling via PI3K-ERK
1.34E-02 G-protein signaling_G-Protein alpha-i signaling cascades
0.0109 Glycolysis and gluconeogenesis p. 1 9.01E-03 Muscle
contraction_Oxytocin signaling in uterus and mammary gland 3.05E-02
2.26E-02 2.33E-02 Development_Delta- and kappa-type opioid
receptors signaling via beta-arrestin 2.31E-02 0.000622 0.00609
Glutathione metabolism 1.38E-02 3.85E-06 Regulation of lipid
metabolism_PPAR regulation of lipid metabolism 1.30E-04 0.0115
Immune response_PGE2 common pathways 0.0269 Immune
response_HTR2A-induced activation of cPLA2 6.48E-03 2.81E-02
0.00257 Mitochondrial unsaturated fatty acid beta-oxidation
6.00E-03 0.0152 Development_Role of HDAC and
calcium/calmodulin-dependent kinase (CaMK) in 5.64E-03 3.60E-03
0.00177 control of skeletal myogenesis Development_Growth hormone
signaling via PI3K/AKT and MAPK cascades 1.72E-02 3.69E-03 0.0115
Neuropeptide signaling in pancreatic cancer 4.32E-02 Apoptosis and
survival_NO synthesis and signaling 0.0162 0.0333 Immune
response_IL-15 signaling via JAK-STAT cascade 2.73E-02 Regulation
of lipid metabolism_G-alpha(q) regulation of lipid metabolism
0.0432 Neurophysiological process_Long-term depression in
cerebellum 4.32E-02 Apoptosis and survival_Anti-apoptotic action of
membrane-bound ESR1 4.80E-02 0.00612 Development_Role of CDK5 in
neuronal development 1.93E-03 2.76E-02 3.60E-02 3.34E-02 0.00463
Cell cycle_Nucleocytoplasmic transport of CDK/Cyclins 1.75E-03
0.0276 Immune response_IL-5 signalling 6.34E-03 2.27E-02 0.00289
Development_Mu-type opioid receptor signaling 1.61E-02 0.00204
0.00751 Pentose phosphate pathway/Rodent version 0.025
Phenylalanine metabolism 3.99E-02 0.00161 Glycolysis and
gluconeogenesis (short map) 1.50E-02 6.87E-04 WNT signaling in
gastric cancer 1.96E-03 3.61E-04 0.00908 Stem cells_Transcription
factors in segregation of hepatocytic lineage 0.000615 4.59E-04
Development_G-Proteins mediated regulation MAPK-ERK signaling
2.70E-02 0.0166 Development_EPO-induced PI3K/AKT pathway and Ca(2+)
influx 2.07E-02 0.00425 Development_Angiotensin activation of Akt
3.41E-02 6.69E-03 0.00363 DNA damage_ATM/ATR regulation of G2/M
checkpoint 1.80E-02 Development_SSTR1 in regulation of cell
proliferation and migration 0.0494 Cytoskeleton remodeling_ESR1
action on cytoskeleton remodeling and cell migration 4.24E-02
Immune response_TREM1 signaling pathway 2.25E-02 0.00521 Stem
cells_FGF signaling in pancreatic and hepatic differentiation of
embryonic stem 0.0425 cells Tryptophan metabolism 3.92E-02 0.0069
Triacylglycerol metabolism p.1 2.16E-02 0.0123 G-protein
signaling_Rac3 regulation pathway 2.34E-02 Development_Growth
hormone signaling via STATs and PLC/IP3 3.09E-02 2.50E-04
Regulation of lipid metabolism_Regulation of fatty acid synthesis:
NLTP and EHHADH 0.000219 Oxidative stress_Angiotensin II-induced
production of ROS 4.80E-02 3.95E-02 Cholesterol and Sphingolipids
transport/Recycling to plasma membrane in lung 2.73E-02 (normal and
CF) Development_TGF-beta-induction of EMT via ROS 0.0228 Immune
response_IL-22 signaling pathway 5.80E-03 3.27E-02 Cell
cycle_Transition and termination of DNA replication 0.00189 Stem
cells_FGF2-induced self-renewal of adult neural stem cells 0.0118
0.0408 Regulation of metabolism_Bile acids regulation of glucose
and lipid metabolism via 0.0318 FXR Apoptosis and survival_NO
signaling in survival 0.00515 0.0423 Signal transduction_Activation
of PKC via G-Protein coupled receptor 9.05E-04 2.63E-03 1.08E-02
3.27E-03 0.0269 Development_Hedgehog signaling 3.41E-02 2.02E-03
2.10E-04 0.0246 Development_GDNF family signaling 2.01E-03 0.00619
0.0166 HBV-dependent transcription regulation leading to HCC 0.0469
Butanoate metabolism 3.29E-02 0.0192 Development_ERK5 in cell
proliferation and neuronal survival 2.73E-02 Development_FGFR
signaling pathway 1.76E-02 5.02E-03 3.23E-03 0.0134 0.029 Multiple
Myeloma (general scheme) 0.0297 Development_Angiotensin activation
of ERK 3.98E-02 1.19E-03 0.0202 0.00406 Leucune, isoleucine and
valine metabolism/Rodent version 0.000262 Development_Mu-type
opioid receptor signaling via Beta-arrestin 0.0000955 Immune
response_Alternative complement pathway 1.79E-05
Development_Angiotensin signaling via beta-Arrestin 2.89E-02
1.27E-02 0.00619 0.00826 Development_Transactivation of PDGFR in
non-neuronal cells by Dopamine D2 7.40E-04 0.00306 receptor
Development_Membrane-bound ESR1: interaction with growth factors
signaling 2.07E-02 0.00422 Transcription_Androgen Receptor nuclear
signaling 3.14E-02 6.99E-03 0.00535 2.05E-02 HBV regulation of DNA
repair and apoptosis leading to HCC 1.61E-02 Regulation of lipid
metabolism_Regulation of lipid metabolism via LXR, NF-Y and 0.0347
SREBP Immune response_IL-6 signaling pathway 4.25E-02 Immune
response_Lectin induced complement pathway 1.46E-04 Arachidonic
acid production 4.65E-02 1.27E-02 3.70E-02 0.0231 G-protein
signaling_Rap1A regulation pathway 1.98E-02 Stem
cells_Dopamine-induced transactivation of EGFR in SVZ neural stem
cells 3.26E-02 0.0156 0.0176 Immune response_Fc epsilon RI pathway
5.63E-03 2.08E-02 1.41E-02 4.66E-03 0.00881 FGF signaling in
gastric cancer 2.73E-02 0.0489 Development_FGF-family signaling
3.09E-02 0.00805 Fatty Acid Omega Oxidation 0.0241 FGFR3 signaling
in multiple myeloma 4.37E-02 0.00115 Development_MicroRNA-dependent
inhibition of EMT 0.00468 Cardiac Hypertrophy_Ca(2+)-dependent
NF-AT signaling in Cardiac Hypertrophy 1.85E-02 1.85E-02 0.0381
Immune response_Role of integrins in NK cells cytotoxicity 0.0347
Stem cells_MMP-14-induced COX-2 expression in glioblastoma stem
cells 2.31E-02 Hedgehog signaling in pancreatic cancer 2.52E-05
Neurophysiological process_GABA-A receptor life cycle 0.00162
HCV-dependent cytoplasmic signaling leading to HCC 4.61E-02
1.67E-02 0.000227 0.000193 Neurophysiological
process_NMDA-dependent postsynaptic long-term potentiation in
0.0148 0.0045 CA1 hippocampal neurons Immune response_IL-12
signaling pathway 2.31E-02 0.00424 Stem cells_Scheme: Histone H3
demethylases in stem cells 0.0156 Neurophysiological process_HTR1A
receptor signaling in neuronal cells 0.0149 0.0475
Atherosclerosis_Role of ZNF202 in regulation of expression of genes
involved in 4.81E-02 2.00E-02 1.97E-04 0.0387 1.69E-02
Atherosclerosis Translation_Non-genomic (rapid) action of Androgen
Receptor 4.50E-03 1.41E-02 1.18E-02 0.0408 Immune response_Lipoxins
and Resolvin E1 inhibitory action on neutrophil functions 2.31E-03
0.0255 Cell cycle_Regulation of G1/S transition (part 2) 4.30E-04
0.0348 Anti-apoptotic action of Gastrin in gastric cancer 1.15E-02
8.88E-03 0.00612 Development_Activation of astroglial cells
proliferation by ACM3 1.64E-03 5.80E-03 6.90E-03 0.0202 GTP
metabolism 0.0311 Neurophysiological process_Thyroliberin in cell
hyperpolarization and excitability 4.80E-02 Glutathione
metabolism/Human version 1.50E-02 4.55E-06 Stem cells_FGF2
signaling during embryonic stem cell differentiation 0.0227
Proliferative action of Gastrin in gastric cancer 4.58E-03 3.65E-05
4.59E-02 0.0421 Cell adhesion_Integrin inside-out signaling
7.04E-03 3.84E-03 1.71E-02 0.0356 Tissue factor signaling in Lung
Cancer 3.14E-02 Development_Prolactin receptor signaling 2.63E-02
8.70E-03 2.00E-02 0.0406 Phenylalanine metabolism/Rodent version
3.52E-02 0.00559 Development_SSTR2 in regulation of cell
proliferation 0.00705 0.00595 Immune response_CD137 signaling in
immune cell 7.26E-03 0.0118 Development_WNT5A signaling 9.01E-03
2.82E-02 0.00597 Translation_Translation regulation by Alpha-1
adrenergic receptors 1.03E-03 1.32E-02 0.0419 0.029
Development_Gastrin in cell growth and proliferation 2.89E-03
3.91E-03 2.69E-02 2.42E-02 2.79E-03 0.00394 Effect of H. pylori
infection on gastric epithelial cell proliferation 2.63E-02
2.71E-02 Chemotaxis CCR4-induced leukocyte adhesion 4.63E-02
2.39E-02 GTP-XTP metabolism 8.97E-03 Transcription_Ligand-dependent
activation of the ESR1/SP pathway 2.92E-02 Immune response_TLR3 and
TLR4 induce TICAM1-specific signaling pathway 4.24E-02 1.58E-02
Development_Delta-type opioid receptor mediated cardioprotection
1.03E-02 4.70E-02 0.00808 0.000166 Development_Mu-type opioid
receptor regulation of proliferation 1.89E-02 0.00192 Immune
response_IL-12-induced IFN-gamma production 2.63E-03 0.000259
Proliferative action of Gastrin in pancreatic cancer 7.26E-03
2.27E-02 0.00108 Cell cycle_Regulation of G1/S transition (part 1)
3.46E-03 4.22E-02 0.0000516 3.50E-02 Protein folding_Membrane
trafficking and signal transduction of G-alpha (i) 1.46E-03
2.97E-02 0.00295 heterotrimeric G-protein Immune response_Classical
complement pathway 8.50E-06 Transport_Rab-9 regulation pathway
2.84E-02 5.01E-03 Development_Signaling of Beta-adrenergic
receptors via Beta-arrestins 0.00736 Lysine metabolism/Rodent
version 4.82E-03 0.00209 G-protein signaling_G-Protein beta/gamma
signaling cascades 0.00528 0.0241 Immune response_Sialic-acid
receptors (Siglecs) signaling 0.0179 Leucune, isoleucine and valine
metabolism. p.2 0.000212 Neurophysiological process_Kappa-type
opioid receptor in transmission of nerve 4.63E-02 impulses Stem
cells_Scheme: Adult neurogenesis in the Subventricular Zone
1.37E-02
Immune response_MIF-JAB1 signaling 2.14E-03 3.14E-02 Immune
response_Function of MEF2 in T lymphocytes 4.65E-02 1.27E-02
3.70E-02 0.00553 Immune response_Human NKG2D signaling 1.17E-02
Aflatoxin B1-dependent induction of HCC 3.98E-02 Neurophysiological
process_Role of CDK5 in presynaptic signaling 3.88E-02 0.0442 Stem
cells_mGluRS signaling in glioblastoma stem cells 4.28E-02 0.0386
0.0269 G-protein signaling_G-Protein alpha-q signaling cascades
1.02E-02 3.60E-02 0.00103 DNA damage_ATM/ATR regulation of G1/S
checkpoint 2.44E-07 0.0178 Pentose phosphate pathway 0.0269 Immune
response_MIF - the neuroendocrine-macrophage connector 3.41E-02
2.02E-03 3.50E-02 6.69E-03 0.00597 Immune response_Antiviral
actions of Interferons 1.21E-02 Glycolysis and gluconeogenesis p.
2/Human version 2.53E-03 Peroxisomal branched chain fatty acid
oxidation 3.41E-02 Regulation of lipid metabolism_Alpha-1
adrenergic receptors signaling via arachidonic 4.49E-02 acid
Development_Angiotensin signaling via STATs 1.58E-06
Triacylglycerol metabolism p.2 1.89E-02 Glycolysis and
gluconeogenesis p.3/Human version 1.10E-02 Immune response_T cell
receptor signaling pathway 2.90E-03 Glycolysis and gluconeogenesis
p.3 1.10E-02 2-Naphthylamine and 2-Nitronaphtalene metabolism
3.79E-04 Androstenedione and testosterone biosynthesis and
metabolism p.2/Rodent version 1.00E-02 Retinol metabolism/Rodent
version 1.47E-02 G-protein signaling_Regulation of CDC42 activity
3.27E-02 Mitochondrial long chain fatty acid beta-oxidation
3.41E-02 Pyruvate metabolism/Rodent version 2.91E-03
Neurophysiological process_Netrin-1 in regulation of axon guidance
6.90E-04 Regulation of lipid metabolism_Regulation of lipid
metabolism by niacin and 2.48E-02 isoprenaline Stem cells_Scheme:
Osteogenic and adipogenic differentiation of mesenchymal stem
4.11E-02 cells Pyruvate metabolism 9.11E-03 Naphthalene metabolism
2.50E-02 Transcription_Role of AP-1 in regulation of cellular
metabolism 1.26E-02 1-Naphthylamine and 1-Nitronaphtalene
metabolism 6.00E-03 Muscle contraction_Regulation of eNOS activity
in cardiomyocytes 4.91E-02 Retinol metabolism 1.95E-02
Androstenedione and testosterone biosynthesis and metabolism p.2
8.88E-03 Acetaminophen metabolism 3.90E-03 Propionate metabolism
p.2 1.70E-02
[0281] Furthermore, Table 10 lists only the pathways determined to
be upregulated in CD44+ cells from nulliparous women relative to
CD44+ cells from parous women, and Table 12 lists the pathways that
were significantly upregulated in CD44+, CD24- breast epithelial
cells of parous women relative to the same cell type in nulliparous
women.
[0282] The most significant pathways highly active in parous
samples in all of these three cell types included apoptosis,
survival, and immune response, whereas stem cells and
development-related pathways were enriched only in CD44+ cells from
nulliparous women (FIG. 11) and Table 10, above, and Table 12,
below). Pathways highly active in parous stroma were enriched in
energy metabolism, fatty acid metabolism and adipocyte
differentiation from stem cells, which is consistent with adipose
tissue development and a decrease in breast density following
pregnancy. Table 13, below shows a summary of GeneGo functional
enrichment analysis by protein class for differentially expressed
genes in CD44+, CD24+, CD10+ and stromal cell types isolated from
nulliparous and parous normal human breast. Table 13 indicates the
actual and expected number of network objects in the activated
dataset for a given protein class, and the ratio of the actual and
expected number. In the Table, "n" is the total number of genes in
the list, "R" is the number of genes showing the indicated protein
class in the background list, "N" is the total number of genes in
the background list, the mean value for hypergeometric distribution
is calculated by the formula: (n*R/N), the z-score is calculated
using the formula: ((r-mean)/sqrt(variance)), and the p-value
represents the probability to have the given value of r or higher
(or lower for negative z-score). The functional categories of genes
affected by parity were similar in all four cell types with
receptors and enzymes representing the most enriched groups (FIG.
12 and Table 13).
TABLE-US-00011 TABLE 11 Pathways Upregulated in Nulliparous CD44+
Cells Relative to Parous CD44+ Cells P-value in Pathway maps NP
CD44+ Cytoskeleton remodeling_Role of PKA in cytoskeleton
reorganisation 6.44E-07 Development_MAG-dependent inhibition of
neurite outgrowth 1.54E-06 Role of DNA methylation in progression
of multiple myeloma 2.40E-06 Cell adhesion_Histamine H1 receptor
signaling in the interruption of cell barrier integrity 3.24E-06
Stem cells_Response to hypoxia in glioblastoma stem cells 4.22E-06
Development_WNT signaling pathway. Part 2 5.42E-06
Development_Slit-Robo signaling 6.19E-06 Cytoskeleton
remodeling_Fibronectin-binding integrins in cell motility 8.94E-06
Oxidative phosphorylation 9.31E-06 Cell adhesion_Role of
tetraspanins in the integrin-mediated cell adhesion 1.02E-05 Cell
cycle_Role of Nek in cell cycle regulation 1.27E-05 Blood
coagulation_Blood coagulation 1.86E-05 Cell adhesion_ECM remodeling
2.09E-05 Inhibitory action of Lipoxin A4 on PDGF, EGF and LTD4
signaling 2.45E-05 Stem cells_WNT/Beta-catenin and NOTCH in
induction of osteogenesis 2.48E-05 HIF-1 in gastric cancer 3.00E-05
Cell adhesion_Plasmin signaling 3.33E-05 Development_Lipoxin
inhibitory action on PDGF, EGF and LTD4 signaling 3.33E-05 Cell
adhesion_Integrin-mediated cell adhesion and migration 3.84E-05
Cytoskeleton remodeling_Reverse signaling by ephrin B 5.92E-05
Immune response_IL-1 signaling pathway 7.06E-05 Cell
adhesion_Endothelial cell contacts by junctional mechanisms
7.46E-05 Signal transduction_cAMP signaling 7.78E-05 Role of
stellate cells in progression of pancreatic cancer 1.16E-04 Stem
cells_NOTCH1-induced self-renewal of glioblastoma stem cells
1.30E-04 Stem cells_Pancreatic cancer stem cells in tumor
metastasis 1.30E-04 Tumor-stroma interactions in pancreatic cancer
1.44E-04 Stem cells_Regulation of lung epithelial progenitor cell
differentiation 1.66E-04 LKB1 signaling pathway in lung cancer
cells 1.66E-04 Immune response _CCR3 signaling in eosinophils
1.68E-04 Non-genomic signaling of ESR2 (membrane) in lung cancer
cells 1.76E-04 Blood coagulation_GPCRs in platelet aggregation
2.20E-04 Cytoskeleton remodeling_Role of PDGFs in cell migration
2.55E-04 Stem cells_Role of BMP signaling in embryonic stem cell
neural differentiation 2.59E-04 Development_Hedgehog and PTH
signaling pathways in bone and cartilage development 3.07E-04 Stem
cells_Hedgehog, BMP and Parathyroid hormone in osteogenesis
3.25E-04 IGF signaling in HCC 3.94E-04 Development_EGFR signaling
via small GTPases 4.43E-04 Cell adhesion_Cadherin-mediated cell
adhesion 4.72E-04 Stem cells_Differentiation of white adipocytes
4.75E-04 Apoptosis and survival_Endoplasmic reticulum stress
response pathway 4.75E-04 Development_BMP signaling 5.69E-04
Development_TGF-beta-dependent induction of EMT via MAPK 6.02E-04
PGE2 pathways in cancer 6.80E-04 Immune response_Antigen
presentation by MHC class I 8.21E-04 Muscle contraction_Regulation
of eNOS activity in endothelial cells 8.47E-04
Development_Melanocyte development and pigmentation 8.76E-04 Stem
cells_Extraembryonic differentiation of embryonic stem cells
9.09E-04 Stem cells_Astrocyte differentiation from adult stem cells
9.09E-04 Stem cells_Auditory hair cell differentiation in
embryogenesis 1.06E-03 Effect of H. pylori infection on gastric
epithelial cells motility 1.12E-03 Development_S1P3 receptor
signaling pathway 1.12E-03 Development_Role of IL-8 in angiogenesis
1.12E-03 Immune response_IL-9 signaling pathway 1.13E-03 Cell
adhesion_Gap junctions 1.35E-03 DNA damage_Brca1 as a transcription
regulator 1.35E-03 Stem cells_Early embryonal hypaxial myogenesis
1.40E-03 Immune response_Oncostatin M signaling via MAPK in human
cells 1.40E-03 Stem cells_Beta adrenergic receptors in brown
adipocyte differentiation 1.40E-03 ENaC regulation in airways
(normal and CF) 1.48E-03 EGFR family signaling in pancreatic cancer
1.49E-03 Cell adhesion_Endothelial cell contacts by non-junctional
mechanisms 1.52E-03 Neurophysiological process_Glutamate regulation
of Dopamine D1A receptor signaling 1.62E-03 Neurophysiological
process_Receptor-mediated axon growth repulsion 1.62E-03 Role of
cell adhesion molecules in progression of pancreatic cancer
1.62E-03 Immune response_Fc gamma R-mediated phagocytosis in
macrophages 1.62E-03 Neurophysiological process_ACM regulation of
nerve impulse 1.93E-03 Transcription_Transcription regulation of
aminoacid metabolism 1.98E-03 G-protein signaling_Regulation of p38
and JNK signaling mediated by G-proteins 2.08E-03 Stem cells_Role
of GSK3 beta in cardioprotection against myocardial infarction
2.12E-03 Development_NOTCH-induced EMT 2.12E-03 HCV-dependent
transcription regulation leading to HCC 2.12E-03 Development_PDGF
signaling via MAPK cascades 2.29E-03 Transport_Clathrin-coated
vesicle cycle 2.30E-03 Stem cells_Stimulation of differentiation of
mouse embryonic fibroblasts into adipocytes by 2.30E-03
extracellular factors Immune response_MIF in innate immunity
response 2.50E-03 Development_S1P2 and S1P3 receptors in cell
proliferation and differentiation 2.54E-03 Reproduction_GnRH
signaling 2.61E-03 Regulation of lipid metabolism_Stimulation of
Arachidonic acid production by ACM receptors 2.61E-03 Immune
response_Oncostatin M signaling via JAK-Stat in human cells
2.84E-03 Development_WNT signaling pathway. Part 1. Degradation of
beta-catenin in the absence WNT 2.84E-03 signaling
Development_VEGF-family signaling 3.00E-03 Hypoxia-induced EMT in
cancer and fibrosis 3.01E-03 Cell adhesion_Role of CDK5 in cell
adhesion 3.01E-03 Mechanisms of drug resistance in multiple myeloma
3.17E-03 Activation of TGF-beta signaling in pancreatic cancer
3.20E-03 Development_NOTCH1-mediated pathway for NF-KB activity
modulation 3.20E-03 Regulation of VEGF signaling in pancreatic
cancer 3.20E-03 Possible pathway of TGF-beta 1-dependent inhibition
of CFTR expression 3.20E-03 Signal transduction_Erk Interactions:
Inhibition of Erk 3.20E-03 Muscle contraction_ GPCRs in the
regulation of smooth muscle tone 3.51E-03 Stem cells_NOTCH in
inhibition of WNT/Beta-catenin-induced osteogenesis 3.56E-03
Apoptosis and survival_Inhibition of ROS-induced apoptosis by
17beta-estradiol 3.56E-03 Development_TGF-beta receptor signaling
3.70E-03 TGF-beta 1-induced transactivation of membrane receptors
signaling in HCC 3.70E-03 Beta-2 adrenergic-dependent CFTR
expression 3.87E-03 Immune response_Oncostatin M signaling via MAPK
in mouse cells 3.88E-03 Role of osteoblasts in bone lesions
formation in multiple myeloma 3.88E-03 Mechanisms of CAM-DR in
multiple myeloma 3.88E-03 Development_TGF-beta-dependent induction
of EMT via SMADs 3.88E-03 Stem cells_WNT and Notch signaling in
early cardiac myogenesis 3.88E-03 Some pathways of EMT in cancer
cells 4.30E-03 Membrane-bound ESR1: interaction with G-proteins
signaling 4.30E-03 Cell adhesion_Tight junctions 4.66E-03
Cytoskeleton remodeling_Keratin filaments 4.66E-03 IGF-1 signaling
in pancreatic cancer 4.66E-03 Stem cells_Dopamine-induced
expression of CNTF in adult neurogenesis 4.79E-03 Cell cycle_Role
of 14-3-3 proteins in cell cycle regulation 4.79E-03
Development_Thrombopoetin signaling via JAK-STAT pathway 4.79E-03
Immune response_IL-17 signaling pathways 4.82E-03 Suppression of
TGF-beta signaling in pancreatic cancer 4.93E-03 G-protein
signaling_G-Protein alpha-12 signaling pathway 5.57E-03 G-protein
signaling_Regulation of cAMP levels by ACM 5.78E-03 Cell
adhesion_Ephrin signaling 5.78E-03 G-protein signaling_Cross-talk
between Ras-family GTPases 6.08E-03 Proteolysis_Putative ubiquitin
pathway 6.08E-03 Stem cells_Aberrant Wnt signaling in
medulloblastoma stem cells 6.08E-03 Putative role of Estrogen
receptor and Androgen receptor signaling in progression of lung
cancer 6.56E-03 ERBB family and HGF signaling in gastric cancer
6.56E-03 Stem cells_Noncanonical WNT signaling in cardiac
myogenesis 6.59E-03 G-protein signaling_Rap2A regulation pathway
7.03E-03 Transport_Macropinocytosis regulation by growth factors
7.05E-03 Development_EGFR signaling pathway 7.05E-03 Dual role of
TGF-beta 1 in HCC 7.59E-03 Immune response_IFN alpha/beta signaling
pathway 7.59E-03 Development_Glucocorticoid receptor signaling
7.59E-03 Cell adhesion_PLAU signaling 7.76E-03 Transcription_P53
signaling pathway 7.76E-03 Stem cells_BMP7 in brown adipocyte
differentiation 7.76E-03 Development_Beta-adrenergic receptors
regulation of ERK 7.77E-03 Role and regulation of Prostaglandin E2
in gastric cancer 7.77E-03 Development_Leptin signaling via
PI3K-dependent pathway 7.77E-03 Transport_Alpha-2 adrenergic
receptor regulation of ion channels 7.77E-03 Influence of bone
marrow cell environment on progression of multiple myeloma 7.77E-03
Immune response_CD40 signaling 7.95E-03 Muscle contraction_ACM
regulation of smooth muscle contraction 8.52E-03 Stem cells_H3K4
demethylases in stem cell maintenance 8.73E-03 Development_PDGF
signaling via STATs and NF-kB 8.73E-03 Transition of HCC cells to
invasive and migratory phenotype 9.07E-03 WNT signaling in HCC
9.07E-03 Development_Neurotrophin family signaling 9.07E-03
Ubiquinone metabolism 9.10E-03 Immune response_Oncostatin M
signaling via JAK-Stat in mouse cells 9.13E-03 Androgen signaling
in HCC 9.13E-03 Development_Leptin signaling via JAK/STAT and MAPK
cascades 9.37E-03 Transport_RAB1A regulation pathway 9.84E-03
Cytoskeleton remodeling_Integrin outside-in signaling 1.02E-02 Role
of metalloproteases and heparanase in progression of pancreatic
cancer 1.04E-02 Cytoskeleton remodeling_Thyroliberin in
Cytoskeleton remodeling 1.04E-02 Transport_ACM3 in salivary glands
1.06E-02 Transport_Intracellular cholesterol transport in norm
1.10E-02 Muscle contraction_Delta-type opioid receptor in smooth
muscle contraction 1.14E-02 G-protein signaling_Ras family GTPases
in kinase cascades (scheme) 1.14E-02 Development_Alpha-1 adrenergic
receptors signaling via cAMP 1.16E-02 HCV-mediated liver damage and
predisposition to HCC progression via p53 1.16E-02 wtCFTR and
delta508 traffic/Clathrin coated vesicles formation (norm and CF)
1.16E-02 Immune response_Histamine signaling in dendritic cells
1.17E-02 Development_GM-CSF signaling 1.17E-02 Development_A2B
receptor: action via G-protein alpha s 1.17E-02 Angiogenesis in HCC
1.17E-02 Pro-inflammatory action of Gastrin in gastric cancer
1.17E-02 Oxidative stress_Role of ASK1 under oxidative stress
1.22E-02 Stem cells_BMP signaling in cardiac myogenesis 1.22E-02
Transcription_Role of VDR in regulation of genes involved in
osteoporosis 1.23E-02 Stem cells_TNF-alpha, IL-1 alpha and
WNT5A-dependent regulation of osteogenesis and 1.33E-02
adipogenesis in mesenchymal stem cells Mitochondrial ketone bodies
biosynthesis and metabolism 1.38E-02 Regulation of beta-adrenergic
receptors signaling in pancreatic cancer 1.40E-02 Development_Notch
Signaling Pathway 1.40E-02 Development_A2A receptor signaling
1.40E-02 Development_VEGF signaling and activation 1.40E-02
Apoptosis and survival_Anti-apoptotic action of Gastrin 1.40E-02
Neurophysiological process_Melatonin signaling 1.40E-02
Neurophysiological process_EphB receptors in dendritic spine
morphogenesis and synaptogenesis 1.43E-02 Cytoskeleton
remodeling_Role of Activin A in cytoskeleton remodeling 1.46E-02
Stem cells_H3K36 demethylation in stem cell maintenance 1.46E-02
Effect of H. pylori infection on inflammation in gastric epithelial
cells 1.54E-02 Development_S1P1 signaling pathway 1.60E-02
Development_Ligand-independent activation of ESR1 and ESR2 1.60E-02
CFTR-dependent regulation of ion channels in Airway Epithelium
(norm and CF) 1.60E-02 Mechanisms of resistance to EGFR inhibitors
in lung cancer 1.60E-02 Development_Regulation of CDK5 in CNS
1.64E-02 HGF signaling in pancreatic cancer 1.64E-02 E-cadherin
signaling and its regulation in gastric cancer 1.67E-02 HBV
signaling via protein kinases leading to HCC 1.67E-02
Development_Endothelin-1/EDNRA signaling 1.69E-02 Development_VEGF
signaling via VEGFR2 - generic cascades 1.82E-02 Immune
response_IL-13 signaling via JAK-STAT 1.82E-02 Signal
transduction_Calcium signaling 1.82E-02 Cytoskeleton
remodeling_ACM3 and ACM4 in keratinocyte migration 1.92E-02
Cholesterol and Sphingolipids transport/Distribution to the
intracellular membrane compartments 1.94E-02
(normal and CF) Stem cells_Notch signaling in medulloblastoma stem
cells 1.94E-02 Proteolysis_Putative SUMO-1 pathway 1.94E-02
Transcription_Role of heterochromatin protein 1 (HP1) family in
transcriptional silencing 2.18E-02 Immune response_MIF-mediated
glucocorticoid regulation 2.18E-02 Cell adhesion_Cell-matrix
glycoconjugates 2.21E-02 Cytoskeleton remodeling_RalA regulation
pathway 2.28E-02 Muscle contraction_S1P2 receptor-mediated smooth
muscle contraction 2.28E-02 EGFR signaling pathway in Lung Cancer
2.33E-02 Influence of smoking on activation of EGFR signaling in
lung cancer cells 2.33E-02 Development_HGF signaling pathway
2.33E-02 Cardiac Hypertrophy_NF-AT signaling in Cardiac Hypertrophy
2.33E-02 Immune response_TLR signaling pathways 2.36E-02
Chemotaxis_Leukocyte chemotaxis 2.47E-02 Cytokine production by
Th17 cells in CF 2.52E-02 Development_PACAP signaling in neural
cells 2.52E-02 Translation _Regulation of EIF2 activity 2.52E-02
Cytoskeleton remodeling_FAK signaling 2.62E-02 Inhibition of
apoptosis in pancreatic cancer 2.62E-02 Stem
cells_Neovascularization of glioblastoma in response to hypoxia
2.65E-02 Stem cells_Embryonal epaxial myogenesis 2.65E-02
Inflammatory mechanisms of pancreatic cancerogenesis 2.82E-02
Sorafenib-induced inhibition of cell proliferation and angiogenesis
in HCC 2.84E-02 IL-1 beta-dependent CFTR expression 2.84E-02
Development_Role of Activin A in cell differentiation and
proliferation 2.87E-02 Stem cells_H3K27 demethylases in
differentiation of stem cells 2.87E-02
Reproduction_Progesterone-mediated oocyte maturation 2.87E-02 Stem
cells_Regulation of endothelial progenitor cell differentiation
from adult stem cells 2.90E-02 Bacterial infections in CF airways
2.90E-02 Cytokine production by Th17 cells in CF (Mouse model)
2.93E-02 Development_PEDF signaling 2.93E-02 Immune
response_Bacterial infections in normal airways 2.93E-02 Stem
cells_Cooperation between Hedgehog, IGF-2 and HGF signaling
pathways in medulloblastoma 3.06E-02 stem cells Immune response
_Immunological synapse formation 3.20E-02 Stem cells_Muscle
progenitor cell migration in hypaxial myogenesis 3.24E-02 Apoptosis
and survival_Lymphotoxin-beta receptor signaling 3.24E-02 Immune
response_Gastrin in inflammatory response 3.38E-02
Transcription_Transcription factor Tubby signaling pathways
3.50E-02 Stem cells_EGF-induced proliferation of Type C cells in
SVZ of adult brain 3.51E-02 Normal and pathological
TGF-beta-mediated regulation of cell proliferation 3.51E-02 Mucin
expression in CF via TLRs, EGFR signaling pathways 3.63E-02 Immune
response_Neurotensin-induced activation of IL-8 in colonocytes
3.64E-02 Signal transduction_JNK pathway 3.64E-02 Cytoskeleton
remodeling_Neurofilaments 3.66E-02 Development_Thyroliberin
signaling 3.87E-02 Transcription_PPAR Pathway 3.87E-02 Stem
cells_Role of PKR1 and ILK in cardiac progenitor cells 4.00E-02
Apoptosis and survival_Role of CDK5 in neuronal death and survival
4.00E-02 Development_CNTF receptor signaling 4.00E-02 wtCFTR and
deltaF508 traffic/Membrane expression (norm and CF) 4.00E-02
Chemotaxis_CXCR4 signaling pathway 4.00E-02 Neurophysiological
process_Dopamine D2 receptor transactivation of PDGFR in CNS
4.26E-02 Immune response_Signaling pathway mediated by IL-6 and
IL-1 4.91E-02 Development_FGF2-dependent induction of EMT 4.46E-04
Transcription_ChREBP regulation pathway 6.25E-04 Regulation of
lipid metabolism_Insulin regulation of glycogen metabolism 2.76E-03
Transport_Macropinocytosis 9.84E-03 Regulation of CFTR activity
(norm and CF) 7.82E-05 Cell adhesion_Chemokines and adhesion
2.69E-07 Development_TGF-beta-dependent induction of EMT via RhoA,
PI3K and ILK. 1.13E-04 K-RAS signaling in lung cancer 6.72E-03 Cell
adhesion_Alpha-4 integrins in cell migration and adhesion 3.71E-06
Cytoskeleton remodeling_Cytoskeleton remodeling 1.05E-09 Muscle
contraction_Relaxin signaling pathway 8.94E-03 Apoptosis and
survival_BAD phosphorylation 9.18E-04 IL-6 signaling in multiple
myeloma 8.76E-04 Apoptosis and survival_Apoptotic TNF-family
pathways 9.18E-04 Immune response_IL-2 activation and signaling
pathway 3.17E-03 Dual role of BMP signaling in gastric cancer
3.50E-04 Cytoskeleton remodeling_Regulation of actin cytoskeleton
by Rho GTPases 1.34E-09 Cell cycle_Initiation of mitosis 9.37E-03
Transcription_CREB pathway 1.35E-03 Signal transduction_PKA
signaling 1.64E-05 Stem cells_Endothelial differentiation during
embryonic development 3.25E-04 Cytoskeleton remodeling_TGF, WNT and
cytoskeletal remodeling 1.88E-09 HBV-dependent NF-kB and PI3K/AKT
pathways leading to HCC 8.76E-04 Translation _Regulation of
translation initiation 6.27E-04 Cell cycle_Influence of Ras and Rho
proteins on G1/S Transition 1.18E-04 Apoptosis and
survival_Granzyme A signaling 1.35E-03
TABLE-US-00012 TABLE 12 Pathways Upregulated in Parous CD44+ Cells
Relative to Nulliparous CD44+ Cells P-val in Pathway maps P CD44+
TTP metabolism 0.0000608 Resistance of pancreatic cancer cells to
death receptor signaling 1.29E-04 Transcription_Assembly of RNA
Polymerase II preinitiation complex on TATA-less promoters 0.000136
Development_PIP3 signaling in cardiac myocytes 3.38E-04
HCV-dependent regulation of RNA polymerases leading to HCC 0.00035
Stem cells_H3K9 demethylases in pluripotency maintenance of stem
cells 4.62E-04 Inhibition of apoptosis in gastric cancer 6.32E-04
Cell cycle_Start of DNA replication in early S phase 0.00067
Apoptosis and survival_Caspase cascade 0.000816 Immune response_BCR
pathway 9.79E-04 Immune response_ICOS pathway in T-helper cell
1.40E-03 Cell cycle_The metaphase checkpoint 0.00141 Inhibitory
action of Lipoxins on neutrophil migration 1.46E-03 Cytoskeleton
remodeling_Alpha-1A adrenergic receptor-dependent inhibition of
PI3K 1.67E-03 DNA damage_NHEJ mechanisms of DSBs repair 1.67E-03
Regulation of metabolism_Triiodothyronine and Thyroxine signaling
0.00186 Cell cycle_Chromosome condensation in prometaphase 2.70E-03
Development_IGF-1 receptor signaling 2.77E-03 dCTP/dUTP metabolism
0.003 dGTP metabolism 0.00332 Inhibition of RUNX3 signaling in
gastric cancer 0.00336 Apoptosis and survival_Beta-2 adrenergic
receptor anti-apoptotic action 0.00412 Signal transduction_Activin
A signaling regulation 4.38E-03 Stem cells_Fetal brown fat cell
differentiation 0.00447 Immune response_CXCR4 signaling via second
messenger 5.11E-03 dATP/dITP metabolism 0.00573 Signal
transduction_PTEN pathway 5.97E-03 Microsatellite instability in
gastric cancer 0.00601 Inhibition of TGF-beta signaling in gastric
cancer 6.01E-03 Immune response_Regulation of T cell function by
CTLA-4 6.82E-03 DNA damage_DNA-damage-induced responses 0.00747
Stem cells_Self-renewal of adult neural stem cells 0.00756
Regulation of degradation of deltaF508 CFTR in CF 8.44E-03
Transcription_Sin3 and NuRD in transcription regulation 0.00892
Blood coagulation_GPIb-IX-V-dependent platelet activation 0.00952
Transcription_Receptor-mediated HIF regulation 1.01E-02 Stem
cells_Signaling pathways in embryonic hepatocyte maturation
1.05E-02 Apoptosis and survival_nAChR in apoptosis inhibition and
cell cycle progression 1.15E-02 Apoptosis and
survival_Anti-apoptotic TNFs/NF-kB/Bcl-2 pathway 1.29E-02 DNA
damage_Role of Brca1 and Brca2 in DNA repair 0.0133
Translation_IL-2 regulation of translation 0.0139 DNA
damage_Inhibition of telomerase activity and cellular senescence
0.0139 DNA damage_Mismatch repair 0.0139 Neurophysiological
process_Olfactory transduction 0.0139 Immune response_CD28
signaling 1.42E-02 Immune response_Role of DAP12 receptors in NK
cells 0.0142 Immune response_PIP3 signaling in B lymphocytes 0.0144
Immune response_ETV3 affect on CSF1-promoted macrophage
differentiation 0.0152 Blood coagulation_GPVI-dependent platelet
activation 0.0157 Inhibition of tumor suppressive pathways in
pancreatic cancer 1.65E-02 Transcription_Ligand-Dependent
Transcription of Retinoid-Target genes 0.0196 Role of
alpha-6/beta-4 integrins in carcinoma progression 0.0199
Development_Thrombopoietin-regulated cell processes 1.99E-02
Chemotaxis_Lipoxin inhibitory action on fMLP-induced neutrophil
chemotaxis 2.20E-02 Development_EGFR signaling via PIP3 0.0226
G-protein signaling_S1P2 receptor signaling 0.0248 Stem
cells_Differentiation of natural regulatory T cells 0.0248
Translation_Opioid receptors in regulation of translation 2.61E-02
Transport_RAB3 regulation pathway 0.0271 G-protein signaling_RAC1
in cellular process 0.0277 DNA damage_Nucleotide excision repair
0.0277 Immune response_Inhibitory action of lipoxins on superoxide
production induced by IL-8 and 2.91E-02 Leukotriene B4 in
neutrophils Inhibitory action of Lipoxins on Superoxide production
in neutrophils 2.91E-02 wtCFTR and delta508-CFTR traffic/Generic
schema (norm and CF) 0.0317 Apoptosis and
survival_DNA-damage-induced apoptosis 0.0327 Apoptosis and
survival_NGF signaling pathway 0.0341 Apoptosis and survival_APRIL
and BAFF signaling 3.42E-02 Immune response_NFAT in immune response
0.0346 Apoptosis and survival_Anti-apoptotic TNFs/NF-kB/IAP pathway
3.85E-02 Immune response_TCR and CD28 co-stimulation in activation
of NF-kB 0.0414 Immune response_Innate immune response to RNA viral
infection 4.33E-02 Immune response _IFN gamma signaling pathway
0.044 Immune response_CD16 signaling in NK cells 0.0472 Immune
response_Delta-type opioid receptor signaling in T-cells 4.84E-02
Apoptosis and survival_p53-dependent apoptosis 0.0484 Stem
cells_Role of growth factors in the maintenance of embryonic stem
cell pluripotency 0.0129 Chemoresistance pathways mediated by
constitutive activation of PI3K pathway and BCL-2 in 2.20E-05 small
cell lung cancer Signal transduction_AKT signaling 2.74E-05 Immune
response_Inhibitory action of Lipoxins on pro-inflammatory
TNF-alpha signaling 4.18E-05 Apoptosis and
survival_Cytoplasmic/mitochondrial transport of proapoptotic
proteins Bid, Bmf and 1.61E-04 Bim Translation _Regulation of EIF4F
activity 1.81E-04 PI3K signaling in gastric cancer 6.36E-04
Chemotaxis_Inhibitory action of lipoxins on IL-8- and Leukotriene
B4-induced neutrophil migration 6.36E-04 Translation_Insulin
regulation of translation 7.48E-04 Transcription_Role of Akt in
hypoxia induced HIF1 activation 1.49E-03 Apoptosis and
survival_Ceramides signaling pathway 1.96E-03 Apoptosis and
survival_Role of IAP-proteins in apoptosis 3.16E-03
Proteolysis_Role of Parkin in the Ubiquitin-Proteasomal Pathway
5.01E-03 Anti-apoptotic action of Gastrin in pancreatic cancer
5.92E-03
TABLE-US-00013 TABLE 13 GeneGo Functional Enrichment Analysis by
Protein Class for Differentially Expressed Genes in CD44+, CD24+,
CD10+ and Stromal Breast Epithelial Cell Types Protein class Actual
n R N Expected Ratio p-value z-score Protein class enriched in
nulliparous CD44+ cells phosphatases 33 2078 230 22651 21.1 1.564
6.690E-03 2.732 ligands 67 2078 507 22651 46.51 1.44 1.524E-03
3.188 kinases 71 2078 650 22651 59.63 1.191 6.960E-02 1.567
transcription 101 2078 951 22651 87.24 1.158 6.627E-02 1.579
factors enzymes 286 2078 2693 22651 247.1 1.158 3.576E-03 2.77
proteases 57 2078 552 22651 50.64 1.126 1.896E-01 0.9493 receptors
97 2078 1492 22651 136.9 0.7087 6.932E-05 -3.7 other 1374 2078
15628 22651 1434 0.9584 1.705E-03 -2.972 Protein class enriched in
nulliparous CD10+ cells proteases 59 1491 552 22651 36.34 1.624
1.665E-04 3.938 ligands 53 1491 507 22651 33.37 1.588 5.912E-04
3.555 enzymes 218 1491 2693 22651 177.3 1.23 5.826E-04 3.372
transcription 68 1491 951 22651 62.6 1.086 2.531E-01 0.7215 factors
phosphatases 16 1491 230 22651 15.14 1.057 4.467E-01 0.2299 kinases
43 1491 650 22651 42.79 1.005 5.096E-01 0.03431 receptors 96 1491
1492 22651 98.21 0.9775 4.319E-01 -0.2388 other 946 1491 15628
22651 1029 0.9196 1.294E-06 -4.792 Protein class enriched in
nulliparous CD24+ cells phosphatases 23 1273 230 22651 12.93 1.779
5.428E-03 2.899 enzymes 213 1273 2693 22651 151.3 1.407 9.672E-08
5.495 kinases 45 1273 650 22651 36.53 1.232 8.715E-02 1.464
transcription 51 1273 951 22651 53.45 0.9542 3.967E-01 -0.352
factors ligands 25 1273 507 22651 28.49 0.8774 2.859E-01 -0.6814
proteases 27 1273 552 22651 31.02 0.8703 2.598E-01 -0.7526
receptors 46 1273 1492 22651 83.85 0.5486 1.417E-06 -4.402 other
844 1273 15628 22651 878.3 0.9609 1.799E-02 -2.14 Protein class
enriched in nulliparous stromal cells ligands 35 770 507 22651
17.24 2.031 6.543E-05 4.403 proteases 38 770 552 22651 18.76 2.025
3.424E-05 4.574 kinases 36 770 650 22651 22.1 1.629 2.994E-03 3.054
transcription 49 770 951 22651 32.33 1.516 2.625E-03 3.048 factors
phosphatases 11 770 230 22651 7.819 1.407 1.619E-01 1.163 receptors
53 770 1492 22651 50.72 1.045 3.891E-01 0.3371 enzymes 69 770 2693
22651 91.55 0.7537 4.980E-03 -2.554 other 482 770 15628 22651 531.3
0.9073 7.001E-05 -3.905 Protein class enriched in parous CD44+
cells phosphatases 24 1820 230 22651 18.48 1.299 1.130E-01 1.346
enzymes 280 1820 2693 22651 216.4 1.294 1.994E-06 4.804 kinases 67
1820 650 22651 52.23 1.283 2.106E-02 2.163 transcription 88 1820
951 22651 76.41 1.152 9.018E-02 1.412 factors proteases 39 1820 552
22651 44.35 0.8793 2.234E-01 -0.8485 ligands 35 1820 507 22651
40.74 0.8592 1.949E-01 -0.948 receptors 76 1820 1492 22651 119.9
0.634 3.035E-06 -4.324 other 1215 1820 15628 22651 1256 0.9676
1.720E-02 -2.151 Protein class enriched in parous CD10+ cells
enzymes 241 1721 2693 22651 204.6 1.178 3.179E-03 2.819 kinases 58
1721 650 22651 49.39 1.174 1.131E-01 1.294 ligands 41 1721 507
22651 38.52 1.064 3.611E-01 0.4202 phosphatases 17 1721 230 22651
17.48 0.9728 5.164E-01 -0.1189 transcription 65 1721 951 22651
72.26 0.8996 2.004E-01 -0.9072 factors proteases 33 1721 552 22651
41.94 0.7868 8.152E-02 -1.454 receptors 78 1721 1492 22651 113.4
0.6881 1.122E-04 -3.575 other 1193 1721 15628 22651 1187 1.005
3.921E-01 0.3036 Protein class enriched in parous CD24+ cells
phosphatases 16 1173 230 22651 11.91 1.343 1.422E-01 1.223 kinases
42 1173 650 22651 33.66 1.248 8.280E-02 1.498 enzymes 170 1173 2693
22651 139.5 1.219 3.280E-03 2.829 transcription 58 1173 951 22651
49.25 1.178 1.104E-01 1.308 factors ligands 28 1173 507 22651 26.26
1.066 3.900E-01 0.3536 proteases 28 1173 552 22651 28.59 0.9795
5.044E-01 -0.1139 receptors 54 1173 1492 22651 77.26 0.6989
2.041E-03 -2.812 other 780 1173 15628 22651 809.3 0.9638 3.152E-02
-1.9 Protein class enriched in parous stromal cells enzymes 228 950
2693 22651 112.9 2.019 1.785E-26 11.78 kinases 35 950 650 22651
27.26 1.284 7.908E-02 1.536 phosphatases 9 950 230 22651 9.646
0.933 5.007E-01 -0.2137 ligands 12 950 507 22651 21.26 0.5643
1.865E-02 -2.076 proteases 13 950 552 22651 23.15 0.5615 1.370E-02
-2.182 transcription 22 950 951 22651 39.89 0.5516 1.014E-03 -2.956
factors receptors 29 950 1492 22651 62.58 0.4634 5.878E-07 -4.487
other 603 950 15628 22651 655.5 0.92 1.188E-04 -3.759 Protein class
enrichment for promoter hypermethylation in nulliparous CD44+ cells
kinases 37 838 650 22651 24.05 1.539 6.593E-03 2.731 transcription
54 838 951 22651 35.18 1.535 1.240E-03 3.303 factors enzymes 134
838 2693 22651 99.63 1.345 1.970E-04 3.738 proteases 25 838 552
22651 20.42 1.224 1.745E-01 1.045 ligands 20 838 507 22651 18.76
1.066 4.165E-01 0.2958 phosphatases 9 838 230 22651 8.509 1.058
4.798E-01 0.1724 receptors 40 838 1492 22651 55.2 0.7247 1.541E-02
-2.157 other 523 838 15628 22651 578.2 0.9046 2.087E-05 -4.199
Protein class enrichment for promoter hypermethylation in
nulliparous CD44+ cells transcription 32 290 951 22651 12.18 2.628
6.665E-07 5.842 factors ligands 10 290 507 22651 6.491 1.541
1.180E-01 1.402 proteases 9 290 552 22651 7.067 1.273 2.774E-01
0.7408 kinases 10 290 650 22651 8.322 1.202 3.222E-01 0.594 enzymes
39 290 2693 22651 34.48 1.131 2.282E-01 0.8256 receptors 20 290
1492 22651 19.1 1.047 4.490E-01 0.2139 phosphatases 2 290 230 22651
2.945 0.6792 4.332E-01 -0.5569 other 170 290 15628 22651 200.1
0.8496 1.099E-04 -3.844 Protein class enrichment for genebody
hypermethylation in nulliparous CD44+ cells transcription 31 249
951 22651 10.45 2.965 6.726E-08 6.528 factors phosphatases 4 249
230 22651 2.528 1.582 2.474E-01 0.9354 receptors 18 249 1492 22651
16.4 1.097 3.762E-01 0.4107 ligands 6 249 507 22651 5.573 1.077
4.852E-01 0.1838 kinases 6 249 650 22651 7.145 0.8397 4.249E-01
-0.4372 enzymes 21 249 2693 22651 29.6 0.7094 5.047E-02 -1.694
proteases 4 249 552 22651 6.068 0.6592 2.712E-01 -0.8547 other 160
249 15628 22651 171.8 0.9313 6.111E-02 -1.625 Protein class
enrichment for genebody hypermethylation in parous CD44+ cells
transcription 20 170 951 22651 7.137 2.802 3.207E-05 4.937 factors
phosphatases 4 170 230 22651 1.726 2.317 9.542E-02 1.746 kinases 11
170 650 22651 4.878 2.255 1.018E-02 2.823 proteases 5 170 552 22651
4.143 1.207 3.995E-01 0.4279 enzymes 21 170 2693 22651 20.21 1.039
4.608E-01 0.1876 receptors 9 170 1492 22651 11.2 0.8037 3.107E-01
-0.6821 ligands 3 170 507 22651 3.805 0.7884 4.700E-01 -0.419 other
97 170 15628 22651 117.3 0.827 6.559E-04 -3.377
[0283] The analysis was further focused on CD44+ cells, which
showed the most pronounced differences between parous and
nulliparous states. Pathways highly active in nulliparous samples
are related to major developmental and tumorigenic pathways
including cytoskeleton remodeling, chemokines and cell adhesion,
and WNT signaling (FIG. 13 and Table 10), whereas pathways more
active in parous samples include PI3K/AKT signaling and apoptosis
(FIG. 14 and Table 10). Importantly, the highest scored pathway for
genes highly expressed in nulliparous samples is four orders of
magnitude more statistically significant than those for the genes
highly expressed in parous samples, suggesting that downregulation
of protumorigenic developmental pathways is a prominent feature of
CD44+ cells from parous women. Interactome analysis also
demonstrated a much larger number of overconnected proteins in
nulliparous than in parous state in all four cell types, but
particularly in CD44+ cells (FIG. 12). As the relative number of
interactions (connectivity) is directly related to the functional
activity of the dataset [Nikolsky, Y., et al. (2008) Cancer Res 68,
9532-9540], this result suggested that parous cells are overall
substantially less active than nulliparous ones.
[0284] Because pregnancy-induced protection against breast cancer
is also observed in rodents, it was investigated whether pathways
altered by parity are conserved across species. Pathways in CD44+
cells were compared to that generated based on genes differentially
expressed between virgin and parous rats [Blakely et al., 2006,
supra; D'Cruz, C. M., and Chodosh, L. A. (2006) Cancer Res 66,
6421-6431]. Significant overlap was found between pathways highly
active in nulliparous and virgin samples (thus, downregulated in
parous), but almost nothing in common was found among those highly
active in parous tissues. The top ranked pathways were all related
to cytoskeleton remodeling and cell adhesion, known to be highly
relevant in stem cells (FIG. 15A and FIG. 15B). Thus, pregnancy
appears to induce similar alterations in the mammary epithelium
regardless of species. A network built of the common pathways
included a complete NOTCH pathway (including NOTCH1 (GenBank
Accession no., AB209873, AF308602, AL592301, BC013208),
NOTCH1-NICD, ADAM17 (GenBank Accession no., BM725368, BQ186514),
gammasecretase complex (PSENEN, GenBank Accession no., AF220053,
BQ222622), APH1A (GenBank Accession Nos. BC020590, BI760743,
DC365601), and APH1B (GenBank Accession Nos. AC016207, AI693802)),
IGF1 (GenBank Accession Nos. AB209184, AC010202), EGF (GenBank
Accession No. AC004050, AC005509), CD44 (GenBank Accession No.
BC004372), CD9 (GenBank Accession Nos. AI003581, BG291377), and
ITGB1 (GenBank Accession Nos. AI261443, BM973433, BX537407) as
"triggers" (ligands and receptors), c-Src (GenBank Accession Nos.
AF272982, BC051270), PKC (GenBank Accession No. NM.sub.--212535),
and FAK (GenBank Accession Nos. AB209083, AK304356) as major
signaling kinases, and c-Jun (GenBank Accession Nos. BC002646,
BC009874), p53 (GenBank Accession No. AK223026, DA453049), SNAIL1
(GenBank Accession Nos. BC012910, DA972913), and LEFT (GenBank
Accession Nos. AC097067, AC118062) as transcription factors.
Example 4
Cell Type-Specific Epigenetic Patterns Related to Parity and their
Functional Relevance
[0285] This example demonstrates that parity has a more pronounced
long-term effect on DNA methylation than on H3 lysine 27
trimethylation (K27) patterns.
[0286] Reduction of breast cancer risk in postmenopausal women
conferred by full-term pregnancy in early adulthood implies the
induction of long-lasting changes such as alterations in cell
type-specific epigenetic patterns. To investigate this hypothesis,
the comprehensive DNA methylation and K27 profiles of CD24+ and
CD44+ cells from nulliparous and parous women were analyzed using
MSDKseq applied to high-throughput sequencing and ChIPseq,
respectively. The data are summarized in Tables 14-17, below.
[0287] Comparison of MSDKseq libraries of nulliparous and parous
samples within each cell type showed a higher number of
significantly (p<0.05) differentially methylated regions (DMRs)
in CD44+ cells and, in both cell types, more DMRs were
hypermethylated in nulliparous than in parous cells (FIG. 16 and
Table 14, below).
[0288] To validate differences in DNA methylation in additional
samples and by other methods, quantitative methylation-specific PCR
(qMSP) analyses of selected genes were performed using CD44+ cells
from multiple nulliparous and parous cases. Despite some
interpersonal variability, statistically significant differences
were detected between nulliparous and parous groups that overall
correlated with MSDKseq data (FIG. 6).
[0289] In Table 14, genes with DMR (hypermethylated in parous or
nulliparous samples) in promoter region or genebody in CD44+ cells
are listed. DMR pattern (hypermethylated in which sample in which
region), gene symbol, RefSeq ID, gene description, chromosomal
location, log 10 p-value (calculated by Poisson margin model), log
ratio of averaged nulliparous and parous MSDK-tag counts, scaled
MSDK-tag counts, chromosomal position of BssHII recognition sites,
and distance between BssHII sites and TSS (plus and minus indicate
downstream and upstream of TSS, respectively) are shown. The log 10
p-value and log ratio have a positive or negative sign which
indicates DMR is hypermethylated in parous or nulliparous,
respectively.
[0290] Global associations between differential gene expression and
presence of DMRs were analyzed in CD44+ and CD24+ cells, but
significant associations were not found, potentially due to the
complex relationship between DNA methylation and transcript levels,
as DNA methylation can both positively (e.g., in gene body) and
negatively (e.g., in promoters) regulate gene expression, depending
on the location relative to transcription start site.
[0291] The data from the analyses are summarized in Table 15 and
Table 16, below, which list genes that are differentially
methylated between nulliparous and parous CD44+ and CD24+ cells,
respectively, along with SAGEseq, ChIPseq and MSDKseq data for the
listed genes. Significant differences in genes enriched for
H3K27me3 mark were not detected in CD44+ or CD24+ cells from
nulliparous and parous samples. However, genes highly expressed in
CD44+ or CD24+ cells from nulliparous women were not K27-enriched
in either parous or nulliparous cases, implying the potential lack
of their regulation by the PRC2 complex that establishes this
histone mark (see, Tables 15 and 16).
[0292] Overall it appears that parity may have a more pronounced
long-term effect on DNA methylation than on K27 patterns.
[0293] To investigate pathways affected by parity-related
epigenetic alterations, pathways enriched by genes associated with
gene body or promoter DMRs were analyzed in CD44+ cells from
nulliparous and parous samples. Very little overlap was found among
the four distinct categories (FIG. 17). Relatively few pathways
were significantly enriched in both expression and methylation data
and most of these were related to development, TGF.beta. and WNT
signaling.
[0294] The fraction of transcription factors (TFs) among
differentially methylated genes was 2-3 fold higher than expected
and what was observed among differentially expressed genes,
implying that promoter methylation might be a preferred control
mechanism of their expression. Similar to the expression data, DMRs
in nulliparous samples had higher numbers of overconnected objects
than in parous ones. Gene body DMRs in CD44+ nulliparous cells had
the highest number of overconnected objects and transcription
factors represented a significant fraction of overconnected objects
in promoter hypermethylated DMRs in CD44+ nulliparous cells.
Further, Table 17 lists enriched GeneGo pathway maps for
differentially methylated regions (DMRs) in promoter (-5 to 2 kb)
and gene body (+2 kb to end) in CD44+ cells from human breast
epithelium. The table contains canonical pathway maps with p-values
(<0.05) indicating significance of enrichment for differentially
methylated genes (hypo/hyper methylated) in CD44+
progenitor-enriched cells from nulliparous or parous cases.
TABLE-US-00014 Lengthy table referenced here
US20150285802A1-20151008-T00008 Please refer to the end of the
specification for access instructions.
TABLE-US-00015 Lengthy table referenced here
US20150285802A1-20151008-T00009 Please refer to the end of the
specification for access instructions.
TABLE-US-00016 Lengthy table referenced here
US20150285802A1-20151008-T00010 Please refer to the end of the
specification for access instructions.
TABLE-US-00017 Lengthy table referenced here
US20150285802A1-20151008-T00011 Please refer to the end of the
specification for access instructions.
Example 5
Persistent Parity-Related Decrease of p27+ Cells
[0295] This example demonstrates that the number of p27+ and Ki67+
cells are significantly lower in parous than in nulliparous breast
tissues.
[0296] As discussed in Example CDKN1B encoding for p27, was one of
the most significantly differentially expressed genes in CD44.sup.+
cells from nulliparous and parous (high in nulliparous) and also
from control and BRCA1/2 parous tissues (high in BRCA1/2).
[0297] The global profiling results were validated in intact breast
epithelium at the single cell level using multicolor
immunofluorescence assays for the combined detection of CD24, CD44,
and top differentially expressed genes. Genes were selected based
on significance of difference between nulliparous and parous groups
and antibody availability. A marked decrease was found in the
expression of p27, Sox17, and Cox2 in parous compared to
nulliparous samples. The levels of expression of these markers were
lower in breast epithelial cells of parous women compared to
nulliparous women (FIG. 18 and FIG. 19).
[0298] p27 has been reported to affect the number and proliferation
of stem cells and progenitors in several organs. Thus, the decrease
of p27+ cells in parous tissues may indicate that the number or
proliferative potential of breast epithelial progenitors is
decreased. To investigate this issue, immunofluorescence analysis
was performed for Ki67, a proliferation marker expressed in cycling
cells, alone and in combination with p27. Using this approach it
was observed that the number of Ki67+ cells was significantly lower
in parous samples and a small subset of cells was Ki67+p27+ (FIG.
19).
[0299] The tissue samples used for the global profiling studies
above (Example 3) were obtained from premenopausal women, since the
protective effects of pregnancy against breast cancer are likely to
be established early, even though they are manifested after
menopause. However, to confirm that the parity-related differences
detected in premenopausal women were maintained and could be
detected even after menopause, the expression of p27, Sox17, and
Cox2 was analyzed by immunofluorescence and immunohistochemistry in
breast tissue samples from postmenopausal women. Although the
observed differences between nulliparous and parous postmenopausal
samples were less pronounced, the number of p27+ and Ki67+ cells
were still significantly lower in parous than in nulliparous
tissues (FIG. 20). This observation also suggested that the
differences in the number of p27+ and Ki67+ cells between parous
and nulliparous tissues in premenopausal women was not likely due
to differences in the phase of the menstrual cycle between groups,
as postmenopausal tissues showed similar differences for these
markers.
Example 6
Link Between Parity-Related Differences and Mammographic
Density
[0300] This example demonstrates that p27+ cells are a marker of
both parity status and mammographic density, and a strong marker
for breast cancer risk prediction.
[0301] Mammographic density is one of the most significant risk
factors for breast cancer, yet its molecular basis is unknown.
Mammographic density is higher in nulliparous women and declines
after pregnancy, thus, some of the parity-related differences
detected may also be linked to differences in mammographic density.
To test this hypothesis, the expression levels of p27, Sox17, Cox2,
and Ki67 were analyzed in biopsy samples obtained from high and low
density areas of the same breast [Lin, et al. (2011) Breast Cancer
Res Treat 128, 505-516]. The overall expression of Sox17, Cox2,
p27, and Ki67 were not significantly different between low and
high-density areas, but the number of p27+ cells was higher in
high-density areas (FIG. 21). Thus, the number of p27+ cells is a
marker of both parity status and mammographic density, and because
both of these are linked to breast cancer risk, it can be used for
breast cancer risk prediction.
Example 7
p27.sup.+ Cells are Quiescent Hormone-Responsive Cells with
Progenitor Features
[0302] This example demonstrates that a subset of p27+ cells may
represent quiescent hormone-responsive progenitors that are the
potential cell-of-origin of breast cancer.
[0303] The mutually exclusive expression of Ki67 and p27 in breast
epithelial cells with their concomitant decrease in parous compared
to nulliparous women implied coordinated regulation and that they
may represent actively cycling and quiescent cells with
proliferative potential, respectively. Ovarian hormones are the
best-understood regulators of breast epithelial cell proliferation
and also breast cancer risk. Correlating with this, the gene
expression data (Example 2) indicated a decrease in androgen
receptor (AR) and AR targets in CD44.sup.+ cells from parous women
(Table 4) and prior studies implied a decrease in ER+ breast
epithelial cells in parous compared to nulliparous women. To
explore the potential hormonal regulation of p27+ breast epithelial
cells, the expression of ER, AR, and p27 was analyzed in breast
tissue samples from women with varying parity and hormonal status.
These included control nulliparous and parous women, BRCA1/2
mutation carriers, breast biopsy tissues from women in early (8-10
weeks) and late (22-26 weeks) stage of pregnancy, and premenopausal
women in the follicular and luteal phases of the menstrual cycle or
from women undergoing ovarian hyperstimulation prior to oocyte
collection for in vitro fertilization (samples are collected at the
time of oocyte collection). For each case, multiple different
regions of the same slide or breast tissue sample were analyzed in
order to minimize differences due to the known tissue heterogeneity
even in the same woman. Interestingly, it was found that nearly all
p27+ cells were also ER+, and their numbers were the highest in
BRCA1/2 mutation carriers and the lowest in biopsy samples from
pregnant women and after ovarian hyperstimulation, where both
ovarian hormone and hCG (human choriogonadotropin) levels are the
highest (FIG. 22). The frequencies of p27+ cells, ER+ cells, and
p27+ER+ cells were also higher in control nulliparous compared to
parous women and in follicular relative to luteal phase of the
menstrual cycle (FIG. 22). Overall similar observations were made
for AR (FIG. 23A), although the overlap between p27 and AR was less
pronounced compared to that between p27 and ER (FIG. 23B). The high
fraction of AR+ cells in BRCA1 mutation carriers is particularly
interesting since AR is a genetic modifier of BRCA1-associated
breast cancer risk.
[0304] To further investigate the relationship between the numbers
of p27.sup.+ cells and ovarian hormone-induced breast epithelial
cell proliferation, immunofluorescence analysis for p27 and Ki67
was performed in tissue samples with the highest differences in
hormone levels. Correlating with prior data, the frequency of
Ki67.sup.+ cells was the highest in the luteal phase of the
menstrual cycle when both estrogen and progesterone levels are high
(FIG. 23B). Samples from early pregnancy had a lower fraction of
proliferating Ki67.sup.+ cells and the numbers of these cells was
the lowest in the follicular phase. The frequency of p27.sup.+
cells displayed an inverse correlation with that of Ki67.sup.+
cells: it was the highest in the follicular phase and lowest in
biopsies from oocyte donors (breast tissue biopsies were taken at
the time of oocyte collection) (FIG. 23B). Interestingly, a low but
detectable fraction of p27.sup.+ cells was also Ki67.sup.+ in the
luteal phase and early pregnancy, potentially marking proliferating
progenitors in early G1 phase of the cell cycle when p27 and Ki67
can overlap. The differences in the frequency of p27.sup.+ and
Ki67.sup.+ cells between the follicular and luteal phases was less
significant in parous compared to nulliparous women in part due to
the lower overall fractions of these cells in parous cases (FIG.
23C).
[0305] These results show hat a subset of p27.sup.+ cells represent
quiescent hormone-responsive luminal progenitors and that their
frequency relates to the risk of breast cancer.
Example 8
Functional Validation of Parity-Related Differences in Signaling
Pathways
[0306] This example demonstrates that the decreased activity of
stem cell-related pathways following pregnancy lead to decreased
Ki67+ and p27+ cells in parous women.
[0307] Several signaling pathways less active in CD44+ parous cells
were related to stem cell maintenance and cell proliferation (FIG.
11). To investigate if inhibition of these pathways affects the
number of proliferating cells, normal breast tissues were incubated
in a tissue explant culture model with inhibitors or agonists of
selected pathways (e.g., cAMP, EGFR, Cox2, Hh, TGF.beta., Wnt, and
IGFR) for 8-10 days. Inhibitors of irrelevant pathways (e.g., PARP
inhibitor) as additional negative controls were also tested. For
each case, three different pieces of breast tissue taken from
different regions of the same breast were cultured, to minimize
variability due to tissue heterogeneity. The number of p27+ cells
and cellular proliferation based on bromodeoxyuridine (BrdU)
incorporation (marks cells in S phase of the cell cycle) and Ki67
expression (marks cycling cells irrespective of cell cycle phase)
was then assessed.
[0308] Tissue architecture and cellular viability were maintained
and p27+, Ki67+, and BrdU+ cells were detected in all conditions.
It was found that inhibition of cAMP, EGFR, Cox2, Hh, and IGFR
signaling significantly (p<0.05) decreased the number of cells
incorporating BrdU whereas the TGFBR inhibitor had the opposite
effect (FIG. 24) Inhibition of EGFR and Cox2, and, to a lesser
degree, Wnt and IGFR, decreased the fraction of Ki67+ cells,
whereas the frequency of p27+ cells most pronouncedly decreased
following IGFR and TGFBR inhibitor treatment. It was also confirmed
that the compounds effectively inhibited the activity of the
intended pathways (FIG. 25 and FIG. 26) and that the selected
pathways were active in p27+ cells.
[0309] To determine whether the numbers and the proliferation of
p27+ cells are regulated by ER and estrogen signaling, the fraction
of p27+ and Ki67+ cells in tissue slices treated with varying
concentrations of ovarian hormones or tamoxifen were analyzed. To
correlate the tissue slices data with that was observed under
physiologic conditions (FIG. 22), estrogen, progesterone,
prolactin, and hCG hormone levels that mimic serum levels in the
follicular or luteal phases of the menstrual cycle or in
mid-pregnancy were used. It was observed that the numbers of p27+
cells were high in sections treated with concentrations of estrogen
present in follicular phase and also following tamoxifen treatment,
whereas it decreased following IGFR and TGFBR inhibitor treatment
(FIG. 24). Cultures incubated with luteal phase and pregnancy level
hormones (FIG. 26B and FIG. 26C). These data further demonstrated
that a subset of p27+ cells are hormone-responsive luminal
progenitors.
[0310] Most importantly, the expression of phosphoSmad2 (pSmad2), a
key mediator of TGF.beta. signaling, demonstrated a nearly complete
overlap with that of p27, implying that TGF.beta. is essential for
maintaining these cells in quiescent stage possibly via modulating
p27 (FIG. 25). These results imply that the decreased activity of
these stem cell-related pathways following pregnancy may lead to
decreased Ki67+ and p27+ cells in parous women. Furthermore, the
data also suggested a direct role for these signaling pathways in
regulating breast epithelial cell proliferation where TGF.beta.
acts as a growth inhibitor and the other pathways are
mitogenic.
Example 9
Relevance of Parity to Breast Cancer Risk and Prognosis
[0311] The present example demonstrates that parity influences both
the risk and prognosis of ER+ breast tumors.
[0312] Based on the profiling data above (Example 3), it is
presently demonstrated that breast epithelial cells with progenitor
features are different in nulliparous and parous women. If these
cells serve as cell-of-origin for breast cancer then breast tumors
developing in parous and nulliparous women might also be different,
and this might impact their gene expression profiles and clinical
outcome. To test these hypotheses, the effect of parity on breast
cancer-specific survival was investigated in the Nurses' Health
Study (NHS). Overall, Kaplan Meier curves showed that there was no
significant association between parity and breast cancer-specific
survival (p=0.29). However, when the analysis was limited to ER+
tumors, it was found that nulliparous women had a suggestive worse
survival compared with parous women (FIG. 27). In multivariate
analysis there was still a marginally significant association among
women with ER+ tumors, with nulliparous women having a nearly 30%
increased risk of death from their disease (HR: 1.29, 95% CI: 0.98,
1.70; p=0.06). Assessing associations between age at first
pregnancy and number of pregnancies gave similar results. In
contrast, among women with ER- tumors, parity was not associated
with breast cancer-specific survival (p=0.51). Thus, parity
influences both the risk and prognosis of ER+ breast tumors.
[0313] Because pregnancy may not induce the same epigenetic and
gene expression changes in all women, due to germline variations,
it was next investigated if the parity-related gene expression
signature (PAGES) in CD44+ cells might be a more useful prognostic
marker than parity status alone. Thus, the expression of PAGES was
analyzed in public breast cancer gene expression data with clinical
outcome. The supervised principle component analysis (SPCA) was
applied on one of the cohorts (Wang) as a training set (FIG. 28) to
identify the subset of the PAGES with prognostic value followed by
validation in three other cohorts (Desmedt et al., supra; Sotiriou
et al., supra; van de Vijver et al., supra), the data for which are
shown in FIGS. 29A-C( ). In each dataset ER+ tumors, the tumor
subtype affected by parity, and cases without systemic therapy were
selected in order to avoid differences due to treatment. All
patients in the training set had small (<2 cm), lymph node
negative tumors at the time of diagnosis. Using this approach,
parity/nulliparity-related gene signatures were identified that
split patients into two distinct groups with significant survival
difference. The genes included in the prognostic signature are
summarized in Table 18, which shows the gene symbol, gene
description, gene expression pattern (i.e., high in parous and
nulliparous samples), and prognostic values (good or bad prognosis)
for each of the genes. Interestingly, such prognostic signature was
found among genes highly expressed in both nulliparous and parous
samples and each set of genes could be further separated into good
and bad signatures. These results reflect the complex relationship
between pregnancy and breast cancer that involves both protective
and tumor-promoting effects.
TABLE-US-00018 TABLE 18 Genes Included In Prognostic
Parity/Nulliparity Gene Signature Gene Symbol Description
Expression Prognosis A2M alpha-2-macroglobulin nulliparous bad
ABLIM1 actin binding LIM protein 1 nulliparous bad ADNP
activity-dependent neuroprotector homeobox parous bad APPBP2
amyloid beta precursor protein (cytoplasmic tail) binding protein 2
parous bad AQP1 aquaporin 1 (Colton blood group) nulliparous bad
ARID5B AT rich interactive domain 5B (MRF1-like) nulliparous bad
ASF1B ASF1 anti-silencing function 1 homolog B (S. cerevisiae)
parous bad AZGP1 alpha-2-glycoprotein 1, zinc-binding pseudogene 1;
alpha-2-glycoprotein 1, zinc- nulliparous bad binding B3GNT2
UDP-GlcNAc: betaGal beta-1,3-N-acetylglucosaminyltransferase 1;
UDP- nulliparous bad GlcNAc: betaGal
beta-1,3-N-acetylglucosaminyltransferase 2 BACE2 beta-site
APP-cleaving enzyme 2 nulliparous bad BIRC5 baculoviral IAP
repeat-containing 5 parous bad C11orf60 chromosome 11 open reading
frame 60 nulliparous bad C12orf48 chromosome 12 open reading frame
48 parous bad C19orf56 chromosome 19 open reading frame 56
nulliparous bad CCDC101 coiled-coil domain containing 101
nulliparous bad CCL2 chemokine (C-C motif) ligand 2 nulliparous bad
CCNI cyclin I nulliparous bad CCT2 chaperonin containing TCP1,
subunit 2 (beta) parous bad CD44 CD44 molecule (Indian blood group)
nulliparous bad CENPA centromere protein A parous bad CHEK1 CHK1
checkpoint homolog (S. pombe) parous bad CIR1 corepressor
interacting with RBPJ nulliparous bad CLPB ClpB caseinolytic
peptidase B homolog (E. coli) parous bad CNN3 calponin 3, acidic
nulliparous bad CSTB cystatin B (stefin B) nulliparous bad CTDSP1
CTD (carboxy-terminal domain, RNA polymerase II, polypeptide A)
small phosphatase nulliparous bad 1 CTDSPL CTD (carboxy-terminal
domain, RNA polymerase II, polypeptide A) small phosphatase-
nulliparous bad like CTPS CTP synthase parous bad CXCL12 chemokine
(C-X-C motif) ligand 12 (stromal cell-derived factor 1) nulliparous
bad DARC Duffy blood group, chemokine receptor nulliparous bad
DDX39 DEAD (Asp-Glu-Ala-Asp) box polypeptide 39 parous bad DEF6
differentially expressed in FDCP 6 homolog (mouse) nulliparous bad
DULLARD dullard homolog (Xenopus laevis) nulliparous bad DUSP4 dual
specificity phosphatase 4 nulliparous bad EEF1A2 eukaryotic
translation elongation factor 1 alpha 2 parous bad EFNA4 ephrin-A4
nulliparous bad EIF3G eukaryotic translation initiation factor 3,
subunit G nulliparous bad F3 coagulation factor III
(thromboplastin, tissue factor) nulliparous bad FBLN1 fibulin 1
nulliparous bad FBXO7 F-box protein 7 nulliparous bad FBXW4 F-box
and WD repeat domain containing 4 nulliparous bad FLOT1 flotillin 1
nulliparous bad FTO fat mass and obesity associated nulliparous bad
GAPVD1 GTPase activating protein and VPS9 domains 1 parous bad GGT5
gamma-glutamyltransferase 5 nulliparous bad GINS1 GINS complex
subunit 1 (Psf1 homolog) parous bad GNB2L1 guanine nucleotide
binding protein (G protein), beta polypeptide 2-like 1 nulliparous
bad GOLM1 golgi membrane protein 1 nulliparous bad GSTK1
glutathione S-transferase kappa 1 nulliparous bad GSTP1 glutathione
S-transferase pi 1 nulliparous bad GYPC glycophorin C (Gerbich
blood group) nulliparous bad HEATR2 HEAT repeat containing 2 parous
bad HIGD2A HIG1 hypoxia inducible domain family, member 2A
nulliparous bad HLA-DPA1 major histocompatibility complex, class
II, DP alpha 1 nulliparous bad HNRNPA0 heterogeneous nuclear
ribonucleoprotein A0 nulliparous bad IGFBP4 insulin-like growth
factor binding protein 4 nulliparous bad IMP3 IMPS, U3 small
nucleolar ribonucleoprotein, homolog (yeast) nulliparous bad INPP1
inositol polyphosphate-1-phosphatase nulliparous bad ITM2A integral
membrane protein 2A nulliparous bad JOSD1 Josephin domain
containing 1 nulliparous bad KIAA0101 KIAA0101 parous bad KIAA0406
KIAA0406 parous bad LITAF lipopolysaccharide-induced TNF factor
nulliparous bad LRIG1 leucine-rich repeats and immunoglobulin-like
domains 1 nulliparous bad LSM2 LSM2 homolog, U6 small nuclear RNA
associated (S. cerevisiae) nulliparous bad MCF2L MCF.2 cell line
derived transforming sequence-like parous bad MGMT
O-6-methylguanine-DNA methyltransferase nulliparous bad MNAT1
menage a trois homolog 1, cyclin H assembly factor (Xenopus laevis)
parous bad NAP1L1 nucleosome assembly protein 1-like 1 nulliparous
bad NFYC nuclear transcription factor Y, gamma nulliparous bad
NUPR1 nuclear protein, transcriptional regulator, 1 nulliparous bad
PALM paralemmin nulliparous bad PIK3IP1 phosphoinositide-3-kinase
interacting protein 1 nulliparous bad PNRC1 proline-rich nuclear
receptor coactivator 1 nulliparous bad POP1 processing of precursor
1, ribonuclease P/MRP subunit (S. cerevisiae) parous bad PPM1D
protein phosphatase 1D magnesium-dependent, delta isoform parous
bad PRC1 protein regulator of cytokinesis 1 parous bad PSAP
prosaposin nulliparous bad PYCRL pyrroline-5-carboxylate
reductase-like parous bad RACGAP1 Rac GTPase activating protein 1
pseudogene; Rac GTPase activating protein 1 parous bad RCOR3 REST
corepressor 3 nulliparous bad RECQL4 RecQ protein-like 4 parous bad
RNF146 ring finger protein 146 nulliparous bad RPL15 ribosomal
protein L15 pseudogene 22; ribosomal protein L15 pseudogene 18;
nulliparous bad ribosomal protein L15 pseudogene 17; ribosomal
protein L15 pseudogene 3; ribosomal protein L15 pseudogene 7;
ribosomal protein L15 RPL22 ribosomal protein L22 pseudogene 11;
ribosomal protein L22 nulliparous bad RPLP2 ribosomal protein,
large, P2 pseudogene 3; ribosomal protein, large, P2 nulliparous
bad RPS6KA1 ribosomal protein S6 kinase, 90 kDa, polypeptide 1
nulliparous bad RRP15 ribosomal RNA processing 15 homolog (S.
cerevisiae) parous bad SCRIB scribbled homolog (Drosophila) parous
bad SEPP1 selenoprotein P, plasma, 1 nulliparous bad SLC17A9 solute
carrier family 17, member 9 parous bad SLC25A28 solute carrier
family 25, member 28 nulliparous bad SLC25A6 solute carrier family
25 (mitochondrial carrier; adenine nucleotide translocator),
nulliparous bad member 6 SLC35B1 solute carrier family 35, member
B1 parous bad SPC25 SPC25, NDC80 kinetochore complex component,
homolog (S. cerevisiae) parous bad SRGAP2 SLIT-ROBO Rho GTPase
activating protein 2 parous bad STMN1 stathmin 1 parous bad SYNGR3
synaptogyrin 3 parous bad TIMM17A translocase of inner
mitochondrial membrane 17 homolog A (yeast) parous bad TNFRSF11
tumor necrosis factor receptor superfamily, member 11b nulliparous
bad TNNT3 troponin T type 3 (skeletal, fast) nulliparous bad TPT1
similar to tumor protein, translationally-controlled 1; tumor
protein, translationally- nulliparous bad controlled 1 TRIP10
thyroid hormone receptor interactor 10 nulliparous bad TSPAN7
tetraspanin 7 nulliparous bad TXNIP thioredoxin interacting protein
nulliparous bad UBE3C ubiquitin protein ligase E3C parous bad UCKL1
uridine-cytidine kinase 1-like 1 parous bad USP32 similar to TBC1
domain family, member 3; ubiquitin specific peptidase 32 parous bad
YWHAH tyrosine 3-monooxygenase/tryptophan 5-monooxygenase
activation protein, eta nulliparous bad polypeptide ZC3H3 zinc
finger CCCH-type containing 3 parous bad ZFP36L1 zinc finger
protein 36, C3H type-like 1 nulliparous bad ZFP36L2 zinc finger
protein 36, C3H type-like 2 nulliparous bad ACY1 aminoacylase 1
parous good AGGF1 angiogenic factor with G patch and FHA domains 1
parous good AGK acylglycerol kinase nulliparous good AMIGO2
adhesion molecule with Ig-like domain 2 nulliparous good ANKRD46
ankyrin repeat domain 46 nulliparous good APOD apolipoprotein D
parous good APOL1 apolipoprotein L, 1 parous good APOL3
apolipoprotein L, 3 parous good ARHGAP11 Rho GTPase activating
protein 11B; Rho GTPase activating protein 11A parous good ATG4B
ATG4 autophagy related 4 homolog B (S. cerevisiae) parous good
AZIN1 antizyme inhibitor 1 nulliparous good B3GALNT1
beta-1,3-N-acetylgalactosaminyltransferase 1 (globoside blood
group) nulliparous good C13orf34 chromosome 13 open reading frame
34 parous good CBX3 similar to chromobox homolog 3; chromobox
homolog 3 nulliparous good CD79A CD79a molecule,
immunoglobulin-associated alpha parous good CEACAM5
carcinoembryonic antigen-related cell adhesion molecule 5 parous
good CHCHD3 coiled-coil-helix-coiled-coil-helix domain containing 3
nulliparous good CNBP CCHC-type zinc finger, nucleic acid binding
protein parous good CNIH cornichon homolog (Drosophila) nulliparous
good COBRA1 cofactor of BRCA1 nulliparous good COQ2 coenzyme Q2
homolog, prenyltransferase (yeast) nulliparous good COX6A1
cytochrome c oxidase subunit VIa polypeptide 1 nulliparous good
CSTF1 cleavage stimulation factor, 3' pre-RNA, subunit 1, 50 kDa
nulliparous good CYC1 cytochrome c-1 nulliparous good DCPS
decapping enzyme, scavenger parous good DPM1 dolichyl-phosphate
mannosyltransferase polypeptide 1, catalytic subunit nulliparous
good DYNLL1 dynein, light chain, LC8-type 1 parous good E2F5 E2F
transcription factor 5, p130-binding nulliparous good EFR3A EFR3
homolog A (S. cerevisiae) nulliparous good EIF3J eukaryotic
translation initiation factor 3, subunit J parous good ERO1L
ERO1-like (S. cerevisiae) nulliparous good FAM164A family with
sequence similarity 164, member A nulliparous good FAM55C family
with sequence similarity 55, member C parous good FEN1 flap
structure-specific endonuclease 1 nulliparous good FLRT3
fibronectin leucine rich transmembrane protein 3 nulliparous good
GLG1 golgi apparatus protein 1 parous good GUF1 GUF1 GTPase homolog
(S. cerevisiae) parous good HAUS5 HAUS augmin-like complex, subunit
5 parous good HDGFRP3 hepatoma-derived growth factor, related
protein 3 nulliparous good HLA-B major histocompatibility complex,
class I, C; major histocompatibility complex, class I, B parous
good HLA-DOB major histocompatibility complex, class II, DO beta
parous good HMGB2 high-mobility group box 2 nulliparous good INPP5D
inositol polyphosphate-5-phosphatase, 145 kDa parous good INVS
inversin parous good ITCH itchy E3 ubiquitin protein ligase homolog
(mouse) parous good KCNG2 potassium voltage-gated channel,
subfamily G, member 2 parous good KDELR2 KDEL (Lys-Asp-Glu-Leu)
endoplasmic reticulum protein retention receptor 2 nulliparous good
KIAA0391 KIAA0391 nulliparous good LAPTM4B lysosomal protein
transmembrane 4 beta nulliparous good LARP4 La ribonucleoprotein
domain family, member 4 nulliparous good LILRB1 leukocyte
immunoglobulin-like receptor, subfamily B (with TM and ITIM
domains), parous good member 1 MAP3K7IP mitogen-activated protein
kinase kinase kinase 7 interacting protein 1 parous good METT11D1
methyltransferase 11 domain containing 1; similar to
methyltransferase 11 domain parous good containing 1 isoform 2
MLLT11 myeloid/lymphoid or mixed-lineage leukemia (trithorax
homolog, Drosophila); nulliparous good translocated to, 11 MLX
MAX-like protein X parous good MTDH metadherin nulliparous good
NDRG4 NDRG family member 4 nulliparous good NDUFA4 NADH
dehydrogenase (ubiquinone) 1 alpha subcomplex, 4, 9 kDa nulliparous
good NFS1 NFS1 nitrogen fixation 1 homolog (S. cerevisiae)
nulliparous good NRAS neuroblastoma RAS viral (v-ras) oncogene
homolog nulliparous good P4HA2 prolyl 4-hydroxylase, alpha
polypeptide II nulliparous good PHF1 PHD finger protein 1 parous
good PIK3CG phosphoinositide-3-kinase, catalytic, gamma polypeptide
parous good PLEKHF2 pleckstrin homology domain containing, family F
(with FYVE domain) member 2 nulliparous good PLOD3
procollagen-lysine, 2-oxoglutarate 5-dioxygenase 3 parous good PNP
nucleoside phosphorylase nulliparous good PNPLA2 patatin-like
phospholipase domain containing 2 parous good PPP1CC protein
phosphatase 1, catalytic subunit, gamma isoform nulliparous good
PPP3R1 protein phosphatase 3 (formerly 2B), regulatory subunit B,
alpha isoform nulliparous good PRPF31 PRP31 pre-mRNA processing
factor 31 homolog (S. cerevisiae) nulliparous good PSMA2 proteasome
(prosome, macropain) subunit, alpha type, 2 nulliparous good PSMA3
proteasome (prosome, macropain) subunit, alpha type, 3
nulliparous
good PSMA4 proteasome (prosome, macropain) subunit, alpha type, 4
nulliparous good PSMA6 proteasome (prosome, macropain) subunit,
alpha type, 6 nulliparous good PSMD4 proteasome (prosome,
macropain) 26S subunit, non-ATPase, 4 nulliparous good PUF60 poly-U
binding splicing factor 60 KDa nulliparous good RALA v-ral simian
leukemia viral oncogene homolog A (ras related) nulliparous good
RBBP7 retinoblastoma binding protein 7 nulliparous good RFC3
replication factor C (activator 1) 3, 38 kDa nulliparous good
RHBDL1 rhomboid, veinlet-like 1 (Drosophila) parous good RINT1
RAD50 interactor 1 parous good RNASEH1 ribonuclease H1 parous good
RNF125 ring finger protein 125 parous good RPS11 ribosomal protein
S11 pseudogene 5; ribosomal protein S11 parous good RPS6 ribosomal
protein S6 pseudogene 25; ribosomal protein S6; ribosomal protein
S6 parous good pseudogene 1 RRAGA Ras-related GTP binding A parous
good SAPS3 SAPS domain family, member 3 parous good SCNN1B sodium
channel, nonvoltage-gated 1, beta nulliparous good SHMT2 serine
hydroxymethyltransferase 2 (mitochondrial) nulliparous good SKA1
chromosome 18 open reading frame 24 parous good SLC25A32 solute
carrier family 25, member 32 nulliparous good SRP19 signal
recognition particle 19 kDa nulliparous good ST20 suppressor of
tumorigenicity 20 parous good STAU1 staufen, RNA binding protein,
homolog 1 (Drosophila) nulliparous good STX3 syntaxin 3 nulliparous
good THAP4 THAP domain containing 4 parous good TIMELESS timeless
homolog (Drosophila) nulliparous good TMC01 transmembrane and
coiled-coil domains 1 nulliparous good TMED9 transmembrane emp24
protein transport domain containing 9 nulliparous good TMEM158
transmembrane protein 158 nulliparous good TMEM222 transmembrane
protein 222 parous good TOB1 transducer of ERBB2, 1 nulliparous
good TSPAN13 tetraspanin 13 nulliparous good TTC38
tetratricopeptide repeat domain 38 parous good TUBA1C tubulin,
alpha 1c nulliparous good TXNDC9 thioredoxin domain containing 9
nulliparous good UBA2 ubiquitin-like modifier activating enzyme 2
nulliparous good UQCRB similar to ubiquinol-cytochrome c reductase
binding protein nulliparous good WDR12 WD repeat domain 12
nulliparous good XPOT exportin, tRNA (nuclear export receptor for
tRNAs); similar to Exportin-T (tRNA nulliparous good exportin)
(Exportin(tRNA)) YEATS4 YEATS domain containing 4 nulliparous good
YIF1A Yip1 interacting factor homolog A (S. cerevisiae) nulliparous
good ZDHHC14 zinc finger, DHHC-type containing 14 parous good
ZFAND1 zinc finger, AN1-type domain 1 nulliparous good ZNF217 zinc
finger protein 217 nulliparous good ZNF264 zinc finger protein 264
nulliparous good ZNF304 zinc finger protein 304 nulliparous good
ZNF706 zinc finger protein 706 nulliparous good ZWINT ZW10
interactor nulliparous good
Example 10
Parity-Associated Decrease in Mammary Epithelial Progenitors and
Breast Tumor Initiation
[0314] The data described in the Examples above support the
hypothesis that a decrease in the number and proliferative
potential of luminal progenitors in parous women directly relates
to a decrease in breast cancer risk for both ER+ and ER- breast
cancers, and that this effect is dependent on the age at first
full-term pregnancy. A mathematical model of the dynamics of
proliferating mammary epithelial cells was designed that can
accumulate the changes leading to cancer initiation. In the model,
described in detail below, two types of cells were considered: (1)
a self-renewing population of stem cells and, (2) a population of
proliferating hormone-responsive luminal progenitors that result
from the differentiation of these stem cells.
Mathematical Modeling:
[0315] Simulations were initiated at menarche and continued until
cancer initiation or death, as depicted in the timeline in FIG. 30.
The effect of pregnancy at varying times from menarche through
right before menopause on cancer initiation was tested and compared
against the nulliparous cancer initiation risk. The robustness of
the simulation over varying numbers of stem cells per terminal end
duct, additional proliferative capacities resulting from pregnancy,
and rates of asymmetric stem cell division were then tested.
[0316] The dynamics of stem cells in the breast ductal system was
first studied. Given the population structure inherent to breast
ducts, it was assumed that the stem cells in each duct act
independently. As such, the dynamics of a single duct within the
breast was investigated since the total probability of cancer
initiation is given by the probability per niche times the number
of niches. Thus, the relative likelihood of cancer initiation is
not altered by considering only one niche. The overall number of
stem cells in the breast is on the order of 5 to 10 cells per duct,
and this number was denoted by N. A fundamental time step of this
system to be dictated by the division time of stem cells,
t.sub.step, which varies during pregnancy, was defined. In
previously published in vivo experiments, the mean cell cycle
length of benign breast hyperplasia cells was approximately 162
hours per cell. It was assumed that even benign breast hyperplasia
cells divide faster than stem cells; thus, using t.sub.step=162
hours as the average stem cell cycle length when not pregnant may
be an overestimation of the number of stem cell divisions that
occur in the normal breast. Within a duct, a single stem cell is
randomly chosen to divide during each time step proportional to the
fitness of the cell, following a stochastic process known as the
Moran model (see, Moran, P. A. P. (1962). The statistical processes
of evolutionary theory (Oxford: Clarendon Press). National Center
for Health Statistics (US) (2012). Health, United States, 2011:
With Special Feature on Socioeconomic Status and Health
(Hyattsville, Md.)). According to this model, the divided cell is
replaced by one of the daughter cells of the division, while the
other daughter replaces another stem cell that was randomly
selected from the population. Use of this model ensured
preservation of homeostasis in the normal breast cell population.
For each cell division, a single mutation was allowed to arise in
one of the two daughter cells of the division.
[0317] In the mature breast, stem cells divide primarily to
maintain cellular integrity. However, differentiating events do
occur, although rarely. In this model, with probability p, cell
division in the current time step was allowed to be asymmetric,
producing one stem daughter cell to maintain the stem cell
population and one progenitor daughter. Since the exact rate of
differentiation is unknown, p=10.sup.-1 to 10.sup.-3 was tested.
With the remaining 1-p probability, the stem cell division is
symmetric and followed the usual Moran division dynamics. In each
time step thereafter, all of the cells resulting from the
progenitor daughter divided and differentiated further until a
total of z cell divisions were accumulated. We set z=10, to fit
data from mouse fat pad depletion experiments (see, Kordon, E. C.,
and Smith, G. H. (1998). An entire functional mammary gland may
comprise the progeny from a single cell. Development 125,
1921-1930.) After z.sub.pre divisions, the cells were considered
differentiated and, at this point, they were no longer included in
the cells considered in the mathematical model. Thus, in the
wild-type system, there were N stem cells per duct and 2.sup.z+1-1
progenitor cells per differentiation cascade. FIG. 34 describes the
temporal dynamics of the system.
[0318] During each cell division, genetic alterations contributing
to cancer initiation may arise. A number n.sub.mut of mutations
were considered that, when combined, result in a single cell
leading to cancer initiation. These mutations could be any of the
many mutations commonly found in breast cancer with initiation
potential; however, it was assumed that only a single mutational
hit was necessary to (in)activate the gene. The simulation was
tested with mutation rates on the order of 10.sup.-5 mutations per
gene per cell division to limit the required number of simulations
for detection to a reasonable number; however, results remained
consistent even at lower mutation rates. The following mutational
effects were assumed for each mutation: in stem cells, mutant cells
had a relative fitness of f.sub.mut=1.1, i.e. a fitness increase of
10%, resulting in an increased probability of dividing, while
mutant progenitor cells divided an additional z.sub.mut=1 times
(FIG. 34). Since the number of stem cells per duct is small, the
fitness of mutant alleles has little effect on cancer initiation
probabilities, as the fixation time of mutations is much smaller
than the mutation accumulation time (see, Hambardzumyan, D., Cheng,
Y. K., Haeno, H., Holland, E. C., and Michor, F. (2011). The
probable cell of origin of NF1- and PDGF-driven glioblastomas. PLoS
One 6, e24454). Thus, ignoring the specific value of f.sub.mut is
justified. These assumptions presume that the mutations primarily
act to increase the proliferation rate of cells. Mutant fitness
values were considered to be multiplicative while mutant progenitor
division capacity was considered to be additive. Thus, the relative
fitness of a stem cell with n mutations was f.sub.mut.sup.n and the
number of divisions a mutant progenitor with n mutations was
z+n*z.sub.mut. Additionally, progenitor cells must accumulate some
propensity towards self-renewal: a parameter
.gamma.=.gamma..sub.base-i*.gamma..sub.step was defined as the
probability of a progenitor cell at differentiation level
0.ltoreq.i.ltoreq.z+n*z.sub.mut acquiring self-renewal. Cancer
initiation was defined as a single cell that accumulated all
required mutations and either retained or acquired the ability to
self-renew, either through being a stem cell or through acquiring a
genetic or epigenetic self-renewal event.
[0319] The phenotypic alterations that occur in the breast during
pregnancy and as a result of pregnancy were considered. For the
purposes of this simulation, the 280 day period of time for the
pregnancy itself was considered as the time period during which
parameters are altered by pregnancy. It has been previously
published that pregnancy results in terminal differentiation of
progenitor cells into milk producing cells as well as increased
proliferation of cells. To model these effects, further
differentiation of progenitor cells during pregnancy by an
additional z.sub.preg differentiation levels, and a decrease in the
cell cycle length of stem cells was allowed (FIG. 34). According to
several groups, there is a 4.5 to 8.5-fold increase in Ki67+ cells
during pregnancy. Thus, a 4-fold to 8-fold increase in progenitor
cells during pregnancy was allowed, corresponding to Z.sub.preg=2
to 3. The remaining .about.1.1 fold increase in proliferation was
modeled as a decrease in stem cell cycle length to
t.sub.step,preg=147 hours. Additionally, as described in the
Examples, above, there was also a decrease in the number of
proliferative progenitors after pregnancy: this change was
simulated in population structure by decreasing the number of
differentiation levels in the progenitor hierarchy by z.sub.post.
The experiments showed a 2-3 fold drop in p27.sup.+ expressing
progenitor cells, which would correspond to z.sub.post=1.
[0320] The simulation spanned from menarche to death or initiation
of cancer within the duct. As such, the total simulation time was
calculated from the average women's life expectancy in the United
States, which was 80.9 years in 2009, and the average age of
menarche, which ranged between 12.4-12.7 years of age for differing
age groups in 2002 (FIG. 34). The mean age of menarche between the
groups was used, which was 12.6 years, and thus resulted in a total
of 68.3 years of simulation time. The effects of pregnancy
occurring at four roughly equidistant time points, t.sub.preg was
tested: immediately following menarche, time of first pregnancy at
the average age of 25.4 in 2010, immediately before menopause at
the average age of 51.3 in 1998, and halfway between average first
pregnancy and menopause at the age of 38.3. All time points were
tabulated from the most recent government-provided data. The
effects of varying the simulation parameters independently for each
pregnancy age t.sub.preg were tested. All fixed value parameters
and the values of all other parameters are listed in the tables
below.
TABLE-US-00019 TABLE 19 Fixed parameter values t.sub.total
t.sub.step t.sub.step, preg (years) f.sub.mut .gamma.
.gamma..sub.step .mu. (h) (h) z z.sub.mut z.sub.post 68.3 1.1 0.1
0.005 2 .times. 162 147 10 1 -1 10.sup.-5 Legend: Parameters that
remained unchanged throughout all simulations are shown.
TABLE-US-00020 TABLE 20 Range of parameter values investigated
t.sub.preg N n.sub.mut p z.sub.preg 0 5 1 10.sup.-3 2 12.8 8 2
10.sup.-2 3 25.7 10 10.sup.-1 38.7 Legend: For each parameter of
interest, multiple values were tested. Values defaulted to the
numbers in bold.
[0321] In the schematic depicted in FIG. 31, initially, there are N
wild-type stem cells (top of schematic), which give rise to a
differentiation cascade of 2.sup.z+1-1 wild-type luminal progenitor
cells (triangular, lower region). At each time step, all progenitor
cells as well as one randomly selected stem cell divide. With
probability a, the stem cell divides symmetrically and one daughter
cell replaces another randomly chosen stem cell. With probability
1-.alpha., the stem cell divides asymmetrically and one daughter
cell remains a stem cell while the other daughter cell becomes
committed to the progenitor population. Regardless of the dividing
stem cell's fate, all existing progenitor cells divide
symmetrically for a total of z times to give rise to successively
more differentiated cells (progressively darker shades of gray)
before becoming terminally differentiated. Darkening gray
gradations refer to successively more differentiated cells and
serve to clarify a single time step of the stochastic process.
[0322] In FIG. 32, the acquisition of mutations leading to breast
cancer initiation all result in an increased relative fitness
(i.e., growth rate) f.sub.mut in stem cells ("SC") as compared to
wild-type cells ("WT") and an additional number of divisions
z.sub.mut progenitor cells can undergo before terminally
differentiating.
[0323] In FIG. 33, during pregnancy, progenitor cells experience an
expansion in proliferative capacity through an additional number of
division Z.sub.preg in order to form terminally differentiated
milk-producing cells (dotted triangle) and a decrease in cell cycle
length.
[0324] The effect of pregnancy on breast cancer per duct (expressed
as the relative probability of cancer initiation) as compared to
nulliparous simulations initiation at varying times after menarche
was tested and compared to the risk of tumor initiation in
nulliparous women. Default values were N=8, p=10.sup.-2,
Z.sub.preg=2 (FIG. 34). It was observed that the relative
likelihood of initiation increased with later pregnancy. The
robustness of the simulation over varying numbers of stem cells per
terminal end duct, additional proliferative capacities resulting
from pregnancy, and rates of asymmetric stem cell division were
tested (FIGS. 35-37). The relative likelihood of cancer initiation
was then compared with pregnancy occurring at four different time
points during childbearing years as compared to nulliparous
simulations. It was found that the probability of cancer initiation
in a duct increases as the age at first pregnancy increases.
Furthermore, these simulations showed that differences in the
numbers of luminal epithelial progenitors with proliferative
potential is the most probable explanation for differences in
breast cancer risk due to reproductive (e.g., parity) and genetic
(e.g., BRCA1/2 germline mutation) factors.
[0325] In summary, it was found that both increasing numbers of
stem cells per duct and increasing rates of asymmetric stem cell
division increase the rate of cancer initiation per duct. Also, as
expected, changes in the proliferative capacity of progenitor cells
during pregnancy had no effect in the nulliparous state. The
relative likelihood of cancer initiation was then compared with
pregnancy occurring at four different time points during a woman's
childbearing years as compared to the nulliparous simulations. It
was found that the probability of cancer initiation in a duct
increases as the age of first pregnancy increases within the range
of all simulated parameters. Additionally, the probability of
cancer initiation is greater in nulliparous situations than in all
pregnancy simulations. Interestingly, cancer initiation from the
stem cell population decreases with age of first pregnancy while
initiation from progenitors increases. Some of the cancers that
were considered as initiated from the progenitor population may
potentially have had a stem initiation event occur afterwards, and
simulations where progenitor initiation occurred are also those
where fixation of the first mutation in the stem population was
likely.
[0326] A number of embodiments of the invention have been
described. Nevertheless, it will be understood that various
modifications may be made without departing from the spirit and
scope of the invention. It is further to be understood that all
values are approximate, and are provided for description.
Accordingly, other embodiments are within the scope of the
following claims.
TABLE-US-LTS-00001 LENGTHY TABLES The patent application contains a
lengthy table section. A copy of the table is available in
electronic form from the USPTO web site
(http://seqdata.uspto.gov/?pageRequest=docDetail&DocID=US20150285802A1).
An electronic copy of the table will also be available from the
USPTO upon request and payment of the fee set forth in 37 CFR
1.19(b)(3).
Sequence CWU 1
1
5411151DNAHomo sapiens 1gccggagcca gcggttctcc aagcacccag catcctgcta
gacgcgccgc gcaccgacgg 60aggggacatg ggcagagcaa tggtggccag gctcgggctg
gggctgctgc tgctggcact 120gctcctaccc acgcagattt attccagtga
aacaacaact ggaacttcaa gtaactcctc 180ccagagtact tccaactctg
ggttggcccc aaatccaact aatgccacca ccaaggcggc 240tggtggtgcc
ctgcagtcaa cagccagtct cttcgtggtc tcactctctc ttctgcatct
300ctactcttaa gagactcagg ccaagaaacg tcttctaaat ttccccatct
tctaaaccca 360atccaaatgg cgtctggaag tccaatgtgg caaggaaaaa
caggtcttca tcgaatctac 420taattccaca ccttttattg acacagaaaa
tgttgagaat cccaaatttg attgatttga 480agaacatgtg agaggtttga
ctagatgatg gatgccaata ttaaatctgc tggagtttca 540tgtacaagat
gaaggagagg cagcatccaa aaaatagtta agacatgatt tccttgaatg
600tggttgagaa atatggacac ttaatactac cttgaaaata agaatagaaa
taaggatggg 660atcgtggact ggagatcagg tttccattgg ggttcattaa
ttctataagg cctaaaacag 720gtcatcataa aaggtcccat gaattctatc
catatgtcca tgagaaggaa cttccaggtg 780ttactgtaat tctcaaggta
ttgtttcgac agcactagtt caatggcgaa taatcaaatg 840cgttcccatg
gtgaaccgag gggggcgaca tgggaaacgg aaccaacttc cttccgtgaa
900ggcctcggga ttgacattgg attcgaacat tccggtgtaa tggcaagtgc
caggacagaa 960gtaatgaagt tgtccccaca aaaatttgaa cagtgcattc
tctgagaaag ggaaacaaca 1020acacgtgctt atcacgagaa ttatatagcg
cacaatatct agccactacg tgatctaaac 1080aaactcgaca ggagtgtcca
tagtctctct tgctcattct acactattgt tccttctctt 1140ctctgtgctg t
1151280PRTHomo sapiens 2Met Gly Arg Ala Met Val Ala Arg Leu Gly Leu
Gly Leu Leu Leu Leu 1 5 10 15 Ala Leu Leu Leu Pro Thr Gln Ile Tyr
Ser Ser Glu Thr Thr Thr Gly 20 25 30 Thr Ser Ser Asn Ser Ser Gln
Ser Thr Ser Asn Ser Gly Leu Ala Pro 35 40 45 Asn Pro Thr Asn Ala
Thr Thr Lys Ala Ala Gly Gly Ala Leu Gln Ser 50 55 60 Thr Ala Ser
Leu Phe Val Val Ser Leu Ser Leu Leu His Leu Tyr Ser 65 70 75 80
35710DNAHomo sapiens 3cgtcaggttg gctcttcagg ttcatttcca tagttccctg
cggcctctgc cttggggagt 60tatgttttgt taccgagatc cgcgctacca gattgcaccg
gggctgattt gggggctggg 120aatttgccat tctgctgtac agacactgat
ttttttttct tctttttaaa aagcaagatt 180ttaggtgatg ggcaagtcag
aaagtcagat ggatataact gatatcaaca ctccaaagcc 240aaagaagaaa
cagcgatgga ctccactgga gatcagcctc tcggtccttg tcctgctcct
300caccatcata gctgtgacaa tgatcgcact ctatgcaacc tacgatgatg
gtatttgcaa 360gtcatcagac tgcataaaat cagctgctcg actgatccaa
aacatggatg ccaccactga 420gccttgtaca gactttttca aatatgcttg
cggaggctgg ttgaaacgta atgtcattcc 480cgagaccagc tcccgttacg
gcaactttga cattttaaga gatgaactag aagtcgtttt 540gaaagatgtc
cttcaagaac ccaaaactga agatatagta gcagtgcaga aagcaaaagc
600attgtacagg tcttgtataa atgaatctgc tattgatagc agaggtggag
aacctctact 660caaactgtta ccagacatat atgggtggcc agtagcaaca
gaaaactggg agcaaaaata 720tggtgcttct tggacagctg aaaaagctat
tgcacaactg aattctaaat atgggaaaaa 780agtccttatt aatttgtttg
ttggcactga tgataagaat tctgtgaatc atgtaattca 840tattgaccaa
cctcgacttg gcctcccttc tagagattac tatgaatgca ctggaatcta
900taaagaggct tgtacagcat atgtggattt tatgatttct gtggccagat
tgattcgtca 960ggaagaaaga ttgcccatcg atgaaaacca gcttgctttg
gaaatgaata aagttatgga 1020attggaaaaa gaaattgcca atgctacggc
taaacctgaa gatcgaaatg atccaatgct 1080tctgtataac aagatgacat
tggcccagat ccaaaataac ttttcactag agatcaatgg 1140gaagccattc
agctggttga atttcacaaa tgaaatcatg tcaactgtga atattagtat
1200tacaaatgag gaagatgtgg ttgtttatgc tccagaatat ttaaccaaac
ttaagcccat 1260tcttaccaaa tattctgcca gagatcttca aaatttaatg
tcctggagat tcataatgga 1320tcttgtaagc agcctcagcc gaacctacaa
ggagtccaga aatgctttcc gcaaggccct 1380ttatggtaca acctcagaaa
cagcaacttg gagacgttgt gcaaactatg tcaatgggaa 1440tatggaaaat
gctgtgggga ggctttatgt ggaagcagca tttgctggag agagtaaaca
1500tgtggtcgag gatttgattg cacagatccg agaagttttt attcagactt
tagatgacct 1560cacttggatg gatgccgaga caaaaaagag agctgaagaa
aaggccttag caattaaaga 1620aaggatcggc tatcctgatg acattgtttc
aaatgataac aaactgaata atgagtacct 1680cgagttgaac tacaaagaag
atgaatactt cgagaacata attcaaaatt tgaaattcag 1740ccaaagtaaa
caactgaaga agctccgaga aaaggtggac aaagatgagt ggataagtgg
1800agcagctgta gtcaatgcat tttactcttc aggaagaaat cagatagtct
tcccagccgg 1860cattctgcag ccccccttct ttagtgccca gcagtccaac
tcattgaact atgggggcat 1920cggcatggtc ataggacacg aaatcaccca
tggcttcgat gacaatggca gaaactttaa 1980caaagatgga gacctcgttg
actggtggac tcaacagtct gcaagtaact ttaaggagca 2040atcccagtgc
atggtgtatc agtatggaaa cttttcctgg gacctggcag gtggacagca
2100ccttaatgga attaatacac tgggagaaaa cattgctgat aatggaggtc
ttggtcaagc 2160atacagagcc tatcagaatt atattaaaaa gaatggcgaa
gaaaaattac ttcctggact 2220tgacctaaat cacaaacaac tatttttctt
gaactttgca caggtgtggt gtggaaccta 2280taggccagag tatgcggtta
actccattaa aacagatgtg cacagtccag gcaatttcag 2340gattattggg
actttgcaga actctgcaga gttttcagaa gcctttcact gccgcaagaa
2400ttcatacatg aatccagaaa agaagtgccg ggtttggtga tcttcaaaag
aagcattgca 2460gcccttggct agacttgcca acaccacaga aatggggaat
tctctaatcg aaagaaaatg 2520ggccctaggg gtcactgtac tgacttgagg
gtgattaaca gagagggcac catcacaata 2580cagataacat taggttgtcc
tagaaagggt gtggagggag gaagggggtc taaggtctat 2640caagtcaatc
atttctcact gtgtacataa tgcttaattt ctaaagataa tattactgtt
2700tatttctgtt tctcatatgg tctaccagtt tgctgatgtc cctagaaaac
aatgcaaaac 2760ctttgaggta gaccaggatt tctaatcaaa agggaaaaga
agatgttgaa gaatacagtt 2820aggcaccaga agaacagtag gtgacactat
agtttaaaac acattgccta actactagtt 2880tttactttta tttgcaacat
ttacagtcct tcaaaatcct tccaaagaat tcttatacac 2940attggggcct
tggagcttac atagttttaa actcattttt gccatacatc agttattcat
3000tctgtgatca tttattttaa gcactcttaa agcaaaaaat gaatgtctaa
aattgttttt 3060tgttgtacct gctttgactg atgctgagat tcttcaggct
tcctgcaatt ttctaagcaa 3120tttcttgctc tatctctcaa aacttggtat
ttttcagaga tttatataaa tgtaaaaata 3180ataattttta tatttaatta
ttaactacat ttatgagtaa ctattattat aggtaatcaa 3240tgaatattga
agtttcagct taaaataaac agttgtgaac caagatctat aaagcgatat
3300acagatgaaa atttgagact atttaaactt ataaatcata ttgatgaaaa
gatttaagca 3360caaactttag ggtaaaaatt gccattggac agttgtctag
agatatatat acttgtggtt 3420ttcaaattgg actttcaaaa ttaaatctgt
ccctgagagt gtctctgata aaagggcaaa 3480tctgcaccta tgtagctctg
catctcctgt cttttcaggt ttgtcatcag atggaaatat 3540tttgataata
aattgaaatt gtgaactcat tgctccctaa gactgtgaca actgtctaac
3600tttagaagtg catttctgaa tagaaatggg aggcctctga tggaccttct
agaattataa 3660gtcacaaaga gttctggaaa agaactgttt actgcttgat
aggaattcat cttttgaggc 3720ttctgttcct ctcttttcct gttgtattga
ctattttcgt tcattacttg attaagattt 3780tacaaaagag gagcacttcc
aaaattctta tttttcctaa caaaagatga aagcagggaa 3840tttctatcta
aatgatgagt attagttccc tgtctcttga aaaatgccca tttgccttta
3900aaaaaaaaag ttacagaaat actataacat atgtacataa attgcataaa
gcataagtat 3960acagttcaat aaacttaact ttaactgaac aatggccctg
tagccagcac ctgtaagaaa 4020cagagcagta ccagcgctct aaaagcacct
ccttgtcact ttattactcc cagaacaaca 4080actatcctga cttctaatat
cattcactag ctttgcctgg ttttgtcttt tatgcagata 4140gaatcaatca
gtatgtattc ttttgtgcct ggcttctttc tctcagcctt acatttgtga
4200gattcctctg tattgtgctg attgtggatc ttttcattct cattgcagaa
taatgttcta 4260ttgtgggact tattacaatt tgttcatcct attgttgatg
ggcacttgag aactttccat 4320tttggcgcta ttacaaatag tgcaactatg
aatgtactgc atgttaccat cttacttgag 4380cctttaatgg acttatttct
tcaaatcctt ccaaaaatta ttataagcat tgaaattata 4440gtttcaagcc
aactgtggat acccttaccc tttcctcctt tatcacaacc accgttacaa
4500gtatacttat atttccctaa aatacattta aaacttacct aagtgacatt
tgtagttgga 4560gtaataggag cttccagctc taataaaaca gctgtctcta
acttatttta tttccatcat 4620gtcagagcag gtgaagagcc agaagtgaag
agtgactagt acaaattata aaaagccact 4680agactcttca ctgttagctt
tttaaaacat taggctccca tccctatgga ggaacaactc 4740tccagtgcct
ggatcccctc tgtctacaaa tataagattt tctgggccta aaggatagat
4800caaagtcaaa aatagcaatg cctccctatc cctcacacat ccagacatca
tgaattttac 4860atggtactct tgttgagttc tgtagagcct tctgatgtct
ctaaagcact accgattctt 4920tggagttgtc acatcagata agacatatct
ctaattccat ccataaatcc agttctacta 4980tggctgagtt ctggtcaaag
aaagaaagtt tagaagctga gacacaaagg gttgggagct 5040gatgaaactc
acaaatgatg gtaggaagaa gctctcgaca atacccgttg gcaaggagtc
5100tgcctccatg ctgcagtgtt cgagtggatt gtaggtgcaa gatggaaagg
attgtaggtg 5160caagctgtcc agagaaaaga gtccttgttc cagccctatt
ctgccactcc tgacagggtg 5220accttgggta tttgcaatat tcctttgggc
ctctgcttct ctcacctaaa aaaagagaat 5280tagattatat tggtggttct
cagcaagaga aggagtatgt gtccaatgct gccttcccat 5340gaatctgtct
cccagttatg aatcagtggg caggataaac tgaaaactcc catttacgtg
5400tctgaatcga gtgagacaaa attttagtcc aaataacaag taccaaagtt
ttatcaagtt 5460tgggtctgtg ctgctgttac tgttaaccat ttaagtgggg
caaaaccttg ctaattttct 5520caaaagcatt tatcattctt gttgccacag
ctggagctct caaactaaaa gacatttgtt 5580attttggaaa gaagaaagac
tctattctca aagtttccta atcagaaatt tttatcagtt 5640tccagtctca
aaaatacaaa ataaaaacaa acgtttttaa tactattgct tttatgccta
5700gtcaactctg 57104750PRTHomo sapiens 4Met Gly Lys Ser Glu Ser Gln
Met Asp Ile Thr Asp Ile Asn Thr Pro 1 5 10 15 Lys Pro Lys Lys Lys
Gln Arg Trp Thr Pro Leu Glu Ile Ser Leu Ser 20 25 30 Val Leu Val
Leu Leu Leu Thr Ile Ile Ala Val Thr Met Ile Ala Leu 35 40 45 Tyr
Ala Thr Tyr Asp Asp Gly Ile Cys Lys Ser Ser Asp Cys Ile Lys 50 55
60 Ser Ala Ala Arg Leu Ile Gln Asn Met Asp Ala Thr Thr Glu Pro Cys
65 70 75 80 Thr Asp Phe Phe Lys Tyr Ala Cys Gly Gly Trp Leu Lys Arg
Asn Val 85 90 95 Ile Pro Glu Thr Ser Ser Arg Tyr Gly Asn Phe Asp
Ile Leu Arg Asp 100 105 110 Glu Leu Glu Val Val Leu Lys Asp Val Leu
Gln Glu Pro Lys Thr Glu 115 120 125 Asp Ile Val Ala Val Gln Lys Ala
Lys Ala Leu Tyr Arg Ser Cys Ile 130 135 140 Asn Glu Ser Ala Ile Asp
Ser Arg Gly Gly Glu Pro Leu Leu Lys Leu 145 150 155 160 Leu Pro Asp
Ile Tyr Gly Trp Pro Val Ala Thr Glu Asn Trp Glu Gln 165 170 175 Lys
Tyr Gly Ala Ser Trp Thr Ala Glu Lys Ala Ile Ala Gln Leu Asn 180 185
190 Ser Lys Tyr Gly Lys Lys Val Leu Ile Asn Leu Phe Val Gly Thr Asp
195 200 205 Asp Lys Asn Ser Val Asn His Val Ile His Ile Asp Gln Pro
Arg Leu 210 215 220 Gly Leu Pro Ser Arg Asp Tyr Tyr Glu Cys Thr Gly
Ile Tyr Lys Glu 225 230 235 240 Ala Cys Thr Ala Tyr Val Asp Phe Met
Ile Ser Val Ala Arg Leu Ile 245 250 255 Arg Gln Glu Glu Arg Leu Pro
Ile Asp Glu Asn Gln Leu Ala Leu Glu 260 265 270 Met Asn Lys Val Met
Glu Leu Glu Lys Glu Ile Ala Asn Ala Thr Ala 275 280 285 Lys Pro Glu
Asp Arg Asn Asp Pro Met Leu Leu Tyr Asn Lys Met Thr 290 295 300 Leu
Ala Gln Ile Gln Asn Asn Phe Ser Leu Glu Ile Asn Gly Lys Pro 305 310
315 320 Phe Ser Trp Leu Asn Phe Thr Asn Glu Ile Met Ser Thr Val Asn
Ile 325 330 335 Ser Ile Thr Asn Glu Glu Asp Val Val Val Tyr Ala Pro
Glu Tyr Leu 340 345 350 Thr Lys Leu Lys Pro Ile Leu Thr Lys Tyr Ser
Ala Arg Asp Leu Gln 355 360 365 Asn Leu Met Ser Trp Arg Phe Ile Met
Asp Leu Val Ser Ser Leu Ser 370 375 380 Arg Thr Tyr Lys Glu Ser Arg
Asn Ala Phe Arg Lys Ala Leu Tyr Gly 385 390 395 400 Thr Thr Ser Glu
Thr Ala Thr Trp Arg Arg Cys Ala Asn Tyr Val Asn 405 410 415 Gly Asn
Met Glu Asn Ala Val Gly Arg Leu Tyr Val Glu Ala Ala Phe 420 425 430
Ala Gly Glu Ser Lys His Val Val Glu Asp Leu Ile Ala Gln Ile Arg 435
440 445 Glu Val Phe Ile Gln Thr Leu Asp Asp Leu Thr Trp Met Asp Ala
Glu 450 455 460 Thr Lys Lys Arg Ala Glu Glu Lys Ala Leu Ala Ile Lys
Glu Arg Ile 465 470 475 480 Gly Tyr Pro Asp Asp Ile Val Ser Asn Asp
Asn Lys Leu Asn Asn Glu 485 490 495 Tyr Leu Glu Leu Asn Tyr Lys Glu
Asp Glu Tyr Phe Glu Asn Ile Ile 500 505 510 Gln Asn Leu Lys Phe Ser
Gln Ser Lys Gln Leu Lys Lys Leu Arg Glu 515 520 525 Lys Val Asp Lys
Asp Glu Trp Ile Ser Gly Ala Ala Val Val Asn Ala 530 535 540 Phe Tyr
Ser Ser Gly Arg Asn Gln Ile Val Phe Pro Ala Gly Ile Leu 545 550 555
560 Gln Pro Pro Phe Phe Ser Ala Gln Gln Ser Asn Ser Leu Asn Tyr Gly
565 570 575 Gly Ile Gly Met Val Ile Gly His Glu Ile Thr His Gly Phe
Asp Asp 580 585 590 Asn Gly Arg Asn Phe Asn Lys Asp Gly Asp Leu Val
Asp Trp Trp Thr 595 600 605 Gln Gln Ser Ala Ser Asn Phe Lys Glu Gln
Ser Gln Cys Met Val Tyr 610 615 620 Gln Tyr Gly Asn Phe Ser Trp Asp
Leu Ala Gly Gly Gln His Leu Asn 625 630 635 640 Gly Ile Asn Thr Leu
Gly Glu Asn Ile Ala Asp Asn Gly Gly Leu Gly 645 650 655 Gln Ala Tyr
Arg Ala Tyr Gln Asn Tyr Ile Lys Lys Asn Gly Glu Glu 660 665 670 Lys
Leu Leu Pro Gly Leu Asp Leu Asn His Lys Gln Leu Phe Phe Leu 675 680
685 Asn Phe Ala Gln Val Trp Cys Gly Thr Tyr Arg Pro Glu Tyr Ala Val
690 695 700 Asn Ser Ile Lys Thr Asp Val His Ser Pro Gly Asn Phe Arg
Ile Ile 705 710 715 720 Gly Thr Leu Gln Asn Ser Ala Glu Phe Ser Glu
Ala Phe His Cys Arg 725 730 735 Lys Asn Ser Tyr Met Asn Pro Glu Lys
Lys Cys Arg Val Trp 740 745 750 5 2387DNAHomo sapiens 5cctcgtgccg
cggaccccag cctctgccag gttcggtccg ccatcctcgt cccgtcctcc 60gccggcccct
gccccgcgcc cagggatcct ccagctcctt tcgcccgcgc cctccgttcg
120ctccggacac catggacaag ttttggtggc acgcagcctg gggactctgc
ctcgtgccgc 180tgagcctggc gcagatcgat ttgaatataa cctgccgctt
tgcaggtgta ttccacgtgg 240agaaaaatgg tcgctacagc atctctcgga
cggaggccgc tgacctctgc aaggctttca 300atagcacctt gcccacaatg
gcccagatgg agaaagctct gagcatcgga tttgagacct 360gcaggtatgg
gttcatagaa gggcatgtgg tgattccccg gatccacccc aactccatct
420gtgcagcaaa caacacaggg gtgtacatcc tcacatccaa cacctcccag
tatgacacat 480attgcttcaa tgcttcagct ccacctgaag aagattgtac
atcagtcaca gacctgccca 540atgcctttga tggaccaatt accataacta
ttgttaaccg tgatggcacc cgctatgtcc 600agaaaggaga atacagaacg
aatcctgaag acatctaccc cagcaaccct actgatgatg 660acgtgagcag
cggctcctcc agtgaaagga gcagcacttc aggaggttac atcttttaca
720ccttttctac tgtacacccc atcccagacg aagacagtcc ctggatcacc
gacagcacag 780acagaatccc tgctaccagt acgtcttcaa ataccatctc
agcaggctgg gagccaaatg 840aagaaaatga agatgaaaga gacagacacc
tcagtttttc tggatcaggc attgatgatg 900atgaagattt tatctccagc
accatttcaa ccacaccacg ggcttttgac cacacaaaac 960agaaccagga
ctggacccag tggaacccaa gccattcaaa tccggaagtg ctacttcaga
1020caaccacaag gatgactgat gtagacagaa atggcaccac tgcttatgaa
ggaaactgga 1080acccagaagc acaccctccc ctcattcacc atgagcatca
tgaggaagaa gagaccccac 1140attctacaag cacaatccag gcaactccta
gtagtacaac ggaagaaaca gctacccaga 1200aggaacagtg gtttggcaac
agatggcatg agggatatcg ccaaacaccc agagaagact 1260cccattcgac
aacagggaca gctgcagcct cagctcatac cagccatcca atgcaaggaa
1320ggacaacacc aagcccagag gacagttcct ggactgattt cttcaaccca
atctcacacc 1380ccatgggacg aggtcatcaa gcaggaagaa ggatggatat
ggactccagt catagtacaa 1440cgcttcagcc tactgcaaat ccaaacacag
gtttggtgga agatttggac aggacaggac 1500ctctttcaat gacaacgcag
cagagtaatt ctcagagctt ctctacatca catgaaggct 1560tggaagaaga
taaagaccat ccaacaactt ctactctgac atcaagcaat aggaatgatg
1620tcacaggtgg aagaagagac ccaaatcatt ctgaaggctc aactacttta
ctggaaggtt 1680atacctctca ttacccacac acgaaggaaa gcaggacctt
catcccagtg acctcagcta 1740agactgggtc ctttggagtt actgcagtta
ctgttggaga ttccaactct aatgtcaatc 1800gttccttatc aggagaccaa
gacacattcc accccagtgg ggggtcccat accactcatg 1860gatctgaatc
agatggacac tcacatggga gtcaagaagg tggagcaaac acaacctctg
1920gtcctataag gacaccccaa attccagaat ggctgatcat cttggcatcc
ctcttggcct 1980tggctttgat tcttgcagtt tgcattgcag tcaacagtcg
aagaaggtgt gggcagaaga 2040aaaagctagt gatcaacagt ggcaatggag
ctgtggagga cagaaagcca agtggactca 2100acggagaggc cagcaagtct
caggaaatgg tgcatttggt gaacaaggag tcgtcagaaa 2160ctccagacca
gtttatgaca gctgatgaga caaggaacct gcagaatgtg gacatgaaga
2220ttggggtgta acacctacac cattatcttg gaaagaaaca accgttggaa
acataaccat 2280tacagggagc tgggacactt aacagatgca atgtgctact
gattgtttca ttgcgaatct 2340tttttagcat aaaattttct actcttaaaa
aaaaaaaaaa aaaaaaa 2387643PRTHomo sapiens 6Thr Leu Met Ser Thr Ser
Ala Thr Ala Thr Glu Thr Ala Thr Lys Arg 1 5 10 15 Gln Glu Thr Trp
Asp Trp Phe Ser Trp Leu Phe Leu Pro Ser
Glu Ser 20 25 30 Lys Asn His Leu His Thr Thr Thr Gln Met Ala 35 40
72334DNAHomo sapiens 7cttgctcacg gctctgcgac tccgacgccg gcaaggtttg
gagagcggct gggttcgcgg 60gacccgcggg cttgcacccg cccagactcg gacgggcttt
gccaccctct ccgcttgcct 120ggtcccctct cctctccgcc ctcccgctcg
ccagtccatt tgatcagcgg agactcggcg 180gccgggccgg ggcttccccg
cagcccctgc gcgctcctag agctcgggcc gtggctcgtc 240ggggtctgtg
tcttttggct ccgagggcag tcgctgggct tccgagaggg gttcgggccg
300cgtaggggcg ctttgttttg ttcggttttg tttttttgag agtgcgagag
aggcggtcgt 360gcagacccgg gagaaagatg tcaaacgtgc gagtgtctaa
cgggagccct agcctggagc 420ggatggacgc caggcaggcg gagcacccca
agccctcggc ctgcaggaac ctcttcggcc 480cggtggacca cgaagagtta
acccgggact tggagaagca ctgcagagac atggaagagg 540cgagccagcg
caagtggaat ttcgattttc agaatcacaa acccctagag ggcaagtacg
600agtggcaaga ggtggagaag ggcagcttgc ccgagttcta ctacagaccc
ccgcggcccc 660ccaaaggtgc ctgcaaggtg ccggcgcagg agagccagga
tggcagcggg agccgcccgg 720cggcgccttt aattggggct ccggctaact
ctgaggacac gcatttggtg gacccaaaga 780ctgatccgtc ggacagccag
acggggttag cggagcaatg cgcaggaata aggaagcgac 840ctgcaaccga
cgattcttct actcaaaaca aaagagccaa cagaacagaa gaaaatgttt
900cagacggttc cccaaatgcc ggttctgtgg agcagacgcc caagaagcct
ggcctcagaa 960gacgtcaaac gtaaacagct cgaattaaga atatgtttcc
ttgtttatca gatacatcac 1020tgcttgatga agcaaggaag atatacatga
aaattttaaa aatacatatc gctgacttca 1080tggaatggac atcctgtata
agcactgaaa aacaacaaca caataacact aaaattttag 1140gcactcttaa
atgatctgcc tctaaaagcg ttggatgtag cattatgcaa ttaggttttt
1200ccttatttgc ttcattgtac tacctgtgta tatagttttt accttttatg
tagcacataa 1260actttgggga agggagggca gggtggggct gaggaactga
cgtggagcgg ggtatgaaga 1320gcttgctttg atttacagca agtagataaa
tatttgactt gcatgaagag aagcaatttt 1380ggggaagggt ttgaattgtt
ttctttaaag atgtaatgtc cctttcagag acagctgata 1440cttcatttaa
aaaaatcaca aaaatttgaa cactggctaa agataattgc tatttatttt
1500tacaagaagt ttattctcat ttgggagatc tggtgatctc ccaagctatc
taaagtttgt 1560tagatagctg catgtggctt ttttaaaaaa gcaacagaaa
cctatcctca ctgccctccc 1620cagtctctct taaagttgga atttaccagt
taattactca gcagaatggt gatcactcca 1680ggtagtttgg ggcaaaaatc
cgaggtgctt gggagttttg aatgttaaga attgaccatc 1740tgcttttatt
aaatttgttg acaaaatttt ctcattttct tttcacttcg ggctgtgtaa
1800acacagtcaa aataattcta aatccctcga tatttttaaa gatctgtaag
taacttcaca 1860ttaaaaaatg aaatattttt taatttaaag cttactctgt
ccatttatcc acaggaaagt 1920gttattttta aaggaaggtt catgtagaga
aaagcacact tgtaggataa gtgaaatgga 1980tactacatct ttaaacagta
tttcattgcc tgtgtatgga aaaaccattt gaagtgtacc 2040tgtgtacata
actctgtaaa aacactgaaa aattatacta acttatttat gttaaaagat
2100tttttttaat ctagacaata tacaagccaa agtggcatgt tttgtgcatt
tgtaaatgct 2160gtgttgggta gaataggttt tcccctcttt tgttaaataa
tatggctatg cttaaaaggt 2220tgcatactga gccaagtata attttttgta
atgtgtgaaa aagatgccaa ttattgttac 2280acattaagta atcaataaag
aaaacttcca tagctaaaaa aaaaaaaaaa aaaa 23348198PRTHomo sapiens 8Met
Ser Asn Val Arg Val Ser Asn Gly Ser Pro Ser Leu Glu Arg Met 1 5 10
15 Asp Ala Arg Gln Ala Glu His Pro Lys Pro Ser Ala Cys Arg Asn Leu
20 25 30 Phe Gly Pro Val Asp His Glu Glu Leu Thr Arg Asp Leu Glu
Lys His 35 40 45 Cys Arg Asp Met Glu Glu Ala Ser Gln Arg Lys Trp
Asn Phe Asp Phe 50 55 60 Gln Asn His Lys Pro Leu Glu Gly Lys Tyr
Glu Trp Gln Glu Val Glu 65 70 75 80 Lys Gly Ser Leu Pro Glu Phe Tyr
Tyr Arg Pro Pro Arg Pro Pro Lys 85 90 95 Gly Ala Cys Lys Val Pro
Ala Gln Glu Ser Gln Asp Val Ser Gly Ser 100 105 110 Arg Pro Ala Ala
Pro Leu Ile Gly Ala Pro Ala Asn Ser Glu Asp Thr 115 120 125 His Leu
Val Asp Pro Lys Thr Asp Pro Ser Asp Ser Gln Thr Gly Leu 130 135 140
Ala Glu Gln Cys Ala Gly Ile Arg Lys Arg Pro Ala Thr Asp Asp Ser 145
150 155 160 Ser Thr Gln Asn Lys Arg Ala Asn Arg Thr Glu Glu Asn Val
Ser Asp 165 170 175 Gly Ser Pro Asn Ala Gly Ser Val Glu Gln Thr Pro
Lys Lys Pro Gly 180 185 190 Leu Arg Arg Arg Gln Thr 195 9
541DNAHomo sapiensmodified_base(381)..(381)a, c, t, g, unknown or
other 9gcattcattc agcaagtatt tatgatgtca cctggctccc cgccagacac
tggggaaaca 60aacgtggaga cggggcactg cccgcacagg gcacattttg ggggacagct
accctgtctg 120gtgcaccatc ctcacctgca cctggcaagg tgggtgaatg
gggaggaatc cagacaggtg 180acctggggga tggcgggcta ttctctgatt
tggggaacac agagaggaca gggggcaaag 240tggaaagtaa atgaagatga
acgagttata tctacaagtc tgaagctgga gaaagatgtc 300tgggcttcaa
aggagatttc ggaggcagcg gctcacaagg aaacgatcct gaaatcgtgg
360cttaaagtgc tgagtgctca ntctnttcaa ggatgatgat gctttacaga
gtactggtgt 420cactttctgt acttgggtgg cattnggctt gccagataga
gtcagaaaga aaacaggccg 480gaaccaagcc cgtaacttcc naactttaat
gaccggaaac tgganatttg gacccatttg 540g 5411046PRTHomo sapiens 10Met
Trp Pro Thr Arg Arg Leu Val Thr Ile Lys Arg Ser Gly Val Asp 1 5 10
15 Gly Pro His Phe Pro Leu Ser Leu Ser Thr Cys Leu Phe Gly Arg Arg
20 25 30 Lys Cys Val Leu Gln Cys Thr Glu Cys Ser Lys Thr Ala Ile 35
40 45 112350DNAHomo sapiens 11gcagtgtcac taggccggct gggggccctg
ggtacgctgt agaccagacc gcgacaggcc 60agaacacggg cggcggcttc gggccgggag
acccgcgcag ccctcggggc atctcagtgc 120ctcactcccc accccctccc
ccgggtcggg ggaggcggcg cgtccggcgg agggttgagg 180ggagcggggc
aggcctggag cgccatgagc agcccggatg cgggatacgc cagtgacgac
240cagagccaga cccagagcgc gctgcccgcg gtgatggccg ggctgggccc
ctgcccctgg 300gccgagtcgc tgagccccat cggggacatg aaggtgaagg
gcgaggcgcc ggcgaacagc 360ggagcaccgg ccggggccgc gggccgagcc
aagggcgagt cccgtatccg gcggccgatg 420aacgctttca tggtgtgggc
taaggacgag cgcaagcggc tggcgcagca gaatccagac 480ctgcacaacg
ccgagttgag caagatgctg ggcaagtcgt ggaaggcgct gacgctggcg
540gagaagcggc ccttcgtgga ggaggcagag cggctgcgcg tgcagcacat
gcaggaccac 600cccaactaca agtaccggcc gcggcggcgc aagcaggtga
agcggctgaa gcgggtggag 660ggcggcttcc tgcacggcct ggctgagccg
caggcggccg cgctgggccc cgagggcggc 720cgcgtggcca tggacggcct
gggcctccag ttccccgagc agggcttccc cgccggcccg 780ccgctgctgc
ctccgcacat gggcggccac taccgcgact gccagagtct gggcgcgcct
840ccgctcgacg gctacccgtt gcccacgccc gacacgtccc cgctggacgg
cgtggacccc 900gacccggctt tcttcgccgc cccgatgccc ggggactgcc
cggcggccgg cacctacagc 960tacgcgcagg tctcggacta cgctggcccc
ccggagcctc ccgccggtcc catgcacccc 1020cgactcggcc cagagcccgc
gggtccctcg attccgggcc tcctggcgcc acccagcgcc 1080cttcacgtgt
actacggcgc gatgggctcg cccggggcgg gcggcgggcg cggcttccag
1140atgcagccgc aacaccagca ccagcaccag caccagcacc accccccggg
ccccggacag 1200ccgtcgcccc ctccggaggc actgccctgc cgggacggca
cggaccccag tcagcccgcc 1260gagctcctcg gggaggtgga ccgcacggaa
tttgaacagt atctgcactt cgtgtgcaag 1320cctgagatgg gcctccccta
ccaggggcat gactccggtg tgaatctccc cgacagccac 1380ggggccattt
cctcggtggt gtccgacgcc agctccgcgg tatattactg caactatcct
1440gacgtgtgac aggtccctga tccgccccag cctgcaggcc agaagcagtg
ttacacactt 1500cctggaggag ctaaggaaat cctcagactc ctgggttttt
gttgttgctg ttgttgtttt 1560ttaaaaggtg tgttggcata taatttatgg
taatttattt tgtctgccac ttgaacagtt 1620tgggggggtg aggtttcatt
taaaatttgt tcagagattt gtttcccata gttggattgt 1680caaaacccta
tttccaagtt caagttaact agctttgaat gtgtcccaaa acagcttcct
1740ccatttcctg aaagtttatt gatcaaagaa atgttgtcct gggtgtgttt
tttcaatctt 1800ctaaaaaata aaatctggaa tcctgctttt ttgctctact
agtacctctg tcacactagt 1860cttatcaaaa accagttctt aagatcaatg
ttaagtttat tagttaatgt aaatttctca 1920tcctcgaaaa gggtgaacat
aaatgccttt aaggagtata tctaaaaata aacattagga 1980tatctaagtt
tgatgtaatt gtttcaggaa ggaaaaaaga aaagcattct ggaatgagcc
2040tacttcaagt aatcttagtt tctaaaacta acagttaata ttttcaattc
cagtatatca 2100ctttaagtag aaggggatgt ccaagtaatt ttggttttct
aactgttgaa tcataagctt 2160gacctgcccc cagaggcttt ttggatgttt
ttatctgtgt tttgccatct ctttacactc 2220ctcgacattc agtttacctt
aatcttcaca tttttacacc ttgggaagtg gcaagcatcg 2280ctgggtttaa
gataaaggag tcacaaaaac taatcaaaat aaaatttgca ttatgacaac
2340ttttaataca 235012414PRTHomo sapiens 12Met Ser Ser Pro Asp Ala
Gly Tyr Ala Ser Asp Asp Gln Ser Gln Thr 1 5 10 15 Gln Ser Ala Leu
Pro Ala Val Met Ala Gly Leu Gly Pro Cys Pro Trp 20 25 30 Ala Glu
Ser Leu Ser Pro Ile Gly Asp Met Lys Val Lys Gly Glu Ala 35 40 45
Pro Ala Asn Ser Gly Ala Pro Ala Gly Ala Ala Gly Arg Ala Lys Gly 50
55 60 Glu Ser Arg Ile Arg Arg Pro Met Asn Ala Phe Met Val Trp Ala
Lys 65 70 75 80 Asp Glu Arg Lys Arg Leu Ala Gln Gln Asn Pro Asp Leu
His Asn Ala 85 90 95 Glu Leu Ser Lys Met Leu Gly Lys Ser Trp Lys
Ala Leu Thr Leu Ala 100 105 110 Glu Lys Arg Pro Phe Val Glu Glu Ala
Glu Arg Leu Arg Val Gln His 115 120 125 Met Gln Asp His Pro Asn Tyr
Lys Tyr Arg Pro Arg Arg Arg Lys Gln 130 135 140 Val Lys Arg Leu Lys
Arg Val Glu Gly Gly Phe Leu His Gly Leu Ala 145 150 155 160 Glu Pro
Gln Ala Ala Ala Leu Gly Pro Glu Gly Gly Arg Val Ala Met 165 170 175
Asp Gly Leu Gly Leu Gln Phe Pro Glu Gln Gly Phe Pro Ala Gly Pro 180
185 190 Pro Leu Leu Pro Pro His Met Gly Gly His Tyr Arg Asp Cys Gln
Ser 195 200 205 Leu Gly Ala Pro Pro Leu Asp Gly Tyr Pro Leu Pro Thr
Pro Asp Thr 210 215 220 Ser Pro Leu Asp Gly Val Asp Pro Asp Pro Ala
Phe Phe Ala Ala Pro 225 230 235 240 Met Pro Gly Asp Cys Pro Ala Ala
Gly Thr Tyr Ser Tyr Ala Gln Val 245 250 255 Ser Asp Tyr Ala Gly Pro
Pro Glu Pro Pro Ala Gly Pro Met His Pro 260 265 270 Arg Leu Gly Pro
Glu Pro Ala Gly Pro Ser Ile Pro Gly Leu Leu Ala 275 280 285 Pro Pro
Ser Ala Leu His Val Tyr Tyr Gly Ala Met Gly Ser Pro Gly 290 295 300
Ala Gly Gly Gly Arg Gly Phe Gln Met Gln Pro Gln His Gln His Gln 305
310 315 320 His Gln His Gln His His Pro Pro Gly Pro Gly Gln Pro Ser
Pro Pro 325 330 335 Pro Glu Ala Leu Pro Cys Arg Asp Gly Thr Asp Pro
Ser Gln Pro Ala 340 345 350 Glu Leu Leu Gly Glu Val Asp Arg Thr Glu
Phe Glu Gln Tyr Leu His 355 360 365 Phe Val Cys Lys Pro Glu Met Gly
Leu Pro Tyr Gln Gly His Asp Ser 370 375 380 Gly Val Asn Leu Pro Asp
Ser His Gly Ala Ile Ser Ser Val Val Ser 385 390 395 400 Asp Ala Ser
Ser Ala Val Tyr Tyr Cys Asn Tyr Pro Asp Val 405 410 134507DNAHomo
sapiens 13gaccaattgt catacgactt gcagtgagcg tcaggagcac gtccaggaac
tcctcagcag 60cgcctccttc agctccacag ccagacgccc tcagacagca aagcctaccc
ccgcgccgcg 120ccctgcccgc cgctgcgatg ctcgcccgcg ccctgctgct
gtgcgcggtc ctggcgctca 180gccatacagc aaatccttgc tgttcccacc
catgtcaaaa ccgaggtgta tgtatgagtg 240tgggatttga ccagtataag
tgcgattgta cccggacagg attctatgga gaaaactgct 300caacaccgga
atttttgaca agaataaaat tatttctgaa acccactcca aacacagtgc
360actacatact tacccacttc aagggatttt ggaacgttgt gaataacatt
cccttccttc 420gaaatgcaat tatgagttat gtgttgacat ccagatcaca
tttgattgac agtccaccaa 480cttacaatgc tgactatggc tacaaaagct
gggaagcctt ctctaacctc tcctattata 540ctagagccct tcctcctgtg
cctgatgatt gcccgactcc cttgggtgtc aaaggtaaaa 600agcagcttcc
tgattcaaat gagattgtgg aaaaattgct tctaagaaga aagttcatcc
660ctgatcccca gggctcaaac atgatgtttg cattctttgc ccagcacttc
acgcatcagt 720ttttcaagac agatcataag cgagggccag ctttcaccaa
cgggctgggc catggggtgg 780acttaaatca tatttacggt gaaactctgg
ctagacagcg taaactgcgc cttttcaagg 840atggaaaaat gaaatatcag
ataattgatg gagagatgta tcctcccaca gtcaaagata 900ctcaggcaga
gatgatctac cctcctcaag tccctgagca tctacggttt gctgtggggc
960aggaggtctt tggtctggtg cctggtctga tgatgtatgc cacaatctgg
ctgcgggaac 1020acaacagagt atgcgatgtg cttaaacagg agcatcctga
atggggtgat gagcagttgt 1080tccagacaag caggctaata ctgataggag
agactattaa gattgtgatt gaagattatg 1140tgcaacactt gagtggctat
cacttcaaac tgaaatttga cccagaacta cttttcaaca 1200aacaattcca
gtaccaaaat cgtattgctg ctgaatttaa caccctctat cactggcatc
1260cccttctgcc tgacaccttt caaattcatg accagaaata caactatcaa
cagtttatct 1320acaacaactc tatattgctg gaacatggaa ttacccagtt
tgttgaatca ttcaccaggc 1380aaattgctgg cagggttgct ggtggtagga
atgttccacc cgcagtacag aaagtatcac 1440aggcttccat tgaccagagc
aggcagatga aataccagtc ttttaatgag taccgcaaac 1500gctttatgct
gaagccctat gaatcatttg aagaacttac aggagaaaag gaaatgtctg
1560cagagttgga agcactctat ggtgacatcg atgctgtgga gctgtatcct
gcccttctgg 1620tagaaaagcc tcggccagat gccatctttg gtgaaaccat
ggtagaagtt ggagcaccat 1680tctccttgaa aggacttatg ggtaatgtta
tatgttctcc tgcctactgg aagccaagca 1740cttttggtgg agaagtgggt
tttcaaatca tcaacactgc ctcaattcag tctctcatct 1800gcaataacgt
gaagggctgt ccctttactt cattcagtgt tccagatcca gagctcatta
1860aaacagtcac catcaatgca agttcttccc gctccggact agatgatatc
aatcccacag 1920tactactaaa agaacgttcg actgaactgt agaagtctaa
tgatcatatt tatttattta 1980tatgaaccat gtctattaat ttaattattt
aataatattt atattaaact ccttatgtta 2040cttaacatct tctgtaacag
aagtcagtac tcctgttgcg gagaaaggag tcatacttgt 2100gaagactttt
atgtcactac tctaaagatt ttgctgttgc tgttaagttt ggaaaacagt
2160ttttattctg ttttataaac cagagagaaa tgagttttga cgtcttttta
cttgaatttc 2220aacttatatt ataagaacga aagtaaagat gtttgaatac
ttaaacactg tcacaagatg 2280gcaaaatgct gaaagttttt acactgtcga
tgtttccaat gcatcttcca tgatgcatta 2340gaagtaacta atgtttgaaa
ttttaaagta cttttggtta tttttctgtc atcaaacaaa 2400aacaggtatc
agtgcattat taaatgaata tttaaattag acattaccag taatttcatg
2460tctacttttt aaaatcagca atgaaacaat aatttgaaat ttctaaattc
atagggtaga 2520atcacctgta aaagcttgtt tgatttctta aagttattaa
acttgtacat ataccaaaaa 2580gaagctgtct tggatttaaa tctgtaaaat
cagtagaaat tttactacaa ttgcttgtta 2640aaatatttta taagtgatgt
tcctttttca ccaagagtat aaaccttttt agtgtgactg 2700ttaaaacttc
cttttaaatc aaaatgccaa atttattaag gtggtggagc cactgcagtg
2760ttatcttaaa ataagaatat tttgttgaga tattccagaa tttgtttata
tggctggtaa 2820catgtaaaat ctatatcagc aaaagggtct acctttaaaa
taagcaataa caaagaagaa 2880aaccaaatta ttgttcaaat ttaggtttaa
acttttgaag caaacttttt tttatccttg 2940tgcactgcag gcctggtact
cagattttgc tatgaggtta atgaagtacc aagctgtgct 3000tgaataatga
tatgttttct cagattttct gttgtacagt ttaatttagc agtccatatc
3060acattgcaaa agtagcaatg acctcataaa atacctcttc aaaatgctta
aattcatttc 3120acacattaat tttatctcag tcttgaagcc aattcagtag
gtgcattgga atcaagcctg 3180gctacctgca tgctgttcct tttcttttct
tcttttagcc attttgctaa gagacacagt 3240cttctcatca cttcgtttct
cctattttgt tttactagtt ttaagatcag agttcacttt 3300ctttggactc
tgcctatatt ttcttacctg aacttttgca agttttcagg taaacctcag
3360ctcaggactg ctatttagct cctcttaaga agattaaaag agaaaaaaaa
aggccctttt 3420aaaaatagta tacacttatt ttaagtgaaa agcagagaat
tttatttata gctaatttta 3480gctatctgta accaagatgg atgcaaagag
gctagtgcct cagagagaac tgtacggggt 3540ttgtgactgg aaaaagttac
gttcccattc taattaatgc cctttcttat ttaaaaacaa 3600aaccaaatga
tatctaagta gttctcagca ataataataa tgacgataat acttcttttc
3660cacatctcat tgtcactgac atttaatggt actgtatatt acttaattta
ttgaagatta 3720ttatttatgt cttattagga cactatggtt ataaactgtg
tttaagccta caatcattga 3780tttttttttg ttatgtcaca atcagtatat
cttctttggg gttacctctc tgaatattat 3840gtaaacaatc caaagaaatg
attgtattaa gatttgtgaa taaattttta gaaatctgat 3900tggcatattg
agatatttaa ggttgaatgt ttgtccttag gataggccta tgtgctagcc
3960cacaaagaat attgtctcat tagcctgaat gtgccataag actgaccttt
taaaatgttt 4020tgagggatct gtggatgctt cgttaatttg ttcagccaca
atttattgag aaaatattct 4080gtgtcaagca ctgtgggttt taatattttt
aaatcaaacg ctgattacag ataatagtat 4140ttatataaat aattgaaaaa
aattttcttt tgggaagagg gagaaaatga aataaatatc 4200attaaagata
actcaggaga atcttcttta caattttacg tttagaatgt ttaaggttaa
4260gaaagaaata gtcaatatgc ttgtataaaa cactgttcac tgtttttttt
aaaaaaaaaa 4320cttgatttgt tattaacatt gatctgctga caaaacctgg
gaatttgggt tgtgtatgcg 4380aatgtttcag tgcctcagac aaatgtgtat
ttaacttatg taaaagataa gtctggaaat 4440aaatgtctgt ttatttttgt
actatttaaa aattgacaga tcttttctga agaaaaaaaa 4500aaaaaaa
450714604PRTHomo sapiens 14Met Leu Ala Arg Ala Leu Leu Leu Cys Ala
Val Leu Ala Leu Ser His 1 5 10 15 Thr Ala Asn Pro Cys Cys Ser His
Pro Cys Gln Asn Arg Gly Val Cys 20 25 30 Met Ser Val Gly Phe Asp
Gln Tyr Lys Cys Asp Cys Thr Arg Thr Gly 35 40 45 Phe Tyr Gly Glu
Asn Cys Ser Thr Pro Glu Phe Leu Thr Arg Ile Lys 50 55 60 Leu Phe
Leu Lys Pro Thr Pro Asn Thr Val His Tyr Ile Leu Thr His 65 70
75 80 Phe Lys Gly Phe Trp Asn Val Val Asn Asn Ile Pro Phe Leu Arg
Asn 85 90 95 Ala Ile Met Ser Tyr Val Leu Thr Ser Arg Ser His Leu
Ile Asp Ser 100 105 110 Pro Pro Thr Tyr Asn Ala Asp Tyr Gly Tyr Lys
Ser Trp Glu Ala Phe 115 120 125 Ser Asn Leu Ser Tyr Tyr Thr Arg Ala
Leu Pro Pro Val Pro Asp Asp 130 135 140 Cys Pro Thr Pro Leu Gly Val
Lys Gly Lys Lys Gln Leu Pro Asp Ser 145 150 155 160 Asn Glu Ile Val
Glu Lys Leu Leu Leu Arg Arg Lys Phe Ile Pro Asp 165 170 175 Pro Gln
Gly Ser Asn Met Met Phe Ala Phe Phe Ala Gln His Phe Thr 180 185 190
His Gln Phe Phe Lys Thr Asp His Lys Arg Gly Pro Ala Phe Thr Asn 195
200 205 Gly Leu Gly His Gly Val Asp Leu Asn His Ile Tyr Gly Glu Thr
Leu 210 215 220 Ala Arg Gln Arg Lys Leu Arg Leu Phe Lys Asp Gly Lys
Met Lys Tyr 225 230 235 240 Gln Ile Ile Asp Gly Glu Met Tyr Pro Pro
Thr Val Lys Asp Thr Gln 245 250 255 Ala Glu Met Ile Tyr Pro Pro Gln
Val Pro Glu His Leu Arg Phe Ala 260 265 270 Val Gly Gln Glu Val Phe
Gly Leu Val Pro Gly Leu Met Met Tyr Ala 275 280 285 Thr Ile Trp Leu
Arg Glu His Asn Arg Val Cys Asp Val Leu Lys Gln 290 295 300 Glu His
Pro Glu Trp Gly Asp Glu Gln Leu Phe Gln Thr Ser Arg Leu 305 310 315
320 Ile Leu Ile Gly Glu Thr Ile Lys Ile Val Ile Glu Asp Tyr Val Gln
325 330 335 His Leu Ser Gly Tyr His Phe Lys Leu Lys Phe Asp Pro Glu
Leu Leu 340 345 350 Phe Asn Lys Gln Phe Gln Tyr Gln Asn Arg Ile Ala
Ala Glu Phe Asn 355 360 365 Thr Leu Tyr His Trp His Pro Leu Leu Pro
Asp Thr Phe Gln Ile His 370 375 380 Asp Gln Lys Tyr Asn Tyr Gln Gln
Phe Ile Tyr Asn Asn Ser Ile Leu 385 390 395 400 Leu Glu His Gly Ile
Thr Gln Phe Val Glu Ser Phe Thr Arg Gln Ile 405 410 415 Ala Gly Arg
Val Ala Gly Gly Arg Asn Val Pro Pro Ala Val Gln Lys 420 425 430 Val
Ser Gln Ala Ser Ile Asp Gln Ser Arg Gln Met Lys Tyr Gln Ser 435 440
445 Phe Asn Glu Tyr Arg Lys Arg Phe Met Leu Lys Pro Tyr Glu Ser Phe
450 455 460 Glu Glu Leu Thr Gly Glu Lys Glu Met Ser Ala Glu Leu Glu
Ala Leu 465 470 475 480 Tyr Gly Asp Ile Asp Ala Val Glu Leu Tyr Pro
Ala Leu Leu Val Glu 485 490 495 Lys Pro Arg Pro Asp Ala Ile Phe Gly
Glu Thr Met Val Glu Val Gly 500 505 510 Ala Pro Phe Ser Leu Lys Gly
Leu Met Gly Asn Val Ile Cys Ser Pro 515 520 525 Ala Tyr Trp Lys Pro
Ser Thr Phe Gly Gly Glu Val Gly Phe Gln Ile 530 535 540 Ile Asn Thr
Ala Ser Ile Gln Ser Leu Ile Cys Asn Asn Val Lys Gly 545 550 555 560
Cys Pro Phe Thr Ser Phe Ser Val Pro Asp Pro Glu Leu Ile Lys Thr 565
570 575 Val Thr Ile Asn Ala Ser Ser Ser Arg Ser Gly Leu Asp Asp Ile
Asn 580 585 590 Pro Thr Val Leu Leu Lys Glu Arg Ser Thr Glu Leu 595
600 15 5616DNAHomo sapiens 15ccccggcgca gcgcggccgc agcagcctcc
gccccccgca cggtgtgagc gcccgacgcg 60gccgaggcgg ccggagtccc gagctagccc
cggcggccgc cgccgcccag accggacgac 120aggccacctc gtcggcgtcc
gcccgagtcc ccgcctcgcc gccaacgcca caaccaccgc 180gcacggcccc
ctgactccgt ccagtattga tcgggagagc cggagcgagc tcttcgggga
240gcagcgatgc gaccctccgg gacggccggg gcagcgctcc tggcgctgct
ggctgcgctc 300tgcccggcga gtcgggctct ggaggaaaag aaagtttgcc
aaggcacgag taacaagctc 360acgcagttgg gcacttttga agatcatttt
ctcagcctcc agaggatgtt caataactgt 420gaggtggtcc ttgggaattt
ggaaattacc tatgtgcaga ggaattatga tctttccttc 480ttaaagacca
tccaggaggt ggctggttat gtcctcattg ccctcaacac agtggagcga
540attcctttgg aaaacctgca gatcatcaga ggaaatatgt actacgaaaa
ttcctatgcc 600ttagcagtct tatctaacta tgatgcaaat aaaaccggac
tgaaggagct gcccatgaga 660aatttacagg aaatcctgca tggcgccgtg
cggttcagca acaaccctgc cctgtgcaac 720gtggagagca tccagtggcg
ggacatagtc agcagtgact ttctcagcaa catgtcgatg 780gacttccaga
accacctggg cagctgccaa aagtgtgatc caagctgtcc caatgggagc
840tgctggggtg caggagagga gaactgccag aaactgacca aaatcatctg
tgcccagcag 900tgctccgggc gctgccgtgg caagtccccc agtgactgct
gccacaacca gtgtgctgca 960ggctgcacag gcccccggga gagcgactgc
ctggtctgcc gcaaattccg agacgaagcc 1020acgtgcaagg acacctgccc
cccactcatg ctctacaacc ccaccacgta ccagatggat 1080gtgaaccccg
agggcaaata cagctttggt gccacctgcg tgaagaagtg tccccgtaat
1140tatgtggtga cagatcacgg ctcgtgcgtc cgagcctgtg gggccgacag
ctatgagatg 1200gaggaagacg gcgtccgcaa gtgtaagaag tgcgaagggc
cttgccgcaa agtgtgtaac 1260ggaataggta ttggtgaatt taaagactca
ctctccataa atgctacgaa tattaaacac 1320ttcaaaaact gcacctccat
cagtggcgat ctccacatcc tgccggtggc atttaggggt 1380gactccttca
cacatactcc tcctctggat ccacaggaac tggatattct gaaaaccgta
1440aaggaaatca cagggttttt gctgattcag gcttggcctg aaaacaggac
ggacctccat 1500gcctttgaga acctagaaat catacgcggc aggaccaagc
aacatggtca gttttctctt 1560gcagtcgtca gcctgaacat aacatccttg
ggattacgct ccctcaagga gataagtgat 1620ggagatgtga taatttcagg
aaacaaaaat ttgtgctatg caaatacaat aaactggaaa 1680aaactgtttg
ggacctccgg tcagaaaacc aaaattataa gcaacagagg tgaaaacagc
1740tgcaaggcca caggccaggt ctgccatgcc ttgtgctccc ccgagggctg
ctggggcccg 1800gagcccaggg actgcgtctc ttgccggaat gtcagccgag
gcagggaatg cgtggacaag 1860tgcaaccttc tggagggtga gccaagggag
tttgtggaga actctgagtg catacagtgc 1920cacccagagt gcctgcctca
ggccatgaac atcacctgca caggacgggg accagacaac 1980tgtatccagt
gtgcccacta cattgacggc ccccactgcg tcaagacctg cccggcagga
2040gtcatgggag aaaacaacac cctggtctgg aagtacgcag acgccggcca
tgtgtgccac 2100ctgtgccatc caaactgcac ctacggatgc actgggccag
gtcttgaagg ctgtccaacg 2160aatgggccta agatcccgtc catcgccact
gggatggtgg gggccctcct cttgctgctg 2220gtggtggccc tggggatcgg
cctcttcatg cgaaggcgcc acatcgttcg gaagcgcacg 2280ctgcggaggc
tgctgcagga gagggagctt gtggagcctc ttacacccag tggagaagct
2340cccaaccaag ctctcttgag gatcttgaag gaaactgaat tcaaaaagat
caaagtgctg 2400ggctccggtg cgttcggcac ggtgtataag ggactctgga
tcccagaagg tgagaaagtt 2460aaaattcccg tcgctatcaa ggaattaaga
gaagcaacat ctccgaaagc caacaaggaa 2520atcctcgatg aagcctacgt
gatggccagc gtggacaacc cccacgtgtg ccgcctgctg 2580ggcatctgcc
tcacctccac cgtgcagctc atcacgcagc tcatgccctt cggctgcctc
2640ctggactatg tccgggaaca caaagacaat attggctccc agtacctgct
caactggtgt 2700gtgcagatcg caaagggcat gaactacttg gaggaccgtc
gcttggtgca ccgcgacctg 2760gcagccagga acgtactggt gaaaacaccg
cagcatgtca agatcacaga ttttgggctg 2820gccaaactgc tgggtgcgga
agagaaagaa taccatgcag aaggaggcaa agtgcctatc 2880aagtggatgg
cattggaatc aattttacac agaatctata cccaccagag tgatgtctgg
2940agctacgggg tgaccgtttg ggagttgatg acctttggat ccaagccata
tgacggaatc 3000cctgccagcg agatctcctc catcctggag aaaggagaac
gcctccctca gccacccata 3060tgtaccatcg atgtctacat gatcatggtc
aagtgctgga tgatagacgc agatagtcgc 3120ccaaagttcc gtgagttgat
catcgaattc tccaaaatgg cccgagaccc ccagcgctac 3180cttgtcattc
agggggatga aagaatgcat ttgccaagtc ctacagactc caacttctac
3240cgtgccctga tggatgaaga agacatggac gacgtggtgg atgccgacga
gtacctcatc 3300ccacagcagg gcttcttcag cagcccctcc acgtcacgga
ctcccctcct gagctctctg 3360agtgcaacca gcaacaattc caccgtggct
tgcattgata gaaatgggct gcaaagctgt 3420cccatcaagg aagacagctt
cttgcagcga tacagctcag accccacagg cgccttgact 3480gaggacagca
tagacgacac cttcctccca gtgcctgaat acataaacca gtccgttccc
3540aaaaggcccg ctggctctgt gcagaatcct gtctatcaca atcagcctct
gaaccccgcg 3600cccagcagag acccacacta ccaggacccc cacagcactg
cagtgggcaa ccccgagtat 3660ctcaacactg tccagcccac ctgtgtcaac
agcacattcg acagccctgc ccactgggcc 3720cagaaaggca gccaccaaat
tagcctggac aaccctgact accagcagga cttctttccc 3780aaggaagcca
agccaaatgg catctttaag ggctccacag ctgaaaatgc agaataccta
3840agggtcgcgc cacaaagcag tgaatttatt ggagcatgac cacggaggat
agtatgagcc 3900ctaaaaatcc agactctttc gatacccagg accaagccac
agcaggtcct ccatcccaac 3960agccatgccc gcattagctc ttagacccac
agactggttt tgcaacgttt acaccgacta 4020gccaggaagt acttccacct
cgggcacatt ttgggaagtt gcattccttt gtcttcaaac 4080tgtgaagcat
ttacagaaac gcatccagca agaatattgt ccctttgagc agaaatttat
4140ctttcaaaga ggtatatttg aaaaaaaaaa aaagtatatg tgaggatttt
tattgattgg 4200ggatcttgga gtttttcatt gtcgctattg atttttactt
caatgggctc ttccaacaag 4260gaagaagctt gctggtagca cttgctaccc
tgagttcatc caggcccaac tgtgagcaag 4320gagcacaagc cacaagtctt
ccagaggatg cttgattcca gtggttctgc ttcaaggctt 4380ccactgcaaa
acactaaaga tccaagaagg ccttcatggc cccagcaggc cggatcggta
4440ctgtatcaag tcatggcagg tacagtagga taagccactc tgtcccttcc
tgggcaaaga 4500agaaacggag gggatggaat tcttccttag acttactttt
gtaaaaatgt ccccacggta 4560cttactcccc actgatggac cagtggtttc
cagtcatgag cgttagactg acttgtttgt 4620cttccattcc attgttttga
aactcagtat gctgcccctg tcttgctgtc atgaaatcag 4680caagagagga
tgacacatca aataataact cggattccag cccacattgg attcatcagc
4740atttggacca atagcccaca gctgagaatg tggaatacct aaggatagca
ccgcttttgt 4800tctcgcaaaa acgtatctcc taatttgagg ctcagatgaa
atgcatcagg tcctttgggg 4860catagatcag aagactacaa aaatgaagct
gctctgaaat ctcctttagc catcacccca 4920accccccaaa attagtttgt
gttacttatg gaagatagtt ttctcctttt acttcacttc 4980aaaagctttt
tactcaaaga gtatatgttc cctccaggtc agctgccccc aaaccccctc
5040cttacgcttt gtcacacaaa aagtgtctct gccttgagtc atctattcaa
gcacttacag 5100ctctggccac aacagggcat tttacaggtg cgaatgacag
tagcattatg agtagtgtgg 5160aattcaggta gtaaatatga aactagggtt
tgaaattgat aatgctttca caacatttgc 5220agatgtttta gaaggaaaaa
agttccttcc taaaataatt tctctacaat tggaagattg 5280gaagattcag
ctagttagga gcccaccttt tttcctaatc tgtgtgtgcc ctgtaacctg
5340actggttaac agcagtcctt tgtaaacagt gttttaaact ctcctagtca
atatccaccc 5400catccaattt atcaaggaag aaatggttca gaaaatattt
tcagcctaca gttatgttca 5460gtcacacaca catacaaaat gttccttttg
cttttaaagt aatttttgac tcccagatca 5520gtcagagccc ctacagcatt
gttaagaaag tatttgattt ttgtctcaat gaaaataaaa 5580ctatattcat
ttccactcta aaaaaaaaaa aaaaaa 5616162239DNAHomo sapiens 16ccccggcgca
gcgcggccgc agcagcctcc gccccccgca cggtgtgagc gcccgacgcg 60gccgaggcgg
ccggagtccc gagctagccc cggcggccgc cgccgcccag accggacgac
120aggccacctc gtcggcgtcc gcccgagtcc ccgcctcgcc gccaacgcca
caaccaccgc 180gcacggcccc ctgactccgt ccagtattga tcgggagagc
cggagcgagc tcttcgggga 240gcagcgatgc gaccctccgg gacggccggg
gcagcgctcc tggcgctgct ggctgcgctc 300tgcccggcga gtcgggctct
ggaggaaaag aaagtttgcc aaggcacgag taacaagctc 360acgcagttgg
gcacttttga agatcatttt ctcagcctcc agaggatgtt caataactgt
420gaggtggtcc ttgggaattt ggaaattacc tatgtgcaga ggaattatga
tctttccttc 480ttaaagacca tccaggaggt ggctggttat gtcctcattg
ccctcaacac agtggagcga 540attcctttgg aaaacctgca gatcatcaga
ggaaatatgt actacgaaaa ttcctatgcc 600ttagcagtct tatctaacta
tgatgcaaat aaaaccggac tgaaggagct gcccatgaga 660aatttacagg
aaatcctgca tggcgccgtg cggttcagca acaaccctgc cctgtgcaac
720gtggagagca tccagtggcg ggacatagtc agcagtgact ttctcagcaa
catgtcgatg 780gacttccaga accacctggg cagctgccaa aagtgtgatc
caagctgtcc caatgggagc 840tgctggggtg caggagagga gaactgccag
aaactgacca aaatcatctg tgcccagcag 900tgctccgggc gctgccgtgg
caagtccccc agtgactgct gccacaacca gtgtgctgca 960ggctgcacag
gcccccggga gagcgactgc ctggtctgcc gcaaattccg agacgaagcc
1020acgtgcaagg acacctgccc cccactcatg ctctacaacc ccaccacgta
ccagatggat 1080gtgaaccccg agggcaaata cagctttggt gccacctgcg
tgaagaagtg tccccgtaat 1140tatgtggtga cagatcacgg ctcgtgcgtc
cgagcctgtg gggccgacag ctatgagatg 1200gaggaagacg gcgtccgcaa
gtgtaagaag tgcgaagggc cttgccgcaa agtgtgtaac 1260ggaataggta
ttggtgaatt taaagactca ctctccataa atgctacgaa tattaaacac
1320ttcaaaaact gcacctccat cagtggcgat ctccacatcc tgccggtggc
atttaggggt 1380gactccttca cacatactcc tcctctggat ccacaggaac
tggatattct gaaaaccgta 1440aaggaaatca cagggttttt gctgattcag
gcttggcctg aaaacaggac ggacctccat 1500gcctttgaga acctagaaat
catacgcggc aggaccaagc aacatggtca gttttctctt 1560gcagtcgtca
gcctgaacat aacatccttg ggattacgct ccctcaagga gataagtgat
1620ggagatgtga taatttcagg aaacaaaaat ttgtgctatg caaatacaat
aaactggaaa 1680aaactgtttg ggacctccgg tcagaaaacc aaaattataa
gcaacagagg tgaaaacagc 1740tgcaaggcca caggccaggt ctgccatgcc
ttgtgctccc ccgagggctg ctggggcccg 1800gagcccaggg actgcgtctc
ttgccggaat gtcagccgag gcagggaatg cgtggacaag 1860tgcaaccttc
tggagggtga gccaagggag tttgtggaga actctgagtg catacagtgc
1920cacccagagt gcctgcctca ggccatgaac atcacctgca caggacgggg
accagacaac 1980tgtatccagt gtgcccacta cattgacggc ccccactgcg
tcaagacctg cccggcagga 2040gtcatgggag aaaacaacac cctggtctgg
aagtacgcag acgccggcca tgtgtgccac 2100ctgtgccatc caaactgcac
ctacgggtcc taataaatct tcactgtctg actttagtct 2160cccactaaaa
ctgcatttcc tttctacaat ttcaatttct ccctttgctt caaataaagt
2220cctgacacta ttcatttga 2239171595DNAHomo sapiens 17ccccggcgca
gcgcggccgc agcagcctcc gccccccgca cggtgtgagc gcccgacgcg 60gccgaggcgg
ccggagtccc gagctagccc cggcggccgc cgccgcccag accggacgac
120aggccacctc gtcggcgtcc gcccgagtcc ccgcctcgcc gccaacgcca
caaccaccgc 180gcacggcccc ctgactccgt ccagtattga tcgggagagc
cggagcgagc tcttcgggga 240gcagcgatgc gaccctccgg gacggccggg
gcagcgctcc tggcgctgct ggctgcgctc 300tgcccggcga gtcgggctct
ggaggaaaag aaagtttgcc aaggcacgag taacaagctc 360acgcagttgg
gcacttttga agatcatttt ctcagcctcc agaggatgtt caataactgt
420gaggtggtcc ttgggaattt ggaaattacc tatgtgcaga ggaattatga
tctttccttc 480ttaaagacca tccaggaggt ggctggttat gtcctcattg
ccctcaacac agtggagcga 540attcctttgg aaaacctgca gatcatcaga
ggaaatatgt actacgaaaa ttcctatgcc 600ttagcagtct tatctaacta
tgatgcaaat aaaaccggac tgaaggagct gcccatgaga 660aatttacagg
aaatcctgca tggcgccgtg cggttcagca acaaccctgc cctgtgcaac
720gtggagagca tccagtggcg ggacatagtc agcagtgact ttctcagcaa
catgtcgatg 780gacttccaga accacctggg cagctgccaa aagtgtgatc
caagctgtcc caatgggagc 840tgctggggtg caggagagga gaactgccag
aaactgacca aaatcatctg tgcccagcag 900tgctccgggc gctgccgtgg
caagtccccc agtgactgct gccacaacca gtgtgctgca 960ggctgcacag
gcccccggga gagcgactgc ctggtctgcc gcaaattccg agacgaagcc
1020acgtgcaagg acacctgccc cccactcatg ctctacaacc ccaccacgta
ccagatggat 1080gtgaaccccg agggcaaata cagctttggt gccacctgcg
tgaagaagtg tccccgtaat 1140tatgtggtga cagatcacgg ctcgtgcgtc
cgagcctgtg gggccgacag ctatgagatg 1200gaggaagacg gcgtccgcaa
gtgtaagaag tgcgaagggc cttgccgcaa agtgtgtaac 1260ggaataggta
ttggtgaatt taaagactca ctctccataa atgctacgaa tattaaacac
1320ttcaaaaact gcacctccat cagtggcgat ctccacatcc tgccggtggc
atttaggggt 1380gactccttca cacatactcc tcctctggat ccacaggaac
tggatattct gaaaaccgta 1440aaggaaatca caggtttgag ctgaattatc
acatgaatat aaatgggaaa tcagtgtttt 1500agagagagaa cttttcgaca
tatttcctgt tcccttggaa taaaaacatt tcttctgaaa 1560ttttaccgtt
aaaaaaaaaa aaaaaaaaaa aaaaa 1595182865DNAHomo sapiens 18ccccggcgca
gcgcggccgc agcagcctcc gccccccgca cggtgtgagc gcccgacgcg 60gccgaggcgg
ccggagtccc gagctagccc cggcggccgc cgccgcccag accggacgac
120aggccacctc gtcggcgtcc gcccgagtcc ccgcctcgcc gccaacgcca
caaccaccgc 180gcacggcccc ctgactccgt ccagtattga tcgggagagc
cggagcgagc tcttcgggga 240gcagcgatgc gaccctccgg gacggccggg
gcagcgctcc tggcgctgct ggctgcgctc 300tgcccggcga gtcgggctct
ggaggaaaag aaagtttgcc aaggcacgag taacaagctc 360acgcagttgg
gcacttttga agatcatttt ctcagcctcc agaggatgtt caataactgt
420gaggtggtcc ttgggaattt ggaaattacc tatgtgcaga ggaattatga
tctttccttc 480ttaaagacca tccaggaggt ggctggttat gtcctcattg
ccctcaacac agtggagcga 540attcctttgg aaaacctgca gatcatcaga
ggaaatatgt actacgaaaa ttcctatgcc 600ttagcagtct tatctaacta
tgatgcaaat aaaaccggac tgaaggagct gcccatgaga 660aatttacagg
aaatcctgca tggcgccgtg cggttcagca acaaccctgc cctgtgcaac
720gtggagagca tccagtggcg ggacatagtc agcagtgact ttctcagcaa
catgtcgatg 780gacttccaga accacctggg cagctgccaa aagtgtgatc
caagctgtcc caatgggagc 840tgctggggtg caggagagga gaactgccag
aaactgacca aaatcatctg tgcccagcag 900tgctccgggc gctgccgtgg
caagtccccc agtgactgct gccacaacca gtgtgctgca 960ggctgcacag
gcccccggga gagcgactgc ctggtctgcc gcaaattccg agacgaagcc
1020acgtgcaagg acacctgccc cccactcatg ctctacaacc ccaccacgta
ccagatggat 1080gtgaaccccg agggcaaata cagctttggt gccacctgcg
tgaagaagtg tccccgtaat 1140tatgtggtga cagatcacgg ctcgtgcgtc
cgagcctgtg gggccgacag ctatgagatg 1200gaggaagacg gcgtccgcaa
gtgtaagaag tgcgaagggc cttgccgcaa agtgtgtaac 1260ggaataggta
ttggtgaatt taaagactca ctctccataa atgctacgaa tattaaacac
1320ttcaaaaact gcacctccat cagtggcgat ctccacatcc tgccggtggc
atttaggggt 1380gactccttca cacatactcc tcctctggat ccacaggaac
tggatattct gaaaaccgta 1440aaggaaatca cagggttttt gctgattcag
gcttggcctg aaaacaggac ggacctccat 1500gcctttgaga acctagaaat
catacgcggc aggaccaagc aacatggtca gttttctctt 1560gcagtcgtca
gcctgaacat aacatccttg ggattacgct ccctcaagga gataagtgat
1620ggagatgtga taatttcagg aaacaaaaat ttgtgctatg caaatacaat
aaactggaaa 1680aaactgtttg ggacctccgg tcagaaaacc aaaattataa
gcaacagagg tgaaaacagc 1740tgcaaggcca caggccaggt ctgccatgcc
ttgtgctccc ccgagggctg ctggggcccg 1800gagcccaggg actgcgtctc
ttgccggaat gtcagccgag gcagggaatg cgtggacaag 1860tgcaaccttc
tggagggtga gccaagggag tttgtggaga actctgagtg catacagtgc
1920cacccagagt gcctgcctca ggccatgaac atcacctgca caggacgggg
accagacaac 1980tgtatccagt gtgcccacta
cattgacggc ccccactgcg tcaagacctg cccggcagga 2040gtcatgggag
aaaacaacac cctggtctgg aagtacgcag acgccggcca tgtgtgccac
2100ctgtgccatc caaactgcac ctacgggcca ggaaatgaga gtctcaaagc
catgttattc 2160tgccttttta aactatcatc ctgtaatcaa agtaatgatg
gcagcgtgtc ccaccagagc 2220gggagcccag ctgctcagga gtcatgctta
ggatggatcc cttctcttct gccgtcagag 2280tttcagctgg gttggggtgg
atgcagccac ctccatgcct ggccttctgc atctgtgatc 2340atcacggcct
cctcctgcca ctgagcctca tgccttcacg tgtctgttcc ccccgctttt
2400cctttctgcc acccctgcac gtgggccgcc aggttcccaa gagtatccta
cccatttcct 2460tccttccact ccctttgcca gtgcctctca ccccaactag
tagctaacca tcacccccag 2520gactgacctc ttcctcctcg ctgccagatg
attgttcaaa gcacagaatt tgtcagaaac 2580ctgcagggac tccatgctgc
cagccttctc cgtaattagc atggccccag tccatgcttc 2640tagccttggt
tccttctgcc cctctgtttg aaattctaga gccagctgtg ggacaattat
2700ctgtgtcaaa agccagatgt gaaaacatct caataacaaa ctggctgctt
tgttcaatgc 2760tagaacaacg cctgtcacag agtagaaact caaaaatatt
tgctgagtga atgaacaaat 2820gaataaatgc ataataaata attaaccacc
aatccaacat ccaga 2865191210PRTHomo sapiens 19Met Arg Pro Ser Gly
Thr Ala Gly Ala Ala Leu Leu Ala Leu Leu Ala 1 5 10 15 Ala Leu Cys
Pro Ala Ser Arg Ala Leu Glu Glu Lys Lys Val Cys Gln 20 25 30 Gly
Thr Ser Asn Lys Leu Thr Gln Leu Gly Thr Phe Glu Asp His Phe 35 40
45 Leu Ser Leu Gln Arg Met Phe Asn Asn Cys Glu Val Val Leu Gly Asn
50 55 60 Leu Glu Ile Thr Tyr Val Gln Arg Asn Tyr Asp Leu Ser Phe
Leu Lys 65 70 75 80 Thr Ile Gln Glu Val Ala Gly Tyr Val Leu Ile Ala
Leu Asn Thr Val 85 90 95 Glu Arg Ile Pro Leu Glu Asn Leu Gln Ile
Ile Arg Gly Asn Met Tyr 100 105 110 Tyr Glu Asn Ser Tyr Ala Leu Ala
Val Leu Ser Asn Tyr Asp Ala Asn 115 120 125 Lys Thr Gly Leu Lys Glu
Leu Pro Met Arg Asn Leu Gln Glu Ile Leu 130 135 140 His Gly Ala Val
Arg Phe Ser Asn Asn Pro Ala Leu Cys Asn Val Glu 145 150 155 160 Ser
Ile Gln Trp Arg Asp Ile Val Ser Ser Asp Phe Leu Ser Asn Met 165 170
175 Ser Met Asp Phe Gln Asn His Leu Gly Ser Cys Gln Lys Cys Asp Pro
180 185 190 Ser Cys Pro Asn Gly Ser Cys Trp Gly Ala Gly Glu Glu Asn
Cys Gln 195 200 205 Lys Leu Thr Lys Ile Ile Cys Ala Gln Gln Cys Ser
Gly Arg Cys Arg 210 215 220 Gly Lys Ser Pro Ser Asp Cys Cys His Asn
Gln Cys Ala Ala Gly Cys 225 230 235 240 Thr Gly Pro Arg Glu Ser Asp
Cys Leu Val Cys Arg Lys Phe Arg Asp 245 250 255 Glu Ala Thr Cys Lys
Asp Thr Cys Pro Pro Leu Met Leu Tyr Asn Pro 260 265 270 Thr Thr Tyr
Gln Met Asp Val Asn Pro Glu Gly Lys Tyr Ser Phe Gly 275 280 285 Ala
Thr Cys Val Lys Lys Cys Pro Arg Asn Tyr Val Val Thr Asp His 290 295
300 Gly Ser Cys Val Arg Ala Cys Gly Ala Asp Ser Tyr Glu Met Glu Glu
305 310 315 320 Asp Gly Val Arg Lys Cys Lys Lys Cys Glu Gly Pro Cys
Arg Lys Val 325 330 335 Cys Asn Gly Ile Gly Ile Gly Glu Phe Lys Asp
Ser Leu Ser Ile Asn 340 345 350 Ala Thr Asn Ile Lys His Phe Lys Asn
Cys Thr Ser Ile Ser Gly Asp 355 360 365 Leu His Ile Leu Pro Val Ala
Phe Arg Gly Asp Ser Phe Thr His Thr 370 375 380 Pro Pro Leu Asp Pro
Gln Glu Leu Asp Ile Leu Lys Thr Val Lys Glu 385 390 395 400 Ile Thr
Gly Phe Leu Leu Ile Gln Ala Trp Pro Glu Asn Arg Thr Asp 405 410 415
Leu His Ala Phe Glu Asn Leu Glu Ile Ile Arg Gly Arg Thr Lys Gln 420
425 430 His Gly Gln Phe Ser Leu Ala Val Val Ser Leu Asn Ile Thr Ser
Leu 435 440 445 Gly Leu Arg Ser Leu Lys Glu Ile Ser Asp Gly Asp Val
Ile Ile Ser 450 455 460 Gly Asn Lys Asn Leu Cys Tyr Ala Asn Thr Ile
Asn Trp Lys Lys Leu 465 470 475 480 Phe Gly Thr Ser Gly Gln Lys Thr
Lys Ile Ile Ser Asn Arg Gly Glu 485 490 495 Asn Ser Cys Lys Ala Thr
Gly Gln Val Cys His Ala Leu Cys Ser Pro 500 505 510 Glu Gly Cys Trp
Gly Pro Glu Pro Arg Asp Cys Val Ser Cys Arg Asn 515 520 525 Val Ser
Arg Gly Arg Glu Cys Val Asp Lys Cys Asn Leu Leu Glu Gly 530 535 540
Glu Pro Arg Glu Phe Val Glu Asn Ser Glu Cys Ile Gln Cys His Pro 545
550 555 560 Glu Cys Leu Pro Gln Ala Met Asn Ile Thr Cys Thr Gly Arg
Gly Pro 565 570 575 Asp Asn Cys Ile Gln Cys Ala His Tyr Ile Asp Gly
Pro His Cys Val 580 585 590 Lys Thr Cys Pro Ala Gly Val Met Gly Glu
Asn Asn Thr Leu Val Trp 595 600 605 Lys Tyr Ala Asp Ala Gly His Val
Cys His Leu Cys His Pro Asn Cys 610 615 620 Thr Tyr Gly Cys Thr Gly
Pro Gly Leu Glu Gly Cys Pro Thr Asn Gly 625 630 635 640 Pro Lys Ile
Pro Ser Ile Ala Thr Gly Met Val Gly Ala Leu Leu Leu 645 650 655 Leu
Leu Val Val Ala Leu Gly Ile Gly Leu Phe Met Arg Arg Arg His 660 665
670 Ile Val Arg Lys Arg Thr Leu Arg Arg Leu Leu Gln Glu Arg Glu Leu
675 680 685 Val Glu Pro Leu Thr Pro Ser Gly Glu Ala Pro Asn Gln Ala
Leu Leu 690 695 700 Arg Ile Leu Lys Glu Thr Glu Phe Lys Lys Ile Lys
Val Leu Gly Ser 705 710 715 720 Gly Ala Phe Gly Thr Val Tyr Lys Gly
Leu Trp Ile Pro Glu Gly Glu 725 730 735 Lys Val Lys Ile Pro Val Ala
Ile Lys Glu Leu Arg Glu Ala Thr Ser 740 745 750 Pro Lys Ala Asn Lys
Glu Ile Leu Asp Glu Ala Tyr Val Met Ala Ser 755 760 765 Val Asp Asn
Pro His Val Cys Arg Leu Leu Gly Ile Cys Leu Thr Ser 770 775 780 Thr
Val Gln Leu Ile Thr Gln Leu Met Pro Phe Gly Cys Leu Leu Asp 785 790
795 800 Tyr Val Arg Glu His Lys Asp Asn Ile Gly Ser Gln Tyr Leu Leu
Asn 805 810 815 Trp Cys Val Gln Ile Ala Lys Gly Met Asn Tyr Leu Glu
Asp Arg Arg 820 825 830 Leu Val His Arg Asp Leu Ala Ala Arg Asn Val
Leu Val Lys Thr Pro 835 840 845 Gln His Val Lys Ile Thr Asp Phe Gly
Leu Ala Lys Leu Leu Gly Ala 850 855 860 Glu Glu Lys Glu Tyr His Ala
Glu Gly Gly Lys Val Pro Ile Lys Trp 865 870 875 880 Met Ala Leu Glu
Ser Ile Leu His Arg Ile Tyr Thr His Gln Ser Asp 885 890 895 Val Trp
Ser Tyr Gly Val Thr Val Trp Glu Leu Met Thr Phe Gly Ser 900 905 910
Lys Pro Tyr Asp Gly Ile Pro Ala Ser Glu Ile Ser Ser Ile Leu Glu 915
920 925 Lys Gly Glu Arg Leu Pro Gln Pro Pro Ile Cys Thr Ile Asp Val
Tyr 930 935 940 Met Ile Met Val Lys Cys Trp Met Ile Asp Ala Asp Ser
Arg Pro Lys 945 950 955 960 Phe Arg Glu Leu Ile Ile Glu Phe Ser Lys
Met Ala Arg Asp Pro Gln 965 970 975 Arg Tyr Leu Val Ile Gln Gly Asp
Glu Arg Met His Leu Pro Ser Pro 980 985 990 Thr Asp Ser Asn Phe Tyr
Arg Ala Leu Met Asp Glu Glu Asp Met Asp 995 1000 1005 Asp Val Val
Asp Ala Asp Glu Tyr Leu Ile Pro Gln Gln Gly Phe 1010 1015 1020 Phe
Ser Ser Pro Ser Thr Ser Arg Thr Pro Leu Leu Ser Ser Leu 1025 1030
1035 Ser Ala Thr Ser Asn Asn Ser Thr Val Ala Cys Ile Asp Arg Asn
1040 1045 1050 Gly Leu Gln Ser Cys Pro Ile Lys Glu Asp Ser Phe Leu
Gln Arg 1055 1060 1065 Tyr Ser Ser Asp Pro Thr Gly Ala Leu Thr Glu
Asp Ser Ile Asp 1070 1075 1080 Asp Thr Phe Leu Pro Val Pro Glu Tyr
Ile Asn Gln Ser Val Pro 1085 1090 1095 Lys Arg Pro Ala Gly Ser Val
Gln Asn Pro Val Tyr His Asn Gln 1100 1105 1110 Pro Leu Asn Pro Ala
Pro Ser Arg Asp Pro His Tyr Gln Asp Pro 1115 1120 1125 His Ser Thr
Ala Val Gly Asn Pro Glu Tyr Leu Asn Thr Val Gln 1130 1135 1140 Pro
Thr Cys Val Asn Ser Thr Phe Asp Ser Pro Ala His Trp Ala 1145 1150
1155 Gln Lys Gly Ser His Gln Ile Ser Leu Asp Asn Pro Asp Tyr Gln
1160 1165 1170 Gln Asp Phe Phe Pro Lys Glu Ala Lys Pro Asn Gly Ile
Phe Lys 1175 1180 1185 Gly Ser Thr Ala Glu Asn Ala Glu Tyr Leu Arg
Val Ala Pro Gln 1190 1195 1200 Ser Ser Glu Phe Ile Gly Ala 1205
1210 20628PRTHomo sapiens 20Met Arg Pro Ser Gly Thr Ala Gly Ala Ala
Leu Leu Ala Leu Leu Ala 1 5 10 15 Ala Leu Cys Pro Ala Ser Arg Ala
Leu Glu Glu Lys Lys Val Cys Gln 20 25 30 Gly Thr Ser Asn Lys Leu
Thr Gln Leu Gly Thr Phe Glu Asp His Phe 35 40 45 Leu Ser Leu Gln
Arg Met Phe Asn Asn Cys Glu Val Val Leu Gly Asn 50 55 60 Leu Glu
Ile Thr Tyr Val Gln Arg Asn Tyr Asp Leu Ser Phe Leu Lys 65 70 75 80
Thr Ile Gln Glu Val Ala Gly Tyr Val Leu Ile Ala Leu Asn Thr Val 85
90 95 Glu Arg Ile Pro Leu Glu Asn Leu Gln Ile Ile Arg Gly Asn Met
Tyr 100 105 110 Tyr Glu Asn Ser Tyr Ala Leu Ala Val Leu Ser Asn Tyr
Asp Ala Asn 115 120 125 Lys Thr Gly Leu Lys Glu Leu Pro Met Arg Asn
Leu Gln Glu Ile Leu 130 135 140 His Gly Ala Val Arg Phe Ser Asn Asn
Pro Ala Leu Cys Asn Val Glu 145 150 155 160 Ser Ile Gln Trp Arg Asp
Ile Val Ser Ser Asp Phe Leu Ser Asn Met 165 170 175 Ser Met Asp Phe
Gln Asn His Leu Gly Ser Cys Gln Lys Cys Asp Pro 180 185 190 Ser Cys
Pro Asn Gly Ser Cys Trp Gly Ala Gly Glu Glu Asn Cys Gln 195 200 205
Lys Leu Thr Lys Ile Ile Cys Ala Gln Gln Cys Ser Gly Arg Cys Arg 210
215 220 Gly Lys Ser Pro Ser Asp Cys Cys His Asn Gln Cys Ala Ala Gly
Cys 225 230 235 240 Thr Gly Pro Arg Glu Ser Asp Cys Leu Val Cys Arg
Lys Phe Arg Asp 245 250 255 Glu Ala Thr Cys Lys Asp Thr Cys Pro Pro
Leu Met Leu Tyr Asn Pro 260 265 270 Thr Thr Tyr Gln Met Asp Val Asn
Pro Glu Gly Lys Tyr Ser Phe Gly 275 280 285 Ala Thr Cys Val Lys Lys
Cys Pro Arg Asn Tyr Val Val Thr Asp His 290 295 300 Gly Ser Cys Val
Arg Ala Cys Gly Ala Asp Ser Tyr Glu Met Glu Glu 305 310 315 320 Asp
Gly Val Arg Lys Cys Lys Lys Cys Glu Gly Pro Cys Arg Lys Val 325 330
335 Cys Asn Gly Ile Gly Ile Gly Glu Phe Lys Asp Ser Leu Ser Ile Asn
340 345 350 Ala Thr Asn Ile Lys His Phe Lys Asn Cys Thr Ser Ile Ser
Gly Asp 355 360 365 Leu His Ile Leu Pro Val Ala Phe Arg Gly Asp Ser
Phe Thr His Thr 370 375 380 Pro Pro Leu Asp Pro Gln Glu Leu Asp Ile
Leu Lys Thr Val Lys Glu 385 390 395 400 Ile Thr Gly Phe Leu Leu Ile
Gln Ala Trp Pro Glu Asn Arg Thr Asp 405 410 415 Leu His Ala Phe Glu
Asn Leu Glu Ile Ile Arg Gly Arg Thr Lys Gln 420 425 430 His Gly Gln
Phe Ser Leu Ala Val Val Ser Leu Asn Ile Thr Ser Leu 435 440 445 Gly
Leu Arg Ser Leu Lys Glu Ile Ser Asp Gly Asp Val Ile Ile Ser 450 455
460 Gly Asn Lys Asn Leu Cys Tyr Ala Asn Thr Ile Asn Trp Lys Lys Leu
465 470 475 480 Phe Gly Thr Ser Gly Gln Lys Thr Lys Ile Ile Ser Asn
Arg Gly Glu 485 490 495 Asn Ser Cys Lys Ala Thr Gly Gln Val Cys His
Ala Leu Cys Ser Pro 500 505 510 Glu Gly Cys Trp Gly Pro Glu Pro Arg
Asp Cys Val Ser Cys Arg Asn 515 520 525 Val Ser Arg Gly Arg Glu Cys
Val Asp Lys Cys Asn Leu Leu Glu Gly 530 535 540 Glu Pro Arg Glu Phe
Val Glu Asn Ser Glu Cys Ile Gln Cys His Pro 545 550 555 560 Glu Cys
Leu Pro Gln Ala Met Asn Ile Thr Cys Thr Gly Arg Gly Pro 565 570 575
Asp Asn Cys Ile Gln Cys Ala His Tyr Ile Asp Gly Pro His Cys Val 580
585 590 Lys Thr Cys Pro Ala Gly Val Met Gly Glu Asn Asn Thr Leu Val
Trp 595 600 605 Lys Tyr Ala Asp Ala Gly His Val Cys His Leu Cys His
Pro Asn Cys 610 615 620 Thr Tyr Gly Ser 625 21405PRTHomo sapiens
21Met Arg Pro Ser Gly Thr Ala Gly Ala Ala Leu Leu Ala Leu Leu Ala 1
5 10 15 Ala Leu Cys Pro Ala Ser Arg Ala Leu Glu Glu Lys Lys Val Cys
Gln 20 25 30 Gly Thr Ser Asn Lys Leu Thr Gln Leu Gly Thr Phe Glu
Asp His Phe 35 40 45 Leu Ser Leu Gln Arg Met Phe Asn Asn Cys Glu
Val Val Leu Gly Asn 50 55 60 Leu Glu Ile Thr Tyr Val Gln Arg Asn
Tyr Asp Leu Ser Phe Leu Lys 65 70 75 80 Thr Ile Gln Glu Val Ala Gly
Tyr Val Leu Ile Ala Leu Asn Thr Val 85 90 95 Glu Arg Ile Pro Leu
Glu Asn Leu Gln Ile Ile Arg Gly Asn Met Tyr 100 105 110 Tyr Glu Asn
Ser Tyr Ala Leu Ala Val Leu Ser Asn Tyr Asp Ala Asn 115 120 125 Lys
Thr Gly Leu Lys Glu Leu Pro Met Arg Asn Leu Gln Glu Ile Leu 130 135
140 His Gly Ala Val Arg Phe Ser Asn Asn Pro Ala Leu Cys Asn Val Glu
145 150 155 160 Ser Ile Gln Trp Arg Asp Ile Val Ser Ser Asp Phe Leu
Ser Asn Met 165 170 175 Ser Met Asp Phe Gln Asn His Leu Gly Ser Cys
Gln Lys Cys Asp Pro 180 185 190 Ser Cys Pro Asn Gly Ser Cys Trp Gly
Ala Gly Glu Glu Asn Cys Gln 195 200 205 Lys Leu Thr Lys Ile Ile Cys
Ala Gln Gln Cys Ser Gly Arg Cys Arg 210 215 220 Gly Lys Ser Pro Ser
Asp Cys Cys His Asn Gln Cys Ala Ala Gly Cys 225 230 235 240 Thr Gly
Pro Arg Glu Ser Asp Cys Leu Val Cys Arg Lys Phe Arg Asp 245 250 255
Glu Ala Thr Cys Lys Asp Thr Cys Pro Pro Leu Met Leu Tyr Asn Pro 260
265 270 Thr Thr Tyr Gln Met Asp Val Asn Pro Glu Gly Lys Tyr Ser Phe
Gly 275 280 285 Ala Thr Cys Val Lys Lys Cys Pro Arg Asn Tyr Val Val
Thr Asp His 290 295 300 Gly Ser Cys Val Arg Ala Cys Gly Ala Asp Ser
Tyr Glu Met Glu Glu 305 310
315 320 Asp Gly Val Arg Lys Cys Lys Lys Cys Glu Gly Pro Cys Arg Lys
Val 325 330 335 Cys Asn Gly Ile Gly Ile Gly Glu Phe Lys Asp Ser Leu
Ser Ile Asn 340 345 350 Ala Thr Asn Ile Lys His Phe Lys Asn Cys Thr
Ser Ile Ser Gly Asp 355 360 365 Leu His Ile Leu Pro Val Ala Phe Arg
Gly Asp Ser Phe Thr His Thr 370 375 380 Pro Pro Leu Asp Pro Gln Glu
Leu Asp Ile Leu Lys Thr Val Lys Glu 385 390 395 400 Ile Thr Gly Leu
Ser 405 22705PRTHomo sapiens 22Met Arg Pro Ser Gly Thr Ala Gly Ala
Ala Leu Leu Ala Leu Leu Ala 1 5 10 15 Ala Leu Cys Pro Ala Ser Arg
Ala Leu Glu Glu Lys Lys Val Cys Gln 20 25 30 Gly Thr Ser Asn Lys
Leu Thr Gln Leu Gly Thr Phe Glu Asp His Phe 35 40 45 Leu Ser Leu
Gln Arg Met Phe Asn Asn Cys Glu Val Val Leu Gly Asn 50 55 60 Leu
Glu Ile Thr Tyr Val Gln Arg Asn Tyr Asp Leu Ser Phe Leu Lys 65 70
75 80 Thr Ile Gln Glu Val Ala Gly Tyr Val Leu Ile Ala Leu Asn Thr
Val 85 90 95 Glu Arg Ile Pro Leu Glu Asn Leu Gln Ile Ile Arg Gly
Asn Met Tyr 100 105 110 Tyr Glu Asn Ser Tyr Ala Leu Ala Val Leu Ser
Asn Tyr Asp Ala Asn 115 120 125 Lys Thr Gly Leu Lys Glu Leu Pro Met
Arg Asn Leu Gln Glu Ile Leu 130 135 140 His Gly Ala Val Arg Phe Ser
Asn Asn Pro Ala Leu Cys Asn Val Glu 145 150 155 160 Ser Ile Gln Trp
Arg Asp Ile Val Ser Ser Asp Phe Leu Ser Asn Met 165 170 175 Ser Met
Asp Phe Gln Asn His Leu Gly Ser Cys Gln Lys Cys Asp Pro 180 185 190
Ser Cys Pro Asn Gly Ser Cys Trp Gly Ala Gly Glu Glu Asn Cys Gln 195
200 205 Lys Leu Thr Lys Ile Ile Cys Ala Gln Gln Cys Ser Gly Arg Cys
Arg 210 215 220 Gly Lys Ser Pro Ser Asp Cys Cys His Asn Gln Cys Ala
Ala Gly Cys 225 230 235 240 Thr Gly Pro Arg Glu Ser Asp Cys Leu Val
Cys Arg Lys Phe Arg Asp 245 250 255 Glu Ala Thr Cys Lys Asp Thr Cys
Pro Pro Leu Met Leu Tyr Asn Pro 260 265 270 Thr Thr Tyr Gln Met Asp
Val Asn Pro Glu Gly Lys Tyr Ser Phe Gly 275 280 285 Ala Thr Cys Val
Lys Lys Cys Pro Arg Asn Tyr Val Val Thr Asp His 290 295 300 Gly Ser
Cys Val Arg Ala Cys Gly Ala Asp Ser Tyr Glu Met Glu Glu 305 310 315
320 Asp Gly Val Arg Lys Cys Lys Lys Cys Glu Gly Pro Cys Arg Lys Val
325 330 335 Cys Asn Gly Ile Gly Ile Gly Glu Phe Lys Asp Ser Leu Ser
Ile Asn 340 345 350 Ala Thr Asn Ile Lys His Phe Lys Asn Cys Thr Ser
Ile Ser Gly Asp 355 360 365 Leu His Ile Leu Pro Val Ala Phe Arg Gly
Asp Ser Phe Thr His Thr 370 375 380 Pro Pro Leu Asp Pro Gln Glu Leu
Asp Ile Leu Lys Thr Val Lys Glu 385 390 395 400 Ile Thr Gly Phe Leu
Leu Ile Gln Ala Trp Pro Glu Asn Arg Thr Asp 405 410 415 Leu His Ala
Phe Glu Asn Leu Glu Ile Ile Arg Gly Arg Thr Lys Gln 420 425 430 His
Gly Gln Phe Ser Leu Ala Val Val Ser Leu Asn Ile Thr Ser Leu 435 440
445 Gly Leu Arg Ser Leu Lys Glu Ile Ser Asp Gly Asp Val Ile Ile Ser
450 455 460 Gly Asn Lys Asn Leu Cys Tyr Ala Asn Thr Ile Asn Trp Lys
Lys Leu 465 470 475 480 Phe Gly Thr Ser Gly Gln Lys Thr Lys Ile Ile
Ser Asn Arg Gly Glu 485 490 495 Asn Ser Cys Lys Ala Thr Gly Gln Val
Cys His Ala Leu Cys Ser Pro 500 505 510 Glu Gly Cys Trp Gly Pro Glu
Pro Arg Asp Cys Val Ser Cys Arg Asn 515 520 525 Val Ser Arg Gly Arg
Glu Cys Val Asp Lys Cys Asn Leu Leu Glu Gly 530 535 540 Glu Pro Arg
Glu Phe Val Glu Asn Ser Glu Cys Ile Gln Cys His Pro 545 550 555 560
Glu Cys Leu Pro Gln Ala Met Asn Ile Thr Cys Thr Gly Arg Gly Pro 565
570 575 Asp Asn Cys Ile Gln Cys Ala His Tyr Ile Asp Gly Pro His Cys
Val 580 585 590 Lys Thr Cys Pro Ala Gly Val Met Gly Glu Asn Asn Thr
Leu Val Trp 595 600 605 Lys Tyr Ala Asp Ala Gly His Val Cys His Leu
Cys His Pro Asn Cys 610 615 620 Thr Tyr Gly Pro Gly Asn Glu Ser Leu
Lys Ala Met Leu Phe Cys Leu 625 630 635 640 Phe Lys Leu Ser Ser Cys
Asn Gln Ser Asn Asp Gly Ser Val Ser His 645 650 655 Gln Ser Gly Ser
Pro Ala Ala Gln Glu Ser Cys Leu Gly Trp Ile Pro 660 665 670 Ser Leu
Leu Pro Ser Glu Phe Gln Leu Gly Trp Gly Gly Cys Ser His 675 680 685
Leu His Ala Trp Pro Ser Ala Ser Val Ile Ile Thr Ala Ser Ser Cys 690
695 700 His 705 231576DNAHomo sapiens 23gcgaggcagc cagcgaggga
gagagcgagc gggcgagccg gagcgaggaa gggaaagcgc 60aagagagagc gcacacgcac
acacccgccg cgcgcactcg cgcacggacc cgcacgggga 120cagctcggaa
gtcatcagtt ccatgggcga gatgctgctg ctggcgagat gtctgctgct
180agtcctcgtc tcctcgctgc tggtatgctc gggactggcg tgcggaccgg
gcagggggtt 240cgggaagagg aggcacccca aaaagctgac ccctttagcc
tacaagcagt ttatccccaa 300tgtggccgag aagaccctag gcgccagcgg
aaggtatgaa gggaagatct ccagaaactc 360cgagcgattt aaggaactca
cccccaatta caaccccgac atcatattta aggatgaaga 420aaacaccgga
gcggacaggc tgatgactca gaggtgtaag gacaagttga acgctttggc
480catctcggtg atgaaccagt ggccaggagt gaaactgcgg gtgaccgagg
gctgggacga 540agatggccac cactcagagg agtctctgca ctacgagggc
cgcgcagtgg acatcaccac 600gtctgaccgc gaccgcagca agtacggcat
gctggcccgc ctggcggtgg aggccggctt 660cgactgggtg tactacgagt
ccaaggcaca tatccactgc tcggtgaaag cagagaactc 720ggtggcggcc
aaatcgggag gctgcttccc gggctcggcc acggtgcacc tggagcaggg
780cggcaccaag ctggtgaagg acctgagccc cggggaccgc gtgctggcgg
cggacgacca 840gggccggctg ctctacagcg acttcctcac tttcctggac
cgcgacgacg gcgccaagaa 900ggtcttctac gtgatcgaga cgcgggagcc
gcgcgagcgc ctgctgctca ccgccgcgca 960cctgctcttt gtggcgccgc
acaacgactc ggccaccggg gagcccgagg cgtcctcggg 1020ctcggggccg
ccttccgggg gcgcactggg gcctcgggcg ctgttcgcca gccgcgtgcg
1080cccgggccag cgcgtgtacg tggtggccga gcgtgacggg gaccgccggc
tcctgcccgc 1140cgctgtgcac agcgtgaccc taagcgagga ggccgcgggc
gcctacgcgc cgctcacggc 1200ccagggcacc attctcatca accgggtgct
ggcctcgtgc tacgcggtca tcgaggagca 1260cagctgggcg caccgggcct
tcgcgccctt ccgcctggcg cacgcgctcc tggctgcact 1320ggcgcccgcg
cgcacggacc gcggcgggga cagcggcggc ggggaccgcg ggggcggcgg
1380cggcagagta gccctaaccg ctccaggtgc tgccgacgct ccgggtgcgg
gggccaccgc 1440gggcatccac tggtactcgc agctgctcta ccaaataggc
acctggctcc tggacagcga 1500ggccctgcac ccgctgggca tggcggtcaa
gtccagctga agccgggggg ccgggggagg 1560ggcgcgggag ggggcg
157624462PRTHomo sapiens 24Met Leu Leu Leu Ala Arg Cys Leu Leu Leu
Val Leu Val Ser Ser Leu 1 5 10 15 Leu Val Cys Ser Gly Leu Ala Cys
Gly Pro Gly Arg Gly Phe Gly Lys 20 25 30 Arg Arg His Pro Lys Lys
Leu Thr Pro Leu Ala Tyr Lys Gln Phe Ile 35 40 45 Pro Asn Val Ala
Glu Lys Thr Leu Gly Ala Ser Gly Arg Tyr Glu Gly 50 55 60 Lys Ile
Ser Arg Asn Ser Glu Arg Phe Lys Glu Leu Thr Pro Asn Tyr 65 70 75 80
Asn Pro Asp Ile Ile Phe Lys Asp Glu Glu Asn Thr Gly Ala Asp Arg 85
90 95 Leu Met Thr Gln Arg Cys Lys Asp Lys Leu Asn Ala Leu Ala Ile
Ser 100 105 110 Val Met Asn Gln Trp Pro Gly Val Lys Leu Arg Val Thr
Glu Gly Trp 115 120 125 Asp Glu Asp Gly His His Ser Glu Glu Ser Leu
His Tyr Glu Gly Arg 130 135 140 Ala Val Asp Ile Thr Thr Ser Asp Arg
Asp Arg Ser Lys Tyr Gly Met 145 150 155 160 Leu Ala Arg Leu Ala Val
Glu Ala Gly Phe Asp Trp Val Tyr Tyr Glu 165 170 175 Ser Lys Ala His
Ile His Cys Ser Val Lys Ala Glu Asn Ser Val Ala 180 185 190 Ala Lys
Ser Gly Gly Cys Phe Pro Gly Ser Ala Thr Val His Leu Glu 195 200 205
Gln Gly Gly Thr Lys Leu Val Lys Asp Leu Ser Pro Gly Asp Arg Val 210
215 220 Leu Ala Ala Asp Asp Gln Gly Arg Leu Leu Tyr Ser Asp Phe Leu
Thr 225 230 235 240 Phe Leu Asp Arg Asp Asp Gly Ala Lys Lys Val Phe
Tyr Val Ile Glu 245 250 255 Thr Arg Glu Pro Arg Glu Arg Leu Leu Leu
Thr Ala Ala His Leu Leu 260 265 270 Phe Val Ala Pro His Asn Asp Ser
Ala Thr Gly Glu Pro Glu Ala Ser 275 280 285 Ser Gly Ser Gly Pro Pro
Ser Gly Gly Ala Leu Gly Pro Arg Ala Leu 290 295 300 Phe Ala Ser Arg
Val Arg Pro Gly Gln Arg Val Tyr Val Val Ala Glu 305 310 315 320 Arg
Asp Gly Asp Arg Arg Leu Leu Pro Ala Ala Val His Ser Val Thr 325 330
335 Leu Ser Glu Glu Ala Ala Gly Ala Tyr Ala Pro Leu Thr Ala Gln Gly
340 345 350 Thr Ile Leu Ile Asn Arg Val Leu Ala Ser Cys Tyr Ala Val
Ile Glu 355 360 365 Glu His Ser Trp Ala His Arg Ala Phe Ala Pro Phe
Arg Leu Ala His 370 375 380 Ala Leu Leu Ala Ala Leu Ala Pro Ala Arg
Thr Asp Arg Gly Gly Asp 385 390 395 400 Ser Gly Gly Gly Asp Arg Gly
Gly Gly Gly Gly Arg Val Ala Leu Thr 405 410 415 Ala Pro Gly Ala Ala
Asp Ala Pro Gly Ala Gly Ala Thr Ala Gly Ile 420 425 430 His Trp Tyr
Ser Gln Leu Leu Tyr Gln Ile Gly Thr Trp Leu Leu Asp 435 440 445 Ser
Glu Ala Leu His Pro Leu Gly Met Ala Val Lys Ser Ser 450 455 460
2511242DNAHomo sapiens 25tttttttttt ttttttttga gaaaggggaa
tttcatccca aataaaagga atgaagtctg 60gctccggagg agggtccccg acctcgctgt
gggggctcct gtttctctcc gccgcgctct 120cgctctggcc gacgagtgga
gaaatctgcg ggccaggcat cgacatccgc aacgactatc 180agcagctgaa
gcgcctggag aactgcacgg tgatcgaggg ctacctccac atcctgctca
240tctccaaggc cgaggactac cgcagctacc gcttccccaa gctcacggtc
attaccgagt 300acttgctgct gttccgagtg gctggcctcg agagcctcgg
agacctcttc cccaacctca 360cggtcatccg cggctggaaa ctcttctaca
actacgccct ggtcatcttc gagatgacca 420atctcaagga tattgggctt
tacaacctga ggaacattac tcggggggcc atcaggattg 480agaaaaatgc
tgacctctgt tacctctcca ctgtggactg gtccctgatc ctggatgcgg
540tgtccaataa ctacattgtg gggaataagc ccccaaagga atgtggggac
ctgtgtccag 600ggaccatgga ggagaagccg atgtgtgaga agaccaccat
caacaatgag tacaactacc 660gctgctggac cacaaaccgc tgccagaaaa
tgtgcccaag cacgtgtggg aagcgggcgt 720gcaccgagaa caatgagtgc
tgccaccccg agtgcctggg cagctgcagc gcgcctgaca 780acgacacggc
ctgtgtagct tgccgccact actactatgc cggtgtctgt gtgcctgcct
840gcccgcccaa cacctacagg tttgagggct ggcgctgtgt ggaccgtgac
ttctgcgcca 900acatcctcag cgccgagagc agcgactccg aggggtttgt
gatccacgac ggcgagtgca 960tgcaggagtg cccctcgggc ttcatccgca
acggcagcca gagcatgtac tgcatccctt 1020gtgaaggtcc ttgcccgaag
gtctgtgagg aagaaaagaa aacaaagacc attgattctg 1080ttacttctgc
tcagatgctc caaggatgca ccatcttcaa gggcaatttg ctcattaaca
1140tccgacgggg gaataacatt gcttcagagc tggagaactt catggggctc
atcgaggtgg 1200tgacgggcta cgtgaagatc cgccattctc atgccttggt
ctccttgtcc ttcctaaaaa 1260accttcgcct catcctagga gaggagcagc
tagaagggaa ttactccttc tacgtcctcg 1320acaaccagaa cttgcagcaa
ctgtgggact gggaccaccg caacctgacc atcaaagcag 1380ggaaaatgta
ctttgctttc aatcccaaat tatgtgtttc cgaaatttac cgcatggagg
1440aagtgacggg gactaaaggg cgccaaagca aaggggacat aaacaccagg
aacaacgggg 1500agagagcctc ctgtgaaagt gacgtcctgc atttcacctc
caccaccacg tcgaagaatc 1560gcatcatcat aacctggcac cggtaccggc
cccctgacta cagggatctc atcagcttca 1620ccgtttacta caaggaagca
ccctttaaga atgtcacaga gtatgatggg caggatgcct 1680gcggctccaa
cagctggaac atggtggacg tggacctccc gcccaacaag gacgtggagc
1740ccggcatctt actacatggg ctgaagccct ggactcagta cgccgtttac
gtcaaggctg 1800tgaccctcac catggtggag aacgaccata tccgtggggc
caagagtgag atcttgtaca 1860ttcgcaccaa tgcttcagtt ccttccattc
ccttggacgt tctttcagca tcgaactcct 1920cttctcagtt aatcgtgaag
tggaaccctc cctctctgcc caacggcaac ctgagttact 1980acattgtgcg
ctggcagcgg cagcctcagg acggctacct ttaccggcac aattactgct
2040ccaaagacaa aatccccatc aggaagtatg ccgacggcac catcgacatt
gaggaggtca 2100cagagaaccc caagactgag gtgtgtggtg gggagaaagg
gccttgctgc gcctgcccca 2160aaactgaagc cgagaagcag gccgagaagg
aggaggctga ataccgcaaa gtctttgaga 2220atttcctgca caactccatc
ttcgtgccca gacctgaaag gaagcggaga gatgtcatgc 2280aagtggccaa
caccaccatg tccagccgaa gcaggaacac cacggccgca gacacctaca
2340acatcaccga cccggaagag ctggagacag agtacccttt ctttgagagc
agagtggata 2400acaaggagag aactgtcatt tctaaccttc ggcctttcac
attgtaccgc atcgatatcc 2460acagctgcaa ccacgaggct gagaagctgg
gctgcagcgc ctccaacttc gtctttgcaa 2520ggactatgcc cgcagaagga
gcagatgaca ttcctgggcc agtgacctgg gagccaaggc 2580ctgaaaactc
catcttttta aagtggccgg aacctgagaa tcccaatgga ttgattctaa
2640tgtatgaaat aaaatacgga tcacaagttg aggatcagcg agaatgtgtg
tccagacagg 2700aatacaggaa gtatggaggg gccaagctaa accggctaaa
cccggggaac tacacagccc 2760ggattcaggc cacatctctc tctgggaatg
ggtcgtggac agatcctgtg ttcttctatg 2820tccaggccaa aacaggatat
gaaaacttca tccatctgat catcgctctg cccgtcgctg 2880tcctgttgat
cgtgggaggg ttggtgatta tgctgtacgt cttccataga aagagaaata
2940acagcaggct ggggaatgga gtgctgtatg cctctgtgaa cccggagtac
ttcagcgctg 3000ctgatgtgta cgttcctgat gagtgggagg tggctcggga
gaagatcacc atgagccggg 3060aacttgggca ggggtcgttt gggatggtct
atgaaggagt tgccaagggt gtggtgaaag 3120atgaacctga aaccagagtg
gccattaaaa cagtgaacga ggccgcaagc atgcgtgaga 3180ggattgagtt
tctcaacgaa gcttctgtga tgaaggagtt caattgtcac catgtggtgc
3240gattgctggg tgtggtgtcc caaggccagc caacactggt catcatggaa
ctgatgacac 3300ggggcgatct caaaagttat ctccggtctc tgaggccaga
aatggagaat aatccagtcc 3360tagcacctcc aagcctgagc aagatgattc
agatggccgg agagattgca gacggcatgg 3420catacctcaa cgccaataag
ttcgtccaca gagaccttgc tgcccggaat tgcatggtag 3480ccgaagattt
cacagtcaaa atcggagatt ttggtatgac gcgagatatc tatgagacag
3540actattaccg gaaaggaggg aaagggctgc tgcccgtgcg ctggatgtct
cctgagtccc 3600tcaaggatgg agtcttcacc acttactcgg acgtctggtc
cttcggggtc gtcctctggg 3660agatcgccac actggccgag cagccctacc
agggcttgtc caacgagcaa gtccttcgct 3720tcgtcatgga gggcggcctt
ctggacaagc cagacaactg tcctgacatg ctgtttgaac 3780tgatgcgcat
gtgctggcag tataacccca agatgaggcc ttccttcctg gagatcatca
3840gcagcatcaa agaggagatg gagcctggct tccgggaggt ctccttctac
tacagcgagg 3900agaacaagct gcccgagccg gaggagctgg acctggagcc
agagaacatg gagagcgtcc 3960ccctggaccc ctcggcctcc tcgtcctccc
tgccactgcc cgacagacac tcaggacaca 4020aggccgagaa cggccccggc
cctggggtgc tggtcctccg cgccagcttc gacgagagac 4080agccttacgc
ccacatgaac gggggccgca agaacgagcg ggccttgccg ctgccccagt
4140cttcgacctg ctgatccttg gatcctgaat ctgtgcaaac agtaacgtgt
gcgcacgcgc 4200agcggggtgg ggggggagag agagttttaa caatccattc
acaagcctcc tgtacctcag 4260tggatcttca gaactgccct tgctgcccgc
gggagacagc ttctctgcag taaaacacat 4320ttgggatgtt ccttttttca
atatgcaagc agctttttat tccctgccca aacccttaac 4380tgacatgggc
ctttaagaac cttaatgaca acacttaata gcaacagagc acttgagaac
4440cagtctcctc actctgtccc tgtccttccc tgttctccct ttctctctcc
tctctgcttc 4500ataacggaaa aataattgcc acaagtccag ctgggaagcc
ctttttatca gtttgaggaa 4560gtggctgtcc ctgtggcccc atccaaccac
tgtacacacc cgcctgacac cgtgggtcat 4620tacaaaaaaa cacgtggaga
tggaaatttt tacctttatc tttcaccttt ctagggacat 4680gaaatttaca
aagggccatc gttcatccaa ggctgttacc attttaacgc tgcctaattt
4740tgccaaaatc ctgaactttc tccctcatcg gcccggcgct gattcctcgt
gtccggaggc 4800atgggtgagc atggcagctg gttgctccat ttgagagaca
cgctggcgac acactccgtc 4860catccgactg cccctgctgt gctgctcaag
gccacaggca cacaggtctc attgcttctg 4920actagattat tatttggggg
aactggacac aataggtctt tctctcagtg aaggtgggga 4980gaagctgaac
cggcttccct gccctgcctc cccagccccc tgcccaaccc ccaagaatct
5040ggtggccatg ggccccgaag cagcctggcg gacaggcttg gagtcaaggg
gccccatgcc 5100tgcttctctc ccagccccag ctcccccgcc cgcccccaag
gacacagatg ggaaggggtt 5160tccagggact cagccccact gttgatgcag
gtttgcaagg aaagaaattc aaacaccaca 5220acagcagtaa gaagaaaagc
agtcaatgga ttcaagcatt ctaagctttg ttgacatttt 5280ctctgttcct
aggacttctt catgggtctt acagttctat gttagaccat gaaacatttg
5340catacacatc gtctttaatg tcacttttat aactttttta cggttcagat
attcatctat 5400acgtctgtac agaaaaaaaa aagctgctat tttttttgtt
cttgatcttt gtggatttaa 5460tctatgaaaa ccttcaggtc caccctctcc
cctttctgct cactccaaga aacttcttat 5520gctttgtact agagtgcgtg
actttcttcc tcttttcccg gtaatggata cttctatcac 5580ataatttgcc
atgaactgtt ggatgccttt ttataaatac atcccccatc cctgctccca
5640cctgcccctt tagttgtttt ctaacccgta ggctctctgg gcacgaggca
gaaagcaggc 5700cgggcaccca tcctgagagg gccgcgctcc tctccccagc
ctgccctcac agcattggag 5760cctgttacag tgcaagacat gatacaaact
caggtcagaa aaacaaaggt taaatatttc 5820acacgtcttt gttcagtgtt
tccactcacc gtggttgaga agcctcaccc tctctttccc 5880ttgcctttgc
ttaggttgtg acacacatat atatatattt ttttaattct tgggtacaac
5940agcagtgtta accgcagaca ctaggcattt ggattactat ttttcttaat
ggctatttaa 6000tccttccatc ccacgaaaaa cagctgctga gtccaaggga
gcagcagagc gtggtccggc 6060agggcctgtt gtggccctcg ccacccccct
caccggaccg actgacctgt ctttggaacc 6120agaacatccc aagggaactc
cttcgcactg gcgttgagtg ggaccccggg atccaggctg 6180gcccagggcg
gcaccctcag ggctgtgccc gctggagtgc taggtggagg cagcacagac
6240gccacggtgg cccaagagcc cctttgcttc ttgctggggg accagggctg
tggtgctggc 6300ccactttccc tcggccagga atccaggtcc ttggggccca
ggggtcttgt cttgtttcat 6360ttttagcact tctcaccaga gagatgacag
cacaagagtt gcttctggga tagaaatgtt 6420taggagtaag aacaaagctg
ggatacggtg attgctagtt gtgactgaag attcaacaca 6480gaaaagaaag
tttatacggc ttttttgctg gtcagcagtt tgtcccactg ctttctctag
6540tctctatccc atagcgtgtt ccctttaaaa aaaaaaaaaa ggtattatat
gtaggagttt 6600tcttttaatt tattttgtga taaattacca gtttcaatca
ctgtagaaaa gccccattat 6660gaatttaaat ttcaaggaaa gggtgtgtgt
gtgtgtatgt gtggggtgtg tgtgtgtgag 6720agtgatggga cagttcttga
ttttttgggt tttttttccc ccaaacattt atctacctca 6780ctcttatttt
ttatatgtgt atatagacaa aagaatacat ctcacctttc tcagcacctg
6840acaataggcc gttgatactg gtaacctcat ccacgccaca ggcgccacac
ccaggtgatg 6900cagggggaag ccaggctgta ttccggggtc aaagcaacac
taactcacct ctctgctcat 6960ttcagacagc ttgccttttt ctgagatgtc
ctgttttgtg ttgctttttt tgttttgttt 7020tctatcttgg tttccaccaa
ggtgttagat ttctcctcct cctagccagg tggccctgtg 7080aggccaacga
gggcaccaga gcacacctgg gggagccacc aggctgtccc tggctggttg
7140tctttggaac aaactgcttc tgtgcagatg gaatgaccaa cacatttcgt
ccttaagaga 7200gcagtggttc ctcaggttct gaggagagga aggtgtccag
gcagcaccat ctctgtgcga 7260atccccaggg taaaggcgtg gggcattggg
tttgctcccc ttgctgctgc tccatccctg 7320caggaggctc gcgctgaggc
aggaccgtgc ggccatggct gctgcattca ttgagcacaa 7380aggtgcagct
gcagcagcag ctggagagca agagtcaccc agcctgtgcg ccagaatgca
7440gaggctcctg acctcacagc cagtccctga tagaacacac gcaggagcag
agtcccctcc 7500ccctccaggc tgccctctca acttctccct cacctccttc
cctaggggta gacagagatg 7560taccaaacct tccggctgga aagcccagtg
gccggcgccg aggctcgtgg cgtcacgccc 7620cccccgccag ggctgtacct
ccgtctccct ggtcctgctg ctcacaggac agacggctcg 7680ctcccctctt
ccagcagctg ctcttacagg cactgatgat ttcgctggga agtgtggcgg
7740gcagctttgc ctaagcgtgg atggctcctc ggcaattcca gcctaagtga
aggcgctcag 7800gagcctcctg ctggaacgcg acccatctct cccaggaccc
cggggatctt aaggtcattg 7860agaaatactg ttggatcagg gttttgttct
tccacactgt aggtgacccc ttggaataac 7920ggcctctcct ctcgtgcaca
tacctaccgg tttccacaac tggatttcta cagatcattc 7980agctggttat
aagggttttg tttaaactgt ccgagttact gatgtcattt tgtttttgtt
8040ttatgtaggt agcttttaag tagaaaacac taacagtgta gtgcccatca
tagcaaatgc 8100ttcagaaaca cctcaataaa agagaaaact tggcttgtgt
gatggtgcag tcactttact 8160ggaccaaccc acccaccttg actataccaa
ggcatcatct atccacagtt ctagcctaac 8220ttcatgctga tttctctgcc
tcttgatttt tctctgtgtg ttccaaataa tcttaagctg 8280agttgtggca
ttttccatgc aacctccttc tgccagcagc tcacactgct tgaagtcata
8340tgaaccactg aggcacatca tggaattgat gtgagcatta agacgttctc
ccacacagcc 8400cttccctgag gcagcaggag ctggtgtgta ctggagacac
tgttgaactt gatcaagacc 8460cagaccaccc caggtctcct tcgtgggatg
tcatgacgtt tgacatacct ttggaacgag 8520cctcctcctt ggaagatgga
agaccgtgtt cgtggccgac ctggcctctc ctggcctgtt 8580tcttaagatg
cggagtcaca tttcaatggt acgaaaagtg gcttcgtaaa atagaagagc
8640agtcactgtg gaactaccaa atggcgagat gctcggtgca cattggggtg
ctttgggata 8700aaagatttat gagccaacta ttctctggca ccagattcta
ggccagtttg ttccactgaa 8760gcttttccca cagcagtcca cctctgcagg
ctggcagccg aatggcttgc cagtggctct 8820gtggcaagat cacactgaga
tcgatgggtg agaaggctag gatgcttgtc tagtgttctt 8880agctgtcacg
ttggctcctt ccagggtggc cagacggtgt tggccactcc cttctaaaac
8940acaggcgccc tcctggtgac agtgacccgc cgtggtatgc cttggcccat
tccagcagtc 9000ccagttatgc atttcaagtt tggggtttgt tcttttcgtt
aatgttcctc tgtgttgtca 9060gctgtcttca tttcctgggc taagcagcat
tgggagatgt ggaccagaga tccactcctt 9120aagaaccagt ggcgaaagac
actttctttc ttcactctga agtagctggt ggtacaaatg 9180agaacttcaa
gagaggatgt tatttagact gaacctctgt tgccagagat gctgaagata
9240cagaccttgg acaggtcaga gggtttcatt tttggccttc atcttagatg
actggttgcg 9300tcatttggag aagtgagtgc tccttgatgg tggaatgacc
gggtggtggg tacagaacca 9360ttgtcacagg gatcctggca cagagaagag
ttacgagcag cagggtgcag ggcttggaag 9420gaatgtgggc aaggttttga
acttgattgt tcttgaagct atcagaccac atcgaggctc 9480agcagtcatc
cgtgggcatt tggtttcaac aaagaaacct aacatcctac tctggaaact
9540gatctcggag ttaaggcgaa ttgttcaaga acacaaacta catcgcactc
gtcagttgtc 9600agttctgggg catgacttta gcgttttgtt tctgcgagaa
cataacgatc actcattttt 9660atgtcccacg tgtgtgtgtc cgcatctttc
tggtcaacat tgttttaact agtcactcat 9720tagcgttttc aatagggctc
ttaagtccag tagattacgg gtagtcagtt gacgaagatc 9780tggtttacaa
gaactaatta aatgtttcat tgcatttttg taagaacaga ataattttat
9840aaaatgtttg tagtttataa ttgccgaaaa taatttaaag acactttttt
tttctctgtg 9900tgtgcaaatg tgtgtttgtg atccattttt tttttttttt
tttaggacac ctgtttacta 9960gctagcttta caatatgcca aaaaaggatt
tctccctgac cccatccgtg gttcaccctc 10020ttttcccccc atgctttttg
ccctagttta taacaaagga atgatgatga tttaaaaagt 10080agttctgtat
cttcagtatc ttggtcttcc agaaccctct ggttgggaag gggatcattt
10140tttactggtc atttcccttt ggagtgtagc tactttaaca gatggaaaga
acctcattgg 10200ccatggaaac agccgaggtg ttggagccca gcagtgcatg
gcaccgttcg gcatctggct 10260tgattggtct ggctgccgtc attgtcagca
cagtgccatg gacatgggaa gacttgactg 10320cacagccaat ggttttcatg
atgattacag catacacagt gatcacataa acgatgacag 10380ctatggggca
cacaggccat ttgcttacat gcctcgtatc atgactgatt actgctttgt
10440tagaacacag aagagaccct attttattta aggcagaacc ccgaagatac
gtatttccaa 10500tacagaaaag aatttttaat aaaaactata acatacacaa
aaattggttt taaagttgac 10560tccacttcct ctaactccag tggattgttg
gccatgtctc cccaactcca caatatctct 10620atcatgggaa acacctgggg
tttttgcgct acataggaga aagatctgga aactatttgg 10680gttttgtttt
caacttttca tttggatgtt tggcgttgca cacacacatc caccggtgga
10740agagacgccc ggtgaaaaca cctgtctgct ttctaagcca gtgaggttga
ggtgagaggt 10800ttgccagagt ttgtctacct ctgggtatcc ctttgtctgg
gataaaaaaa atcaaaccag 10860aaggcgggat ggaatggatg caccgcaaat
aatgcatttt ctgagttttc ttgttaaaaa 10920aaaatttttt taagtaagaa
aaaaaaaggt aataacatgg ccaatttgtt acataaaatg 10980actttctgtg
tataaattat tcctaaaaaa tcctgtttat ataaaaaatc agtagatgaa
11040aaaaatttca aaatgttttt gtatattctg ttgtaagaat ttattcctgt
tattgcgata 11100tactctggat tctttacata atggaaaaaa gaaactgtct
attttgaatg gctgaagcta 11160aggcaacgtt agtttctctt actctgcttt
tttctagtaa agtactacat ggtttaagtt 11220aaataaaata attctgtatg ca
11242261367PRTHomo sapiens 26Met Lys Ser Gly Ser Gly Gly Gly Ser
Pro Thr Ser Leu Trp Gly Leu 1 5 10 15 Leu Phe Leu Ser Ala Ala Leu
Ser Leu Trp Pro Thr Ser Gly Glu Ile 20 25 30 Cys Gly Pro Gly Ile
Asp Ile Arg Asn Asp Tyr Gln Gln Leu Lys Arg 35 40 45 Leu Glu Asn
Cys Thr Val Ile Glu Gly Tyr Leu His Ile Leu Leu Ile 50 55 60 Ser
Lys Ala Glu Asp Tyr Arg Ser Tyr Arg Phe Pro Lys Leu Thr Val 65 70
75 80 Ile Thr Glu Tyr Leu Leu Leu Phe Arg Val Ala Gly Leu Glu Ser
Leu 85 90 95 Gly Asp Leu Phe Pro Asn Leu Thr Val Ile Arg Gly Trp
Lys Leu Phe 100 105 110 Tyr Asn Tyr Ala Leu Val Ile Phe Glu Met Thr
Asn Leu Lys Asp Ile 115 120 125 Gly Leu Tyr Asn Leu Arg Asn Ile Thr
Arg Gly Ala Ile Arg Ile Glu 130 135 140 Lys Asn Ala Asp Leu Cys Tyr
Leu Ser Thr Val Asp Trp Ser Leu Ile 145 150 155 160 Leu Asp Ala Val
Ser Asn Asn Tyr Ile Val Gly Asn Lys Pro Pro Lys 165 170 175 Glu Cys
Gly Asp Leu Cys Pro Gly Thr Met Glu Glu Lys Pro Met Cys 180 185 190
Glu Lys Thr Thr Ile Asn Asn Glu Tyr Asn Tyr Arg Cys Trp Thr Thr 195
200 205 Asn Arg Cys Gln Lys Met Cys Pro Ser Thr Cys Gly Lys Arg Ala
Cys 210 215 220 Thr Glu Asn Asn Glu Cys Cys His Pro Glu Cys Leu Gly
Ser Cys Ser 225 230 235 240 Ala Pro Asp Asn Asp Thr Ala Cys Val Ala
Cys Arg His Tyr Tyr Tyr 245 250 255 Ala Gly Val Cys Val Pro Ala Cys
Pro Pro Asn Thr Tyr Arg Phe Glu 260 265 270 Gly Trp Arg Cys Val Asp
Arg Asp Phe Cys Ala Asn Ile Leu Ser Ala 275 280 285 Glu Ser Ser Asp
Ser Glu Gly Phe Val Ile His Asp Gly Glu Cys Met 290 295 300 Gln Glu
Cys Pro Ser Gly Phe Ile Arg Asn Gly Ser Gln Ser Met Tyr 305 310 315
320 Cys Ile Pro Cys Glu Gly Pro Cys Pro Lys Val Cys Glu Glu Glu Lys
325 330 335 Lys Thr Lys Thr Ile Asp Ser Val Thr Ser Ala Gln Met Leu
Gln Gly 340 345 350 Cys Thr Ile Phe Lys Gly Asn Leu Leu Ile Asn Ile
Arg Arg Gly Asn 355 360 365 Asn Ile Ala Ser Glu Leu Glu Asn Phe Met
Gly Leu Ile Glu Val Val 370 375 380 Thr Gly Tyr Val Lys Ile Arg His
Ser His Ala Leu Val Ser Leu Ser 385 390 395 400 Phe Leu Lys Asn Leu
Arg Leu Ile Leu Gly Glu Glu Gln Leu Glu Gly 405 410 415 Asn Tyr Ser
Phe Tyr Val Leu Asp Asn Gln Asn Leu Gln Gln Leu Trp 420 425 430 Asp
Trp Asp His Arg Asn Leu Thr Ile Lys Ala Gly Lys Met Tyr Phe 435 440
445 Ala Phe Asn Pro Lys Leu Cys Val Ser Glu Ile Tyr Arg Met Glu Glu
450 455 460 Val Thr Gly Thr Lys Gly Arg Gln Ser Lys Gly Asp Ile Asn
Thr Arg 465 470 475 480 Asn Asn Gly Glu Arg Ala Ser Cys Glu Ser Asp
Val Leu His Phe Thr 485 490 495 Ser Thr Thr Thr Ser Lys Asn Arg Ile
Ile Ile Thr Trp His Arg Tyr 500 505 510 Arg Pro Pro Asp Tyr Arg Asp
Leu Ile Ser Phe Thr Val Tyr Tyr Lys 515 520 525 Glu Ala Pro Phe Lys
Asn Val Thr Glu Tyr Asp Gly Gln Asp Ala Cys 530 535 540 Gly Ser Asn
Ser Trp Asn Met Val Asp Val Asp Leu Pro Pro Asn Lys 545 550 555 560
Asp Val Glu Pro Gly Ile Leu Leu His Gly Leu Lys Pro Trp Thr Gln 565
570 575 Tyr Ala Val Tyr Val Lys Ala Val Thr Leu Thr Met Val Glu Asn
Asp 580 585 590 His Ile Arg Gly Ala Lys Ser Glu Ile Leu Tyr Ile Arg
Thr Asn Ala 595 600 605 Ser Val Pro Ser Ile Pro Leu Asp Val Leu Ser
Ala Ser Asn Ser Ser 610 615 620 Ser Gln Leu Ile Val Lys Trp Asn Pro
Pro Ser Leu Pro Asn Gly Asn 625 630 635 640 Leu Ser Tyr Tyr Ile Val
Arg Trp Gln Arg Gln Pro Gln Asp Gly Tyr 645 650 655 Leu Tyr Arg His
Asn Tyr Cys Ser Lys Asp Lys Ile Pro Ile Arg Lys 660 665 670 Tyr Ala
Asp Gly Thr Ile Asp Ile Glu Glu Val Thr Glu Asn Pro Lys 675 680 685
Thr Glu Val Cys Gly Gly Glu Lys Gly Pro Cys Cys Ala Cys Pro Lys 690
695 700 Thr Glu Ala Glu Lys Gln Ala Glu Lys Glu Glu Ala Glu Tyr Arg
Lys 705 710 715 720 Val Phe Glu Asn Phe Leu His Asn Ser Ile Phe Val
Pro Arg Pro Glu 725 730 735 Arg Lys Arg Arg Asp Val Met Gln Val Ala
Asn Thr Thr Met Ser Ser 740 745 750 Arg Ser Arg Asn Thr Thr Ala Ala
Asp Thr Tyr Asn Ile Thr Asp Pro 755 760 765 Glu Glu Leu Glu Thr Glu
Tyr Pro Phe Phe Glu Ser Arg Val Asp Asn 770 775 780 Lys Glu Arg Thr
Val Ile Ser Asn Leu Arg Pro Phe Thr Leu Tyr Arg 785 790 795 800 Ile
Asp Ile His Ser Cys Asn His Glu Ala Glu Lys Leu Gly Cys Ser 805 810
815 Ala Ser Asn Phe Val Phe Ala Arg Thr Met Pro Ala Glu Gly Ala Asp
820 825 830 Asp Ile Pro Gly Pro Val Thr Trp Glu Pro Arg Pro Glu Asn
Ser Ile 835 840 845 Phe Leu Lys Trp Pro Glu Pro Glu Asn Pro Asn Gly
Leu Ile Leu Met 850 855 860 Tyr Glu Ile Lys Tyr Gly Ser Gln Val Glu
Asp Gln Arg Glu Cys Val 865 870 875 880 Ser Arg Gln Glu Tyr Arg Lys
Tyr Gly Gly Ala Lys Leu Asn Arg Leu 885 890 895 Asn Pro Gly Asn Tyr
Thr Ala Arg Ile Gln Ala Thr Ser Leu Ser Gly 900 905 910 Asn Gly Ser
Trp Thr Asp Pro Val Phe Phe Tyr Val Gln Ala Lys Thr 915 920 925 Gly
Tyr Glu Asn Phe Ile His Leu Ile Ile Ala Leu Pro Val Ala Val 930 935
940 Leu Leu Ile Val Gly Gly Leu Val Ile Met Leu Tyr Val Phe His Arg
945 950 955 960 Lys Arg Asn Asn Ser Arg Leu Gly Asn Gly Val Leu Tyr
Ala Ser Val 965 970 975 Asn Pro Glu Tyr Phe Ser Ala Ala Asp Val Tyr
Val Pro Asp Glu Trp 980 985 990 Glu Val Ala Arg Glu Lys Ile Thr Met
Ser Arg Glu Leu Gly Gln Gly 995 1000 1005 Ser Phe Gly Met Val Tyr
Glu Gly Val Ala Lys Gly Val Val Lys 1010 1015 1020 Asp Glu Pro Glu
Thr Arg Val Ala Ile Lys Thr Val Asn Glu Ala 1025 1030 1035 Ala Ser
Met Arg Glu Arg Ile Glu Phe Leu Asn Glu Ala Ser Val 1040 1045 1050
Met Lys Glu Phe Asn Cys His His Val Val Arg Leu Leu Gly Val 1055
1060 1065 Val Ser Gln Gly Gln Pro Thr Leu Val Ile Met Glu Leu Met
Thr 1070 1075 1080 Arg Gly Asp Leu Lys Ser Tyr Leu Arg Ser Leu Arg
Pro Glu Met 1085 1090 1095 Glu Asn Asn Pro Val Leu Ala Pro Pro Ser
Leu Ser Lys Met Ile 1100 1105 1110 Gln Met Ala Gly Glu Ile Ala Asp
Gly Met Ala Tyr Leu Asn Ala 1115 1120 1125 Asn Lys Phe Val His Arg
Asp Leu Ala Ala Arg Asn Cys Met Val 1130 1135 1140 Ala Glu Asp Phe
Thr Val Lys Ile Gly Asp Phe Gly Met Thr Arg 1145 1150 1155 Asp Ile
Tyr Glu Thr Asp Tyr Tyr Arg Lys Gly Gly Lys Gly Leu 1160 1165 1170
Leu Pro Val Arg Trp Met Ser Pro Glu Ser Leu Lys Asp Gly Val 1175
1180 1185 Phe Thr Thr Tyr Ser Asp Val Trp Ser Phe Gly Val Val Leu
Trp 1190 1195 1200 Glu Ile Ala Thr Leu Ala Glu Gln Pro Tyr Gln Gly
Leu Ser Asn 1205 1210 1215 Glu Gln Val Leu Arg Phe Val Met Glu Gly
Gly Leu Leu Asp Lys 1220 1225 1230 Pro Asp Asn Cys Pro Asp Met Leu
Phe Glu Leu Met Arg Met Cys 1235 1240 1245 Trp Gln Tyr Asn Pro Lys
Met Arg Pro Ser Phe Leu Glu Ile Ile 1250 1255 1260 Ser Ser Ile Lys
Glu Glu Met Glu Pro Gly Phe Arg Glu Val Ser 1265 1270 1275 Phe Tyr
Tyr Ser Glu Glu Asn Lys Leu Pro Glu Pro Glu Glu Leu 1280 1285 1290
Asp Leu Glu Pro Glu Asn Met Glu Ser Val Pro Leu Asp Pro Ser 1295
1300 1305 Ala Ser Ser Ser Ser Leu Pro Leu Pro Asp Arg His Ser Gly
His 1310 1315 1320 Lys Ala Glu Asn Gly Pro Gly Pro Gly Val Leu Val
Leu Arg Ala 1325 1330 1335 Ser Phe Asp Glu Arg Gln Pro Tyr Ala His
Met Asn Gly Gly Arg
1340 1345 1350 Lys Asn Glu Arg Ala Leu Pro Leu Pro Gln Ser Ser Thr
Cys 1355 1360 1365 276475DNAHomo sapiens 27ggcgaggcga ggtttgctgg
ggtgaggcag cggcgcggcc gggccgggcc gggccacagg 60cggtggcggc gggaccatgg
aggcggcggt cgctgctccg cgtccccggc tgctcctcct 120cgtgctggcg
gcggcggcgg cggcggcggc ggcgctgctc ccgggggcga cggcgttaca
180gtgtttctgc cacctctgta caaaagacaa ttttacttgt gtgacagatg
ggctctgctt 240tgtctctgtc acagagacca cagacaaagt tatacacaac
agcatgtgta tagctgaaat 300tgacttaatt cctcgagata ggccgtttgt
atgtgcaccc tcttcaaaaa ctgggtctgt 360gactacaaca tattgctgca
atcaggacca ttgcaataaa atagaacttc caactactgt 420aaagtcatca
cctggccttg gtcctgtgga actggcagct gtcattgctg gaccagtgtg
480cttcgtctgc atctcactca tgttgatggt ctatatctgc cacaaccgca
ctgtcattca 540ccatcgagtg ccaaatgaag aggacccttc attagatcgc
ccttttattt cagagggtac 600tacgttgaaa gacttaattt atgatatgac
aacgtcaggt tctggctcag gtttaccatt 660gcttgttcag agaacaattg
cgagaactat tgtgttacaa gaaagcattg gcaaaggtcg 720atttggagaa
gtttggagag gaaagtggcg gggagaagaa gttgctgtta agatattctc
780ctctagagaa gaacgttcgt ggttccgtga ggcagagatt tatcaaactg
taatgttacg 840tcatgaaaac atcctgggat ttatagcagc agacaataaa
gacaatggta cttggactca 900gctctggttg gtgtcagatt atcatgagca
tggatccctt tttgattact taaacagata 960cacagttact gtggaaggaa
tgataaaact tgctctgtcc acggcgagcg gtcttgccca 1020tcttcacatg
gagattgttg gtacccaagg aaagccagcc attgctcata gagatttgaa
1080atcaaagaat atcttggtaa agaagaatgg aacttgctgt attgcagact
taggactggc 1140agtaagacat gattcagcca cagataccat tgatattgct
ccaaaccaca gagtgggaac 1200aaaaaggtac atggcccctg aagttctcga
tgattccata aatatgaaac attttgaatc 1260cttcaaacgt gctgacatct
atgcaatggg cttagtattc tgggaaattg ctcgacgatg 1320ttccattggt
ggaattcatg aagattacca actgccttat tatgatcttg taccttctga
1380cccatcagtt gaagaaatga gaaaagttgt ttgtgaacag aagttaaggc
caaatatccc 1440aaacagatgg cagagctgtg aagccttgag agtaatggct
aaaattatga gagaatgttg 1500gtatgccaat ggagcagcta ggcttacagc
attgcggatt aagaaaacat tatcgcaact 1560cagtcaacag gaaggcatca
aaatgtaatt ctacagcttt gcctgaactc tccttttttc 1620ttcagatctg
ctcctgggtt ttaatttggg aggtcaattg ttctacctca ctgagaggga
1680acagaaggat attgcttcct tttgcagcag tgtaataaag tcaattaaaa
acttcccagg 1740atttctttgg acccaggaaa cagccatgtg ggtcctttct
gtgcactatg aacgcttctt 1800tcccaggaca gaaaatgtgt agtctacctt
tattttttat taacaaaact tgttttttaa 1860aaagatgatt gctggtctta
actttaggta actctgctgt gctggagatc atctttaagg 1920gcaaaggagt
tggattgctg aattacaatg aaacatgtct tattactaaa gaaagtgatt
1980tactcctggt tagtacattc tcagaggatt ctgaaccact agagtttcct
tgattcagac 2040tttgaatgta ctgttctata gtttttcagg atcttaaaac
taacacttat aaaactctta 2100tcttgagtct aaaaatgacc tcatatagta
gtgaggaaca taattcatgc aattgtattt 2160tgtatactat tattgttctt
tcacttattc agaacattac atgccttcaa aatgggattg 2220tactatacca
gtaagtgcca cttctgtgtc tttctaatgg aaatgagtag aattgctgaa
2280agtctctatg ttaaaaccta tagtgtttga attcaaaaag cttatttatc
tgggtaaccc 2340aaactttttc tgttttgttt ttggaagggt ttttgtggta
tgtcatttgg tattctattc 2400tgaaaatgcc tttctcctac caaaatgtgc
ttaagccact aaagaaatga agtggcatta 2460attagtaaat tattagcatg
gtcatgtttg aatattctca catcaagctt ttgcatttta 2520attgtgttgt
ctaagtatac ttttaaaaaa tcaagtggca ctctagatgc ttatagtact
2580ttaatatttg tagcatacag actaattttt ctaaaaggga aagtctgtct
agctgcttgt 2640gaaaagttat gtggtattct gtaagccatt tttttcttta
tctgttcaaa gacttatttt 2700ttaagacatg aattacattt aaaattagaa
tatggttaat attaaataat aggccttttt 2760ctaggaaggc gaaggtagtt
aataatttga atagataaca gatgtgcaag aaagtcacat 2820ttgttatgta
tgtaggagta aacgttcggt ggatcctctg tctttgtaac tgaggttaga
2880gctagtgtgg ttttgaggtc tcactacact ttgaggaagg cagcttttaa
ttcagtgttt 2940ccttatgtgt gcgtacattg caactgctta catgtaattt
atgtaatgca ttcagtgcac 3000ccttgttact tgggagaggt ggtagctaaa
gaacattctg agtataggtt tttctccatt 3060tacagatgtc tttggtcaaa
tattgaaagc aaacttgtca tggtcttctt acattaagtt 3120gaaactagct
tataataact ggtttttact tccaatgcta tgaagtctct gcagggcttt
3180tacagttttc gaagtccttt tatcactgtg atcttattct gaggggagaa
aaaactatca 3240tagctctgag gcaagacttc gactttatag tgctatcagt
tccccgatac agggtcagag 3300taacccatac agtattttgg tcaggaagag
aaagtggcca tttacactga atgagttgca 3360ttctgataat gtcttatctc
ttatacgtag aataaatttg aaagactatt tgatcttaaa 3420accaaagtaa
ttttagaatg agtgacatat tacataggaa tttagtgtca atttcatgtg
3480tttaaaaaca tcatgggaaa aatgcttaga ggttactatt ttgactacaa
agttgagttt 3540ttttctgtag ttaccataat ttcattgaag caaatgaatg
agtttgagag gtttgttttt 3600atagttgtgt tgtattactt gtttaataat
aatctctaat tctgtgatca ggtacttttt 3660ttgtgggggt tttttttttg
tttttttttt tttgttgttg tttttgggcc atttctaagc 3720ctaccagatc
tgctttatga aatccagggg accaatgcat tttatcacta aaactatttt
3780tatataattt taagaatata ccaaaagttg tctgatttaa agttgtaata
catgatttct 3840cactttcatg taaggttatc cacttttgct gaagatattt
tttattgaat caaagattga 3900gttacaatta tacttttctt acctaagtgg
ataaaatgta cttttgatga atcagggaat 3960ttttttaaag ttggagttta
gttctaaatt gactttacgt attactgcag ttaattcctt 4020ttttggctag
ggatggtttg ataaaccaca attggctgat attgaaaatg aaagaaactt
4080aaaaggtggg atggatcatg attactgtcg ataactgcag ataaatttga
ttagagtaat 4140aattttgtca tttaaaaaca cagttgttta tactgcccat
cctaggatgc tcaccttcca 4200agattcaacg tggctaaaac atcttctggt
aaattgtgcg tccatattca ttttgtcagt 4260agccaggaga aatggggatg
ggggaaatac gacttagtga ggcatagaca tccctggtcc 4320atcctttctg
tctccagctg tttcttggaa cctgctctcc tgcttgctgg tccctgacgc
4380agagaccgtt gcctccccca cagccgtttg actgaaggct gctctggaga
cctagagtaa 4440aacggctgat ggaagttgtg ggacccactt ccatttcctt
cagtcattag aggtggaagg 4500gaggggtctc caagtttgga gattgagcag
atgaggcttg ggatgcccct gctttgactt 4560cagccatgga tgaggagtgg
gatggcagca aggtggctcc tgtggcagtg gagttgtgcc 4620agaaacagtg
gccagttgta tcgcctataa gacagggtaa ggtctgaaga gctgagcctg
4680taattctgct gtaataatga tagtgctcaa gaagtgcctt gagttggtgt
acagtgccat 4740ggccatcaag aatcccagat ttcaggtttt attacaaaat
gtaagtggtc acttggcgat 4800tttgtagtac atgcatgagt tacctttttt
ctctatgtct gagaactgtc agattaaaac 4860aagatggcaa agagatcgtt
agagtgcaca acaaaatcac tatcccatta gacacatcat 4920caaaagctta
tttttattct tgcactggaa gaatcgtaag tcaactgttt cttgaccatg
4980gcagtgttct ggctccaaat ggtagtgatt ccaaataatg gttctgttaa
cactttggca 5040gaaaatgcca gctcagatat tttgagatac taaggattat
ctttggacat gtactgcagc 5100ttcttgtctc tgttttggat tactggaata
cccatgggcc ctctcaagag tgctggactt 5160ctaggacatt aagatgattg
tcagtacatt aaacttttca atcccattat gcaatcttgt 5220ttgtaaatgt
aaacttctaa aaatatggtt aataacattc aacctgttta ttacaactta
5280aaaggaactt cagtgaattt gtttttattt tttaacaaga tttgtgaact
gaatatcatg 5340aaccatgttt tgatacccct ttttcacgtt gtgccaacgg
aatagggtgt ttgatatttc 5400ttcatatgtt aaggagatgc ttcaaaatgt
caattgcttt aaacttaaat tacctctcaa 5460gagaccaagg tacatttacc
tcattgtgta tataatgttt aatatttgtc agagcattct 5520ccaggtttgc
agttttattt ctataaagta tgggtattat gttgctcagt tactcaaatg
5580gtactgtatt gtttatattt gtaccccaaa taacatcgtc tgtactttct
gttttctgta 5640ttgtatttgt gcaggattct ttaggcttta tcagtgtaat
ctctgccttt taagatatgt 5700acagaaaatg tccatataaa tttccattga
agtcgaatga tactgagaag cctgtaaaga 5760ggagaaaaaa acataagctg
tgtttcccca taagtttttt taaattgtat attgtatttg 5820tagtaatatt
ccaaaagaat gtaaatagga aatagaagag tgatgcttat gttaagtcct
5880aacactacag tagaagaatg gaagcagtgc aaataaatta catttttccc
aagtgccagt 5940ggcatatttt aaaataaagt gtatacgttg gaatgagtca
tgccatatgt agttgctgta 6000gatggcaact agaacctttg agttacaaga
gtctttagaa gttttctaac cctgcctagt 6060gcaagttaca atattatagc
gtgttcgggg agtgccctcc tgtctgcagg tgtgtctctg 6120tgcctggggg
cttttctcca catgcttagg ggtgtgggtc ttccattggg gcatgatgga
6180cctgtctaca ggtgatctct gttgcctttg ggtcagcaca tttgttagtc
tcctgggggt 6240gaaaacttgg cttacaagag aactggaaaa atgatgagat
gtggtcccca aacccttgat 6300tgactctggg gaggggcttt gtgaatagga
ttgctctcac attaaagata gttacttcaa 6360tttgaaggct ggatttaggg
attttttttt ttccttataa caaagacatc accaggatat 6420gaagcttttg
ttgaaagttg gaaaaaaagt gaaattaaag acattcccag acaaa 6475286244DNAHomo
sapiens 28ggcgaggcga ggtttgctgg ggtgaggcag cggcgcggcc gggccgggcc
gggccacagg 60cggtggcggc gggaccatgg aggcggcggt cgctgctccg cgtccccggc
tgctcctcct 120cgtgctggcg gcggcggcgg cggcggcggc ggcgctgctc
ccgggggcga cggcgttaca 180gtgtttctgc cacctctgta caaaagacaa
ttttacttgt gtgacagatg ggctctgctt 240tgtctctgtc acagagacca
cagacaaagt tatacacaac agcatgtgta tagctgaaat 300tgacttaatt
cctcgagata ggccgtttgt atgtgcaccc tcttcaaaaa ctgggtctgt
360gactacaaca tattgctgca atcaggacca ttgcaataaa atagaacttc
caactactgg 420tttaccattg cttgttcaga gaacaattgc gagaactatt
gtgttacaag aaagcattgg 480caaaggtcga tttggagaag tttggagagg
aaagtggcgg ggagaagaag ttgctgttaa 540gatattctcc tctagagaag
aacgttcgtg gttccgtgag gcagagattt atcaaactgt 600aatgttacgt
catgaaaaca tcctgggatt tatagcagca gacaataaag acaatggtac
660ttggactcag ctctggttgg tgtcagatta tcatgagcat ggatcccttt
ttgattactt 720aaacagatac acagttactg tggaaggaat gataaaactt
gctctgtcca cggcgagcgg 780tcttgcccat cttcacatgg agattgttgg
tacccaagga aagccagcca ttgctcatag 840agatttgaaa tcaaagaata
tcttggtaaa gaagaatgga acttgctgta ttgcagactt 900aggactggca
gtaagacatg attcagccac agataccatt gatattgctc caaaccacag
960agtgggaaca aaaaggtaca tggcccctga agttctcgat gattccataa
atatgaaaca 1020ttttgaatcc ttcaaacgtg ctgacatcta tgcaatgggc
ttagtattct gggaaattgc 1080tcgacgatgt tccattggtg gaattcatga
agattaccaa ctgccttatt atgatcttgt 1140accttctgac ccatcagttg
aagaaatgag aaaagttgtt tgtgaacaga agttaaggcc 1200aaatatccca
aacagatggc agagctgtga agccttgaga gtaatggcta aaattatgag
1260agaatgttgg tatgccaatg gagcagctag gcttacagca ttgcggatta
agaaaacatt 1320atcgcaactc agtcaacagg aaggcatcaa aatgtaattc
tacagctttg cctgaactct 1380ccttttttct tcagatctgc tcctgggttt
taatttggga ggtcaattgt tctacctcac 1440tgagagggaa cagaaggata
ttgcttcctt ttgcagcagt gtaataaagt caattaaaaa 1500cttcccagga
tttctttgga cccaggaaac agccatgtgg gtcctttctg tgcactatga
1560acgcttcttt cccaggacag aaaatgtgta gtctaccttt attttttatt
aacaaaactt 1620gttttttaaa aagatgattg ctggtcttaa ctttaggtaa
ctctgctgtg ctggagatca 1680tctttaaggg caaaggagtt ggattgctga
attacaatga aacatgtctt attactaaag 1740aaagtgattt actcctggtt
agtacattct cagaggattc tgaaccacta gagtttcctt 1800gattcagact
ttgaatgtac tgttctatag tttttcagga tcttaaaact aacacttata
1860aaactcttat cttgagtcta aaaatgacct catatagtag tgaggaacat
aattcatgca 1920attgtatttt gtatactatt attgttcttt cacttattca
gaacattaca tgccttcaaa 1980atgggattgt actataccag taagtgccac
ttctgtgtct ttctaatgga aatgagtaga 2040attgctgaaa gtctctatgt
taaaacctat agtgtttgaa ttcaaaaagc ttatttatct 2100gggtaaccca
aactttttct gttttgtttt tggaagggtt tttgtggtat gtcatttggt
2160attctattct gaaaatgcct ttctcctacc aaaatgtgct taagccacta
aagaaatgaa 2220gtggcattaa ttagtaaatt attagcatgg tcatgtttga
atattctcac atcaagcttt 2280tgcattttaa ttgtgttgtc taagtatact
tttaaaaaat caagtggcac tctagatgct 2340tatagtactt taatatttgt
agcatacaga ctaatttttc taaaagggaa agtctgtcta 2400gctgcttgtg
aaaagttatg tggtattctg taagccattt ttttctttat ctgttcaaag
2460acttattttt taagacatga attacattta aaattagaat atggttaata
ttaaataata 2520ggcctttttc taggaaggcg aaggtagtta ataatttgaa
tagataacag atgtgcaaga 2580aagtcacatt tgttatgtat gtaggagtaa
acgttcggtg gatcctctgt ctttgtaact 2640gaggttagag ctagtgtggt
tttgaggtct cactacactt tgaggaaggc agcttttaat 2700tcagtgtttc
cttatgtgtg cgtacattgc aactgcttac atgtaattta tgtaatgcat
2760tcagtgcacc cttgttactt gggagaggtg gtagctaaag aacattctga
gtataggttt 2820ttctccattt acagatgtct ttggtcaaat attgaaagca
aacttgtcat ggtcttctta 2880cattaagttg aaactagctt ataataactg
gtttttactt ccaatgctat gaagtctctg 2940cagggctttt acagttttcg
aagtcctttt atcactgtga tcttattctg aggggagaaa 3000aaactatcat
agctctgagg caagacttcg actttatagt gctatcagtt ccccgataca
3060gggtcagagt aacccataca gtattttggt caggaagaga aagtggccat
ttacactgaa 3120tgagttgcat tctgataatg tcttatctct tatacgtaga
ataaatttga aagactattt 3180gatcttaaaa ccaaagtaat tttagaatga
gtgacatatt acataggaat ttagtgtcaa 3240tttcatgtgt ttaaaaacat
catgggaaaa atgcttagag gttactattt tgactacaaa 3300gttgagtttt
tttctgtagt taccataatt tcattgaagc aaatgaatga gtttgagagg
3360tttgttttta tagttgtgtt gtattacttg tttaataata atctctaatt
ctgtgatcag 3420gtactttttt tgtgggggtt ttttttttgt tttttttttt
ttgttgttgt ttttgggcca 3480tttctaagcc taccagatct gctttatgaa
atccagggga ccaatgcatt ttatcactaa 3540aactattttt atataatttt
aagaatatac caaaagttgt ctgatttaaa gttgtaatac 3600atgatttctc
actttcatgt aaggttatcc acttttgctg aagatatttt ttattgaatc
3660aaagattgag ttacaattat acttttctta cctaagtgga taaaatgtac
ttttgatgaa 3720tcagggaatt tttttaaagt tggagtttag ttctaaattg
actttacgta ttactgcagt 3780taattccttt tttggctagg gatggtttga
taaaccacaa ttggctgata ttgaaaatga 3840aagaaactta aaaggtggga
tggatcatga ttactgtcga taactgcaga taaatttgat 3900tagagtaata
attttgtcat ttaaaaacac agttgtttat actgcccatc ctaggatgct
3960caccttccaa gattcaacgt ggctaaaaca tcttctggta aattgtgcgt
ccatattcat 4020tttgtcagta gccaggagaa atggggatgg gggaaatacg
acttagtgag gcatagacat 4080ccctggtcca tcctttctgt ctccagctgt
ttcttggaac ctgctctcct gcttgctggt 4140ccctgacgca gagaccgttg
cctcccccac agccgtttga ctgaaggctg ctctggagac 4200ctagagtaaa
acggctgatg gaagttgtgg gacccacttc catttccttc agtcattaga
4260ggtggaaggg aggggtctcc aagtttggag attgagcaga tgaggcttgg
gatgcccctg 4320ctttgacttc agccatggat gaggagtggg atggcagcaa
ggtggctcct gtggcagtgg 4380agttgtgcca gaaacagtgg ccagttgtat
cgcctataag acagggtaag gtctgaagag 4440ctgagcctgt aattctgctg
taataatgat agtgctcaag aagtgccttg agttggtgta 4500cagtgccatg
gccatcaaga atcccagatt tcaggtttta ttacaaaatg taagtggtca
4560cttggcgatt ttgtagtaca tgcatgagtt accttttttc tctatgtctg
agaactgtca 4620gattaaaaca agatggcaaa gagatcgtta gagtgcacaa
caaaatcact atcccattag 4680acacatcatc aaaagcttat ttttattctt
gcactggaag aatcgtaagt caactgtttc 4740ttgaccatgg cagtgttctg
gctccaaatg gtagtgattc caaataatgg ttctgttaac 4800actttggcag
aaaatgccag ctcagatatt ttgagatact aaggattatc tttggacatg
4860tactgcagct tcttgtctct gttttggatt actggaatac ccatgggccc
tctcaagagt 4920gctggacttc taggacatta agatgattgt cagtacatta
aacttttcaa tcccattatg 4980caatcttgtt tgtaaatgta aacttctaaa
aatatggtta ataacattca acctgtttat 5040tacaacttaa aaggaacttc
agtgaatttg tttttatttt ttaacaagat ttgtgaactg 5100aatatcatga
accatgtttt gatacccctt tttcacgttg tgccaacgga atagggtgtt
5160tgatatttct tcatatgtta aggagatgct tcaaaatgtc aattgcttta
aacttaaatt 5220acctctcaag agaccaaggt acatttacct cattgtgtat
ataatgttta atatttgtca 5280gagcattctc caggtttgca gttttatttc
tataaagtat gggtattatg ttgctcagtt 5340actcaaatgg tactgtattg
tttatatttg taccccaaat aacatcgtct gtactttctg 5400ttttctgtat
tgtatttgtg caggattctt taggctttat cagtgtaatc tctgcctttt
5460aagatatgta cagaaaatgt ccatataaat ttccattgaa gtcgaatgat
actgagaagc 5520ctgtaaagag gagaaaaaaa cataagctgt gtttccccat
aagttttttt aaattgtata 5580ttgtatttgt agtaatattc caaaagaatg
taaataggaa atagaagagt gatgcttatg 5640ttaagtccta acactacagt
agaagaatgg aagcagtgca aataaattac atttttccca 5700agtgccagtg
gcatatttta aaataaagtg tatacgttgg aatgagtcat gccatatgta
5760gttgctgtag atggcaacta gaacctttga gttacaagag tctttagaag
ttttctaacc 5820ctgcctagtg caagttacaa tattatagcg tgttcgggga
gtgccctcct gtctgcaggt 5880gtgtctctgt gcctgggggc ttttctccac
atgcttaggg gtgtgggtct tccattgggg 5940catgatggac ctgtctacag
gtgatctctg ttgcctttgg gtcagcacat ttgttagtct 6000cctgggggtg
aaaacttggc ttacaagaga actggaaaaa tgatgagatg tggtccccaa
6060acccttgatt gactctgggg aggggctttg tgaataggat tgctctcaca
ttaaagatag 6120ttacttcaat ttgaaggctg gatttaggga tttttttttt
tccttataac aaagacatca 6180ccaggatatg aagcttttgt tgaaagttgg
aaaaaaagtg aaattaaaga cattcccaga 6240caaa 624429503PRTHomo sapiens
29Met Glu Ala Ala Val Ala Ala Pro Arg Pro Arg Leu Leu Leu Leu Val 1
5 10 15 Leu Ala Ala Ala Ala Ala Ala Ala Ala Ala Leu Leu Pro Gly Ala
Thr 20 25 30 Ala Leu Gln Cys Phe Cys His Leu Cys Thr Lys Asp Asn
Phe Thr Cys 35 40 45 Val Thr Asp Gly Leu Cys Phe Val Ser Val Thr
Glu Thr Thr Asp Lys 50 55 60 Val Ile His Asn Ser Met Cys Ile Ala
Glu Ile Asp Leu Ile Pro Arg 65 70 75 80 Asp Arg Pro Phe Val Cys Ala
Pro Ser Ser Lys Thr Gly Ser Val Thr 85 90 95 Thr Thr Tyr Cys Cys
Asn Gln Asp His Cys Asn Lys Ile Glu Leu Pro 100 105 110 Thr Thr Val
Lys Ser Ser Pro Gly Leu Gly Pro Val Glu Leu Ala Ala 115 120 125 Val
Ile Ala Gly Pro Val Cys Phe Val Cys Ile Ser Leu Met Leu Met 130 135
140 Val Tyr Ile Cys His Asn Arg Thr Val Ile His His Arg Val Pro Asn
145 150 155 160 Glu Glu Asp Pro Ser Leu Asp Arg Pro Phe Ile Ser Glu
Gly Thr Thr 165 170 175 Leu Lys Asp Leu Ile Tyr Asp Met Thr Thr Ser
Gly Ser Gly Ser Gly 180 185 190 Leu Pro Leu Leu Val Gln Arg Thr Ile
Ala Arg Thr Ile Val Leu Gln 195 200 205 Glu Ser Ile Gly Lys Gly Arg
Phe Gly Glu Val Trp Arg Gly Lys Trp 210 215 220 Arg Gly Glu Glu Val
Ala Val Lys Ile Phe Ser Ser Arg Glu Glu Arg 225 230 235 240 Ser Trp
Phe Arg Glu Ala Glu Ile Tyr Gln Thr Val Met Leu Arg His 245 250 255
Glu Asn Ile Leu Gly Phe Ile Ala Ala Asp Asn Lys Asp Asn Gly Thr 260
265 270 Trp Thr Gln Leu Trp Leu Val Ser Asp Tyr His Glu His Gly Ser
Leu 275 280 285 Phe Asp Tyr Leu Asn Arg Tyr Thr Val Thr Val Glu Gly
Met Ile Lys 290 295 300 Leu Ala Leu Ser Thr Ala Ser Gly Leu Ala His
Leu His Met Glu Ile 305 310
315 320 Val Gly Thr Gln Gly Lys Pro Ala Ile Ala His Arg Asp Leu Lys
Ser 325 330 335 Lys Asn Ile Leu Val Lys Lys Asn Gly Thr Cys Cys Ile
Ala Asp Leu 340 345 350 Gly Leu Ala Val Arg His Asp Ser Ala Thr Asp
Thr Ile Asp Ile Ala 355 360 365 Pro Asn His Arg Val Gly Thr Lys Arg
Tyr Met Ala Pro Glu Val Leu 370 375 380 Asp Asp Ser Ile Asn Met Lys
His Phe Glu Ser Phe Lys Arg Ala Asp 385 390 395 400 Ile Tyr Ala Met
Gly Leu Val Phe Trp Glu Ile Ala Arg Arg Cys Ser 405 410 415 Ile Gly
Gly Ile His Glu Asp Tyr Gln Leu Pro Tyr Tyr Asp Leu Val 420 425 430
Pro Ser Asp Pro Ser Val Glu Glu Met Arg Lys Val Val Cys Glu Gln 435
440 445 Lys Leu Arg Pro Asn Ile Pro Asn Arg Trp Gln Ser Cys Glu Ala
Leu 450 455 460 Arg Val Met Ala Lys Ile Met Arg Glu Cys Trp Tyr Ala
Asn Gly Ala 465 470 475 480 Ala Arg Leu Thr Ala Leu Arg Ile Lys Lys
Thr Leu Ser Gln Leu Ser 485 490 495 Gln Gln Glu Gly Ile Lys Met 500
30426PRTHomo sapiens 30Met Glu Ala Ala Val Ala Ala Pro Arg Pro Arg
Leu Leu Leu Leu Val 1 5 10 15 Leu Ala Ala Ala Ala Ala Ala Ala Ala
Ala Leu Leu Pro Gly Ala Thr 20 25 30 Ala Leu Gln Cys Phe Cys His
Leu Cys Thr Lys Asp Asn Phe Thr Cys 35 40 45 Val Thr Asp Gly Leu
Cys Phe Val Ser Val Thr Glu Thr Thr Asp Lys 50 55 60 Val Ile His
Asn Ser Met Cys Ile Ala Glu Ile Asp Leu Ile Pro Arg 65 70 75 80 Asp
Arg Pro Phe Val Cys Ala Pro Ser Ser Lys Thr Gly Ser Val Thr 85 90
95 Thr Thr Tyr Cys Cys Asn Gln Asp His Cys Asn Lys Ile Glu Leu Pro
100 105 110 Thr Thr Gly Leu Pro Leu Leu Val Gln Arg Thr Ile Ala Arg
Thr Ile 115 120 125 Val Leu Gln Glu Ser Ile Gly Lys Gly Arg Phe Gly
Glu Val Trp Arg 130 135 140 Gly Lys Trp Arg Gly Glu Glu Val Ala Val
Lys Ile Phe Ser Ser Arg 145 150 155 160 Glu Glu Arg Ser Trp Phe Arg
Glu Ala Glu Ile Tyr Gln Thr Val Met 165 170 175 Leu Arg His Glu Asn
Ile Leu Gly Phe Ile Ala Ala Asp Asn Lys Asp 180 185 190 Asn Gly Thr
Trp Thr Gln Leu Trp Leu Val Ser Asp Tyr His Glu His 195 200 205 Gly
Ser Leu Phe Asp Tyr Leu Asn Arg Tyr Thr Val Thr Val Glu Gly 210 215
220 Met Ile Lys Leu Ala Leu Ser Thr Ala Ser Gly Leu Ala His Leu His
225 230 235 240 Met Glu Ile Val Gly Thr Gln Gly Lys Pro Ala Ile Ala
His Arg Asp 245 250 255 Leu Lys Ser Lys Asn Ile Leu Val Lys Lys Asn
Gly Thr Cys Cys Ile 260 265 270 Ala Asp Leu Gly Leu Ala Val Arg His
Asp Ser Ala Thr Asp Thr Ile 275 280 285 Asp Ile Ala Pro Asn His Arg
Val Gly Thr Lys Arg Tyr Met Ala Pro 290 295 300 Glu Val Leu Asp Asp
Ser Ile Asn Met Lys His Phe Glu Ser Phe Lys 305 310 315 320 Arg Ala
Asp Ile Tyr Ala Met Gly Leu Val Phe Trp Glu Ile Ala Arg 325 330 335
Arg Cys Ser Ile Gly Gly Ile His Glu Asp Tyr Gln Leu Pro Tyr Tyr 340
345 350 Asp Leu Val Pro Ser Asp Pro Ser Val Glu Glu Met Arg Lys Val
Val 355 360 365 Cys Glu Gln Lys Leu Arg Pro Asn Ile Pro Asn Arg Trp
Gln Ser Cys 370 375 380 Glu Ala Leu Arg Val Met Ala Lys Ile Met Arg
Glu Cys Trp Tyr Ala 385 390 395 400 Asn Gly Ala Ala Arg Leu Thr Ala
Leu Arg Ile Lys Lys Thr Leu Ser 405 410 415 Gln Leu Ser Gln Gln Glu
Gly Ile Lys Met 420 425 316330DNAHomo sapiens 31aggagctggc
ggagggcgtt cgtcctggga ctgcacttgc tcccgtcggg tcgcccggct 60tcaccggacc
cgcaggctcc cggggcaggg ccggggccag agctcgcgtg tcggcgggac
120atgcgctgcg tcgcctctaa cctcgggctg tgctcttttt ccaggtggcc
cgccggtttc 180tgagccttct gccctgcggg gacacggtct gcaccctgcc
cgcggccacg gaccatgacc 240atgaccctcc acaccaaagc atctgggatg
gccctactgc atcagatcca agggaacgag 300ctggagcccc tgaaccgtcc
gcagctcaag atccccctgg agcggcccct gggcgaggtg 360tacctggaca
gcagcaagcc cgccgtgtac aactaccccg agggcgccgc ctacgagttc
420aacgccgcgg ccgccgccaa cgcgcaggtc tacggtcaga ccggcctccc
ctacggcccc 480gggtctgagg ctgcggcgtt cggctccaac ggcctggggg
gtttcccccc actcaacagc 540gtgtctccga gcccgctgat gctactgcac
ccgccgccgc agctgtcgcc tttcctgcag 600ccccacggcc agcaggtgcc
ctactacctg gagaacgagc ccagcggcta cacggtgcgc 660gaggccggcc
cgccggcatt ctacaggcca aattcagata atcgacgcca gggtggcaga
720gaaagattgg ccagtaccaa tgacaaggga agtatggcta tggaatctgc
caaggagact 780cgctactgtg cagtgtgcaa tgactatgct tcaggctacc
attatggagt ctggtcctgt 840gagggctgca aggccttctt caagagaagt
attcaaggac ataacgacta tatgtgtcca 900gccaccaacc agtgcaccat
tgataaaaac aggaggaaga gctgccaggc ctgccggctc 960cgcaaatgct
acgaagtggg aatgatgaaa ggtgggatac gaaaagaccg aagaggaggg
1020agaatgttga aacacaagcg ccagagagat gatggggagg gcaggggtga
agtggggtct 1080gctggagaca tgagagctgc caacctttgg ccaagcccgc
tcatgatcaa acgctctaag 1140aagaacagcc tggccttgtc cctgacggcc
gaccagatgg tcagtgcctt gttggatgct 1200gagcccccca tactctattc
cgagtatgat cctaccagac ccttcagtga agcttcgatg 1260atgggcttac
tgaccaacct ggcagacagg gagctggttc acatgatcaa ctgggcgaag
1320agggtgccag gctttgtgga tttgaccctc catgatcagg tccaccttct
agaatgtgcc 1380tggctagaga tcctgatgat tggtctcgtc tggcgctcca
tggagcaccc agggaagcta 1440ctgtttgctc ctaacttgct cttggacagg
aaccagggaa aatgtgtaga gggcatggtg 1500gagatcttcg acatgctgct
ggctacatca tctcggttcc gcatgatgaa tctgcaggga 1560gaggagtttg
tgtgcctcaa atctattatt ttgcttaatt ctggagtgta cacatttctg
1620tccagcaccc tgaagtctct ggaagagaag gaccatatcc accgagtcct
ggacaagatc 1680acagacactt tgatccacct gatggccaag gcaggcctga
ccctgcagca gcagcaccag 1740cggctggccc agctcctcct catcctctcc
cacatcaggc acatgagtaa caaaggcatg 1800gagcatctgt acagcatgaa
gtgcaagaac gtggtgcccc tctatgacct gctgctggag 1860atgctggacg
cccaccgcct acatgcgccc actagccgtg gaggggcatc cgtggaggag
1920acggaccaaa gccacttggc cactgcgggc tctacttcat cgcattcctt
gcaaaagtat 1980tacatcacgg gggaggcaga gggtttccct gccacggtct
gagagctccc tggctcccac 2040acggttcaga taatccctgc tgcattttac
cctcatcatg caccacttta gccaaattct 2100gtctcctgca tacactccgg
catgcatcca acaccaatgg ctttctagat gagtggccat 2160tcatttgctt
gctcagttct tagtggcaca tcttctgtct tctgttggga acagccaaag
2220ggattccaag gctaaatctt tgtaacagct ctctttcccc cttgctatgt
tactaagcgt 2280gaggattccc gtagctcttc acagctgaac tcagtctatg
ggttggggct cagataactc 2340tgtgcattta agctacttgt agagacccag
gcctggagag tagacatttt gcctctgata 2400agcacttttt aaatggctct
aagaataagc cacagcaaag aatttaaagt ggctccttta 2460attggtgact
tggagaaagc taggtcaagg gtttattata gcaccctctt gtattcctat
2520ggcaatgcat ccttttatga aagtggtaca ccttaaagct tttatatgac
tgtagcagag 2580tatctggtga ttgtcaattc attcccccta taggaataca
aggggcacac agggaaggca 2640gatcccctag ttggcaagac tattttaact
tgatacactg cagattcaga tgtgctgaaa 2700gctctgcctc tggctttccg
gtcatgggtt ccagttaatt catgcctccc atggacctat 2760ggagagcagc
aagttgatct tagttaagtc tccctatatg agggataagt tcctgatttt
2820tgtttttatt tttgtgttac aaaagaaagc cctccctccc tgaacttgca
gtaaggtcag 2880cttcaggacc tgttccagtg ggcactgtac ttggatcttc
ccggcgtgtg tgtgccttac 2940acaggggtga actgttcact gtggtgatgc
atgatgaggg taaatggtag ttgaaaggag 3000caggggccct ggtgttgcat
ttagccctgg ggcatggagc tgaacagtac ttgtgcagga 3060ttgttgtggc
tactagagaa caagagggaa agtagggcag aaactggata cagttctgag
3120gcacagccag acttgctcag ggtggccctg ccacaggctg cagctaccta
ggaacattcc 3180ttgcagaccc cgcattgccc tttgggggtg ccctgggatc
cctggggtag tccagctctt 3240cttcatttcc cagcgtggcc ctggttggaa
gaagcagctg tcacagctgc tgtagacagc 3300tgtgttccta caattggccc
agcaccctgg ggcacgggag aagggtgggg accgttgctg 3360tcactactca
ggctgactgg ggcctggtca gattacgtat gcccttggtg gtttagagat
3420aatccaaaat cagggtttgg tttggggaag aaaatcctcc cccttcctcc
cccgccccgt 3480tccctaccgc ctccactcct gccagctcat ttccttcaat
ttcctttgac ctataggcta 3540aaaaagaaag gctcattcca gccacagggc
agccttccct gggcctttgc ttctctagca 3600caattatggg ttacttcctt
tttcttaaca aaaaagaatg tttgatttcc tctgggtgac 3660cttattgtct
gtaattgaaa ccctattgag aggtgatgtc tgtgttagcc aatgacccag
3720gtgagctgct cgggcttctc ttggtatgtc ttgtttggaa aagtggattt
cattcatttc 3780tgattgtcca gttaagtgat caccaaagga ctgagaatct
gggagggcaa aaaaaaaaaa 3840aaagttttta tgtgcactta aatttgggga
caattttatg tatctgtgtt aaggatatgt 3900ttaagaacat aattcttttg
ttgctgtttg tttaagaagc accttagttt gtttaagaag 3960caccttatat
agtataatat atattttttt gaaattacat tgcttgttta tcagacaatt
4020gaatgtagta attctgttct ggatttaatt tgactgggtt aacatgcaaa
aaccaaggaa 4080aaatatttag tttttttttt tttttttgta tacttttcaa
gctaccttgt catgtataca 4140gtcatttatg cctaaagcct ggtgattatt
catttaaatg aagatcacat ttcatatcaa 4200cttttgtatc cacagtagac
aaaatagcac taatccagat gcctattgtt ggatactgaa 4260tgacagacaa
tcttatgtag caaagattat gcctgaaaag gaaaattatt cagggcagct
4320aattttgctt ttaccaaaat atcagtagta atatttttgg acagtagcta
atgggtcagt 4380gggttctttt taatgtttat acttagattt tcttttaaaa
aaattaaaat aaaacaaaaa 4440aaaatttcta ggactagacg atgtaatacc
agctaaagcc aaacaattat acagtggaag 4500gttttacatt attcatccaa
tgtgtttcta ttcatgttaa gatactacta catttgaagt 4560gggcagagaa
catcagatga ttgaaatgtt cgcccagggg tctccagcaa ctttggaaat
4620ctctttgtat ttttacttga agtgccacta atggacagca gatattttct
ggctgatgtt 4680ggtattgggt gtaggaacat gatttaaaaa aaaactcttg
cctctgcttt cccccactct 4740gaggcaagtt aaaatgtaaa agatgtgatt
tatctggggg gctcaggtat ggtggggaag 4800tggattcagg aatctgggga
atggcaaata tattaagaag agtattgaaa gtatttggag 4860gaaaatggtt
aattctgggt gtgcaccagg gttcagtaga gtccacttct gccctggaga
4920ccacaaatca actagctcca tttacagcca tttctaaaat ggcagcttca
gttctagaga 4980agaaagaaca acatcagcag taaagtccat ggaatagcta
gtggtctgtg tttcttttcg 5040ccattgccta gcttgccgta atgattctat
aatgccatca tgcagcaatt atgagaggct 5100aggtcatcca aagagaagac
cctatcaatg taggttgcaa aatctaaccc ctaaggaagt 5160gcagtctttg
atttgatttc cctagtaacc ttgcagatat gtttaaccaa gccatagccc
5220atgccttttg agggctgaac aaataaggga cttactgata atttactttt
gatcacatta 5280aggtgttctc accttgaaat cttatacact gaaatggcca
ttgatttagg ccactggctt 5340agagtactcc ttcccctgca tgacactgat
tacaaatact ttcctattca tactttccaa 5400ttatgagatg gactgtgggt
actgggagtg atcactaaca ccatagtaat gtctaatatt 5460cacaggcaga
tctgcttggg gaagctagtt atgtgaaagg caaatagagt catacagtag
5520ctcaaaaggc aaccataatt ctctttggtg caggtcttgg gagcgtgatc
tagattacac 5580tgcaccattc ccaagttaat cccctgaaaa cttactctca
actggagcaa atgaactttg 5640gtcccaaata tccatctttt cagtagcgtt
aattatgctc tgtttccaac tgcatttcct 5700ttccaattga attaaagtgt
ggcctcgttt ttagtcattt aaaattgttt tctaagtaat 5760tgctgcctct
attatggcac ttcaattttg cactgtcttt tgagattcaa gaaaaatttc
5820tattcttttt tttgcatcca attgtgcctg aacttttaaa atatgtaaat
gctgccatgt 5880tccaaaccca tcgtcagtgt gtgtgtttag agctgtgcac
cctagaaaca acatattgtc 5940ccatgagcag gtgcctgaga cacagacccc
tttgcattca cagagaggtc attggttata 6000gagacttgaa ttaataagtg
acattatgcc agtttctgtt ctctcacagg tgataaacaa 6060tgctttttgt
gcactacata ctcttcagtg tagagctctt gttttatggg aaaaggctca
6120aatgccaaat tgtgtttgat ggattaatat gcccttttgc cgatgcatac
tattactgat 6180gtgactcggt tttgtcgcag ctttgctttg tttaatgaaa
cacacttgta aacctctttt 6240gcactttgaa aaagaatcca gcgggatgct
cgagcacctg taaacaattt tctcaaccta 6300tttgatgttc aaataaagaa
ttaaactaaa 6330326357DNAHomo sapiens 32gacaaacaga gatatatcgg
agtctggcac ggggcacata aggcagcaca ttagagaaag 60ccggcccctg gatccgtctt
tcgcgtttat tttaagccca gtcttccctg ggccaccttt 120agcagatcct
cgtgcgcccc cgccccctgg ccgtgaaact cagcctctat ccagcagcga
180cgacaagtaa agtggcccgc cggtttctga gccttctgcc ctgcggggac
acggtctgca 240ccctgcccgc ggccacggac catgaccatg accctccaca
ccaaagcatc tgggatggcc 300ctactgcatc agatccaagg gaacgagctg
gagcccctga accgtccgca gctcaagatc 360cccctggagc ggcccctggg
cgaggtgtac ctggacagca gcaagcccgc cgtgtacaac 420taccccgagg
gcgccgccta cgagttcaac gccgcggccg ccgccaacgc gcaggtctac
480ggtcagaccg gcctccccta cggccccggg tctgaggctg cggcgttcgg
ctccaacggc 540ctggggggtt tccccccact caacagcgtg tctccgagcc
cgctgatgct actgcacccg 600ccgccgcagc tgtcgccttt cctgcagccc
cacggccagc aggtgcccta ctacctggag 660aacgagccca gcggctacac
ggtgcgcgag gccggcccgc cggcattcta caggccaaat 720tcagataatc
gacgccaggg tggcagagaa agattggcca gtaccaatga caagggaagt
780atggctatgg aatctgccaa ggagactcgc tactgtgcag tgtgcaatga
ctatgcttca 840ggctaccatt atggagtctg gtcctgtgag ggctgcaagg
ccttcttcaa gagaagtatt 900caaggacata acgactatat gtgtccagcc
accaaccagt gcaccattga taaaaacagg 960aggaagagct gccaggcctg
ccggctccgc aaatgctacg aagtgggaat gatgaaaggt 1020gggatacgaa
aagaccgaag aggagggaga atgttgaaac acaagcgcca gagagatgat
1080ggggagggca ggggtgaagt ggggtctgct ggagacatga gagctgccaa
cctttggcca 1140agcccgctca tgatcaaacg ctctaagaag aacagcctgg
ccttgtccct gacggccgac 1200cagatggtca gtgccttgtt ggatgctgag
ccccccatac tctattccga gtatgatcct 1260accagaccct tcagtgaagc
ttcgatgatg ggcttactga ccaacctggc agacagggag 1320ctggttcaca
tgatcaactg ggcgaagagg gtgccaggct ttgtggattt gaccctccat
1380gatcaggtcc accttctaga atgtgcctgg ctagagatcc tgatgattgg
tctcgtctgg 1440cgctccatgg agcacccagg gaagctactg tttgctccta
acttgctctt ggacaggaac 1500cagggaaaat gtgtagaggg catggtggag
atcttcgaca tgctgctggc tacatcatct 1560cggttccgca tgatgaatct
gcagggagag gagtttgtgt gcctcaaatc tattattttg 1620cttaattctg
gagtgtacac atttctgtcc agcaccctga agtctctgga agagaaggac
1680catatccacc gagtcctgga caagatcaca gacactttga tccacctgat
ggccaaggca 1740ggcctgaccc tgcagcagca gcaccagcgg ctggcccagc
tcctcctcat cctctcccac 1800atcaggcaca tgagtaacaa aggcatggag
catctgtaca gcatgaagtg caagaacgtg 1860gtgcccctct atgacctgct
gctggagatg ctggacgccc accgcctaca tgcgcccact 1920agccgtggag
gggcatccgt ggaggagacg gaccaaagcc acttggccac tgcgggctct
1980acttcatcgc attccttgca aaagtattac atcacggggg aggcagaggg
tttccctgcc 2040acggtctgag agctccctgg ctcccacacg gttcagataa
tccctgctgc attttaccct 2100catcatgcac cactttagcc aaattctgtc
tcctgcatac actccggcat gcatccaaca 2160ccaatggctt tctagatgag
tggccattca tttgcttgct cagttcttag tggcacatct 2220tctgtcttct
gttgggaaca gccaaaggga ttccaaggct aaatctttgt aacagctctc
2280tttccccctt gctatgttac taagcgtgag gattcccgta gctcttcaca
gctgaactca 2340gtctatgggt tggggctcag ataactctgt gcatttaagc
tacttgtaga gacccaggcc 2400tggagagtag acattttgcc tctgataagc
actttttaaa tggctctaag aataagccac 2460agcaaagaat ttaaagtggc
tcctttaatt ggtgacttgg agaaagctag gtcaagggtt 2520tattatagca
ccctcttgta ttcctatggc aatgcatcct tttatgaaag tggtacacct
2580taaagctttt atatgactgt agcagagtat ctggtgattg tcaattcatt
ccccctatag 2640gaatacaagg ggcacacagg gaaggcagat cccctagttg
gcaagactat tttaacttga 2700tacactgcag attcagatgt gctgaaagct
ctgcctctgg ctttccggtc atgggttcca 2760gttaattcat gcctcccatg
gacctatgga gagcagcaag ttgatcttag ttaagtctcc 2820ctatatgagg
gataagttcc tgatttttgt ttttattttt gtgttacaaa agaaagccct
2880ccctccctga acttgcagta aggtcagctt caggacctgt tccagtgggc
actgtacttg 2940gatcttcccg gcgtgtgtgt gccttacaca ggggtgaact
gttcactgtg gtgatgcatg 3000atgagggtaa atggtagttg aaaggagcag
gggccctggt gttgcattta gccctggggc 3060atggagctga acagtacttg
tgcaggattg ttgtggctac tagagaacaa gagggaaagt 3120agggcagaaa
ctggatacag ttctgaggca cagccagact tgctcagggt ggccctgcca
3180caggctgcag ctacctagga acattccttg cagaccccgc attgcccttt
gggggtgccc 3240tgggatccct ggggtagtcc agctcttctt catttcccag
cgtggccctg gttggaagaa 3300gcagctgtca cagctgctgt agacagctgt
gttcctacaa ttggcccagc accctggggc 3360acgggagaag ggtggggacc
gttgctgtca ctactcaggc tgactggggc ctggtcagat 3420tacgtatgcc
cttggtggtt tagagataat ccaaaatcag ggtttggttt ggggaagaaa
3480atcctccccc ttcctccccc gccccgttcc ctaccgcctc cactcctgcc
agctcatttc 3540cttcaatttc ctttgaccta taggctaaaa aagaaaggct
cattccagcc acagggcagc 3600cttccctggg cctttgcttc tctagcacaa
ttatgggtta cttccttttt cttaacaaaa 3660aagaatgttt gatttcctct
gggtgacctt attgtctgta attgaaaccc tattgagagg 3720tgatgtctgt
gttagccaat gacccaggtg agctgctcgg gcttctcttg gtatgtcttg
3780tttggaaaag tggatttcat tcatttctga ttgtccagtt aagtgatcac
caaaggactg 3840agaatctggg agggcaaaaa aaaaaaaaaa gtttttatgt
gcacttaaat ttggggacaa 3900ttttatgtat ctgtgttaag gatatgttta
agaacataat tcttttgttg ctgtttgttt 3960aagaagcacc ttagtttgtt
taagaagcac cttatatagt ataatatata tttttttgaa 4020attacattgc
ttgtttatca gacaattgaa tgtagtaatt ctgttctgga tttaatttga
4080ctgggttaac atgcaaaaac caaggaaaaa tatttagttt tttttttttt
ttttgtatac 4140ttttcaagct accttgtcat gtatacagtc atttatgcct
aaagcctggt gattattcat 4200ttaaatgaag atcacatttc atatcaactt
ttgtatccac agtagacaaa atagcactaa 4260tccagatgcc tattgttgga
tactgaatga cagacaatct tatgtagcaa agattatgcc 4320tgaaaaggaa
aattattcag ggcagctaat tttgctttta ccaaaatatc agtagtaata
4380tttttggaca gtagctaatg ggtcagtggg ttctttttaa tgtttatact
tagattttct 4440tttaaaaaaa ttaaaataaa acaaaaaaaa atttctagga
ctagacgatg taataccagc 4500taaagccaaa caattataca gtggaaggtt
ttacattatt catccaatgt gtttctattc 4560atgttaagat actactacat
ttgaagtggg cagagaacat cagatgattg aaatgttcgc 4620ccaggggtct
ccagcaactt
tggaaatctc tttgtatttt tacttgaagt gccactaatg 4680gacagcagat
attttctggc tgatgttggt attgggtgta ggaacatgat ttaaaaaaaa
4740actcttgcct ctgctttccc ccactctgag gcaagttaaa atgtaaaaga
tgtgatttat 4800ctggggggct caggtatggt ggggaagtgg attcaggaat
ctggggaatg gcaaatatat 4860taagaagagt attgaaagta tttggaggaa
aatggttaat tctgggtgtg caccagggtt 4920cagtagagtc cacttctgcc
ctggagacca caaatcaact agctccattt acagccattt 4980ctaaaatggc
agcttcagtt ctagagaaga aagaacaaca tcagcagtaa agtccatgga
5040atagctagtg gtctgtgttt cttttcgcca ttgcctagct tgccgtaatg
attctataat 5100gccatcatgc agcaattatg agaggctagg tcatccaaag
agaagaccct atcaatgtag 5160gttgcaaaat ctaaccccta aggaagtgca
gtctttgatt tgatttccct agtaaccttg 5220cagatatgtt taaccaagcc
atagcccatg ccttttgagg gctgaacaaa taagggactt 5280actgataatt
tacttttgat cacattaagg tgttctcacc ttgaaatctt atacactgaa
5340atggccattg atttaggcca ctggcttaga gtactccttc ccctgcatga
cactgattac 5400aaatactttc ctattcatac tttccaatta tgagatggac
tgtgggtact gggagtgatc 5460actaacacca tagtaatgtc taatattcac
aggcagatct gcttggggaa gctagttatg 5520tgaaaggcaa atagagtcat
acagtagctc aaaaggcaac cataattctc tttggtgcag 5580gtcttgggag
cgtgatctag attacactgc accattccca agttaatccc ctgaaaactt
5640actctcaact ggagcaaatg aactttggtc ccaaatatcc atcttttcag
tagcgttaat 5700tatgctctgt ttccaactgc atttcctttc caattgaatt
aaagtgtggc ctcgttttta 5760gtcatttaaa attgttttct aagtaattgc
tgcctctatt atggcacttc aattttgcac 5820tgtcttttga gattcaagaa
aaatttctat tctttttttt gcatccaatt gtgcctgaac 5880ttttaaaata
tgtaaatgct gccatgttcc aaacccatcg tcagtgtgtg tgtttagagc
5940tgtgcaccct agaaacaaca tattgtccca tgagcaggtg cctgagacac
agaccccttt 6000gcattcacag agaggtcatt ggttatagag acttgaatta
ataagtgaca ttatgccagt 6060ttctgttctc tcacaggtga taaacaatgc
tttttgtgca ctacatactc ttcagtgtag 6120agctcttgtt ttatgggaaa
aggctcaaat gccaaattgt gtttgatgga ttaatatgcc 6180cttttgccga
tgcatactat tactgatgtg actcggtttt gtcgcagctt tgctttgttt
6240aatgaaacac acttgtaaac ctcttttgca ctttgaaaaa gaatccagcg
ggatgctcga 6300gcacctgtaa acaattttct caacctattt gatgttcaaa
taaagaatta aactaaa 6357336314DNAHomo sapiens 33aaacacatcc
acacactctc tctgcctagt tcacacactg agccactcgc acatgcgagc 60acattccttc
cttccttctc actctctcgg cccttgactt ctacaagccc atggaacatt
120tctggaaaga cgttcttgat ccagcagggt ggcccgccgg tttctgagcc
ttctgccctg 180cggggacacg gtctgcaccc tgcccgcggc cacggaccat
gaccatgacc ctccacacca 240aagcatctgg gatggcccta ctgcatcaga
tccaagggaa cgagctggag cccctgaacc 300gtccgcagct caagatcccc
ctggagcggc ccctgggcga ggtgtacctg gacagcagca 360agcccgccgt
gtacaactac cccgagggcg ccgcctacga gttcaacgcc gcggccgccg
420ccaacgcgca ggtctacggt cagaccggcc tcccctacgg ccccgggtct
gaggctgcgg 480cgttcggctc caacggcctg gggggtttcc ccccactcaa
cagcgtgtct ccgagcccgc 540tgatgctact gcacccgccg ccgcagctgt
cgcctttcct gcagccccac ggccagcagg 600tgccctacta cctggagaac
gagcccagcg gctacacggt gcgcgaggcc ggcccgccgg 660cattctacag
gccaaattca gataatcgac gccagggtgg cagagaaaga ttggccagta
720ccaatgacaa gggaagtatg gctatggaat ctgccaagga gactcgctac
tgtgcagtgt 780gcaatgacta tgcttcaggc taccattatg gagtctggtc
ctgtgagggc tgcaaggcct 840tcttcaagag aagtattcaa ggacataacg
actatatgtg tccagccacc aaccagtgca 900ccattgataa aaacaggagg
aagagctgcc aggcctgccg gctccgcaaa tgctacgaag 960tgggaatgat
gaaaggtggg atacgaaaag accgaagagg agggagaatg ttgaaacaca
1020agcgccagag agatgatggg gagggcaggg gtgaagtggg gtctgctgga
gacatgagag 1080ctgccaacct ttggccaagc ccgctcatga tcaaacgctc
taagaagaac agcctggcct 1140tgtccctgac ggccgaccag atggtcagtg
ccttgttgga tgctgagccc cccatactct 1200attccgagta tgatcctacc
agacccttca gtgaagcttc gatgatgggc ttactgacca 1260acctggcaga
cagggagctg gttcacatga tcaactgggc gaagagggtg ccaggctttg
1320tggatttgac cctccatgat caggtccacc ttctagaatg tgcctggcta
gagatcctga 1380tgattggtct cgtctggcgc tccatggagc acccagggaa
gctactgttt gctcctaact 1440tgctcttgga caggaaccag ggaaaatgtg
tagagggcat ggtggagatc ttcgacatgc 1500tgctggctac atcatctcgg
ttccgcatga tgaatctgca gggagaggag tttgtgtgcc 1560tcaaatctat
tattttgctt aattctggag tgtacacatt tctgtccagc accctgaagt
1620ctctggaaga gaaggaccat atccaccgag tcctggacaa gatcacagac
actttgatcc 1680acctgatggc caaggcaggc ctgaccctgc agcagcagca
ccagcggctg gcccagctcc 1740tcctcatcct ctcccacatc aggcacatga
gtaacaaagg catggagcat ctgtacagca 1800tgaagtgcaa gaacgtggtg
cccctctatg acctgctgct ggagatgctg gacgcccacc 1860gcctacatgc
gcccactagc cgtggagggg catccgtgga ggagacggac caaagccact
1920tggccactgc gggctctact tcatcgcatt ccttgcaaaa gtattacatc
acgggggagg 1980cagagggttt ccctgccacg gtctgagagc tccctggctc
ccacacggtt cagataatcc 2040ctgctgcatt ttaccctcat catgcaccac
tttagccaaa ttctgtctcc tgcatacact 2100ccggcatgca tccaacacca
atggctttct agatgagtgg ccattcattt gcttgctcag 2160ttcttagtgg
cacatcttct gtcttctgtt gggaacagcc aaagggattc caaggctaaa
2220tctttgtaac agctctcttt cccccttgct atgttactaa gcgtgaggat
tcccgtagct 2280cttcacagct gaactcagtc tatgggttgg ggctcagata
actctgtgca tttaagctac 2340ttgtagagac ccaggcctgg agagtagaca
ttttgcctct gataagcact ttttaaatgg 2400ctctaagaat aagccacagc
aaagaattta aagtggctcc tttaattggt gacttggaga 2460aagctaggtc
aagggtttat tatagcaccc tcttgtattc ctatggcaat gcatcctttt
2520atgaaagtgg tacaccttaa agcttttata tgactgtagc agagtatctg
gtgattgtca 2580attcattccc cctataggaa tacaaggggc acacagggaa
ggcagatccc ctagttggca 2640agactatttt aacttgatac actgcagatt
cagatgtgct gaaagctctg cctctggctt 2700tccggtcatg ggttccagtt
aattcatgcc tcccatggac ctatggagag cagcaagttg 2760atcttagtta
agtctcccta tatgagggat aagttcctga tttttgtttt tatttttgtg
2820ttacaaaaga aagccctccc tccctgaact tgcagtaagg tcagcttcag
gacctgttcc 2880agtgggcact gtacttggat cttcccggcg tgtgtgtgcc
ttacacaggg gtgaactgtt 2940cactgtggtg atgcatgatg agggtaaatg
gtagttgaaa ggagcagggg ccctggtgtt 3000gcatttagcc ctggggcatg
gagctgaaca gtacttgtgc aggattgttg tggctactag 3060agaacaagag
ggaaagtagg gcagaaactg gatacagttc tgaggcacag ccagacttgc
3120tcagggtggc cctgccacag gctgcagcta cctaggaaca ttccttgcag
accccgcatt 3180gccctttggg ggtgccctgg gatccctggg gtagtccagc
tcttcttcat ttcccagcgt 3240ggccctggtt ggaagaagca gctgtcacag
ctgctgtaga cagctgtgtt cctacaattg 3300gcccagcacc ctggggcacg
ggagaagggt ggggaccgtt gctgtcacta ctcaggctga 3360ctggggcctg
gtcagattac gtatgccctt ggtggtttag agataatcca aaatcagggt
3420ttggtttggg gaagaaaatc ctcccccttc ctcccccgcc ccgttcccta
ccgcctccac 3480tcctgccagc tcatttcctt caatttcctt tgacctatag
gctaaaaaag aaaggctcat 3540tccagccaca gggcagcctt ccctgggcct
ttgcttctct agcacaatta tgggttactt 3600cctttttctt aacaaaaaag
aatgtttgat ttcctctggg tgaccttatt gtctgtaatt 3660gaaaccctat
tgagaggtga tgtctgtgtt agccaatgac ccaggtgagc tgctcgggct
3720tctcttggta tgtcttgttt ggaaaagtgg atttcattca tttctgattg
tccagttaag 3780tgatcaccaa aggactgaga atctgggagg gcaaaaaaaa
aaaaaaagtt tttatgtgca 3840cttaaatttg gggacaattt tatgtatctg
tgttaaggat atgtttaaga acataattct 3900tttgttgctg tttgtttaag
aagcacctta gtttgtttaa gaagcacctt atatagtata 3960atatatattt
ttttgaaatt acattgcttg tttatcagac aattgaatgt agtaattctg
4020ttctggattt aatttgactg ggttaacatg caaaaaccaa ggaaaaatat
ttagtttttt 4080tttttttttt tgtatacttt tcaagctacc ttgtcatgta
tacagtcatt tatgcctaaa 4140gcctggtgat tattcattta aatgaagatc
acatttcata tcaacttttg tatccacagt 4200agacaaaata gcactaatcc
agatgcctat tgttggatac tgaatgacag acaatcttat 4260gtagcaaaga
ttatgcctga aaaggaaaat tattcagggc agctaatttt gcttttacca
4320aaatatcagt agtaatattt ttggacagta gctaatgggt cagtgggttc
tttttaatgt 4380ttatacttag attttctttt aaaaaaatta aaataaaaca
aaaaaaaatt tctaggacta 4440gacgatgtaa taccagctaa agccaaacaa
ttatacagtg gaaggtttta cattattcat 4500ccaatgtgtt tctattcatg
ttaagatact actacatttg aagtgggcag agaacatcag 4560atgattgaaa
tgttcgccca ggggtctcca gcaactttgg aaatctcttt gtatttttac
4620ttgaagtgcc actaatggac agcagatatt ttctggctga tgttggtatt
gggtgtagga 4680acatgattta aaaaaaaact cttgcctctg ctttccccca
ctctgaggca agttaaaatg 4740taaaagatgt gatttatctg gggggctcag
gtatggtggg gaagtggatt caggaatctg 4800gggaatggca aatatattaa
gaagagtatt gaaagtattt ggaggaaaat ggttaattct 4860gggtgtgcac
cagggttcag tagagtccac ttctgccctg gagaccacaa atcaactagc
4920tccatttaca gccatttcta aaatggcagc ttcagttcta gagaagaaag
aacaacatca 4980gcagtaaagt ccatggaata gctagtggtc tgtgtttctt
ttcgccattg cctagcttgc 5040cgtaatgatt ctataatgcc atcatgcagc
aattatgaga ggctaggtca tccaaagaga 5100agaccctatc aatgtaggtt
gcaaaatcta acccctaagg aagtgcagtc tttgatttga 5160tttccctagt
aaccttgcag atatgtttaa ccaagccata gcccatgcct tttgagggct
5220gaacaaataa gggacttact gataatttac ttttgatcac attaaggtgt
tctcaccttg 5280aaatcttata cactgaaatg gccattgatt taggccactg
gcttagagta ctccttcccc 5340tgcatgacac tgattacaaa tactttccta
ttcatacttt ccaattatga gatggactgt 5400gggtactggg agtgatcact
aacaccatag taatgtctaa tattcacagg cagatctgct 5460tggggaagct
agttatgtga aaggcaaata gagtcataca gtagctcaaa aggcaaccat
5520aattctcttt ggtgcaggtc ttgggagcgt gatctagatt acactgcacc
attcccaagt 5580taatcccctg aaaacttact ctcaactgga gcaaatgaac
tttggtccca aatatccatc 5640ttttcagtag cgttaattat gctctgtttc
caactgcatt tcctttccaa ttgaattaaa 5700gtgtggcctc gtttttagtc
atttaaaatt gttttctaag taattgctgc ctctattatg 5760gcacttcaat
tttgcactgt cttttgagat tcaagaaaaa tttctattct tttttttgca
5820tccaattgtg cctgaacttt taaaatatgt aaatgctgcc atgttccaaa
cccatcgtca 5880gtgtgtgtgt ttagagctgt gcaccctaga aacaacatat
tgtcccatga gcaggtgcct 5940gagacacaga cccctttgca ttcacagaga
ggtcattggt tatagagact tgaattaata 6000agtgacatta tgccagtttc
tgttctctca caggtgataa acaatgcttt ttgtgcacta 6060catactcttc
agtgtagagc tcttgtttta tgggaaaagg ctcaaatgcc aaattgtgtt
6120tgatggatta atatgccctt ttgccgatgc atactattac tgatgtgact
cggttttgtc 6180gcagctttgc tttgtttaat gaaacacact tgtaaacctc
ttttgcactt tgaaaaagaa 6240tccagcggga tgctcgagca cctgtaaaca
attttctcaa cctatttgat gttcaaataa 6300agaattaaac taaa
6314346466DNAHomo sapiens 34atggtcataa cagcctcctg tctaccgact
cagaacggat tttaccaaaa ctgaaaatgc 60aggctccatg ctcagaagct ctttaacagg
ctcgaaaggt ccatgctcct ttctcctgcc 120cattctatag cataagaaga
cagtctctga gtgataatct tctcttcaag aagaagaaaa 180ctaggaagga
gtaagcacaa agatctcttc acattctccg ggactgcggt accaaatatc
240agcacagcac ttcttgaaaa aggatgtaga ttttaatctg aactttgaac
catcactgag 300gtggcccgcc ggtttctgag ccttctgccc tgcggggaca
cggtctgcac cctgcccgcg 360gccacggacc atgaccatga ccctccacac
caaagcatct gggatggccc tactgcatca 420gatccaaggg aacgagctgg
agcccctgaa ccgtccgcag ctcaagatcc ccctggagcg 480gcccctgggc
gaggtgtacc tggacagcag caagcccgcc gtgtacaact accccgaggg
540cgccgcctac gagttcaacg ccgcggccgc cgccaacgcg caggtctacg
gtcagaccgg 600cctcccctac ggccccgggt ctgaggctgc ggcgttcggc
tccaacggcc tggggggttt 660ccccccactc aacagcgtgt ctccgagccc
gctgatgcta ctgcacccgc cgccgcagct 720gtcgcctttc ctgcagcccc
acggccagca ggtgccctac tacctggaga acgagcccag 780cggctacacg
gtgcgcgagg ccggcccgcc ggcattctac aggccaaatt cagataatcg
840acgccagggt ggcagagaaa gattggccag taccaatgac aagggaagta
tggctatgga 900atctgccaag gagactcgct actgtgcagt gtgcaatgac
tatgcttcag gctaccatta 960tggagtctgg tcctgtgagg gctgcaaggc
cttcttcaag agaagtattc aaggacataa 1020cgactatatg tgtccagcca
ccaaccagtg caccattgat aaaaacagga ggaagagctg 1080ccaggcctgc
cggctccgca aatgctacga agtgggaatg atgaaaggtg ggatacgaaa
1140agaccgaaga ggagggagaa tgttgaaaca caagcgccag agagatgatg
gggagggcag 1200gggtgaagtg gggtctgctg gagacatgag agctgccaac
ctttggccaa gcccgctcat 1260gatcaaacgc tctaagaaga acagcctggc
cttgtccctg acggccgacc agatggtcag 1320tgccttgttg gatgctgagc
cccccatact ctattccgag tatgatccta ccagaccctt 1380cagtgaagct
tcgatgatgg gcttactgac caacctggca gacagggagc tggttcacat
1440gatcaactgg gcgaagaggg tgccaggctt tgtggatttg accctccatg
atcaggtcca 1500ccttctagaa tgtgcctggc tagagatcct gatgattggt
ctcgtctggc gctccatgga 1560gcacccaggg aagctactgt ttgctcctaa
cttgctcttg gacaggaacc agggaaaatg 1620tgtagagggc atggtggaga
tcttcgacat gctgctggct acatcatctc ggttccgcat 1680gatgaatctg
cagggagagg agtttgtgtg cctcaaatct attattttgc ttaattctgg
1740agtgtacaca tttctgtcca gcaccctgaa gtctctggaa gagaaggacc
atatccaccg 1800agtcctggac aagatcacag acactttgat ccacctgatg
gccaaggcag gcctgaccct 1860gcagcagcag caccagcggc tggcccagct
cctcctcatc ctctcccaca tcaggcacat 1920gagtaacaaa ggcatggagc
atctgtacag catgaagtgc aagaacgtgg tgcccctcta 1980tgacctgctg
ctggagatgc tggacgccca ccgcctacat gcgcccacta gccgtggagg
2040ggcatccgtg gaggagacgg accaaagcca cttggccact gcgggctcta
cttcatcgca 2100ttccttgcaa aagtattaca tcacggggga ggcagagggt
ttccctgcca cggtctgaga 2160gctccctggc tcccacacgg ttcagataat
ccctgctgca ttttaccctc atcatgcacc 2220actttagcca aattctgtct
cctgcataca ctccggcatg catccaacac caatggcttt 2280ctagatgagt
ggccattcat ttgcttgctc agttcttagt ggcacatctt ctgtcttctg
2340ttgggaacag ccaaagggat tccaaggcta aatctttgta acagctctct
ttcccccttg 2400ctatgttact aagcgtgagg attcccgtag ctcttcacag
ctgaactcag tctatgggtt 2460ggggctcaga taactctgtg catttaagct
acttgtagag acccaggcct ggagagtaga 2520cattttgcct ctgataagca
ctttttaaat ggctctaaga ataagccaca gcaaagaatt 2580taaagtggct
cctttaattg gtgacttgga gaaagctagg tcaagggttt attatagcac
2640cctcttgtat tcctatggca atgcatcctt ttatgaaagt ggtacacctt
aaagctttta 2700tatgactgta gcagagtatc tggtgattgt caattcattc
cccctatagg aatacaaggg 2760gcacacaggg aaggcagatc ccctagttgg
caagactatt ttaacttgat acactgcaga 2820ttcagatgtg ctgaaagctc
tgcctctggc tttccggtca tgggttccag ttaattcatg 2880cctcccatgg
acctatggag agcagcaagt tgatcttagt taagtctccc tatatgaggg
2940ataagttcct gatttttgtt tttatttttg tgttacaaaa gaaagccctc
cctccctgaa 3000cttgcagtaa ggtcagcttc aggacctgtt ccagtgggca
ctgtacttgg atcttcccgg 3060cgtgtgtgtg ccttacacag gggtgaactg
ttcactgtgg tgatgcatga tgagggtaaa 3120tggtagttga aaggagcagg
ggccctggtg ttgcatttag ccctggggca tggagctgaa 3180cagtacttgt
gcaggattgt tgtggctact agagaacaag agggaaagta gggcagaaac
3240tggatacagt tctgaggcac agccagactt gctcagggtg gccctgccac
aggctgcagc 3300tacctaggaa cattccttgc agaccccgca ttgccctttg
ggggtgccct gggatccctg 3360gggtagtcca gctcttcttc atttcccagc
gtggccctgg ttggaagaag cagctgtcac 3420agctgctgta gacagctgtg
ttcctacaat tggcccagca ccctggggca cgggagaagg 3480gtggggaccg
ttgctgtcac tactcaggct gactggggcc tggtcagatt acgtatgccc
3540ttggtggttt agagataatc caaaatcagg gtttggtttg gggaagaaaa
tcctccccct 3600tcctcccccg ccccgttccc taccgcctcc actcctgcca
gctcatttcc ttcaatttcc 3660tttgacctat aggctaaaaa agaaaggctc
attccagcca cagggcagcc ttccctgggc 3720ctttgcttct ctagcacaat
tatgggttac ttcctttttc ttaacaaaaa agaatgtttg 3780atttcctctg
ggtgacctta ttgtctgtaa ttgaaaccct attgagaggt gatgtctgtg
3840ttagccaatg acccaggtga gctgctcggg cttctcttgg tatgtcttgt
ttggaaaagt 3900ggatttcatt catttctgat tgtccagtta agtgatcacc
aaaggactga gaatctggga 3960gggcaaaaaa aaaaaaaaag tttttatgtg
cacttaaatt tggggacaat tttatgtatc 4020tgtgttaagg atatgtttaa
gaacataatt cttttgttgc tgtttgttta agaagcacct 4080tagtttgttt
aagaagcacc ttatatagta taatatatat ttttttgaaa ttacattgct
4140tgtttatcag acaattgaat gtagtaattc tgttctggat ttaatttgac
tgggttaaca 4200tgcaaaaacc aaggaaaaat atttagtttt tttttttttt
tttgtatact tttcaagcta 4260ccttgtcatg tatacagtca tttatgccta
aagcctggtg attattcatt taaatgaaga 4320tcacatttca tatcaacttt
tgtatccaca gtagacaaaa tagcactaat ccagatgcct 4380attgttggat
actgaatgac agacaatctt atgtagcaaa gattatgcct gaaaaggaaa
4440attattcagg gcagctaatt ttgcttttac caaaatatca gtagtaatat
ttttggacag 4500tagctaatgg gtcagtgggt tctttttaat gtttatactt
agattttctt ttaaaaaaat 4560taaaataaaa caaaaaaaaa tttctaggac
tagacgatgt aataccagct aaagccaaac 4620aattatacag tggaaggttt
tacattattc atccaatgtg tttctattca tgttaagata 4680ctactacatt
tgaagtgggc agagaacatc agatgattga aatgttcgcc caggggtctc
4740cagcaacttt ggaaatctct ttgtattttt acttgaagtg ccactaatgg
acagcagata 4800ttttctggct gatgttggta ttgggtgtag gaacatgatt
taaaaaaaaa ctcttgcctc 4860tgctttcccc cactctgagg caagttaaaa
tgtaaaagat gtgatttatc tggggggctc 4920aggtatggtg gggaagtgga
ttcaggaatc tggggaatgg caaatatatt aagaagagta 4980ttgaaagtat
ttggaggaaa atggttaatt ctgggtgtgc accagggttc agtagagtcc
5040acttctgccc tggagaccac aaatcaacta gctccattta cagccatttc
taaaatggca 5100gcttcagttc tagagaagaa agaacaacat cagcagtaaa
gtccatggaa tagctagtgg 5160tctgtgtttc ttttcgccat tgcctagctt
gccgtaatga ttctataatg ccatcatgca 5220gcaattatga gaggctaggt
catccaaaga gaagacccta tcaatgtagg ttgcaaaatc 5280taacccctaa
ggaagtgcag tctttgattt gatttcccta gtaaccttgc agatatgttt
5340aaccaagcca tagcccatgc cttttgaggg ctgaacaaat aagggactta
ctgataattt 5400acttttgatc acattaaggt gttctcacct tgaaatctta
tacactgaaa tggccattga 5460tttaggccac tggcttagag tactccttcc
cctgcatgac actgattaca aatactttcc 5520tattcatact ttccaattat
gagatggact gtgggtactg ggagtgatca ctaacaccat 5580agtaatgtct
aatattcaca ggcagatctg cttggggaag ctagttatgt gaaaggcaaa
5640tagagtcata cagtagctca aaaggcaacc ataattctct ttggtgcagg
tcttgggagc 5700gtgatctaga ttacactgca ccattcccaa gttaatcccc
tgaaaactta ctctcaactg 5760gagcaaatga actttggtcc caaatatcca
tcttttcagt agcgttaatt atgctctgtt 5820tccaactgca tttcctttcc
aattgaatta aagtgtggcc tcgtttttag tcatttaaaa 5880ttgttttcta
agtaattgct gcctctatta tggcacttca attttgcact gtcttttgag
5940attcaagaaa aatttctatt cttttttttg catccaattg tgcctgaact
tttaaaatat 6000gtaaatgctg ccatgttcca aacccatcgt cagtgtgtgt
gtttagagct gtgcacccta 6060gaaacaacat attgtcccat gagcaggtgc
ctgagacaca gacccctttg cattcacaga 6120gaggtcattg gttatagaga
cttgaattaa taagtgacat tatgccagtt tctgttctct 6180cacaggtgat
aaacaatgct ttttgtgcac tacatactct tcagtgtaga gctcttgttt
6240tatgggaaaa ggctcaaatg ccaaattgtg tttgatggat taatatgccc
ttttgccgat 6300gcatactatt actgatgtga ctcggttttg tcgcagcttt
gctttgttta atgaaacaca 6360cttgtaaacc tcttttgcac tttgaaaaag
aatccagcgg gatgctcgag cacctgtaaa 6420caattttctc aacctatttg
atgttcaaat aaagaattaa actaaa 646635595PRTHomo sapiens 35Met Thr Met
Thr Leu His Thr Lys Ala Ser Gly Met Ala Leu Leu His 1 5 10 15 Gln
Ile Gln Gly Asn Glu Leu Glu Pro Leu Asn Arg Pro Gln Leu Lys 20 25
30 Ile Pro Leu Glu Arg Pro Leu Gly Glu Val Tyr Leu Asp Ser Ser Lys
35 40 45 Pro Ala Val Tyr Asn Tyr Pro Glu Gly Ala Ala Tyr Glu Phe
Asn Ala 50 55 60 Ala Ala Ala Ala Asn Ala
Gln Val Tyr Gly Gln Thr Gly Leu Pro Tyr 65 70 75 80 Gly Pro Gly Ser
Glu Ala Ala Ala Phe Gly Ser Asn Gly Leu Gly Gly 85 90 95 Phe Pro
Pro Leu Asn Ser Val Ser Pro Ser Pro Leu Met Leu Leu His 100 105 110
Pro Pro Pro Gln Leu Ser Pro Phe Leu Gln Pro His Gly Gln Gln Val 115
120 125 Pro Tyr Tyr Leu Glu Asn Glu Pro Ser Gly Tyr Thr Val Arg Glu
Ala 130 135 140 Gly Pro Pro Ala Phe Tyr Arg Pro Asn Ser Asp Asn Arg
Arg Gln Gly 145 150 155 160 Gly Arg Glu Arg Leu Ala Ser Thr Asn Asp
Lys Gly Ser Met Ala Met 165 170 175 Glu Ser Ala Lys Glu Thr Arg Tyr
Cys Ala Val Cys Asn Asp Tyr Ala 180 185 190 Ser Gly Tyr His Tyr Gly
Val Trp Ser Cys Glu Gly Cys Lys Ala Phe 195 200 205 Phe Lys Arg Ser
Ile Gln Gly His Asn Asp Tyr Met Cys Pro Ala Thr 210 215 220 Asn Gln
Cys Thr Ile Asp Lys Asn Arg Arg Lys Ser Cys Gln Ala Cys 225 230 235
240 Arg Leu Arg Lys Cys Tyr Glu Val Gly Met Met Lys Gly Gly Ile Arg
245 250 255 Lys Asp Arg Arg Gly Gly Arg Met Leu Lys His Lys Arg Gln
Arg Asp 260 265 270 Asp Gly Glu Gly Arg Gly Glu Val Gly Ser Ala Gly
Asp Met Arg Ala 275 280 285 Ala Asn Leu Trp Pro Ser Pro Leu Met Ile
Lys Arg Ser Lys Lys Asn 290 295 300 Ser Leu Ala Leu Ser Leu Thr Ala
Asp Gln Met Val Ser Ala Leu Leu 305 310 315 320 Asp Ala Glu Pro Pro
Ile Leu Tyr Ser Glu Tyr Asp Pro Thr Arg Pro 325 330 335 Phe Ser Glu
Ala Ser Met Met Gly Leu Leu Thr Asn Leu Ala Asp Arg 340 345 350 Glu
Leu Val His Met Ile Asn Trp Ala Lys Arg Val Pro Gly Phe Val 355 360
365 Asp Leu Thr Leu His Asp Gln Val His Leu Leu Glu Cys Ala Trp Leu
370 375 380 Glu Ile Leu Met Ile Gly Leu Val Trp Arg Ser Met Glu His
Pro Gly 385 390 395 400 Lys Leu Leu Phe Ala Pro Asn Leu Leu Leu Asp
Arg Asn Gln Gly Lys 405 410 415 Cys Val Glu Gly Met Val Glu Ile Phe
Asp Met Leu Leu Ala Thr Ser 420 425 430 Ser Arg Phe Arg Met Met Asn
Leu Gln Gly Glu Glu Phe Val Cys Leu 435 440 445 Lys Ser Ile Ile Leu
Leu Asn Ser Gly Val Tyr Thr Phe Leu Ser Ser 450 455 460 Thr Leu Lys
Ser Leu Glu Glu Lys Asp His Ile His Arg Val Leu Asp 465 470 475 480
Lys Ile Thr Asp Thr Leu Ile His Leu Met Ala Lys Ala Gly Leu Thr 485
490 495 Leu Gln Gln Gln His Gln Arg Leu Ala Gln Leu Leu Leu Ile Leu
Ser 500 505 510 His Ile Arg His Met Ser Asn Lys Gly Met Glu His Leu
Tyr Ser Met 515 520 525 Lys Cys Lys Asn Val Val Pro Leu Tyr Asp Leu
Leu Leu Glu Met Leu 530 535 540 Asp Ala His Arg Leu His Ala Pro Thr
Ser Arg Gly Gly Ala Ser Val 545 550 555 560 Glu Glu Thr Asp Gln Ser
His Leu Ala Thr Ala Gly Ser Thr Ser Ser 565 570 575 His Ser Leu Gln
Lys Tyr Tyr Ile Thr Gly Glu Ala Glu Gly Phe Pro 580 585 590 Ala Thr
Val 595 36595PRTHomo sapiens 36Met Thr Met Thr Leu His Thr Lys Ala
Ser Gly Met Ala Leu Leu His 1 5 10 15 Gln Ile Gln Gly Asn Glu Leu
Glu Pro Leu Asn Arg Pro Gln Leu Lys 20 25 30 Ile Pro Leu Glu Arg
Pro Leu Gly Glu Val Tyr Leu Asp Ser Ser Lys 35 40 45 Pro Ala Val
Tyr Asn Tyr Pro Glu Gly Ala Ala Tyr Glu Phe Asn Ala 50 55 60 Ala
Ala Ala Ala Asn Ala Gln Val Tyr Gly Gln Thr Gly Leu Pro Tyr 65 70
75 80 Gly Pro Gly Ser Glu Ala Ala Ala Phe Gly Ser Asn Gly Leu Gly
Gly 85 90 95 Phe Pro Pro Leu Asn Ser Val Ser Pro Ser Pro Leu Met
Leu Leu His 100 105 110 Pro Pro Pro Gln Leu Ser Pro Phe Leu Gln Pro
His Gly Gln Gln Val 115 120 125 Pro Tyr Tyr Leu Glu Asn Glu Pro Ser
Gly Tyr Thr Val Arg Glu Ala 130 135 140 Gly Pro Pro Ala Phe Tyr Arg
Pro Asn Ser Asp Asn Arg Arg Gln Gly 145 150 155 160 Gly Arg Glu Arg
Leu Ala Ser Thr Asn Asp Lys Gly Ser Met Ala Met 165 170 175 Glu Ser
Ala Lys Glu Thr Arg Tyr Cys Ala Val Cys Asn Asp Tyr Ala 180 185 190
Ser Gly Tyr His Tyr Gly Val Trp Ser Cys Glu Gly Cys Lys Ala Phe 195
200 205 Phe Lys Arg Ser Ile Gln Gly His Asn Asp Tyr Met Cys Pro Ala
Thr 210 215 220 Asn Gln Cys Thr Ile Asp Lys Asn Arg Arg Lys Ser Cys
Gln Ala Cys 225 230 235 240 Arg Leu Arg Lys Cys Tyr Glu Val Gly Met
Met Lys Gly Gly Ile Arg 245 250 255 Lys Asp Arg Arg Gly Gly Arg Met
Leu Lys His Lys Arg Gln Arg Asp 260 265 270 Asp Gly Glu Gly Arg Gly
Glu Val Gly Ser Ala Gly Asp Met Arg Ala 275 280 285 Ala Asn Leu Trp
Pro Ser Pro Leu Met Ile Lys Arg Ser Lys Lys Asn 290 295 300 Ser Leu
Ala Leu Ser Leu Thr Ala Asp Gln Met Val Ser Ala Leu Leu 305 310 315
320 Asp Ala Glu Pro Pro Ile Leu Tyr Ser Glu Tyr Asp Pro Thr Arg Pro
325 330 335 Phe Ser Glu Ala Ser Met Met Gly Leu Leu Thr Asn Leu Ala
Asp Arg 340 345 350 Glu Leu Val His Met Ile Asn Trp Ala Lys Arg Val
Pro Gly Phe Val 355 360 365 Asp Leu Thr Leu His Asp Gln Val His Leu
Leu Glu Cys Ala Trp Leu 370 375 380 Glu Ile Leu Met Ile Gly Leu Val
Trp Arg Ser Met Glu His Pro Gly 385 390 395 400 Lys Leu Leu Phe Ala
Pro Asn Leu Leu Leu Asp Arg Asn Gln Gly Lys 405 410 415 Cys Val Glu
Gly Met Val Glu Ile Phe Asp Met Leu Leu Ala Thr Ser 420 425 430 Ser
Arg Phe Arg Met Met Asn Leu Gln Gly Glu Glu Phe Val Cys Leu 435 440
445 Lys Ser Ile Ile Leu Leu Asn Ser Gly Val Tyr Thr Phe Leu Ser Ser
450 455 460 Thr Leu Lys Ser Leu Glu Glu Lys Asp His Ile His Arg Val
Leu Asp 465 470 475 480 Lys Ile Thr Asp Thr Leu Ile His Leu Met Ala
Lys Ala Gly Leu Thr 485 490 495 Leu Gln Gln Gln His Gln Arg Leu Ala
Gln Leu Leu Leu Ile Leu Ser 500 505 510 His Ile Arg His Met Ser Asn
Lys Gly Met Glu His Leu Tyr Ser Met 515 520 525 Lys Cys Lys Asn Val
Val Pro Leu Tyr Asp Leu Leu Leu Glu Met Leu 530 535 540 Asp Ala His
Arg Leu His Ala Pro Thr Ser Arg Gly Gly Ala Ser Val 545 550 555 560
Glu Glu Thr Asp Gln Ser His Leu Ala Thr Ala Gly Ser Thr Ser Ser 565
570 575 His Ser Leu Gln Lys Tyr Tyr Ile Thr Gly Glu Ala Glu Gly Phe
Pro 580 585 590 Ala Thr Val 595 37595PRTHomo sapiens 37Met Thr Met
Thr Leu His Thr Lys Ala Ser Gly Met Ala Leu Leu His 1 5 10 15 Gln
Ile Gln Gly Asn Glu Leu Glu Pro Leu Asn Arg Pro Gln Leu Lys 20 25
30 Ile Pro Leu Glu Arg Pro Leu Gly Glu Val Tyr Leu Asp Ser Ser Lys
35 40 45 Pro Ala Val Tyr Asn Tyr Pro Glu Gly Ala Ala Tyr Glu Phe
Asn Ala 50 55 60 Ala Ala Ala Ala Asn Ala Gln Val Tyr Gly Gln Thr
Gly Leu Pro Tyr 65 70 75 80 Gly Pro Gly Ser Glu Ala Ala Ala Phe Gly
Ser Asn Gly Leu Gly Gly 85 90 95 Phe Pro Pro Leu Asn Ser Val Ser
Pro Ser Pro Leu Met Leu Leu His 100 105 110 Pro Pro Pro Gln Leu Ser
Pro Phe Leu Gln Pro His Gly Gln Gln Val 115 120 125 Pro Tyr Tyr Leu
Glu Asn Glu Pro Ser Gly Tyr Thr Val Arg Glu Ala 130 135 140 Gly Pro
Pro Ala Phe Tyr Arg Pro Asn Ser Asp Asn Arg Arg Gln Gly 145 150 155
160 Gly Arg Glu Arg Leu Ala Ser Thr Asn Asp Lys Gly Ser Met Ala Met
165 170 175 Glu Ser Ala Lys Glu Thr Arg Tyr Cys Ala Val Cys Asn Asp
Tyr Ala 180 185 190 Ser Gly Tyr His Tyr Gly Val Trp Ser Cys Glu Gly
Cys Lys Ala Phe 195 200 205 Phe Lys Arg Ser Ile Gln Gly His Asn Asp
Tyr Met Cys Pro Ala Thr 210 215 220 Asn Gln Cys Thr Ile Asp Lys Asn
Arg Arg Lys Ser Cys Gln Ala Cys 225 230 235 240 Arg Leu Arg Lys Cys
Tyr Glu Val Gly Met Met Lys Gly Gly Ile Arg 245 250 255 Lys Asp Arg
Arg Gly Gly Arg Met Leu Lys His Lys Arg Gln Arg Asp 260 265 270 Asp
Gly Glu Gly Arg Gly Glu Val Gly Ser Ala Gly Asp Met Arg Ala 275 280
285 Ala Asn Leu Trp Pro Ser Pro Leu Met Ile Lys Arg Ser Lys Lys Asn
290 295 300 Ser Leu Ala Leu Ser Leu Thr Ala Asp Gln Met Val Ser Ala
Leu Leu 305 310 315 320 Asp Ala Glu Pro Pro Ile Leu Tyr Ser Glu Tyr
Asp Pro Thr Arg Pro 325 330 335 Phe Ser Glu Ala Ser Met Met Gly Leu
Leu Thr Asn Leu Ala Asp Arg 340 345 350 Glu Leu Val His Met Ile Asn
Trp Ala Lys Arg Val Pro Gly Phe Val 355 360 365 Asp Leu Thr Leu His
Asp Gln Val His Leu Leu Glu Cys Ala Trp Leu 370 375 380 Glu Ile Leu
Met Ile Gly Leu Val Trp Arg Ser Met Glu His Pro Gly 385 390 395 400
Lys Leu Leu Phe Ala Pro Asn Leu Leu Leu Asp Arg Asn Gln Gly Lys 405
410 415 Cys Val Glu Gly Met Val Glu Ile Phe Asp Met Leu Leu Ala Thr
Ser 420 425 430 Ser Arg Phe Arg Met Met Asn Leu Gln Gly Glu Glu Phe
Val Cys Leu 435 440 445 Lys Ser Ile Ile Leu Leu Asn Ser Gly Val Tyr
Thr Phe Leu Ser Ser 450 455 460 Thr Leu Lys Ser Leu Glu Glu Lys Asp
His Ile His Arg Val Leu Asp 465 470 475 480 Lys Ile Thr Asp Thr Leu
Ile His Leu Met Ala Lys Ala Gly Leu Thr 485 490 495 Leu Gln Gln Gln
His Gln Arg Leu Ala Gln Leu Leu Leu Ile Leu Ser 500 505 510 His Ile
Arg His Met Ser Asn Lys Gly Met Glu His Leu Tyr Ser Met 515 520 525
Lys Cys Lys Asn Val Val Pro Leu Tyr Asp Leu Leu Leu Glu Met Leu 530
535 540 Asp Ala His Arg Leu His Ala Pro Thr Ser Arg Gly Gly Ala Ser
Val 545 550 555 560 Glu Glu Thr Asp Gln Ser His Leu Ala Thr Ala Gly
Ser Thr Ser Ser 565 570 575 His Ser Leu Gln Lys Tyr Tyr Ile Thr Gly
Glu Ala Glu Gly Phe Pro 580 585 590 Ala Thr Val 595 38595PRTHomo
sapiens 38Met Thr Met Thr Leu His Thr Lys Ala Ser Gly Met Ala Leu
Leu His 1 5 10 15 Gln Ile Gln Gly Asn Glu Leu Glu Pro Leu Asn Arg
Pro Gln Leu Lys 20 25 30 Ile Pro Leu Glu Arg Pro Leu Gly Glu Val
Tyr Leu Asp Ser Ser Lys 35 40 45 Pro Ala Val Tyr Asn Tyr Pro Glu
Gly Ala Ala Tyr Glu Phe Asn Ala 50 55 60 Ala Ala Ala Ala Asn Ala
Gln Val Tyr Gly Gln Thr Gly Leu Pro Tyr 65 70 75 80 Gly Pro Gly Ser
Glu Ala Ala Ala Phe Gly Ser Asn Gly Leu Gly Gly 85 90 95 Phe Pro
Pro Leu Asn Ser Val Ser Pro Ser Pro Leu Met Leu Leu His 100 105 110
Pro Pro Pro Gln Leu Ser Pro Phe Leu Gln Pro His Gly Gln Gln Val 115
120 125 Pro Tyr Tyr Leu Glu Asn Glu Pro Ser Gly Tyr Thr Val Arg Glu
Ala 130 135 140 Gly Pro Pro Ala Phe Tyr Arg Pro Asn Ser Asp Asn Arg
Arg Gln Gly 145 150 155 160 Gly Arg Glu Arg Leu Ala Ser Thr Asn Asp
Lys Gly Ser Met Ala Met 165 170 175 Glu Ser Ala Lys Glu Thr Arg Tyr
Cys Ala Val Cys Asn Asp Tyr Ala 180 185 190 Ser Gly Tyr His Tyr Gly
Val Trp Ser Cys Glu Gly Cys Lys Ala Phe 195 200 205 Phe Lys Arg Ser
Ile Gln Gly His Asn Asp Tyr Met Cys Pro Ala Thr 210 215 220 Asn Gln
Cys Thr Ile Asp Lys Asn Arg Arg Lys Ser Cys Gln Ala Cys 225 230 235
240 Arg Leu Arg Lys Cys Tyr Glu Val Gly Met Met Lys Gly Gly Ile Arg
245 250 255 Lys Asp Arg Arg Gly Gly Arg Met Leu Lys His Lys Arg Gln
Arg Asp 260 265 270 Asp Gly Glu Gly Arg Gly Glu Val Gly Ser Ala Gly
Asp Met Arg Ala 275 280 285 Ala Asn Leu Trp Pro Ser Pro Leu Met Ile
Lys Arg Ser Lys Lys Asn 290 295 300 Ser Leu Ala Leu Ser Leu Thr Ala
Asp Gln Met Val Ser Ala Leu Leu 305 310 315 320 Asp Ala Glu Pro Pro
Ile Leu Tyr Ser Glu Tyr Asp Pro Thr Arg Pro 325 330 335 Phe Ser Glu
Ala Ser Met Met Gly Leu Leu Thr Asn Leu Ala Asp Arg 340 345 350 Glu
Leu Val His Met Ile Asn Trp Ala Lys Arg Val Pro Gly Phe Val 355 360
365 Asp Leu Thr Leu His Asp Gln Val His Leu Leu Glu Cys Ala Trp Leu
370 375 380 Glu Ile Leu Met Ile Gly Leu Val Trp Arg Ser Met Glu His
Pro Gly 385 390 395 400 Lys Leu Leu Phe Ala Pro Asn Leu Leu Leu Asp
Arg Asn Gln Gly Lys 405 410 415 Cys Val Glu Gly Met Val Glu Ile Phe
Asp Met Leu Leu Ala Thr Ser 420 425 430 Ser Arg Phe Arg Met Met Asn
Leu Gln Gly Glu Glu Phe Val Cys Leu 435 440 445 Lys Ser Ile Ile Leu
Leu Asn Ser Gly Val Tyr Thr Phe Leu Ser Ser 450 455 460 Thr Leu Lys
Ser Leu Glu Glu Lys Asp His Ile His Arg Val Leu Asp 465 470 475 480
Lys Ile Thr Asp Thr Leu Ile His Leu Met Ala Lys Ala Gly Leu Thr 485
490 495 Leu Gln Gln Gln His Gln Arg Leu Ala Gln Leu Leu Leu Ile Leu
Ser 500 505 510 His Ile Arg His Met Ser Asn Lys Gly Met Glu His Leu
Tyr Ser Met 515 520 525 Lys Cys Lys Asn Val Val Pro Leu Tyr Asp Leu
Leu Leu Glu Met Leu 530 535 540 Asp Ala His Arg Leu His Ala Pro Thr
Ser Arg Gly Gly Ala Ser Val 545 550 555 560 Glu Glu Thr Asp Gln Ser
His Leu Ala Thr Ala Gly Ser Thr Ser Ser 565 570 575 His Ser Leu Gln
Lys Tyr Tyr Ile Thr Gly Glu Ala Glu Gly
Phe Pro 580 585 590 Ala Thr Val 595 39 7224DNAHomo sapiens
39gtaccttgat ttcgtattct gagaggctgc tgcttagcgg tagccccttg gtttccgtgg
60caacggaaaa gcgcgggaat tacagataaa ttaaaactgc gactgcgcgg cgtgagctcg
120ctgagacttc ctggacgggg gacaggctgt ggggtttctc agataactgg
gcccctgcgc 180tcaggaggcc ttcaccctct gctctgggta aagttcattg
gaacagaaag aaatggattt 240atctgctctt cgcgttgaag aagtacaaaa
tgtcattaat gctatgcaga aaatcttaga 300gtgtcccatc tgtctggagt
tgatcaagga acctgtctcc acaaagtgtg accacatatt 360ttgcaaattt
tgcatgctga aacttctcaa ccagaagaaa gggccttcac agtgtccttt
420atgtaagaat gatataacca aaaggagcct acaagaaagt acgagattta
gtcaacttgt 480tgaagagcta ttgaaaatca tttgtgcttt tcagcttgac
acaggtttgg agtatgcaaa 540cagctataat tttgcaaaaa aggaaaataa
ctctcctgaa catctaaaag atgaagtttc 600tatcatccaa agtatgggct
acagaaaccg tgccaaaaga cttctacaga gtgaacccga 660aaatccttcc
ttgcaggaaa ccagtctcag tgtccaactc tctaaccttg gaactgtgag
720aactctgagg acaaagcagc ggatacaacc tcaaaagacg tctgtctaca
ttgaattggg 780atctgattct tctgaagata ccgttaataa ggcaacttat
tgcagtgtgg gagatcaaga 840attgttacaa atcacccctc aaggaaccag
ggatgaaatc agtttggatt ctgcaaaaaa 900ggctgcttgt gaattttctg
agacggatgt aacaaatact gaacatcatc aacccagtaa 960taatgatttg
aacaccactg agaagcgtgc agctgagagg catccagaaa agtatcaggg
1020tagttctgtt tcaaacttgc atgtggagcc atgtggcaca aatactcatg
ccagctcatt 1080acagcatgag aacagcagtt tattactcac taaagacaga
atgaatgtag aaaaggctga 1140attctgtaat aaaagcaaac agcctggctt
agcaaggagc caacataaca gatgggctgg 1200aagtaaggaa acatgtaatg
ataggcggac tcccagcaca gaaaaaaagg tagatctgaa 1260tgctgatccc
ctgtgtgaga gaaaagaatg gaataagcag aaactgccat gctcagagaa
1320tcctagagat actgaagatg ttccttggat aacactaaat agcagcattc
agaaagttaa 1380tgagtggttt tccagaagtg atgaactgtt aggttctgat
gactcacatg atggggagtc 1440tgaatcaaat gccaaagtag ctgatgtatt
ggacgttcta aatgaggtag atgaatattc 1500tggttcttca gagaaaatag
acttactggc cagtgatcct catgaggctt taatatgtaa 1560aagtgaaaga
gttcactcca aatcagtaga gagtaatatt gaagacaaaa tatttgggaa
1620aacctatcgg aagaaggcaa gcctccccaa cttaagccat gtaactgaaa
atctaattat 1680aggagcattt gttactgagc cacagataat acaagagcgt
cccctcacaa ataaattaaa 1740gcgtaaaagg agacctacat caggccttca
tcctgaggat tttatcaaga aagcagattt 1800ggcagttcaa aagactcctg
aaatgataaa tcagggaact aaccaaacgg agcagaatgg 1860tcaagtgatg
aatattacta atagtggtca tgagaataaa acaaaaggtg attctattca
1920gaatgagaaa aatcctaacc caatagaatc actcgaaaaa gaatctgctt
tcaaaacgaa 1980agctgaacct ataagcagca gtataagcaa tatggaactc
gaattaaata tccacaattc 2040aaaagcacct aaaaagaata ggctgaggag
gaagtcttct accaggcata ttcatgcgct 2100tgaactagta gtcagtagaa
atctaagccc acctaattgt actgaattgc aaattgatag 2160ttgttctagc
agtgaagaga taaagaaaaa aaagtacaac caaatgccag tcaggcacag
2220cagaaaccta caactcatgg aaggtaaaga acctgcaact ggagccaaga
agagtaacaa 2280gccaaatgaa cagacaagta aaagacatga cagcgatact
ttcccagagc tgaagttaac 2340aaatgcacct ggttctttta ctaagtgttc
aaataccagt gaacttaaag aatttgtcaa 2400tcctagcctt ccaagagaag
aaaaagaaga gaaactagaa acagttaaag tgtctaataa 2460tgctgaagac
cccaaagatc tcatgttaag tggagaaagg gttttgcaaa ctgaaagatc
2520tgtagagagt agcagtattt cattggtacc tggtactgat tatggcactc
aggaaagtat 2580ctcgttactg gaagttagca ctctagggaa ggcaaaaaca
gaaccaaata aatgtgtgag 2640tcagtgtgca gcatttgaaa accccaaggg
actaattcat ggttgttcca aagataatag 2700aaatgacaca gaaggcttta
agtatccatt gggacatgaa gttaaccaca gtcgggaaac 2760aagcatagaa
atggaagaaa gtgaacttga tgctcagtat ttgcagaata cattcaaggt
2820ttcaaagcgc cagtcatttg ctccgttttc aaatccagga aatgcagaag
aggaatgtgc 2880aacattctct gcccactctg ggtccttaaa gaaacaaagt
ccaaaagtca cttttgaatg 2940tgaacaaaag gaagaaaatc aaggaaagaa
tgagtctaat atcaagcctg tacagacagt 3000taatatcact gcaggctttc
ctgtggttgg tcagaaagat aagccagttg ataatgccaa 3060atgtagtatc
aaaggaggct ctaggttttg tctatcatct cagttcagag gcaacgaaac
3120tggactcatt actccaaata aacatggact tttacaaaac ccatatcgta
taccaccact 3180ttttcccatc aagtcatttg ttaaaactaa atgtaagaaa
aatctgctag aggaaaactt 3240tgaggaacat tcaatgtcac ctgaaagaga
aatgggaaat gagaacattc caagtacagt 3300gagcacaatt agccgtaata
acattagaga aaatgttttt aaagaagcca gctcaagcaa 3360tattaatgaa
gtaggttcca gtactaatga agtgggctcc agtattaatg aaataggttc
3420cagtgatgaa aacattcaag cagaactagg tagaaacaga gggccaaaat
tgaatgctat 3480gcttagatta ggggttttgc aacctgaggt ctataaacaa
agtcttcctg gaagtaattg 3540taagcatcct gaaataaaaa agcaagaata
tgaagaagta gttcagactg ttaatacaga 3600tttctctcca tatctgattt
cagataactt agaacagcct atgggaagta gtcatgcatc 3660tcaggtttgt
tctgagacac ctgatgacct gttagatgat ggtgaaataa aggaagatac
3720tagttttgct gaaaatgaca ttaaggaaag ttctgctgtt tttagcaaaa
gcgtccagaa 3780aggagagctt agcaggagtc ctagcccttt cacccataca
catttggctc agggttaccg 3840aagaggggcc aagaaattag agtcctcaga
agagaactta tctagtgagg atgaagagct 3900tccctgcttc caacacttgt
tatttggtaa agtaaacaat ataccttctc agtctactag 3960gcatagcacc
gttgctaccg agtgtctgtc taagaacaca gaggagaatt tattatcatt
4020gaagaatagc ttaaatgact gcagtaacca ggtaatattg gcaaaggcat
ctcaggaaca 4080tcaccttagt gaggaaacaa aatgttctgc tagcttgttt
tcttcacagt gcagtgaatt 4140ggaagacttg actgcaaata caaacaccca
ggatcctttc ttgattggtt cttccaaaca 4200aatgaggcat cagtctgaaa
gccagggagt tggtctgagt gacaaggaat tggtttcaga 4260tgatgaagaa
agaggaacgg gcttggaaga aaataatcaa gaagagcaaa gcatggattc
4320aaacttaggt gaagcagcat ctgggtgtga gagtgaaaca agcgtctctg
aagactgctc 4380agggctatcc tctcagagtg acattttaac cactcagcag
agggatacca tgcaacataa 4440cctgataaag ctccagcagg aaatggctga
actagaagct gtgttagaac agcatgggag 4500ccagccttct aacagctacc
cttccatcat aagtgactct tctgcccttg aggacctgcg 4560aaatccagaa
caaagcacat cagaaaaagc agtattaact tcacagaaaa gtagtgaata
4620ccctataagc cagaatccag aaggcctttc tgctgacaag tttgaggtgt
ctgcagatag 4680ttctaccagt aaaaataaag aaccaggagt ggaaaggtca
tccccttcta aatgcccatc 4740attagatgat aggtggtaca tgcacagttg
ctctgggagt cttcagaata gaaactaccc 4800atctcaagag gagctcatta
aggttgttga tgtggaggag caacagctgg aagagtctgg 4860gccacacgat
ttgacggaaa catcttactt gccaaggcaa gatctagagg gaacccctta
4920cctggaatct ggaatcagcc tcttctctga tgaccctgaa tctgatcctt
ctgaagacag 4980agccccagag tcagctcgtg ttggcaacat accatcttca
acctctgcat tgaaagttcc 5040ccaattgaaa gttgcagaat ctgcccagag
tccagctgct gctcatacta ctgatactgc 5100tgggtataat gcaatggaag
aaagtgtgag cagggagaag ccagaattga cagcttcaac 5160agaaagggtc
aacaaaagaa tgtccatggt ggtgtctggc ctgaccccag aagaatttat
5220gctcgtgtac aagtttgcca gaaaacacca catcacttta actaatctaa
ttactgaaga 5280gactactcat gttgttatga aaacagatgc tgagtttgtg
tgtgaacgga cactgaaata 5340ttttctagga attgcgggag gaaaatgggt
agttagctat ttctgggtga cccagtctat 5400taaagaaaga aaaatgctga
atgagcatga ttttgaagtc agaggagatg tggtcaatgg 5460aagaaaccac
caaggtccaa agcgagcaag agaatcccag gacagaaaga tcttcagggg
5520gctagaaatc tgttgctatg ggcccttcac caacatgccc acagatcaac
tggaatggat 5580ggtacagctg tgtggtgctt ctgtggtgaa ggagctttca
tcattcaccc ttggcacagg 5640tgtccaccca attgtggttg tgcagccaga
tgcctggaca gaggacaatg gcttccatgc 5700aattgggcag atgtgtgagg
cacctgtggt gacccgagag tgggtgttgg acagtgtagc 5760actctaccag
tgccaggagc tggacaccta cctgataccc cagatccccc acagccacta
5820ctgactgcag ccagccacag gtacagagcc acaggacccc aagaatgagc
ttacaaagtg 5880gcctttccag gccctgggag ctcctctcac tcttcagtcc
ttctactgtc ctggctacta 5940aatattttat gtacatcagc ctgaaaagga
cttctggcta tgcaagggtc ccttaaagat 6000tttctgcttg aagtctccct
tggaaatctg ccatgagcac aaaattatgg taatttttca 6060cctgagaaga
ttttaaaacc atttaaacgc caccaattga gcaagatgct gattcattat
6120ttatcagccc tattctttct attcaggctg ttgttggctt agggctggaa
gcacagagtg 6180gcttggcctc aagagaatag ctggtttccc taagtttact
tctctaaaac cctgtgttca 6240caaaggcaga gagtcagacc cttcaatgga
aggagagtgc ttgggatcga ttatgtgact 6300taaagtcaga atagtccttg
ggcagttctc aaatgttgga gtggaacatt ggggaggaaa 6360ttctgaggca
ggtattagaa atgaaaagga aacttgaaac ctgggcatgg tggctcacgc
6420ctgtaatccc agcactttgg gaggccaagg tgggcagatc actggaggtc
aggagttcga 6480aaccagcctg gccaacatgg tgaaacccca tctctactaa
aaatacagaa attagccggt 6540catggtggtg gacacctgta atcccagcta
ctcaggtggc taaggcagga gaatcacttc 6600agcccgggag gtggaggttg
cagtgagcca agatcatacc acggcactcc agcctgggtg 6660acagtgagac
tgtggctcaa aaaaaaaaaa aaaaaaagga aaatgaaact agaagagatt
6720tctaaaagtc tgagatatat ttgctagatt tctaaagaat gtgttctaaa
acagcagaag 6780attttcaaga accggtttcc aaagacagtc ttctaattcc
tcattagtaa taagtaaaat 6840gtttattgtt gtagctctgg tatataatcc
attcctctta aaatataaga cctctggcat 6900gaatatttca tatctataaa
atgacagatc ccaccaggaa ggaagctgtt gctttctttg 6960aggtgatttt
tttcctttgc tccctgttgc tgaaaccata cagcttcata aataattttg
7020cttgctgaag gaagaaaaag tgtttttcat aaacccatta tccaggactg
tttatagctg 7080ttggaaggac taggtcttcc ctagcccccc cagtgtgcaa
gggcagtgaa gacttgattg 7140tacaaaatac gttttgtaaa tgttgtgctg
ttaacactgc aaataaactt ggtagcaaac 7200acttccaaaa aaaaaaaaaa aaaa
7224407287DNAHomo sapiens 40gtaccttgat ttcgtattct gagaggctgc
tgcttagcgg tagccccttg gtttccgtgg 60caacggaaaa gcgcgggaat tacagataaa
ttaaaactgc gactgcgcgg cgtgagctcg 120ctgagacttc ctggacgggg
gacaggctgt ggggtttctc agataactgg gcccctgcgc 180tcaggaggcc
ttcaccctct gctctgggta aagttcattg gaacagaaag aaatggattt
240atctgctctt cgcgttgaag aagtacaaaa tgtcattaat gctatgcaga
aaatcttaga 300gtgtcccatc tgtctggagt tgatcaagga acctgtctcc
acaaagtgtg accacatatt 360ttgcaaattt tgcatgctga aacttctcaa
ccagaagaaa gggccttcac agtgtccttt 420atgtaagaat gatataacca
aaaggagcct acaagaaagt acgagattta gtcaacttgt 480tgaagagcta
ttgaaaatca tttgtgcttt tcagcttgac acaggtttgg agtatgcaaa
540cagctataat tttgcaaaaa aggaaaataa ctctcctgaa catctaaaag
atgaagtttc 600tatcatccaa agtatgggct acagaaaccg tgccaaaaga
cttctacaga gtgaacccga 660aaatccttcc ttgcaggaaa ccagtctcag
tgtccaactc tctaaccttg gaactgtgag 720aactctgagg acaaagcagc
ggatacaacc tcaaaagacg tctgtctaca ttgaattggg 780atctgattct
tctgaagata ccgttaataa ggcaacttat tgcagtgtgg gagatcaaga
840attgttacaa atcacccctc aaggaaccag ggatgaaatc agtttggatt
ctgcaaaaaa 900ggctgcttgt gaattttctg agacggatgt aacaaatact
gaacatcatc aacccagtaa 960taatgatttg aacaccactg agaagcgtgc
agctgagagg catccagaaa agtatcaggg 1020tagttctgtt tcaaacttgc
atgtggagcc atgtggcaca aatactcatg ccagctcatt 1080acagcatgag
aacagcagtt tattactcac taaagacaga atgaatgtag aaaaggctga
1140attctgtaat aaaagcaaac agcctggctt agcaaggagc caacataaca
gatgggctgg 1200aagtaaggaa acatgtaatg ataggcggac tcccagcaca
gaaaaaaagg tagatctgaa 1260tgctgatccc ctgtgtgaga gaaaagaatg
gaataagcag aaactgccat gctcagagaa 1320tcctagagat actgaagatg
ttccttggat aacactaaat agcagcattc agaaagttaa 1380tgagtggttt
tccagaagtg atgaactgtt aggttctgat gactcacatg atggggagtc
1440tgaatcaaat gccaaagtag ctgatgtatt ggacgttcta aatgaggtag
atgaatattc 1500tggttcttca gagaaaatag acttactggc cagtgatcct
catgaggctt taatatgtaa 1560aagtgaaaga gttcactcca aatcagtaga
gagtaatatt gaagacaaaa tatttgggaa 1620aacctatcgg aagaaggcaa
gcctccccaa cttaagccat gtaactgaaa atctaattat 1680aggagcattt
gttactgagc cacagataat acaagagcgt cccctcacaa ataaattaaa
1740gcgtaaaagg agacctacat caggccttca tcctgaggat tttatcaaga
aagcagattt 1800ggcagttcaa aagactcctg aaatgataaa tcagggaact
aaccaaacgg agcagaatgg 1860tcaagtgatg aatattacta atagtggtca
tgagaataaa acaaaaggtg attctattca 1920gaatgagaaa aatcctaacc
caatagaatc actcgaaaaa gaatctgctt tcaaaacgaa 1980agctgaacct
ataagcagca gtataagcaa tatggaactc gaattaaata tccacaattc
2040aaaagcacct aaaaagaata ggctgaggag gaagtcttct accaggcata
ttcatgcgct 2100tgaactagta gtcagtagaa atctaagccc acctaattgt
actgaattgc aaattgatag 2160ttgttctagc agtgaagaga taaagaaaaa
aaagtacaac caaatgccag tcaggcacag 2220cagaaaccta caactcatgg
aaggtaaaga acctgcaact ggagccaaga agagtaacaa 2280gccaaatgaa
cagacaagta aaagacatga cagcgatact ttcccagagc tgaagttaac
2340aaatgcacct ggttctttta ctaagtgttc aaataccagt gaacttaaag
aatttgtcaa 2400tcctagcctt ccaagagaag aaaaagaaga gaaactagaa
acagttaaag tgtctaataa 2460tgctgaagac cccaaagatc tcatgttaag
tggagaaagg gttttgcaaa ctgaaagatc 2520tgtagagagt agcagtattt
cattggtacc tggtactgat tatggcactc aggaaagtat 2580ctcgttactg
gaagttagca ctctagggaa ggcaaaaaca gaaccaaata aatgtgtgag
2640tcagtgtgca gcatttgaaa accccaaggg actaattcat ggttgttcca
aagataatag 2700aaatgacaca gaaggcttta agtatccatt gggacatgaa
gttaaccaca gtcgggaaac 2760aagcatagaa atggaagaaa gtgaacttga
tgctcagtat ttgcagaata cattcaaggt 2820ttcaaagcgc cagtcatttg
ctccgttttc aaatccagga aatgcagaag aggaatgtgc 2880aacattctct
gcccactctg ggtccttaaa gaaacaaagt ccaaaagtca cttttgaatg
2940tgaacaaaag gaagaaaatc aaggaaagaa tgagtctaat atcaagcctg
tacagacagt 3000taatatcact gcaggctttc ctgtggttgg tcagaaagat
aagccagttg ataatgccaa 3060atgtagtatc aaaggaggct ctaggttttg
tctatcatct cagttcagag gcaacgaaac 3120tggactcatt actccaaata
aacatggact tttacaaaac ccatatcgta taccaccact 3180ttttcccatc
aagtcatttg ttaaaactaa atgtaagaaa aatctgctag aggaaaactt
3240tgaggaacat tcaatgtcac ctgaaagaga aatgggaaat gagaacattc
caagtacagt 3300gagcacaatt agccgtaata acattagaga aaatgttttt
aaagaagcca gctcaagcaa 3360tattaatgaa gtaggttcca gtactaatga
agtgggctcc agtattaatg aaataggttc 3420cagtgatgaa aacattcaag
cagaactagg tagaaacaga gggccaaaat tgaatgctat 3480gcttagatta
ggggttttgc aacctgaggt ctataaacaa agtcttcctg gaagtaattg
3540taagcatcct gaaataaaaa agcaagaata tgaagaagta gttcagactg
ttaatacaga 3600tttctctcca tatctgattt cagataactt agaacagcct
atgggaagta gtcatgcatc 3660tcaggtttgt tctgagacac ctgatgacct
gttagatgat ggtgaaataa aggaagatac 3720tagttttgct gaaaatgaca
ttaaggaaag ttctgctgtt tttagcaaaa gcgtccagaa 3780aggagagctt
agcaggagtc ctagcccttt cacccataca catttggctc agggttaccg
3840aagaggggcc aagaaattag agtcctcaga agagaactta tctagtgagg
atgaagagct 3900tccctgcttc caacacttgt tatttggtaa agtaaacaat
ataccttctc agtctactag 3960gcatagcacc gttgctaccg agtgtctgtc
taagaacaca gaggagaatt tattatcatt 4020gaagaatagc ttaaatgact
gcagtaacca ggtaatattg gcaaaggcat ctcaggaaca 4080tcaccttagt
gaggaaacaa aatgttctgc tagcttgttt tcttcacagt gcagtgaatt
4140ggaagacttg actgcaaata caaacaccca ggatcctttc ttgattggtt
cttccaaaca 4200aatgaggcat cagtctgaaa gccagggagt tggtctgagt
gacaaggaat tggtttcaga 4260tgatgaagaa agaggaacgg gcttggaaga
aaataatcaa gaagagcaaa gcatggattc 4320aaacttaggt gaagcagcat
ctgggtgtga gagtgaaaca agcgtctctg aagactgctc 4380agggctatcc
tctcagagtg acattttaac cactcagcag agggatacca tgcaacataa
4440cctgataaag ctccagcagg aaatggctga actagaagct gtgttagaac
agcatgggag 4500ccagccttct aacagctacc cttccatcat aagtgactct
tctgcccttg aggacctgcg 4560aaatccagaa caaagcacat cagaaaaaga
ttcgcatata catggccaaa ggaacaactc 4620catgttttct aaaaggccta
gagaacatat atcagtatta acttcacaga aaagtagtga 4680ataccctata
agccagaatc cagaaggcct ttctgctgac aagtttgagg tgtctgcaga
4740tagttctacc agtaaaaata aagaaccagg agtggaaagg tcatcccctt
ctaaatgccc 4800atcattagat gataggtggt acatgcacag ttgctctggg
agtcttcaga atagaaacta 4860cccatctcaa gaggagctca ttaaggttgt
tgatgtggag gagcaacagc tggaagagtc 4920tgggccacac gatttgacgg
aaacatctta cttgccaagg caagatctag agggaacccc 4980ttacctggaa
tctggaatca gcctcttctc tgatgaccct gaatctgatc cttctgaaga
5040cagagcccca gagtcagctc gtgttggcaa cataccatct tcaacctctg
cattgaaagt 5100tccccaattg aaagttgcag aatctgccca gagtccagct
gctgctcata ctactgatac 5160tgctgggtat aatgcaatgg aagaaagtgt
gagcagggag aagccagaat tgacagcttc 5220aacagaaagg gtcaacaaaa
gaatgtccat ggtggtgtct ggcctgaccc cagaagaatt 5280tatgctcgtg
tacaagtttg ccagaaaaca ccacatcact ttaactaatc taattactga
5340agagactact catgttgtta tgaaaacaga tgctgagttt gtgtgtgaac
ggacactgaa 5400atattttcta ggaattgcgg gaggaaaatg ggtagttagc
tatttctggg tgacccagtc 5460tattaaagaa agaaaaatgc tgaatgagca
tgattttgaa gtcagaggag atgtggtcaa 5520tggaagaaac caccaaggtc
caaagcgagc aagagaatcc caggacagaa agatcttcag 5580ggggctagaa
atctgttgct atgggccctt caccaacatg cccacagatc aactggaatg
5640gatggtacag ctgtgtggtg cttctgtggt gaaggagctt tcatcattca
cccttggcac 5700aggtgtccac ccaattgtgg ttgtgcagcc agatgcctgg
acagaggaca atggcttcca 5760tgcaattggg cagatgtgtg aggcacctgt
ggtgacccga gagtgggtgt tggacagtgt 5820agcactctac cagtgccagg
agctggacac ctacctgata ccccagatcc cccacagcca 5880ctactgactg
cagccagcca caggtacaga gccacaggac cccaagaatg agcttacaaa
5940gtggcctttc caggccctgg gagctcctct cactcttcag tccttctact
gtcctggcta 6000ctaaatattt tatgtacatc agcctgaaaa ggacttctgg
ctatgcaagg gtcccttaaa 6060gattttctgc ttgaagtctc ccttggaaat
ctgccatgag cacaaaatta tggtaatttt 6120tcacctgaga agattttaaa
accatttaaa cgccaccaat tgagcaagat gctgattcat 6180tatttatcag
ccctattctt tctattcagg ctgttgttgg cttagggctg gaagcacaga
6240gtggcttggc ctcaagagaa tagctggttt ccctaagttt acttctctaa
aaccctgtgt 6300tcacaaaggc agagagtcag acccttcaat ggaaggagag
tgcttgggat cgattatgtg 6360acttaaagtc agaatagtcc ttgggcagtt
ctcaaatgtt ggagtggaac attggggagg 6420aaattctgag gcaggtatta
gaaatgaaaa ggaaacttga aacctgggca tggtggctca 6480cgcctgtaat
cccagcactt tgggaggcca aggtgggcag atcactggag gtcaggagtt
6540cgaaaccagc ctggccaaca tggtgaaacc ccatctctac taaaaataca
gaaattagcc 6600ggtcatggtg gtggacacct gtaatcccag ctactcaggt
ggctaaggca ggagaatcac 6660ttcagcccgg gaggtggagg ttgcagtgag
ccaagatcat accacggcac tccagcctgg 6720gtgacagtga gactgtggct
caaaaaaaaa aaaaaaaaaa ggaaaatgaa actagaagag 6780atttctaaaa
gtctgagata tatttgctag atttctaaag aatgtgttct aaaacagcag
6840aagattttca agaaccggtt tccaaagaca gtcttctaat tcctcattag
taataagtaa 6900aatgtttatt gttgtagctc tggtatataa tccattcctc
ttaaaatata agacctctgg 6960catgaatatt tcatatctat aaaatgacag
atcccaccag gaaggaagct gttgctttct 7020ttgaggtgat ttttttcctt
tgctccctgt tgctgaaacc atacagcttc ataaataatt 7080ttgcttgctg
aaggaagaaa aagtgttttt cataaaccca ttatccagga ctgtttatag
7140ctgttggaag gactaggtct tccctagccc ccccagtgtg caagggcagt
gaagacttga 7200ttgtacaaaa tacgttttgt aaatgttgtg ctgttaacac
tgcaaataaa cttggtagca 7260aacacttcca aaaaaaaaaa aaaaaaa
7287417132DNAHomo sapiens 41cttagcggta gccccttggt ttccgtggca
acggaaaagc gcgggaatta cagataaatt 60aaaactgcga ctgcgcggcg tgagctcgct
gagacttcct ggacggggga caggctgtgg 120ggtttctcag ataactgggc
ccctgcgctc aggaggcctt caccctctgc tctggttcat 180tggaacagaa
agaaatggat ttatctgctc ttcgcgttga agaagtacaa aatgtcatta
240atgctatgca gaaaatctta gagtgtccca tctgattttg catgctgaaa
cttctcaacc 300agaagaaagg gccttcacag tgtcctttat gtaagaatga
tataaccaaa aggagcctac 360aagaaagtac gagatttagt caacttgttg
aagagctatt gaaaatcatt tgtgcttttc 420agcttgacac aggtttggag
tatgcaaaca gctataattt tgcaaaaaag gaaaataact 480ctcctgaaca
tctaaaagat gaagtttcta tcatccaaag tatgggctac agaaaccgtg
540ccaaaagact tctacagagt gaacccgaaa atccttcctt gcaggaaacc
agtctcagtg 600tccaactctc taaccttgga actgtgagaa ctctgaggac
aaagcagcgg atacaacctc 660aaaagacgtc tgtctacatt gaattgggat
ctgattcttc tgaagatacc gttaataagg 720caacttattg cagtgtggga
gatcaagaat tgttacaaat cacccctcaa ggaaccaggg 780atgaaatcag
tttggattct gcaaaaaagg ctgcttgtga attttctgag acggatgtaa
840caaatactga acatcatcaa cccagtaata atgatttgaa caccactgag
aagcgtgcag 900ctgagaggca tccagaaaag tatcagggta gttctgtttc
aaacttgcat gtggagccat 960gtggcacaaa tactcatgcc agctcattac
agcatgagaa cagcagttta ttactcacta 1020aagacagaat gaatgtagaa
aaggctgaat tctgtaataa aagcaaacag cctggcttag 1080caaggagcca
acataacaga tgggctggaa gtaaggaaac atgtaatgat aggcggactc
1140ccagcacaga aaaaaaggta gatctgaatg ctgatcccct gtgtgagaga
aaagaatgga 1200ataagcagaa actgccatgc tcagagaatc ctagagatac
tgaagatgtt ccttggataa 1260cactaaatag cagcattcag aaagttaatg
agtggttttc cagaagtgat gaactgttag 1320gttctgatga ctcacatgat
ggggagtctg aatcaaatgc caaagtagct gatgtattgg 1380acgttctaaa
tgaggtagat gaatattctg gttcttcaga gaaaatagac ttactggcca
1440gtgatcctca tgaggcttta atatgtaaaa gtgaaagagt tcactccaaa
tcagtagaga 1500gtaatattga agacaaaata tttgggaaaa cctatcggaa
gaaggcaagc ctccccaact 1560taagccatgt aactgaaaat ctaattatag
gagcatttgt tactgagcca cagataatac 1620aagagcgtcc cctcacaaat
aaattaaagc gtaaaaggag acctacatca ggccttcatc 1680ctgaggattt
tatcaagaaa gcagatttgg cagttcaaaa gactcctgaa atgataaatc
1740agggaactaa ccaaacggag cagaatggtc aagtgatgaa tattactaat
agtggtcatg 1800agaataaaac aaaaggtgat tctattcaga atgagaaaaa
tcctaaccca atagaatcac 1860tcgaaaaaga atctgctttc aaaacgaaag
ctgaacctat aagcagcagt ataagcaata 1920tggaactcga attaaatatc
cacaattcaa aagcacctaa aaagaatagg ctgaggagga 1980agtcttctac
caggcatatt catgcgcttg aactagtagt cagtagaaat ctaagcccac
2040ctaattgtac tgaattgcaa attgatagtt gttctagcag tgaagagata
aagaaaaaaa 2100agtacaacca aatgccagtc aggcacagca gaaacctaca
actcatggaa ggtaaagaac 2160ctgcaactgg agccaagaag agtaacaagc
caaatgaaca gacaagtaaa agacatgaca 2220gcgatacttt cccagagctg
aagttaacaa atgcacctgg ttcttttact aagtgttcaa 2280ataccagtga
acttaaagaa tttgtcaatc ctagccttcc aagagaagaa aaagaagaga
2340aactagaaac agttaaagtg tctaataatg ctgaagaccc caaagatctc
atgttaagtg 2400gagaaagggt tttgcaaact gaaagatctg tagagagtag
cagtatttca ttggtacctg 2460gtactgatta tggcactcag gaaagtatct
cgttactgga agttagcact ctagggaagg 2520caaaaacaga accaaataaa
tgtgtgagtc agtgtgcagc atttgaaaac cccaagggac 2580taattcatgg
ttgttccaaa gataatagaa atgacacaga aggctttaag tatccattgg
2640gacatgaagt taaccacagt cgggaaacaa gcatagaaat ggaagaaagt
gaacttgatg 2700ctcagtattt gcagaataca ttcaaggttt caaagcgcca
gtcatttgct ccgttttcaa 2760atccaggaaa tgcagaagag gaatgtgcaa
cattctctgc ccactctggg tccttaaaga 2820aacaaagtcc aaaagtcact
tttgaatgtg aacaaaagga agaaaatcaa ggaaagaatg 2880agtctaatat
caagcctgta cagacagtta atatcactgc aggctttcct gtggttggtc
2940agaaagataa gccagttgat aatgccaaat gtagtatcaa aggaggctct
aggttttgtc 3000tatcatctca gttcagaggc aacgaaactg gactcattac
tccaaataaa catggacttt 3060tacaaaaccc atatcgtata ccaccacttt
ttcccatcaa gtcatttgtt aaaactaaat 3120gtaagaaaaa tctgctagag
gaaaactttg aggaacattc aatgtcacct gaaagagaaa 3180tgggaaatga
gaacattcca agtacagtga gcacaattag ccgtaataac attagagaaa
3240atgtttttaa agaagccagc tcaagcaata ttaatgaagt aggttccagt
actaatgaag 3300tgggctccag tattaatgaa ataggttcca gtgatgaaaa
cattcaagca gaactaggta 3360gaaacagagg gccaaaattg aatgctatgc
ttagattagg ggttttgcaa cctgaggtct 3420ataaacaaag tcttcctgga
agtaattgta agcatcctga aataaaaaag caagaatatg 3480aagaagtagt
tcagactgtt aatacagatt tctctccata tctgatttca gataacttag
3540aacagcctat gggaagtagt catgcatctc aggtttgttc tgagacacct
gatgacctgt 3600tagatgatgg tgaaataaag gaagatacta gttttgctga
aaatgacatt aaggaaagtt 3660ctgctgtttt tagcaaaagc gtccagaaag
gagagcttag caggagtcct agccctttca 3720cccatacaca tttggctcag
ggttaccgaa gaggggccaa gaaattagag tcctcagaag 3780agaacttatc
tagtgaggat gaagagcttc cctgcttcca acacttgtta tttggtaaag
3840taaacaatat accttctcag tctactaggc atagcaccgt tgctaccgag
tgtctgtcta 3900agaacacaga ggagaattta ttatcattga agaatagctt
aaatgactgc agtaaccagg 3960taatattggc aaaggcatct caggaacatc
accttagtga ggaaacaaaa tgttctgcta 4020gcttgttttc ttcacagtgc
agtgaattgg aagacttgac tgcaaataca aacacccagg 4080atcctttctt
gattggttct tccaaacaaa tgaggcatca gtctgaaagc cagggagttg
4140gtctgagtga caaggaattg gtttcagatg atgaagaaag aggaacgggc
ttggaagaaa 4200ataatcaaga agagcaaagc atggattcaa acttaggtga
agcagcatct gggtgtgaga 4260gtgaaacaag cgtctctgaa gactgctcag
ggctatcctc tcagagtgac attttaacca 4320ctcagcagag ggataccatg
caacataacc tgataaagct ccagcaggaa atggctgaac 4380tagaagctgt
gttagaacag catgggagcc agccttctaa cagctaccct tccatcataa
4440gtgactcttc tgcccttgag gacctgcgaa atccagaaca aagcacatca
gaaaaagcag 4500tattaacttc acagaaaagt agtgaatacc ctataagcca
gaatccagaa ggcctttctg 4560ctgacaagtt tgaggtgtct gcagatagtt
ctaccagtaa aaataaagaa ccaggagtgg 4620aaaggtcatc cccttctaaa
tgcccatcat tagatgatag gtggtacatg cacagttgct 4680ctgggagtct
tcagaataga aactacccat ctcaagagga gctcattaag gttgttgatg
4740tggaggagca acagctggaa gagtctgggc cacacgattt gacggaaaca
tcttacttgc 4800caaggcaaga tctagaggga accccttacc tggaatctgg
aatcagcctc ttctctgatg 4860accctgaatc tgatccttct gaagacagag
ccccagagtc agctcgtgtt ggcaacatac 4920catcttcaac ctctgcattg
aaagttcccc aattgaaagt tgcagaatct gcccagagtc 4980cagctgctgc
tcatactact gatactgctg ggtataatgc aatggaagaa agtgtgagca
5040gggagaagcc agaattgaca gcttcaacag aaagggtcaa caaaagaatg
tccatggtgg 5100tgtctggcct gaccccagaa gaatttatgc tcgtgtacaa
gtttgccaga aaacaccaca 5160tcactttaac taatctaatt actgaagaga
ctactcatgt tgttatgaaa acagatgctg 5220agtttgtgtg tgaacggaca
ctgaaatatt ttctaggaat tgcgggagga aaatgggtag 5280ttagctattt
ctgggtgacc cagtctatta aagaaagaaa aatgctgaat gagcatgatt
5340ttgaagtcag aggagatgtg gtcaatggaa gaaaccacca aggtccaaag
cgagcaagag 5400aatcccagga cagaaagatc ttcagggggc tagaaatctg
ttgctatggg cccttcacca 5460acatgcccac agatcaactg gaatggatgg
tacagctgtg tggtgcttct gtggtgaagg 5520agctttcatc attcaccctt
ggcacaggtg tccacccaat tgtggttgtg cagccagatg 5580cctggacaga
ggacaatggc ttccatgcaa ttgggcagat gtgtgaggca cctgtggtga
5640cccgagagtg ggtgttggac agtgtagcac tctaccagtg ccaggagctg
gacacctacc 5700tgatacccca gatcccccac agccactact gactgcagcc
agccacaggt acagagccac 5760aggaccccaa gaatgagctt acaaagtggc
ctttccaggc cctgggagct cctctcactc 5820ttcagtcctt ctactgtcct
ggctactaaa tattttatgt acatcagcct gaaaaggact 5880tctggctatg
caagggtccc ttaaagattt tctgcttgaa gtctcccttg gaaatctgcc
5940atgagcacaa aattatggta atttttcacc tgagaagatt ttaaaaccat
ttaaacgcca 6000ccaattgagc aagatgctga ttcattattt atcagcccta
ttctttctat tcaggctgtt 6060gttggcttag ggctggaagc acagagtggc
ttggcctcaa gagaatagct ggtttcccta 6120agtttacttc tctaaaaccc
tgtgttcaca aaggcagaga gtcagaccct tcaatggaag 6180gagagtgctt
gggatcgatt atgtgactta aagtcagaat agtccttggg cagttctcaa
6240atgttggagt ggaacattgg ggaggaaatt ctgaggcagg tattagaaat
gaaaaggaaa 6300cttgaaacct gggcatggtg gctcacgcct gtaatcccag
cactttggga ggccaaggtg 6360ggcagatcac tggaggtcag gagttcgaaa
ccagcctggc caacatggtg aaaccccatc 6420tctactaaaa atacagaaat
tagccggtca tggtggtgga cacctgtaat cccagctact 6480caggtggcta
aggcaggaga atcacttcag cccgggaggt ggaggttgca gtgagccaag
6540atcataccac ggcactccag cctgggtgac agtgagactg tggctcaaaa
aaaaaaaaaa 6600aaaaaggaaa atgaaactag aagagatttc taaaagtctg
agatatattt gctagatttc 6660taaagaatgt gttctaaaac agcagaagat
tttcaagaac cggtttccaa agacagtctt 6720ctaattcctc attagtaata
agtaaaatgt ttattgttgt agctctggta tataatccat 6780tcctcttaaa
atataagacc tctggcatga atatttcata tctataaaat gacagatccc
6840accaggaagg aagctgttgc tttctttgag gtgatttttt tcctttgctc
cctgttgctg 6900aaaccataca gcttcataaa taattttgct tgctgaagga
agaaaaagtg tttttcataa 6960acccattatc caggactgtt tatagctgtt
ggaaggacta ggtcttccct agccccccca 7020gtgtgcaagg gcagtgaaga
cttgattgta caaaatacgt tttgtaaatg ttgtgctgtt 7080aacactgcaa
ataaacttgg tagcaaacac ttccaaaaaa aaaaaaaaaa aa 7132423699DNAHomo
sapiens 42ttcattggaa cagaaagaaa tggatttatc tgctcttcgc gttgaagaag
tacaaaatgt 60cattaatgct atgcagaaaa tcttagagtg tcccatctgt ctggagttga
tcaaggaacc 120tgtctccaca aagtgtgacc acatattttg caaattttgc
atgctgaaac ttctcaacca 180gaagaaaggg ccttcacagt gtcctttatg
taagaatgat ataaccaaaa ggagcctaca 240agaaagtacg agatttagtc
aacttgttga agagctattg aaaatcattt gtgcttttca 300gcttgacaca
ggtttggagt atgcaaacag ctataatttt gcaaaaaagg aaaataactc
360tcctgaacat ctaaaagatg aagtttctat catccaaagt atgggctaca
gaaaccgtgc 420caaaagactt ctacagagtg aacccgaaaa tccttccttg
caggaaacca gtctcagtgt 480ccaactctct aaccttggaa ctgtgagaac
tctgaggaca aagcagcgga tacaacctca 540aaagacgtct gtctacattg
aattgggatc tgattcttct gaagataccg ttaataaggc 600aacttattgc
agtgtgggag atcaagaatt gttacaaatc acccctcaag gaaccaggga
660tgaaatcagt ttggattctg caaaaaaggc tgcttgtgaa ttttctgaga
cggatgtaac 720aaatactgaa catcatcaac ccagtaataa tgatttgaac
accactgaga agcgtgcagc 780tgagaggcat ccagaaaagt atcagggtga
agcagcatct gggtgtgaga gtgaaacaag 840cgtctctgaa gactgctcag
ggctatcctc tcagagtgac attttaacca ctcagcagag 900ggataccatg
caacataacc tgataaagct ccagcaggaa atggctgaac tagaagctgt
960gttagaacag catgggagcc agccttctaa cagctaccct tccatcataa
gtgactcttc 1020tgcccttgag gacctgcgaa atccagaaca aagcacatca
gaaaaagtat taacttcaca 1080gaaaagtagt gaatacccta taagccagaa
tccagaaggc ctttctgctg acaagtttga 1140ggtgtctgca gatagttcta
ccagtaaaaa taaagaacca ggagtggaaa ggtcatcccc 1200ttctaaatgc
ccatcattag atgataggtg gtacatgcac agttgctctg ggagtcttca
1260gaatagaaac tacccatctc aagaggagct cattaaggtt gttgatgtgg
aggagcaaca 1320gctggaagag tctgggccac acgatttgac ggaaacatct
tacttgccaa ggcaagatct 1380agagggaacc ccttacctgg aatctggaat
cagcctcttc tctgatgacc ctgaatctga 1440tccttctgaa gacagagccc
cagagtcagc tcgtgttggc aacataccat cttcaacctc 1500tgcattgaaa
gttccccaat tgaaagttgc agaatctgcc cagagtccag ctgctgctca
1560tactactgat actgctgggt ataatgcaat ggaagaaagt gtgagcaggg
agaagccaga 1620attgacagct tcaacagaaa gggtcaacaa aagaatgtcc
atggtggtgt ctggcctgac 1680cccagaagaa tttatgctcg tgtacaagtt
tgccagaaaa caccacatca ctttaactaa 1740tctaattact gaagagacta
ctcatgttgt tatgaaaaca gatgctgagt ttgtgtgtga 1800acggacactg
aaatattttc taggaattgc gggaggaaaa tgggtagtta gctatttctg
1860ggtgacccag tctattaaag aaagaaaaat gctgaatgag catgattttg
aagtcagagg 1920agatgtggtc aatggaagaa accaccaagg tccaaagcga
gcaagagaat cccaggacag 1980aaagatcttc agggggctag aaatctgttg
ctatgggccc ttcaccaaca tgcccacaga 2040tcaactggaa tggatggtac
agctgtgtgg tgcttctgtg gtgaaggagc tttcatcatt 2100cacccttggc
acaggtgtcc acccaattgt ggttgtgcag ccagatgcct ggacagagga
2160caatggcttc catgcaattg ggcagatgtg tgaggcacct gtggtgaccc
gagagtgggt 2220gttggacagt gtagcactct accagtgcca ggagctggac
acctacctga taccccagat 2280cccccacagc cactactgac tgcagccagc
cacaggtaca gagccacagg accccaagaa 2340tgagcttaca aagtggcctt
tccaggccct gggagctcct ctcactcttc agtccttcta 2400ctgtcctggc
tactaaatat tttatgtaca tcagcctgaa aaggacttct ggctatgcaa
2460gggtccctta aagattttct gcttgaagtc tcccttggaa atctgccatg
agcacaaaat 2520tatggtaatt tttcacctga gaagatttta aaaccattta
aacgccacca attgagcaag 2580atgctgattc attatttatc agccctattc
tttctattca ggctgttgtt ggcttagggc 2640tggaagcaca gagtggcttg
gcctcaagag aatagctggt ttccctaagt ttacttctct 2700aaaaccctgt
gttcacaaag gcagagagtc agacccttca atggaaggag agtgcttggg
2760atcgattatg tgacttaaag tcagaatagt ccttgggcag ttctcaaatg
ttggagtgga 2820acattgggga ggaaattctg aggcaggtat tagaaatgaa
aaggaaactt gaaacctggg 2880catggtggct cacgcctgta atcccagcac
tttgggaggc caaggtgggc agatcactgg 2940aggtcaggag ttcgaaacca
gcctggccaa catggtgaaa ccccatctct actaaaaata 3000cagaaattag
ccggtcatgg tggtggacac ctgtaatccc agctactcag gtggctaagg
3060caggagaatc acttcagccc gggaggtgga ggttgcagtg agccaagatc
ataccacggc 3120actccagcct gggtgacagt gagactgtgg ctcaaaaaaa
aaaaaaaaaa aaggaaaatg 3180aaactagaag agatttctaa aagtctgaga
tatatttgct agatttctaa agaatgtgtt 3240ctaaaacagc agaagatttt
caagaaccgg tttccaaaga cagtcttcta attcctcatt 3300agtaataagt
aaaatgttta ttgttgtagc tctggtatat aatccattcc tcttaaaata
3360taagacctct ggcatgaata tttcatatct ataaaatgac agatcccacc
aggaaggaag 3420ctgttgcttt ctttgaggtg atttttttcc tttgctccct
gttgctgaaa ccatacagct 3480tcataaataa ttttgcttgc tgaaggaaga
aaaagtgttt ttcataaacc cattatccag 3540gactgtttat agctgttgga
aggactaggt cttccctagc ccccccagtg tgcaagggca 3600gtgaagactt
gattgtacaa aatacgtttt gtaaatgttg tgctgttaac actgcaaata
3660aacttggtag caaacacttc caaaaaaaaa aaaaaaaaa 3699433800DNAHomo
sapiens 43cttagcggta gccccttggt ttccgtggca acggaaaagc gcgggaatta
cagataaatt 60aaaactgcga ctgcgcggcg tgagctcgct gagacttcct ggacggggga
caggctgtgg 120ggtttctcag ataactgggc ccctgcgctc aggaggcctt
caccctctgc tctggttcat 180tggaacagaa agaaatggat ttatctgctc
ttcgcgttga agaagtacaa aatgtcatta 240atgctatgca gaaaatctta
gagtgtccca tctgtctgga gttgatcaag gaacctgtct 300ccacaaagtg
tgaccacata ttttgcaaat tttgcatgct gaaacttctc aaccagaaga
360aagggccttc acagtgtcct ttatgtaaga atgatataac caaaaggagc
ctacaagaaa 420gtacgagatt tagtcaactt gttgaagagc tattgaaaat
catttgtgct tttcagcttg 480acacaggttt ggagtatgca aacagctata
attttgcaaa aaaggaaaat aactctcctg 540aacatctaaa agatgaagtt
tctatcatcc aaagtatggg ctacagaaac cgtgccaaaa 600gacttctaca
gagtgaaccc gaaaatcctt ccttgcagga aaccagtctc agtgtccaac
660tctctaacct tggaactgtg agaactctga ggacaaagca gcggatacaa
cctcaaaaga 720cgtctgtcta cattgaattg ggatctgatt cttctgaaga
taccgttaat aaggcaactt 780attgcagtgt gggagatcaa gaattgttac
aaatcacccc tcaaggaacc agggatgaaa 840tcagtttgga ttctgcaaaa
aaggctgctt gtgaattttc tgagacggat gtaacaaata 900ctgaacatca
tcaacccagt aataatgatt tgaacaccac tgagaagcgt gcagctgaga
960ggcatccaga aaagtatcag ggtgaagcag catctgggtg tgagagtgaa
acaagcgtct 1020ctgaagactg ctcagggcta tcctctcaga gtgacatttt
aaccactcag cagagggata 1080ccatgcaaca taacctgata aagctccagc
aggaaatggc tgaactagaa gctgtgttag 1140aacagcatgg gagccagcct
tctaacagct acccttccat cataagtgac tcttctgccc 1200ttgaggacct
gcgaaatcca gaacaaagca catcagaaaa agtattaact tcacagaaaa
1260gtagtgaata ccctataagc cagaatccag aaggcctttc tgctgacaag
tttgaggtgt 1320ctgcagatag ttctaccagt aaaaataaag aaccaggagt
ggaaaggtca tccccttcta 1380aatgcccatc attagatgat aggtggtaca
tgcacagttg ctctgggagt cttcagaata 1440gaaactaccc atctcaagag
gagctcatta aggttgttga tgtggaggag caacagctgg 1500aagagtctgg
gccacacgat ttgacggaaa catcttactt gccaaggcaa gatctagagg
1560gaacccctta cctggaatct ggaatcagcc tcttctctga tgaccctgaa
tctgatcctt 1620ctgaagacag agccccagag tcagctcgtg ttggcaacat
accatcttca acctctgcat 1680tgaaagttcc ccaattgaaa gttgcagaat
ctgcccagag tccagctgct gctcatacta 1740ctgatactgc tgggtataat
gcaatggaag aaagtgtgag cagggagaag ccagaattga 1800cagcttcaac
agaaagggtc aacaaaagaa tgtccatggt ggtgtctggc ctgaccccag
1860aagaatttat gctcgtgtac aagtttgcca gaaaacacca catcacttta
actaatctaa 1920ttactgaaga gactactcat gttgttatga aaacagatgc
tgagtttgtg tgtgaacgga 1980cactgaaata ttttctagga attgcgggag
gaaaatgggt agttagctat ttctgggtga 2040cccagtctat taaagaaaga
aaaatgctga atgagcatga ttttgaagtc agaggagatg 2100tggtcaatgg
aagaaaccac caaggtccaa agcgagcaag agaatcccag gacagaaaga
2160tcttcagggg gctagaaatc tgttgctatg ggcccttcac caacatgccc
acagggtgtc 2220cacccaattg tggttgtgca gccagatgcc tggacagagg
acaatggctt ccatgcaatt 2280gggcagatgt gtgaggcacc tgtggtgacc
cgagagtggg tgttggacag tgtagcactc 2340taccagtgcc aggagctgga
cacctacctg ataccccaga tcccccacag ccactactga 2400ctgcagccag
ccacaggtac agagccacag gaccccaaga atgagcttac aaagtggcct
2460ttccaggccc tgggagctcc tctcactctt cagtccttct actgtcctgg
ctactaaata 2520ttttatgtac atcagcctga aaaggacttc tggctatgca
agggtccctt aaagattttc 2580tgcttgaagt ctcccttgga aatctgccat
gagcacaaaa ttatggtaat ttttcacctg 2640agaagatttt aaaaccattt
aaacgccacc aattgagcaa gatgctgatt cattatttat 2700cagccctatt
ctttctattc aggctgttgt tggcttaggg ctggaagcac agagtggctt
2760ggcctcaaga gaatagctgg tttccctaag tttacttctc taaaaccctg
tgttcacaaa 2820ggcagagagt cagacccttc aatggaagga gagtgcttgg
gatcgattat gtgacttaaa 2880gtcagaatag tccttgggca gttctcaaat
gttggagtgg aacattgggg aggaaattct 2940gaggcaggta ttagaaatga
aaaggaaact tgaaacctgg gcatggtggc tcacgcctgt 3000aatcccagca
ctttgggagg ccaaggtggg cagatcactg gaggtcagga gttcgaaacc
3060agcctggcca acatggtgaa accccatctc tactaaaaat acagaaatta
gccggtcatg 3120gtggtggaca cctgtaatcc cagctactca ggtggctaag
gcaggagaat cacttcagcc 3180cgggaggtgg aggttgcagt gagccaagat
cataccacgg cactccagcc tgggtgacag 3240tgagactgtg gctcaaaaaa
aaaaaaaaaa aaaggaaaat gaaactagaa gagatttcta 3300aaagtctgag
atatatttgc tagatttcta aagaatgtgt tctaaaacag cagaagattt
3360tcaagaaccg gtttccaaag acagtcttct aattcctcat tagtaataag
taaaatgttt 3420attgttgtag ctctggtata taatccattc ctcttaaaat
ataagacctc tggcatgaat 3480atttcatatc tataaaatga cagatcccac
caggaaggaa gctgttgctt tctttgaggt 3540gatttttttc ctttgctccc
tgttgctgaa accatacagc ttcataaata attttgcttg 3600ctgaaggaag
aaaaagtgtt tttcataaac ccattatcca ggactgttta tagctgttgg
3660aaggactagg tcttccctag cccccccagt gtgcaagggc agtgaagact
tgattgtaca 3720aaatacgttt tgtaaatgtt gtgctgttaa cactgcaaat
aaacttggta gcaaacactt 3780ccaaaaaaaa aaaaaaaaaa 3800441863PRTHomo
sapiens 44Met Asp Leu Ser Ala Leu Arg Val Glu Glu Val Gln Asn Val
Ile Asn 1 5 10 15 Ala Met Gln Lys Ile Leu Glu Cys Pro Ile Cys Leu
Glu Leu Ile Lys 20 25 30 Glu Pro Val Ser Thr Lys Cys Asp His Ile
Phe Cys Lys Phe Cys Met 35 40 45 Leu Lys Leu Leu Asn Gln Lys Lys
Gly Pro Ser Gln Cys Pro Leu Cys 50 55 60 Lys Asn Asp Ile Thr Lys
Arg Ser Leu Gln Glu Ser Thr Arg Phe Ser 65 70 75 80 Gln Leu Val Glu
Glu Leu Leu Lys Ile Ile Cys Ala Phe Gln Leu Asp 85 90 95 Thr Gly
Leu
Glu Tyr Ala Asn Ser Tyr Asn Phe Ala Lys Lys Glu Asn 100 105 110 Asn
Ser Pro Glu His Leu Lys Asp Glu Val Ser Ile Ile Gln Ser Met 115 120
125 Gly Tyr Arg Asn Arg Ala Lys Arg Leu Leu Gln Ser Glu Pro Glu Asn
130 135 140 Pro Ser Leu Gln Glu Thr Ser Leu Ser Val Gln Leu Ser Asn
Leu Gly 145 150 155 160 Thr Val Arg Thr Leu Arg Thr Lys Gln Arg Ile
Gln Pro Gln Lys Thr 165 170 175 Ser Val Tyr Ile Glu Leu Gly Ser Asp
Ser Ser Glu Asp Thr Val Asn 180 185 190 Lys Ala Thr Tyr Cys Ser Val
Gly Asp Gln Glu Leu Leu Gln Ile Thr 195 200 205 Pro Gln Gly Thr Arg
Asp Glu Ile Ser Leu Asp Ser Ala Lys Lys Ala 210 215 220 Ala Cys Glu
Phe Ser Glu Thr Asp Val Thr Asn Thr Glu His His Gln 225 230 235 240
Pro Ser Asn Asn Asp Leu Asn Thr Thr Glu Lys Arg Ala Ala Glu Arg 245
250 255 His Pro Glu Lys Tyr Gln Gly Ser Ser Val Ser Asn Leu His Val
Glu 260 265 270 Pro Cys Gly Thr Asn Thr His Ala Ser Ser Leu Gln His
Glu Asn Ser 275 280 285 Ser Leu Leu Leu Thr Lys Asp Arg Met Asn Val
Glu Lys Ala Glu Phe 290 295 300 Cys Asn Lys Ser Lys Gln Pro Gly Leu
Ala Arg Ser Gln His Asn Arg 305 310 315 320 Trp Ala Gly Ser Lys Glu
Thr Cys Asn Asp Arg Arg Thr Pro Ser Thr 325 330 335 Glu Lys Lys Val
Asp Leu Asn Ala Asp Pro Leu Cys Glu Arg Lys Glu 340 345 350 Trp Asn
Lys Gln Lys Leu Pro Cys Ser Glu Asn Pro Arg Asp Thr Glu 355 360 365
Asp Val Pro Trp Ile Thr Leu Asn Ser Ser Ile Gln Lys Val Asn Glu 370
375 380 Trp Phe Ser Arg Ser Asp Glu Leu Leu Gly Ser Asp Asp Ser His
Asp 385 390 395 400 Gly Glu Ser Glu Ser Asn Ala Lys Val Ala Asp Val
Leu Asp Val Leu 405 410 415 Asn Glu Val Asp Glu Tyr Ser Gly Ser Ser
Glu Lys Ile Asp Leu Leu 420 425 430 Ala Ser Asp Pro His Glu Ala Leu
Ile Cys Lys Ser Glu Arg Val His 435 440 445 Ser Lys Ser Val Glu Ser
Asn Ile Glu Asp Lys Ile Phe Gly Lys Thr 450 455 460 Tyr Arg Lys Lys
Ala Ser Leu Pro Asn Leu Ser His Val Thr Glu Asn 465 470 475 480 Leu
Ile Ile Gly Ala Phe Val Thr Glu Pro Gln Ile Ile Gln Glu Arg 485 490
495 Pro Leu Thr Asn Lys Leu Lys Arg Lys Arg Arg Pro Thr Ser Gly Leu
500 505 510 His Pro Glu Asp Phe Ile Lys Lys Ala Asp Leu Ala Val Gln
Lys Thr 515 520 525 Pro Glu Met Ile Asn Gln Gly Thr Asn Gln Thr Glu
Gln Asn Gly Gln 530 535 540 Val Met Asn Ile Thr Asn Ser Gly His Glu
Asn Lys Thr Lys Gly Asp 545 550 555 560 Ser Ile Gln Asn Glu Lys Asn
Pro Asn Pro Ile Glu Ser Leu Glu Lys 565 570 575 Glu Ser Ala Phe Lys
Thr Lys Ala Glu Pro Ile Ser Ser Ser Ile Ser 580 585 590 Asn Met Glu
Leu Glu Leu Asn Ile His Asn Ser Lys Ala Pro Lys Lys 595 600 605 Asn
Arg Leu Arg Arg Lys Ser Ser Thr Arg His Ile His Ala Leu Glu 610 615
620 Leu Val Val Ser Arg Asn Leu Ser Pro Pro Asn Cys Thr Glu Leu Gln
625 630 635 640 Ile Asp Ser Cys Ser Ser Ser Glu Glu Ile Lys Lys Lys
Lys Tyr Asn 645 650 655 Gln Met Pro Val Arg His Ser Arg Asn Leu Gln
Leu Met Glu Gly Lys 660 665 670 Glu Pro Ala Thr Gly Ala Lys Lys Ser
Asn Lys Pro Asn Glu Gln Thr 675 680 685 Ser Lys Arg His Asp Ser Asp
Thr Phe Pro Glu Leu Lys Leu Thr Asn 690 695 700 Ala Pro Gly Ser Phe
Thr Lys Cys Ser Asn Thr Ser Glu Leu Lys Glu 705 710 715 720 Phe Val
Asn Pro Ser Leu Pro Arg Glu Glu Lys Glu Glu Lys Leu Glu 725 730 735
Thr Val Lys Val Ser Asn Asn Ala Glu Asp Pro Lys Asp Leu Met Leu 740
745 750 Ser Gly Glu Arg Val Leu Gln Thr Glu Arg Ser Val Glu Ser Ser
Ser 755 760 765 Ile Ser Leu Val Pro Gly Thr Asp Tyr Gly Thr Gln Glu
Ser Ile Ser 770 775 780 Leu Leu Glu Val Ser Thr Leu Gly Lys Ala Lys
Thr Glu Pro Asn Lys 785 790 795 800 Cys Val Ser Gln Cys Ala Ala Phe
Glu Asn Pro Lys Gly Leu Ile His 805 810 815 Gly Cys Ser Lys Asp Asn
Arg Asn Asp Thr Glu Gly Phe Lys Tyr Pro 820 825 830 Leu Gly His Glu
Val Asn His Ser Arg Glu Thr Ser Ile Glu Met Glu 835 840 845 Glu Ser
Glu Leu Asp Ala Gln Tyr Leu Gln Asn Thr Phe Lys Val Ser 850 855 860
Lys Arg Gln Ser Phe Ala Pro Phe Ser Asn Pro Gly Asn Ala Glu Glu 865
870 875 880 Glu Cys Ala Thr Phe Ser Ala His Ser Gly Ser Leu Lys Lys
Gln Ser 885 890 895 Pro Lys Val Thr Phe Glu Cys Glu Gln Lys Glu Glu
Asn Gln Gly Lys 900 905 910 Asn Glu Ser Asn Ile Lys Pro Val Gln Thr
Val Asn Ile Thr Ala Gly 915 920 925 Phe Pro Val Val Gly Gln Lys Asp
Lys Pro Val Asp Asn Ala Lys Cys 930 935 940 Ser Ile Lys Gly Gly Ser
Arg Phe Cys Leu Ser Ser Gln Phe Arg Gly 945 950 955 960 Asn Glu Thr
Gly Leu Ile Thr Pro Asn Lys His Gly Leu Leu Gln Asn 965 970 975 Pro
Tyr Arg Ile Pro Pro Leu Phe Pro Ile Lys Ser Phe Val Lys Thr 980 985
990 Lys Cys Lys Lys Asn Leu Leu Glu Glu Asn Phe Glu Glu His Ser Met
995 1000 1005 Ser Pro Glu Arg Glu Met Gly Asn Glu Asn Ile Pro Ser
Thr Val 1010 1015 1020 Ser Thr Ile Ser Arg Asn Asn Ile Arg Glu Asn
Val Phe Lys Glu 1025 1030 1035 Ala Ser Ser Ser Asn Ile Asn Glu Val
Gly Ser Ser Thr Asn Glu 1040 1045 1050 Val Gly Ser Ser Ile Asn Glu
Ile Gly Ser Ser Asp Glu Asn Ile 1055 1060 1065 Gln Ala Glu Leu Gly
Arg Asn Arg Gly Pro Lys Leu Asn Ala Met 1070 1075 1080 Leu Arg Leu
Gly Val Leu Gln Pro Glu Val Tyr Lys Gln Ser Leu 1085 1090 1095 Pro
Gly Ser Asn Cys Lys His Pro Glu Ile Lys Lys Gln Glu Tyr 1100 1105
1110 Glu Glu Val Val Gln Thr Val Asn Thr Asp Phe Ser Pro Tyr Leu
1115 1120 1125 Ile Ser Asp Asn Leu Glu Gln Pro Met Gly Ser Ser His
Ala Ser 1130 1135 1140 Gln Val Cys Ser Glu Thr Pro Asp Asp Leu Leu
Asp Asp Gly Glu 1145 1150 1155 Ile Lys Glu Asp Thr Ser Phe Ala Glu
Asn Asp Ile Lys Glu Ser 1160 1165 1170 Ser Ala Val Phe Ser Lys Ser
Val Gln Lys Gly Glu Leu Ser Arg 1175 1180 1185 Ser Pro Ser Pro Phe
Thr His Thr His Leu Ala Gln Gly Tyr Arg 1190 1195 1200 Arg Gly Ala
Lys Lys Leu Glu Ser Ser Glu Glu Asn Leu Ser Ser 1205 1210 1215 Glu
Asp Glu Glu Leu Pro Cys Phe Gln His Leu Leu Phe Gly Lys 1220 1225
1230 Val Asn Asn Ile Pro Ser Gln Ser Thr Arg His Ser Thr Val Ala
1235 1240 1245 Thr Glu Cys Leu Ser Lys Asn Thr Glu Glu Asn Leu Leu
Ser Leu 1250 1255 1260 Lys Asn Ser Leu Asn Asp Cys Ser Asn Gln Val
Ile Leu Ala Lys 1265 1270 1275 Ala Ser Gln Glu His His Leu Ser Glu
Glu Thr Lys Cys Ser Ala 1280 1285 1290 Ser Leu Phe Ser Ser Gln Cys
Ser Glu Leu Glu Asp Leu Thr Ala 1295 1300 1305 Asn Thr Asn Thr Gln
Asp Pro Phe Leu Ile Gly Ser Ser Lys Gln 1310 1315 1320 Met Arg His
Gln Ser Glu Ser Gln Gly Val Gly Leu Ser Asp Lys 1325 1330 1335 Glu
Leu Val Ser Asp Asp Glu Glu Arg Gly Thr Gly Leu Glu Glu 1340 1345
1350 Asn Asn Gln Glu Glu Gln Ser Met Asp Ser Asn Leu Gly Glu Ala
1355 1360 1365 Ala Ser Gly Cys Glu Ser Glu Thr Ser Val Ser Glu Asp
Cys Ser 1370 1375 1380 Gly Leu Ser Ser Gln Ser Asp Ile Leu Thr Thr
Gln Gln Arg Asp 1385 1390 1395 Thr Met Gln His Asn Leu Ile Lys Leu
Gln Gln Glu Met Ala Glu 1400 1405 1410 Leu Glu Ala Val Leu Glu Gln
His Gly Ser Gln Pro Ser Asn Ser 1415 1420 1425 Tyr Pro Ser Ile Ile
Ser Asp Ser Ser Ala Leu Glu Asp Leu Arg 1430 1435 1440 Asn Pro Glu
Gln Ser Thr Ser Glu Lys Ala Val Leu Thr Ser Gln 1445 1450 1455 Lys
Ser Ser Glu Tyr Pro Ile Ser Gln Asn Pro Glu Gly Leu Ser 1460 1465
1470 Ala Asp Lys Phe Glu Val Ser Ala Asp Ser Ser Thr Ser Lys Asn
1475 1480 1485 Lys Glu Pro Gly Val Glu Arg Ser Ser Pro Ser Lys Cys
Pro Ser 1490 1495 1500 Leu Asp Asp Arg Trp Tyr Met His Ser Cys Ser
Gly Ser Leu Gln 1505 1510 1515 Asn Arg Asn Tyr Pro Ser Gln Glu Glu
Leu Ile Lys Val Val Asp 1520 1525 1530 Val Glu Glu Gln Gln Leu Glu
Glu Ser Gly Pro His Asp Leu Thr 1535 1540 1545 Glu Thr Ser Tyr Leu
Pro Arg Gln Asp Leu Glu Gly Thr Pro Tyr 1550 1555 1560 Leu Glu Ser
Gly Ile Ser Leu Phe Ser Asp Asp Pro Glu Ser Asp 1565 1570 1575 Pro
Ser Glu Asp Arg Ala Pro Glu Ser Ala Arg Val Gly Asn Ile 1580 1585
1590 Pro Ser Ser Thr Ser Ala Leu Lys Val Pro Gln Leu Lys Val Ala
1595 1600 1605 Glu Ser Ala Gln Ser Pro Ala Ala Ala His Thr Thr Asp
Thr Ala 1610 1615 1620 Gly Tyr Asn Ala Met Glu Glu Ser Val Ser Arg
Glu Lys Pro Glu 1625 1630 1635 Leu Thr Ala Ser Thr Glu Arg Val Asn
Lys Arg Met Ser Met Val 1640 1645 1650 Val Ser Gly Leu Thr Pro Glu
Glu Phe Met Leu Val Tyr Lys Phe 1655 1660 1665 Ala Arg Lys His His
Ile Thr Leu Thr Asn Leu Ile Thr Glu Glu 1670 1675 1680 Thr Thr His
Val Val Met Lys Thr Asp Ala Glu Phe Val Cys Glu 1685 1690 1695 Arg
Thr Leu Lys Tyr Phe Leu Gly Ile Ala Gly Gly Lys Trp Val 1700 1705
1710 Val Ser Tyr Phe Trp Val Thr Gln Ser Ile Lys Glu Arg Lys Met
1715 1720 1725 Leu Asn Glu His Asp Phe Glu Val Arg Gly Asp Val Val
Asn Gly 1730 1735 1740 Arg Asn His Gln Gly Pro Lys Arg Ala Arg Glu
Ser Gln Asp Arg 1745 1750 1755 Lys Ile Phe Arg Gly Leu Glu Ile Cys
Cys Tyr Gly Pro Phe Thr 1760 1765 1770 Asn Met Pro Thr Asp Gln Leu
Glu Trp Met Val Gln Leu Cys Gly 1775 1780 1785 Ala Ser Val Val Lys
Glu Leu Ser Ser Phe Thr Leu Gly Thr Gly 1790 1795 1800 Val His Pro
Ile Val Val Val Gln Pro Asp Ala Trp Thr Glu Asp 1805 1810 1815 Asn
Gly Phe His Ala Ile Gly Gln Met Cys Glu Ala Pro Val Val 1820 1825
1830 Thr Arg Glu Trp Val Leu Asp Ser Val Ala Leu Tyr Gln Cys Gln
1835 1840 1845 Glu Leu Asp Thr Tyr Leu Ile Pro Gln Ile Pro His Ser
His Tyr 1850 1855 1860 451884PRTHomo sapiens 45Met Asp Leu Ser Ala
Leu Arg Val Glu Glu Val Gln Asn Val Ile Asn 1 5 10 15 Ala Met Gln
Lys Ile Leu Glu Cys Pro Ile Cys Leu Glu Leu Ile Lys 20 25 30 Glu
Pro Val Ser Thr Lys Cys Asp His Ile Phe Cys Lys Phe Cys Met 35 40
45 Leu Lys Leu Leu Asn Gln Lys Lys Gly Pro Ser Gln Cys Pro Leu Cys
50 55 60 Lys Asn Asp Ile Thr Lys Arg Ser Leu Gln Glu Ser Thr Arg
Phe Ser 65 70 75 80 Gln Leu Val Glu Glu Leu Leu Lys Ile Ile Cys Ala
Phe Gln Leu Asp 85 90 95 Thr Gly Leu Glu Tyr Ala Asn Ser Tyr Asn
Phe Ala Lys Lys Glu Asn 100 105 110 Asn Ser Pro Glu His Leu Lys Asp
Glu Val Ser Ile Ile Gln Ser Met 115 120 125 Gly Tyr Arg Asn Arg Ala
Lys Arg Leu Leu Gln Ser Glu Pro Glu Asn 130 135 140 Pro Ser Leu Gln
Glu Thr Ser Leu Ser Val Gln Leu Ser Asn Leu Gly 145 150 155 160 Thr
Val Arg Thr Leu Arg Thr Lys Gln Arg Ile Gln Pro Gln Lys Thr 165 170
175 Ser Val Tyr Ile Glu Leu Gly Ser Asp Ser Ser Glu Asp Thr Val Asn
180 185 190 Lys Ala Thr Tyr Cys Ser Val Gly Asp Gln Glu Leu Leu Gln
Ile Thr 195 200 205 Pro Gln Gly Thr Arg Asp Glu Ile Ser Leu Asp Ser
Ala Lys Lys Ala 210 215 220 Ala Cys Glu Phe Ser Glu Thr Asp Val Thr
Asn Thr Glu His His Gln 225 230 235 240 Pro Ser Asn Asn Asp Leu Asn
Thr Thr Glu Lys Arg Ala Ala Glu Arg 245 250 255 His Pro Glu Lys Tyr
Gln Gly Ser Ser Val Ser Asn Leu His Val Glu 260 265 270 Pro Cys Gly
Thr Asn Thr His Ala Ser Ser Leu Gln His Glu Asn Ser 275 280 285 Ser
Leu Leu Leu Thr Lys Asp Arg Met Asn Val Glu Lys Ala Glu Phe 290 295
300 Cys Asn Lys Ser Lys Gln Pro Gly Leu Ala Arg Ser Gln His Asn Arg
305 310 315 320 Trp Ala Gly Ser Lys Glu Thr Cys Asn Asp Arg Arg Thr
Pro Ser Thr 325 330 335 Glu Lys Lys Val Asp Leu Asn Ala Asp Pro Leu
Cys Glu Arg Lys Glu 340 345 350 Trp Asn Lys Gln Lys Leu Pro Cys Ser
Glu Asn Pro Arg Asp Thr Glu 355 360 365 Asp Val Pro Trp Ile Thr Leu
Asn Ser Ser Ile Gln Lys Val Asn Glu 370 375 380 Trp Phe Ser Arg Ser
Asp Glu Leu Leu Gly Ser Asp Asp Ser His Asp 385 390 395 400 Gly Glu
Ser Glu Ser Asn Ala Lys Val Ala Asp Val Leu Asp Val Leu 405 410 415
Asn Glu Val Asp Glu Tyr Ser Gly Ser Ser Glu Lys Ile Asp Leu Leu 420
425 430 Ala Ser Asp Pro His Glu Ala Leu Ile Cys Lys Ser Glu Arg Val
His 435 440 445 Ser Lys Ser Val Glu Ser Asn Ile Glu Asp Lys Ile Phe
Gly Lys Thr 450 455 460 Tyr Arg Lys Lys Ala Ser Leu Pro Asn Leu Ser
His Val Thr Glu Asn 465 470 475 480 Leu Ile Ile Gly Ala Phe Val Thr
Glu Pro Gln Ile Ile Gln Glu Arg 485 490
495 Pro Leu Thr Asn Lys Leu Lys Arg Lys Arg Arg Pro Thr Ser Gly Leu
500 505 510 His Pro Glu Asp Phe Ile Lys Lys Ala Asp Leu Ala Val Gln
Lys Thr 515 520 525 Pro Glu Met Ile Asn Gln Gly Thr Asn Gln Thr Glu
Gln Asn Gly Gln 530 535 540 Val Met Asn Ile Thr Asn Ser Gly His Glu
Asn Lys Thr Lys Gly Asp 545 550 555 560 Ser Ile Gln Asn Glu Lys Asn
Pro Asn Pro Ile Glu Ser Leu Glu Lys 565 570 575 Glu Ser Ala Phe Lys
Thr Lys Ala Glu Pro Ile Ser Ser Ser Ile Ser 580 585 590 Asn Met Glu
Leu Glu Leu Asn Ile His Asn Ser Lys Ala Pro Lys Lys 595 600 605 Asn
Arg Leu Arg Arg Lys Ser Ser Thr Arg His Ile His Ala Leu Glu 610 615
620 Leu Val Val Ser Arg Asn Leu Ser Pro Pro Asn Cys Thr Glu Leu Gln
625 630 635 640 Ile Asp Ser Cys Ser Ser Ser Glu Glu Ile Lys Lys Lys
Lys Tyr Asn 645 650 655 Gln Met Pro Val Arg His Ser Arg Asn Leu Gln
Leu Met Glu Gly Lys 660 665 670 Glu Pro Ala Thr Gly Ala Lys Lys Ser
Asn Lys Pro Asn Glu Gln Thr 675 680 685 Ser Lys Arg His Asp Ser Asp
Thr Phe Pro Glu Leu Lys Leu Thr Asn 690 695 700 Ala Pro Gly Ser Phe
Thr Lys Cys Ser Asn Thr Ser Glu Leu Lys Glu 705 710 715 720 Phe Val
Asn Pro Ser Leu Pro Arg Glu Glu Lys Glu Glu Lys Leu Glu 725 730 735
Thr Val Lys Val Ser Asn Asn Ala Glu Asp Pro Lys Asp Leu Met Leu 740
745 750 Ser Gly Glu Arg Val Leu Gln Thr Glu Arg Ser Val Glu Ser Ser
Ser 755 760 765 Ile Ser Leu Val Pro Gly Thr Asp Tyr Gly Thr Gln Glu
Ser Ile Ser 770 775 780 Leu Leu Glu Val Ser Thr Leu Gly Lys Ala Lys
Thr Glu Pro Asn Lys 785 790 795 800 Cys Val Ser Gln Cys Ala Ala Phe
Glu Asn Pro Lys Gly Leu Ile His 805 810 815 Gly Cys Ser Lys Asp Asn
Arg Asn Asp Thr Glu Gly Phe Lys Tyr Pro 820 825 830 Leu Gly His Glu
Val Asn His Ser Arg Glu Thr Ser Ile Glu Met Glu 835 840 845 Glu Ser
Glu Leu Asp Ala Gln Tyr Leu Gln Asn Thr Phe Lys Val Ser 850 855 860
Lys Arg Gln Ser Phe Ala Pro Phe Ser Asn Pro Gly Asn Ala Glu Glu 865
870 875 880 Glu Cys Ala Thr Phe Ser Ala His Ser Gly Ser Leu Lys Lys
Gln Ser 885 890 895 Pro Lys Val Thr Phe Glu Cys Glu Gln Lys Glu Glu
Asn Gln Gly Lys 900 905 910 Asn Glu Ser Asn Ile Lys Pro Val Gln Thr
Val Asn Ile Thr Ala Gly 915 920 925 Phe Pro Val Val Gly Gln Lys Asp
Lys Pro Val Asp Asn Ala Lys Cys 930 935 940 Ser Ile Lys Gly Gly Ser
Arg Phe Cys Leu Ser Ser Gln Phe Arg Gly 945 950 955 960 Asn Glu Thr
Gly Leu Ile Thr Pro Asn Lys His Gly Leu Leu Gln Asn 965 970 975 Pro
Tyr Arg Ile Pro Pro Leu Phe Pro Ile Lys Ser Phe Val Lys Thr 980 985
990 Lys Cys Lys Lys Asn Leu Leu Glu Glu Asn Phe Glu Glu His Ser Met
995 1000 1005 Ser Pro Glu Arg Glu Met Gly Asn Glu Asn Ile Pro Ser
Thr Val 1010 1015 1020 Ser Thr Ile Ser Arg Asn Asn Ile Arg Glu Asn
Val Phe Lys Glu 1025 1030 1035 Ala Ser Ser Ser Asn Ile Asn Glu Val
Gly Ser Ser Thr Asn Glu 1040 1045 1050 Val Gly Ser Ser Ile Asn Glu
Ile Gly Ser Ser Asp Glu Asn Ile 1055 1060 1065 Gln Ala Glu Leu Gly
Arg Asn Arg Gly Pro Lys Leu Asn Ala Met 1070 1075 1080 Leu Arg Leu
Gly Val Leu Gln Pro Glu Val Tyr Lys Gln Ser Leu 1085 1090 1095 Pro
Gly Ser Asn Cys Lys His Pro Glu Ile Lys Lys Gln Glu Tyr 1100 1105
1110 Glu Glu Val Val Gln Thr Val Asn Thr Asp Phe Ser Pro Tyr Leu
1115 1120 1125 Ile Ser Asp Asn Leu Glu Gln Pro Met Gly Ser Ser His
Ala Ser 1130 1135 1140 Gln Val Cys Ser Glu Thr Pro Asp Asp Leu Leu
Asp Asp Gly Glu 1145 1150 1155 Ile Lys Glu Asp Thr Ser Phe Ala Glu
Asn Asp Ile Lys Glu Ser 1160 1165 1170 Ser Ala Val Phe Ser Lys Ser
Val Gln Lys Gly Glu Leu Ser Arg 1175 1180 1185 Ser Pro Ser Pro Phe
Thr His Thr His Leu Ala Gln Gly Tyr Arg 1190 1195 1200 Arg Gly Ala
Lys Lys Leu Glu Ser Ser Glu Glu Asn Leu Ser Ser 1205 1210 1215 Glu
Asp Glu Glu Leu Pro Cys Phe Gln His Leu Leu Phe Gly Lys 1220 1225
1230 Val Asn Asn Ile Pro Ser Gln Ser Thr Arg His Ser Thr Val Ala
1235 1240 1245 Thr Glu Cys Leu Ser Lys Asn Thr Glu Glu Asn Leu Leu
Ser Leu 1250 1255 1260 Lys Asn Ser Leu Asn Asp Cys Ser Asn Gln Val
Ile Leu Ala Lys 1265 1270 1275 Ala Ser Gln Glu His His Leu Ser Glu
Glu Thr Lys Cys Ser Ala 1280 1285 1290 Ser Leu Phe Ser Ser Gln Cys
Ser Glu Leu Glu Asp Leu Thr Ala 1295 1300 1305 Asn Thr Asn Thr Gln
Asp Pro Phe Leu Ile Gly Ser Ser Lys Gln 1310 1315 1320 Met Arg His
Gln Ser Glu Ser Gln Gly Val Gly Leu Ser Asp Lys 1325 1330 1335 Glu
Leu Val Ser Asp Asp Glu Glu Arg Gly Thr Gly Leu Glu Glu 1340 1345
1350 Asn Asn Gln Glu Glu Gln Ser Met Asp Ser Asn Leu Gly Glu Ala
1355 1360 1365 Ala Ser Gly Cys Glu Ser Glu Thr Ser Val Ser Glu Asp
Cys Ser 1370 1375 1380 Gly Leu Ser Ser Gln Ser Asp Ile Leu Thr Thr
Gln Gln Arg Asp 1385 1390 1395 Thr Met Gln His Asn Leu Ile Lys Leu
Gln Gln Glu Met Ala Glu 1400 1405 1410 Leu Glu Ala Val Leu Glu Gln
His Gly Ser Gln Pro Ser Asn Ser 1415 1420 1425 Tyr Pro Ser Ile Ile
Ser Asp Ser Ser Ala Leu Glu Asp Leu Arg 1430 1435 1440 Asn Pro Glu
Gln Ser Thr Ser Glu Lys Asp Ser His Ile His Gly 1445 1450 1455 Gln
Arg Asn Asn Ser Met Phe Ser Lys Arg Pro Arg Glu His Ile 1460 1465
1470 Ser Val Leu Thr Ser Gln Lys Ser Ser Glu Tyr Pro Ile Ser Gln
1475 1480 1485 Asn Pro Glu Gly Leu Ser Ala Asp Lys Phe Glu Val Ser
Ala Asp 1490 1495 1500 Ser Ser Thr Ser Lys Asn Lys Glu Pro Gly Val
Glu Arg Ser Ser 1505 1510 1515 Pro Ser Lys Cys Pro Ser Leu Asp Asp
Arg Trp Tyr Met His Ser 1520 1525 1530 Cys Ser Gly Ser Leu Gln Asn
Arg Asn Tyr Pro Ser Gln Glu Glu 1535 1540 1545 Leu Ile Lys Val Val
Asp Val Glu Glu Gln Gln Leu Glu Glu Ser 1550 1555 1560 Gly Pro His
Asp Leu Thr Glu Thr Ser Tyr Leu Pro Arg Gln Asp 1565 1570 1575 Leu
Glu Gly Thr Pro Tyr Leu Glu Ser Gly Ile Ser Leu Phe Ser 1580 1585
1590 Asp Asp Pro Glu Ser Asp Pro Ser Glu Asp Arg Ala Pro Glu Ser
1595 1600 1605 Ala Arg Val Gly Asn Ile Pro Ser Ser Thr Ser Ala Leu
Lys Val 1610 1615 1620 Pro Gln Leu Lys Val Ala Glu Ser Ala Gln Ser
Pro Ala Ala Ala 1625 1630 1635 His Thr Thr Asp Thr Ala Gly Tyr Asn
Ala Met Glu Glu Ser Val 1640 1645 1650 Ser Arg Glu Lys Pro Glu Leu
Thr Ala Ser Thr Glu Arg Val Asn 1655 1660 1665 Lys Arg Met Ser Met
Val Val Ser Gly Leu Thr Pro Glu Glu Phe 1670 1675 1680 Met Leu Val
Tyr Lys Phe Ala Arg Lys His His Ile Thr Leu Thr 1685 1690 1695 Asn
Leu Ile Thr Glu Glu Thr Thr His Val Val Met Lys Thr Asp 1700 1705
1710 Ala Glu Phe Val Cys Glu Arg Thr Leu Lys Tyr Phe Leu Gly Ile
1715 1720 1725 Ala Gly Gly Lys Trp Val Val Ser Tyr Phe Trp Val Thr
Gln Ser 1730 1735 1740 Ile Lys Glu Arg Lys Met Leu Asn Glu His Asp
Phe Glu Val Arg 1745 1750 1755 Gly Asp Val Val Asn Gly Arg Asn His
Gln Gly Pro Lys Arg Ala 1760 1765 1770 Arg Glu Ser Gln Asp Arg Lys
Ile Phe Arg Gly Leu Glu Ile Cys 1775 1780 1785 Cys Tyr Gly Pro Phe
Thr Asn Met Pro Thr Asp Gln Leu Glu Trp 1790 1795 1800 Met Val Gln
Leu Cys Gly Ala Ser Val Val Lys Glu Leu Ser Ser 1805 1810 1815 Phe
Thr Leu Gly Thr Gly Val His Pro Ile Val Val Val Gln Pro 1820 1825
1830 Asp Ala Trp Thr Glu Asp Asn Gly Phe His Ala Ile Gly Gln Met
1835 1840 1845 Cys Glu Ala Pro Val Val Thr Arg Glu Trp Val Leu Asp
Ser Val 1850 1855 1860 Ala Leu Tyr Gln Cys Gln Glu Leu Asp Thr Tyr
Leu Ile Pro Gln 1865 1870 1875 Ile Pro His Ser His Tyr 1880 46
1816PRTHomo sapiens 46Met Leu Lys Leu Leu Asn Gln Lys Lys Gly Pro
Ser Gln Cys Pro Leu 1 5 10 15 Cys Lys Asn Asp Ile Thr Lys Arg Ser
Leu Gln Glu Ser Thr Arg Phe 20 25 30 Ser Gln Leu Val Glu Glu Leu
Leu Lys Ile Ile Cys Ala Phe Gln Leu 35 40 45 Asp Thr Gly Leu Glu
Tyr Ala Asn Ser Tyr Asn Phe Ala Lys Lys Glu 50 55 60 Asn Asn Ser
Pro Glu His Leu Lys Asp Glu Val Ser Ile Ile Gln Ser 65 70 75 80 Met
Gly Tyr Arg Asn Arg Ala Lys Arg Leu Leu Gln Ser Glu Pro Glu 85 90
95 Asn Pro Ser Leu Gln Glu Thr Ser Leu Ser Val Gln Leu Ser Asn Leu
100 105 110 Gly Thr Val Arg Thr Leu Arg Thr Lys Gln Arg Ile Gln Pro
Gln Lys 115 120 125 Thr Ser Val Tyr Ile Glu Leu Gly Ser Asp Ser Ser
Glu Asp Thr Val 130 135 140 Asn Lys Ala Thr Tyr Cys Ser Val Gly Asp
Gln Glu Leu Leu Gln Ile 145 150 155 160 Thr Pro Gln Gly Thr Arg Asp
Glu Ile Ser Leu Asp Ser Ala Lys Lys 165 170 175 Ala Ala Cys Glu Phe
Ser Glu Thr Asp Val Thr Asn Thr Glu His His 180 185 190 Gln Pro Ser
Asn Asn Asp Leu Asn Thr Thr Glu Lys Arg Ala Ala Glu 195 200 205 Arg
His Pro Glu Lys Tyr Gln Gly Ser Ser Val Ser Asn Leu His Val 210 215
220 Glu Pro Cys Gly Thr Asn Thr His Ala Ser Ser Leu Gln His Glu Asn
225 230 235 240 Ser Ser Leu Leu Leu Thr Lys Asp Arg Met Asn Val Glu
Lys Ala Glu 245 250 255 Phe Cys Asn Lys Ser Lys Gln Pro Gly Leu Ala
Arg Ser Gln His Asn 260 265 270 Arg Trp Ala Gly Ser Lys Glu Thr Cys
Asn Asp Arg Arg Thr Pro Ser 275 280 285 Thr Glu Lys Lys Val Asp Leu
Asn Ala Asp Pro Leu Cys Glu Arg Lys 290 295 300 Glu Trp Asn Lys Gln
Lys Leu Pro Cys Ser Glu Asn Pro Arg Asp Thr 305 310 315 320 Glu Asp
Val Pro Trp Ile Thr Leu Asn Ser Ser Ile Gln Lys Val Asn 325 330 335
Glu Trp Phe Ser Arg Ser Asp Glu Leu Leu Gly Ser Asp Asp Ser His 340
345 350 Asp Gly Glu Ser Glu Ser Asn Ala Lys Val Ala Asp Val Leu Asp
Val 355 360 365 Leu Asn Glu Val Asp Glu Tyr Ser Gly Ser Ser Glu Lys
Ile Asp Leu 370 375 380 Leu Ala Ser Asp Pro His Glu Ala Leu Ile Cys
Lys Ser Glu Arg Val 385 390 395 400 His Ser Lys Ser Val Glu Ser Asn
Ile Glu Asp Lys Ile Phe Gly Lys 405 410 415 Thr Tyr Arg Lys Lys Ala
Ser Leu Pro Asn Leu Ser His Val Thr Glu 420 425 430 Asn Leu Ile Ile
Gly Ala Phe Val Thr Glu Pro Gln Ile Ile Gln Glu 435 440 445 Arg Pro
Leu Thr Asn Lys Leu Lys Arg Lys Arg Arg Pro Thr Ser Gly 450 455 460
Leu His Pro Glu Asp Phe Ile Lys Lys Ala Asp Leu Ala Val Gln Lys 465
470 475 480 Thr Pro Glu Met Ile Asn Gln Gly Thr Asn Gln Thr Glu Gln
Asn Gly 485 490 495 Gln Val Met Asn Ile Thr Asn Ser Gly His Glu Asn
Lys Thr Lys Gly 500 505 510 Asp Ser Ile Gln Asn Glu Lys Asn Pro Asn
Pro Ile Glu Ser Leu Glu 515 520 525 Lys Glu Ser Ala Phe Lys Thr Lys
Ala Glu Pro Ile Ser Ser Ser Ile 530 535 540 Ser Asn Met Glu Leu Glu
Leu Asn Ile His Asn Ser Lys Ala Pro Lys 545 550 555 560 Lys Asn Arg
Leu Arg Arg Lys Ser Ser Thr Arg His Ile His Ala Leu 565 570 575 Glu
Leu Val Val Ser Arg Asn Leu Ser Pro Pro Asn Cys Thr Glu Leu 580 585
590 Gln Ile Asp Ser Cys Ser Ser Ser Glu Glu Ile Lys Lys Lys Lys Tyr
595 600 605 Asn Gln Met Pro Val Arg His Ser Arg Asn Leu Gln Leu Met
Glu Gly 610 615 620 Lys Glu Pro Ala Thr Gly Ala Lys Lys Ser Asn Lys
Pro Asn Glu Gln 625 630 635 640 Thr Ser Lys Arg His Asp Ser Asp Thr
Phe Pro Glu Leu Lys Leu Thr 645 650 655 Asn Ala Pro Gly Ser Phe Thr
Lys Cys Ser Asn Thr Ser Glu Leu Lys 660 665 670 Glu Phe Val Asn Pro
Ser Leu Pro Arg Glu Glu Lys Glu Glu Lys Leu 675 680 685 Glu Thr Val
Lys Val Ser Asn Asn Ala Glu Asp Pro Lys Asp Leu Met 690 695 700 Leu
Ser Gly Glu Arg Val Leu Gln Thr Glu Arg Ser Val Glu Ser Ser 705 710
715 720 Ser Ile Ser Leu Val Pro Gly Thr Asp Tyr Gly Thr Gln Glu Ser
Ile 725 730 735 Ser Leu Leu Glu Val Ser Thr Leu Gly Lys Ala Lys Thr
Glu Pro Asn 740 745 750 Lys Cys Val Ser Gln Cys Ala Ala Phe Glu Asn
Pro Lys Gly Leu Ile 755 760 765 His Gly Cys Ser Lys Asp Asn Arg Asn
Asp Thr Glu Gly Phe Lys Tyr 770 775 780 Pro Leu Gly His Glu Val Asn
His Ser Arg Glu Thr Ser Ile Glu Met 785 790 795 800 Glu Glu Ser Glu
Leu Asp Ala Gln Tyr Leu Gln Asn Thr Phe Lys Val 805 810 815 Ser Lys
Arg Gln Ser Phe Ala Pro Phe Ser Asn Pro Gly Asn Ala Glu 820 825 830
Glu Glu Cys Ala Thr Phe Ser Ala His Ser Gly Ser Leu Lys Lys Gln 835
840 845 Ser Pro Lys Val Thr Phe Glu Cys Glu Gln Lys Glu Glu Asn Gln
Gly 850 855 860 Lys Asn Glu Ser Asn
Ile Lys Pro Val Gln Thr Val Asn Ile Thr Ala 865 870 875 880 Gly Phe
Pro Val Val Gly Gln Lys Asp Lys Pro Val Asp Asn Ala Lys 885 890 895
Cys Ser Ile Lys Gly Gly Ser Arg Phe Cys Leu Ser Ser Gln Phe Arg 900
905 910 Gly Asn Glu Thr Gly Leu Ile Thr Pro Asn Lys His Gly Leu Leu
Gln 915 920 925 Asn Pro Tyr Arg Ile Pro Pro Leu Phe Pro Ile Lys Ser
Phe Val Lys 930 935 940 Thr Lys Cys Lys Lys Asn Leu Leu Glu Glu Asn
Phe Glu Glu His Ser 945 950 955 960 Met Ser Pro Glu Arg Glu Met Gly
Asn Glu Asn Ile Pro Ser Thr Val 965 970 975 Ser Thr Ile Ser Arg Asn
Asn Ile Arg Glu Asn Val Phe Lys Glu Ala 980 985 990 Ser Ser Ser Asn
Ile Asn Glu Val Gly Ser Ser Thr Asn Glu Val Gly 995 1000 1005 Ser
Ser Ile Asn Glu Ile Gly Ser Ser Asp Glu Asn Ile Gln Ala 1010 1015
1020 Glu Leu Gly Arg Asn Arg Gly Pro Lys Leu Asn Ala Met Leu Arg
1025 1030 1035 Leu Gly Val Leu Gln Pro Glu Val Tyr Lys Gln Ser Leu
Pro Gly 1040 1045 1050 Ser Asn Cys Lys His Pro Glu Ile Lys Lys Gln
Glu Tyr Glu Glu 1055 1060 1065 Val Val Gln Thr Val Asn Thr Asp Phe
Ser Pro Tyr Leu Ile Ser 1070 1075 1080 Asp Asn Leu Glu Gln Pro Met
Gly Ser Ser His Ala Ser Gln Val 1085 1090 1095 Cys Ser Glu Thr Pro
Asp Asp Leu Leu Asp Asp Gly Glu Ile Lys 1100 1105 1110 Glu Asp Thr
Ser Phe Ala Glu Asn Asp Ile Lys Glu Ser Ser Ala 1115 1120 1125 Val
Phe Ser Lys Ser Val Gln Lys Gly Glu Leu Ser Arg Ser Pro 1130 1135
1140 Ser Pro Phe Thr His Thr His Leu Ala Gln Gly Tyr Arg Arg Gly
1145 1150 1155 Ala Lys Lys Leu Glu Ser Ser Glu Glu Asn Leu Ser Ser
Glu Asp 1160 1165 1170 Glu Glu Leu Pro Cys Phe Gln His Leu Leu Phe
Gly Lys Val Asn 1175 1180 1185 Asn Ile Pro Ser Gln Ser Thr Arg His
Ser Thr Val Ala Thr Glu 1190 1195 1200 Cys Leu Ser Lys Asn Thr Glu
Glu Asn Leu Leu Ser Leu Lys Asn 1205 1210 1215 Ser Leu Asn Asp Cys
Ser Asn Gln Val Ile Leu Ala Lys Ala Ser 1220 1225 1230 Gln Glu His
His Leu Ser Glu Glu Thr Lys Cys Ser Ala Ser Leu 1235 1240 1245 Phe
Ser Ser Gln Cys Ser Glu Leu Glu Asp Leu Thr Ala Asn Thr 1250 1255
1260 Asn Thr Gln Asp Pro Phe Leu Ile Gly Ser Ser Lys Gln Met Arg
1265 1270 1275 His Gln Ser Glu Ser Gln Gly Val Gly Leu Ser Asp Lys
Glu Leu 1280 1285 1290 Val Ser Asp Asp Glu Glu Arg Gly Thr Gly Leu
Glu Glu Asn Asn 1295 1300 1305 Gln Glu Glu Gln Ser Met Asp Ser Asn
Leu Gly Glu Ala Ala Ser 1310 1315 1320 Gly Cys Glu Ser Glu Thr Ser
Val Ser Glu Asp Cys Ser Gly Leu 1325 1330 1335 Ser Ser Gln Ser Asp
Ile Leu Thr Thr Gln Gln Arg Asp Thr Met 1340 1345 1350 Gln His Asn
Leu Ile Lys Leu Gln Gln Glu Met Ala Glu Leu Glu 1355 1360 1365 Ala
Val Leu Glu Gln His Gly Ser Gln Pro Ser Asn Ser Tyr Pro 1370 1375
1380 Ser Ile Ile Ser Asp Ser Ser Ala Leu Glu Asp Leu Arg Asn Pro
1385 1390 1395 Glu Gln Ser Thr Ser Glu Lys Ala Val Leu Thr Ser Gln
Lys Ser 1400 1405 1410 Ser Glu Tyr Pro Ile Ser Gln Asn Pro Glu Gly
Leu Ser Ala Asp 1415 1420 1425 Lys Phe Glu Val Ser Ala Asp Ser Ser
Thr Ser Lys Asn Lys Glu 1430 1435 1440 Pro Gly Val Glu Arg Ser Ser
Pro Ser Lys Cys Pro Ser Leu Asp 1445 1450 1455 Asp Arg Trp Tyr Met
His Ser Cys Ser Gly Ser Leu Gln Asn Arg 1460 1465 1470 Asn Tyr Pro
Ser Gln Glu Glu Leu Ile Lys Val Val Asp Val Glu 1475 1480 1485 Glu
Gln Gln Leu Glu Glu Ser Gly Pro His Asp Leu Thr Glu Thr 1490 1495
1500 Ser Tyr Leu Pro Arg Gln Asp Leu Glu Gly Thr Pro Tyr Leu Glu
1505 1510 1515 Ser Gly Ile Ser Leu Phe Ser Asp Asp Pro Glu Ser Asp
Pro Ser 1520 1525 1530 Glu Asp Arg Ala Pro Glu Ser Ala Arg Val Gly
Asn Ile Pro Ser 1535 1540 1545 Ser Thr Ser Ala Leu Lys Val Pro Gln
Leu Lys Val Ala Glu Ser 1550 1555 1560 Ala Gln Ser Pro Ala Ala Ala
His Thr Thr Asp Thr Ala Gly Tyr 1565 1570 1575 Asn Ala Met Glu Glu
Ser Val Ser Arg Glu Lys Pro Glu Leu Thr 1580 1585 1590 Ala Ser Thr
Glu Arg Val Asn Lys Arg Met Ser Met Val Val Ser 1595 1600 1605 Gly
Leu Thr Pro Glu Glu Phe Met Leu Val Tyr Lys Phe Ala Arg 1610 1615
1620 Lys His His Ile Thr Leu Thr Asn Leu Ile Thr Glu Glu Thr Thr
1625 1630 1635 His Val Val Met Lys Thr Asp Ala Glu Phe Val Cys Glu
Arg Thr 1640 1645 1650 Leu Lys Tyr Phe Leu Gly Ile Ala Gly Gly Lys
Trp Val Val Ser 1655 1660 1665 Tyr Phe Trp Val Thr Gln Ser Ile Lys
Glu Arg Lys Met Leu Asn 1670 1675 1680 Glu His Asp Phe Glu Val Arg
Gly Asp Val Val Asn Gly Arg Asn 1685 1690 1695 His Gln Gly Pro Lys
Arg Ala Arg Glu Ser Gln Asp Arg Lys Ile 1700 1705 1710 Phe Arg Gly
Leu Glu Ile Cys Cys Tyr Gly Pro Phe Thr Asn Met 1715 1720 1725 Pro
Thr Asp Gln Leu Glu Trp Met Val Gln Leu Cys Gly Ala Ser 1730 1735
1740 Val Val Lys Glu Leu Ser Ser Phe Thr Leu Gly Thr Gly Val His
1745 1750 1755 Pro Ile Val Val Val Gln Pro Asp Ala Trp Thr Glu Asp
Asn Gly 1760 1765 1770 Phe His Ala Ile Gly Gln Met Cys Glu Ala Pro
Val Val Thr Arg 1775 1780 1785 Glu Trp Val Leu Asp Ser Val Ala Leu
Tyr Gln Cys Gln Glu Leu 1790 1795 1800 Asp Thr Tyr Leu Ile Pro Gln
Ile Pro His Ser His Tyr 1805 1810 1815 47759PRTHomo sapiens 47Met
Asp Leu Ser Ala Leu Arg Val Glu Glu Val Gln Asn Val Ile Asn 1 5 10
15 Ala Met Gln Lys Ile Leu Glu Cys Pro Ile Cys Leu Glu Leu Ile Lys
20 25 30 Glu Pro Val Ser Thr Lys Cys Asp His Ile Phe Cys Lys Phe
Cys Met 35 40 45 Leu Lys Leu Leu Asn Gln Lys Lys Gly Pro Ser Gln
Cys Pro Leu Cys 50 55 60 Lys Asn Asp Ile Thr Lys Arg Ser Leu Gln
Glu Ser Thr Arg Phe Ser 65 70 75 80 Gln Leu Val Glu Glu Leu Leu Lys
Ile Ile Cys Ala Phe Gln Leu Asp 85 90 95 Thr Gly Leu Glu Tyr Ala
Asn Ser Tyr Asn Phe Ala Lys Lys Glu Asn 100 105 110 Asn Ser Pro Glu
His Leu Lys Asp Glu Val Ser Ile Ile Gln Ser Met 115 120 125 Gly Tyr
Arg Asn Arg Ala Lys Arg Leu Leu Gln Ser Glu Pro Glu Asn 130 135 140
Pro Ser Leu Gln Glu Thr Ser Leu Ser Val Gln Leu Ser Asn Leu Gly 145
150 155 160 Thr Val Arg Thr Leu Arg Thr Lys Gln Arg Ile Gln Pro Gln
Lys Thr 165 170 175 Ser Val Tyr Ile Glu Leu Gly Ser Asp Ser Ser Glu
Asp Thr Val Asn 180 185 190 Lys Ala Thr Tyr Cys Ser Val Gly Asp Gln
Glu Leu Leu Gln Ile Thr 195 200 205 Pro Gln Gly Thr Arg Asp Glu Ile
Ser Leu Asp Ser Ala Lys Lys Ala 210 215 220 Ala Cys Glu Phe Ser Glu
Thr Asp Val Thr Asn Thr Glu His His Gln 225 230 235 240 Pro Ser Asn
Asn Asp Leu Asn Thr Thr Glu Lys Arg Ala Ala Glu Arg 245 250 255 His
Pro Glu Lys Tyr Gln Gly Glu Ala Ala Ser Gly Cys Glu Ser Glu 260 265
270 Thr Ser Val Ser Glu Asp Cys Ser Gly Leu Ser Ser Gln Ser Asp Ile
275 280 285 Leu Thr Thr Gln Gln Arg Asp Thr Met Gln His Asn Leu Ile
Lys Leu 290 295 300 Gln Gln Glu Met Ala Glu Leu Glu Ala Val Leu Glu
Gln His Gly Ser 305 310 315 320 Gln Pro Ser Asn Ser Tyr Pro Ser Ile
Ile Ser Asp Ser Ser Ala Leu 325 330 335 Glu Asp Leu Arg Asn Pro Glu
Gln Ser Thr Ser Glu Lys Val Leu Thr 340 345 350 Ser Gln Lys Ser Ser
Glu Tyr Pro Ile Ser Gln Asn Pro Glu Gly Leu 355 360 365 Ser Ala Asp
Lys Phe Glu Val Ser Ala Asp Ser Ser Thr Ser Lys Asn 370 375 380 Lys
Glu Pro Gly Val Glu Arg Ser Ser Pro Ser Lys Cys Pro Ser Leu 385 390
395 400 Asp Asp Arg Trp Tyr Met His Ser Cys Ser Gly Ser Leu Gln Asn
Arg 405 410 415 Asn Tyr Pro Ser Gln Glu Glu Leu Ile Lys Val Val Asp
Val Glu Glu 420 425 430 Gln Gln Leu Glu Glu Ser Gly Pro His Asp Leu
Thr Glu Thr Ser Tyr 435 440 445 Leu Pro Arg Gln Asp Leu Glu Gly Thr
Pro Tyr Leu Glu Ser Gly Ile 450 455 460 Ser Leu Phe Ser Asp Asp Pro
Glu Ser Asp Pro Ser Glu Asp Arg Ala 465 470 475 480 Pro Glu Ser Ala
Arg Val Gly Asn Ile Pro Ser Ser Thr Ser Ala Leu 485 490 495 Lys Val
Pro Gln Leu Lys Val Ala Glu Ser Ala Gln Ser Pro Ala Ala 500 505 510
Ala His Thr Thr Asp Thr Ala Gly Tyr Asn Ala Met Glu Glu Ser Val 515
520 525 Ser Arg Glu Lys Pro Glu Leu Thr Ala Ser Thr Glu Arg Val Asn
Lys 530 535 540 Arg Met Ser Met Val Val Ser Gly Leu Thr Pro Glu Glu
Phe Met Leu 545 550 555 560 Val Tyr Lys Phe Ala Arg Lys His His Ile
Thr Leu Thr Asn Leu Ile 565 570 575 Thr Glu Glu Thr Thr His Val Val
Met Lys Thr Asp Ala Glu Phe Val 580 585 590 Cys Glu Arg Thr Leu Lys
Tyr Phe Leu Gly Ile Ala Gly Gly Lys Trp 595 600 605 Val Val Ser Tyr
Phe Trp Val Thr Gln Ser Ile Lys Glu Arg Lys Met 610 615 620 Leu Asn
Glu His Asp Phe Glu Val Arg Gly Asp Val Val Asn Gly Arg 625 630 635
640 Asn His Gln Gly Pro Lys Arg Ala Arg Glu Ser Gln Asp Arg Lys Ile
645 650 655 Phe Arg Gly Leu Glu Ile Cys Cys Tyr Gly Pro Phe Thr Asn
Met Pro 660 665 670 Thr Asp Gln Leu Glu Trp Met Val Gln Leu Cys Gly
Ala Ser Val Val 675 680 685 Lys Glu Leu Ser Ser Phe Thr Leu Gly Thr
Gly Val His Pro Ile Val 690 695 700 Val Val Gln Pro Asp Ala Trp Thr
Glu Asp Asn Gly Phe His Ala Ile 705 710 715 720 Gly Gln Met Cys Glu
Ala Pro Val Val Thr Arg Glu Trp Val Leu Asp 725 730 735 Ser Val Ala
Leu Tyr Gln Cys Gln Glu Leu Asp Thr Tyr Leu Ile Pro 740 745 750 Gln
Ile Pro His Ser His Tyr 755 48699PRTHomo sapiens 48Met Asp Leu Ser
Ala Leu Arg Val Glu Glu Val Gln Asn Val Ile Asn 1 5 10 15 Ala Met
Gln Lys Ile Leu Glu Cys Pro Ile Cys Leu Glu Leu Ile Lys 20 25 30
Glu Pro Val Ser Thr Lys Cys Asp His Ile Phe Cys Lys Phe Cys Met 35
40 45 Leu Lys Leu Leu Asn Gln Lys Lys Gly Pro Ser Gln Cys Pro Leu
Cys 50 55 60 Lys Asn Asp Ile Thr Lys Arg Ser Leu Gln Glu Ser Thr
Arg Phe Ser 65 70 75 80 Gln Leu Val Glu Glu Leu Leu Lys Ile Ile Cys
Ala Phe Gln Leu Asp 85 90 95 Thr Gly Leu Glu Tyr Ala Asn Ser Tyr
Asn Phe Ala Lys Lys Glu Asn 100 105 110 Asn Ser Pro Glu His Leu Lys
Asp Glu Val Ser Ile Ile Gln Ser Met 115 120 125 Gly Tyr Arg Asn Arg
Ala Lys Arg Leu Leu Gln Ser Glu Pro Glu Asn 130 135 140 Pro Ser Leu
Gln Glu Thr Ser Leu Ser Val Gln Leu Ser Asn Leu Gly 145 150 155 160
Thr Val Arg Thr Leu Arg Thr Lys Gln Arg Ile Gln Pro Gln Lys Thr 165
170 175 Ser Val Tyr Ile Glu Leu Gly Ser Asp Ser Ser Glu Asp Thr Val
Asn 180 185 190 Lys Ala Thr Tyr Cys Ser Val Gly Asp Gln Glu Leu Leu
Gln Ile Thr 195 200 205 Pro Gln Gly Thr Arg Asp Glu Ile Ser Leu Asp
Ser Ala Lys Lys Ala 210 215 220 Ala Cys Glu Phe Ser Glu Thr Asp Val
Thr Asn Thr Glu His His Gln 225 230 235 240 Pro Ser Asn Asn Asp Leu
Asn Thr Thr Glu Lys Arg Ala Ala Glu Arg 245 250 255 His Pro Glu Lys
Tyr Gln Gly Glu Ala Ala Ser Gly Cys Glu Ser Glu 260 265 270 Thr Ser
Val Ser Glu Asp Cys Ser Gly Leu Ser Ser Gln Ser Asp Ile 275 280 285
Leu Thr Thr Gln Gln Arg Asp Thr Met Gln His Asn Leu Ile Lys Leu 290
295 300 Gln Gln Glu Met Ala Glu Leu Glu Ala Val Leu Glu Gln His Gly
Ser 305 310 315 320 Gln Pro Ser Asn Ser Tyr Pro Ser Ile Ile Ser Asp
Ser Ser Ala Leu 325 330 335 Glu Asp Leu Arg Asn Pro Glu Gln Ser Thr
Ser Glu Lys Val Leu Thr 340 345 350 Ser Gln Lys Ser Ser Glu Tyr Pro
Ile Ser Gln Asn Pro Glu Gly Leu 355 360 365 Ser Ala Asp Lys Phe Glu
Val Ser Ala Asp Ser Ser Thr Ser Lys Asn 370 375 380 Lys Glu Pro Gly
Val Glu Arg Ser Ser Pro Ser Lys Cys Pro Ser Leu 385 390 395 400 Asp
Asp Arg Trp Tyr Met His Ser Cys Ser Gly Ser Leu Gln Asn Arg 405 410
415 Asn Tyr Pro Ser Gln Glu Glu Leu Ile Lys Val Val Asp Val Glu Glu
420 425 430 Gln Gln Leu Glu Glu Ser Gly Pro His Asp Leu Thr Glu Thr
Ser Tyr 435 440 445 Leu Pro Arg Gln Asp Leu Glu Gly Thr Pro Tyr Leu
Glu Ser Gly Ile 450 455 460 Ser Leu Phe Ser Asp Asp Pro Glu Ser Asp
Pro Ser Glu Asp Arg Ala 465 470 475 480 Pro Glu Ser Ala Arg Val Gly
Asn Ile Pro Ser Ser Thr Ser Ala Leu 485 490 495 Lys Val Pro Gln Leu
Lys Val Ala Glu Ser Ala Gln Ser Pro Ala Ala 500 505 510 Ala His Thr
Thr Asp Thr Ala Gly Tyr Asn Ala Met Glu Glu Ser Val 515 520 525 Ser
Arg Glu Lys Pro Glu Leu Thr Ala Ser Thr Glu Arg Val Asn Lys 530 535
540 Arg Met Ser Met Val Val Ser Gly Leu Thr Pro Glu Glu Phe Met Leu
545
550 555 560 Val Tyr Lys Phe Ala Arg Lys His His Ile Thr Leu Thr Asn
Leu Ile 565 570 575 Thr Glu Glu Thr Thr His Val Val Met Lys Thr Asp
Ala Glu Phe Val 580 585 590 Cys Glu Arg Thr Leu Lys Tyr Phe Leu Gly
Ile Ala Gly Gly Lys Trp 595 600 605 Val Val Ser Tyr Phe Trp Val Thr
Gln Ser Ile Lys Glu Arg Lys Met 610 615 620 Leu Asn Glu His Asp Phe
Glu Val Arg Gly Asp Val Val Asn Gly Arg 625 630 635 640 Asn His Gln
Gly Pro Lys Arg Ala Arg Glu Ser Gln Asp Arg Lys Ile 645 650 655 Phe
Arg Gly Leu Glu Ile Cys Cys Tyr Gly Pro Phe Thr Asn Met Pro 660 665
670 Thr Gly Cys Pro Pro Asn Cys Gly Cys Ala Ala Arg Cys Leu Asp Arg
675 680 685 Gly Gln Trp Leu Pro Cys Asn Trp Ala Asp Val 690 695
4911386DNAHomo sapiens 49gtggcgcgag cttctgaaac taggcggcag
aggcggagcc gctgtggcac tgctgcgcct 60ctgctgcgcc tcgggtgtct tttgcggcgg
tgggtcgccg ccgggagaag cgtgagggga 120cagatttgtg accggcgcgg
tttttgtcag cttactccgg ccaaaaaaga actgcacctc 180tggagcggac
ttatttacca agcattggag gaatatcgta ggtaaaaatg cctattggat
240ccaaagagag gccaacattt tttgaaattt ttaagacacg ctgcaacaaa
gcagatttag 300gaccaataag tcttaattgg tttgaagaac tttcttcaga
agctccaccc tataattctg 360aacctgcaga agaatctgaa cataaaaaca
acaattacga accaaaccta tttaaaactc 420cacaaaggaa accatcttat
aatcagctgg cttcaactcc aataatattc aaagagcaag 480ggctgactct
gccgctgtac caatctcctg taaaagaatt agataaattc aaattagact
540taggaaggaa tgttcccaat agtagacata aaagtcttcg cacagtgaaa
actaaaatgg 600atcaagcaga tgatgtttcc tgtccacttc taaattcttg
tcttagtgaa agtcctgttg 660ttctacaatg tacacatgta acaccacaaa
gagataagtc agtggtatgt gggagtttgt 720ttcatacacc aaagtttgtg
aagggtcgtc agacaccaaa acatatttct gaaagtctag 780gagctgaggt
ggatcctgat atgtcttggt caagttcttt agctacacca cccaccctta
840gttctactgt gctcatagtc agaaatgaag aagcatctga aactgtattt
cctcatgata 900ctactgctaa tgtgaaaagc tatttttcca atcatgatga
aagtctgaag aaaaatgata 960gatttatcgc ttctgtgaca gacagtgaaa
acacaaatca aagagaagct gcaagtcatg 1020gatttggaaa aacatcaggg
aattcattta aagtaaatag ctgcaaagac cacattggaa 1080agtcaatgcc
aaatgtccta gaagatgaag tatatgaaac agttgtagat acctctgaag
1140aagatagttt ttcattatgt ttttctaaat gtagaacaaa aaatctacaa
aaagtaagaa 1200ctagcaagac taggaaaaaa attttccatg aagcaaacgc
tgatgaatgt gaaaaatcta 1260aaaaccaagt gaaagaaaaa tactcatttg
tatctgaagt ggaaccaaat gatactgatc 1320cattagattc aaatgtagca
aatcagaagc cctttgagag tggaagtgac aaaatctcca 1380aggaagttgt
accgtctttg gcctgtgaat ggtctcaact aaccctttca ggtctaaatg
1440gagcccagat ggagaaaata cccctattgc atatttcttc atgtgaccaa
aatatttcag 1500aaaaagacct attagacaca gagaacaaaa gaaagaaaga
ttttcttact tcagagaatt 1560ctttgccacg tatttctagc ctaccaaaat
cagagaagcc attaaatgag gaaacagtgg 1620taaataagag agatgaagag
cagcatcttg aatctcatac agactgcatt cttgcagtaa 1680agcaggcaat
atctggaact tctccagtgg cttcttcatt tcagggtatc aaaaagtcta
1740tattcagaat aagagaatca cctaaagaga ctttcaatgc aagtttttca
ggtcatatga 1800ctgatccaaa ctttaaaaaa gaaactgaag cctctgaaag
tggactggaa atacatactg 1860tttgctcaca gaaggaggac tccttatgtc
caaatttaat tgataatgga agctggccag 1920ccaccaccac acagaattct
gtagctttga agaatgcagg tttaatatcc actttgaaaa 1980agaaaacaaa
taagtttatt tatgctatac atgatgaaac atcttataaa ggaaaaaaaa
2040taccgaaaga ccaaaaatca gaactaatta actgttcagc ccagtttgaa
gcaaatgctt 2100ttgaagcacc acttacattt gcaaatgctg attcaggttt
attgcattct tctgtgaaaa 2160gaagctgttc acagaatgat tctgaagaac
caactttgtc cttaactagc tcttttggga 2220caattctgag gaaatgttct
agaaatgaaa catgttctaa taatacagta atctctcagg 2280atcttgatta
taaagaagca aaatgtaata aggaaaaact acagttattt attaccccag
2340aagctgattc tctgtcatgc ctgcaggaag gacagtgtga aaatgatcca
aaaagcaaaa 2400aagtttcaga tataaaagaa gaggtcttgg ctgcagcatg
tcacccagta caacattcaa 2460aagtggaata cagtgatact gactttcaat
cccagaaaag tcttttatat gatcatgaaa 2520atgccagcac tcttatttta
actcctactt ccaaggatgt tctgtcaaac ctagtcatga 2580tttctagagg
caaagaatca tacaaaatgt cagacaagct caaaggtaac aattatgaat
2640ctgatgttga attaaccaaa aatattccca tggaaaagaa tcaagatgta
tgtgctttaa 2700atgaaaatta taaaaacgtt gagctgttgc cacctgaaaa
atacatgaga gtagcatcac 2760cttcaagaaa ggtacaattc aaccaaaaca
caaatctaag agtaatccaa aaaaatcaag 2820aagaaactac ttcaatttca
aaaataactg tcaatccaga ctctgaagaa cttttctcag 2880acaatgagaa
taattttgtc ttccaagtag ctaatgaaag gaataatctt gctttaggaa
2940atactaagga acttcatgaa acagacttga cttgtgtaaa cgaacccatt
ttcaagaact 3000ctaccatggt tttatatgga gacacaggtg ataaacaagc
aacccaagtg tcaattaaaa 3060aagatttggt ttatgttctt gcagaggaga
acaaaaatag tgtaaagcag catataaaaa 3120tgactctagg tcaagattta
aaatcggaca tctccttgaa tatagataaa ataccagaaa 3180aaaataatga
ttacatgaac aaatgggcag gactcttagg tccaatttca aatcacagtt
3240ttggaggtag cttcagaaca gcttcaaata aggaaatcaa gctctctgaa
cataacatta 3300agaagagcaa aatgttcttc aaagatattg aagaacaata
tcctactagt ttagcttgtg 3360ttgaaattgt aaataccttg gcattagata
atcaaaagaa actgagcaag cctcagtcaa 3420ttaatactgt atctgcacat
ttacagagta gtgtagttgt ttctgattgt aaaaatagtc 3480atataacccc
tcagatgtta ttttccaagc aggattttaa ttcaaaccat aatttaacac
3540ctagccaaaa ggcagaaatt acagaacttt ctactatatt agaagaatca
ggaagtcagt 3600ttgaatttac tcagtttaga aaaccaagct acatattgca
gaagagtaca tttgaagtgc 3660ctgaaaacca gatgactatc ttaaagacca
cttctgagga atgcagagat gctgatcttc 3720atgtcataat gaatgcccca
tcgattggtc aggtagacag cagcaagcaa tttgaaggta 3780cagttgaaat
taaacggaag tttgctggcc tgttgaaaaa tgactgtaac aaaagtgctt
3840ctggttattt aacagatgaa aatgaagtgg ggtttagggg cttttattct
gctcatggca 3900caaaactgaa tgtttctact gaagctctgc aaaaagctgt
gaaactgttt agtgatattg 3960agaatattag tgaggaaact tctgcagagg
tacatccaat aagtttatct tcaagtaaat 4020gtcatgattc tgttgtttca
atgtttaaga tagaaaatca taatgataaa actgtaagtg 4080aaaaaaataa
taaatgccaa ctgatattac aaaataatat tgaaatgact actggcactt
4140ttgttgaaga aattactgaa aattacaaga gaaatactga aaatgaagat
aacaaatata 4200ctgctgccag tagaaattct cataacttag aatttgatgg
cagtgattca agtaaaaatg 4260atactgtttg tattcataaa gatgaaacgg
acttgctatt tactgatcag cacaacatat 4320gtcttaaatt atctggccag
tttatgaagg agggaaacac tcagattaaa gaagatttgt 4380cagatttaac
ttttttggaa gttgcgaaag ctcaagaagc atgtcatggt aatacttcaa
4440ataaagaaca gttaactgct actaaaacgg agcaaaatat aaaagatttt
gagacttctg 4500atacattttt tcagactgca agtgggaaaa atattagtgt
cgccaaagag tcatttaata 4560aaattgtaaa tttctttgat cagaaaccag
aagaattgca taacttttcc ttaaattctg 4620aattacattc tgacataaga
aagaacaaaa tggacattct aagttatgag gaaacagaca 4680tagttaaaca
caaaatactg aaagaaagtg tcccagttgg tactggaaat caactagtga
4740ccttccaggg acaacccgaa cgtgatgaaa agatcaaaga acctactcta
ttgggttttc 4800atacagctag cgggaaaaaa gttaaaattg caaaggaatc
tttggacaaa gtgaaaaacc 4860tttttgatga aaaagagcaa ggtactagtg
aaatcaccag ttttagccat caatgggcaa 4920agaccctaaa gtacagagag
gcctgtaaag accttgaatt agcatgtgag accattgaga 4980tcacagctgc
cccaaagtgt aaagaaatgc agaattctct caataatgat aaaaaccttg
5040tttctattga gactgtggtg ccacctaagc tcttaagtga taatttatgt
agacaaactg 5100aaaatctcaa aacatcaaaa agtatctttt tgaaagttaa
agtacatgaa aatgtagaaa 5160aagaaacagc aaaaagtcct gcaacttgtt
acacaaatca gtccccttat tcagtcattg 5220aaaattcagc cttagctttt
tacacaagtt gtagtagaaa aacttctgtg agtcagactt 5280cattacttga
agcaaaaaaa tggcttagag aaggaatatt tgatggtcaa ccagaaagaa
5340taaatactgc agattatgta ggaaattatt tgtatgaaaa taattcaaac
agtactatag 5400ctgaaaatga caaaaatcat ctctccgaaa aacaagatac
ttatttaagt aacagtagca 5460tgtctaacag ctattcctac cattctgatg
aggtatataa tgattcagga tatctctcaa 5520aaaataaact tgattctggt
attgagccag tattgaagaa tgttgaagat caaaaaaaca 5580ctagtttttc
caaagtaata tccaatgtaa aagatgcaaa tgcataccca caaactgtaa
5640atgaagatat ttgcgttgag gaacttgtga ctagctcttc accctgcaaa
aataaaaatg 5700cagccattaa attgtccata tctaatagta ataattttga
ggtagggcca cctgcattta 5760ggatagccag tggtaaaatc gtttgtgttt
cacatgaaac aattaaaaaa gtgaaagaca 5820tatttacaga cagtttcagt
aaagtaatta aggaaaacaa cgagaataaa tcaaaaattt 5880gccaaacgaa
aattatggca ggttgttacg aggcattgga tgattcagag gatattcttc
5940ataactctct agataatgat gaatgtagca cgcattcaca taaggttttt
gctgacattc 6000agagtgaaga aattttacaa cataaccaaa atatgtctgg
attggagaaa gtttctaaaa 6060tatcaccttg tgatgttagt ttggaaactt
cagatatatg taaatgtagt atagggaagc 6120ttcataagtc agtctcatct
gcaaatactt gtgggatttt tagcacagca agtggaaaat 6180ctgtccaggt
atcagatgct tcattacaaa acgcaagaca agtgttttct gaaatagaag
6240atagtaccaa gcaagtcttt tccaaagtat tgtttaaaag taacgaacat
tcagaccagc 6300tcacaagaga agaaaatact gctatacgta ctccagaaca
tttaatatcc caaaaaggct 6360tttcatataa tgtggtaaat tcatctgctt
tctctggatt tagtacagca agtggaaagc 6420aagtttccat tttagaaagt
tccttacaca aagttaaggg agtgttagag gaatttgatt 6480taatcagaac
tgagcatagt cttcactatt cacctacgtc tagacaaaat gtatcaaaaa
6540tacttcctcg tgttgataag agaaacccag agcactgtgt aaactcagaa
atggaaaaaa 6600cctgcagtaa agaatttaaa ttatcaaata acttaaatgt
tgaaggtggt tcttcagaaa 6660ataatcactc tattaaagtt tctccatatc
tctctcaatt tcaacaagac aaacaacagt 6720tggtattagg aaccaaagtg
tcacttgttg agaacattca tgttttggga aaagaacagg 6780cttcacctaa
aaacgtaaaa atggaaattg gtaaaactga aactttttct gatgttcctg
6840tgaaaacaaa tatagaagtt tgttctactt actccaaaga ttcagaaaac
tactttgaaa 6900cagaagcagt agaaattgct aaagctttta tggaagatga
tgaactgaca gattctaaac 6960tgccaagtca tgccacacat tctcttttta
catgtcccga aaatgaggaa atggttttgt 7020caaattcaag aattggaaaa
agaagaggag agccccttat cttagtggga gaaccctcaa 7080tcaaaagaaa
cttattaaat gaatttgaca ggataataga aaatcaagaa aaatccttaa
7140aggcttcaaa aagcactcca gatggcacaa taaaagatcg aagattgttt
atgcatcatg 7200tttctttaga gccgattacc tgtgtaccct ttcgcacaac
taaggaacgt caagagatac 7260agaatccaaa ttttaccgca cctggtcaag
aatttctgtc taaatctcat ttgtatgaac 7320atctgacttt ggaaaaatct
tcaagcaatt tagcagtttc aggacatcca ttttatcaag 7380tttctgctac
aagaaatgaa aaaatgagac acttgattac tacaggcaga ccaaccaaag
7440tctttgttcc accttttaaa actaaatcac attttcacag agttgaacag
tgtgttagga 7500atattaactt ggaggaaaac agacaaaagc aaaacattga
tggacatggc tctgatgata 7560gtaaaaataa gattaatgac aatgagattc
atcagtttaa caaaaacaac tccaatcaag 7620cagcagctgt aactttcaca
aagtgtgaag aagaaccttt agatttaatt acaagtcttc 7680agaatgccag
agatatacag gatatgcgaa ttaagaagaa acaaaggcaa cgcgtctttc
7740cacagccagg cagtctgtat cttgcaaaaa catccactct gcctcgaatc
tctctgaaag 7800cagcagtagg aggccaagtt ccctctgcgt gttctcataa
acagctgtat acgtatggcg 7860tttctaaaca ttgcataaaa attaacagca
aaaatgcaga gtcttttcag tttcacactg 7920aagattattt tggtaaggaa
agtttatgga ctggaaaagg aatacagttg gctgatggtg 7980gatggctcat
accctccaat gatggaaagg ctggaaaaga agaattttat agggctctgt
8040gtgacactcc aggtgtggat ccaaagctta tttctagaat ttgggtttat
aatcactata 8100gatggatcat atggaaactg gcagctatgg aatgtgcctt
tcctaaggaa tttgctaata 8160gatgcctaag cccagaaagg gtgcttcttc
aactaaaata cagatatgat acggaaattg 8220atagaagcag aagatcggct
ataaaaaaga taatggaaag ggatgacaca gctgcaaaaa 8280cacttgttct
ctgtgtttct gacataattt cattgagcgc aaatatatct gaaacttcta
8340gcaataaaac tagtagtgca gatacccaaa aagtggccat tattgaactt
acagatgggt 8400ggtatgctgt taaggcccag ttagatcctc ccctcttagc
tgtcttaaag aatggcagac 8460tgacagttgg tcagaagatt attcttcatg
gagcagaact ggtgggctct cctgatgcct 8520gtacacctct tgaagcccca
gaatctctta tgttaaagat ttctgctaac agtactcggc 8580ctgctcgctg
gtataccaaa cttggattct ttcctgaccc tagacctttt cctctgccct
8640tatcatcgct tttcagtgat ggaggaaatg ttggttgtgt tgatgtaatt
attcaaagag 8700cataccctat acagtggatg gagaagacat catctggatt
atacatattt cgcaatgaaa 8760gagaggaaga aaaggaagca gcaaaatatg
tggaggccca acaaaagaga ctagaagcct 8820tattcactaa aattcaggag
gaatttgaag aacatgaaga aaacacaaca aaaccatatt 8880taccatcacg
tgcactaaca agacagcaag ttcgtgcttt gcaagatggt gcagagcttt
8940atgaagcagt gaagaatgca gcagacccag cttaccttga gggttatttc
agtgaagagc 9000agttaagagc cttgaataat cacaggcaaa tgttgaatga
taagaaacaa gctcagatcc 9060agttggaaat taggaaggcc atggaatctg
ctgaacaaaa ggaacaaggt ttatcaaggg 9120atgtcacaac cgtgtggaag
ttgcgtattg taagctattc aaaaaaagaa aaagattcag 9180ttatactgag
tatttggcgt ccatcatcag atttatattc tctgttaaca gaaggaaaga
9240gatacagaat ttatcatctt gcaacttcaa aatctaaaag taaatctgaa
agagctaaca 9300tacagttagc agcgacaaaa aaaactcagt atcaacaact
accggtttca gatgaaattt 9360tatttcagat ttaccagcca cgggagcccc
ttcacttcag caaattttta gatccagact 9420ttcagccatc ttgttctgag
gtggacctaa taggatttgt cgtttctgtt gtgaaaaaaa 9480caggacttgc
ccctttcgtc tatttgtcag acgaatgtta caatttactg gcaataaagt
9540tttggataga ccttaatgag gacattatta agcctcatat gttaattgct
gcaagcaacc 9600tccagtggcg accagaatcc aaatcaggcc ttcttacttt
atttgctgga gatttttctg 9660tgttttctgc tagtccaaaa gagggccact
ttcaagagac attcaacaaa atgaaaaata 9720ctgttgagaa tattgacata
ctttgcaatg aagcagaaaa caagcttatg catatactgc 9780atgcaaatga
tcccaagtgg tccaccccaa ctaaagactg tacttcaggg ccgtacactg
9840ctcaaatcat tcctggtaca ggaaacaagc ttctgatgtc ttctcctaat
tgtgagatat 9900attatcaaag tcctttatca ctttgtatgg ccaaaaggaa
gtctgtttcc acacctgtct 9960cagcccagat gacttcaaag tcttgtaaag
gggagaaaga gattgatgac caaaagaact 10020gcaaaaagag aagagccttg
gatttcttga gtagactgcc tttacctcca cctgttagtc 10080ccatttgtac
atttgtttct ccggctgcac agaaggcatt tcagccacca aggagttgtg
10140gcaccaaata cgaaacaccc ataaagaaaa aagaactgaa ttctcctcag
atgactccat 10200ttaaaaaatt caatgaaatt tctcttttgg aaagtaattc
aatagctgac gaagaacttg 10260cattgataaa tacccaagct cttttgtctg
gttcaacagg agaaaaacaa tttatatctg 10320tcagtgaatc cactaggact
gctcccacca gttcagaaga ttatctcaga ctgaaacgac 10380gttgtactac
atctctgatc aaagaacagg agagttccca ggccagtacg gaagaatgtg
10440agaaaaataa gcaggacaca attacaacta aaaaatatat ctaagcattt
gcaaaggcga 10500caataaatta ttgacgctta acctttccag tttataagac
tggaatataa tttcaaacca 10560cacattagta cttatgttgc acaatgagaa
aagaaattag tttcaaattt acctcagcgt 10620ttgtgtatcg ggcaaaaatc
gttttgcccg attccgtatt ggtatacttt tgcttcagtt 10680gcatatctta
aaactaaatg taatttatta actaatcaag aaaaacatct ttggctgagc
10740tcggtggctc atgcctgtaa tcccaacact ttgagaagct gaggtgggag
gagtgcttga 10800ggccaggagt tcaagaccag cctgggcaac atagggagac
ccccatcttt acaaagaaaa 10860aaaaaagggg aaaagaaaat cttttaaatc
tttggatttg atcactacaa gtattatttt 10920acaagtgaaa taaacatacc
attttctttt agattgtgtc attaaatgga atgaggtctc 10980ttagtacagt
tattttgatg cagataattc cttttagttt agctactatt ttaggggatt
11040ttttttagag gtaactcact atgaaatagt tctccttaat gcaaatatgt
tggttctgct 11100atagttccat cctgttcaaa agtcaggatg aatatgaaga
gtggtgtttc cttttgagca 11160attcttcatc cttaagtcag catgattata
agaaaaatag aaccctcagt gtaactctaa 11220ttccttttta ctattccagt
gtgatctctg aaattaaatt acttcaacta aaaattcaaa 11280tactttaaat
cagaagattt catagttaat ttattttttt tttcaacaaa atggtcatcc
11340aaactcaaac ttgagaaaat atcttgcttt caaattggca ctgatt
11386503418PRTHomo sapiens 50Met Pro Ile Gly Ser Lys Glu Arg Pro
Thr Phe Phe Glu Ile Phe Lys 1 5 10 15 Thr Arg Cys Asn Lys Ala Asp
Leu Gly Pro Ile Ser Leu Asn Trp Phe 20 25 30 Glu Glu Leu Ser Ser
Glu Ala Pro Pro Tyr Asn Ser Glu Pro Ala Glu 35 40 45 Glu Ser Glu
His Lys Asn Asn Asn Tyr Glu Pro Asn Leu Phe Lys Thr 50 55 60 Pro
Gln Arg Lys Pro Ser Tyr Asn Gln Leu Ala Ser Thr Pro Ile Ile 65 70
75 80 Phe Lys Glu Gln Gly Leu Thr Leu Pro Leu Tyr Gln Ser Pro Val
Lys 85 90 95 Glu Leu Asp Lys Phe Lys Leu Asp Leu Gly Arg Asn Val
Pro Asn Ser 100 105 110 Arg His Lys Ser Leu Arg Thr Val Lys Thr Lys
Met Asp Gln Ala Asp 115 120 125 Asp Val Ser Cys Pro Leu Leu Asn Ser
Cys Leu Ser Glu Ser Pro Val 130 135 140 Val Leu Gln Cys Thr His Val
Thr Pro Gln Arg Asp Lys Ser Val Val 145 150 155 160 Cys Gly Ser Leu
Phe His Thr Pro Lys Phe Val Lys Gly Arg Gln Thr 165 170 175 Pro Lys
His Ile Ser Glu Ser Leu Gly Ala Glu Val Asp Pro Asp Met 180 185 190
Ser Trp Ser Ser Ser Leu Ala Thr Pro Pro Thr Leu Ser Ser Thr Val 195
200 205 Leu Ile Val Arg Asn Glu Glu Ala Ser Glu Thr Val Phe Pro His
Asp 210 215 220 Thr Thr Ala Asn Val Lys Ser Tyr Phe Ser Asn His Asp
Glu Ser Leu 225 230 235 240 Lys Lys Asn Asp Arg Phe Ile Ala Ser Val
Thr Asp Ser Glu Asn Thr 245 250 255 Asn Gln Arg Glu Ala Ala Ser His
Gly Phe Gly Lys Thr Ser Gly Asn 260 265 270 Ser Phe Lys Val Asn Ser
Cys Lys Asp His Ile Gly Lys Ser Met Pro 275 280 285 Asn Val Leu Glu
Asp Glu Val Tyr Glu Thr Val Val Asp Thr Ser Glu 290 295 300 Glu Asp
Ser Phe Ser Leu Cys Phe Ser Lys Cys Arg Thr Lys Asn Leu 305 310 315
320 Gln Lys Val Arg Thr Ser Lys Thr Arg Lys Lys Ile Phe His Glu Ala
325 330 335 Asn Ala Asp Glu Cys Glu Lys Ser Lys Asn Gln Val Lys Glu
Lys Tyr 340 345 350 Ser Phe Val Ser Glu Val Glu Pro Asn Asp Thr Asp
Pro Leu Asp Ser 355 360 365 Asn Val Ala Asn Gln Lys Pro Phe Glu Ser
Gly Ser Asp Lys Ile Ser 370 375 380 Lys Glu Val Val Pro Ser Leu Ala
Cys Glu Trp Ser Gln Leu Thr Leu 385 390 395 400 Ser Gly Leu Asn Gly
Ala Gln Met Glu Lys Ile Pro Leu Leu His Ile
405 410 415 Ser Ser Cys Asp Gln Asn Ile Ser Glu Lys Asp Leu Leu Asp
Thr Glu 420 425 430 Asn Lys Arg Lys Lys Asp Phe Leu Thr Ser Glu Asn
Ser Leu Pro Arg 435 440 445 Ile Ser Ser Leu Pro Lys Ser Glu Lys Pro
Leu Asn Glu Glu Thr Val 450 455 460 Val Asn Lys Arg Asp Glu Glu Gln
His Leu Glu Ser His Thr Asp Cys 465 470 475 480 Ile Leu Ala Val Lys
Gln Ala Ile Ser Gly Thr Ser Pro Val Ala Ser 485 490 495 Ser Phe Gln
Gly Ile Lys Lys Ser Ile Phe Arg Ile Arg Glu Ser Pro 500 505 510 Lys
Glu Thr Phe Asn Ala Ser Phe Ser Gly His Met Thr Asp Pro Asn 515 520
525 Phe Lys Lys Glu Thr Glu Ala Ser Glu Ser Gly Leu Glu Ile His Thr
530 535 540 Val Cys Ser Gln Lys Glu Asp Ser Leu Cys Pro Asn Leu Ile
Asp Asn 545 550 555 560 Gly Ser Trp Pro Ala Thr Thr Thr Gln Asn Ser
Val Ala Leu Lys Asn 565 570 575 Ala Gly Leu Ile Ser Thr Leu Lys Lys
Lys Thr Asn Lys Phe Ile Tyr 580 585 590 Ala Ile His Asp Glu Thr Ser
Tyr Lys Gly Lys Lys Ile Pro Lys Asp 595 600 605 Gln Lys Ser Glu Leu
Ile Asn Cys Ser Ala Gln Phe Glu Ala Asn Ala 610 615 620 Phe Glu Ala
Pro Leu Thr Phe Ala Asn Ala Asp Ser Gly Leu Leu His 625 630 635 640
Ser Ser Val Lys Arg Ser Cys Ser Gln Asn Asp Ser Glu Glu Pro Thr 645
650 655 Leu Ser Leu Thr Ser Ser Phe Gly Thr Ile Leu Arg Lys Cys Ser
Arg 660 665 670 Asn Glu Thr Cys Ser Asn Asn Thr Val Ile Ser Gln Asp
Leu Asp Tyr 675 680 685 Lys Glu Ala Lys Cys Asn Lys Glu Lys Leu Gln
Leu Phe Ile Thr Pro 690 695 700 Glu Ala Asp Ser Leu Ser Cys Leu Gln
Glu Gly Gln Cys Glu Asn Asp 705 710 715 720 Pro Lys Ser Lys Lys Val
Ser Asp Ile Lys Glu Glu Val Leu Ala Ala 725 730 735 Ala Cys His Pro
Val Gln His Ser Lys Val Glu Tyr Ser Asp Thr Asp 740 745 750 Phe Gln
Ser Gln Lys Ser Leu Leu Tyr Asp His Glu Asn Ala Ser Thr 755 760 765
Leu Ile Leu Thr Pro Thr Ser Lys Asp Val Leu Ser Asn Leu Val Met 770
775 780 Ile Ser Arg Gly Lys Glu Ser Tyr Lys Met Ser Asp Lys Leu Lys
Gly 785 790 795 800 Asn Asn Tyr Glu Ser Asp Val Glu Leu Thr Lys Asn
Ile Pro Met Glu 805 810 815 Lys Asn Gln Asp Val Cys Ala Leu Asn Glu
Asn Tyr Lys Asn Val Glu 820 825 830 Leu Leu Pro Pro Glu Lys Tyr Met
Arg Val Ala Ser Pro Ser Arg Lys 835 840 845 Val Gln Phe Asn Gln Asn
Thr Asn Leu Arg Val Ile Gln Lys Asn Gln 850 855 860 Glu Glu Thr Thr
Ser Ile Ser Lys Ile Thr Val Asn Pro Asp Ser Glu 865 870 875 880 Glu
Leu Phe Ser Asp Asn Glu Asn Asn Phe Val Phe Gln Val Ala Asn 885 890
895 Glu Arg Asn Asn Leu Ala Leu Gly Asn Thr Lys Glu Leu His Glu Thr
900 905 910 Asp Leu Thr Cys Val Asn Glu Pro Ile Phe Lys Asn Ser Thr
Met Val 915 920 925 Leu Tyr Gly Asp Thr Gly Asp Lys Gln Ala Thr Gln
Val Ser Ile Lys 930 935 940 Lys Asp Leu Val Tyr Val Leu Ala Glu Glu
Asn Lys Asn Ser Val Lys 945 950 955 960 Gln His Ile Lys Met Thr Leu
Gly Gln Asp Leu Lys Ser Asp Ile Ser 965 970 975 Leu Asn Ile Asp Lys
Ile Pro Glu Lys Asn Asn Asp Tyr Met Asn Lys 980 985 990 Trp Ala Gly
Leu Leu Gly Pro Ile Ser Asn His Ser Phe Gly Gly Ser 995 1000 1005
Phe Arg Thr Ala Ser Asn Lys Glu Ile Lys Leu Ser Glu His Asn 1010
1015 1020 Ile Lys Lys Ser Lys Met Phe Phe Lys Asp Ile Glu Glu Gln
Tyr 1025 1030 1035 Pro Thr Ser Leu Ala Cys Val Glu Ile Val Asn Thr
Leu Ala Leu 1040 1045 1050 Asp Asn Gln Lys Lys Leu Ser Lys Pro Gln
Ser Ile Asn Thr Val 1055 1060 1065 Ser Ala His Leu Gln Ser Ser Val
Val Val Ser Asp Cys Lys Asn 1070 1075 1080 Ser His Ile Thr Pro Gln
Met Leu Phe Ser Lys Gln Asp Phe Asn 1085 1090 1095 Ser Asn His Asn
Leu Thr Pro Ser Gln Lys Ala Glu Ile Thr Glu 1100 1105 1110 Leu Ser
Thr Ile Leu Glu Glu Ser Gly Ser Gln Phe Glu Phe Thr 1115 1120 1125
Gln Phe Arg Lys Pro Ser Tyr Ile Leu Gln Lys Ser Thr Phe Glu 1130
1135 1140 Val Pro Glu Asn Gln Met Thr Ile Leu Lys Thr Thr Ser Glu
Glu 1145 1150 1155 Cys Arg Asp Ala Asp Leu His Val Ile Met Asn Ala
Pro Ser Ile 1160 1165 1170 Gly Gln Val Asp Ser Ser Lys Gln Phe Glu
Gly Thr Val Glu Ile 1175 1180 1185 Lys Arg Lys Phe Ala Gly Leu Leu
Lys Asn Asp Cys Asn Lys Ser 1190 1195 1200 Ala Ser Gly Tyr Leu Thr
Asp Glu Asn Glu Val Gly Phe Arg Gly 1205 1210 1215 Phe Tyr Ser Ala
His Gly Thr Lys Leu Asn Val Ser Thr Glu Ala 1220 1225 1230 Leu Gln
Lys Ala Val Lys Leu Phe Ser Asp Ile Glu Asn Ile Ser 1235 1240 1245
Glu Glu Thr Ser Ala Glu Val His Pro Ile Ser Leu Ser Ser Ser 1250
1255 1260 Lys Cys His Asp Ser Val Val Ser Met Phe Lys Ile Glu Asn
His 1265 1270 1275 Asn Asp Lys Thr Val Ser Glu Lys Asn Asn Lys Cys
Gln Leu Ile 1280 1285 1290 Leu Gln Asn Asn Ile Glu Met Thr Thr Gly
Thr Phe Val Glu Glu 1295 1300 1305 Ile Thr Glu Asn Tyr Lys Arg Asn
Thr Glu Asn Glu Asp Asn Lys 1310 1315 1320 Tyr Thr Ala Ala Ser Arg
Asn Ser His Asn Leu Glu Phe Asp Gly 1325 1330 1335 Ser Asp Ser Ser
Lys Asn Asp Thr Val Cys Ile His Lys Asp Glu 1340 1345 1350 Thr Asp
Leu Leu Phe Thr Asp Gln His Asn Ile Cys Leu Lys Leu 1355 1360 1365
Ser Gly Gln Phe Met Lys Glu Gly Asn Thr Gln Ile Lys Glu Asp 1370
1375 1380 Leu Ser Asp Leu Thr Phe Leu Glu Val Ala Lys Ala Gln Glu
Ala 1385 1390 1395 Cys His Gly Asn Thr Ser Asn Lys Glu Gln Leu Thr
Ala Thr Lys 1400 1405 1410 Thr Glu Gln Asn Ile Lys Asp Phe Glu Thr
Ser Asp Thr Phe Phe 1415 1420 1425 Gln Thr Ala Ser Gly Lys Asn Ile
Ser Val Ala Lys Glu Ser Phe 1430 1435 1440 Asn Lys Ile Val Asn Phe
Phe Asp Gln Lys Pro Glu Glu Leu His 1445 1450 1455 Asn Phe Ser Leu
Asn Ser Glu Leu His Ser Asp Ile Arg Lys Asn 1460 1465 1470 Lys Met
Asp Ile Leu Ser Tyr Glu Glu Thr Asp Ile Val Lys His 1475 1480 1485
Lys Ile Leu Lys Glu Ser Val Pro Val Gly Thr Gly Asn Gln Leu 1490
1495 1500 Val Thr Phe Gln Gly Gln Pro Glu Arg Asp Glu Lys Ile Lys
Glu 1505 1510 1515 Pro Thr Leu Leu Gly Phe His Thr Ala Ser Gly Lys
Lys Val Lys 1520 1525 1530 Ile Ala Lys Glu Ser Leu Asp Lys Val Lys
Asn Leu Phe Asp Glu 1535 1540 1545 Lys Glu Gln Gly Thr Ser Glu Ile
Thr Ser Phe Ser His Gln Trp 1550 1555 1560 Ala Lys Thr Leu Lys Tyr
Arg Glu Ala Cys Lys Asp Leu Glu Leu 1565 1570 1575 Ala Cys Glu Thr
Ile Glu Ile Thr Ala Ala Pro Lys Cys Lys Glu 1580 1585 1590 Met Gln
Asn Ser Leu Asn Asn Asp Lys Asn Leu Val Ser Ile Glu 1595 1600 1605
Thr Val Val Pro Pro Lys Leu Leu Ser Asp Asn Leu Cys Arg Gln 1610
1615 1620 Thr Glu Asn Leu Lys Thr Ser Lys Ser Ile Phe Leu Lys Val
Lys 1625 1630 1635 Val His Glu Asn Val Glu Lys Glu Thr Ala Lys Ser
Pro Ala Thr 1640 1645 1650 Cys Tyr Thr Asn Gln Ser Pro Tyr Ser Val
Ile Glu Asn Ser Ala 1655 1660 1665 Leu Ala Phe Tyr Thr Ser Cys Ser
Arg Lys Thr Ser Val Ser Gln 1670 1675 1680 Thr Ser Leu Leu Glu Ala
Lys Lys Trp Leu Arg Glu Gly Ile Phe 1685 1690 1695 Asp Gly Gln Pro
Glu Arg Ile Asn Thr Ala Asp Tyr Val Gly Asn 1700 1705 1710 Tyr Leu
Tyr Glu Asn Asn Ser Asn Ser Thr Ile Ala Glu Asn Asp 1715 1720 1725
Lys Asn His Leu Ser Glu Lys Gln Asp Thr Tyr Leu Ser Asn Ser 1730
1735 1740 Ser Met Ser Asn Ser Tyr Ser Tyr His Ser Asp Glu Val Tyr
Asn 1745 1750 1755 Asp Ser Gly Tyr Leu Ser Lys Asn Lys Leu Asp Ser
Gly Ile Glu 1760 1765 1770 Pro Val Leu Lys Asn Val Glu Asp Gln Lys
Asn Thr Ser Phe Ser 1775 1780 1785 Lys Val Ile Ser Asn Val Lys Asp
Ala Asn Ala Tyr Pro Gln Thr 1790 1795 1800 Val Asn Glu Asp Ile Cys
Val Glu Glu Leu Val Thr Ser Ser Ser 1805 1810 1815 Pro Cys Lys Asn
Lys Asn Ala Ala Ile Lys Leu Ser Ile Ser Asn 1820 1825 1830 Ser Asn
Asn Phe Glu Val Gly Pro Pro Ala Phe Arg Ile Ala Ser 1835 1840 1845
Gly Lys Ile Val Cys Val Ser His Glu Thr Ile Lys Lys Val Lys 1850
1855 1860 Asp Ile Phe Thr Asp Ser Phe Ser Lys Val Ile Lys Glu Asn
Asn 1865 1870 1875 Glu Asn Lys Ser Lys Ile Cys Gln Thr Lys Ile Met
Ala Gly Cys 1880 1885 1890 Tyr Glu Ala Leu Asp Asp Ser Glu Asp Ile
Leu His Asn Ser Leu 1895 1900 1905 Asp Asn Asp Glu Cys Ser Thr His
Ser His Lys Val Phe Ala Asp 1910 1915 1920 Ile Gln Ser Glu Glu Ile
Leu Gln His Asn Gln Asn Met Ser Gly 1925 1930 1935 Leu Glu Lys Val
Ser Lys Ile Ser Pro Cys Asp Val Ser Leu Glu 1940 1945 1950 Thr Ser
Asp Ile Cys Lys Cys Ser Ile Gly Lys Leu His Lys Ser 1955 1960 1965
Val Ser Ser Ala Asn Thr Cys Gly Ile Phe Ser Thr Ala Ser Gly 1970
1975 1980 Lys Ser Val Gln Val Ser Asp Ala Ser Leu Gln Asn Ala Arg
Gln 1985 1990 1995 Val Phe Ser Glu Ile Glu Asp Ser Thr Lys Gln Val
Phe Ser Lys 2000 2005 2010 Val Leu Phe Lys Ser Asn Glu His Ser Asp
Gln Leu Thr Arg Glu 2015 2020 2025 Glu Asn Thr Ala Ile Arg Thr Pro
Glu His Leu Ile Ser Gln Lys 2030 2035 2040 Gly Phe Ser Tyr Asn Val
Val Asn Ser Ser Ala Phe Ser Gly Phe 2045 2050 2055 Ser Thr Ala Ser
Gly Lys Gln Val Ser Ile Leu Glu Ser Ser Leu 2060 2065 2070 His Lys
Val Lys Gly Val Leu Glu Glu Phe Asp Leu Ile Arg Thr 2075 2080 2085
Glu His Ser Leu His Tyr Ser Pro Thr Ser Arg Gln Asn Val Ser 2090
2095 2100 Lys Ile Leu Pro Arg Val Asp Lys Arg Asn Pro Glu His Cys
Val 2105 2110 2115 Asn Ser Glu Met Glu Lys Thr Cys Ser Lys Glu Phe
Lys Leu Ser 2120 2125 2130 Asn Asn Leu Asn Val Glu Gly Gly Ser Ser
Glu Asn Asn His Ser 2135 2140 2145 Ile Lys Val Ser Pro Tyr Leu Ser
Gln Phe Gln Gln Asp Lys Gln 2150 2155 2160 Gln Leu Val Leu Gly Thr
Lys Val Ser Leu Val Glu Asn Ile His 2165 2170 2175 Val Leu Gly Lys
Glu Gln Ala Ser Pro Lys Asn Val Lys Met Glu 2180 2185 2190 Ile Gly
Lys Thr Glu Thr Phe Ser Asp Val Pro Val Lys Thr Asn 2195 2200 2205
Ile Glu Val Cys Ser Thr Tyr Ser Lys Asp Ser Glu Asn Tyr Phe 2210
2215 2220 Glu Thr Glu Ala Val Glu Ile Ala Lys Ala Phe Met Glu Asp
Asp 2225 2230 2235 Glu Leu Thr Asp Ser Lys Leu Pro Ser His Ala Thr
His Ser Leu 2240 2245 2250 Phe Thr Cys Pro Glu Asn Glu Glu Met Val
Leu Ser Asn Ser Arg 2255 2260 2265 Ile Gly Lys Arg Arg Gly Glu Pro
Leu Ile Leu Val Gly Glu Pro 2270 2275 2280 Ser Ile Lys Arg Asn Leu
Leu Asn Glu Phe Asp Arg Ile Ile Glu 2285 2290 2295 Asn Gln Glu Lys
Ser Leu Lys Ala Ser Lys Ser Thr Pro Asp Gly 2300 2305 2310 Thr Ile
Lys Asp Arg Arg Leu Phe Met His His Val Ser Leu Glu 2315 2320 2325
Pro Ile Thr Cys Val Pro Phe Arg Thr Thr Lys Glu Arg Gln Glu 2330
2335 2340 Ile Gln Asn Pro Asn Phe Thr Ala Pro Gly Gln Glu Phe Leu
Ser 2345 2350 2355 Lys Ser His Leu Tyr Glu His Leu Thr Leu Glu Lys
Ser Ser Ser 2360 2365 2370 Asn Leu Ala Val Ser Gly His Pro Phe Tyr
Gln Val Ser Ala Thr 2375 2380 2385 Arg Asn Glu Lys Met Arg His Leu
Ile Thr Thr Gly Arg Pro Thr 2390 2395 2400 Lys Val Phe Val Pro Pro
Phe Lys Thr Lys Ser His Phe His Arg 2405 2410 2415 Val Glu Gln Cys
Val Arg Asn Ile Asn Leu Glu Glu Asn Arg Gln 2420 2425 2430 Lys Gln
Asn Ile Asp Gly His Gly Ser Asp Asp Ser Lys Asn Lys 2435 2440 2445
Ile Asn Asp Asn Glu Ile His Gln Phe Asn Lys Asn Asn Ser Asn 2450
2455 2460 Gln Ala Ala Ala Val Thr Phe Thr Lys Cys Glu Glu Glu Pro
Leu 2465 2470 2475 Asp Leu Ile Thr Ser Leu Gln Asn Ala Arg Asp Ile
Gln Asp Met 2480 2485 2490 Arg Ile Lys Lys Lys Gln Arg Gln Arg Val
Phe Pro Gln Pro Gly 2495 2500 2505 Ser Leu Tyr Leu Ala Lys Thr Ser
Thr Leu Pro Arg Ile Ser Leu 2510 2515 2520 Lys Ala Ala Val Gly Gly
Gln Val Pro Ser Ala Cys Ser His Lys 2525 2530 2535 Gln Leu Tyr Thr
Tyr Gly Val Ser Lys His Cys Ile Lys Ile Asn 2540 2545 2550 Ser Lys
Asn Ala Glu Ser Phe Gln Phe His Thr Glu Asp Tyr Phe 2555 2560 2565
Gly Lys Glu Ser Leu Trp Thr Gly Lys Gly Ile Gln Leu Ala Asp 2570
2575 2580 Gly Gly Trp Leu Ile Pro Ser Asn Asp Gly Lys Ala Gly Lys
Glu 2585 2590 2595 Glu Phe Tyr Arg Ala Leu Cys Asp Thr Pro Gly Val
Asp Pro Lys 2600 2605 2610 Leu Ile Ser Arg Ile Trp Val Tyr Asn His
Tyr Arg Trp Ile Ile 2615 2620 2625 Trp Lys Leu Ala Ala Met Glu
Cys Ala Phe Pro Lys Glu Phe Ala 2630 2635 2640 Asn Arg Cys Leu Ser
Pro Glu Arg Val Leu Leu Gln Leu Lys Tyr 2645 2650 2655 Arg Tyr Asp
Thr Glu Ile Asp Arg Ser Arg Arg Ser Ala Ile Lys 2660 2665 2670 Lys
Ile Met Glu Arg Asp Asp Thr Ala Ala Lys Thr Leu Val Leu 2675 2680
2685 Cys Val Ser Asp Ile Ile Ser Leu Ser Ala Asn Ile Ser Glu Thr
2690 2695 2700 Ser Ser Asn Lys Thr Ser Ser Ala Asp Thr Gln Lys Val
Ala Ile 2705 2710 2715 Ile Glu Leu Thr Asp Gly Trp Tyr Ala Val Lys
Ala Gln Leu Asp 2720 2725 2730 Pro Pro Leu Leu Ala Val Leu Lys Asn
Gly Arg Leu Thr Val Gly 2735 2740 2745 Gln Lys Ile Ile Leu His Gly
Ala Glu Leu Val Gly Ser Pro Asp 2750 2755 2760 Ala Cys Thr Pro Leu
Glu Ala Pro Glu Ser Leu Met Leu Lys Ile 2765 2770 2775 Ser Ala Asn
Ser Thr Arg Pro Ala Arg Trp Tyr Thr Lys Leu Gly 2780 2785 2790 Phe
Phe Pro Asp Pro Arg Pro Phe Pro Leu Pro Leu Ser Ser Leu 2795 2800
2805 Phe Ser Asp Gly Gly Asn Val Gly Cys Val Asp Val Ile Ile Gln
2810 2815 2820 Arg Ala Tyr Pro Ile Gln Trp Met Glu Lys Thr Ser Ser
Gly Leu 2825 2830 2835 Tyr Ile Phe Arg Asn Glu Arg Glu Glu Glu Lys
Glu Ala Ala Lys 2840 2845 2850 Tyr Val Glu Ala Gln Gln Lys Arg Leu
Glu Ala Leu Phe Thr Lys 2855 2860 2865 Ile Gln Glu Glu Phe Glu Glu
His Glu Glu Asn Thr Thr Lys Pro 2870 2875 2880 Tyr Leu Pro Ser Arg
Ala Leu Thr Arg Gln Gln Val Arg Ala Leu 2885 2890 2895 Gln Asp Gly
Ala Glu Leu Tyr Glu Ala Val Lys Asn Ala Ala Asp 2900 2905 2910 Pro
Ala Tyr Leu Glu Gly Tyr Phe Ser Glu Glu Gln Leu Arg Ala 2915 2920
2925 Leu Asn Asn His Arg Gln Met Leu Asn Asp Lys Lys Gln Ala Gln
2930 2935 2940 Ile Gln Leu Glu Ile Arg Lys Ala Met Glu Ser Ala Glu
Gln Lys 2945 2950 2955 Glu Gln Gly Leu Ser Arg Asp Val Thr Thr Val
Trp Lys Leu Arg 2960 2965 2970 Ile Val Ser Tyr Ser Lys Lys Glu Lys
Asp Ser Val Ile Leu Ser 2975 2980 2985 Ile Trp Arg Pro Ser Ser Asp
Leu Tyr Ser Leu Leu Thr Glu Gly 2990 2995 3000 Lys Arg Tyr Arg Ile
Tyr His Leu Ala Thr Ser Lys Ser Lys Ser 3005 3010 3015 Lys Ser Glu
Arg Ala Asn Ile Gln Leu Ala Ala Thr Lys Lys Thr 3020 3025 3030 Gln
Tyr Gln Gln Leu Pro Val Ser Asp Glu Ile Leu Phe Gln Ile 3035 3040
3045 Tyr Gln Pro Arg Glu Pro Leu His Phe Ser Lys Phe Leu Asp Pro
3050 3055 3060 Asp Phe Gln Pro Ser Cys Ser Glu Val Asp Leu Ile Gly
Phe Val 3065 3070 3075 Val Ser Val Val Lys Lys Thr Gly Leu Ala Pro
Phe Val Tyr Leu 3080 3085 3090 Ser Asp Glu Cys Tyr Asn Leu Leu Ala
Ile Lys Phe Trp Ile Asp 3095 3100 3105 Leu Asn Glu Asp Ile Ile Lys
Pro His Met Leu Ile Ala Ala Ser 3110 3115 3120 Asn Leu Gln Trp Arg
Pro Glu Ser Lys Ser Gly Leu Leu Thr Leu 3125 3130 3135 Phe Ala Gly
Asp Phe Ser Val Phe Ser Ala Ser Pro Lys Glu Gly 3140 3145 3150 His
Phe Gln Glu Thr Phe Asn Lys Met Lys Asn Thr Val Glu Asn 3155 3160
3165 Ile Asp Ile Leu Cys Asn Glu Ala Glu Asn Lys Leu Met His Ile
3170 3175 3180 Leu His Ala Asn Asp Pro Lys Trp Ser Thr Pro Thr Lys
Asp Cys 3185 3190 3195 Thr Ser Gly Pro Tyr Thr Ala Gln Ile Ile Pro
Gly Thr Gly Asn 3200 3205 3210 Lys Leu Leu Met Ser Ser Pro Asn Cys
Glu Ile Tyr Tyr Gln Ser 3215 3220 3225 Pro Leu Ser Leu Cys Met Ala
Lys Arg Lys Ser Val Ser Thr Pro 3230 3235 3240 Val Ser Ala Gln Met
Thr Ser Lys Ser Cys Lys Gly Glu Lys Glu 3245 3250 3255 Ile Asp Asp
Gln Lys Asn Cys Lys Lys Arg Arg Ala Leu Asp Phe 3260 3265 3270 Leu
Ser Arg Leu Pro Leu Pro Pro Pro Val Ser Pro Ile Cys Thr 3275 3280
3285 Phe Val Ser Pro Ala Ala Gln Lys Ala Phe Gln Pro Pro Arg Ser
3290 3295 3300 Cys Gly Thr Lys Tyr Glu Thr Pro Ile Lys Lys Lys Glu
Leu Asn 3305 3310 3315 Ser Pro Gln Met Thr Pro Phe Lys Lys Phe Asn
Glu Ile Ser Leu 3320 3325 3330 Leu Glu Ser Asn Ser Ile Ala Asp Glu
Glu Leu Ala Leu Ile Asn 3335 3340 3345 Thr Gln Ala Leu Leu Ser Gly
Ser Thr Gly Glu Lys Gln Phe Ile 3350 3355 3360 Ser Val Ser Glu Ser
Thr Arg Thr Ala Pro Thr Ser Ser Glu Asp 3365 3370 3375 Tyr Leu Arg
Leu Lys Arg Arg Cys Thr Thr Ser Leu Ile Lys Glu 3380 3385 3390 Gln
Glu Ser Ser Gln Ala Ser Thr Glu Glu Cys Glu Lys Asn Lys 3395 3400
3405 Gln Asp Thr Ile Thr Thr Lys Lys Tyr Ile 3410 3415
5110661DNAHomo sapiens 51cgagatcccg gggagccagc ttgctgggag
agcgggacgg tccggagcaa gcccagaggc 60agaggaggcg acagagggaa aaagggccga
gctagccgct ccagtgctgt acaggagccg 120aagggacgca ccacgccagc
cccagcccgg ctccagcgac agccaacgcc tcttgcagcg 180cggcggcttc
gaagccgccg cccggagctg ccctttcctc ttcggtgaag tttttaaaag
240ctgctaaaga ctcggaggaa gcaaggaaag tgcctggtag gactgacggc
tgcctttgtc 300ctcctcctct ccaccccgcc tccccccacc ctgccttccc
cccctccccc gtcttctctc 360ccgcagctgc ctcagtcggc tactctcagc
caacccccct caccaccctt ctccccaccc 420gcccccccgc ccccgtcggc
ccagcgctgc cagcccgagt ttgcagagag gtaactccct 480ttggctgcga
gcgggcgagc tagctgcaca ttgcaaagaa ggctcttagg agccaggcga
540ctggggagcg gcttcagcac tgcagccacg acccgcctgg ttaggctgca
cgcggagaga 600accctctgtt ttcccccact ctctctccac ctcctcctgc
cttccccacc ccgagtgcgg 660agccagagat caaaagatga aaaggcagtc
aggtcttcag tagccaaaaa acaaaacaaa 720caaaaacaaa aaagccgaaa
taaaagaaaa agataataac tcagttctta tttgcaccta 780cttcagtgga
cactgaattt ggaaggtgga ggattttgtt tttttctttt aagatctggg
840catcttttga atctaccctt caagtattaa gagacagact gtgagcctag
cagggcagat 900cttgtccacc gtgtgtcttc ttctgcacga gactttgagg
ctgtcagagc gctttttgcg 960tggttgctcc cgcaagtttc cttctctgga
gcttcccgca ggtgggcagc tagctgcagc 1020gactaccgca tcatcacagc
ctgttgaact cttctgagca agagaagggg aggcggggta 1080agggaagtag
gtggaagatt cagccaagct caaggatgga agtgcagtta gggctgggaa
1140gggtctaccc tcggccgccg tccaagacct accgaggagc tttccagaat
ctgttccaga 1200gcgtgcgcga agtgatccag aacccgggcc ccaggcaccc
agaggccgcg agcgcagcac 1260ctcccggcgc cagtttgctg ctgctgcagc
agcagcagca gcagcagcag cagcagcagc 1320agcagcagca gcagcagcag
cagcagcagc agcaagagac tagccccagg cagcagcagc 1380agcagcaggg
tgaggatggt tctccccaag cccatcgtag aggccccaca ggctacctgg
1440tcctggatga ggaacagcaa ccttcacagc cgcagtcggc cctggagtgc
caccccgaga 1500gaggttgcgt cccagagcct ggagccgccg tggccgccag
caaggggctg ccgcagcagc 1560tgccagcacc tccggacgag gatgactcag
ctgccccatc cacgttgtcc ctgctgggcc 1620ccactttccc cggcttaagc
agctgctccg ctgaccttaa agacatcctg agcgaggcca 1680gcaccatgca
actccttcag caacagcagc aggaagcagt atccgaaggc agcagcagcg
1740ggagagcgag ggaggcctcg ggggctccca cttcctccaa ggacaattac
ttagggggca 1800cttcgaccat ttctgacaac gccaaggagt tgtgtaaggc
agtgtcggtg tccatgggcc 1860tgggtgtgga ggcgttggag catctgagtc
caggggaaca gcttcggggg gattgcatgt 1920acgccccact tttgggagtt
ccacccgctg tgcgtcccac tccttgtgcc ccattggccg 1980aatgcaaagg
ttctctgcta gacgacagcg caggcaagag cactgaagat actgctgagt
2040attccccttt caagggaggt tacaccaaag ggctagaagg cgagagccta
ggctgctctg 2100gcagcgctgc agcagggagc tccgggacac ttgaactgcc
gtctaccctg tctctctaca 2160agtccggagc actggacgag gcagctgcgt
accagagtcg cgactactac aactttccac 2220tggctctggc cggaccgccg
ccccctccgc cgcctcccca tccccacgct cgcatcaagc 2280tggagaaccc
gctggactac ggcagcgcct gggcggctgc ggcggcgcag tgccgctatg
2340gggacctggc gagcctgcat ggcgcgggtg cagcgggacc cggttctggg
tcaccctcag 2400ccgccgcttc ctcatcctgg cacactctct tcacagccga
agaaggccag ttgtatggac 2460cgtgtggtgg tggtgggggt ggtggcggcg
gcggcggcgg cggcggcggc ggcggcggcg 2520gcggcggcgg cggcgaggcg
ggagctgtag ccccctacgg ctacactcgg ccccctcagg 2580ggctggcggg
ccaggaaagc gacttcaccg cacctgatgt gtggtaccct ggcggcatgg
2640tgagcagagt gccctatccc agtcccactt gtgtcaaaag cgaaatgggc
ccctggatgg 2700atagctactc cggaccttac ggggacatgc gtttggagac
tgccagggac catgttttgc 2760ccattgacta ttactttcca ccccagaaga
cctgcctgat ctgtggagat gaagcttctg 2820ggtgtcacta tggagctctc
acatgtggaa gctgcaaggt cttcttcaaa agagccgctg 2880aagggaaaca
gaagtacctg tgcgccagca gaaatgattg cactattgat aaattccgaa
2940ggaaaaattg tccatcttgt cgtcttcgga aatgttatga agcagggatg
actctgggag 3000cccggaagct gaagaaactt ggtaatctga aactacagga
ggaaggagag gcttccagca 3060ccaccagccc cactgaggag acaacccaga
agctgacagt gtcacacatt gaaggctatg 3120aatgtcagcc catctttctg
aatgtcctgg aagccattga gccaggtgta gtgtgtgctg 3180gacacgacaa
caaccagccc gactcctttg cagccttgct ctctagcctc aatgaactgg
3240gagagagaca gcttgtacac gtggtcaagt gggccaaggc cttgcctggc
ttccgcaact 3300tacacgtgga cgaccagatg gctgtcattc agtactcctg
gatggggctc atggtgtttg 3360ccatgggctg gcgatccttc accaatgtca
actccaggat gctctacttc gcccctgatc 3420tggttttcaa tgagtaccgc
atgcacaagt cccggatgta cagccagtgt gtccgaatga 3480ggcacctctc
tcaagagttt ggatggctcc aaatcacccc ccaggaattc ctgtgcatga
3540aagcactgct actcttcagc attattccag tggatgggct gaaaaatcaa
aaattctttg 3600atgaacttcg aatgaactac atcaaggaac tcgatcgtat
cattgcatgc aaaagaaaaa 3660atcccacatc ctgctcaaga cgcttctacc
agctcaccaa gctcctggac tccgtgcagc 3720ctattgcgag agagctgcat
cagttcactt ttgacctgct aatcaagtca cacatggtga 3780gcgtggactt
tccggaaatg atggcagaga tcatctctgt gcaagtgccc aagatccttt
3840ctgggaaagt caagcccatc tatttccaca cccagtgaag cattggaaac
cctatttccc 3900caccccagct catgccccct ttcagatgtc ttctgcctgt
tataactctg cactactcct 3960ctgcagtgcc ttggggaatt tcctctattg
atgtacagtc tgtcatgaac atgttcctga 4020attctatttg ctgggctttt
tttttctctt tctctccttt ctttttcttc ttccctccct 4080atctaaccct
cccatggcac cttcagactt tgcttcccat tgtggctcct atctgtgttt
4140tgaatggtgt tgtatgcctt taaatctgtg atgatcctca tatggcccag
tgtcaagttg 4200tgcttgttta cagcactact ctgtgccagc cacacaaacg
tttacttatc ttatgccacg 4260ggaagtttag agagctaaga ttatctgggg
aaatcaaaac aaaaacaagc aaacaaaaaa 4320aaaaagcaaa aacaaaacaa
aaaataagcc aaaaaacctt gctagtgttt tttcctcaaa 4380aataaataaa
taaataaata aatacgtaca tacatacaca catacataca aacatataga
4440aatccccaaa gaggccaata gtgacgagaa ggtgaaaatt gcaggcccat
ggggagttac 4500tgattttttc atctcctccc tccacgggag actttatttt
ctgccaatgg ctattgccat 4560tagagggcag agtgacccca gagctgagtt
gggcaggggg gtggacagag aggagaggac 4620aaggagggca atggagcatc
agtacctgcc cacagccttg gtccctgggg gctagactgc 4680tcaactgtgg
agcaattcat tatactgaaa atgtgcttgt tgttgaaaat ttgtctgcat
4740gttaatgcct cacccccaaa cccttttctc tctcactctc tgcctccaac
ttcagattga 4800ctttcaatag tttttctaag acctttgaac tgaatgttct
cttcagccaa aacttggcga 4860cttccacaga aaagtctgac cactgagaag
aaggagagca gagatttaac cctttgtaag 4920gccccatttg gatccaggtc
tgctttctca tgtgtgagtc agggaggagc tggagccaga 4980ggagaagaaa
atgatagctt ggctgttctc ctgcttagga cactgactga atagttaaac
5040tctcactgcc actacctttt ccccaccttt aaaagacctg aatgaagttt
tctgccaaac 5100tccgtgaagc cacaagcacc ttatgtcctc ccttcagtgt
tttgtgggcc tgaatttcat 5160cacactgcat ttcagccatg gtcatcaagc
ctgtttgctt cttttgggca tgttcacaga 5220ttctctgtta agagccccca
ccaccaagaa ggttagcagg ccaacagctc tgacatctat 5280ctgtagatgc
cagtagtcac aaagatttct taccaactct cagatcgctg gagcccttag
5340acaaactgga aagaaggcat caaagggatc aggcaagctg ggcgtcttgc
ccttgtcccc 5400cagagatgat accctcccag caagtggaga agttctcact
tccttcttta gagcagctaa 5460aggggctacc cagatcaggg ttgaagagaa
aactcaatta ccagggtggg aagaatgaag 5520gcactagaac cagaaaccct
gcaaatgctc ttcttgtcac ccagcatatc cacctgcaga 5580agtcatgaga
agagagaagg aacaaagagg agactctgac tactgaatta aaatcttcag
5640cggcaaagcc taaagccaga tggacaccat ctggtgagtt tactcatcat
cctcctctgc 5700tgctgattct gggctctgac attgcccata ctcactcaga
ttccccacct ttgttgctgc 5760ctcttagtca gagggaggcc aaaccattga
gactttctac agaaccatgg cttctttcgg 5820aaaggtctgg ttggtgtggc
tccaatactt tgccacccat gaactcaggg tgtgccctgg 5880gacactggtt
ttatatagtc ttttggcaca cctgtgttct gttgacttcg ttcttcaagc
5940ccaagtgcaa gggaaaatgt ccacctactt tctcatcttg gcctctgcct
ccttacttag 6000ctcttaatct catctgttga actcaagaaa tcaagggcca
gtcatcaagc tgcccatttt 6060aattgattca ctctgtttgt tgagaggata
gtttctgagt gacatgatat gatccacaag 6120ggtttccttc cctgatttct
gcattgatat taatagccaa acgaacttca aaacagcttt 6180aaataacaag
ggagagggga acctaagatg agtaatatgc caatccaaga ctgctggaga
6240aaactaaagc tgacaggttc cctttttggg gtgggataga catgttctgg
ttttctttat 6300tattacacaa tctggctcat gtacaggatc acttttagct
gttttaaaca gaaaaaaata 6360tccaccactc ttttcagtta cactaggtta
cattttaata ggtcctttac atctgttttg 6420gaatgatttt catcttttgt
gatacacaga ttgaattata tcattttcat atctctcctt 6480gtaaatacta
gaagctctcc tttacatttc tctatcaaat ttttcatctt tatgggtttc
6540ccaattgtga ctcttgtctt catgaatata tgtttttcat ttgcaaaagc
caaaaatcag 6600tgaaacagca gtgtaattaa aagcaacaac tggattactc
caaatttcca aatgacaaaa 6660ctagggaaaa atagcctaca caagccttta
ggcctactct ttctgtgctt gggtttgagt 6720gaacaaagga gattttagct
tggctctgtt ctcccatgga tgaaaggagg aggatttttt 6780ttttcttttg
gccattgatg ttctagccaa tgtaattgac agaagtctca ttttgcatgc
6840gctctgctct acaaacagag ttggtatggt tggtatactg tactcacctg
tgagggactg 6900gccactcaga cccacttagc tggtgagcta gaagatgagg
atcactcact ggaaaagtca 6960caaggaccat ctccaaacaa gttggcagtg
ctcgatgtgg acgaagagtg aggaagagaa 7020aaagaaggag caccagggag
aaggctccgt ctgtgctggg cagcagacag ctgccaggat 7080cacgaactct
gtagtcaaag aaaagagtcg tgtggcagtt tcagctctcg ttcattgggc
7140agctcgccta ggcccagcct ctgagctgac atgggagttg ttggattctt
tgtttcatag 7200ctttttctat gccataggca atattgttgt tcttggaaag
tttattattt ttttaactcc 7260cttactctga gaaagggata ttttgaagga
ctgtcatata tctttgaaaa aagaaaatct 7320gtaatacata tatttttatg
tatgttcact ggcactaaaa aatatagaga gcttcattct 7380gtcctttggg
tagttgctga ggtaattgtc caggttgaaa aataatgtgc tgatgctaga
7440gtccctctct gtccatactc tacttctaaa tacatatagg catacatagc
aagttttatt 7500tgacttgtac tttaagagaa aatatgtcca ccatccacat
gatgcacaaa tgagctaaca 7560ttgagcttca agtagcttct aagtgtttgt
ttcattaggc acagcacaga tgtggccttt 7620ccccccttct ctcccttgat
atctggcagg gcataaaggc ccaggccact tcctctgccc 7680cttcccagcc
ctgcaccaaa gctgcatttc aggagactct ctccagacag cccagtaact
7740acccgagcat ggcccctgca tagccctgga aaaataagag gctgactgtc
tacgaattat 7800cttgtgccag ttgcccaggt gagagggcac tgggccaagg
gagtggtttt catgtttgac 7860ccactacaag gggtcatggg aatcaggaat
gccaaagcac cagatcaaat ccaaaactta 7920aagtcaaaat aagccattca
gcatgttcag tttcttggaa aaggaagttt ctacccctga 7980tgcctttgta
ggcagatctg ttctcaccat taatcttttt gaaaatcttt taaagcagtt
8040tttaaaaaga gagatgaaag catcacatta tataaccaaa gattacattg
tacctgctaa 8100gataccaaaa ttcataaggg caggggggga gcaagcatta
gtgcctcttt gataagctgt 8160ccaaagacag actaaaggac tctgctggtg
actgacttat aagagctttg tgggtttttt 8220tttccctaat aatatacatg
tttagaagaa ttgaaaataa tttcgggaaa atgggattat 8280gggtccttca
ctaagtgatt ttataagcag aactggcttt ccttttctct agtagttgct
8340gagcaaattg ttgaagctcc atcattgcat ggttggaaat ggagctgttc
ttagccactg 8400tgtttgctag tgcccatgtt agcttatctg aagatgtgaa
acccttgctg ataagggagc 8460atttaaagta ctagattttg cactagaggg
acagcaggca gaaatcctta tttctgccca 8520ctttggatgg cacaaaaagt
tatctgcagt tgaaggcaga aagttgaaat acattgtaaa 8580tgaatatttg
tatccatgtt tcaaaattga aatatatata tatatatata tatatatata
8640tatatatata tagtgtgtgt gtgtgttctg atagctttaa ctttctctgc
atctttatat 8700ttggttccag atcacacctg atgccatgta cttgtgagag
aggatgcagt tttgttttgg 8760aagctctctc agaacaaaca agacacctgg
attgatcagt taactaaaag ttttctcccc 8820tattgggttt gacccacagg
tcctgtgaag gagcagaggg ataaaaagag tagaggacat 8880gatacattgt
actttactag ttcaagacag atgaatgtgg aaagcataaa aactcaatgg
8940aactgactga gatttaccac agggaaggcc caaacttggg gccaaaagcc
tacccaagtg 9000attgaccagt ggccccctaa tgggacctga gctgttggaa
gaagagaact gttccttggt 9060cttcaccatc cttgtgagag aagggcagtt
tcctgcattg gaacctggag caagcgctct 9120atctttcaca caaattccct
cacctgagat tgaggtgctc ttgttactgg gtgtctgtgt 9180gctgtaattc
tggttttgga tatgttctgt aaagattttg acaaatgaaa atgtgttttt
9240ctctgttaaa acttgtcaga gtactagaag ttgtatctct gtaggtgcag
gtccatttct 9300gcccacaggt agggtgtttt tctttgatta agagattgac
acttctgttg cctaggacct 9360cccaactcaa ccatttctag gtgaaggcag
aaaaatccac attagttact cctcttcaga 9420catttcagct gagataacaa
atcttttgga attttttcac ccatagaaag agtggtagat 9480atttgaattt
agcaggtgga gtttcatagt aaaaacagct tttgactcag ctttgattta
9540tcctcatttg atttggccag aaagtaggta atatgcattg attggcttct
gattccaatt 9600cagtatagca aggtgctagg ttttttcctt tccccacctg
tctcttagcc tggggaatta 9660aatgagaagc cttagaatgg gtggcccttg
tgacctgaaa cacttcccac ataagctact 9720taacaagatt gtcatggagc
tgcagattcc attgcccacc aaagactaga acacacacat 9780atccatacac
caaaggaaag acaattctga aatgctgttt ctctggtggt tccctctctg
9840gctgctgcct cacagtatgg gaacctgtac tctgcagagg tgacaggcca
gatttgcatt 9900atctcacaac cttagccctt ggtgctaact gtcctacagt
gaagtgcctg gggggttgtc 9960ctatcccata agccacttgg atgctgacag
cagccaccat cagaatgacc cacgcaaaaa 10020aaagaaaaaa aaaattaaaa
agtcccctca caacccagtg acacctttct gctttcctct 10080agactggaac
attgattagg gagtgcctca gacatgacat tcttgtgctg tccttggaat
10140taatctggca gcaggaggga gcagactatg taaacagaga taaaaattaa
ttttcaatat 10200tgaaggaaaa aagaaataag aagagagaga gaaagaaagc
atcacacaaa gattttctta 10260aaagaaacaa ttttgcttga aatctcttta
gatggggctc atttctcacg gtggcacttg 10320gcctccactg ggcagcagga
ccagctccaa gcgctagtgt tctgttctct ttttgtaatc 10380ttggaatctt
ttgttgctct aaatacaatt aaaaatggca gaaacttgtt tgttggacta
10440catgtgtgac tttgggtctg tctctgcctc tgctttcaga aatgtcatcc
attgtgtaaa 10500atattggctt actggtctgc cagctaaaac ttggccacat
cccctgttat ggctgcagga 10560tcgagttatt gttaacaaag agacccaaga
aaagctgcta atgtcctctt atcattgttg 10620ttaatttgtt aaaacataaa
gaaatctaaa atttcaaaaa a 10661528112DNAHomo sapiens 52gctgcgagca
gagagggatt cctcggaggt catctgttcc atcttcttgc ctatgcaaat 60gcctgcctga
agctgctgga ggctggcttt gtaccggact ttgtacaggg aaccagggaa
120acgaatgcag agtgctcctg acattgcctg tcactttttc ccatgatact
ctggcttcac 180agtttggaga ctgccaggga ccatgttttg cccattgact
attactttcc accccagaag 240acctgcctga tctgtggaga tgaagcttct
gggtgtcact atggagctct cacatgtgga 300agctgcaagg tcttcttcaa
aagagccgct gaagggaaac agaagtacct gtgcgccagc 360agaaatgatt
gcactattga taaattccga aggaaaaatt gtccatcttg tcgtcttcgg
420aaatgttatg aagcagggat gactctggga gcccggaagc tgaagaaact
tggtaatctg 480aaactacagg aggaaggaga ggcttccagc accaccagcc
ccactgagga gacaacccag 540aagctgacag tgtcacacat tgaaggctat
gaatgtcagc ccatctttct gaatgtcctg 600gaagccattg agccaggtgt
agtgtgtgct ggacacgaca acaaccagcc cgactccttt 660gcagccttgc
tctctagcct caatgaactg ggagagagac agcttgtaca cgtggtcaag
720tgggccaagg ccttgcctgg cttccgcaac ttacacgtgg acgaccagat
ggctgtcatt 780cagtactcct ggatggggct catggtgttt gccatgggct
ggcgatcctt caccaatgtc 840aactccagga tgctctactt cgcccctgat
ctggttttca atgagtaccg catgcacaag 900tcccggatgt acagccagtg
tgtccgaatg aggcacctct ctcaagagtt tggatggctc 960caaatcaccc
cccaggaatt cctgtgcatg aaagcactgc tactcttcag cattattcca
1020gtggatgggc tgaaaaatca aaaattcttt gatgaacttc gaatgaacta
catcaaggaa 1080ctcgatcgta tcattgcatg caaaagaaaa aatcccacat
cctgctcaag acgcttctac 1140cagctcacca agctcctgga ctccgtgcag
cctattgcga gagagctgca tcagttcact 1200tttgacctgc taatcaagtc
acacatggtg agcgtggact ttccggaaat gatggcagag 1260atcatctctg
tgcaagtgcc caagatcctt tctgggaaag tcaagcccat ctatttccac
1320acccagtgaa gcattggaaa ccctatttcc ccaccccagc tcatgccccc
tttcagatgt 1380cttctgcctg ttataactct gcactactcc tctgcagtgc
cttggggaat ttcctctatt 1440gatgtacagt ctgtcatgaa catgttcctg
aattctattt gctgggcttt ttttttctct 1500ttctctcctt tctttttctt
cttccctccc tatctaaccc tcccatggca ccttcagact 1560ttgcttccca
ttgtggctcc tatctgtgtt ttgaatggtg ttgtatgcct ttaaatctgt
1620gatgatcctc atatggccca gtgtcaagtt gtgcttgttt acagcactac
tctgtgccag 1680ccacacaaac gtttacttat cttatgccac gggaagttta
gagagctaag attatctggg 1740gaaatcaaaa caaaaacaag caaacaaaaa
aaaaaagcaa aaacaaaaca aaaaataagc 1800caaaaaacct tgctagtgtt
ttttcctcaa aaataaataa ataaataaat aaatacgtac 1860atacatacac
acatacatac aaacatatag aaatccccaa agaggccaat agtgacgaga
1920aggtgaaaat tgcaggccca tggggagtta ctgatttttt catctcctcc
ctccacggga 1980gactttattt tctgccaatg gctattgcca ttagagggca
gagtgacccc agagctgagt 2040tgggcagggg ggtggacaga gaggagagga
caaggagggc aatggagcat cagtacctgc 2100ccacagcctt ggtccctggg
ggctagactg ctcaactgtg gagcaattca ttatactgaa 2160aatgtgcttg
ttgttgaaaa tttgtctgca tgttaatgcc tcacccccaa acccttttct
2220ctctcactct ctgcctccaa cttcagattg actttcaata gtttttctaa
gacctttgaa 2280ctgaatgttc tcttcagcca aaacttggcg acttccacag
aaaagtctga ccactgagaa 2340gaaggagagc agagatttaa ccctttgtaa
ggccccattt ggatccaggt ctgctttctc 2400atgtgtgagt cagggaggag
ctggagccag aggagaagaa aatgatagct tggctgttct 2460cctgcttagg
acactgactg aatagttaaa ctctcactgc cactaccttt tccccacctt
2520taaaagacct gaatgaagtt ttctgccaaa ctccgtgaag ccacaagcac
cttatgtcct 2580cccttcagtg ttttgtgggc ctgaatttca tcacactgca
tttcagccat ggtcatcaag 2640cctgtttgct tcttttgggc atgttcacag
attctctgtt aagagccccc accaccaaga 2700aggttagcag gccaacagct
ctgacatcta tctgtagatg ccagtagtca caaagatttc 2760ttaccaactc
tcagatcgct ggagccctta gacaaactgg aaagaaggca tcaaagggat
2820caggcaagct gggcgtcttg cccttgtccc ccagagatga taccctccca
gcaagtggag 2880aagttctcac ttccttcttt agagcagcta aaggggctac
ccagatcagg gttgaagaga 2940aaactcaatt accagggtgg gaagaatgaa
ggcactagaa ccagaaaccc tgcaaatgct 3000cttcttgtca cccagcatat
ccacctgcag aagtcatgag aagagagaag gaacaaagag 3060gagactctga
ctactgaatt aaaatcttca gcggcaaagc ctaaagccag atggacacca
3120tctggtgagt ttactcatca tcctcctctg ctgctgattc tgggctctga
cattgcccat 3180actcactcag attccccacc tttgttgctg cctcttagtc
agagggaggc caaaccattg 3240agactttcta cagaaccatg gcttctttcg
gaaaggtctg gttggtgtgg ctccaatact 3300ttgccaccca tgaactcagg
gtgtgccctg ggacactggt tttatatagt cttttggcac 3360acctgtgttc
tgttgacttc gttcttcaag cccaagtgca agggaaaatg tccacctact
3420ttctcatctt ggcctctgcc tccttactta gctcttaatc tcatctgttg
aactcaagaa 3480atcaagggcc agtcatcaag ctgcccattt taattgattc
actctgtttg ttgagaggat 3540agtttctgag tgacatgata tgatccacaa
gggtttcctt ccctgatttc tgcattgata 3600ttaatagcca aacgaacttc
aaaacagctt taaataacaa gggagagggg aacctaagat 3660gagtaatatg
ccaatccaag actgctggag aaaactaaag ctgacaggtt ccctttttgg
3720ggtgggatag acatgttctg gttttcttta ttattacaca atctggctca
tgtacaggat 3780cacttttagc tgttttaaac agaaaaaaat atccaccact
cttttcagtt acactaggtt 3840acattttaat aggtccttta catctgtttt
ggaatgattt tcatcttttg tgatacacag 3900attgaattat atcattttca
tatctctcct tgtaaatact agaagctctc ctttacattt 3960ctctatcaaa
tttttcatct ttatgggttt cccaattgtg actcttgtct tcatgaatat
4020atgtttttca tttgcaaaag ccaaaaatca gtgaaacagc agtgtaatta
aaagcaacaa 4080ctggattact ccaaatttcc aaatgacaaa actagggaaa
aatagcctac acaagccttt 4140aggcctactc tttctgtgct tgggtttgag
tgaacaaagg agattttagc ttggctctgt 4200tctcccatgg atgaaaggag
gaggattttt tttttctttt ggccattgat gttctagcca 4260atgtaattga
cagaagtctc attttgcatg cgctctgctc tacaaacaga gttggtatgg
4320ttggtatact gtactcacct gtgagggact ggccactcag acccacttag
ctggtgagct 4380agaagatgag gatcactcac tggaaaagtc acaaggacca
tctccaaaca agttggcagt 4440gctcgatgtg gacgaagagt gaggaagaga
aaaagaagga gcaccaggga gaaggctccg 4500tctgtgctgg gcagcagaca
gctgccagga tcacgaactc tgtagtcaaa gaaaagagtc 4560gtgtggcagt
ttcagctctc gttcattggg cagctcgcct aggcccagcc tctgagctga
4620catgggagtt gttggattct ttgtttcata gctttttcta tgccataggc
aatattgttg 4680ttcttggaaa gtttattatt tttttaactc ccttactctg
agaaagggat attttgaagg 4740actgtcatat atctttgaaa aaagaaaatc
tgtaatacat atatttttat gtatgttcac 4800tggcactaaa aaatatagag
agcttcattc tgtcctttgg gtagttgctg aggtaattgt 4860ccaggttgaa
aaataatgtg ctgatgctag agtccctctc tgtccatact ctacttctaa
4920atacatatag gcatacatag caagttttat ttgacttgta ctttaagaga
aaatatgtcc 4980accatccaca tgatgcacaa atgagctaac attgagcttc
aagtagcttc taagtgtttg 5040tttcattagg cacagcacag atgtggcctt
tccccccttc tctcccttga tatctggcag 5100ggcataaagg cccaggccac
ttcctctgcc ccttcccagc cctgcaccaa agctgcattt 5160caggagactc
tctccagaca gcccagtaac tacccgagca tggcccctgc atagccctgg
5220aaaaataaga ggctgactgt ctacgaatta tcttgtgcca gttgcccagg
tgagagggca 5280ctgggccaag ggagtggttt tcatgtttga cccactacaa
ggggtcatgg gaatcaggaa 5340tgccaaagca ccagatcaaa tccaaaactt
aaagtcaaaa taagccattc agcatgttca 5400gtttcttgga aaaggaagtt
tctacccctg atgcctttgt aggcagatct gttctcacca 5460ttaatctttt
tgaaaatctt ttaaagcagt ttttaaaaag agagatgaaa gcatcacatt
5520atataaccaa agattacatt gtacctgcta agataccaaa attcataagg
gcaggggggg 5580agcaagcatt agtgcctctt tgataagctg tccaaagaca
gactaaagga ctctgctggt 5640gactgactta taagagcttt gtgggttttt
ttttccctaa taatatacat gtttagaaga 5700attgaaaata atttcgggaa
aatgggatta tgggtccttc actaagtgat tttataagca 5760gaactggctt
tccttttctc tagtagttgc tgagcaaatt gttgaagctc catcattgca
5820tggttggaaa tggagctgtt cttagccact gtgtttgcta gtgcccatgt
tagcttatct 5880gaagatgtga aacccttgct gataagggag catttaaagt
actagatttt gcactagagg 5940gacagcaggc agaaatcctt atttctgccc
actttggatg gcacaaaaag ttatctgcag 6000ttgaaggcag aaagttgaaa
tacattgtaa atgaatattt gtatccatgt ttcaaaattg 6060aaatatatat
atatatatat atatatatat atatatatat atagtgtgtg tgtgtgttct
6120gatagcttta actttctctg catctttata tttggttcca gatcacacct
gatgccatgt 6180acttgtgaga gaggatgcag ttttgttttg gaagctctct
cagaacaaac aagacacctg 6240gattgatcag ttaactaaaa gttttctccc
ctattgggtt tgacccacag gtcctgtgaa 6300ggagcagagg gataaaaaga
gtagaggaca tgatacattg tactttacta gttcaagaca 6360gatgaatgtg
gaaagcataa aaactcaatg gaactgactg agatttacca cagggaaggc
6420ccaaacttgg ggccaaaagc ctacccaagt gattgaccag tggcccccta
atgggacctg 6480agctgttgga agaagagaac tgttccttgg tcttcaccat
ccttgtgaga gaagggcagt 6540ttcctgcatt ggaacctgga gcaagcgctc
tatctttcac acaaattccc tcacctgaga 6600ttgaggtgct cttgttactg
ggtgtctgtg tgctgtaatt ctggttttgg atatgttctg 6660taaagatttt
gacaaatgaa aatgtgtttt tctctgttaa aacttgtcag agtactagaa
6720gttgtatctc tgtaggtgca ggtccatttc tgcccacagg tagggtgttt
ttctttgatt 6780aagagattga cacttctgtt gcctaggacc tcccaactca
accatttcta ggtgaaggca 6840gaaaaatcca cattagttac tcctcttcag
acatttcagc tgagataaca aatcttttgg 6900aattttttca cccatagaaa
gagtggtaga tatttgaatt tagcaggtgg agtttcatag 6960taaaaacagc
ttttgactca gctttgattt atcctcattt gatttggcca gaaagtaggt
7020aatatgcatt gattggcttc tgattccaat tcagtatagc aaggtgctag
gttttttcct 7080ttccccacct gtctcttagc ctggggaatt aaatgagaag
ccttagaatg ggtggccctt 7140gtgacctgaa acacttccca cataagctac
ttaacaagat tgtcatggag ctgcagattc 7200cattgcccac caaagactag
aacacacaca tatccataca ccaaaggaaa gacaattctg 7260aaatgctgtt
tctctggtgg ttccctctct ggctgctgcc tcacagtatg ggaacctgta
7320ctctgcagag gtgacaggcc agatttgcat tatctcacaa ccttagccct
tggtgctaac 7380tgtcctacag tgaagtgcct ggggggttgt cctatcccat
aagccacttg gatgctgaca 7440gcagccacca tcagaatgac ccacgcaaaa
aaaagaaaaa aaaaattaaa aagtcccctc 7500acaacccagt gacacctttc
tgctttcctc tagactggaa cattgattag ggagtgcctc 7560agacatgaca
ttcttgtgct gtccttggaa ttaatctggc agcaggaggg agcagactat
7620gtaaacagag ataaaaatta attttcaata ttgaaggaaa aaagaaataa
gaagagagag 7680agaaagaaag catcacacaa agattttctt aaaagaaaca
attttgcttg aaatctcttt 7740agatggggct catttctcac ggtggcactt
ggcctccact gggcagcagg accagctcca 7800agcgctagtg ttctgttctc
tttttgtaat cttggaatct tttgttgctc taaatacaat 7860taaaaatggc
agaaacttgt ttgttggact acatgtgtga ctttgggtct gtctctgcct
7920ctgctttcag aaatgtcatc cattgtgtaa aatattggct tactggtctg
ccagctaaaa 7980cttggccaca tcccctgtta tggctgcagg atcgagttat
tgttaacaaa gagacccaag 8040aaaagctgct aatgtcctct tatcattgtt
gttaatttgt taaaacataa agaaatctaa 8100aatttcaaaa aa 811253920PRTHomo
sapiens 53Met Glu Val Gln Leu Gly Leu Gly Arg Val Tyr Pro Arg Pro
Pro Ser 1 5 10 15 Lys Thr Tyr Arg Gly Ala Phe Gln Asn Leu Phe Gln
Ser Val Arg Glu 20 25 30 Val Ile Gln Asn Pro Gly Pro Arg His Pro
Glu Ala Ala Ser Ala Ala 35 40 45 Pro Pro Gly Ala Ser Leu Leu Leu
Leu Gln Gln Gln Gln Gln Gln Gln 50 55 60 Gln Gln Gln Gln Gln Gln
Gln Gln Gln Gln Gln Gln Gln Gln Gln Gln 65 70 75 80 Glu Thr Ser Pro
Arg Gln Gln Gln Gln Gln Gln Gly Glu Asp Gly Ser 85 90 95 Pro Gln
Ala His Arg Arg Gly Pro Thr Gly Tyr Leu Val Leu Asp Glu 100 105 110
Glu Gln Gln Pro Ser Gln Pro Gln Ser Ala Leu Glu Cys His Pro Glu 115
120 125 Arg Gly Cys Val Pro Glu Pro Gly Ala Ala Val Ala Ala Ser Lys
Gly 130 135 140 Leu Pro Gln Gln Leu Pro Ala Pro Pro Asp Glu Asp Asp
Ser Ala Ala 145 150 155 160 Pro Ser Thr Leu Ser Leu Leu Gly Pro Thr
Phe Pro Gly Leu Ser Ser 165 170 175 Cys Ser Ala Asp Leu Lys Asp Ile
Leu Ser Glu Ala Ser Thr Met Gln 180 185 190 Leu Leu Gln Gln Gln Gln
Gln Glu Ala Val Ser Glu Gly Ser Ser Ser 195 200 205 Gly Arg Ala Arg
Glu Ala Ser Gly Ala Pro Thr Ser Ser Lys Asp Asn 210 215 220 Tyr Leu
Gly Gly Thr Ser Thr Ile Ser Asp Asn Ala Lys Glu Leu Cys 225 230 235
240 Lys Ala Val Ser Val Ser Met Gly Leu Gly Val Glu Ala Leu Glu His
245 250 255 Leu Ser Pro Gly Glu Gln Leu Arg Gly Asp Cys Met Tyr Ala
Pro Leu 260 265 270 Leu Gly Val Pro Pro Ala Val Arg Pro Thr Pro Cys
Ala Pro Leu Ala 275 280 285 Glu Cys Lys Gly Ser Leu Leu Asp Asp Ser
Ala Gly Lys Ser Thr Glu 290 295 300 Asp Thr Ala Glu Tyr Ser Pro Phe
Lys Gly Gly Tyr Thr Lys Gly Leu 305 310 315 320 Glu Gly Glu Ser Leu
Gly Cys Ser Gly Ser Ala Ala Ala Gly Ser Ser 325 330 335 Gly Thr Leu
Glu Leu Pro Ser Thr Leu Ser Leu Tyr Lys Ser Gly Ala 340 345 350 Leu
Asp Glu Ala Ala Ala Tyr Gln Ser Arg Asp Tyr Tyr Asn Phe Pro 355 360
365 Leu Ala Leu Ala Gly Pro Pro Pro Pro Pro Pro Pro Pro His Pro His
370 375 380 Ala Arg Ile Lys Leu Glu Asn Pro Leu Asp Tyr Gly Ser Ala
Trp Ala 385 390 395 400 Ala Ala Ala Ala Gln Cys Arg Tyr Gly Asp Leu
Ala Ser Leu His Gly 405 410 415 Ala Gly Ala Ala Gly Pro Gly Ser Gly
Ser Pro Ser Ala Ala Ala Ser 420 425 430 Ser Ser Trp His Thr Leu Phe
Thr Ala Glu Glu Gly Gln Leu Tyr Gly 435 440 445 Pro Cys Gly Gly Gly
Gly Gly Gly Gly Gly Gly Gly Gly Gly Gly Gly 450 455 460 Gly Gly Gly
Gly Gly Gly Gly Gly Gly Glu Ala Gly Ala Val Ala Pro 465 470 475 480
Tyr Gly Tyr Thr Arg Pro Pro Gln Gly Leu Ala Gly Gln Glu Ser Asp 485
490 495 Phe Thr Ala Pro Asp Val Trp Tyr Pro Gly Gly Met Val Ser Arg
Val 500 505 510 Pro Tyr Pro Ser Pro Thr Cys Val Lys Ser Glu Met Gly
Pro Trp Met 515 520 525 Asp Ser Tyr Ser Gly Pro Tyr Gly Asp Met Arg
Leu Glu Thr Ala Arg 530 535 540 Asp His Val Leu Pro Ile Asp Tyr Tyr
Phe Pro Pro Gln Lys Thr Cys 545 550 555 560 Leu Ile Cys Gly Asp Glu
Ala Ser Gly Cys His Tyr Gly Ala Leu Thr 565 570 575 Cys Gly Ser Cys
Lys Val Phe Phe Lys Arg Ala Ala Glu Gly Lys Gln 580 585 590 Lys Tyr
Leu Cys Ala Ser Arg Asn Asp Cys Thr Ile Asp Lys Phe Arg 595 600 605
Arg Lys Asn Cys Pro Ser Cys Arg Leu Arg Lys Cys Tyr Glu Ala Gly 610
615 620 Met Thr Leu Gly Ala Arg Lys Leu Lys Lys Leu Gly Asn Leu Lys
Leu 625 630 635 640 Gln Glu Glu Gly Glu Ala Ser Ser Thr Thr Ser Pro
Thr Glu Glu Thr 645 650 655 Thr Gln Lys Leu Thr Val Ser His Ile Glu
Gly Tyr Glu Cys Gln Pro 660 665 670 Ile Phe Leu Asn Val Leu Glu Ala
Ile Glu Pro Gly Val Val Cys Ala 675 680 685 Gly His Asp Asn Asn Gln
Pro Asp Ser Phe Ala Ala Leu Leu Ser Ser 690 695 700 Leu Asn Glu Leu
Gly Glu Arg Gln Leu Val His Val Val Lys Trp Ala 705 710 715 720 Lys
Ala Leu Pro Gly Phe Arg Asn Leu His Val Asp Asp Gln Met Ala 725 730
735 Val Ile Gln Tyr Ser Trp Met Gly Leu Met Val Phe Ala Met Gly Trp
740 745 750 Arg Ser Phe Thr Asn Val Asn Ser Arg Met Leu Tyr Phe Ala
Pro Asp 755 760 765 Leu Val Phe Asn Glu Tyr Arg Met His Lys Ser Arg
Met Tyr Ser Gln 770 775 780 Cys Val Arg Met Arg His Leu Ser Gln Glu
Phe Gly Trp Leu Gln Ile 785 790 795 800 Thr Pro Gln Glu Phe Leu Cys
Met Lys Ala Leu Leu Leu Phe Ser Ile 805 810 815 Ile Pro Val Asp Gly
Leu Lys Asn Gln Lys Phe Phe Asp Glu Leu Arg 820 825 830 Met Asn Tyr
Ile Lys Glu Leu Asp Arg Ile Ile Ala Cys Lys Arg Lys 835 840 845 Asn
Pro Thr Ser Cys Ser Arg Arg Phe Tyr Gln Leu Thr Lys Leu Leu 850 855
860 Asp Ser Val Gln Pro Ile Ala Arg Glu Leu His Gln Phe Thr Phe Asp
865 870 875 880 Leu Leu Ile Lys Ser His Met Val Ser Val Asp Phe Pro
Glu Met Met 885
890 895 Ala Glu Ile Ile Ser Val Gln Val Pro Lys Ile Leu Ser Gly Lys
Val 900 905 910 Lys Pro Ile Tyr Phe His Thr Gln 915 920
54388PRTHomo sapiens 54Met Ile Leu Trp Leu His Ser Leu Glu Thr Ala
Arg Asp His Val Leu 1 5 10 15 Pro Ile Asp Tyr Tyr Phe Pro Pro Gln
Lys Thr Cys Leu Ile Cys Gly 20 25 30 Asp Glu Ala Ser Gly Cys His
Tyr Gly Ala Leu Thr Cys Gly Ser Cys 35 40 45 Lys Val Phe Phe Lys
Arg Ala Ala Glu Gly Lys Gln Lys Tyr Leu Cys 50 55 60 Ala Ser Arg
Asn Asp Cys Thr Ile Asp Lys Phe Arg Arg Lys Asn Cys 65 70 75 80 Pro
Ser Cys Arg Leu Arg Lys Cys Tyr Glu Ala Gly Met Thr Leu Gly 85 90
95 Ala Arg Lys Leu Lys Lys Leu Gly Asn Leu Lys Leu Gln Glu Glu Gly
100 105 110 Glu Ala Ser Ser Thr Thr Ser Pro Thr Glu Glu Thr Thr Gln
Lys Leu 115 120 125 Thr Val Ser His Ile Glu Gly Tyr Glu Cys Gln Pro
Ile Phe Leu Asn 130 135 140 Val Leu Glu Ala Ile Glu Pro Gly Val Val
Cys Ala Gly His Asp Asn 145 150 155 160 Asn Gln Pro Asp Ser Phe Ala
Ala Leu Leu Ser Ser Leu Asn Glu Leu 165 170 175 Gly Glu Arg Gln Leu
Val His Val Val Lys Trp Ala Lys Ala Leu Pro 180 185 190 Gly Phe Arg
Asn Leu His Val Asp Asp Gln Met Ala Val Ile Gln Tyr 195 200 205 Ser
Trp Met Gly Leu Met Val Phe Ala Met Gly Trp Arg Ser Phe Thr 210 215
220 Asn Val Asn Ser Arg Met Leu Tyr Phe Ala Pro Asp Leu Val Phe Asn
225 230 235 240 Glu Tyr Arg Met His Lys Ser Arg Met Tyr Ser Gln Cys
Val Arg Met 245 250 255 Arg His Leu Ser Gln Glu Phe Gly Trp Leu Gln
Ile Thr Pro Gln Glu 260 265 270 Phe Leu Cys Met Lys Ala Leu Leu Leu
Phe Ser Ile Ile Pro Val Asp 275 280 285 Gly Leu Lys Asn Gln Lys Phe
Phe Asp Glu Leu Arg Met Asn Tyr Ile 290 295 300 Lys Glu Leu Asp Arg
Ile Ile Ala Cys Lys Arg Lys Asn Pro Thr Ser 305 310 315 320 Cys Ser
Arg Arg Phe Tyr Gln Leu Thr Lys Leu Leu Asp Ser Val Gln 325 330 335
Pro Ile Ala Arg Glu Leu His Gln Phe Thr Phe Asp Leu Leu Ile Lys 340
345 350 Ser His Met Val Ser Val Asp Phe Pro Glu Met Met Ala Glu Ile
Ile 355 360 365 Ser Val Gln Val Pro Lys Ile Leu Ser Gly Lys Val Lys
Pro Ile Tyr 370 375 380 Phe His Thr Gln 385
* * * * *
References