U.S. patent application number 10/274583 was filed with the patent office on 2003-07-24 for lrrcaps as modifiers of the p53 pathway and methods of use.
Invention is credited to Belvin, Marcia, Francis-Lang, Helen, Funke, Roel P., Li, Danxi, Lioubin, Mario N., Plowman, Gregory D., Schleithoff, Lothar.
Application Number | 20030138431 10/274583 |
Document ID | / |
Family ID | 27407290 |
Filed Date | 2003-07-24 |
United States Patent
Application |
20030138431 |
Kind Code |
A1 |
Belvin, Marcia ; et
al. |
July 24, 2003 |
LRRCAPs as modifiers of the p53 pathway and methods of use
Abstract
Human LRRCAPS genes are identified as modulators of the p53
pathway, and thus are therapeutic targets for disorders associated
with defective p53 function. Methods for identifying modulators of
p53, comprising screening for agents that modulate the activity of
LRRCAPS are provided.
Inventors: |
Belvin, Marcia; (Albany,
CA) ; Schleithoff, Lothar; (Tuebingen, DE) ;
Plowman, Gregory D.; (San Carlos, CA) ; Funke, Roel
P.; (South San, CA) ; Lioubin, Mario N.; (San
Mateo, CA) ; Li, Danxi; (San Francisco, CA) ;
Francis-Lang, Helen; (San Francisco, CA) |
Correspondence
Address: |
JAN P. BRUNELLE
EXELIXIS, INC.
170 HARBOR WAY
P.O. BOX 511
SOUTH SAN FRANCISCO
CA
94083-0511
US
|
Family ID: |
27407290 |
Appl. No.: |
10/274583 |
Filed: |
October 21, 2002 |
Related U.S. Patent Documents
|
|
|
|
|
|
Application
Number |
Filing Date |
Patent Number |
|
|
60338733 |
Oct 22, 2001 |
|
|
|
60357600 |
Feb 15, 2002 |
|
|
|
60361196 |
Mar 1, 2002 |
|
|
|
Current U.S.
Class: |
424/155.1 ;
435/6.16; 435/7.23; 514/44A |
Current CPC
Class: |
G01N 33/5011 20130101;
G01N 33/573 20130101; C12Q 1/485 20130101; G01N 33/574 20130101;
C12Q 1/527 20130101; G01N 33/57419 20130101; G01N 33/57449
20130101; G01N 2333/988 20130101; G01N 2500/10 20130101; G01N
33/57496 20130101; G01N 2333/82 20130101; G01N 33/5308 20130101;
G01N 2333/4739 20130101; G01N 33/57484 20130101; G01N 2500/00
20130101; G01N 2510/00 20130101; G01N 2333/90212 20130101; G01N
33/57415 20130101; G01N 33/6872 20130101; G01N 2333/912 20130101;
C12Q 1/42 20130101; G01N 33/57423 20130101 |
Class at
Publication: |
424/155.1 ;
514/44; 435/6; 435/7.23 |
International
Class: |
A61K 039/395; C12Q
001/68; G01N 033/574; A61K 048/00 |
Claims
What is claimed is:
1. A method of identifying a candidate p53 pathway modulating
agent, said method comprising the steps of: a. providing an assay
system comprising a purified LRRCAPS polypeptide or nucleic acid or
a functionally active fragment or derivative thereof; b. contacting
the assay system with a test agent under conditions whereby, but
for the presence of the test agent, the system provides a reference
activity; and c. detecting a test agent-biased activity of the
assay system, wherein a difference between the test agent-biased
activity and the reference activity identifies the test agent as a
candidate p53 pathway modulating agent.
2. The method of claim 1 wherein the assay system comprises
cultured cells that express the LRRCAPS polypeptide.
3. The method of claim 2 wherein the cultured cells additionally
have defective p53 function.
4. The method of claim 1 wherein the assay system includes a
screening assay comprising a LRRCAPS polypeptide, and the candidate
test agent is a small molecule modulator.
5. The method of claim 4 wherein the assay is a binding assay.
6. The method of claim 1 wherein the assay system is selected from
the group consisting of an apoptosis assay system, a cell
proliferation assay system, an angiogenesis assay system, and a
hypoxic induction assay system.
7. The method of claim 1 wherein the assay system includes a
binding assay comprising a LRRCAPS polypeptide and the candidate
test agent is an antibody.
8. The method of claim 1 wherein the assay system includes an
expression assay comprising a LRRCAPS nucleic acid and the
candidate test agent is a nucleic acid modulator.
9. The method of claim 8 wherein the nucleic acid modulator is an
antisense oligomer.
10. The method of claim 8 wherein the nucleic acid modulator is a
PMO.
11. The method of claim 1 additionally comprising: d. administering
the candidate p53 pathway modulating agent identified in (c) to a
model system comprising cells defective in p53 function and,
detecting a phenotypic change in the model system that indicates
that the p53 function is restored.
12. The method of claim 11 wherein the model system is a mouse
model with defective p53 function.
13. A method for modulating a p53 pathway of a cell comprising
contacting a cell defective in p53 function with a candidate
modulator that specifically binds to a LRRCAPS polypeptide
comprising an amino acid sequence selected from group consisting of
SEQ ID NOs: 19, 20, 21, 22, 23, and 24, whereby p53 function is
restored.
14. The method of claim 13 wherein the candidate modulator is
administered to a vertebrate animal predetermined to have a disease
or disorder resulting from a defect in p53 function.
15. The method of claim 13 wherein the candidate modulator is
selected from the group consisting of an antibody and a small
molecule.
16. The method of claim 1, comprising the additional steps of: e.
providing a secondary assay system comprising cultured cells or a
non-human animal expressing LRRCAPS, f. contacting the secondary
assay system with the test agent of (b) or an agent derived
therefrom under conditions whereby, but for the presence of the
test agent or agent derived therefrom, the system provides a
reference activity; and g. detecting an agent-biased activity of
the second assay system, h. wherein a difference between the
agent-biased activity and the reference activity of the second
assay system confirms the test agent or agent derived therefrom as
a candidate p53 pathway modulating agent, i. and wherein the second
assay detects an agent-biased change in the p53 pathway.
17. The method of claim 16 wherein the secondary assay system
comprises cultured cells.
18. The method of claim 16 wherein the secondary assay system
comprises a non-human animal.
19. The method of claim 18 wherein the non-human animal
mis-expresses a p53 pathway gene.
20. A method of modulating p53 pathway in a mammalian cell
comprising contacting the cell with an agent that specifically
binds a LRRCAPS polypeptide or nucleic acid.
21. The method of claim 20 wherein the agent is administered to a
mammalian animal predetermined to have a pathology associated with
the p53 pathway.
22. The method of claim 20 wherein the agent is a small molecule
modulator, a nucleic acid modulator, or an antibody.
23. A method for diagnosing a disease in a patient comprising: a.
obtaining a biological sample from the patient; b. contacting the
sample with a probe for LRRCAPS expression; c. comparing results
from step (b) with a control; d. determining whether step (c)
indicates a likelihood of disease.
24. The method of claim 23 wherein said disease is cancer.
25. The method according to claim 24, wherein said cancer is a
cancer as shown in Table 2 as having >25% expression level.
Description
REFERENCE TO RELATED APPLICATIONS
[0001] This application claims priority to U.S. provisional patent
applications 60/338,733 filed Oct. 22, 2001, 60/357,600 filed Feb.
15, 2002, and 60/361,196 filed Mar. 1, 2002. The contents of the
prior applications are hereby incorporated in their entirety.
BACKGROUND OF THE INVENTION
[0002] The p53 gene is mutated in over 50 different types of human
cancers, including familial and spontaneous cancers, and is
believed to be the most commonly mutated gene in human cancer
(Zambetti and Levine, FASEB (1993) 7:855-865; Hollstein, et al.,
Nucleic Acids Res. (1994) 22:3551-3555). Greater than 90% of
mutations in the p53 gene are missense mutations that alter a
single amino acid that inactivates p53 function. Aberrant forms of
human p53 are associated with poor prognosis, more aggressive
tumors, metastasis, and short survival rates (Mitsudomi et al.,
Clin Cancer Res 2000 October; 6(10):4055-63; Koshland, Science
(1993) 262:1953).
[0003] The human p53 protein normally functions as a central
integrator of signals including DNA damage, hypoxia, nucleotide
deprivation, and oncogene activation (Prives, Cell (1998) 95:5-8).
In response to these signals, p53 protein levels are greatly
increased with the result that the accumulated p53 activates cell
cycle arrest or apoptosis depending on the nature and strength of
these signals. Indeed, multiple lines of experimental evidence have
pointed to a key role for p53 as a tumor suppressor (Levine, Cell
(1997) 88:323-331). For example, homozygous p53 "knockout" mice are
developmentally normal but exhibit nearly 100% incidence of
neoplasia in the first year of life (Donehower et al., Nature
(1992) 356:215-221).
[0004] The biochemical mechanisms and pathways through which p53
functions in normal and cancerous cells are not fully understood,
but one clearly important aspect of p53 function is its activity as
a gene-specific transcriptional activator. Among the genes with
known p53-response elements are several with well-characterized
roles in either regulation of the cell cycle or apoptosis,
including GADD45, p21/Waf1/Cip1, cyclin G, Bax, IGF-BP3, and MM2
(Levine, Cell (1997) 88:323-331).
[0005] Leucine-rich repeats (LRRs) are short motifs of 22-28
residues in length and are found in various cytoplasmic, membrane,
and extracellular proteins (Rothberg, J. et al. (1990) Genes Dev
(12A): 2169-87). These proteins play diverse roles, with
protein-protein interactions being the most common property. In
vitro studies of a synthetic LRR from Drosophila Toll protein have
implied that the peptides form gels by adopting beta-sheet
structures that form extended filaments. These results support the
idea that LRRs mediate protein-protein interactions and cellular
adhesion (Gay, N. (1991) FEBS Lett; 291(1): 87-91). Other functions
of LRR-containing proteins include the binding of enzymes (Tan, F.
et al. (1990) J Biol Chem; 265(1): 13-9), vascular repair (Hickey,
M. (1989) Proc Natl Acad Sci USA; 86(17): 6773-7), and neuronal
pathfinding and synapse formation (Taniguchi H et al (2000) J
Neurobiol 42:104-106). The 3-D structure of ribonuclease inhibitor,
a protein containing 15 LRRs, has been determined (Kobe, B. and
Deisenhofer, J. (1993) Nature; 366(6457): 751-6) demonstrating LRRs
to be a new class of alpha/beta fold. LRRs form elongated
non-globular structures and are often flanked by cysteine rich
domains.
[0006] D2S448 is a melanoma associated gene, a tumor antigen, and
possibly a peroxidase, which may be involved in p53 -dependent
apoptosis and immune responses. It also shows promise as a
potential immunogenic peptide for cancer vaccination (Horikoshi, N.
et al. (1999) Biochem Biophys Res Commun; 261(3): 864-9).
[0007] Glioma amplified on chromosome 1 protein (GAC1) is a member
of the leucine rich repeat (LRR) superfamily and may play a role in
signal transduction or cell adhesion. Gene amplification of this
gene is seen in glioma and retinoblastoma tumors (Almeida, A. et
al. (1998) Oncogene 16: 2997-3002). GAC1 contains 12 full-length
LRR motifs, and its LRR block is flanked by cysteine-rich
sequences. GAC1 is expressed in adult brain and at much lower
levels in adult heart and kidney (Almeida, A. et al. (1998)
supra).
[0008] Trophoblast glycoprotein (TPBG) is a protein that is
expressed by all types of trophoblasts as early as 9 weeks of
development, and was originally identified as a cell surface
antigen defined by monoclonal antibody 5T4. TPBG plays a role in
cell adhesion and motility and may be involved in metastasis,
placentation, and trophoblast invasion. Expression of TPBG in
gastric and colon cancers is associated with tumor metastasis and
poor prognosis (Boyle, J. et al. (1990) Hum. Genet 84: 455-458).
TPBG is expressed in several tumor cell lines (Myers, K. et al.
(1994) Biol. Chem. 269: 9319-9324).
[0009] The ability to manipulate the genomes of model organisms
such as Drosophila provides a powerful means to analyze biochemical
processes that, due to significant evolutionary conservation, have
direct relevance to more complex vertebrate organisms. Due to a
high level of gene and pathway conservation, the strong similarity
of cellular processes, and the functional conservation of genes
between these model organisms and mammals, identification of the
involvement of novel genes in particular pathways and their
functions in such model organisms can directly contribute to the
understanding of the correlative pathways and methods of modulating
them in mammals (see, for example, Mechler B M et al., 1985 EMBO J
4:1551-1557; Gateff E. 1982 Adv. Cancer Res. 37: 33-74; Watson K
L., et al., 1994 J Cell Sci. 18: 19-33; Miklos G L, and Rubin G M.
1996 Cell 86:521-529; Wassarman D A, et al., 1995 Curr Opin Gen Dev
5: 44-50; and Booth D R. 1999 Cancer Metastasis Rev. 18: 261-284).
For example, a genetic screen can be carried out in an invertebrate
model organism having underexpression (e.g. knockout) or
overexpression of a gene (referred to as a "genetic entry point")
that yields a visible phenotype. Additional genes are mutated in a
random or targeted manner. When a gene mutation changes the
original phenotype caused by the mutation in the genetic entry
point, the gene is identified as a "modifier" involved in the same
or overlapping pathway as the genetic entry point. When the genetic
entry point is an ortholog of a human gene implicated in a disease
pathway, such as p53, modifier genes can be identified that may be
attractive candidate targets for novel therapeutics.
[0010] All references cited herein, including patents, patent
applications, publications, and sequence information in referenced
Genbank identifier numbers, are incorporated herein in their
entireties.
SUMMARY OF THE INVENTION
[0011] We have discovered genes that modify the p53 pathway in
Drosophila, and identified their human orthologs, hereinafter
referred to as LRRCAPS (Leucine rich repeat, capricious related).
The invention provides methods for utilizing these p53 modifier
genes and polypeptides to identify LRRCAPS-modulating agents that
are candidate therapeutic agents that can be used in the treatment
of disorders associated with defective or impaired p53 function
and/or LRRCAPS function. Preferred LRRCAPS-modulating agents
specifically bind to LRRCAPS polypeptides and restore p53 function.
Other preferred LRRCAPS-modulating agents are nucleic acid
modulators such as antisense oligomers and RNAi that repress
LRRCAPS gene expression or product activity by, for example,
binding to and inhibiting the respective nucleic acid (i.e. DNA or
mRNA).
[0012] LRRCAPS modulating agents may be evaluated by any convenient
in vitro or in vivo assay for molecular interaction with an LRRCAPS
polypeptide or nucleic acid. In one embodiment, candidate LRRCAPS
modulating agents are tested with an assay system comprising a
LRRCAPS polypeptide or nucleic acid. Agents that produce a change
in the activity of the assay system relative to controls are
identified as candidate p53 modulating agents. The assay system may
be cell-based or cell-free. LRRCAPS-modulating agents include
LRRCAPS related proteins (e.g. dominant negative mutants, and
biotherapeutics); LRRCAPS-specific antibodies; LRRCAPS-specific
antisense oligomers and other nucleic acid modulators; and chemical
agents that specifically bind to or interact with LRRCAPS or
compete with LRRCAPS binding partner (e.g. by binding to an LRRCAPS
binding partner). In one specific embodiment, a small molecule
modulator is identified using a binding assay. In specific
embodiments, the screening assay system is selected from an
apoptosis assay, a cell proliferation assay, an angiogenesis assay,
and a hypoxic induction assay.
[0013] In another embodiment, candidate p53 pathway modulating
agents are further tested using a second assay system that detects
changes in the p53 pathway, such as angiogenic, apoptotic, or cell
proliferation changes produced by the originally identified
candidate agent or an agent derived from the original agent. The
second assay system may use cultured cells or non-human animals. In
specific embodiments, the secondary assay system uses non-human
animals, including animals predetermined to have a disease or
disorder implicating the p53 pathway, such as an angiogenic,
apoptotic, or cell proliferation disorder (e.g. cancer).
[0014] The invention further provides methods for modulating the
LRRCAPS function and/or the p53 pathway in a mammalian cell by
contacting the mammalian cell with an agent that specifically binds
a LRRCAPS polypeptide or nucleic acid. The agent may be a small
molecule modulator, a nucleic acid modulator, or an antibody and
may be administered to a mammalian animal predetermined to have a
pathology associated the p53 pathway.
DETAILED DESCRIPTION OF THE INVENTION
[0015] Genetic screens were designed to identify modifiers of the
p53 pathway in Drosophila, where a genetic modifier screen was
carried out in which p53 was overexpressed in the wing (Ollmann M,
et al., Cell 2000 101: 91-101). The CAPS gene was identified as a
modifier of the p53 pathway. Accordingly, vertebrate orthologs of
these modifiers, and preferably the human orthologs, LRRCAPS
(Leucine rich repeat, capricious related) genes (i.e., nucleic
acids and polypeptides) are attractive drug targets for the
treatment of pathologies associated with a defective p53 signaling
pathway, such as cancer.
[0016] In vitro and in vivo methods of assessing LRRCAPS function
are provided herein. Modulation of the LRRCAPS or their respective
binding partners is useful for understanding the association of the
p53 pathway and its members in normal and disease conditions and
for developing diagnostics and therapeutic modalities for p53
related pathologies. LRRCAPS-modulating agents that act by
inhibiting or enhancing LRRCAPS expression, directly or indirectly,
for example, by affecting an LRRCAPS function such as binding
activity, can be identified using methods provided herein. LRRCAPS
modulating agents are useful in diagnosis, therapy and
pharmaceutical development.
[0017] Nucleic Acids and Polypeptides of the Invention
[0018] Sequences related to LRRCAPS nucleic acids and polypeptides
that can be used in the invention are disclosed in Genbank
(referenced by Genbank identifier (GI) number) as GI#s 18073097
(SEQ ID NO: 1), 16157510 (SEQ ID NO: 3), 14758125 (SEQ ID NO: 4),
14149931 (SEQ ID NO: 5), 20547335 (SEQ ID NO: 7), 5453655 (SEQ ID
NO: 9), 3253212 (SEQ ID NO: 10), 21734210 (SEQ ID NO: 12), 21706505
(SEQ ID NO: 13), 14764197 (SEQ ID NO: 14), 5729717 (SEQ ID NO: 15),
and 435654 (SEQ ID NO: 18) for nucleic acid, and GI#s 11877257 (SEQ
ID NO: 19), 16157511 (SEQ ID NO: 20), 14758126 (SEQ ID NO: 21),
5453656 (SEQ ID NO: 22), 14764198 (SEQ ID NO: 23), and 5729718 (SEQ
ID NO: 24) for polypeptides. Further, nucleic acid sequences of SEQ
ID NOs: 2, 6, 8, 11, 16, and 17 can also be used in the methods of
the invention.
[0019] The term "LRRCAPS polypeptide" refers to a full-length
LRRCAPS protein or a functionally active fragment or derivative
thereof. A "functionally active" LRRCAPS fragment or derivative
exhibits one or more functional activities associated with a
full-length, wild-type LRRCAPS protein, such as antigenic or
immunogenic activity, ability to bind natural cellular substrates,
etc. The functional activity of LRRCAPS proteins, derivatives and
fragments can be assayed by various methods known to one skilled in
the art (Current Protocols in Protein Science (1998) Coligan et
al., eds., John Wiley & Sons, Inc., Somerset, N.J.) and as
further discussed below. In one embodiment, a functionally active
LRRCAPS polypeptide is a LRRCAPS derivative capable of rescuing
defective endogenous LRRCAPS activity, such as in cell based or
animal assays; the rescuing derivative may be from the same or a
different species. For purposes herein, functionally active
fragments also include those fragments that comprise one or more
structural domains of an LRRCAPS, such as a binding domain. Protein
domains can be identified using the PFAM program (Bateman A., et
al., Nucleic Acids Res, 1999, 27:260-2). For example, the
approximate amino acid locations of LRR (leucine rich repeat) and
IG domains of LRRCAPS from GI#s 11877257, 16157511, 14758126,
5453656, 14764198, and 5729718 (SEQ ID NOs: 19, 20, 21, 22, 23, and
24, respectively) are listed in Table 1 further below in Example 1.
Methods for obtaining LRRCAPS polypeptides are also further
described below. In some embodiments, preferred fragments are
functionally active, domain-containing fragments comprising at
least 25 contiguous amino acids, preferably at least 50, more
preferably 75, and most preferably at least 100 contiguous amino
acids of any one of SEQ ID NOs: 19, 20, 21, 22, 23, and 24 (an
LRRCAPS). In further preferred embodiments, the fragment comprises
the entire functionally active domain.
[0020] The term "LRRCAPS nucleic acid" refers to a DNA or RNA
molecule that encodes a LRRCAPS polypeptide. Preferably, the
LRRCAPS polypeptide or nucleic acid or fragment thereof is from a
human, but can also be an ortholog, or derivative thereof with at
least 70% sequence identity, preferably at least 80%, more
preferably 85%, still more preferably 90%, and most preferably at
least 95% sequence identity with human LRRCAPS. Methods of
identifying orthlogs are known in the art. Normally, orthologs in
different species retain the same function, due to presence of one
or more protein motifs and/or 3-dimensional structures. Orthologs
are generally identified by sequence homology analysis, such as
BLAST analysis, usually using protein bait sequences. Sequences are
assigned as a potential ortholog if the best hit sequence from the
forward BLAST result retrieves the original query sequence in the
reverse BLAST (Huynen M A and Bork P, Proc Natl Acad Sci (1998)
95:5849-5856; Huynen M A et al., Genome Research (2000)
10:1204-1210). Programs for multiple sequence alignment, such as
CLUSTAL (Thompson J D et al, 1994, Nucleic Acids Res 22:4673-4680)
may be used to highlight conserved regions and/or residues of
orthologous proteins and to generate phylogenetic trees. In a
phylogenetic tree representing multiple homologous sequences from
diverse species (e.g., retrieved through BLAST analysis),
orthologous sequences from two species generally appear closest on
the tree with respect to all other sequences from these two
species. Structural threading or other analysis of protein folding
(e.g., using software by ProCeryon, Biosciences, Salzburg, Austria)
may also identify potential orthologs. In evolution, when a gene
duplication event follows speciation, a single gene in one species,
such as Drosophila, may correspond to multiple genes (paralogs) in
another, such as human. As used herein, the term "orthologs"
encompasses paralogs. As used herein, "percent (%) sequence
identity" with respect to a subject sequence, or a specified
portion of a subject sequence, is defined as the percentage of
nucleotides or amino acids in the candidate derivative sequence
identical with the nucleotides or amino acids in the subject
sequence (or specified portion thereof), after aligning the
sequences and introducing gaps, if necessary to achieve the maximum
percent sequence identity, as generated by the program
WU-BLAST-2.0a19 (Altschul et al., J. Mol. Biol. (1997) 215:403-410)
with all the search parameters set to default values. The HSP S and
HSP S2 parameters are dynamic values and are established by the
program itself depending upon the composition of the particular
sequence and composition of the particular database against which
the sequence of interest is being searched. A % identity value is
determined by the number of matching identical nucleotides or amino
acids divided by the sequence length for which the percent identity
is being reported. "Percent (%) amino acid sequence similarity" is
determined by doing the same calculation as for determining % amino
acid sequence identity, but including conservative amino acid
substitutions in addition to identical amino acids in the
computation.
[0021] A conservative amino acid substitution is one in which an
amino acid is substituted for another amino acid having similar
properties such that the folding or activity of the protein is not
significantly affected. Aromatic amino acids that can be
substituted for each other are phenylalanine, tryptophan, and
tyrosine; interchangeable hydrophobic amino acids are leucine,
isoleucine, methionine, and valine; interchangeable polar amino
acids are glutamine and asparagine; interchangeable basic amino
acids are arginine, lysine and histidine; interchangeable acidic
amino acids are aspartic acid and glutamic acid; and
interchangeable small amino acids are alanine, serine, threonine,
cysteine and glycine.
[0022] Alternatively, an alignment for nucleic acid sequences is
provided by the local homology algorithm of Smith and Waterman
(Smith and Waterman, 1981, Advances in Applied Mathematics
2:482-489; database: European Bioinformatics Institute; Smith and
Waterman, 1981, J. of Molec. Biol., 147:195-197; Nicholas et al.,
1998, "A Tutorial on Searching Sequence Databases and Sequence
Scoring Methods" (www.psc.edu) and references cited therein.; W. R.
Pearson, 1991, Genomics 11:635-650). This algorithm can be applied
to amino acid sequences by using the scoring matrix developed by
Dayhoff (Dayhoff: Atlas of Protein Sequences and Structure, M. O.
Dayhoff ed., 5 suppl. 3:353-358, National Biomedical Research
Foundation, Washington, D.C., U.S.A.), and normalized by Gribskov
(Gribskov 1986 Nucl. Acids Res. 14(6):6745-6763). The
Smith-Waterman algorithm may be employed where default parameters
are used for scoring (for example, gap open penalty of 12, gap
extension penalty of two). From the data generated, the "Match"
value reflects "sequence identity."
[0023] Derivative nucleic acid molecules of the subject nucleic
acid molecules include sequences that hybridize to the nucleic acid
sequence of any of SEQ ID NOs: 1-18. The stringency of
hybridization can be controlled by temperature, ionic strength, pH,
and the presence of denaturing agents such as formamide during
hybridization and washing. Conditions routinely used are set out in
readily available procedure texts (e.g., Current Protocol in
Molecular Biology, Vol. 1, Chap. 2.10, John Wiley & Sons,
Publishers (1994); Sambrook et al., Molecular Cloning, Cold Spring
Harbor (1989)). In some embodiments, a nucleic acid molecule of the
invention is capable of hybridizing to a nucleic acid molecule
containing the nucleotide sequence of any one of SEQ ID NOs: 1-18
under high stringency hybridization conditions that comprise:
prehybridization of filters containing nucleic acid for 8 hours to
overnight at 65.degree. C. in a solution comprising 6.times. single
strength citrate (SSC) (1.times. SSC is 0.15 M NaCl, 0.015 M Na
citrate; pH 7.0), 5.times. Denhardt's solution, 0.05% sodium
pyrophosphate and 100 .mu.g/ml herring sperm DNA; hybridization for
18-20 hours at 65.degree. C. in a solution containing 6.times. SSC,
1.times. Denhardt's solution, 100 .mu.g/ml yeast tRNA and 0.05%
sodium pyrophosphate; and washing of filters at 65.degree. C. for
1h in a solution containing 0.1.times. SSC and 0.1% SDS (sodium
dodecyl sulfate).
[0024] In other embodiments, moderately stringent hybridization
conditions are used that comprise: pretreatment of filters
containing nucleic acid for 6 h at 40.degree. C. in a solution
containing 35% formamide, 5.times. SSC, 50 mM Tris-HCl (pH7.5), 5mM
EDTA, 0.1% PVP, 0.1% Ficoll, 1% BSA, and 500 .mu.g/ml denatured
salmon sperm DNA; hybridization for 18-20h at 40.degree. C. in a
solution containing 35% formamide, 5.times. SSC, 50 mM Tris-HCl
(pH7.5), 5mM EDTA, 0.02% PVP, 0.02% Ficoll, 0.2% BSA, 100 .mu.g/ml
salmon sperm DNA, and 10% (wt/vol) dextran sulfate; followed by
washing twice for 1 hour at 55.degree. C. in a solution containing
2.times. SSC and 0.1% SDS.
[0025] Alternatively, low stringency conditions can be used that
comprise: incubation for 8 hours to overnight at 37.degree. C. in a
solution comprising 20% formamide, 5.times.SC, 50 mM sodium
phosphate (pH 7.6), 5.times. Denhardt's solution, 10% dextran
sulfate, and 20.mu.g/ml denatured sheared salmon sperm DNA;
hybridization in the same buffer for 18 to 20 hours; and washing of
filters in 1.times.SSC at about 37.degree. C. for 1 hour.
[0026] Isolation, Production, Expression, and Mis-expression of
LRRCAPS Nucleic Acids and Polypeptides
[0027] LRRCAPS nucleic acids and polypeptides, useful for
identifying and testing agents that modulate LRRCAPS function and
for other applications related to the involvement of LRRCAPS in the
p53 pathway. LRRCAPS nucleic acids and derivatives and orthologs
thereof may be obtained using any available method. For instance,
techniques for isolating cDNA or genomic DNA sequences of interest
by screening DNA libraries or by using polymerase chain reaction
(PCR) are well known in the art. In general, the particular use for
the protein will dictate the particulars of expression, production,
and purification methods. For instance, production of proteins for
use in screening for modulating agents may require methods that
preserve specific biological activities of these proteins, whereas
production of proteins for antibody generation may require
structural integrity of particular epitopes. Expression of proteins
to be purified for screening or antibody production may require the
addition of specific tags (e.g., generation of fusion proteins).
Overexpression of an LRRCAPS protein for assays used to assess
LRRCAPS function, such as involvement in cell cycle regulation or
hypoxic response, may require expression in eukaryotic cell lines
capable of these cellular activities. Techniques for the
expression, production, and purification of proteins are well known
in the art; any suitable means therefore may be used (e.g., Higgins
S J and Hames B D (eds.) Protein Expression: A Practical Approach,
Oxford University Press Inc., New York 1999; Stanbury P F et al.,
Principles of Fermentation Technology, 2.sup.nd edition, Elsevier
Science, New York, 1995; Doonan S (ed.) Protein Purification
Protocols, Humana Press, New Jersey, 1996; Coligan J E et al,
Current Protocols in Protein Science (eds.), 1999, John Wiley &
Sons, New York). In particular embodiments, recombinant LRRCAPS is
expressed in a cell line known to have defective p53 function (e.g.
SAOS-2 osteoblasts, H1299 lung cancer cells, C33A and HT3 cervical
cancer cells, HT-29 and DLD-1 colon cancer cells, among others,
available from American Type Culture Collection (ATCC), Manassas,
Va.). The recombinant cells are used in cell-based screening assay
systems of the invention, as described further below.
[0028] The nucleotide sequence encoding an LRRCAPS polypeptide can
be inserted into any appropriate expression vector. The necessary
transcriptional and translational signals, including
promoter/enhancer element, can derive from the native LRRCAPS gene
and/or its flanking regions or can be heterologous. A variety of
host-vector expression systems may be utilized, such as mammalian
cell systems infected with virus (e.g. vaccinia virus, adenovirus,
etc.); insect cell systems infected with virus (e.g. baculovirus);
microorganisms such as yeast containing yeast vectors, or bacteria
transformed with bacteriophage, plasmid, or cosmid DNA. An isolated
host cell strain that modulates the expression of, modifies, and/or
specifically processes the gene product may be used.
[0029] To detect expression of the LRRCAPS gene product, the
expression vector can comprise a promoter operably linked to an
LRRCAPS gene nucleic acid, one or more origins of replication, and,
one or more selectable markers (e.g. thymidine kinase activity,
resistance to antibiotics, etc.). Alternatively, recombinant
expression vectors can be identified by assaying for the expression
of the LRRCAPS gene product based on the physical or functional
properties of the LRRCAPS protein in in vitro assay systems (e.g.
immunoassays).
[0030] The LRRCAPS protein, fragment, or derivative may be
optionally expressed as a fusion, or chimeric protein product (i.e.
it is joined via a peptide bond to a heterologous protein sequence
of a different protein), for example to facilitate purification or
detection. A chimeric product can be made by ligating the
appropriate nucleic acid sequences encoding the desired amino acid
sequences to each other using standard methods and expressing the
chimeric product. A chimeric product may also be made by protein
synthetic techniques, e.g. by use of a peptide synthesizer
(Hunkapiller et al., Nature (1984) 310:105-111).
[0031] Once a recombinant cell that expresses the LRRCAPS gene
sequence is identified, the gene product can be isolated and
purified using standard methods (e.g. ion exchange, affinity, and
gel exclusion chromatography; centrifugation; differential
solubility; electrophoresis). Alternatively, native LRRCAPS
proteins can be purified from natural sources, by standard methods
(e.g. immunoaffinity purification). Once a protein is obtained, it
may be quantified and its activity measured by appropriate methods,
such as immunoassay, bioassay, or other measurements of physical
properties, such as crystallography.
[0032] The methods of this invention may also use cells that have
been engineered for altered expression (mis-expression) of LRRCAPS
or other genes associated with the p53 pathway. As used herein,
mis-expression encompasses ectopic expression, over-expression,
under-expression, and non-expression (e.g. by gene knock-out or
blocking expression that would otherwise normally occur).
[0033] Genetically Modified Animals
[0034] Animal models that have been genetically modified to alter
LRRCAPS expression may be used in in vivo assays to test for
activity of a candidate p53 modulating agent, or to further assess
the role of LRRCAPS in a p53 pathway process such as apoptosis or
cell proliferation. Preferably, the altered LRRCAPS expression
results in a detectable phenotype, such as decreased or increased
levels of cell proliferation, angiogenesis, or apoptosis compared
to control animals having normal LRRCAPS expression. The
genetically modified animal may additionally have altered p53
expression (e.g. p53 knockout). Preferred genetically modified
animals are mammals such as primates, rodents (preferably mice or
rats), among others. Preferred non-mammalian species include
zebrafish, C. elegans, and Drosophila. Preferred genetically
modified animals are transgenic animals having a heterologous
nucleic acid sequence present as an extrachromosomal element in a
portion of its cells, i.e. mosaic animals (see, for example,
techniques described by Jakobovits, 1994, Curr. Biol. 4:761-763.)
or stably integrated into its germ line DNA (i.e., in the genomic
sequence of most or all of its cells). Heterologous nucleic acid is
introduced into the germ line of such transgenic animals by genetic
manipulation of, for example, embryos or embryonic stem cells of
the host animal.
[0035] Methods of making transgenic animals are well-known in the
art (for transgenic mice see Brinster et al., Proc. Nat. Acad. Sci.
USA 82: 4438-4442 (1985), U.S. Pat. Nos. 4,736,866 and 4,870,009,
both by Leder et al., U.S. Pat. No. 4,873,191 by Wagner et al., and
Hogan, B., Manipulating the Mouse Embryo, Cold Spring Harbor
Laboratory Press, Cold Spring Harbor, N.Y., (1986); for particle
bombardment see U.S. Pat. No., 4,945,050, by Sandford et al.; for
transgenic Drosophila see Rubin and Spradling, Science (1982)
218:348-53 and U.S. Pat. No. 4,670,388; for transgenic insects see
Berghammer A. J. et al., A Universal Marker for Transgenic Insects
(1999) Nature 402:370-371; for transgenic Zebrafish see Lin S.,
Transgenic Zebrafish, Methods Mol Biol. (2000);136:375-3830); for
microinjection procedures for fish, amphibian eggs and birds see
Houdebine and Chourrout, Experientia (1991) 47:897-905; for
transgenic rats see Hammer et al., Cell (1990) 63:1099-1112; and
for culturing of embryonic stem (ES) cells and the subsequent
production of transgenic animals by the introduction of DNA into ES
cells using methods such as electroporation, calcium phosphate/DNA
precipitation and direct injection see, e.g., Teratocarcinomas and
Embryonic Stem Cells, A Practical Approach, E. J. Robertson, ed.,
IRL Press (1987)). Clones of the nonhuman transgenic animals can be
produced according to available methods (see Wilmut, I. et al.
(1997) Nature 385:810-813; and PCT International Publication Nos.
WO 97/07668 and WO 97/07669).
[0036] In one embodiment, the transgenic animal is a "knock-out"
animal having a heterozygous or homozygous alteration in the
sequence of an endogenous LRRCAPS gene that results in a decrease
of LRRCAPS function, preferably such that LRRCAPS expression is
undetectable or insignificant. Knock-out animals are typically
generated by homologous recombination with a vector comprising a
transgene having at least a portion of the gene to be knocked out.
Typically a deletion, addition or substitution has been introduced
into the transgene to functionally disrupt it. The transgene can be
a human gene (e.g., from a human genomic clone) but more preferably
is an ortholog of the human gene derived from the transgenic host
species. For example, a mouse LRRCAPS gene is used to construct a
homologous recombination vector suitable for altering an endogenous
LRRCAPS gene in the mouse genome. Detailed methodologies for
homologous recombination in mice are available (see Capecchi,
Science (1989) 244:1288-1292; Joyner et al., Nature (1989)
338:153-156). Procedures for the production of non-rodent
transgenic mammals and other animals are also available (Houdebine
and Chourrout, supra; Pursel et al., Science (1989) 244:1281-1288;
Simms et al., Bio/Technology (1988) 6:179-183). In a preferred
embodiment, knock-out animals, such as mice harboring a knockout of
a specific gene, may be used to produce antibodies against the
human counterpart of the gene that has been knocked out (Claesson M
H et al., (1994) Scan J Immunol 40:257-264; Declerck P J et al.,
(1995) J Biol Chem. 270:8397-400).
[0037] In another embodiment, the transgenic animal is a "knock-in"
animal having an alteration in its genome that results in altered
expression (e.g., increased (including ectopic) or decreased
expression) of the LRRCAPS gene, e.g., by introduction of
additional copies of LRRCAPS, or by operatively inserting a
regulatory sequence that provides for altered expression of an
endogenous copy of the LRRCAPS gene. Such regulatory sequences
include inducible, tissue-specific, and constitutive promoters and
enhancer elements. The knock-in can be homozygous or
heterozygous.
[0038] Transgenic nonhuman animals can also be produced that
contain selected systems allowing for regulated expression of the
transgene. One example of such a system that may be produced is the
cre/loxP recombinase system of bacteriophage P1 (Lakso et al., PNAS
(1992) 89:6232-6236; U.S. Pat. No. 4,959,317). If a cre/loxP
recombinase system is used to regulate expression of the transgene,
animals containing transgenes encoding both the Cre recombinase and
a selected protein are required. Such animals can be provided
through the construction of "double" transgenic animals, e.g., by
mating two transgenic animals, one containing a transgene encoding
a selected protein and the other containing a transgene encoding a
recombinase. Another example of a recombinase system is the FLP
recombinase system of Saccharomyces cerevisiae (O'Gorman et al.
(1991) Science 251:1351-1355; U.S. Pat. No. 5,654,182). In a
preferred embodiment, both Cre-LoxP and Flp-Frt are used in the
same system to regulate expression of the transgene, and for
sequential deletion of vector sequences in the same cell (Sun X et
al (2000) Nat Genet 25:83-6).
[0039] The genetically modified animals can be used in genetic
studies to further elucidate the p53 pathway, as animal models of
disease and disorders implicating defective p53 function, and for
in vivo testing of candidate therapeutic agents, such as those
identified in screens described below. The candidate therapeutic
agents are administered to a genetically modified animal having
altered LRRCAPS function and phenotypic changes are compared with
appropriate control animals such as genetically modified animals
that receive placebo treatment, and/or animals with unaltered
LRRCAPS expression that receive candidate therapeutic agent.
[0040] In addition to the above-described genetically modified
animals having altered LRRCAPS function, animal models having
defective p53 function (and otherwise normal LRRCAPS function), can
be used in the methods of the present invention. For example, a p53
knockout mouse can be used to assess, in vivo, the activity of a
candidate p53 modulating agent identified in one of the in vitro
assays described below. p53 knockout mice are described in the
literature (Jacks et al., Nature 2001;410:1111-1116, 1043-1044;
Donehower et al., supra). Preferably, the candidate p53 modulating
agent when administered to a model system with cells defective in
p53 function, produces a detectable phenotypic change in the model
system indicating that the p53 function is restored, i.e., the
cells exhibit normal cell cycle progression.
[0041] Modulating Agents
[0042] The invention provides methods to identify agents that
interact with and/or modulate the function of LRRCAPS and/or the
p53 pathway. Modulating agents identified by the methods are also
part of the invention. Such agents are useful in a variety of
diagnostic and therapeutic applications associated with the p53
pathway, as well as in further analysis of the LRRCAPS protein and
its contribution to the p53 pathway. Accordingly, the invention
also provides methods for modulating the p53 pathway comprising the
step of specifically modulating LRRCAPS activity by administering a
LRRCAPS-interacting or -modulating agent.
[0043] As used herein, an "LRRCAPS-modulating agent" is any agent
that modulates LRRCAPS function, for example, an agent that
interacts with LRRCAPS to inhibit or enhance LRRCAPS activity or
other-wise affect normal LRRCAPS function. LRRCAPS function can be
affected at any level, including transcription, protein expression,
protein localization, and cellular or extra-cellular activity. In a
preferred embodiment, the LRRCAPS--modulating agent specifically
modulates the function of the LRRCAPS. The phrases "specific
modulating agent", "specifically modulates", etc., are used herein
to refer to modulating agents that directly bind to the LRRCAPS
polypeptide or nucleic acid, and preferably inhibit, enhance, or
otherwise alter, the function of the LRRCAPS. These phrases also
encompass modulating agents that alter the interaction of the
LRRCAPS with a binding partner, substrate, or cofactor (e.g. by
binding to a binding partner of an LRRCAPS, or to a protein/binding
partner complex, and altering LRRCAPS function). In a further
preferred embodiment, the LRRCAPS-modulating agent is a modulator
of the p53 pathway (e.g. it restores and/or upregulates p53
function) and thus is also a p53-modulating agent.
[0044] Preferred LRRCAPS-modulating agents include small molecule
compounds; LRRCAPS-interacting proteins, including antibodies and
other biotherapeutics; and nucleic acid modulators such as
antisense and RNA inhibitors. The modulating agents may be
formulated in pharmaceutical compositions, for example, as
compositions that may comprise other active ingredients, as in
combination therapy, and/or suitable carriers or excipients.
Techniques for formulation and administration of the compounds may
be found in "Remington's Pharmaceutical Sciences" Mack Publishing
Co., Easton, Pa., 19.sup.th edition.
[0045] Small Molecule Modulators
[0046] Small molecules are often preferred to modulate function of
proteins with enzymatic function, and/or containing protein
interaction domains. Chemical agents, referred to in the art as
"small molecule" compounds are typically organic, non-peptide
molecules, having a molecular weight less than 10,000, preferably
less than 5,000, more preferably less than 1,000, and most
preferably less than 500. This class of modulators includes
chemically synthesized molecules, for instance, compounds from
combinatorial chemical libraries. Synthetic compounds may be
rationally designed or identified based on known or inferred
properties of the LRRCAPS protein or may be identified by screening
compound libraries. Alternative appropriate modulators of this
class are natural products, particularly secondary metabolites from
organisms such as plants or fungi, which can also be identified by
screening compound libraries for LRRCAPS-modulating activity.
Methods for generating and obtaining compounds are well known in
the art (Schreiber S L, Science (2000) 151: 1964-1969; Radmann J
and Gunther J, Science (2000) 151:1947-1948).
[0047] Small molecule modulators identified from screening assays,
as described below, can be used as lead compounds from which
candidate clinical compounds may be designed, optimized, and
synthesized. Such clinical compounds may have utility in treating
pathologies associated with the p53 pathway. The activity of
candidate small molecule modulating agents may be improved
several-fold through iterative secondary functional validation, as
further described below, structure determination, and candidate
modulator modification and testing. Additionally, candidate
clinical compounds are generated with specific regard to clinical
and pharmacological properties. For example, the reagents may be
derivatized and re-screened using in vitro and in vivo assays to
optimize activity and minimize toxicity for pharmaceutical
development.
[0048] Protein Modulators
[0049] Specific LRRCAPS-interacting proteins are useful in a
variety of diagnostic and therapeutic applications related to the
p53 pathway and related disorders, as well as in validation assays
for other LRRCAPS-modulating agents. In a preferred embodiment,
LRRCAPS-interacting proteins affect normal LRRCAPS function,
including transcription, protein expression, protein localization,
and cellular or extra-cellular activity. In another embodiment,
LRRCAPS-interacting proteins are useful in detecting and providing
information about the function of LRRCAPS proteins, as is relevant
to p53 related disorders, such as cancer (e.g., for diagnostic
means).
[0050] An LRRCAPS-interacting protein may be endogenous, i.e. one
that naturally interacts genetically or biochemically with an
LRRCAPS, such as a member of the LRRCAPS pathway that modulates
LRRCAPS expression, localization, and/or activity.
LRRCAPS-modulators include dominant negative forms of
LRRCAPS-interacting proteins and of LRRCAPS proteins themselves.
Yeast two-hybrid and variant screens offer preferred methods for
identifying endogenous LRRCAPS-interacting proteins (Finley, R. L.
et al. (1996) in DNA Cloning-Expression Systems: A Practical
Approach, eds. Glover D. & Hames B. D (Oxford University Press,
Oxford, England), pp. 169-203; Fashema S F et al., Gene (2000)
250:1-14; Drees B L Curr Opin Chem Biol (1999) 3:64-70; Vidal M and
Legrain P Nucleic Acids Res (1999) 27:919-29; and U.S. Pat. No.
5,928,868). Mass spectrometry is an alternative preferred method
for the elucidation of protein complexes (reviewed in, e.g.,
Pandley A and Mann M, Nature (2000) 405:837-846; Yates J R
3.sup.rd, Trends Genet (2000) 16:5-8).
[0051] An LRRCAPS-interacting protein may be an exogenous protein,
such as an LRRCAPS-specific antibody or a T-cell antigen receptor
(see, e.g., Harlow and Lane (1988) Antibodies, A Laboratory Manual,
Cold Spring Harbor Laboratory; Harlow and Lane (1999) Using
antibodies: a laboratory manual. Cold Spring Harbor, N.Y.: Cold
Spring Harbor Laboratory Press). LRRCAPS antibodies are further
discussed below.
[0052] In preferred embodiments, an LRRCAPS-interacting protein
specifically binds an LRRCAPS protein. In alternative preferred
embodiments, an LRRCAPS-modulating agent binds an LRRCAPS
substrate, binding partner, or cofactor.
[0053] Antibodies
[0054] In another embodiment, the protein modulator is an LRRCAPS
specific antibody agonist or antagonist.
[0055] The antibodies have therapeutic and diagnostic utilities,
and can be used in screening assays to identify LRRCAPS modulators.
The antibodies can also be used in dissecting the portions of the
LRRCAPS pathway responsible for various cellular responses and in
the general processing and maturation of the LRRCAPS.
[0056] Antibodies that specifically bind LRRCAPS polypeptides can
be generated using known methods. Preferably the antibody is
specific to a mammalian ortholog of LRRCAPS polypeptide, and more
preferably, to human LRRCAPS. Antibodies may be polyclonal,
monoclonal (mAbs), humanized or chimeric antibodies, single chain
antibodies, Fab fragments, F(ab').sub.2 fragments, fragments
produced by a FAb expression library, anti-idiotypic (anti-Id)
antibodies, and epitope-binding fragments of any of the above.
Epitopes of LRRCAPS, which are particularly antigenic, can be
selected, for example, by routine screening of LRRCAPS polypeptides
for antigenicity or by applying a theoretical method for selecting
antigenic regions of a protein (Hopp and Wood (1981), Proc. Natl.
Acad. Sci. U.S.A. 78:3824-28; Hopp and Wood, (1983) Mol. Immunol.
20:483-89, Sutcliffe et al., (1983) Science 219:660-66) to the
amino acid sequence shown in any of SEQ ID NOs: 19-24. Monoclonal
antibodies with affinities of 10.sup.8 M.sup.--1 preferably
10.sup.9 M.sup.-1 to 10.sup.10 M.sup.-1, or stronger can be made by
standard procedures as described (Harlow and Lane, supra; Goding
(1986) Monoclonal Antibodies: Principles and Practice (2d ed)
Academic Press, New York; and U.S. Pat. Nos. 4,381,292, 4,451,570,
and 4,618,577). Antibodies may be generated against crude cell
extracts of LRRCAPS or substantially purified fragments thereof. If
LRRCAPS fragments are used, they preferably comprise at least 10,
and more preferably, at least 20 contiguous amino acids of an
LRRCAPS protein. In a particular embodiment, LRRCAPS-specific
antigens and/or immunogens are coupled to carrier proteins that
stimulate the immune response. For example, the subject
polypeptides are covalently coupled to the keyhole limpet
hemocyanin (KLH) carrier, and the conjugate is emulsified in
Freund's complete adjuvant, which enhances the immune response. An
appropriate immune system such as a laboratory rabbit or mouse is
immunized according to conventional protocols.
[0057] The presence of LRRCAPS-specific antibodies is assayed by an
appropriate assay such as a solid phase enzyme-linked immunosorbant
assay (ELISA) using immobilized corresponding LRRCAPS polypeptides.
Other assays, such as radioimmunoassays or fluorescent assays might
also be used.
[0058] Chimeric antibodies specific to LRRCAPS polypeptides can be
made that contain different portions from different animal species.
For instance, a human immunoglobulin constant region may be linked
to a variable region of a murine mAb, such that the antibody
derives its biological activity from the human antibody, and its
binding specificity from the murine fragment. Chimeric antibodies
are produced by splicing together genes that encode the appropriate
regions from each species (Morrison et al., Proc. Natl. Acad. Sci.
(1984) 81:6851-6855; Neuberger et al., Nature (1984) 312:604-608;
Takeda et al., Nature (1985) 31:452-454). Humanized antibodies,
which are a form of chimeric antibodies, can be generated by
grafting complementary-determining regions (CDRs) (Carlos, T. M.,
J. M. Harlan. 1994. Blood 84:2068-2101) of mouse antibodies into a
background of human framework regions and constant regions by
recombinant DNA technology (Riechmann L M, et al., 1988 Nature 323:
323-327). Humanized antibodies contain .about.10% murine sequences
and .about.90% human sequences, and thus further reduce or
eliminate immunogenicity, while retaining the antibody
specificities (Co M S, and Queen C. 1991 Nature 351: 501-501;
Morrison S L. 1992 Ann. Rev. Immun. 10:239-265). Humanized
antibodies and methods of their production are well-known in the
art (U.S. Pat. Nos. 5,530,101, 5,585,089, 5,693,762, and
6,180,370).
[0059] LRRCAPS-specific single chain antibodies which are
recombinant, single chain polypeptides formed by linking the heavy
and light chain fragments of the Fv regions via an amino acid
bridge, can be produced by methods known in the art (U.S. Pat. No.
4,946,778; Bird, Science (1988) 242:423-426; Huston et al., Proc.
Natl. Acad. Sci. USA (1988) 85:5879-5883; and Ward et al., Nature
(1989) 334:544-546).
[0060] Other suitable techniques for antibody production involve in
vitro exposure of lymphocytes to the antigenic polypeptides or
alternatively to selection of libraries of antibodies in phage or
similar vectors (Huse et al., Science (1989) 246:1275-1281). As
used herein, T-cell antigen receptors are included within the scope
of antibody modulators (Harlow and Lane, 1988, supra).
[0061] The polypeptides and antibodies of the present invention may
be used with or without modification. Frequently, antibodies will
be labeled by joining, either covalently or non-covalently, a
substance that provides for a detectable signal, or that is toxic
to cells that express the targeted protein (Menard S, et al., Int
J. Biol Markers (1989) 4:131-134). A wide variety of labels and
conjugation techniques are known and are reported extensively in
both the scientific and patent literature. Suitable labels include
radionuclides, enzymes, substrates, cofactors, inhibitors,
fluorescent moieties, fluorescent emitting lanthanide metals,
chemiluminescent moieties, bioluminescent moieties, magnetic
particles, and the like (U.S. Pat. Nos. 3,817,837, 3,850,752,
3,939,350, 3,996,345, 4,277,437, 4,275,149, and 4,366,241). Also,
recombinant immunoglobulins may be produced (U.S. Pat. No.
4,816,567). Antibodies to cytoplasmic polypeptides may be delivered
and reach their targets by conjugation with membrane-penetrating
toxin proteins (U.S. Pat. No. 6,086,900).
[0062] When used therapeutically in a patient, the antibodies of
the subject invention are typically administered parenterally, when
possible at the target site, or intravenously. The therapeutically
effective dose and dosage regimen is determined by clinical
studies. Typically, the amount of antibody administered is in the
range of about 0.1 mg/kg-to about 10 mg/kg of patient weight. For
parenteral administration, the antibodies are formulated in a unit
dosage injectable form (e.g., solution, suspension, emulsion) in
association with a pharmaceutically acceptable vehicle. Such
vehicles are inherently nontoxic and non-therapeutic. Examples are
water, saline, Ringer's solution, dextrose solution, and 5% human
serum albumin. Nonaqueous vehicles such as fixed oils, ethyl
oleate, or liposome carriers may also be used. The vehicle may
contain minor amounts of additives, such as buffers and
preservatives, which enhance isotonicity and chemical stability or
otherwise enhance therapeutic potential. The antibodies'
concentrations in such vehicles are typically in the range of about
1 mg/ml to about 10 mg/ml. Immunotherapeutic methods are further
described in the literature (U.S. Pat. No. 5,859,206;
WO0073469).
[0063] Specific Biotherapeutics
[0064] In a preferred embodiment, an LRRCAPS-interacting protein
may have biotherapeutic applications. Biotherapeutic agents
formulated in pharmaceutically acceptable carriers and dosages may
be used to activate or inhibit signal transduction pathways. This
modulation may be accomplished by binding a ligand, thus inhibiting
the activity of the pathway; or by binding a receptor, either to
inhibit activation of, or to activate, the receptor. Alternatively,
the biotherapeutic may itself be a ligand capable of activating or
inhibiting a receptor. Biotherapeutic agents and methods of
producing them are described in detail in U.S. Pat. No.
6,146,628.
[0065] LRRCAPS, its ligand(s), antibodies to the ligand(s) or the
LRRCAPS itself may be used as biotherapeutics to modulate the
activity of LRRCAPS in the p53 pathway.
[0066] Nucleic Acid Modulators
[0067] Other preferred LRRCAPS-modulating agents comprise nucleic
acid molecules, such as antisense oligomers or double stranded RNA
(dsRNA), which generally inhibit LRRCAPS activity. Preferred
nucleic acid modulators interfere with the function of the LRRCAPS
nucleic acid such as DNA replication, transcription, translocation
of the LRRCAPS RNA to the site of protein translation, translation
of protein from the LRRCAPS RNA, splicing of the LRRCAPS RNA to
yield one or more mRNA species, or catalytic activity which may be
engaged in or facilitated by the LRRCAPS RNA.
[0068] In one embodiment, the antisense oligomer is an
oligonucleotide that is sufficiently complementary to an LRRCAPS
mRNA to bind to and prevent translation, preferably by binding to
the 5' untranslated region. LRRCAPS-specific antisense
oligonucleotides, preferably range from at least 6 to about 200
nucleotides. In some embodiments the oligonucleotide is preferably
at least 10, 15, or 20 nucleotides in length. In other embodiments,
the oligonucleotide is preferably less than 50, 40, or 30
nucleotides in length. The oligonucleotide can be DNA or RNA or a
chimeric mixture or derivatives or modified versions thereof,
single-stranded or double-stranded. The oligonucleotide can be
modified at the base moiety, sugar moiety, or phosphate backbone.
The oligonucleotide may include other appending groups such as
peptides, agents that facilitate transport across the cell
membrane, hybridization-triggered cleavage agents, and
intercalating agents.
[0069] In another embodiment, the antisense oligomer is a
phosphothioate morpholino oligomer (PMO). PMOs are assembled from
four different morpholino subunits, each of which contain one of
four genetic bases (A, C, G, or T) linked to a six-membered
morpholine ring. Polymers of these subunits are joined by non-ionic
phosphodiamidate intersubunit linkages. Details of how to make and
use PMOs and other antisense oligomers are well known in the art
(e.g. see WO99/18193; Probst J C, Antisense Oligodeoxynucleotide
and Ribozyme Design, Methods. (2000) 22(3):271-281; Summerton J,
and Weller D. 1997 Antisense Nucleic Acid Drug Dev. :7:187-95; U.S.
Pat. No. 5,235,033 and U.S. Pat. No. 5,378,841).
[0070] Alternative preferred LRRCAPS nucleic acid modulators are
double-stranded RNA species mediating RNA interference (RNAi). RNAi
is the process of sequence-specific, post-transcriptional gene
silencing in animals and plants, initiated by double-stranded RNA
(dsRNA) that is homologous in sequence to the silenced gene.
Methods relating to the use of RNAi to silence genes in C. elegans,
Drosophila, plants, and humans are known in the art (Fire A, et
al., 1998 Nature 391:806-811; Fire, A. Trends Genet. 15, 358-363
(1999); Sharp, P. A. RNA interference 2001. Genes Dev. 15, 485-490
(2001); Hammond, S. M., et al., Nature Rev. Genet. 2, 110-119
(2001); Tuschl, T. Chem. Biochem. 2, 239-245 (2001); Hamilton, A.
et al., Science 286, 950-952 (1999); Hammond, S. M., et al., Nature
404, 293-296 (2000); Zamore, P. D., et al., Cell 101, 25-33 (2000);
Bernstein, E., et al., Nature 409, 363-366 (2001); Elbashir, S. M.,
et al., Genes Dev. 15, 188-200 (2001); WO0129058; WO9932619;
Elbashir S M, et al., 2001 Nature 411:494-498).
[0071] Nucleic acid modulators are commonly used as research
reagents, diagnostics, and therapeutics. For example, antisense
oligonucleotides, which are able to inhibit gene expression with
exquisite specificity, are often used to elucidate the function of
particular genes (see, for example, U.S. Pat. No. 6,165,790).
Nucleic acid modulators are also used, for example, to distinguish
between functions of various members of a biological pathway. For
example, antisense oligomers have been employed as therapeutic
moieties in the treatment of disease states in animals and man and
have been demonstrated in numerous clinical trials to be safe and
effective (Milligan J F, et al, Current Concepts in Antisense Drug
Design, J Med Chem. (1993) 36:1923-1937; Tonkinson J L et al.,
Antisense Oligodeoxynucleotides as Clinical Therapeutic Agents,
Cancer Invest. (1996) 14:54-65). Accordingly, in one aspect of the
invention, an LRRCAPS-specific nucleic acid modulator is used in an
assay to further elucidate the role of the LRRCAPS in the p53
pathway, and/or its relationship to other members of the pathway.
In another aspect of the invention, an LRRCAPS-specific antisense
oligomer is used as a therapeutic agent for treatment of
p53-related disease states.
[0072] Assay Systems
[0073] The invention provides assay systems and screening methods
for identifying specific modulators of LRRCAPS activity. As used
herein, an "assay system" encompasses all the components required
for performing and analyzing results of an assay that detects
and/or measures a particular event. In general, primary assays are
used to identify or confirm a modulator's specific biochemical or
molecular effect with respect to the LRRCAPS nucleic acid or
protein. In general, secondary assays further assess the activity
of a LRRCAPS modulating agent identified by a primary assay and may
confirm that the modulating agent affects LRRCAPS in a manner
relevant to the p53 pathway. In some cases, LRRCAPS modulators will
be directly tested in a secondary assay.
[0074] In a preferred embodiment, the screening method comprises
contacting a suitable assay system comprising an LRRCAPS
polypeptide or nucleic acid with a candidate agent under conditions
whereby, but for the presence of the agent, the system provides a
reference activity (e.g. binding activity), which is based on the
particular molecular event the screening method detects. A
statistically significant difference between the agent-biased
activity and the reference activity indicates that the candidate
agent modulates LRRCAPS activity, and hence the p53 pathway. The
LRRCAPS polypeptide or nucleic acid used in the assay may comprise
any of the nucleic acids or polypeptides described above.
[0075] Primary Assays
[0076] The type of modulator tested generally determines the type
of primary assay.
[0077] Primary Assays for Small Molecule Modulators
[0078] For small molecule modulators, screening assays are used to
identify candidate modulators. Screening assays may be cell-based
or may use a cell-free system that recreates or retains the
relevant biochemical reaction of the target protein (reviewed in
Sittampalam G S et al., Curr Opin Chem Biol (1997) 1:384-91 and
accompanying references). As used herein the term "cell-based"
refers to assays using live cells, dead cells, or a particular
cellular fraction, such as a membrane, endoplasmic reticulum, or
mitochondrial fraction. The term "cell free" encompasses assays
using substantially purified protein (either endogenous or
recombinantly produced), partially purified or crude cellular
extracts. Screening assays may detect a variety of molecular
events, including protein-DNA interactions, protein-protein
interactions (e.g., receptor-ligand binding), transcriptional
activity (e.g., using a reporter gene), enzymatic activity (e.g.,
via a property of the substrate), activity of second messengers,
immunogenicty and changes in cellular morphology or other cellular
characteristics. Appropriate screening assays may use a wide range
of detection methods including fluorescent, radioactive,
calorimetric, spectrophotometric, and amperometric methods, to
provide a read-out for the particular molecular event detected.
[0079] Cell-based screening assays usually require systems for
recombinant expression of LRRCAPS and any auxiliary proteins
demanded by the particular assay. Appropriate methods for
generating recombinant proteins produce sufficient quantities of
proteins that retain their relevant biological activities and are
of sufficient purity to optimize activity and assure assay
reproducibility. Yeast two-hybrid and variant screens, and mass
spectrometry provide preferred methods for determining
protein-protein interactions and elucidation of protein complexes.
In certain applications, when LRRCAPS-interacting proteins are used
in screens to identify small molecule modulators, the binding
specificity of the interacting protein to the LRRCAPS protein may
be assayed by various known methods such as substrate processing
(e.g. ability of the candidate LRRCAPS-specific binding agents to
function as negative effectors in LRRCAPS-expressing cells),
binding equilibrium constants (usually at least about 10.sup.7
M.sup.-1, preferably at least about 10.sup.8 M.sup.-1, more
preferably at least about 10.sup.9 M.sup.-1), and immunogenicity
(e.g. ability to elicit LRRCAPS specific antibody in a heterologous
host such as a mouse, rat, goat or rabbit). For enzymes and
receptors, binding may be assayed by, respectively, substrate and
ligand processing.
[0080] The screening assay may measure a candidate agent's ability
to specifically bind to or modulate activity of a LRRCAPS
polypeptide, a fusion protein thereof, or to cells or membranes
bearing the polypeptide or fusion protein. The LRRCAPS polypeptide
can be full length or a fragment thereof that retains functional
LRRCAPS activity. The LRRCAPS polypeptide may be fused to another
polypeptide, such as a peptide tag for detection or anchoring, or
to another tag. The LRRCAPS polypeptide is preferably human
LRRCAPS, or is an ortholog or derivative thereof as described
above. In a preferred embodiment, the screening assay detects
candidate agent-based modulation of LRRCAPS interaction with a
binding target, such as an endogenous or exogenous protein or other
substrate that has LRRCAPS -specific binding activity, and can be
used to assess normal LRRCAPS gene function.
[0081] Suitable assay formats that may be adapted to screen for
LRRCAPS modulators are known in the art. Preferred screening assays
are high throughput or ultra high throughput and thus provide
automated, cost-effective means of screening compound libraries for
lead compounds (Fernandes P B, Curr Opin Chem Biol (1998)
2:597-603; Sundberg S A, Curr Opin Biotechnol 2000, 11:47-53). In
one preferred embodiment, screening assays uses fluorescence
technologies, including fluorescence polarization, time-resolved
fluorescence, and fluorescence resonance energy transfer. These
systems offer means to monitor protein-protein or DNA-protein
interactions in which the intensity of the signal emitted from
dye-labeled molecules depends upon their interactions with partner
molecules (e.g., Selvin P R, Nat Struct Biol (2000) 7:730-4;
Fernandes P B, supra; Hertzberg R P and Pope A J, Curr Opin Chem
Biol (2000) 4:445-451).
[0082] A variety of suitable assay systems may be used to identify
candidate LRRCAPS and p53 pathway modulators (e.g. U.S. Pat. Nos.
5,550,019 and 6,133,437 (apoptosis assays), U.S. Pat. No. 6,020,135
(p53 modulation), U.S. Pat. Nos. 5,976,782, 6,225,118 and 6,444,434
(angiogenesis assays), among others). Specific preferred assays are
described in more detail below.
[0083] Apoptosis Assays
[0084] Assays for apoptosis may be performed by terminal
deoxynucleotidyl transferase-mediated digoxigenin-11-dUTP nick end
labeling (TUNEL) assay. The TUNEL assay is used to measure nuclear
DNA fragmentation characteristic of apoptosis ( Lazebnik et al.,
1994, Nature 371, 346), by following the incorporation of
fluorescein-dUTP (Yonehara et al., 1989, J. Exp. Med. 169, 1747).
Apoptosis may further be assayed by acridine orange staining of
tissue culture cells (Lucas, R., et al., 1998, Blood 15:4730-41).
An apoptosis assay system may comprise a cell that expresses an
LRRCAPS, and that optionally has defective p53 function (e.g. p53
is over-expressed or under-expressed relative to wild-type cells).
A test agent can be added to the apoptosis assay system and changes
in induction of apoptosis relative to controls where no test agent
is added, identify candidate p53 modulating agents. In some
embodiments of the invention, an apoptosis assay may be used as a
secondary assay to test a candidate p53 modulating agents that is
initially identified using a cell-free assay system. An apoptosis
assay may also be used to test whether LRRCAPS function plays a
direct role in apoptosis. For example, an apoptosis assay may be
performed on cells that over- or under-express LRRCAPS relative to
wild type cells. Differences in apoptotic response compared to wild
type cells suggests that the LRRCAPS plays a direct role in the
apoptotic response. Apoptosis assays are described further in U.S.
Pat. No. 6,133,437.
[0085] Cell Proliferation and Cell Cycle Assays
[0086] Cell proliferation may be assayed via bromodeoxyuridine
(BRDU) incorporation. This assay identifies a cell population
undergoing DNA synthesis by incorporation of BRDU into
newly-synthesized DNA. Newly-synthesized DNA may then be detected
using an anti-BRDU antibody (Hoshino et al., 1986, Int. J. Cancer
38, 369; Campana et al., 1988, J. Immunol. Meth. 107, 79), or by
other means. Cell Proliferation may also be examined using
[.sup.3H]-thymidine incorporation (Chen, J., 1996, Oncogene
13:1395-403; Jeoung, J., 1995, J. Biol. Chem. 270:18367-73). This
assay allows for quantitative characterization of S-phase DNA
syntheses. In this assay, cells synthesizing DNA will incorporate
[.sup.3H]-thymidine into newly synthesized DNA. Incorporation can
then be measured by standard techniques such as by counting of
radioisotope in a scintillation counter (e.g., Beckman LS 3800
Liquid Scintillation Counter). Another proliferation assay uses the
dye Alamar Blue (available from Biosource International), which
fluoresces when reduced in living cells and provides an indirect
measurement of cell number (Voytik-Harbin S L et al., 1998, In
Vitro Cell Dev Biol Anim 34:239-46).
[0087] Cell proliferation may also be assayed by colony formation
in soft agar (Sambrook et al., Molecular Cloning, Cold Spring
Harbor (1989)). For example, cells transformed with LRRCAPS are
seeded in soft agar plates, and colonies are measured and counted
after two weeks incubation.
[0088] Involvement of a gene in the cell cycle may be assayed by
flow cytometry (Gray J W et al. (1986) Int J Radiat Biol Relat Stud
Phys Chem Med 49:237-55). Cells transfected with an LRRCAPS may be
stained with propidium iodide and evaluated in a flow cytometer
(available from Becton Dickinson), which indicates accumulation of
cells in different stages of the cell cycle.
[0089] Accordingly, a cell proliferation or cell cycle assay system
may comprise a cell that expresses an LRRCAPS, and that optionally
has defective p53 function (e.g. p53 is over-expressed or
under-expressed relative to wild-type cells). A test agent can be
added to the assay system and changes in cell proliferation or cell
cycle relative to controls where no test agent is added, identify
candidate p53 modulating agents. In some embodiments of the
invention, the cell proliferation or cell cycle assay may be used
as a secondary assay to test a candidate p53 modulating agents that
is initially identified using another assay system such as a
cell-free assay system. A cell proliferation assay may also be used
to test whether LRRCAPS function plays a direct role in cell
proliferation or cell cycle. For example, a cell proliferation or
cell cycle assay may be performed on cells that over- or
under-express LRRCAPS relative to wild type cells. Differences in
proliferation or cell cycle compared to wild type cells suggests
that the LRRCAPS plays a direct role in cell proliferation or cell
cycle.
[0090] Angiogenesis
[0091] Angiogenesis may be assayed using various human endothelial
cell systems, such as umbilical vein, coronary artery, or dermal
cells. Suitable assays include Alamar Blue based assays (available
from Biosource International) to measure proliferation; migration
assays using fluorescent molecules, such as the use of Becton
Dickinson Falcon HTS FluoroBlock cell culture inserts to measure
migration of cells through membranes in presence or absence of
angiogenesis enhancer or suppressors; and tubule formation assays
based on the formation of tubular structures by endothelial cells
on Matrigel.RTM. (Becton Dickinson). Accordingly, an angiogenesis
assay system may comprise a cell that expresses an LRRCAPS, and
that optionally has defective p53 function (e.g. p53 is
over-expressed or under-expressed relative to wild-type cells). A
test agent can be added to the angiogenesis assay system and
changes in angiogenesis relative to controls where no test agent is
added, identify candidate p53 modulating agents. In some
embodiments of the invention, the angiogenesis assay may be used as
a secondary assay to test a candidate p53 modulating agents that is
initially identified using another assay system. An angiogenesis
assay may also be used to test whether LRRCAPS function plays a
direct role in cell proliferation. For example, an angiogenesis
assay may be performed on cells that over- or under-express LRRCAPS
relative to wild type cells. Differences in angiogenesis compared
to wild type cells suggests that the LRRCAPS plays a direct role in
angiogenesis. U.S. Pat. Nos. 5,976,782, 6,225,118 and 6,444,434,
among others.
[0092] Hypoxic Induction
[0093] The alpha subunit of the transcription factor, hypoxia
inducible factor-1 (HIF-1), is upregulated in tumor cells following
exposure to hypoxia in vitro. Under hypoxic conditions, HIF-1
stimulates the expression of genes known to be important in tumour
cell survival, such as those encoding glyolytic enzymes and VEGF.
Induction of such genes by hypoxic conditions may be assayed by
growing cells transfected with LRRCAPS in hypoxic conditions (such
as with 0.1% O2, 5% CO2, and balance N2, generated in a Napco 7001
incubator (Precision Scientific)) and normoxic conditions, followed
by assessment of gene activity or expression by Taqman.RTM.. For
example, a hypoxic induction assay system may comprise a cell that
expresses an LRRCAPS, and that optionally has a mutated p53 (e.g.
p53 is over-expressed or under-expressed relative to wild-type
cells). A test agent can be added to the hypoxic induction assay
system and changes in hypoxic response relative to controls where
no test agent is added, identify candidate p53 modulating agents.
In some embodiments of the invention, the hypoxic induction assay
may be used as a secondary assay to test a candidate p53 modulating
agents that is initially identified using another assay system. A
hypoxic induction assay may also be used to test whether LRRCAPS
function plays a direct role in the hypoxic response. For example,
a hypoxic induction assay may be performed on cells that over- or
under-express LRRCAPS relative to wild type cells. Differences in
hypoxic response compared to wild type cells suggests that the
LRRCAPS plays a direct role in hypoxic induction.
[0094] Cell Adhesion
[0095] Cell adhesion assays measure adhesion of cells to purified
adhesion proteins, or adhesion of cells to each other, in presence
or absence of candidate modulating agents. Cell-protein adhesion
assays measure the ability of agents to modulate the adhesion of
cells to purified proteins. For example, recombinant proteins are
produced, diluted to 2.5g/mL in PBS, and used to coat the wells of
a microtiter plate. The wells used for negative control are not
coated. Coated wells are then washed, blocked with 1% BSA, and
washed again. Compounds are diluted to 2.times. final test
concentration and added to the blocked, coated wells. Cells are
then added to the wells, and the unbound cells are washed off.
Retained cells are labeled directly on the plate by adding a
membrane-permeable fluorescent dye, such as calcein-AM, and the
signal is quantified in a fluorescent microplate reader.
[0096] Cell-cell adhesion assays measure the ability of agents to
modulate binding of cell adhesion proteins with their native
ligands. These assays use cells that naturally or recombinantly
express the adhesion protein of choice. In an exemplary assay,
cells expressing the cell adhesion protein are plated in wells of a
multiwell plate. Cells expressing the ligand are labeled with a
membrane-permeable fluorescent dye, such as BCECF, and allowed to
adhere to the monolayers in the presence of candidate agents.
Unbound cells are washed off, and bound cells are detected using a
fluorescence plate reader.
[0097] High-throughput cell adhesion assays have also been
described. In one such assay, small molecule ligands and peptides
are bound to the surface of microscope slides using a microarray
spotter, intact cells are then contacted with the slides, and
unbound cells are washed off. In this assay, not only the binding
specificity of the peptides and modulators against cell lines are
determined, but also the functional cell signaling of attached
cells using immunofluorescence techniques in situ on the microchip
is measured (Falsey J R et al., Bioconjug Chem. 2001 May-Jun,
12(3):346-53).
[0098] Cell Migration
[0099] An invasion/migration assay (also called a migration assay)
tests the ability of cells to overcome a physical barrier and to
migrate towards pro-angiogenic signals. Migration assays are known
in the art (e.g., Paik J H et al., 2001, J Biol Chem
276:11830-11837). In a typical experimental set-up, cultured
endothelial cells are seeded onto a matrix-coated porous lamina,
with pore sizes generally smaller than typical cell size. The
matrix generally simulates the environment of the extracellular
matrix, as described above. The lamina is typically a membrane,
such as the transwell polycarbonate membrane (Corning Costar
Corporation, Cambridge, Mass.), and is generally part of an upper
chamber that is in fluid contact with a lower chamber containing
pro-angiogenic stimuli. Migration is generally assayed after an
overnight incubation with stimuli, but longer or shorter time
frames may also be used. Migration is assessed as the number of
cells that crossed the lamina, and may be detected by staining
cells with hemotoxylin solution (VWR Scientific, South San
Francisco, Calif.), or by any other method for determining cell
number. In another exemplary set up, cells are fluorescently
labeled and migration is detected using fluorescent readings, for
instance using the Falcon HTS FluoroBlok (Becton Dickinson). While
some migration is observed in the absence of stimulus, migration is
greatly increased in response to pro-angiogenic factors. As
described above, a preferred assay system for migration/invasion
assays comprises testing an LRRCAPS's response to a variety of
pro-angiogenic factors, including tumor angiogenic and inflammatory
angiogenic agents, and culturing the cells in serum free
medium.
[0100] Sprouting Assay
[0101] A sprouting assay is a three-dimensional in vitro
angiogenesis assay that uses a cell-number defined spheroid
aggregation of endothelial cells ("spheroid"), embedded in a
collagen gel-based matrix. The spheroid can serve as a starting
point for the sprouting of capillary-like structures by invasion
into the extracellular matrix (termed "cell sprouting") and the
subsequent formation of complex anastomosing networks (Korff and
Augustin, 1999, J Cell Sci 112:3249-58). In an exemplary
experimental set-up, spheroids are prepared by pipetting 400 human
umbilical vein endothelial cells into individual wells of a
nonadhesive 96-well plates to allow overnight spheroidal
aggregation (Korff and Augustin: J Cell Biol 143: 1341-52, 1998).
Spheroids are harvested and seeded in 900 .mu.l of
methocel-collagen solution and pipetted into individual wells of a
24 well plate to allow collagen gel polymerization. Test agents are
added after 30 min by pipetting 100 .mu.l of 10-fold concentrated
working dilution of the test substances on top of the gel. Plates
are incubated at 37.degree. C. for 24h. Dishes are fixed at the end
of the experimental incubation period by addition of
paraformaldehyde. Sprouting intensity of endothelial cells can be
quantitated by an automated image analysis system to determine the
cumulative sprout length per spheroid.
[0102] Primary Assays for Antibody Modulators
[0103] For antibody modulators, appropriate primary assays test is
a binding assay that tests the antibody's affinity to and
specificity for the LRRCAPS protein. Methods for testing antibody
affinity and specificity are well known in the art (Harlow and
Lane, 1988, 1999, supra). The enzyme-linked immunosorbant assay
(ELISA) is a preferred method for detecting LRRCAPS-specific
antibodies; others include FACS assays, radioimmunoassays, and
fluorescent assays.
[0104] In some cases, screening assays described for small molecule
modulators may also be used to test antibody modulators.
[0105] Primary Assays for Nucleic Acid Modulators
[0106] For nucleic acid modulators, primary assays may test the
ability of the nucleic acid modulator to inhibit or enhance LRRCAPS
gene expression, preferably mRNA expression. In general, expression
analysis comprises comparing LRRCAPS expression in like populations
of cells (e.g., two pools of cells that endogenously or
recombinantly express LRRCAPS) in the presence and absence of the
nucleic acid modulator. Methods for analyzing mRNA and protein
expression are well known in the art. For instance, Northern
blotting, slot blotting, ribonuclease protection, quantitative
RT-PCR (e.g., using the TaqMan.RTM., PE Applied Biosystems), or
microarray analysis may be used to confirm that LRRCAPS mRNA
expression is reduced in cells treated with the nucleic acid
modulator (e.g., Current Protocols in Molecular Biology (1994)
Ausubel F M et al., eds., John Wiley & Sons, Inc., chapter 4;
Freeman W M et al., Biotechniques (1999) 26:112-125; Kallioniemi O
P, Ann Med 2001, 33:142-147; Blohm D H and Guiseppi-Elie, A Curr
Opin Biotechnol 2001, 12:41-47). Protein expression may also be
monitored. Proteins are most commonly detected with specific
antibodies or antisera directed against either the LRRCAPS protein
or specific peptides. A variety of means including Western
blotting, ELISA, or in situ detection, are available (Harlow E and
Lane D, 1988 and 1999, supra).
[0107] In some cases, screening assays described for small molecule
modulators, particularly in assay systems that involve LRRCAPS mRNA
expression, may also be used to test nucleic acid modulators.
[0108] Secondary Assays
[0109] Secondary assays may be used to further assess the activity
of LRRCAPS-modulating agent identified by any of the above methods
to confirm that the modulating agent affects LRRCAPS in a manner
relevant to the p53 pathway. As used herein, LRRCAPS-modulating
agents encompass candidate clinical compounds or other agents
derived from previously identified modulating agent. Secondary
assays can also be used to test the activity of a modulating agent
on a particular genetic or biochemical pathway or to test the
specificity of the modulating agent's interaction with LRRCAPS.
[0110] Secondary assays generally compare like populations of cells
or animals (e.g., two pools of cells or animals that endogenously
or recombinantly express LRRCAPS) in the presence and absence of
the candidate modulator. In general, such assays test whether
treatment of cells or animals with a candidate LRRCAPS-modulating
agent results in changes in the p53 pathway in comparison to
untreated (or mock- or placebo-treated) cells or animals. Certain
assays use "sensitized genetic backgrounds", which, as used herein,
describe cells or animals engineered for altered expression of
genes in the p53 or interacting pathways.
[0111] Cell-Based Assays
[0112] Cell based assays may use a variety of mammalian cell lines
known to have defective p53 function (e.g. SAOS-2 osteoblasts,
H1299 lung cancer cells, C33A and HT3 cervical cancer cells, HT-29
and DLD-1 colon cancer cells, among others, available from American
Type Culture Collection (ATCC), Manassas, Va.). Cell based assays
may detect endogenous p53 pathway activity or may rely on
recombinant expression of p53 pathway components. Any of the
aforementioned assays may be used in this cell-based format.
Candidate modulators are typically added to the cell media but may
also be injected into cells or delivered by any other efficacious
means.
[0113] Animal Assays
[0114] A variety of non-human animal models of normal or defective
p53 pathway may be used to test candidate LRRCAPS modulators.
Models for defective p53 pathway typically use genetically modified
animals that have been engineered to mis-express (e.g.,
over-express or lack expression in) genes involved in the p53
pathway. Assays generally require systemic delivery of the
candidate modulators, such as by oral administration, injection,
etc.
[0115] In a preferred embodiment, p53 pathway activity is assessed
by monitoring neovascularization and angiogenesis. Animal models
with defective and normal p53 are used to test the candidate
modulator's affect on LRRCAPS in Matrigel.RTM. assays.
Matrigel.RTM. is an extract of basement membrane proteins, and is
composed primarily of laminin, collagen IV, and heparin sulfate
proteoglycan. It is provided as a sterile liquid at 4.degree. C.,
but rapidly forms a solid gel at 37.degree. C. Liquid Matrigel.RTM.
is mixed with various angiogenic agents, such as bFGF and VEGF, or
with human tumor cells which over-express the LRRCAPS. The mixture
is then injected subcutaneously(SC) into female athymic nude mice
(Taconic, Germantown, N.Y.) to support an intense vascular
response. Mice with Matrigel.RTM. pellets may be dosed via oral
(PO), intraperitoneal (IP), or intravenous (IV) routes with the
candidate modulator. Mice are euthanized 5-12 days post-injection,
and the Matrigel.RTM. pellet is harvested for hemoglobin analysis
(Sigma plasma hemoglobin kit). Hemoglobin content of the gel is
found to correlate the degree of neovascularization in the gel.
[0116] In another preferred embodiment, the effect of the candidate
modulator on LRRCAPS is assessed via tumorigenicity assays. In one
example, a xenograft comprising human cells from a pre-existing
tumor or a tumor cell line is used. Tumor xenograft assays are
known in the art (see, e.g., Ogawa K et al., 2000, Oncogene
19:6043-6052). Xenografts are typically implanted SC into female
athymic mice, 6-7 week old, as single cell suspensions either from
a pre-existing tumor or from in vitro culture. The tumors which
express the LRRCAPS endogenously are injected in the flank,
1.times.10.sup.5 to 1.times.10.sup.7 cells per mouse in a volume of
100 .mu.L using a 27gauge needle. Mice are then ear tagged and
tumors are measured twice weekly. Candidate modulator treatment is
initiated on the day the mean tumor weight reaches 100 mg.
Candidate modulator is delivered IV, SC, IP, or PO by bolus
administration. Depending upon the pharmacokinetics of each unique
candidate modulator, dosing can be performed multiple times per
day. The tumor weight is assessed by measuring perpendicular
diameters with a caliper and calculated by multiplying the
measurements of diameters in two dimensions. At the end of the
experiment, the excised tumors maybe utilized for biomarker
identification or further analyses. For immunohistochemistry
staining, xenograft tumors are fixed in 4% paraformaldehyde, 0.1M
phosphate, pH 7.2, for 6 hours at 4.degree. C., immersed in 30%
sucrose in PBS, and rapidly frozen in isopentane cooled with liquid
nitrogen.
[0117] In another preferred embodiment, tumorogenicity is monitored
using a hollow fiber assay, which is described in U.S. Pat. No.
5,698,413. Briefly, the method comprises implanting into a
laboratory animal a biocompatible, semi-permeable encapsulation
device containing target cells, treating the laboratory animal with
a candidate modulating agent, and evaluating the target cells for
reaction to the candidate modulator. Implanted cells are generally
human cells from a pre-existing tumor or a tumor cell line. After
an appropriate period of time, generally around six days, the
implanted samples are harvested for evaluation of the candidate
modulator. Tumorogenicity and modulator efficacy may be evaluated
by assaying the quantity of viable cells present in the
macrocapsule, which can be determined by tests known in the art,
for example, MTT dye conversion assay, neutral red dye uptake,
trypan blue staining, viable cell counts, the number of colonies
formed in soft agar, the capacity of the cells to recover and
replicate in vitro, etc.
[0118] In another preferred embodiment, a tumorogenicity assay use
a transgenic animal, usually a mouse, carrying a dominant oncogene
or tumor suppressor gene knockout under the control of tissue
specific regulatory sequences; these assays are generally referred
to as transgenic tumor assays. In a preferred application, tumor
development in the transgenic model is well characterized or is
controlled. In an exemplary model, the "RIP1-Tag2" transgene,
comprising the SV40 large T-antigen oncogene under control of the
insulin gene regulatory regions is expressed in pancreatic beta
cells and results in islet cell carcinomas (Hanahan D, 1985, Nature
315:115-122; Parangi S et al, 1996, Proc Natl Acad Sci USA 93:
2002-2007; Bergers G et al, 1999, Science 284:808-812). An
"angiogenic switch," occurs at approximately five weeks, as
normally quiescent capillaries in a subset of hyperproliferative
islets become angiogenic. The RIP1-TAG2 mice die by age 14 weeks.
Candidate modulators may be administered at a variety of stages,
including just prior to the angiogenic switch (e.g., for a model of
tumor prevention), during the growth of small tumors (e.g., for a
model of intervention), or during the growth of large and/or
invasive tumors (e.g., for a model of regression). Tumorogenicity
and modulator efficacy can be evaluating life-span extension and/or
tumor characteristics, including number of tumors, tumor size,
tumor morphology, vessel density, apoptotic index, etc.
[0119] Diagnostic and Therapeutic Uses
[0120] Specific LRRCAPS-modulating agents are useful in a variety
of diagnostic and therapeutic applications where disease or disease
prognosis is related to defects in the p53 pathway, such as
angiogenic, apoptotic, or cell proliferation disorders.
Accordingly, the invention also provides methods for modulating the
p53 pathway in a cell, preferably a cell pre-determined to have
defective or impaired p53 function (e.g. due to overexpression,
underexpression, or misexpression of p53, or due to gene
mutations), comprising the step of administering an agent to the
cell that specifically modulates LRRCAPS activity. Preferably, the
modulating agent produces a detectable phenotypic change in the
cell indicating that the p53 function is restored. The phrase
"function is restored", and equivalents, as used herein, means that
the desired phenotype is achieved, or is brought closer to normal
compared to untreated cells. For example, with restored p53
function, cell proliferation and/or progression through cell cycle
may normalize, or be brought closer to normal relative to untreated
cells. The invention also provides methods for treating disorders
or disease associated with impaired p53 function by administering a
therapeutically effective amount of an LRRCAPS -modulating agent
that modulates the p53 pathway. The invention further provides
methods for modulating LRRCAPS function in a cell, preferably a
cell pre-determined to have defective or impaired LRRCAPS function,
by administering an LRRCAPS -modulating agent. Additionally, the
invention provides a method for treating disorders or disease
associated with impaired LRRCAPS function by administering a
therapeutically effective amount of an LRRCAPS-modulating
agent.
[0121] The discovery that LRRCAPS is implicated in p53 pathway
provides for a variety of methods that can be employed for the
diagnostic and prognostic evaluation of diseases and disorders
involving defects in the p53 pathway and for the identification of
subjects having a predisposition to such diseases and
disorders.
[0122] Various expression analysis methods can be used to diagnose
whether LRRCAPS expression occurs in a particular sample, including
Northern blotting, slot blotting, ribonuclease protection,
quantitative RT-PCR, and microarray analysis. (e.g., Current
Protocols in Molecular Biology (1994) Ausubel F M et al., eds.,
John Wiley & Sons, Inc., chapter 4; Freeman W M et al.,
Biotechniques (1999) 26:112-125; Kallioniemi O P, Ann Med 2001,
33:142-147; Blohm and Guiseppi-Elie, Curr Opin Biotechnol 2001,
12:41-47). Tissues having a disease or disorder implicating
defective p53 signaling that express an LRRCAPS, are identified as
amenable to treatment with an LRRCAPS modulating agent. In a
preferred application, the p53 defective tissue overexpresses an
LRRCAPS relative to normal tissue. For example, a Northern blot
analysis of mRNA from tumor and normal cell lines, or from tumor
and matching normal tissue samples from the same patient, using
full or partial LRRCAPS cDNA sequences as probes, can determine
whether particular tumors express or overexpress LRRCAPS.
Alternatively, the TaqMan.RTM. is used for quantitative RT-PCR
analysis of LRRCAPS expression in cell lines, normal tissues and
tumor samples (PE Applied Biosystems).
[0123] Various other diagnostic methods may be performed, for
example, utilizing reagents such as the LRRCAPS oligonucleotides,
and antibodies directed against an LRRCAPS, as described above for:
(1) the detection of the presence of LRRCAPS gene mutations, or the
detection of either over- or under-expression of LRRCAPS mRNA
relative to the non-disorder state; (2) the detection of either an
over- or an under-abundance of LRRCAPS gene product relative to the
non-disorder state; and (3) the detection of perturbations or
abnormalities in the signal transduction pathway mediated by
LRRCAPS.
[0124] Thus, in a specific embodiment, the invention is drawn to a
method for diagnosing a disease or disorder in a patient that is
associated with alterations in LRRCAPS expression, the method
comprising: a) obtaining a biological sample from the patient; b)
contacting the sample with a probe for LRRCAPS expression; c)
comparing results from step (b) with a control; and d) determining
whether step (c) indicates a likelihood of the disease or disorder.
Preferably, the disease is cancer, most preferably a cancer as
shown in TABLE 2. The probe may be either DNA or protein, including
an antibody.
EXAMPLES
[0125] The following experimental section and examples are offered
by way of illustration and not by way of limitation.
[0126] I. Drosophila p53 Screen
[0127] The Drosophila p53 gene was overexpressed specifically in
the wing using the vestigial margin quadrant enhancer. Increasing
quantities of Drosophila p53 (titrated using different strength
transgenic inserts in 1 or 2 copies) caused deterioration of normal
wing morphology from mild to strong, with phenotypes including
disruption of pattern and polarity of wing hairs, shortening and
thickening of wing veins, progressive crumpling of the wing and
appearance of dark "death" inclusions in wing blade. In a screen
designed to identify enhancers and suppressors of Drosophila p53,
homozygous females carrying two copies of p53 were crossed to 5663
males carrying random insertions of a piggyBac transposon (Fraser M
et al., Virology (1985) 145:356-361).
[0128] Progeny containing insertions were compared to
non-insertion-bearing sibling progeny for enhancement or
suppression of the p53 phenotypes. Sequence information surrounding
the piggyBac insertion site was used to identify the modifier
genes. Modifiers of the wing phenotype were identified as members
of the p53 pathway. CAPS was an enhancer of the wing phenotype.
Human orthologs of the modifiers are referred to herein as
LRRCAPS.
[0129] BLAST analysis (Altschul et al., supra) was employed to
identify Targets from Drosophila modifiers. Various domains,
signals, and functional subunits in proteins were analyzed using
the PSORT (Nakai K., and Horton P., Trends Biochem Sci, 1999,
24:34-6; Kenta Nakai, Protein sorting signals and prediction of
subcellular localization, Adv. Protein Chem. 54, 277-344 (2000)),
PFAM (Bateman A., et al., Nucleic Acids Res, 1999, 27:260-2), SMART
(Ponting C P, et al., SMART: identification and annotation of
domains from signaling and extracellular protein sequences. Nucleic
Acids Res. Jan. 1, 1999; 27(1):229-32), TM-HMM (Erik L. L.
Sonnhammer, Gunnar von Heijne, and Anders Krogh: A hidden Markov
model for predicting transmembrane helices in protein sequences. In
Proc. of Sixth Int. Conf. on Intelligent Systems for Molecular
Biology, p 175-182 Ed J. Glasgow, T. Littlejohn, F. Major, R.
Lathrop, D. Sankoff, and C. Sensen Menlo Park, Calif.: AAAI Press,
1998), and dust (Remm M, and Sonnhammer E. Classification of
transmembrane protein families in the Caenorhabditis elegans genome
and identification of human orthologs. Genome Res. 2000 Nov;
10(11):1679-89) programs. For example, PFAM was employed to
determine approximate amino acid locations for the LRR and IG
domains of GI#s 11877257, 16157511, 14758126, 5453656, 14764198,
and 5729718 (SEQ ID NOs: 19, 20, 21, 22, 23, and 24, respectively),
as shown in Table 1.
1TABLE 1 Approximate amino acid locations for various domains of
LRRCAPS polypeptides: LRRCAPS LRRCAPS LRR domain (PFAM 01462, IG
domain GI # SEQ ID NO: PFAM00560, PFAM 01463) (PFAM00047) 11877257
19 28 to 59, 339 to 372, 63 to 86, 87 to 110, 111 to 134, 135 to
158, 159 to 181, 182 to 203, 376 to 399, 400 to 423, 424 to 447,
448 to 471, 472 to 494, 495 to 519, 216 to 264, 529 to 579 16157511
20 63 to 86, 87 to 110, 111 to 134, 260 to 319, 356 to 135 to 158,
159 to 182, 119 414, 447 to 504, to244 539 to 596 14758126 21 22 to
45, 46 to 69, 70 to 93, 94 to 117, 118 to 140, 141 to 162, 321 to
341, 345 to 368, 369 to 392, 393 to 416, 417 to 440, 175 to 225,
474 to 524 5453656 22 70 to 93, 94 to 117, 118 to 141, 438 to 499
142 to 165, 166 to 189, 190 to 213, 214 to 237, 238 to 261, 262 to
285, 286 to 310, 311 to 335, 336 to 359, 360 to 383, 369-421
14764198 23 3 to 26, 27 to 50, 51 to 74, 76 to 227 to 285 99, 100
to 123, 124 to 144, 165- 210 5729718 24 61 to 90, 92 to 115, 119 to
142, 143 to 166, 211 to 234, 235 to 258, 259 to 282, 294 to 345
[0130] II. High-Throughput In Vitro Fluorescence Polarization
Assay
[0131] Fluorescently-labeled LRRCAPS peptide/substrate are added to
each well of a 96-well microtiter plate, along with a test agent in
a test buffer (10 mM HEPES, 10 mM NaCl, 6 mM magnesium chloride, pH
7.6). Changes in fluorescence polarization, determined by using a
Fluorolite FPM-2 Fluorescence Polarization Microtiter System
(Dynatech Laboratories, Inc), relative to control values indicates
the test compound is a candidate modifier of LRRCAPS activity.
[0132] III. High-Throughput In Vitro Binding Assay
[0133] .sup.33P-labeled LRRCAPS peptide is added in an assay buffer
(100 mM KCl, 20 mM HEPES pH 7.6, 1 mM MgCl.sub.2, 1% glycerol, 0.5%
NP-40, 50 mM beta-mercaptoethanol, 1 mg/ml BSA, cocktail of
protease inhibitors) along with a test agent to the wells of a
Neutralite-avidin coated assay plate and incubated at 25.degree. C.
for 1 hour. Biotinylated substrate is then added to each well and
incubated for 1 hour. Reactions are stopped by washing with PBS,
and counted in a scintillation counter. Test agents that cause a
difference in activity relative to control without test agent are
identified as candidate p53 modulating agents.
[0134] IV. Immunoprecipitations and Immunoblotting
[0135] For coprecipitation of transfected proteins,
3.times.10.sup.6 appropriate recombinant cells containing the
LRRCAPS proteins are plated on 10-cm dishes and transfected on the
following day with expression constructs. The total amount of DNA
is kept constant in each transfection by adding empty vector. After
24 h, cells are collected, washed once with phosphate-buffered
saline and lysed for 20 min on ice in 1 ml of lysis buffer
containing 50 mM Hepes, pH 7.9, 250 mM NaCl, 20 mM
-glycerophosphate, 1 mM sodium orthovanadate, 5 mM p-nitrophenyl
phosphate, 2 mM dithiothreitol, protease inhibitors (complete,
Roche Molecular Biochemicals), and 1% Nonidet P-40. Cellular debris
is removed by centrifugation twice at 15,000.times.g for 15 min.
The cell lysate is incubated with 25 .mu.l of M2 beads (Sigma) for
2 h at 4.degree. C. with gentle rocking.
[0136] After extensive washing with lysis buffer, proteins bound to
the beads are solubilized by boiling in SDS sample buffer,
fractionated by SDS-polyacrylamide gel electrophoresis, transferred
to polyvinylidene difluoride membrane and blotted with the
indicated antibodies. The reactive bands are visualized with
horseradish peroxidase coupled to the appropriate secondary
antibodies and the enhanced chemiluminescence (ECL) Western
blotting detection system (Amersham Pharmacia Biotech).
[0137] V. Expression Analysis
[0138] All cell lines used in the following experiments are NCI
(National Cancer Institute) lines, and are available from ATCC
(American Type Culture Collection, Manassas, Va. 20110-2209).
Normal and tumor tissues were obtained from Impath, U C Davis,
Clontech, Stratagene, and Ambion.
[0139] TaqMan analysis was used to assess expression levels of the
disclosed genes in various samples.
[0140] RNA was extracted from each tissue sample using Qiagen
(Valencia, Calif.) RNeasy kits, following manufacturer's protocols,
to a final concentration of 50 ng/.mu.l. Single stranded cDNA was
then synthesized by reverse transcribing the RNA samples using
random hexamers and 500 ng of total RNA per reaction, following
protocol 4304965 of Applied Biosystems (Foster City, Calif.).
[0141] Primers for expression analysis using TaqMan assay (Applied
Biosystems, Foster City, Calif.) were prepared according to the
TaqMan protocols, and the following criteria: a) primer pairs were
designed to span introns to eliminate genomic contamination, and b)
each primer pair produced only one product.
[0142] Taqman reactions were carried out following manufacturer's
protocols, in 25 .mu.l total volume for 96-well plates and 10 .mu.l
total volume for 384-well plates, using 300nM primer and 250 nM
probe, and approximately 25 ng of cDNA. The standard curve for
result analysis was prepared using a universal pool of human cDNA
samples, which is a mixture of cDNAs from a wide variety of tissues
so that the chance that a target will be present in appreciable
amounts is good. The raw data were normalized using 18S rRNA
(universally expressed in all tissues and cells).
[0143] For each expression analysis, tumor tissue samples were
compared with matched normal tissues from the same patient. A gene
was considered overexpressed in a tumor when the level of
expression of the gene was 2 fold or higher in the tumor compared
with its matched normal sample. In cases where normal tissue was
not available, a universal pool of cDNA samples was used instead.
In these cases, a gene was considered overexpressed in a tumor
sample when the difference of expression levels between a tumor
sample and the average of all normal samples from the same tissue
type was greater than 2 times the standard deviation of all normal
samples (i.e., Tumor-average (all normal samples)>2.times.STDEV
(all normal samples)).
[0144] Results are shown in Table 2. Number of pairs of tumor
samples and matched normal tissue from the same patient are shown
for each tumor type. Percentage of the samples with at least
two-fold overexpression for each tumor type is provided. "ND" means
not done. A modulator identified by an assay described herein can
be further validated for therapeutic effect by administration to a
tumor in which the gene is overexpressed. A decrease in tumor
growth confirms therapeutic utility of the modulator. Prior to
treating a patient with the modulator, the likelihood that the
patient will respond to treatment can be diagnosed by obtaining a
tumor sample from the patient, and assaying for expression of the
gene targeted by the modulator. The expression data for the gene(s)
can also be used as a diagnostic marker for disease progression.
The assay can be performed by expression analysis as described
above, by antibody directed to the gene target, or by any other
available detection method.
2TABLE 2 SEQ Head ID # of # of and # of # of # of NO Breast Pairs
Colon Pairs Neck Pairs Kidney Pairs Lung Pairs 1 5.3% 19 12.1% 33
12.5% 8 26.1% 23 10.0% 20 3 21.2% 33 30.3% 33 ND 0 70.0% 20 25.0%
32 4 36.8% 19 9.1% 33 37.5% 8 45.8% 24 45.0% 20 10 42.1% 19 9.1% 33
12.5% 8 21.7% 23 28.6% 21 14 31.6% 19 21.2% 33 50.0% 8 36.4% 22
30.0% 20 16 33.3% 18 32.3% 31 100.0% 8 26.1% 23 57.9% 19 15 50.0%
12 33.3% 30 ND 0 ND 0 71.4% 14 SEQ ID # of # of # of # of NO Ovary
Pairs Uterus Pairs Protate Pairs Skin Pairs 1 18.2% 11 10.5% 19
33.3% 12 66.7% 3 3 30.0% 6 15.8% 19 0.0% 12 ND 0 4 27.3% 11 26.3%
19 33.3% 12 0.0% 3 10 66.7% 12 35.3% 17 0.0% 12 0.0% 3 14 27.3% 11
10.5% 19 16.7% 12 33.3% 3 16 36.4% 11 15.8% 19 8.3% 12 0.0% 3 15
42.9% 7 ND 0 ND 0 ND 0
[0145]
Sequence CWU 1
1
24 1 8580 DNA Homo sapiens 1 ggtgtccagc tcggatggct gactgctctt
tcacttgtcg ttaacttcca cccagtagtc 60 ttagtctccc cttggacatg
gtaccgaccg tgtgccattc ttatttcact tgaccgagcc 120 cgagctgcag
cccagggtgg aagccgcaag ggctgggcag agttccaggc aagagggacc 180
gagaggcgtg agcagtgctc cgcagcgctt gtcagagaaa tggagcagcg gctcctttgt
240 gggctcttac atccccgtcc tgggtgcaca gcccggatat ttgatctgtg
gggtgattga 300 taagtgaggg agagggggga cgcgatccct ccctccctcc
tccctccctc ctccctcctc 360 actccctcct ccctcctccc tccctcctcc
ctcctccctc ctccctccct tcttcccctc 420 tcctccctcc cgctttactt
ccctcctccc tctcgcctcc ctccctcctt cctgcaagaa 480 gcgttgcccg
ttggctagct gctcggtggg gatctgcctg ccctgggggc gccgcccgcg 540
ctccccgcgg tgctctcgct cctgggctgc gccagtccga ggcggtgccg gctcctttgc
600 ctccccgagt cgcagatgct gcgggcgcct ccgggaaaag atctgggcgg
cgcgctcgct 660 cggtaagttc tgagcactca gggacgcggt ggcgacgcgg
ccagtgagcc ggctttcctc 720 agtccgttgc ctttcccggc cacctctcct
tgcgaggggc accagcgtgg ggaggctggg 780 cgccatccgc ggagggcagc
tcgctggcgg ccgccctcta ccctcaatcc ccactggaga 840 ttcctcaccc
cgggaccgtc cgcgcgggcg tggtcgggct ccgcgcctcg cgcagcgggg 900
tggcacaggc ggccagggag ggcccacgca cccggcgcga gctagaagcc tccggtcggc
960 ctgcagtgcc caagtcccat ggcgagggca gcccgagtgg ccgtcgcggc
tgtaggtccg 1020 catgccgggc accgcaccag gcgtctagca ggtaggggca
gggaaggtag ggctgcgctg 1080 gcggccggtg cccagttacc ggcgatcggg
gacgctcgga gacacctggt ctcccggaag 1140 cgcccttcgg aaatgggatt
cgacccggct tgcgggcggc gggtgtttag aagaagaggc 1200 tgcgggcaag
cagtgccccc tctctggttc cccggactcc tcttagcccc ctcgtggcct 1260
gatgggcggc cgggacggag gtgggctgta agcccgccgg cacccaccgt gtcctgtgga
1320 aggcttggag acctgagcca ggctctactc ggttagatgc gagtgagaca
ggcgacggag 1380 ttcgtcttta agcctccctt gcctcagcaa ggagagggag
cgttttcctt attttaatga 1440 ccgcctttcc ctcccttggg gtcccagttc
accctgaacc tttccacagc cttaaaagcc 1500 cgctctccct agcggagtgc
tgcggcagtt cttcagaaga cacggggcca aatgaaactt 1560 acttaaagtg
gttatcgcgt ttcaggctga tttgttccta gtaactacgt tttggaaagc 1620
agctgtggac tctcaaggac aggcaggaac gaagacctcc ttagggtccg agtgtcgctt
1680 cccacagtta gattacccat gaatttcctt gattctgtag gtctcaagaa
tctcatgagc 1740 ccccaaccac ctccaccttt ctctctgaat ctctctcctg
tctctgaaat tcttcgcaaa 1800 aataatctgt cctcaaggaa tattgaaagc
gtcatcaatg tcttgaagga aattactgta 1860 tctgagaacc gaaactagta
tttgatgttt cactttcaaa tacatttttt tttaataaaa 1920 aggatacctt
taagtaaaac actaccactg ctatttacgt ttaagtagat ttttaattca 1980
tattaagcag tagtgtgctt tttggaaaga tgatggactg ggactcataa tccctgggtt
2040 ttgtttctat ttgtagtttc ttgagtaagt cagcactttc cctggttaag
aaatgacctc 2100 atctgtaaaa tgaacctgac ttctaaggtc ttgttcggct
ctcacgtctt tacttcagtt 2160 aaatattgac ataatacatg tttgttgaat
gaatgactga atgaataaat cttattgctc 2220 taggaactat gtgcttttaa
cctttggaaa tacatagaaa caaatgcctc ctttgctagg 2280 aaagggagct
taattgtgct cccactgcat cagactgctt ccatctaatg atgcaattgc 2340
aatacagggt gggaggccat ccacagggct gtccctctgc ctcagagctc atctcaagtt
2400 tgcccttctc tatggagaag gaaaatttga gtctccagaa ggaaagtatt
ttgctaaacg 2460 ccaagcaaga attagagggt aagggggctt ctgatgcctg
gtccaaggct cttagaagaa 2520 gaagaaggag aaagggagaa agctcaagaa
aataatgcac agatcaccag ctacaggtgg 2580 ccctagtgcc taagtttata
aactaccccc gcccttcccg gagagaaagg agcttgtata 2640 aagggaacaa
tcctagaacc taggcttaac aaggaaggag aaggggagaa gcaagggtca 2700
ttctgttctc aagtggtttg ttgtatctgt gtgtttatct attccacatt tggccaggca
2760 gtgggactcc gggaaagcaa gctgaagtgt tttgtgggac aggcacatca
tttggtcaac 2820 tccttttccc tggttgattc cctccttttc ctgacccatc
tcccccttcc taccagaggc 2880 aaccaaggcc tttccaggca acagcagaca
ttatttaccc cttgcgaatt gattccacaa 2940 tggaagactc ttgggtccaa
ggagctgtaa gataattaag gagaaaacag ataatgaagg 3000 caaaaatcca
acacctggcc attagcaatg ctctggaaag gctatttaaa cccgcattgg 3060
atatcagagc tgggagggcc catacagtct acctacctgc cttttgcaga tggacacagg
3120 aagatccaga agctagtggc acatctagca acagagccag atcagaaccc
aggtaagctc 3180 ggtctcaggc caggattcct tttccatctc ctctctttct
caggagccag tcttcctgca 3240 ccagcttcct cttttctcct agctcccctg
cccctgcagc ctggagggct caaccaccct 3300 tcctttggct cccactccca
gctgaggctc agcctggcag tgcttttctg acacccactt 3360 cttttctcct
tcctccaggc aagaagtgca cgtttaaacc tcattatctg gctaagtata 3420
aacctcaggg agaaaagggg tttgtttttg tttttctcag tctataagct aacattgagc
3480 tgactttcag agtccaggca aatatgtttg gtgggtcacc acccagaaag
gaattctctt 3540 agcactgaat cagggctctg tatgtaaagt ataaatcccc
taagagaact cctcaccctc 3600 ctacacagac acattggtgc atgcacacac
accccactct ctgcacagag agcatctgag 3660 ctaggagctg gcaagtgggg
caccagtcct gtagcaaaga ggcacaccac acacacacac 3720 acacacacac
acacacactc tctctctctc tctctctcac tcacacacac acacacacac 3780
acacacacac acacacacgc ttctcccttg cttgggaatc tgggagaaga accccccacc
3840 cccacccctg ccctccatag gcattgtgta ggtgagagaa agagggagga
gtgagagaga 3900 acacacagag gggcccaagg aggtgccagg ccattagcag
ggcccctcct tgagaaaccc 3960 ctctgcagga gcttctcctg ccgccagcca
ggttggaggt ggagtagttc agaatcaact 4020 gacgcagccg ggaattgagc
tttgcaaagc cacttgcaag gaagggaagc atctgcccaa 4080 ccctccccca
ccgcgcgccc tggatcctct gcctgccccc tcccccgtga cgtcacccta 4140
gtcctgtccc ggggagcctg caaagcctct cagattcaaa ctgctagacg cactgctgcc
4200 accgccaccg aattggaaac gcgcgcccag gctccgtcgt cgccttcgcc
cgccgaccgg 4260 gccagccggc tctccgacct ccctacagaa tcgcacccca
gtccctccct ggcagctcgg 4320 cttccctcag ctccaactct tctcttccgc
tcctgcctcc tgtcggattt ttaatttctg 4380 cgcaccccca gtcaaattaa
atcaaccaac aaaaagcagg gcatcccccc tggaagcagc 4440 gtcttatttt
accttgttct cccacttcct gaagatgcta aactcctggt ggactgcaga 4500
ggagagggat tcagtcttct cctgatgtgt gagtaacccc cacctcgcac tgtctttccc
4560 atctctatct ctcctctaca cctgccgcca gccccctgat tcctgatttt
cccaccccct 4620 ttttgcgctt tttttttttt tcctaaagcg attgcgattt
ctgctgggag ctcaagacgg 4680 gcgagctgcc cgagatctct tcgagatacc
ccaggggagg aggagatggg caggatttag 4740 taggacaact cggttactaa
tgacttggcg gctggctgcg accccccggg aaatcaggtg 4800 caagcatgtg
tgttcccggg gcgcgtgtgt gggtggcctc gggatggggg aatagggagc 4860
ggagaaaaga aaccgctttg gaaaatgcaa tgatttcatt ctgccgtgtt gctaaccccc
4920 tcattctccc tcgctcccac tccgcctcct tgctttacca tttttaatcg
ggcttccttg 4980 tttctttctt ccctgcaccc gcttcttccc cctgccccca
cctaaggttt gcctgtaggt 5040 acctgagttg acaccgaagg tgcctaaaga
tgctgagcgg cgtttggttc ctcagtgtgt 5100 taaccgtggc cgggatctta
cagacagaga gtcgcaaaac tgccaaagac atttgcaaga 5160 tccgctgtct
gtgcgaagaa aaggaaaacg tactgaatat caactgtgag aacaaaggat 5220
ttacaacagt tagcctgctc cagccccccc agtatcgaat ctatcagctt tttctcaatg
5280 gaaacctctt gacaagactg tatccaaacg aatttgtcaa ttactccaac
gcggtgactc 5340 ttcacctagg taacaacggg ttacaggaga tccgaacggg
ggcattcagt ggcctgaaaa 5400 ctctcaaaag actgcatctc aacaacaaca
agcttgagat attgagggag gacaccttcc 5460 taggcctgga gagcctggag
tatctccagg ccgactacaa ttacatcagt gccatcgagg 5520 ctggggcatt
cagcaaactt aacaagctca aagtgctcat cctgaatgac aaccttctgc 5580
tttcactgcc cagcaatgtg ttccgctttg tcctgctgac ccacttagac ctcaggggga
5640 ataggctaaa agtaatgcct tttgctggcg tccttgaaca tattggaggg
atcatggaga 5700 ttcagctgga ggaaaatcca tggaattgca cttgtgactt
acttcctctc aaggcctggc 5760 tagacaccat aactgttttt gtgggagaga
ttgtctgtga gactcccttt aggttgcatg 5820 ggaaagacgt gacccagctg
accaggcaag acctctgtcc cagaaaaagt gccagtgatt 5880 ccagtcagag
gggcagccat gctgacaccc acgtccaaag gctgtcacct acaatgaatc 5940
ctgctctcaa cccaaccagg gctccgaaag ccagccggcc gcccaaaatg agaaatcgtc
6000 caactccccg agtgactgtg tcaaaggaca ggcaaagttt tggacccatc
atggtgtacc 6060 agaccaagtc tcctgtgcct ctcacctgtc ccagcagctg
tgtctgcacc tctcagagct 6120 cagacaatgg tctgaatgta aactgccaag
aaaggaagtt cactaatatc tctgacctgc 6180 agcccaaacc gaccagtcca
aagaaactct acctaacagg gaactatctt caaactgtct 6240 ataagaatga
cctcttagaa tacagttctt tggacttact gcacttagga aacaacagga 6300
ttgcagtcat tcaggaaggt gcctttacaa acctgaccag tttacgcaga ctttatctga
6360 atggcaatta ccttgaagtg ctgtaccctt ctatgtttga tggactgcag
agcttgcaat 6420 atctctattt agagtataat gtcattaagg aaattaagcc
tctgaccttt gatgctttga 6480 ttaacctaca gctactgttt ctgaacaaca
accttcttcg gtccttacct gataatatat 6540 ttggggggac ggccctaacc
aggctgaatc tgagaaacaa ccatttttct cacctgcccg 6600 tgaaaggggt
tctggatcag ctcccggctt tcatccagat agatctgcag gagaacccct 6660
gggactgtac ctgtgacatc atggggctga aagactggac agaacatgcc aattcccctg
6720 tcatcattaa tgaggtgact tgcgaatctc ctgctaagca tgcaggggag
atactaaaat 6780 ttctggggag ggaggctatc tgtccagaca gcccaaactt
gtcagatgga accgtcttgt 6840 caatgaatca caatacagac acacctcggt
cgcttagtgt gtctcctagt tcctatcctg 6900 aactacacac tgaagttcca
ctgtctgtct taattctggg attgcttgtt gttttcatct 6960 tatctgtctg
ttttggggct ggtttattcg tctttgtctt gaaacgccga aagggagtgc 7020
cgagcgttcc caggaatacc aacaacttag acgtaagctc ctttcaatta cagtatgggt
7080 cttacaacac tgagactcac gataaaacag acggccatgt ctacaactat
atccccccac 7140 ctgtgggtca gatgtgccaa aaccccatct acatgcagaa
ggaaggagac ccagtagcct 7200 attaccgaaa cctgcaagag ttcagctata
gcaacctgga ggagaaaaaa gaagagccag 7260 ccacacctgc ttacacaata
agtgccactg agctgctaga aaagcaggcc acaccaagag 7320 agcctgagct
gctgtatcaa aatattgctg agcgagtcaa ggaacttccc agcgcaggcc 7380
tagtccacta taacttttgt accttaccta aaaggcagtt tgccccttcc tatgaatctc
7440 gacgccaaaa ccaagacaga atcaataaaa ccgttttata tggaactccc
aggaaatgct 7500 ttgtggggca gtcaaaaccc aaccaccctt tactgcaagc
taagccgcaa tcagaaccgg 7560 actacctcga agttctggaa aaacaaactg
caatcagtca gctgtgaagg gaaatcattt 7620 acaaccctaa ggcatcagag
gatgctgctc cgaactgttg gaaacaagga cattagcttt 7680 tgtgtttgtt
tttgttctcc ctttcccagt gttaatgggg gactttgaaa atgtttggga 7740
gataggatga agtcatgatt ttgcttttgc aagttttcct ttaaattatt tctctctcgc
7800 tctcctcccc tccttttttt tttttttttt ttctttttcc cttctcttct
taggaaccat 7860 cagtggacat gaatgtttct acaatgcatt tcttcataga
ttttgtttat ggttttgttt 7920 cttttttctt ctttgttttt cagtgtggga
gtgggaagag gagattatag tgactgaaga 7980 aagaataggc aaacttttca
aatgaaaatg gatatttagt gtattttgta gaagatctcc 8040 aaagatcttt
tgtgactaca acttcttttg taaataatga tatatggtat ttccatcgtc 8100
agttaccgag tatagccact gggtatcact actttgtgtt aaagtgcctt cgcactttaa
8160 gtacattact taaatgttgc ttttagcttt gataaattga aaatatttta
atgtgttgta 8220 tttttgaaat tgaaaacact gtaaaataga ttgatgtgtc
agctatatta agtcaacgta 8280 cagtttgctt gagttataga aaccagcctg
tcatcaaatg attctagttc taggactttg 8340 taggcttaac tataaaatat
ttcctttcct ctgggtttaa gtgattttat ttaagtcaac 8400 taaggggatt
taacagtgga ctagaggtaa taagccacct cagtcaggat taataattca 8460
ttaataaaat atatttaacc caatatcaga gtgaattgag caattaatgc ccttccgtaa
8520 atcattattt tacactaaca tggtgagtgt tttagattat tttcctaatt
aaaagaacgt 8580 2 2667 DNA Homo sapiens 2 ttcttccctg cacccgcttc
ttccccctgc ccccacctaa ggtttgcctg taggtacctg 60 agttgacacc
gaaggtgcct aaagatgctg agcggcgttt ggttcctcag tgtgttaacc 120
gtggccggga tcttacagac agagagtcgc aaaactgcca aagacatttg caagatccgc
180 tgtctgtgcg aagaaaagga aaacgtactg aatatcaact gtgagaacaa
aggatttaca 240 acagttagcc tgctccagcc cccccagtat cgaatctatc
agctttttct caatggaaac 300 ctcttgacaa gactgtatcc aaacgaattt
gtcaattact ccaacgcggt gactcttcac 360 ctaggtaaca acgggttaca
ggagatccga acgggggcat tcagtggcct gaaaactctc 420 aaaagactgc
atctcaacaa caacaagctt gagatattga gggaggacac cttcctaggc 480
ctggagagcc tggagtatct ccaggccgac tacaattaca tcagtgccat cgaggctggg
540 gcattcagca aacttaacaa gctcaaagtg ctcatcctga atgacaacct
tctgctttca 600 ctgcccagca atgtgttccg ctttgtcctg ctgacccact
tagacctcag ggggaatagg 660 ctaaaagtaa tgccttttgc tggcgtcctt
gaacatattg gagggatcat ggagattcag 720 ctggaggaaa atccatggaa
ttgcacttgt gacttacttc ctctcaaggc ctggctagac 780 accataactg
tttttgtggg agagattgtc tgtgagactc cctttaggtt gcatgggaaa 840
gacgtgaccc agctgaccag gcaagacctc tgtcccagaa aaagtgccag tgattccagt
900 cagaggggca gccatgctga cacccacgtc caaaggctgt cacctacaat
gaatcctgct 960 ctcaacccaa ccagggctcc gaaagccagc cggccgccca
aaatgagaaa tcgtccaact 1020 ccccgagtga ctgtgtcaaa ggacaggcaa
agttttggac ccatcatggt gtaccagacc 1080 aagtctcctg tgcctctcac
ctgtcccagc agctgtgtct gcacctctca gagctcagac 1140 aatggtctga
atgtaaactg ccaagaaagg aagttcacta atatctctga cctgcagccc 1200
aaaccgacca gtccaaagaa actctaccta acagggaact atcttcaaac tgtctataag
1260 aatgacctct tagaatacag ttctttggac ttactgcact taggaaacaa
caggattgca 1320 gtcattcagg aaggtgcctt tacaaacctg accagtttac
gcagacttta tctgaatggc 1380 aattaccttg aagtgctgta cccttctatg
tttgatggac tgcagagctt gcaatatctc 1440 tatttagagt ataatgtcat
taaggaaatt aagcctctga cctttgatgc tttgattaac 1500 ctacagctac
tgtttctgaa caacaacctt cttcggtcct tacctgataa tatatttggg 1560
gggacggccc taaccaggct gaatctgaga aacaaccatt tttctcacct gcccgtgaaa
1620 ggggttctgg atcagctccc ggctttcatc cagatagatc tgcaggagaa
cccctgggac 1680 tgtacctgtg acatcatggg gctgaaagac tggacagaac
atgccaattc ccctgtcatc 1740 attaatgagg tgacttgcga atctcctgct
aagcatgcag gggagatact aaaatttctg 1800 gggagggagg ctatctgtcc
agacagccca aacttgtcag atggaaccgt cttgtcaatg 1860 aatcacaata
cagacacacc tcggtcgctt agtgtgtctc ctagttccta tcctgaacta 1920
cacactgaag ttccactgtc tgtcttaatt ctgggattgc ttgttgtttt catcttatct
1980 gtctgttttg gggctggttt attcgtcttt gtcttgaaac gccgaaaggg
agtgccgagc 2040 gttcccagga ataccaacaa cttagacgta agctcctttc
aattacagta tgggtcttac 2100 aacactgaga ctcacgataa aacagacggc
catgtctaca actatatccc cccacctgtg 2160 ggtcagatgt gccaaaaccc
catctacatg cagaaggaag gagacccagt agcctattac 2220 cgaaacctgc
aagagttcag ctatagcaac ctggaggaga aaaaagaaga gccagccaca 2280
cctgcttaca caataagtgc cactgagctg ctagaaaagc aggccacacc aagagagcct
2340 gagctgctgt atcaaaatat tgctgagcga gtcaaggaac ttcccagcgc
aggcctagtc 2400 cactataact tttgtacctt acctaaaagg cagtttgccc
cttcctatga atctcgacgc 2460 caaaaccaag acagaatcaa taaaaccgtt
ttatatggaa ctcccaggaa atgctttgtg 2520 gggcagtcaa aacccaacca
ccctttactg caagctaagc cgcaatcaga accggactac 2580 ctcgaagttc
tggaaaaaca aactgcaatc agtcagctgt gaagggaaat catttacaac 2640
cctaaggcat cagaggatgc tgctccg 2667 3 6801 DNA Homo sapiens 3
agccggccgt ggtggctccg tgcgtccgag cgtccgtccg cgccgtcggc catggccaag
60 cgctccaggg gccccgggcg ccgctgcctg ttggcgctcg tgctgttctg
cgcctggggg 120 acgctggccg tggtggccca gaagccgggc gcagggtgtc
cgagccgctg cctgtgcttc 180 cgcaccaccg tgcgctgcat gcatctgctg
ctggaggccg tgcccgccgt ggcgccgcag 240 acctccatcc tagatcttcg
ctttaacaga atcagagaga tccaacctgg ggcattcagg 300 cggctgagga
acttgaacac attgcttctc aataataatc agatcaagag gatacctagt 360
ggagcatttg aagacttgga aaatttaaaa tatctctatc tgtacaagaa tgagatccag
420 tcaattgaca ggcaagcatt taagggactt gcctctctag agcaactata
cctgcacttt 480 aatcagatag aaactttgga cccagattcg ttccagcatc
tcccgaagct cgagaggcta 540 tttttgcata acaaccggat tacacattta
gttccaggga catttaatca cttggaatct 600 atgaagagat tgcgactgga
ctcaaacaca cttcactgcg actgtgaaat cctgtggttg 660 gcggatttgc
tgaaaaccta cgcggagtcg gggaacgcgc aggcagcggc catctgtgaa 720
tatcccagac gcatccaggg acgctcagtg gcaaccatca ccccggaaga gctgaactgt
780 gaaaggcccc ggatcacctc cgagccccag gacgcagatg tgacctcggg
gaacaccgtg 840 tacttcacct gcagagccga aggcaacccc aagcctgaga
tcatctggct gcgaaacaat 900 aatgagctga gcatgaagac agattcccgc
ctaaacttgc tggacgatgg gaccctgatg 960 atccagaaca cacaggagac
agaccagggt atctaccagt gcatggcaaa gaacgtggcc 1020 ggagaggtga
agacgcaaga ggtgaccctc aggtacttcg ggtctccagc tcgacccact 1080
tttgtaatcc agccacagaa tacagaggtg ctggttgggg agagcgtcac gctggagtgc
1140 agcgccacag gccacccccc gccgcggatc tcctggacga gaggtgaccg
cacacccttg 1200 ccagttgacc cgcgggtgaa catcacgcct tctggcgggc
tttacataca gaacgtcgta 1260 cagggggaca gcggagagta tgcgtgctct
gcgaccaaca acattgacag cgtccatgcc 1320 accgctttca tcatcgtcca
ggctcttcct cagttcactg tgacgcctca ggacagagtc 1380 gttattgagg
gccagaccgt ggatttccag tgtgaagcca agggcaaccc gccgcccgtc 1440
atcgcctgga ccaagggagg gagccagctc tccgtggacc ggcggcacct ggtcctgtca
1500 tcgggaacac ttagaatctc tggtgttgcc ctccacgacc agggccagta
cgaatgccag 1560 gctgtcaaca tcatcggctc ccagaaggtc gtggcccacc
tgactgtgca gcccagagtc 1620 accccagtgt ttgccagcat tcccagcgac
acaacagtgg aggtgggcgc caatgtgcag 1680 ctcccgtgca gctcccaggg
cgagcccgag ccagccatca cctggaacaa ggatggggtt 1740 caggtgacag
aaagtggaaa atttcacatc agccctgaag gattcttgac catcaatgac 1800
gttggccctg cagacgcagg tcgctatgag tgtgtggccc ggaacaccat tgggtcggcc
1860 tcggtgagca tggtgctcag tgtgaatgac gtcagtcgaa atggagatcc
gtttgtagct 1920 acctccatcg tggaagcgat tgcgactgtt gacagagcta
taaactcaac ccgaacacat 1980 ttgtttgaca gccgtcctcg ttctccaaat
gatttgctgg ccttgttccg gtatccgagg 2040 gatccttaca cagttgaaca
ggcacgggcg ggagaaatct ttgaacggac attgcagctc 2100 attcaggagc
atgtacagca tggcttgatg gtcgacctca acggaacaag ttaccactac 2160
aacgacctgg tgtctccaca gtacctgaac ctcatcgcaa acctgtcggg ctgtaccgcc
2220 caccggcgcg tgaacaactg ctcggacatg tgcttccacc agaagtaccg
gacgcacgac 2280 ggcacctgta acaacctgca gcaccccatg tggggcgcct
cgctgaccgc cttcgagcgc 2340 ctgctgaaat ccgtgtacga gaatggcttc
aacacccctc ggggcatcaa cccccaccga 2400 ctgtacaacg ggcacgccct
tcccatgccg cgcctggtgt ccaccaccct gatcgggacg 2460 gagaccgtca
cacccgacga gcagttcacc cacatgctga tgcagtgggg ccagttcctg 2520
gaccacgacc tcgactccac ggtggtggcc ctgagccagg cacgcttctc cgacggacag
2580 cactgcagca acgtgtgcag caacgacccc ccctgcttct ctgtcatgat
cccccccaat 2640 gactcccggg ccaggagcgg ggcccgctgc atgttcttcg
tgcgctccag ccctgtgtgc 2700 ggcagcggca tgacttcgct gctcatgaac
tccgtgtacc cgcgggagca gatcaaccag 2760 ctcacctcct acatagacgc
atccaacgtg tacgggagca cggagcatga ggcccgcagc 2820 atccgcgacc
tggccagcca ccgcggcctg ctgcggcagg gcatcgtgca gcggtccggg 2880
aagccgctgc tccccttcgc caccgggccg cccacggagt gcatgcggga cgagaacgag
2940 agccccatcc cctgcttcct ggccggggac caccgcgcca acgagcagct
gggcctgacc 3000 agcatgcaca cgctgtggtt ccgcgagcac aaccgcattg
ccacggagct gctcaagctg 3060 aacccgcact gggacggcga caccatctac
tatgagacca ggaagatcgt gggtgcggag 3120 atccagcaca tcacctacca
gcactggctc ccgaagatcc tgggggaggt gggcatgagg 3180 acgctgggag
agtaccacgg ctacgacccc ggcatcaatg ctggcatctt caacgccttc 3240
gccaccgcgg ccttcaggtt tggccacacg cttgtcaacc cactgcttta ccggctggac
3300 gagaacttcc agcccattgc acaagatcac ctcccccttc acaaagcttt
cttctctccc 3360 ttccggattg tgaatgaggg cggcatcgat ccgcttctca
gggggctgtt cggggtggcg 3420 gggaaaatgc gtgtgccctc gcagctgctg
aacacggagc tcacggagcg gctgttctcc 3480 atggcacaca cggtggctct
ggacctggcg gccatcaaca tccagcgggg ccgggaccac 3540 gggatcccac
cctaccacga ctacagggtc tactgcaatc tatcggcggc acacacgttc 3600
gaggacctga aaaatgagat taaaaaccct gagatccggg agaaactgaa aaggttgtat
3660 ggctcgacac tcaacatcga cctgtttccg gcgctcgtgg tggaggacct
ggtgcctggc 3720 agccggctgg gccccaccct gatgtgtctt ctcagcacac
agttcaagcg cctgcgagat 3780 ggggacaggt tgtggtatga gaaccctggg
gtgttctccc cggcccagct gactcagatc 3840 aagcagacgt cgctggccag
gatcctatgc gacaacgcgg acaacatcac ccgggtgcag 3900 agcgacgtgt
tcagggtggc ggagttccct cacggctacg gcagctgtga cgagatcccc 3960
agggtagacc tccgggtgtg gcaggactgc tgtgaagact gtaggaccag ggggcagttc
4020 aatgcctttt cctatcattt ccgaggcaga cggtctcttg agttcagcta
ccaggaggac 4080 aagccgacca agaaaacaag accacggaaa atacccagtg
ttgggagaca gggggaacat 4140 ctcagcaaca gcacctcagc cttcagcaca
cgctcagatg catctgggac aaatgacttc 4200 agagagtttg ttctggaaat
gcagaagacc atcacagacc tcagaacaca gataaagaaa 4260 cttgaatcac
ggctcagtac cacagagtgc gtggatgccg ggggcgaatc tcacgccaac 4320
aacaccaagt ggaaaaaaga tgcatgcacc atttgtgaat gcaaagacgg gcaggtcacc
4380 tgcttcgtgg aagcttgccc ccctgccacc tgtgctgtcc ccgtgaacat
cccaggggcc 4440 tgctgtccag tctgcttaca gaagagggcg gaggaaaagc
cctaggctcc tgggaggctc 4500 ctcagagttt gtctgctgtg ccatcgtgag
atcgggtggc cgatggcagg gagctgcgga 4560 ctgcagacca ggaaacaccc
agaactcgtg acatttcatg acaacgtcca gctggtgctg 4620 ttacagaagg
cagtgcagga ggcttccaac cagagcatct gcggagaagg aggcacagca 4680
ggtgcctgaa gggaagcagg caggagtcct agcttcacgt tagacttctc aggtttttat
4740 ttaattcttt taaaatgaaa aattggtgct actattaaat tgcacagttg
aatcatttag 4800 gcgcctaaat tgattttgcc tcccaacacc atttcttttt
aaataaagca ggatacctct 4860 atatgtcagc cttgccttgt tcagatgcca
ggagccggca gacctgtcac ccgcaggtgg 4920 ggtgagtctt ggagctgcca
gaggggctca ccgaaatcgg ggttccatca caagctatgt 4980 ttaaaaagaa
aattggtgtt tggcaaacgg aacagaacct ttgatgagag cgttcacagg 5040
gacactgtct gggggtgcag tgcaagcccc cggcctcttc cctgggaacc tctgaactcc
5100 tccttcctct gggctctctg taacatttca ccacacgtca gcatctaatc
ccaagacaaa 5160 cattcccgct gctcgaagca gctgtatagc ctgtgactct
ccgtgtgtca gctccttcca 5220 cacctgatta gaacattcat aagccacatt
tagaaacagg tttgctttca gctgtcactt 5280 gcacacatac tgcctagttg
tgaaccaaat gtgaaaaaac ctccttcatc ccattgtgta 5340 tctgatacct
gccgagggcc aagggtgtgt gttgacaacg ccgctcccag ccggccctgg 5400
ttgcgtccac gtcctgaaca agagccgctt ccggatggct cttcccaagg gaggaggagc
5460 tcaagtgtcg ggaactgtct aacttcaggt tgtgtgagtg cgttaaaaaa
aaaaaaaaaa 5520 aagaatccct atacctcatt tgtattttta aaatgcgtga
tgttttatga aattgtgtcc 5580 attttttagg tattagatat ggcagaaaaa
ccatttccac tatgcaaagt tcttttagac 5640 gtcagtgaaa atcaactctc
atacctcatg gtctctcttt aattgaccaa aaccttccat 5700 ttttctctaa
atacaaagcg atctgtgttc tgagcaacct ttccccgaac acacagcttc 5760
agtgcagcac gctgacctga gtatccacca tgtgccaggc acagtgctgg gcacacgagg
5820 caccaaggtc cgggccacct gcccgcagca aggcccagct gaggtggtgg
agggagcccc 5880 tgaggtcagg ggccgtttcg gttcagggtg gcaggtgtcc
agcactgggg tatggcgtcg 5940 aggcttccat ggggtggggg aggccagctt
ccttctgaca ggatgggcgc atacagtgcc 6000 tggtgtgatt tgtgcacaac
ccgtgttcca ggtgcacatc ctcccaagga gacacccaga 6060 cccttccagc
acgggccggc caagttgctg cggcggaggc agcatttcag ctgtgaggaa 6120
ggtcattgga ttcatgtgtt ttatctgtaa aaatggttgt cttaacttct taacctcata
6180 ttggtaagtg attgataaaa attggttggt gtttcatgac atgtggactt
cttttgaaat 6240 agcaagtcaa atgtagtgac caaattgtgg aagagatttc
tgtcaaatag gaaatgtgta 6300 agttcgtcta aaagctgatg gttatgtaag
ttgctcaggc actcagatga cagcagattc 6360 tgggttctgg gagtgttctg
tgcctcttac atgccctgga ggcctcatgg tctcagtgct 6420 gaggcggcac
acctgtagca cacctgcgta atgtgcggtc tgggccagtc acaaggaatt 6480
gtgttgtcta agccaaaggg ggaagctgac tgtgatttac caaaaaaaat tctgtaattc
6540 aaaccaaaat gtctgcggaa tcaccagttt gatactctct gtaatcagaa
cagtgggcag 6600 tgcctgggtg aacgtgtcta gcagccactg tgcgggatcg
ctgtaacagg agtggaatgt 6660 acatatttat ttacttttct aactgctcca
acagccaaat gcctttttta tgaccattgt 6720 attcagttca ttaccaaaga
aatgtttgca ctttgtaatg atgcctttca gttcaaataa 6780 atgggtcaca
ttttcaaatg g 6801 4 2593 DNA Homo sapiens 4 actcccaaac tccagtgctc
tcatccagag gctcttgtga ttctctttgc aattgtgagg 60 aaaaagatgg
cacaatgcta ataaattgtg aagcaaaagg tatcaagatg gtatctgaaa 120
taagtgtgcc accatcacga cctttccaac taagcttatt aaataacggc ttgacgatgc
180 ttcacacaaa tgacttttct gggcttacca atgctatttc aatacacctt
ggatttaaca 240 atattgcaga tattgagata ggtgcattta atggccttgg
cctcctgaaa caacttcata 300 tcaatcacaa ttctttagaa attcttaaag
aggatacttt ccatggactg gaaaacctgg 360 aattcctgca agcagataac
aattttatca cagtgattga accaagtgcc tttagcaagc 420 tcaacagact
caaagtgtta attttaaatg acaatgctat tgagagtctt cctccaaaca 480
tcttccgatt tgttccttta acccatctag atcttcgtgg aaatcaatta caaacattgc
540 cttatgttgg ttttctcgaa cacattggcc gaatattgga tcttcagttg
gaggacaaca 600 aatgggcctg caattgtgac ttattgcagt taaaaacttg
gttggagaac atgcctccac 660 agtctataat tggtgatgtt gtctgcaaca
gccctccatt ttttaaagga agtatactca 720 gtagactaaa gaaggaatct
atttgcccta ctccaccagt gtatgaagaa catgaggatc 780 cttcaggatc
attacatctg gcagcaacat cttcaataaa tgatagtcgc atgtcaacta 840
agaccacgtc cattctaaaa ctacccacca aagcaccagg tttgatacct tatattacaa
900 agccatccac tcaacttcca ggaccttact gccctattcc ttgtaactgc
aaagtcctat 960 ccccatcagg acttctaata cattgtcagg agcgcaacat
tgaaagctta tcagatctga 1020 gacctcctcc gcaaaatcct agaaagctca
ttctagcggg aaatattatt cacagtttaa 1080 tgaagtctga tctagtggaa
tatttcactt tggaaatgct tcacttggga aacaatcgta 1140 ttgaagttct
tgaagaagga tcgtttatga acctaacgag attacaaaaa ctctatctaa 1200
atggtaacca cctgaccaaa ttaagtaaag gcatgttcct tggtctccat aatcttgaat
1260 acttatatct tgaatacaat gccattaagg aaatactgcc aggaaccttt
aatccaatgc 1320 ctaaacttaa agtcctgtat ttaaataaca acctcctcca
agttttacca ccacatattt 1380 tttcaggggt tcctctaact aaggtaaatc
ttaaaacaaa ccagtttacc catctacctg 1440 taagtaatat tttggatgat
cttgatttgc taacccagat tgaccttgag gataacccct 1500 gggactgctc
ctgtgacctg gttggactgc agcaatggat acaaaagtta agcaagaaca 1560
cagtgacaga tgacatcctc tgcacttccc ccgggcatct cgacaaaaag gaattgaaag
1620 ccctaaatag tgaaattctc tgtccaggtt tagtaaataa cccatccatg
ccaacacaga 1680 ctagttacct tatggtcacc actcctgcaa caacaacaaa
tacggctgat actattttac 1740 gatctcttac ggacgctgtg ccactgtctg
ttctaatatt gggacttctg attatgttca 1800 tcactattgt tttctgtgct
gcagggatag tggttcttgt tcttcaccgc aggagaagat 1860 acaaaaagaa
acaagtagat gagcaaatga gagacaacag tcctgtgcat cttcagtaca 1920
gcatgtatgg ccataaaacc actcatcaca ctactgaaag accctctgcc tcactctatg
1980 aacagcacat ggtgagcccc atggttcatg tctatagaag tccatccttt
ggtccaaagc 2040 atctggaaga ggaagaagag aggaatgaga aagaaggaag
tgatgcaaaa catctccaaa 2100 gaagtctttt ggaacaggaa aatcattcac
cactcacagg gtcaaatatg aaatacaaaa 2160 ccacgaacca atcaacagaa
tttttatcct tccaagatgc cagctcattg tacagaaaca 2220 ttttagaaaa
agaaagggaa cttcagcaac tgggaatcac agaataccta aggaaaaaca 2280
ttgctcagct ccagcctgat atggaggcac attatcctgg agcccacgaa gagctgaagt
2340 taatggaaac attaatgtac tcacgtccaa ggaaggtatt agtggaacag
acaaaaaatg 2400 agtattttga acttaaagct aatttacatg ctgaacctga
ctatttagaa gtcctggagc 2460 agcaaacata gatggagagt ttgagggctt
tcgcagaaat gctgtgattc tgttttaagt 2520 ccataccttg taaataagtg
ccttacgtga gtgtgtcatc aatcagaacc taagcacagc 2580 agtaaactat ggg
2593 5 2606 DNA Homo sapiens 5 actcccaaac tccagtgctc tcatccagag
gctcttgtga ttctctttgc aattgtgagg 60 aaaaagatgg cacaatgcta
ataaattgtg aagcaaaagg tatcaagatg gtatctgaaa 120 taagtgtgcc
accatcacga cctttccaac taagcttatt aaataacggc ttgacgatgc 180
ttcacacaaa tgacttttct gggcttacca atgctatttc aatacacctt ggatttaaca
240 atattgcaga tattgagata ggtgcattta atggccttgg cctcctgaaa
caacttcata 300 tcaatcacaa ttctttagaa attcttaaag aggatacttt
ccatggactg gaaaacctgg 360 aattcctgca agcagataac aattttatca
cagtgattga accaagtgcc tttagcaagc 420 tcaacagact caaagtgtta
attttaaatg acaatgctat tgagagtctt cctccaaaca 480 tcttccgatt
tgttccttta acccatctag atcttcgtgg aaatcaatta caaacattgc 540
cttatgttgg ttttctcgaa cacattggcc gaatattgga tcttcagttg gaggacaaca
600 aatgggcctg caattgtgac ttattgcagt taaaaacttg gttggagaac
atgcctccac 660 agtctataat tggtgatgtt gtctgcaaca gccctccatt
ttttaaagga agtatactca 720 gtagactaaa gaaggaatct atttgcccta
ctccaccagt gtatgaagaa catgaggatc 780 cttcaggatc attacatctg
gcagcaacat cttcaataaa tgatagtcgc atgtcaacta 840 agaccacgtc
cattctaaaa ctacccacca aagcaccagg tttgatacct tatattacaa 900
agccatccac tcaacttcca ggaccttact gccctattcc ttgtaactgc aaagtcctat
960 ccccatcagg acttctaata cattgtcagg agcgcaacat tgaaagctta
tcagatctga 1020 gacctcctcc gcaaaatcct agaaagctca ttctagcggg
aaatattatt cacagtttaa 1080 tgaagtctga tctagtggaa tatttcactt
tggaaatgct tcacttggga aacaatcgta 1140 ttgaagttct tgaagaagga
tcgtttatga acctaacgag attacaaaaa ctctatctaa 1200 atggtaacca
cctgaccaaa ttaagtaaag gcatgttcct tggtctccat aatcttgaat 1260
acttatatct tgaatacaat gccattaagg aaatactgcc aggaaccttt aatccaatgc
1320 ctaaacttaa agtcctgtat ttaaataaca cctcctccaa gttttaccac
cacatatttt 1380 ttcaggggtt cctctaacta aggtaaatct taaacaaacc
agtttaccca tctacctgta 1440 agtaatattt ggatgatctt gatttactaa
cccagattga ccttgaggat aacccctggg 1500 ctgctcctgt gacctggttg
gactgcagca atggatacaa aagttaagca agaacacagt 1560 gacagatgac
atcctctgca cttcccccgg gcatctcgac aaaaaggaat tgaaagccct 1620
aaatagtgaa attctctgtc caggtttagt aaataaccca tccatgccaa cacagactag
1680 ttaccttatg gtcaccactc ctgcaacaac aacaaatacg gctgatacta
ttttacgatc 1740 tcttacggac gctgtgccac tgtctgttct aatattggga
cttctgatta tgttcatcac 1800 tattgttttc tgtgctgcag ggatagtggt
tcttgttctt caccgcagga gaagatacaa 1860 aaagaaacaa gtagatgagc
aaatgagaga caacagtcct gtgcatcttc agtacagcat 1920 gtatggccat
aaaaccactc atcacactac tgaaagaccc tctgcctcac tctatgaaca 1980
gcacatggtg agccccatgg ttcatgtcta tagaagtcca tcctttggtc caaagcatct
2040 ggaagaggaa gaagagagga atgagaaaga aggaagtgat gcaaaacatc
tccaaagaag 2100 tcttttggaa caggaaaatc attcaccact cacagggtca
aatatgaaat acaaaaccac 2160 gaaccaatca acagaatttt tatccttcca
agatgccagc tcattgtaca gaaacatttt 2220 agaaaaagaa agggaacttc
agcaactggg aatcacagaa tacctaagga aaaacattgc 2280 tcagctccag
cctgatatgg aggcacatta tcctggagcc cacgaagagc tgaagttaat 2340
ggaaacatta atgtactcac gtccaaggaa ggtattagtg gaacagacaa aaaatgagta
2400 ttttgaactt aaagctaatt tacatgctga acctgactat ttagaagtcc
tggagcagca 2460 aacatagatg gagagtttga gggctttcgc agaaatgctg
tgattctgtt ttaagtccat 2520 accttgtaaa taagtgcctt acgtgagtgt
gtcatcaatc agaacctaag cacagcagtt 2580 aacttgggaa aaaaaaaaaa aaaaaa
2606 6 2574 DNA Homo sapiens 6 tcatcacatg acaacatgaa gctgtggatt
catctctttt attcatctct ccttgcctgt 60 atatctttac actcccaaac
tccagtgctc tcatccagag gctcttgtga ttctctttgc 120 aattgtgagg
aaaaagatgg cacaatgcta ataaattgtg aagcaaaagg tatcaagatg 180
gtatctgaaa taagtgtgcc accatcacga cctttccaac taagcttatt aaataacggc
240 ttgacgatgc ttcacacaaa tgacttttct gggcttacca atgctatttc
aatacacctt 300 ggatttaaca atattgcaga tattgagata ggtgcattta
atggccttgg cctcctgaaa 360 caacttcata tcaatcacaa ttctttagaa
attcttaaag aggatacttt ccatggactg 420 gaaaacctgg aattcctgca
agcagataac aattttatca cagtgattga accaagtgcc 480 tttagcaagc
tcaacagact caaagtgtta attttaaatg acaatgctat tgagagtctt 540
cctccaaaca tcttccgatt tgttccttta acccatctag atcttcgtgg aaatcaatta
600 caaacattgc cttatgttgg ttttctcgaa cacattggcc gaatattgga
tcttcagttg 660 gaggacaaca aatgggcctg caattgtgac ttattgcagt
taaaaacttg gttggagaac 720 atgcctccac agtctataat tggtgatgtt
gtctgcaaca gccctccatt ttttaaagga 780 agtatactca gtagactaaa
gaaggaatct atttgcccta ctccaccagt gtatgaagaa 840 catgaggatc
cttcaggatc attacatctg gcagcaacat cttcaataaa tgatagtcgc 900
atgtcaacta agaccacgtc cattctaaaa ctacccacca aagcaccagg tttgatacct
960 tatattacaa agccatccac tcaacttcca ggaccttact gccctattcc
ttgtaactgc 1020 aaagtcctat ccccatcagg acttctaata cattgtcagg
agcgcaacat tgaaagctta 1080 tcagatctga gacctcctcc gcaaaatcct
agaaagctca ttctagcggg aaatattatt 1140 cacagtttaa tgaagtctga
tctagtggaa tatttcactt tggaaatgct tcacttggga 1200 aacaatcgta
ttgaagttct tgaagaagga tcgtttatga acctaacgag attacaaaaa 1260
ctctatctaa atggtaacca cctgaccaaa ttaagtaaag gcatgttcct tggtctccat
1320 aatcttgaat acttatatct tgaatacaat gccattaagg aaatactgcc
aggaaccttt 1380 aatccaatgc ctaaacttaa agtcctgtat ttaaataaca
acctcctcca agttttacca 1440 ccacatattt tttcaggggt tcctctaact
aaggtaaatc ttaaaacaaa ccagtttacc 1500 catctacctg taagtaatat
tttggatgat cttgatttac taacccagat tgaccttgag 1560 gataacccct
gggactgctc ctgtgacctg gttggactgc agcaatggat acaaaagtta 1620
agcaagaaca cagtgacaga tgacatcctc tgcacttccc ccgggcatct cgacaaaaag
1680 gaattgaaag ccctaaatag tgaaattctc tgtccaggtt tagtaaataa
cccatccatg 1740 ccaacacaga ctagttacct tatggtcacc actcctgcaa
caacaacaaa tacggctgat 1800 actattttac gatctcttac ggacgctgtg
ccactgtctg ttctaatatt gggacttctg 1860 attatgttca tcactattgt
tttctgtgct gcagggatag tggttcttgt tcttcaccgc 1920 aggagaagat
acaaaaagaa acaagtagat gagcaaatga gagacaacag tcctgtgcat 1980
cttcagtaca gcatgtatgg ccataaaacc actcatcaca ctactgaaag accctctgcc
2040 tcactctatg aacagcacat ggtgagcccc atggttcatg tctatagaag
tccatccttt 2100 ggtccaaagc atctggaaga ggaagaagag aggaatgaga
aagaaggaag tgatgcaaaa 2160 catctccaaa gaagtctttt ggaacaggaa
aatcattcac cactcacagg gtcaaatatg 2220 aaatacaaaa ccacgaacca
atcaacagaa tttttatcct tccaagatgc cagctcattg 2280 tacagaaaca
ttttagaaaa agaaagggaa cttcagcaac tgggaatcac agaataccta 2340
aggaaaaaca ttgctcagct ccagcctgat atggaggcac attatcctgg agcccacgaa
2400 gagctgaagt taatggaaac attaatgtac tcacgtccaa ggaaggtatt
agtggaacag 2460 acaaaaaatg agtattttga acttaaagct aatttacatg
ctgaacctga ctatttagaa 2520 gtcctggagc agcaaacata gatggagagt
ttgagggctt tcgcagaaat gctg 2574 7 1504 DNA Homo sapiens 7
attgccttat gttggttttc tcgaacacat tggccgaata ttggatcttc agttggagga
60 caacaaatgg gcctgcaatt gtgacttatt gcagttaaaa acttggttgg
agaacatgcc 120 tccacagtct ataattggtg atgttgtctg caacagccct
ccatttttta aaggaagtat 180 actcagtaga ctaaagaagg aatctatttg
ccctactcca ccagtgtatg aagaacatga 240 ggatccttca ggatcattac
atctggcagc aacatcttca ataaatgata gtcgcatgtc 300 aactaagacc
acgtccattc taaaactacc caccaaagca ccaggtttga taccttatat 360
tacaaagcca tccactcaac ttccaggacc ttactgccct attccttgta actgcaaagt
420 cctatcccca tcaggacttc taatacattg tcaggagcgc aacattgaaa
gcttatcaga 480 tctgagacct cctccgcaaa atcctagaaa gctcattcta
gcgggaaata ttattcacag 540 tttaatgaat ccatcctttg gtccaaagca
tctggaagag gaagaagaga ggaatgagaa 600 agaaggaagt gatgcaaaac
atctccaaag aagtcttttg gaacaggaaa atcattcacc 660 actcacaggg
tcaaatatga aatacaaaac cacgaaccaa tcaacagaat ttttatcctt 720
ccaagatgcc agctcattgt acagaaacat tttagaaaaa gaaagggaac ttcagcaact
780 gggaatcaca gaatacctaa ggaaaaacat tgctcagctc cagcctgata
tggaggcaca 840 ttatcctgga gcccacgaag agctgaagtt aatggaaaca
ttaatgtact cacgtccaag 900 gaaggtatta gtggaacaga caaaaaatga
gtattttgaa cttaaagcta atttacatgc 960 tgaacctgac tatttagaag
tcctggagca gcaaacatag atggagagtt tgagggcttt 1020 cgcagaaatg
ctgtgattct gttttaagtc cataccttgt aaataagtgc cttacgtgag 1080
tgtgtcatca atcagaacct aagcacagca gtaaactatg gggaaaaaaa aagaagaaga
1140 aaaagaaact cagggatcac tgggagaagc catggcatta tcttcaggca
atttagtctg 1200 tcccaaataa aataaatcct tgcatgtaaa tcattcaagg
attatagtaa tatttcatat 1260 actgaaaagt gtctcatagg agtcctcttg
cacatctaaa aaggctgaac atttaagtat 1320 cccgaatttt cttgaattgc
tttccctata gattaattac aattggattt catcatttaa 1380 aaaccatact
tgtatatgta gttataatat gtaaggaata cattgtttat aaccagtatg 1440
tacttcaaaa atgtgtattg tcaaacatac ctaactttct tgcaataaat gcaaaagaaa
1500 ctgg 1504 8 3131 DNA Homo sapiens misc_feature (1)..(3131) "n"
is A, C, G, or T 8 agcgtcgaca acaagaaata ctagaaaagg aggaaggaga
acattgctgc agcttggatc 60 tacaacctaa gaaagcaaga gtgatcaatc
tcagctctgt taaacatctt gtttacttac 120 tgcattcagc agcttgcaaa
tggttaacta tatgcaaaaa agtcagcata gctgtgaagt 180 atgccgtgaa
ttttaattga gggaaaaagg gacaattgct tcaggatgct ctagtatgca 240
ctctgcttga aatattttca atgaaatgct cagtattcta tctttgacca gaggttttaa
300 ctttatgaag ctatgggact tgacaaaaag tgatatttga gaagaaagta
cgcagtggtt 360 ggtgttttct tttttttaat aaaggaattg aattactttg
aacacctctt ccagctgtgc 420 attacagata acgtcaggaa gagtctctgc
tttacaggta atcggatttc atcacatgac 480 aacatgaagc tgtggattca
tctcttttat tcatctctcc tngcctgtat atctttacac 540 tcccaaactc
cagtgctctc atccagaggc tcttgtgatt ctctttgcaa ttgtgaggaa 600
aaagatggca caatgctaat aaattgtgaa gcaaaaggta tcaagatggt atctgaaata
660 agtgtgccac catcacgacc tttccaacta agcttattaa ataacggctt
gacgatgctt 720 cacacaaatg acttttctgg gcttaccaat gctatttcaa
tacaccttgg atttaacaat 780 attgcagata ttgagatagg tgcatttaat
ggccttggcc tcctgaaaca acttcatatc 840 aatcacaatt ctttagaaat
tcttaaagag gatactttcc atggactgga aaacctggaa 900 ttcctgcaag
cagataacaa ttttatcaca gtgattgaac caagtgcctt tagcaagctc 960
aacagactca aagtgttaat tttaaatgac aatgctattg agagtcttcc tccaaacatc
1020 ttccgatttg ttcctttaac ccatctagat cttcgtggaa atcaattaca
aacattgcct 1080 tatgttggtt ttctcgaaca cattggccga atattggatc
ttcagttgga ggacaacaaa 1140 tgggcctgca attgtgactt attgcagtta
aaaacttggt tggagaacat gcctccacag 1200 tctataattg gtgatgttgt
ctgcaacagc cctccatttt ttaaaggaag tatactcagt 1260 agactaaaga
aggaatctat ttgccctact ccaccagtgt atgaagaaca tgaggatcct 1320
tcaggatcat tacatctggc agcaacatct tcaataaatg atagtcgcat gtcaactaag
1380 accacgtcca ttctaaaact acccaccaaa gcaccaggtt tgatacctta
tattacaaag 1440 ccatccactc aacttccagg accttactgc cctattcctt
gtaactgcaa agtcctatcc 1500 ccatcaggac ttctaataca ttgtcaggag
cgcaacattg aaagcttatc agatctgaga 1560 cctcctccgc aaaatcctag
aaagctcatt ctagcgggaa atattattca cagtttaatg 1620 aagtctgatc
tagtggaata tttcactttg gaaatgcttc acttgggaaa caatcgtatt 1680
gaagttcttg aagaaggatc gtttatgaac ctaacgagat tacaaaaact ctatctaaat
1740 ggtaaccacc tgaccaaatt aagtaaaggc atgttccttg gtctccataa
tcttgaatac 1800 ttatatcttg aatacaatgc cattaaggaa atactgccag
gaacctttaa tccaatgcct 1860 aaacttaaag tcctgtattt aaataacaac
ctcctccaag ttttaccacc acatattttt 1920 tcaggggttc ctctaactaa
ggtaaatctt aaaacaaacc agtttaccca tctacctgta 1980 agtaatattt
tggatgatct tgatttgcta acccagattg accttgagga taacccctgg 2040
gactgctcct gtgacctggt tggactgcag caatggatac aaaagttaag caagaacaca
2100 gtgacagatg acatcctctg cacttccccc gggcatctcg acaaaaagga
attgaaagcc 2160 ctaaatagtg aaattctctg tccaggttta gtaaataacc
catccatgcc aacacagact 2220 agttacctta tggtcaccac tcctgcaaca
acaacaaata cggctgatac tattttacga 2280 tctcttacgg acgctgtgcc
actgtctgtt ctaatattgg gacttctgat tatgttcatc 2340
actattgttt tctgtgctgc agggatagtg gttcttgttc ttcaccgcag gagaagatac
2400 aaaaagaaac aagtagatga gcaaatgaga gacaacagtc ctgtgcatct
tcagtacagc 2460 atgtatggcc ataaaaccac tcatcacact actgaaagac
cctctgcctc actctatgaa 2520 cagcacatgg tgagccccat ggttcatgtc
tatagaagtc catcctttgg tccaaagcat 2580 ctggaagagg aagaagagag
gaatgagaaa gaaggaagtg atgcaaaaca tctccaaaga 2640 agtcttttgg
aacaggaaaa tcattcacca ctcacagggt caaatatgaa atacaaaacc 2700
acgaaccaat caacagaatt tttatccttc caagatgcca gctcattgta cagaaacatt
2760 ttagaaaaag aaagggaact tcagcaactg ggaatcacag aatacctaag
gaaaaacatt 2820 gctcagctcc agcctgatat ggaggcacat tatcctggag
cccacgaaga gctgaagtta 2880 atggaaacat taatgtactc acgtccaagg
aaggtattag tggaacagac aaaaaatgag 2940 tattttgaac ttaaagctaa
tttacatgct gaacctgact atttagaagt cctggagcag 3000 caaacataga
tggagagttt gagggctttc gcagaaatgc tgtgattctg ttttaagtcc 3060
ataccttgta aataagtgcc ttacgtgagt gtgtcatcaa tcagaaccta agcacagcag
3120 taaactatgg g 3131 9 3227 DNA Homo sapiens 9 cagcagacgg
cagacggtgc ccggcgccca cgggaagctg aagatacagc ggtatacaaa 60
cctccatgtc tcaagctgac cacacttgta tctggacttg ccagctgatt agctctgttc
120 ccaccaggct cgttcaaaga cccacagctt gagggggcag agggctgccc
ctgatggggc 180 ctggcaatga ctgagcaggc ccagccccag aggacaagga
agagaaggca tattgaggag 240 ggcaagaagt gacgcccggt gtagaatgac
tgccctggga gggtggttcc ttgggccctg 300 gcagggttgc tgacccttac
cctgcaaaac acaaagagca ggactccaga ctcttcttgt 360 gaatggtccc
ctgccctgca gctccaccat gaggcttctc gtggccccac tcttgctagc 420
ttgggtggct ggtgccactg ccgctgtgcc cgtggtaccc tggcatgttc cctgcccccc
480 tcagtgtgcc tgccagatcc ggccctggta tacgccccgc tcgtcctacc
gcgaggctac 540 cactgtggac tgcaatgacc tattcctgac ggcagtcccc
ccggcactcc ccgcaggcac 600 acagaccctg ctcctgcaga gcaacagcat
tgtccgtgtg gaccagagtg agctgggcta 660 cctggccaat ctcacagagc
tggacctgtc ccagaacagc ttttcggatg cccgagactg 720 tgatttccat
gcccttcccc agctgctgag cctgcaccta gaggagaacc agctgacccg 780
gctggaggac cacagctttg cagggctggc cagcctacag gaactctatc tcaaccacaa
840 ccagctctac cgcatcgccc ccagggcctt ttctggcctc agcaacttgc
tgcggctgca 900 cctcaactcc aacctcctga gggccattga cagccgctgg
tttgaaatgc tgcccaactt 960 ggagatactc atgattggcg gcaacaaggt
agatgccatc ctggacatga acttccggcc 1020 cctggccaac ctgcgtagcc
tggtgctagc aggcatgaac ctgcgggaga tctccgacta 1080 tgccctggag
gggctgcaaa gcctggagag cctctccttc tatgacaacc agctggcccg 1140
ggtgcccagg cgggcactgg aacaggtgcc cgggctcaag ttcctagacc tcaacaagaa
1200 cccgctccag cgggtagggc cgggggactt tgccaacatg ctgcacctta
aggagctggg 1260 actgaacaac atggaggagc tggtctccat cgacaagttt
gccctggtga acctccccga 1320 gctgaccaag ctggacatca ccaataaccc
acggctgtcc ttcatccacc cccgcgcctt 1380 ccaccacctg ccccagatgg
agaccctcat gctcaacaac aacgctctca gtgccttgca 1440 ccagcagacg
gtggagtccc tgcccaacct gcaggaggta ggtctccacg gcaaccccat 1500
ccgctgtgac tgtgtcatcc gctgggccaa tgccacgggc acccgtgtcc gcttcatcga
1560 gccgcaatcc accctgtgtg cggagcctcc ggacctccag cgcctcccgg
tccgtgaggt 1620 gcccttccgg gagatgacgg accactgttt gcccctcatc
tccccacgaa gcttcccccc 1680 aagcctccag gtagccagtg gagagagcat
ggtgctgcat tgccgggcac tggccgaacc 1740 cgaacccgag atctactggg
tcactccagc tgggcttcga ctgacacctg cccatgcagg 1800 caggaggtgc
cgggtgtacc ccgaggggac cctggagctg cggagggtga cagcagaaga 1860
ggcagggcta tacacctgtg tggcccagaa cctggtgggg gctgacacta agacggttag
1920 tgtggttgtg ggccgtgctc tcctccagcc aggcagggac gaaggacagg
ggctggagct 1980 ccgggtgcag gagacccacc cctatcacat cctgctatct
tgggtcaccc cacccaacac 2040 agtgtccacc aacctcacct ggtccagtgc
ctcctccctc cggggccagg gggccacagc 2100 tctggcccgc ctgcctcggg
gaacccacag ctacaacatt acccgcctcc ttcaggccac 2160 ggagtactgg
gcctgcctgc aagtggcctt tgctgatgcc cacacccagt tggcttgtgt 2220
atgggccagg accaaagagg ccacttcttg ccacagagcc ttaggggatc gtcctgggct
2280 cattgccatc ctggctctcg ctgtccttct cctggcagct gggctagcgg
cccaccttgg 2340 cacaggccaa cccaggaagg gtgtgggtgg gaggcggcct
ctccctccag cctgggcttt 2400 ctggggctgg agtgcccctt ctgtccgggt
tgtgtctgct cccctcgtcc tgccctggaa 2460 tccagggagg aagctgccca
gatcctcaga aggggagaca ctgttgccac cattgtctca 2520 aaattcttga
agctcagcct gttctcagca gtagagaaat cactaggact actttttacc 2580
aaaagagaag cagtctgggc cagatgccct gccaggaaag ggacatggac ccacgtgctt
2640 gaggcctggc agctgggcca agacagatgg ggctttgtgg ccctgggggt
gcttctgcag 2700 ccttgaaaaa gttgccctta cctcctaggg tcacctctgc
tgccattctg aggaacatct 2760 ccaaggaacg ggagggactt tggctagagc
ctcctgcctc cccatcttct ctctgcccag 2820 aggctcctgg gcctggcttg
gctgtcccct acctgtgtcc ccgggctgca ccccttcctc 2880 ttctctttct
ctgtacagtc tcagttgctt gctcttgtgc ctcctgggca agggctgaag 2940
gaggccactc catctcacct cggggggctg ccctcaatgt gggagtgacc ccagccagat
3000 ctgaaggaca tttgggagag ggatgcccag gaacgcctca tctcagcagc
ctgggctcgg 3060 cattccgaag ctgactttct ataggcaatt ttgtaccttt
gtggagaaat gtgtcacctc 3120 ccccaacccg attcactctt ttctcctgtt
ttgtaaaaaa taaaaataaa taataacaat 3180 aatacggggg aaaggaacga
aaggaactaa aaaaaaaaaa aaaaaaa 3227 10 3227 DNA Homo sapiens 10
cagcagacgg cagacggtgc ccggcgccca cgggaagctg aagatacagc ggtatacaaa
60 cctccatgtc tcaagctgac cacacttgta tctggacttg ccagctgatt
agctctgttc 120 ccaccaggct cgttcaaaga cccacagctt gagggggcag
agggctgccc ctgatggggc 180 ctggcaatga ctgagcaggc ccagccccag
aggacaagga agagaaggca tattgaggag 240 ggcaagaagt gacgcccggt
gtagaatgac tgccctggga gggtggttcc ttgggccctg 300 gcagggttgc
tgacccttac cctgcaaaac acaaagagca ggactccaga ctcttcttgt 360
gaatggtccc ctgccctgca gctccaccat gaggcttctc gtggccccac tcttgctagc
420 ttgggtggct ggtgccactg ccgctgtgcc cgtggtaccc tggcatgttc
cctgcccccc 480 tcagtgtgcc tgccagatcc ggccctggta tacgccccgc
tcgtcctacc gcgaggctac 540 cactgtggac tgcaatgacc tattcctgac
ggcagtcccc ccggcactcc ccgcaggcac 600 acagaccctg ctcctgcaga
gcaacagcat tgtccgtgtg gaccagagtg agctgggcta 660 cctggccaat
ctcacagagc tggacctgtc ccagaacagc ttttcggatg cccgagactg 720
tgatttccat gcccttcccc agctgctgag cctgcaccta gaggagaacc agctgacccg
780 gctggaggac cacagctttg cagggctggc cagcctacag gaactctatc
tcaaccacaa 840 ccagctctac cgcatcgccc ccagggcctt ttctggcctc
agcaacttgc tgcggctgca 900 cctcaactcc aacctcctga gggccattga
cagccgctgg tttgaaatgc tgcccaactt 960 ggagatactc atgattggcg
gcaacaaggt agatgccatc ctggacatga acttccggcc 1020 cctggccaac
ctgcgtagcc tggtgctagc aggcatgaac ctgcgggaga tctccgacta 1080
tgccctggag gggctgcaaa gcctggagag cctctccttc tatgacaacc agctggcccg
1140 ggtgcccagg cgggcactgg aacaggtgcc cgggctcaag ttcctagacc
tcaacaagaa 1200 cccgctccag cgggtagggc cgggggactt tgccaacatg
ctgcacctta aggagctggg 1260 actgaacaac atggaggagc tggtctccat
cgacaagttt gccctggtga acctccccga 1320 gctgaccaag ctggacatca
ccaataaccc acggctgtcc ttcatccacc cccgcgcctt 1380 ccaccacctg
ccccagatgg agaccctcat gctcaacaac aacgctctca gtgccttgca 1440
ccagcagacg gtggagtccc tgcccaacct gcaggaggta ggtctccacg gcaaccccat
1500 ccgctgtgac tgtgtcatcc gctgggccaa tgccacgggc acccgtgtcc
gcttcatcga 1560 gccgcaatcc accctgtgtg cggagcctcc ggacctccag
cgcctcccgg tccgtgaggt 1620 gcccttccgg gagatgacgg accactgttt
gcccctcatc tccccacgaa gcttcccccc 1680 aagcctccag gtagccagtg
gagagagcat ggtgctgcat tgccgggcac tggccgaacc 1740 cgaacccgag
atctactggg tcactccagc tgggcttcga ctgacacctg cccatgcagg 1800
caggaggtgc cgggtgtacc ccgaggggac cctggagctg cggagggtga cagcagaaga
1860 ggcagggcta tacacctgtg tggcccagaa cctggtgggg gctgacacta
agacggttag 1920 tgtggttgtg ggccgtgctc tcctccagcc aggcagggac
gaaggacagg ggctggagct 1980 ccgggtgcag gagacccacc cctatcacat
cctgctatct tgggtcaccc cacccaacac 2040 agtgtccacc aacctcacct
ggtccagtgc ctcctccctc cggggccagg gggccacagc 2100 tctggcccgc
ctgcctcggg gaacccacag ctacaacatt acccgcctcc ttcaggccac 2160
ggagtactgg gcctgcctgc aagtggcctt tgctgatgcc cacacccagt tggcttgtgt
2220 atgggccagg accaaagagg ccacttcttg ccacagagcc ttaggggatc
gtcctgggct 2280 cattgccatc ctggctctcg ctgtccttct cctggcagct
gggctagcgg cccaccttgg 2340 cacaggccaa cccaggaagg gtgtgggtgg
gaggcggcct ctccctccag cctgggcttt 2400 ctggggctgg agtgcccctt
ctgtccgggt tgtgtctgct cccctcgtcc tgccctggaa 2460 tccagggagg
aagctgccca gatcctcaga aggggagaca ctgttgccac cattgtctca 2520
aaattcttga agctcagcct gttctcagca gtagagaaat cactaggact actttttacc
2580 aaaagagaag cagtctgggc cagatgccct gccaggaaag ggacatggac
ccacgtgctt 2640 gaggcctggc agctgggcca agacagatgg ggctttgtgg
ccctgggggt gcttctgcag 2700 ccttgaaaaa gttgccctta cctcctaggg
tcacctctgc tgccattctg aggaacatct 2760 ccaaggaacg ggagggactt
tggctagagc ctcctgcctc cccatcttct ctctgcccag 2820 aggctcctgg
gcctggcttg gctgtcccct acctgtgtcc ccgggctgca ccccttcctc 2880
ttctctttct ctgtacagtc tcagttgctt gctcttgtgc ctcctgggca agggctgaag
2940 gaggccactc catctcacct cggggggctg ccctcaatgt gggagtgacc
ccagccagat 3000 ctgaaggaca tttgggagag ggatgcccag gaacgcctca
tctcagcagc ctgggctcgg 3060 cattccgaag ctgactttct ataggcaatt
ttgtaccttt gtggagaaat gtgtcacctc 3120 ccccaacccg attcactctt
ttctcctgtt ttgtaaaaaa taaaaataaa taataacaat 3180 aatacggggg
aaaggaacga aaggaactaa aaaaaaaaaa aaaaaaa 3227 11 592 DNA Homo
sapiens misc_feature (1)..(592) "n" is A, C, G, or T 11 ttanntntcc
tagcagtatt tagcaccttt ttgccacctt ggtgaacaga aaattgtatt 60
ttcctgtctt tcatggctga aaacaaaagt aatgggaatt ttaaatacgt ttgcnganac
120 tgcccctccc ctcattgagg gtcactgctc aagagtgcag gagtggactc
tccactgatg 180 ggtctccctc cccatcctgg tttccacccc gggctggcta
gctctgttgg tttgnagact 240 ganagccagc ctggctcatt ctcattattg
gctagttagc tttctttatc aacctgctca 300 ctcacaaatg tgtgccctca
gccagagagt aagaaagccc aaatctgtta cagcttctaa 360 aaaaatagat
ttctaatttg tcctactcat gttaggagca ttatctttga aggtaaaaca 420
tagtgtatca ttgtgtaaac tcccaggctt gatgtagcag aagagatcat ttctggaggc
480 ttcagcaatg ggaatttagc attataagag agatttggac naaccagtcc
aaagtggtcc 540 gagttcttaa aatcccaggg tagggnaact ccactccttc
ctttcttctn tg 592 12 5036 DNA Homo sapiens 12 gaacaggcac agggcctgca
gctgcagcat tcagtgcatc ccagaccctc cagaggccca 60 gggcagggtc
agatcagaca gacctggcag cacagacgtg aagtgtaaac ggcacacagc 120
cccctggact tgggttgatt cttgtatctt gctgctcacc agctatgtga cttgagccaa
180 gttctttaac ctcttcagga cctcagtctc ttaatctgca aactggaggt
attagtgctt 240 atctcatgta cctgttatga gagtcatatg aaaaaagaca
ttcagagtct tcaaagagtc 300 ctgagcaatg gtaggtgctc agtaagtgtt
tgtgcatttc cttgacactt gtagctaaaa 360 acttacctgg aatcagactt
ccactcagga gtcagaccag gctcaagaag gcaacacagt 420 ctggcccctt
tttgtttcca tctttggttc taggccaagg ctggacttgc ctcatggccc 480
tggcatctga gagagaaact ggataggttc ctcatgctca ggatggagga aacagcagtg
540 gcttgcattg ttgattgcag gggtgatgtc taatctcctg aagaaaggaa
ctctggagtc 600 acagagattg attcatacaa tcaacacaca ctgggcaccc
atcccgtact acaggcacaa 660 gaggcaagac aggtctctag ttagagacgt
tttcccacgg tgtggtcaga actgtgacag 720 taggaagcat gagtactgca
agagcataaa gccagcccct tagtcagcca ggtggggagg 780 ttttggagaa
gacttctgca ggtggtgact gaacttaaag ttgagcgttg aagaacaagt 840
aggagttgtc tggaagacaa tgaggaggag acatcagaag gggtggcatg tcacggtaac
900 ctcaacccaa aaggtttgtg tgcagcgggt aagtggggtg gtgacaagag
aagccaggtg 960 aaggggccgg gccaggtcac agaaggtctt acctgcacag
ctgaggagct cagacccagg 1020 cttcttaact ccactgcctc ctcctgggca
acactgccag gctgttggga aaggagaaac 1080 actcctgcct ccagggtggg
gaagttggca aacaccaaag caatatcccc acccagaaga 1140 tcctgggcag
cagaacaccc cagcctgcct ggatgtgtca ggtctgctgc ctttcacctc 1200
tagggtcctg gggagcccat cccagatttt agaagctgta cttatttcca ccaaccctca
1260 cccacctccc acctacactg ccagaggaat ttctggaaac ttatcatttt
atcttccatt 1320 tacccagtga gccaaggagg ctgggaggaa agaggtaaga
aaggttagag aacctacctc 1380 acatctctct gggctcagaa ggactctgaa
gataacaata atttcagccc atccactctc 1440 cttccctccc aaacacacat
gtgcatgtac acacacacat acacacacat acaccttcct 1500 ctccttcact
gaagactcac agtcactcac tctgtgagca ggtcatagaa aaggacacta 1560
aagccttaag gacgggcctg gccattacct ctgcagctcc tttggcttgt tgagtcaaaa
1620 aacatgggag gggccaggca cggtgactca cacctgtaat cccagcattt
tgggagaccg 1680 aggtgagcag atcacttgag gtcaggagtt cgagaccagc
ctggccaaca tggagaaacc 1740 cccatctcta ctaaaaatac aaaaattagc
caggagtggt ggcaggtgcc tgtaatccca 1800 gctactcagg tggctgagcc
aggagaatcg cttgaatcca ggaggcggag gatgcagtca 1860 gctgagtgca
ccgctgcact ccagcctggg tgacagaatg agactctgtc tcaaacaaac 1920
aaacacggga ggaggggtag atactgcttc tctgcaacct ccttaactct gcatcctctt
1980 cttccagggc tgcccctgat ggggcctggc aatgactgag caggcccagc
cccagaggac 2040 aaggaagaga aggcatattg aggagggcaa gaagtgacgc
ccggtgtaga atgactgccc 2100 tgggagggtg gttccttggg ccctggcagg
gttgctgacc cttaccctgc aaaacacaaa 2160 gagcaggact ccagactctt
cttgtgaatg gtcccctgcc ctgcagctcc accatgaggc 2220 ttctcgtggc
cccactcttg ctagcttggg tggctggtgc cactgccgct gtgcccgtgg 2280
taccctggcg tgttccctgc ccccctcagt gtgcctgcca gatccggccc tggtatacgc
2340 cccgctcgtc ctaccgcgag gctaccactg tggactgcaa tgacctattc
ctgacggcag 2400 tccccccggc actccccgca ggcacacaga ccctgctcct
gcagagcaac agcattgtcc 2460 gtgtggacca gagtgagctg ggctaccagg
ccaatctcac agagctggac ctgtcccaga 2520 acagcttttc ggatgcccga
gactgtgatt tccatgccct gccccagctg ctgagcctgc 2580 acctagagga
gaaccagctg acccggctgg aggaccacag ctttgcaggg ctggccagcc 2640
tacaggaact ctatctcaac cacaaccagc tctaccgcat cgcccccagg gccttttctg
2700 gcctcagcaa cttgctgcgg ctgcacctca actccaacct cctgagggcc
attgacagcc 2760 gctggtttga aatgctgccc aacttggaga tactcatgat
tggcggcaac aaggtagatg 2820 ccatcctgga catgaacttc cggcccctgg
ccaacctgcg tagcctggtg ctagcaggca 2880 tgaacctgcg ggagatctcc
gactatgccc tggaggggct gcaaagcctg gagagcctct 2940 ccttctatga
caaccagctg gcccgggtgc ccaggcgggc actggaacag gtgcccgggc 3000
tcaagttcct agacctcaac aagaacccgc tccagcgggt agggccgggg gactttgcca
3060 acatgctgca ccttaaggag ctgggactga acaacatgga ggagctggtc
tccatcgaca 3120 agtttgccct ggtgaacctc cccgagctga ccaagctgga
catcaccaat aacccacggc 3180 tgtccttcat ccacccccgc gccttccacc
acctgcccca gatggagacc ctcatgctca 3240 acaacaacgc tctcagtgcc
ttgcaccagc agacggtgga gtccctgccc aacctgcagg 3300 aggtaggtct
ccacggcaac cccatccgct gtgactgtgt catccgctgg gccaatgcca 3360
cgggcacccg tgtccgcttc atcgagccgc aatccaccct gtgtgcggag cctccggacc
3420 tccagcgcct cccggtccgt gaggtgccct tccgggagat gacggaccac
tgtttgcccc 3480 tcatctcccc acgaagcttc cccccaagcc tccaggtagc
cagtggagag agcatggtgc 3540 tgcattgccg ggcactggcc gaacccgaac
ccgagatcta ctgggtcact ccagctgggc 3600 ttcgactgac acctgcccat
gcaggcagga ggtaccgggt gtaccccgag gggaccctgg 3660 agctgcggag
ggtgacagca gaagaggcag ggctatacac ctgtgtggcc cagaacctgg 3720
tgggggctga cactaagacg gttagtgtgg ttgtgggccg tgctctcctc cagccaggca
3780 gggacgaagg acaggggctg gagctccggg tgcaggagac ccacccctat
cacatcctgc 3840 tatcttgggt caccccaccc aacacagtgt ccaccaacct
cacctggtcc agtgcctcct 3900 ccctccgggg ccagggggcc acagctctgg
cccgcctgcc tcggggaacc cacagctaca 3960 acattacccg cctccttcag
gccacggagt actgggcctg cctgcaagtg gcctttgctg 4020 atgcccacac
ccagttggct tgtgtatggg ccaggaccaa agaggccact tcttgccaca 4080
gagccttagg ggatcgtcct gggctcattg ccatcctggc tctcgctgtc cttctcctgg
4140 cagctgggct agcggcccac cttggcacag gccaacccag gaagggtgtg
ggtgggaggc 4200 ggcctctccc tccagcctgg gctttctggg gctggagtgc
cccttctgtc cgggttgtgt 4260 ctgctcccct cgtcctgccc tggaatccag
ggaggaagct gcccagatcc tcagaagggg 4320 agacactgtt gccaccattg
tctcaaaatt cttgaagctc agcctgttct cagcagtaga 4380 gaaatcacta
ggactacttt ttaccaaaag agaagcagtc tgggccagat gccctgccag 4440
gaaagggaca tggacccacg tgcttgaggc ctggcagctg ggccaagaca gatggggctt
4500 tgtggccctg ggggtgcttc tgcagccttg aaaaagttgc ccttacctcc
tagggtcacc 4560 tctgctgcca ttctgaggaa catctccaag gaacaggagg
gactttggct agagcctcct 4620 gcctccccat cttctctctg cccagaggct
cctgggcctg gcttggctgt cccctacctg 4680 tgtccccggg ctgcacccct
tcctcttctc tttctctgta cagtctcagt tgcttgctct 4740 tgtgcctcct
gggcaagggc tgaaggaggc cactccatct cacctcgggg ggctgccctc 4800
aatgtgggag tgaccccagc cagatctgaa ggacatttgg gagagggatg cccaggaacg
4860 cctcatctca gcagcctggg ctcggcattc cgaagctgac tttctatagg
caattttgta 4920 cctttgtgga gaaatgtgtc acctccccca acccgattca
ctcttttctc ctgttttgta 4980 aaaaataaaa ataaataata acaataaaaa
aaaaaaaaaa aaaaaaaaaa aaaaaa 5036 13 3207 DNA Homo sapiens 13
gagggcgccc gccgagcctc cccggcctgt gcagacggcg cgcgcggcgg gagggcgcgg
60 accagcgtcc ccagcccggc cccgggcgga aggcgagcgg agcgcggccg
cgcgggcagc 120 agacggcaga cggtgcccgg cgcccacggg ggctgcccct
gatggggcct ggcaatgact 180 gagcaggccc agccccagag gacaaggaag
agaaggcata ttgaggaggg caagaagtga 240 cgcccggtgt agaatgactg
ccctgggagg gtggttcctt gggccctggc agggttgctg 300 acccttaccc
tgcaaaacac aaagagcagg actccagact cttcttgtga atggtcccct 360
gccctgcagc tccaccatga ggcttctcgt ggccccactc ttgctagctt gggtggctgg
420 tgccactgcc gctgtgcccg tggtaccctg gcatgttccc tgcccccctc
agtgtgcctg 480 ccagatccgg ccctggtata cgccccgctc gtcctaccgc
gaggctacca ctgtggactg 540 caatgaccta ttcctgacgg cagtcccccc
ggcactcccc gcaggcacac agaccctgct 600 cctgcagagc aacagcattg
tccgtgtgga ccagagtgag ctgggctacc tggccaatct 660 cacagagctg
gacctgtccc agaacagctt ttcggatgcc cgagactgtg atttccatgc 720
cctgccccag ctgctgagcc tgcacctaga ggagaaccag ctgacccggc tggaggacca
780 cagctttgca gggctggcca gcctacagga actctatctc aaccacaacc
agctctaccg 840 catcgccccc agggcctttt ctggcctcag caacttgctg
cggctgcacc tcaactccaa 900 cctcctgagg gccattgaca gccgctggtt
tgaaatgctg cccaacttgg agatactcat 960 gattggcggc aacaaggtag
atgccatcct ggacatgaac ttccggcccc tggccaacct 1020 gcgtagcctg
gtgctagcag gcatgaacct gcgggagatc tccgactatg ccctggaggg 1080
gctgcaaagc ctggagagcc tctccttcta tgacaaccag ctggcccggg tgcccaggcg
1140 ggcactggaa caggtgcccg ggctcaagtt cctagacctc aacaagaacc
cgctccagcg 1200 ggtagggccg ggggactttg ccaacatgct gcaccttaag
gagctgggac tgaacaacat 1260 ggaggagctg gtctccatcg acaagtttgc
cctggtgaac ctccccgagc tgaccaagct 1320 ggacatcacc aataacccac
ggctgtcctt catccacccc cgcgccttcc accacctgcc 1380 ccagatggag
accctcatgc tcaacaacaa cgctctcagt gccttgcacc agcagacggc 1440
ggagtccctg cccaacctgc aggaggtagg tctccacggc aaccccatcc gctgtgactg
1500 tgtcatccgc tgggccaatg ccacgggcac ccgtgtccgc ttcatcgagc
cgcaatccac 1560 cctgtgtgcg gagcctccgg acctccagcg cctcccggtc
cgtgaggtgc ccttccggga 1620 gatgacggac cactgtttgc ccctcatctc
cccacgaagc ttccccccaa gcctccaggt 1680 agccagtgga gagagcatgg
tgctgcattg ccgggcactg gccgaacccg aacccgagat 1740 ctactgggtc
actccagctg ggcttcgact gacacctgcc catgcaggca ggaggtaccg 1800
ggtgtacccc gaggggaccc tggagctgcg gagggtgaca gcagaagagg cggggctata
1860 cacctgtgtg gcccagaacc tggtgggggc tgacactaag acggttagtg
tggttgtggg 1920 ccgtgctctc ctccagccag
gcagggacga aggacagggg ctggagctcc gggtgcagga 1980 gacccacccc
tatcacatcc tgctatcttg ggtcacccca cccaacacag tgtccaccaa 2040
cctcacctgg tccagtgcct cctccctccg gggccagggg gccacagctc tggcccgcct
2100 gcctcgggga acccacagct acaacattac ccgcctcctt caggccacgg
agtactgggc 2160 ctgcctgcaa gtggcctttg ctgatgccca cacccagttg
gcttgtgtat gggccaggac 2220 caaagaggcc acttcttgcc acagagcctt
aggggaccgt cctgggctca ttgccatcct 2280 ggctctcgct gtccttctcc
tggcagctgg gctagcggcc caccttggca caggccaacc 2340 caggaagggt
gtgggtggga ggcggcctct ccctccagcc tgggctttct ggggctggag 2400
tcccccttct gtccgggttg tgtctgctcc cctcgtcctg ccctggaatc cagggaggaa
2460 gctgcccaga tcctcagaag gggagacact gttgccacca ttgtctcaaa
attcttgaag 2520 ctcagcctgt tctcagcagt agagaaatca ctaggactac
tttttaccaa aagagaagca 2580 gtctgggcca gatgccctgc caggaaaggg
acatggaccc acgtgcttga ggcctggcag 2640 ctgggccaag acagatgggg
ctttgtggcc ctgggggtgc ttctgcagcc tcgaaaaagt 2700 tgcccttacc
tcctagggtc acctctgctg ccattctgag gaacatctcc aaggaacagg 2760
agggactttg gctagagcct cctgcctccc catcttctct ctgcccagag gctcctgggc
2820 ctggcttggc tgtcccctac ctgtgtcccc gggctgcacc ccttcctctt
ctctttctct 2880 gtacagtctc agttgcttgc tcttgtgcct cctgggcaag
ggctgaagga ggccactcca 2940 tctcacctcg gggggctgcc ctcaatgtgg
gagtgacccc agccagatct gaaggacatt 3000 tgggagaggg atgcccagga
acgcctcatc tcagcagcct gggctcggca ttccgaagct 3060 gactttctat
aggcaatttt gtacctttgt ggagaaatgt gtcacctccc ccaacccgat 3120
tcactctttt ctcctgtttt gtaaaaaata aaaataaata ataacaataa tacgggggaa
3180 aggaacgaaa ggaaaaaaaa aaaaaaa 3207 14 3170 DNA Homo sapiens 14
ggctcaccga caacttcatc gccgccgtgc gccgccgaga cttcgccaac atgaccagcc
60 tggtgcacct cactctctcc cggaacacca tcggccaggt ggcagctggc
gccttcgccg 120 acctgcgtgc cctccgggcc ctgcacctgg acagcaaccg
cctggcggag gtgcgcggcg 180 accagctccg cggcctgggc aacctccgcc
acctgatcct tggaaacaac cagatccgcc 240 gggtggagtc ggcggccttt
gacgccttcc tgtccaccgt ggaggacctg gatctgtcct 300 acaacaacct
ggaggccctg ccgtgggagg cggtgggcca gatggtgaac ctaaacaccc 360
tcacgctgga ccacaacctc atcgaccaca tcgcggaggg gaccttcgtg cagcttcaca
420 agctggtccg tctggacatg acctccaacc gcctgcataa actcccgccc
gacgggctct 480 tcctgaggtc gcagggcacc gggcccaagc cgcccacccc
gctgaccgtc agcttcggcg 540 gcaaccccct gcactgcaac tgcgagctgc
tctggctgcg gcggctgacc cgcgaggacg 600 acttagagac ctgcgccacg
cccgaacacc tcaccgaccg ctacttctgg tccatccccg 660 aggaggagtt
cctgtgtgag cccccgctga tcacacggca ggcggggggc cgggccctgg 720
tggtggaagg ccaggcggtg agcctgcgct gccgagcggt gggtgacccc gagccggtgg
780 tgcactgggt ggcacctgat gggcggctgc tggggaactc cagccggacc
cgggtccggg 840 gggacgggac gctggatgtg accatcacca ccttgaggga
cagtggcacc ttcacttgta 900 tcgcctccaa tgctgctggg gaagcgacgg
cgcccgtgga ggtgtgcgtg gtacctctgc 960 ctctgatggc acccccgccg
gctgccccgc cgcctctcac cgagcccggc tcctctgaca 1020 tcgccacgcc
gggcagacca ggtgccaacg attctgcggc tgagcgtcgg ctcgtggcag 1080
ccgagctcac ctcgaactcc gtgctcatcc gctggccagc ccagaggcct gtgcccggaa
1140 tacgcatgta ccaggttcag tacaacagtt ccgttgatga ctccctcgtc
tacaggatga 1200 tcccgtccac cagtcagacc ttcctggtga atgacctggc
ggcgggccgt gcctacgact 1260 tgtgcgtgct ggcggtctac gacgacgggg
ccacagcgct gccggcaacg cgagtggtgg 1320 gctgtgtaca gttcaccacc
gctggggatc cggcgccctg ccgcccgctg agggcccatt 1380 tcttgggcgg
caccatgatc atcgccatcg ggggcgtcat cgtcgcctcg gtcctcgtct 1440
tcatcgttct gctcatgatc cgctataagg tgtatggcga cggggacagc cgccgcgtca
1500 agggctccag gtcgctcccg cgggtcagcc acgtgtgctc gcagaccaac
ggcgcaggca 1560 caggcgcggc acaggccccg gccctgccgg cccaggacca
ctacgaggcg ctgcgcgagg 1620 tggagtccca ggctgccccc gccgtcgccg
tcgaggccaa ggccatggag gccgagacgg 1680 catccgcgga gccggaggtg
gtccttggac gttctctggg cggctcggcc acctcgctgt 1740 gcctgctgcc
atccgaggaa acttccgggg aggagtctcg ggccgcggtg ggccctcgaa 1800
ggagccgatc cggcgccctg gagccaccaa cctcggcgcc ccctactcta gctctagttc
1860 ctgggggagc cgcggcccgg ccgaggccgc agcagcgcta ttcgttcgac
ggggactacg 1920 gggcactatt ccagagccac agttacccgc gccgcgcccg
gcggacaaag cgccaccggt 1980 ccacgccgca cctggacggg gctggagggg
gcgcggccgg ggaggatgga gacctggggc 2040 tgggctccgc cagggcgtgc
ctggctttca ccagcaccga gtggatgctg gagagtaccg 2100 tgtgagcggc
gggcgggcgc cgggacgcct gggtgccgca gaccaaacgc ccagccgcac 2160
ggacgctggg gcgggactgg gagaaagcgc agcgccaaga cattggacca gagtggagac
2220 gcgcccttgt ccccgggagg gggcggggca gcctcgggct gcggctcgag
gccacgcccc 2280 cgtgcccagg gcggggttcg gggaccggct gccggcctcc
cttcccctat ggactcctcg 2340 acccccctcc tacccctccc ctcgcgcgct
cgcggacctc gctggagccg gtgccttaca 2400 cagcgaagcg cggggagggg
cagggccccc tgacactgca gcactgagac acgagccccc 2460 tcccccagcc
cgtcacccgg ggccggggcg aggggcccat ttcttgtatc tggctggact 2520
agatcctatt ctgtcccgcg gcggcctcca aagcctccca ccccacccca cgcacattcc
2580 tggtccggtc gggtctggct tggggtcccc ctttctctgt ttccctcgtt
tgtctctatc 2640 ccgccctctt gtcgtctctc tgtagtgcct gtctttccct
atttgcctct cctttctctc 2700 tgtcctgtcg tctcttgtcc ctcggccctc
cctggttttg tctagtctcc ctgtctctcc 2760 tgatttcttc tctttactca
ttctcccggg caggtcccac tggaaggacc agactctccc 2820 aaataaatcc
ccacacgaac aaaatccaaa accaaatccc cctccctacc ggagccggga 2880
ccctccgccg cagcagaatt aaactttttt ctgtgtctga ggccctgctg acctgtgtgt
2940 gtgtctgtat gtgtgtccgc gtgtagtgtg tgtgtgtgtg tgtgtgtgtg
tgtgtgtgtg 3000 tgtgtgttgg gggagggtga cctagattgc agcataagga
ctctaagtga gactgaagga 3060 agatgggaag atgactaact ggggccggag
gagactggca gacaggcttt tatcctctga 3120 gagacttaga ggtggggaat
aatcacaaaa ataaaatgat cataatagct 3170 15 2053 DNA Homo sapiens 15
ccggctcgcg ccctccgggc ccagcctccc gagccttcgg agcgggcgcc gtcccagccc
60 agctccgggg aaacgcgagc cgcgatgcct ggggggtgct cccggggccc
cgccgccggg 120 gacgggcgtc tgcggctggc gcgactagcg ctggtactcc
tgggctgggt ctcctcgtct 180 tctcccacct cctcggcatc ctccttctcc
tcctcggcgc cgttcctggc ttccgccgtg 240 tccgcccagc ccccgctgcc
ggaccagtgc cccgcgctgt gcgagtgctc cgaggcagcg 300 cgcacagtca
agtgcgttaa ccgcaatctg accgaggtgc ccacggacct gcccgcctac 360
gtgcgcaacc tcttccttac cggcaaccag ctggccgtgc tccctgccgg cgccttcgcc
420 cgccggccgc cgctggcgga gctggccgcg ctcaacctca gcggcagccg
cctggacgag 480 gtgcgcgcgg gcgccttcga gcatctgccc agcctgcgcc
agctcgacct cagccacaac 540 ccactggccg acctcagtcc cttcgctttc
tcgggcagca atgccagcgt ctcggccccc 600 agtccccttg tggaactgat
cctgaaccac atcgtgcccc ctgaagatga gcggcagaac 660 cggagcttcg
agggcatggt ggtggcggcc ctgctggcgg gccgtgcact gcaggggctc 720
cgccgcttgg agctggccag caaccacttc ctttacctgc cgcgggatgt gctggcccaa
780 ctgcccagcc tcaggcacct ggacttaagt aataattcgc tggtgagcct
gacctacgtg 840 tccttccgca acctgacaca tctagaaagc ctccacctgg
aggacaatgc cctcaaggtc 900 cttcacaatg gcaccctggc tgagttgcaa
ggtctacccc acattagggt tttcctggac 960 aacaatccct gggtctgcga
ctgccacatg gcagacatgg tgacctggct caaggaaaca 1020 gaggtagtgc
agggcaaaga ccggctcacc tgtgcatatc cggaaaaaat gaggaatcgg 1080
gtcctcttgg aactcaacag tgctgacctg gactgtgacc cgattcttcc cccatccctg
1140 caaacctctt atgtcttcct gggtattgtt ttagccctga taggcgctat
tttcctcctg 1200 gttttgtatt tgaaccgcaa ggggataaaa aagtggatgc
ataacatcag agatgcctgc 1260 agggatcaca tggaagggta tcattacaga
tatgaaatca atgcggaccc cagattaaca 1320 aacctcagtt ctaactcgga
tgtctgagaa atattagagg acagaccaag gacaactctg 1380 catgagatgt
agacttaagc tttatcccta ctaggcttgc tccactttca tcctccacta 1440
tagatacaac ggactttgac taaaagcagt gaaggggatt tgcttccttg ttatgtaaag
1500 tttctcggtg tgttctgtta atgtaagacg atgaacagtt gtgtatagtg
ttttaccctc 1560 ttctttttct tggaactcct caacacgtat ggagggattt
ttcaggtttc agcatgaaca 1620 tgggcttctt gctgtctgtc tctctctcag
tacagttcaa ggtgtagcaa gtgtacccac 1680 acagatagca ttcaacaaaa
gctgcctcaa ctttttcgag aaaaatactt tattcataaa 1740 tatcagtttt
attctcatgt acctaagttg tggagaaaat aattgcatcc tataaactgc 1800
ctgcagacgt tagcaggctc ttcaaaataa ctccatggtg cacaggagca cctgcatcca
1860 agagcatgct tacattttac tgttctgcat attacaaaaa ataacttgca
acttcataac 1920 ttctttgaca aagtaaatta cttttttgat tgcagtttat
atgaaaatgt actgattttt 1980 ttttaataaa ctgcatcgag atccaaccga
ctgaattgtt aaaaaaaaaa aaaaataaag 2040 attcttaaaa gaa 2053 16 973
DNA Homo sapiens 16 cagcccagct ccggggaaac gcgagccgcg atgcctgggg
ggtgctcccg gggccccgcc 60 gccggggacg ggcgtctgcg gctggcgcga
ctagcgctgg tactcctggg ctgggtctcc 120 tcgtcttctc ccacctcctc
ggcatcctcc ttctcctcct cggcgccgtt cctggcttcc 180 gccgtgtccg
cccagccccc gctgccggac cagtgccccg cgctgtgcga gtgctccgag 240
gcagcgcgca cagtcaagtg cgttaaccgc aatctgaccg aggtgcccac ggacctgccc
300 gcctacgtgc gcaacctctt ccttaccagc aaccacttcc tttacctgcc
gcgggatgtg 360 ctggcccaac tgcccagcct caggcacctg gacttaagta
ataattcgct ggtgagcctg 420 acctacgtgt ccttccgcaa cctgacacat
ctagaaagcc tccacctgga ggacaatgcc 480 ctcaaggtcc ttcacaatgg
caccctggct gagttgcaag gtctacccca cattagggtt 540 ttcctggaca
acaatccctg ggtctgcgac tgccacatgg cagacatggt gacctggctc 600
aaggaaacag aggtagtgca gggcaaagac cggctcacct gtgcatatcc ggaaaaaatg
660 aggaatcggg tcctcttgga actcaacagt gctgacctgg actgtgaccc
gattcttccc 720 ccatccctgc aaacctctta tgtcttcctg ggtattgttt
tagccctgat aggcgctatt 780 ttcctcctgg ttttgtattt gaaccgcaag
gggataaaaa agtggatgca taacatcaga 840 gatgcctgca gggatcacat
ggaagggtat cattacagat atgaaatcaa tgcggacccc 900 agattaacga
acctcagttc taactcggat gtctgagaaa tattagagga cagaccaagg 960
acaactctgc atg 973 17 1331 DNA Homo sapiens 17 cagcccagct
ccggggaaac gcgagccgcg atgcctgggg ggtgctcccg gggccccgcc 60
gccggggacg ggcgtctgcg gctggcgcga ctagcgctgg tactcctggg ctgggtctcc
120 tcgtcttctc ccacctcctc ggcatcctcc ttctcctcct cggcgccgtt
cctggcttcc 180 gccgtgtccg cccagccccc gctgccggac cagtgccccg
cgctgtgcga gtgctccgag 240 gcagcgcgca cagtcaagtg cgttaaccgc
aatctgaccg aggtgcccac ggacctgccc 300 gcctacgtgc gcaacctctt
ccttaccggc aaccagctgg ccgtgctccc tgccggcgcc 360 ttcgcccgcc
ggccgccgct ggcggagctg gccgcgctca acctcagcgg cagccgcctg 420
gacgaggtgc gcgcgggcgc cttcgagcat ctgcccagcc tgcgccagct cgacctcagc
480 cacaacccac tggccgacct cagtcccttc gctttctcgg gcagcaatgc
cagcgtctcg 540 gcccccagtc cccttgtgga actgatcctg aaccacatcg
tgccccctga agatgagcgg 600 cagaaccgga gcttcgaggg catggtggtg
gcggccctgc tggcgggccg tgcactgcag 660 gggctccgcc gcttggagct
ggccagcaac cacttccttt acctgccgcg ggatgtgctg 720 gcccaactgc
ccagcctcag gcacctggac ttaagtaata attcgctggt gagcctgacc 780
tacgtgtcct tccgcaacct gacacatcta gaaagcctcc acctggagga caatgccctc
840 aaggtccttc acaatggcac cctggctgag ttgcaaggtc taccccacat
tagggttttc 900 ctggacaaca atccctgggt ctgcgactgc cacatggcag
acatggtgac ctggctcaag 960 gaaacagagg tagtgcaggg caaagaccgg
ctcacctgtg catatccgga aaaaatgagg 1020 aatcgggtcc tcttggaact
caacagtgct gacctggact gtgacccgat tcttccccca 1080 tccctgcaaa
cctcttatgt cttcctgggt attgttttag ccctgatagg cgctattttc 1140
ctcctggttt tgtatttgaa ccgcaagggg ataaaaaagt ggatgcataa catcagagat
1200 gcctgcaggg atcacatgga agggtatcat tacagatatg aaatcaatgc
ggaccccaga 1260 ttaacaaacc tcagttctaa ctcggatgtc tgagaaatat
tagaggacag accaaggaca 1320 actctgcatg a 1331 18 2053 DNA Homo
sapiens 18 ccggctcgcg ccctccgggc ccagcctccc gagccttcgg agcgggcgcc
gtcccagccc 60 agctccgggg aaacgcgagc cgcgatgcct ggggggtgct
cccggggccc cgccgccggg 120 gacgggcgtc tgcggctggc gcgactagcg
ctggtactcc tgggctgggt ctcctcgtct 180 tctcccacct cctcggcatc
ctccttctcc tcctcggcgc cgttcctggc ttccgccgtg 240 tccgcccagc
ccccgctgcc ggaccagtgc cccgcgctgt gcgagtgctc cgaggcagcg 300
cgcacagtca agtgcgttaa ccgcaatctg accgaggtgc ccacggacct gcccgcctac
360 gtgcgcaacc tcttccttac cggcaaccag ctggccgtgc tccctgccgg
cgccttcgcc 420 cgccggccgc cgctggcgga gctggccgcg ctcaacctca
gcggcagccg cctggacgag 480 gtgcgcgcgg gcgccttcga gcatctgccc
agcctgcgcc agctcgacct cagccacaac 540 ccactggccg acctcagtcc
cttcgctttc tcgggcagca atgccagcgt ctcggccccc 600 agtccccttg
tggaactgat cctgaaccac atcgtgcccc ctgaagatga gcggcagaac 660
cggagcttcg agggcatggt ggtggcggcc ctgctggcgg gccgtgcact gcaggggctc
720 cgccgcttgg agctggccag caaccacttc ctttacctgc cgcgggatgt
gctggcccaa 780 ctgcccagcc tcaggcacct ggacttaagt aataattcgc
tggtgagcct gacctacgtg 840 tccttccgca acctgacaca tctagaaagc
ctccacctgg aggacaatgc cctcaaggtc 900 cttcacaatg gcaccctggc
tgagttgcaa ggtctacccc acattagggt tttcctggac 960 aacaatccct
gggtctgcga ctgccacatg gcagacatgg tgacctggct caaggaaaca 1020
gaggtagtgc agggcaaaga ccggctcacc tgtgcatatc cggaaaaaat gaggaatcgg
1080 gtcctcttgg aactcaacag tgctgacctg gactgtgacc cgattcttcc
cccatccctg 1140 caaacctctt atgtcttcct gggtattgtt ttagccctga
taggcgctat tttcctcctg 1200 gttttgtatt tgaaccgcaa ggggataaaa
aagtggatgc ataacatcag agatgcctgc 1260 agggatcaca tggaagggta
tcattacaga tatgaaatca atgcggaccc cagattaaca 1320 aacctcagtt
ctaactcgga tgtctgagaa atattagagg acagaccaag gacaactctg 1380
catgagatgt agacttaagc tttatcccta ctaggcttgc tccactttca tcctccacta
1440 tagatacaac ggactttgac taaaagcagt gaaggggatt tgcttccttg
ttatgtaaag 1500 tttctcggtg tgttctgtta atgtaagacg atgaacagtt
gtgtatagtg ttttaccctc 1560 ttctttttct tggaactcct caacacgtat
ggagggattt ttcaggtttc agcatgaaca 1620 tgggcttctt gctgtctgtc
tctctctcag tacagttcaa ggtgtagcaa gtgtacccac 1680 acagatagca
ttcaacaaaa gctgcctcaa ctttttcgag aaaaatactt tattcataaa 1740
tatcagtttt attctcatgt acctaagttg tggagaaaat aattgcatcc tataaactgc
1800 ctgcagacgt tagcaggctc ttcaaaataa ctccatggtg cacaggagca
cctgcatcca 1860 agagcatgct tacattttac tgttctgcat attacaaaaa
ataacttgca acttcataac 1920 ttctttgaca aagtaaatta cttttttgat
tgcagtttat atgaaaatgt actgattttt 1980 ttttaataaa ctgcatcgag
atccaaccga ctgaattgtt aaaaaaaaaa aaaaataaag 2040 attcttaaaa gaa
2053 19 845 PRT Homo sapiens 19 Met Leu Ser Gly Val Trp Phe Leu Ser
Val Leu Thr Val Ala Gly Ile 1 5 10 15 Leu Gln Thr Glu Ser Arg Lys
Thr Ala Lys Asp Ile Cys Lys Ile Arg 20 25 30 Cys Leu Cys Glu Glu
Lys Glu Asn Val Leu Asn Ile Asn Cys Glu Asn 35 40 45 Lys Gly Phe
Thr Thr Val Ser Leu Leu Gln Pro Pro Gln Tyr Arg Ile 50 55 60 Tyr
Gln Leu Phe Leu Asn Gly Asn Leu Leu Thr Arg Leu Tyr Pro Asn 65 70
75 80 Glu Phe Val Asn Tyr Ser Asn Ala Val Thr Leu His Leu Gly Asn
Asn 85 90 95 Gly Leu Gln Glu Ile Arg Thr Gly Ala Phe Ser Gly Leu
Lys Thr Leu 100 105 110 Lys Arg Leu His Leu Asn Asn Asn Lys Leu Glu
Ile Leu Arg Glu Asp 115 120 125 Thr Phe Leu Gly Leu Glu Ser Leu Glu
Tyr Leu Gln Ala Asp Tyr Asn 130 135 140 Tyr Ile Ser Ala Ile Glu Ala
Gly Ala Phe Ser Lys Leu Asn Lys Leu 145 150 155 160 Lys Val Leu Ile
Leu Asn Asp Asn Leu Leu Leu Ser Leu Pro Ser Asn 165 170 175 Val Phe
Arg Phe Val Leu Leu Thr His Leu Asp Leu Arg Gly Asn Arg 180 185 190
Leu Lys Val Met Pro Phe Ala Gly Val Leu Glu His Ile Gly Gly Ile 195
200 205 Met Glu Ile Gln Leu Glu Glu Asn Pro Trp Asn Cys Thr Cys Asp
Leu 210 215 220 Leu Pro Leu Lys Ala Trp Leu Asp Thr Ile Thr Val Phe
Val Gly Glu 225 230 235 240 Ile Val Cys Glu Thr Pro Phe Arg Leu His
Gly Lys Asp Val Thr Gln 245 250 255 Leu Thr Arg Gln Asp Leu Cys Pro
Arg Lys Ser Ala Ser Asp Ser Ser 260 265 270 Gln Arg Gly Ser His Ala
Asp Thr His Val Gln Arg Leu Ser Pro Thr 275 280 285 Met Asn Pro Ala
Leu Asn Pro Thr Arg Ala Pro Lys Ala Ser Arg Pro 290 295 300 Pro Lys
Met Arg Asn Arg Pro Thr Pro Arg Val Thr Val Ser Lys Asp 305 310 315
320 Arg Gln Ser Phe Gly Pro Ile Met Val Tyr Gln Thr Lys Ser Pro Val
325 330 335 Pro Leu Thr Cys Pro Ser Ser Cys Val Cys Thr Ser Gln Ser
Ser Asp 340 345 350 Asn Gly Leu Asn Val Asn Cys Gln Glu Arg Lys Phe
Thr Asn Ile Ser 355 360 365 Asp Leu Gln Pro Lys Pro Thr Ser Pro Lys
Lys Leu Tyr Leu Thr Gly 370 375 380 Asn Tyr Leu Gln Thr Val Tyr Lys
Asn Asp Leu Leu Glu Tyr Ser Ser 385 390 395 400 Leu Asp Leu Leu His
Leu Gly Asn Asn Arg Ile Ala Val Ile Gln Glu 405 410 415 Gly Ala Phe
Thr Asn Leu Thr Ser Leu Arg Arg Leu Tyr Leu Asn Gly 420 425 430 Asn
Tyr Leu Glu Val Leu Tyr Pro Ser Met Phe Asp Gly Leu Gln Ser 435 440
445 Leu Gln Tyr Leu Tyr Leu Glu Tyr Asn Val Ile Lys Glu Ile Lys Pro
450 455 460 Leu Thr Phe Asp Ala Leu Ile Asn Leu Gln Leu Leu Phe Leu
Asn Asn 465 470 475 480 Asn Leu Leu Arg Ser Leu Pro Asp Asn Ile Phe
Gly Gly Thr Ala Leu 485 490 495 Thr Arg Leu Asn Leu Arg Asn Asn His
Phe Ser His Leu Pro Val Lys 500 505 510 Gly Val Leu Asp Gln Leu Pro
Ala Phe Ile Gln Ile Asp Leu Gln Glu 515 520 525 Asn Pro Trp Asp Cys
Thr Cys Asp Ile Met Gly Leu Lys Asp Trp Thr 530 535 540 Glu His Ala
Asn Ser Pro Val Ile Ile Asn Glu Val Thr Cys Glu Ser 545 550 555 560
Pro Ala Lys His Ala Gly Glu Ile Leu Lys Phe Leu Gly Arg Glu Ala 565
570 575 Ile Cys Pro Asp Ser Pro Asn Leu Ser Asp Gly Thr Val Leu Ser
Met 580 585 590 Asn His Asn Thr Asp Thr Pro Arg Ser Leu Ser Val Ser
Pro Ser Ser 595 600 605 Tyr Pro Glu Leu His Thr Glu Val Pro Leu Ser
Val Leu Ile Leu Gly 610 615
620 Leu Leu Val Val Phe Ile Leu Ser Val Cys Phe Gly Ala Gly Leu Phe
625 630 635 640 Val Phe Val Leu Lys Arg Arg Lys Gly Val Pro Ser Val
Pro Arg Asn 645 650 655 Thr Asn Asn Leu Asp Val Ser Ser Phe Gln Leu
Gln Tyr Gly Ser Tyr 660 665 670 Asn Thr Glu Thr His Asp Lys Thr Asp
Gly His Val Tyr Asn Tyr Ile 675 680 685 Pro Pro Pro Val Gly Gln Met
Cys Gln Asn Pro Ile Tyr Met Gln Lys 690 695 700 Glu Gly Asp Pro Val
Ala Tyr Tyr Arg Asn Leu Gln Glu Phe Ser Tyr 705 710 715 720 Ser Asn
Leu Glu Glu Lys Lys Glu Glu Pro Ala Thr Pro Ala Tyr Thr 725 730 735
Ile Ser Ala Thr Glu Leu Leu Glu Lys Gln Ala Thr Pro Arg Glu Pro 740
745 750 Glu Leu Leu Tyr Gln Asn Ile Ala Glu Arg Val Lys Glu Leu Pro
Ser 755 760 765 Ala Gly Leu Val His Tyr Asn Phe Cys Thr Leu Pro Lys
Arg Gln Phe 770 775 780 Ala Pro Ser Tyr Glu Ser Arg Arg Gln Asn Gln
Asp Arg Ile Asn Lys 785 790 795 800 Thr Val Leu Tyr Gly Thr Pro Arg
Lys Cys Phe Val Gly Gln Ser Lys 805 810 815 Pro Asn His Pro Leu Leu
Gln Ala Lys Pro Gln Ser Glu Pro Asp Tyr 820 825 830 Leu Glu Val Leu
Glu Lys Gln Thr Ala Ile Ser Gln Leu 835 840 845 20 1477 PRT Homo
sapiens 20 Met Ala Lys Arg Ser Arg Gly Pro Gly Arg Arg Cys Leu Leu
Ala Leu 1 5 10 15 Val Leu Phe Cys Ala Trp Gly Thr Leu Ala Val Val
Ala Gln Lys Pro 20 25 30 Gly Ala Gly Cys Pro Ser Arg Cys Leu Cys
Phe Arg Thr Thr Val Arg 35 40 45 Cys Met His Leu Leu Leu Glu Ala
Val Pro Ala Val Ala Pro Gln Thr 50 55 60 Ser Ile Leu Asp Leu Arg
Phe Asn Arg Ile Arg Glu Ile Gln Pro Gly 65 70 75 80 Ala Phe Arg Arg
Leu Arg Asn Leu Asn Thr Leu Leu Leu Asn Asn Asn 85 90 95 Gln Ile
Lys Arg Ile Pro Ser Gly Ala Phe Glu Asp Leu Glu Asn Leu 100 105 110
Lys Tyr Leu Tyr Leu Tyr Lys Asn Glu Ile Gln Ser Ile Asp Arg Gln 115
120 125 Ala Phe Lys Gly Leu Ala Ser Leu Glu Gln Leu Tyr Leu His Phe
Asn 130 135 140 Gln Ile Glu Thr Leu Asp Pro Asp Ser Phe Gln His Leu
Pro Lys Leu 145 150 155 160 Glu Arg Leu Phe Leu His Asn Asn Arg Ile
Thr His Leu Val Pro Gly 165 170 175 Thr Phe Asn His Leu Glu Ser Met
Lys Arg Leu Arg Leu Asp Ser Asn 180 185 190 Thr Leu His Cys Asp Cys
Glu Ile Leu Trp Leu Ala Asp Leu Leu Lys 195 200 205 Thr Tyr Ala Glu
Ser Gly Asn Ala Gln Ala Ala Ala Ile Cys Glu Tyr 210 215 220 Pro Arg
Arg Ile Gln Gly Arg Ser Val Ala Thr Ile Thr Pro Glu Glu 225 230 235
240 Leu Asn Cys Glu Arg Pro Arg Ile Thr Ser Glu Pro Gln Asp Ala Asp
245 250 255 Val Thr Ser Gly Asn Thr Val Tyr Phe Thr Cys Arg Ala Glu
Gly Asn 260 265 270 Pro Lys Pro Glu Ile Ile Trp Leu Arg Asn Asn Asn
Glu Leu Ser Met 275 280 285 Lys Thr Asp Ser Arg Leu Asn Leu Leu Asp
Asp Gly Thr Leu Met Ile 290 295 300 Gln Asn Thr Gln Glu Thr Asp Gln
Gly Ile Tyr Gln Cys Met Ala Lys 305 310 315 320 Asn Val Ala Gly Glu
Val Lys Thr Gln Glu Val Thr Leu Arg Tyr Phe 325 330 335 Gly Ser Pro
Ala Arg Pro Thr Phe Val Ile Gln Pro Gln Asn Thr Glu 340 345 350 Val
Leu Val Gly Glu Ser Val Thr Leu Glu Cys Ser Ala Thr Gly His 355 360
365 Pro Pro Pro Arg Ile Ser Trp Thr Arg Gly Asp Arg Thr Pro Leu Pro
370 375 380 Val Asp Pro Arg Val Asn Ile Thr Pro Ser Gly Gly Leu Tyr
Ile Gln 385 390 395 400 Asn Val Val Gln Gly Asp Ser Gly Glu Tyr Ala
Cys Ser Ala Thr Asn 405 410 415 Asn Ile Asp Ser Val His Ala Thr Ala
Phe Ile Ile Val Gln Ala Leu 420 425 430 Pro Gln Phe Thr Val Thr Pro
Gln Asp Arg Val Val Ile Glu Gly Gln 435 440 445 Thr Val Asp Phe Gln
Cys Glu Ala Lys Gly Asn Pro Pro Pro Val Ile 450 455 460 Ala Trp Thr
Lys Gly Gly Ser Gln Leu Ser Val Asp Arg Arg His Leu 465 470 475 480
Val Leu Ser Ser Gly Thr Leu Arg Ile Ser Gly Val Ala Leu His Asp 485
490 495 Gln Gly Gln Tyr Glu Cys Gln Ala Val Asn Ile Ile Gly Ser Gln
Lys 500 505 510 Val Val Ala His Leu Thr Val Gln Pro Arg Val Thr Pro
Val Phe Ala 515 520 525 Ser Ile Pro Ser Asp Thr Thr Val Glu Val Gly
Ala Asn Val Gln Leu 530 535 540 Pro Cys Ser Ser Gln Gly Glu Pro Glu
Pro Ala Ile Thr Trp Asn Lys 545 550 555 560 Asp Gly Val Gln Val Thr
Glu Ser Gly Lys Phe His Ile Ser Pro Glu 565 570 575 Gly Phe Leu Thr
Ile Asn Asp Val Gly Pro Ala Asp Ala Gly Arg Tyr 580 585 590 Glu Cys
Val Ala Arg Asn Thr Ile Gly Ser Ala Ser Val Ser Met Val 595 600 605
Leu Ser Val Asn Asp Val Ser Arg Asn Gly Asp Pro Phe Val Ala Thr 610
615 620 Ser Ile Val Glu Ala Ile Ala Thr Val Asp Arg Ala Ile Asn Ser
Thr 625 630 635 640 Arg Thr His Leu Phe Asp Ser Arg Pro Arg Ser Pro
Asn Asp Leu Leu 645 650 655 Ala Leu Phe Arg Tyr Pro Arg Asp Pro Tyr
Thr Val Glu Gln Ala Arg 660 665 670 Ala Gly Glu Ile Phe Glu Arg Thr
Leu Gln Leu Ile Gln Glu His Val 675 680 685 Gln His Gly Leu Met Val
Asp Leu Asn Gly Thr Ser Tyr His Tyr Asn 690 695 700 Asp Leu Val Ser
Pro Gln Tyr Leu Asn Leu Ile Ala Asn Leu Ser Gly 705 710 715 720 Cys
Thr Ala His Arg Arg Val Asn Asn Cys Ser Asp Met Cys Phe His 725 730
735 Gln Lys Tyr Arg Thr His Asp Gly Thr Cys Asn Asn Leu Gln His Pro
740 745 750 Met Trp Gly Ala Ser Leu Thr Ala Phe Glu Arg Leu Leu Lys
Ser Val 755 760 765 Tyr Glu Asn Gly Phe Asn Thr Pro Arg Gly Ile Asn
Pro His Arg Leu 770 775 780 Tyr Asn Gly His Ala Leu Pro Met Pro Arg
Leu Val Ser Thr Thr Leu 785 790 795 800 Ile Gly Thr Glu Thr Val Thr
Pro Asp Glu Gln Phe Thr His Met Leu 805 810 815 Met Gln Trp Gly Gln
Phe Leu Asp His Asp Leu Asp Ser Thr Val Val 820 825 830 Ala Leu Ser
Gln Ala Arg Phe Ser Asp Gly Gln His Cys Ser Asn Val 835 840 845 Cys
Ser Asn Asp Pro Pro Cys Phe Ser Val Met Ile Pro Pro Asn Asp 850 855
860 Ser Arg Ala Arg Ser Gly Ala Arg Cys Met Phe Phe Val Arg Ser Ser
865 870 875 880 Pro Val Cys Gly Ser Gly Met Thr Ser Leu Leu Met Asn
Ser Val Tyr 885 890 895 Pro Arg Glu Gln Ile Asn Gln Leu Thr Ser Tyr
Ile Asp Ala Ser Asn 900 905 910 Val Tyr Gly Ser Thr Glu His Glu Ala
Arg Ser Ile Arg Asp Leu Ala 915 920 925 Ser His Arg Gly Leu Leu Arg
Gln Gly Ile Val Gln Arg Ser Gly Lys 930 935 940 Pro Leu Leu Pro Phe
Ala Thr Gly Pro Pro Thr Glu Cys Met Arg Asp 945 950 955 960 Glu Asn
Glu Ser Pro Ile Pro Cys Phe Leu Ala Gly Asp His Arg Ala 965 970 975
Asn Glu Gln Leu Gly Leu Thr Ser Met His Thr Leu Trp Phe Arg Glu 980
985 990 His Asn Arg Ile Ala Thr Glu Leu Leu Lys Leu Asn Pro His Trp
Asp 995 1000 1005 Gly Asp Thr Ile Tyr Tyr Glu Thr Arg Lys Ile Val
Gly Ala Glu 1010 1015 1020 Ile Gln His Ile Thr Tyr Gln His Trp Leu
Pro Lys Ile Leu Gly 1025 1030 1035 Glu Val Gly Met Arg Thr Leu Gly
Glu Tyr His Gly Tyr Asp Pro 1040 1045 1050 Gly Ile Asn Ala Gly Ile
Phe Asn Ala Phe Ala Thr Ala Ala Phe 1055 1060 1065 Arg Phe Gly His
Thr Leu Val Asn Pro Leu Leu Tyr Arg Leu Asp 1070 1075 1080 Glu Asn
Phe Gln Pro Ile Ala Gln Asp His Leu Pro Leu His Lys 1085 1090 1095
Ala Phe Phe Ser Pro Phe Arg Ile Val Asn Glu Gly Gly Ile Asp 1100
1105 1110 Pro Leu Leu Arg Gly Leu Phe Gly Val Ala Gly Lys Met Arg
Val 1115 1120 1125 Pro Ser Gln Leu Leu Asn Thr Glu Leu Thr Glu Arg
Leu Phe Ser 1130 1135 1140 Met Ala His Thr Val Ala Leu Asp Leu Ala
Ala Ile Asn Ile Gln 1145 1150 1155 Arg Gly Arg Asp His Gly Ile Pro
Pro Tyr His Asp Tyr Arg Val 1160 1165 1170 Tyr Cys Asn Leu Ser Ala
Ala His Thr Phe Glu Asp Leu Lys Asn 1175 1180 1185 Glu Ile Lys Asn
Pro Glu Ile Arg Glu Lys Leu Lys Arg Leu Tyr 1190 1195 1200 Gly Ser
Thr Leu Asn Ile Asp Leu Phe Pro Ala Leu Val Val Glu 1205 1210 1215
Asp Leu Val Pro Gly Ser Arg Leu Gly Pro Thr Leu Met Cys Leu 1220
1225 1230 Leu Ser Thr Gln Phe Lys Arg Leu Arg Asp Gly Asp Arg Leu
Trp 1235 1240 1245 Tyr Glu Asn Pro Gly Val Phe Ser Pro Ala Gln Leu
Thr Gln Ile 1250 1255 1260 Lys Gln Thr Ser Leu Ala Arg Ile Leu Cys
Asp Asn Ala Asp Asn 1265 1270 1275 Ile Thr Arg Val Gln Ser Asp Val
Phe Arg Val Ala Glu Phe Pro 1280 1285 1290 His Gly Tyr Gly Ser Cys
Asp Glu Ile Pro Arg Val Asp Leu Arg 1295 1300 1305 Val Trp Gln Asp
Cys Cys Glu Asp Cys Arg Thr Arg Gly Gln Phe 1310 1315 1320 Asn Ala
Phe Ser Tyr His Phe Arg Gly Arg Arg Ser Leu Glu Phe 1325 1330 1335
Ser Tyr Gln Glu Asp Lys Pro Thr Lys Lys Thr Arg Pro Arg Lys 1340
1345 1350 Ile Pro Ser Val Gly Arg Gln Gly Glu His Leu Ser Asn Ser
Thr 1355 1360 1365 Ser Ala Phe Ser Thr Arg Ser Asp Ala Ser Gly Thr
Asn Asp Phe 1370 1375 1380 Arg Glu Phe Val Leu Glu Met Gln Lys Thr
Ile Thr Asp Leu Arg 1385 1390 1395 Thr Gln Ile Lys Lys Leu Glu Ser
Arg Leu Ser Thr Thr Glu Cys 1400 1405 1410 Val Asp Ala Gly Gly Glu
Ser His Ala Asn Asn Thr Lys Trp Lys 1415 1420 1425 Lys Asp Ala Cys
Thr Ile Cys Glu Cys Lys Asp Gly Gln Val Thr 1430 1435 1440 Cys Phe
Val Glu Ala Cys Pro Pro Ala Thr Cys Ala Val Pro Val 1445 1450 1455
Asn Ile Pro Gly Ala Cys Cys Pro Val Cys Leu Gln Lys Arg Ala 1460
1465 1470 Glu Glu Lys Pro 1475 21 798 PRT Homo sapiens 21 Met Leu
Ile Asn Cys Glu Ala Lys Gly Ile Lys Met Val Ser Glu Ile 1 5 10 15
Ser Val Pro Pro Ser Arg Pro Phe Gln Leu Ser Leu Leu Asn Asn Gly 20
25 30 Leu Thr Met Leu His Thr Asn Asp Phe Ser Gly Leu Thr Asn Ala
Ile 35 40 45 Ser Ile His Leu Gly Phe Asn Asn Ile Ala Asp Ile Glu
Ile Gly Ala 50 55 60 Phe Asn Gly Leu Gly Leu Leu Lys Gln Leu His
Ile Asn His Asn Ser 65 70 75 80 Leu Glu Ile Leu Lys Glu Asp Thr Phe
His Gly Leu Glu Asn Leu Glu 85 90 95 Phe Leu Gln Ala Asp Asn Asn
Phe Ile Thr Val Ile Glu Pro Ser Ala 100 105 110 Phe Ser Lys Leu Asn
Arg Leu Lys Val Leu Ile Leu Asn Asp Asn Ala 115 120 125 Ile Glu Ser
Leu Pro Pro Asn Ile Phe Arg Phe Val Pro Leu Thr His 130 135 140 Leu
Asp Leu Arg Gly Asn Gln Leu Gln Thr Leu Pro Tyr Val Gly Phe 145 150
155 160 Leu Glu His Ile Gly Arg Ile Leu Asp Leu Gln Leu Glu Asp Asn
Lys 165 170 175 Trp Ala Cys Asn Cys Asp Leu Leu Gln Leu Lys Thr Trp
Leu Glu Asn 180 185 190 Met Pro Pro Gln Ser Ile Ile Gly Asp Val Val
Cys Asn Ser Pro Pro 195 200 205 Phe Phe Lys Gly Ser Ile Leu Ser Arg
Leu Lys Lys Glu Ser Ile Cys 210 215 220 Pro Thr Pro Pro Val Tyr Glu
Glu His Glu Asp Pro Ser Gly Ser Leu 225 230 235 240 His Leu Ala Ala
Thr Ser Ser Ile Asn Asp Ser Arg Met Ser Thr Lys 245 250 255 Thr Thr
Ser Ile Leu Lys Leu Pro Thr Lys Ala Pro Gly Leu Ile Pro 260 265 270
Tyr Ile Thr Lys Pro Ser Thr Gln Leu Pro Gly Pro Tyr Cys Pro Ile 275
280 285 Pro Cys Asn Cys Lys Val Leu Ser Pro Ser Gly Leu Leu Ile His
Cys 290 295 300 Gln Glu Arg Asn Ile Glu Ser Leu Ser Asp Leu Arg Pro
Pro Pro Gln 305 310 315 320 Asn Pro Arg Lys Leu Ile Leu Ala Gly Asn
Ile Ile His Ser Leu Met 325 330 335 Lys Ser Asp Leu Val Glu Tyr Phe
Thr Leu Glu Met Leu His Leu Gly 340 345 350 Asn Asn Arg Ile Glu Val
Leu Glu Glu Gly Ser Phe Met Asn Leu Thr 355 360 365 Arg Leu Gln Lys
Leu Tyr Leu Asn Gly Asn His Leu Thr Lys Leu Ser 370 375 380 Lys Gly
Met Phe Leu Gly Leu His Asn Leu Glu Tyr Leu Tyr Leu Glu 385 390 395
400 Tyr Asn Ala Ile Lys Glu Ile Leu Pro Gly Thr Phe Asn Pro Met Pro
405 410 415 Lys Leu Lys Val Leu Tyr Leu Asn Asn Asn Leu Leu Gln Val
Leu Pro 420 425 430 Pro His Ile Phe Ser Gly Val Pro Leu Thr Lys Val
Asn Leu Lys Thr 435 440 445 Asn Gln Phe Thr His Leu Pro Val Ser Asn
Ile Leu Asp Asp Leu Asp 450 455 460 Leu Leu Thr Gln Ile Asp Leu Glu
Asp Asn Pro Trp Asp Cys Ser Cys 465 470 475 480 Asp Leu Val Gly Leu
Gln Gln Trp Ile Gln Lys Leu Ser Lys Asn Thr 485 490 495 Val Thr Asp
Asp Ile Leu Cys Thr Ser Pro Gly His Leu Asp Lys Lys 500 505 510 Glu
Leu Lys Ala Leu Asn Ser Glu Ile Leu Cys Pro Gly Leu Val Asn 515 520
525 Asn Pro Ser Met Pro Thr Gln Thr Ser Tyr Leu Met Val Thr Thr Pro
530 535 540 Ala Thr Thr Thr Asn Thr Ala Asp Thr Ile Leu Arg Ser Leu
Thr Asp 545 550 555 560 Ala Val Pro Leu Ser Val Leu Ile Leu Gly Leu
Leu Ile Met Phe Ile 565 570 575 Thr Ile Val Phe Cys Ala Ala Gly Ile
Val Val Leu Val Leu His Arg 580 585 590 Arg Arg Arg Tyr Lys Lys Lys
Gln Val Asp Glu Gln Met Arg Asp Asn 595 600 605 Ser Pro Val His Leu
Gln Tyr Ser Met Tyr Gly His Lys Thr Thr His 610 615 620 His Thr Thr
Glu Arg Pro Ser Ala Ser Leu Tyr Glu Gln His Met Val 625 630 635 640
Ser Pro Met Val His Val Tyr Arg Ser Pro Ser Phe Gly Pro Lys His 645
650 655 Leu Glu Glu Glu Glu Glu Arg Asn Glu Lys Glu Gly Ser Asp Ala
Lys 660 665 670 His Leu Gln Arg Ser Leu Leu Glu Gln Glu Asn His Ser
Pro Leu Thr 675 680 685 Gly Ser Asn Met Lys Tyr Lys Thr Thr Asn Gln
Ser Thr Glu Phe Leu 690 695 700 Ser Phe Gln Asp Ala Ser Ser Leu Tyr
Arg Asn Ile Leu Glu Lys Glu 705 710 715 720 Arg Glu Leu Gln Gln Leu
Gly Ile Thr Glu Tyr Leu Arg Lys Asn Ile
725 730 735 Ala Gln Leu Gln Pro Asp Met Glu Ala His Tyr Pro Gly Ala
His Glu 740 745 750 Glu Leu Lys Leu Met Glu Thr Leu Met Tyr Ser Arg
Pro Arg Lys Val 755 760 765 Leu Val Glu Gln Thr Lys Asn Glu Tyr Phe
Glu Leu Lys Ala Asn Leu 770 775 780 His Ala Glu Pro Asp Tyr Leu Glu
Val Leu Glu Gln Gln Thr 785 790 795 22 713 PRT Homo sapiens 22 Met
Arg Leu Leu Val Ala Pro Leu Leu Leu Ala Trp Val Ala Gly Ala 1 5 10
15 Thr Ala Ala Val Pro Val Val Pro Trp His Val Pro Cys Pro Pro Gln
20 25 30 Cys Ala Cys Gln Ile Arg Pro Trp Tyr Thr Pro Arg Ser Ser
Tyr Arg 35 40 45 Glu Ala Thr Thr Val Asp Cys Asn Asp Leu Phe Leu
Thr Ala Val Pro 50 55 60 Pro Ala Leu Pro Ala Gly Thr Gln Thr Leu
Leu Leu Gln Ser Asn Ser 65 70 75 80 Ile Val Arg Val Asp Gln Ser Glu
Leu Gly Tyr Leu Ala Asn Leu Thr 85 90 95 Glu Leu Asp Leu Ser Gln
Asn Ser Phe Ser Asp Ala Arg Asp Cys Asp 100 105 110 Phe His Ala Leu
Pro Gln Leu Leu Ser Leu His Leu Glu Glu Asn Gln 115 120 125 Leu Thr
Arg Leu Glu Asp His Ser Phe Ala Gly Leu Ala Ser Leu Gln 130 135 140
Glu Leu Tyr Leu Asn His Asn Gln Leu Tyr Arg Ile Ala Pro Arg Ala 145
150 155 160 Phe Ser Gly Leu Ser Asn Leu Leu Arg Leu His Leu Asn Ser
Asn Leu 165 170 175 Leu Arg Ala Ile Asp Ser Arg Trp Phe Glu Met Leu
Pro Asn Leu Glu 180 185 190 Ile Leu Met Ile Gly Gly Asn Lys Val Asp
Ala Ile Leu Asp Met Asn 195 200 205 Phe Arg Pro Leu Ala Asn Leu Arg
Ser Leu Val Leu Ala Gly Met Asn 210 215 220 Leu Arg Glu Ile Ser Asp
Tyr Ala Leu Glu Gly Leu Gln Ser Leu Glu 225 230 235 240 Ser Leu Ser
Phe Tyr Asp Asn Gln Leu Ala Arg Val Pro Arg Arg Ala 245 250 255 Leu
Glu Gln Val Pro Gly Leu Lys Phe Leu Asp Leu Asn Lys Asn Pro 260 265
270 Leu Gln Arg Val Gly Pro Gly Asp Phe Ala Asn Met Leu His Leu Lys
275 280 285 Glu Leu Gly Leu Asn Asn Met Glu Glu Leu Val Ser Ile Asp
Lys Phe 290 295 300 Ala Leu Val Asn Leu Pro Glu Leu Thr Lys Leu Asp
Ile Thr Asn Asn 305 310 315 320 Pro Arg Leu Ser Phe Ile His Pro Arg
Ala Phe His His Leu Pro Gln 325 330 335 Met Glu Thr Leu Met Leu Asn
Asn Asn Ala Leu Ser Ala Leu His Gln 340 345 350 Gln Thr Val Glu Ser
Leu Pro Asn Leu Gln Glu Val Gly Leu His Gly 355 360 365 Asn Pro Ile
Arg Cys Asp Cys Val Ile Arg Trp Ala Asn Ala Thr Gly 370 375 380 Thr
Arg Val Arg Phe Ile Glu Pro Gln Ser Thr Leu Cys Ala Glu Pro 385 390
395 400 Pro Asp Leu Gln Arg Leu Pro Val Arg Glu Val Pro Phe Arg Glu
Met 405 410 415 Thr Asp His Cys Leu Pro Leu Ile Ser Pro Arg Ser Phe
Pro Pro Ser 420 425 430 Leu Gln Val Ala Ser Gly Glu Ser Met Val Leu
His Cys Arg Ala Leu 435 440 445 Ala Glu Pro Glu Pro Glu Ile Tyr Trp
Val Thr Pro Ala Gly Leu Arg 450 455 460 Leu Thr Pro Ala His Ala Gly
Arg Arg Cys Arg Val Tyr Pro Glu Gly 465 470 475 480 Thr Leu Glu Leu
Arg Arg Val Thr Ala Glu Glu Ala Gly Leu Tyr Thr 485 490 495 Cys Val
Ala Gln Asn Leu Val Gly Ala Asp Thr Lys Thr Val Ser Val 500 505 510
Val Val Gly Arg Ala Leu Leu Gln Pro Gly Arg Asp Glu Gly Gln Gly 515
520 525 Leu Glu Leu Arg Val Gln Glu Thr His Pro Tyr His Ile Leu Leu
Ser 530 535 540 Trp Val Thr Pro Pro Asn Thr Val Ser Thr Asn Leu Thr
Trp Ser Ser 545 550 555 560 Ala Ser Ser Leu Arg Gly Gln Gly Ala Thr
Ala Leu Ala Arg Leu Pro 565 570 575 Arg Gly Thr His Ser Tyr Asn Ile
Thr Arg Leu Leu Gln Ala Thr Glu 580 585 590 Tyr Trp Ala Cys Leu Gln
Val Ala Phe Ala Asp Ala His Thr Gln Leu 595 600 605 Ala Cys Val Trp
Ala Arg Thr Lys Glu Ala Thr Ser Cys His Arg Ala 610 615 620 Leu Gly
Asp Arg Pro Gly Leu Ile Ala Ile Leu Ala Leu Ala Val Leu 625 630 635
640 Leu Leu Ala Ala Gly Leu Ala Ala His Leu Gly Thr Gly Gln Pro Arg
645 650 655 Lys Gly Val Gly Gly Arg Arg Pro Leu Pro Pro Ala Trp Ala
Phe Trp 660 665 670 Gly Trp Ser Ala Pro Ser Val Arg Val Val Ser Ala
Pro Leu Val Leu 675 680 685 Pro Trp Asn Pro Gly Arg Lys Leu Pro Arg
Ser Ser Glu Gly Glu Thr 690 695 700 Leu Leu Pro Pro Leu Ser Gln Asn
Ser 705 710 23 684 PRT Homo sapiens 23 Met Thr Ser Leu Val His Leu
Thr Leu Ser Arg Asn Thr Ile Gly Gln 1 5 10 15 Val Ala Ala Gly Ala
Phe Ala Asp Leu Arg Ala Leu Arg Ala Leu His 20 25 30 Leu Asp Ser
Asn Arg Leu Ala Glu Val Arg Gly Asp Gln Leu Arg Gly 35 40 45 Leu
Gly Asn Leu Arg His Leu Ile Leu Gly Asn Asn Gln Ile Arg Arg 50 55
60 Val Glu Ser Ala Ala Phe Asp Ala Phe Leu Ser Thr Val Glu Asp Leu
65 70 75 80 Asp Leu Ser Tyr Asn Asn Leu Glu Ala Leu Pro Trp Glu Ala
Val Gly 85 90 95 Gln Met Val Asn Leu Asn Thr Leu Thr Leu Asp His
Asn Leu Ile Asp 100 105 110 His Ile Ala Glu Gly Thr Phe Val Gln Leu
His Lys Leu Val Arg Leu 115 120 125 Asp Met Thr Ser Asn Arg Leu His
Lys Leu Pro Pro Asp Gly Leu Phe 130 135 140 Leu Arg Ser Gln Gly Thr
Gly Pro Lys Pro Pro Thr Pro Leu Thr Val 145 150 155 160 Ser Phe Gly
Gly Asn Pro Leu His Cys Asn Cys Glu Leu Leu Trp Leu 165 170 175 Arg
Arg Leu Thr Arg Glu Asp Asp Leu Glu Thr Cys Ala Thr Pro Glu 180 185
190 His Leu Thr Asp Arg Tyr Phe Trp Ser Ile Pro Glu Glu Glu Phe Leu
195 200 205 Cys Glu Pro Pro Leu Ile Thr Arg Gln Ala Gly Gly Arg Ala
Leu Val 210 215 220 Val Glu Gly Gln Ala Val Ser Leu Arg Cys Arg Ala
Val Gly Asp Pro 225 230 235 240 Glu Pro Val Val His Trp Val Ala Pro
Asp Gly Arg Leu Leu Gly Asn 245 250 255 Ser Ser Arg Thr Arg Val Arg
Gly Asp Gly Thr Leu Asp Val Thr Ile 260 265 270 Thr Thr Leu Arg Asp
Ser Gly Thr Phe Thr Cys Ile Ala Ser Asn Ala 275 280 285 Ala Gly Glu
Ala Thr Ala Pro Val Glu Val Cys Val Val Pro Leu Pro 290 295 300 Leu
Met Ala Pro Pro Pro Ala Ala Pro Pro Pro Leu Thr Glu Pro Gly 305 310
315 320 Ser Ser Asp Ile Ala Thr Pro Gly Arg Pro Gly Ala Asn Asp Ser
Ala 325 330 335 Ala Glu Arg Arg Leu Val Ala Ala Glu Leu Thr Ser Asn
Ser Val Leu 340 345 350 Ile Arg Trp Pro Ala Gln Arg Pro Val Pro Gly
Ile Arg Met Tyr Gln 355 360 365 Val Gln Tyr Asn Ser Ser Val Asp Asp
Ser Leu Val Tyr Arg Met Ile 370 375 380 Pro Ser Thr Ser Gln Thr Phe
Leu Val Asn Asp Leu Ala Ala Gly Arg 385 390 395 400 Ala Tyr Asp Leu
Cys Val Leu Ala Val Tyr Asp Asp Gly Ala Thr Ala 405 410 415 Leu Pro
Ala Thr Arg Val Val Gly Cys Val Gln Phe Thr Thr Ala Gly 420 425 430
Asp Pro Ala Pro Cys Arg Pro Leu Arg Ala His Phe Leu Gly Gly Thr 435
440 445 Met Ile Ile Ala Ile Gly Gly Val Ile Val Ala Ser Val Leu Val
Phe 450 455 460 Ile Val Leu Leu Met Ile Arg Tyr Lys Val Tyr Gly Asp
Gly Asp Ser 465 470 475 480 Arg Arg Val Lys Gly Ser Arg Ser Leu Pro
Arg Val Ser His Val Cys 485 490 495 Ser Gln Thr Asn Gly Ala Gly Thr
Gly Ala Ala Gln Ala Pro Ala Leu 500 505 510 Pro Ala Gln Asp His Tyr
Glu Ala Leu Arg Glu Val Glu Ser Gln Ala 515 520 525 Ala Pro Ala Val
Ala Val Glu Ala Lys Ala Met Glu Ala Glu Thr Ala 530 535 540 Ser Ala
Glu Pro Glu Val Val Leu Gly Arg Ser Leu Gly Gly Ser Ala 545 550 555
560 Thr Ser Leu Cys Leu Leu Pro Ser Glu Glu Thr Ser Gly Glu Glu Ser
565 570 575 Arg Ala Ala Val Gly Pro Arg Arg Ser Arg Ser Gly Ala Leu
Glu Pro 580 585 590 Pro Thr Ser Ala Pro Pro Thr Leu Ala Leu Val Pro
Gly Gly Ala Ala 595 600 605 Ala Arg Pro Arg Pro Gln Gln Arg Tyr Ser
Phe Asp Gly Asp Tyr Gly 610 615 620 Ala Leu Phe Gln Ser His Ser Tyr
Pro Arg Arg Ala Arg Arg Thr Lys 625 630 635 640 Arg His Arg Ser Thr
Pro His Leu Asp Gly Ala Gly Gly Gly Ala Ala 645 650 655 Gly Glu Asp
Gly Asp Leu Gly Leu Gly Ser Ala Arg Ala Cys Leu Ala 660 665 670 Phe
Thr Ser Thr Glu Trp Met Leu Glu Ser Thr Val 675 680 24 420 PRT Homo
sapiens 24 Met Pro Gly Gly Cys Ser Arg Gly Pro Ala Ala Gly Asp Gly
Arg Leu 1 5 10 15 Arg Leu Ala Arg Leu Ala Leu Val Leu Leu Gly Trp
Val Ser Ser Ser 20 25 30 Ser Pro Thr Ser Ser Ala Ser Ser Phe Ser
Ser Ser Ala Pro Phe Leu 35 40 45 Ala Ser Ala Val Ser Ala Gln Pro
Pro Leu Pro Asp Gln Cys Pro Ala 50 55 60 Leu Cys Glu Cys Ser Glu
Ala Ala Arg Thr Val Lys Cys Val Asn Arg 65 70 75 80 Asn Leu Thr Glu
Val Pro Thr Asp Leu Pro Ala Tyr Val Arg Asn Leu 85 90 95 Phe Leu
Thr Gly Asn Gln Leu Ala Val Leu Pro Ala Gly Ala Phe Ala 100 105 110
Arg Arg Pro Pro Leu Ala Glu Leu Ala Ala Leu Asn Leu Ser Gly Ser 115
120 125 Arg Leu Asp Glu Val Arg Ala Gly Ala Phe Glu His Leu Pro Ser
Leu 130 135 140 Arg Gln Leu Asp Leu Ser His Asn Pro Leu Ala Asp Leu
Ser Pro Phe 145 150 155 160 Ala Phe Ser Gly Ser Asn Ala Ser Val Ser
Ala Pro Ser Pro Leu Val 165 170 175 Glu Leu Ile Leu Asn His Ile Val
Pro Pro Glu Asp Glu Arg Gln Asn 180 185 190 Arg Ser Phe Glu Gly Met
Val Val Ala Ala Leu Leu Ala Gly Arg Ala 195 200 205 Leu Gln Gly Leu
Arg Arg Leu Glu Leu Ala Ser Asn His Phe Leu Tyr 210 215 220 Leu Pro
Arg Asp Val Leu Ala Gln Leu Pro Ser Leu Arg His Leu Asp 225 230 235
240 Leu Ser Asn Asn Ser Leu Val Ser Leu Thr Tyr Val Ser Phe Arg Asn
245 250 255 Leu Thr His Leu Glu Ser Leu His Leu Glu Asp Asn Ala Leu
Lys Val 260 265 270 Leu His Asn Gly Thr Leu Ala Glu Leu Gln Gly Leu
Pro His Ile Arg 275 280 285 Val Phe Leu Asp Asn Asn Pro Trp Val Cys
Asp Cys His Met Ala Asp 290 295 300 Met Val Thr Trp Leu Lys Glu Thr
Glu Val Val Gln Gly Lys Asp Arg 305 310 315 320 Leu Thr Cys Ala Tyr
Pro Glu Lys Met Arg Asn Arg Val Leu Leu Glu 325 330 335 Leu Asn Ser
Ala Asp Leu Asp Cys Asp Pro Ile Leu Pro Pro Ser Leu 340 345 350 Gln
Thr Ser Tyr Val Phe Leu Gly Ile Val Leu Ala Leu Ile Gly Ala 355 360
365 Ile Phe Leu Leu Val Leu Tyr Leu Asn Arg Lys Gly Ile Lys Lys Trp
370 375 380 Met His Asn Ile Arg Asp Ala Cys Arg Asp His Met Glu Gly
Tyr His 385 390 395 400 Tyr Arg Tyr Glu Ile Asn Ala Asp Pro Arg Leu
Thr Asn Leu Ser Ser 405 410 415 Asn Ser Asp Val 420
* * * * *