U.S. patent application number 12/952967 was filed with the patent office on 2011-06-23 for polynucleotides and polypeptides encoding receptors.
This patent application is currently assigned to Human Genome Sciences, Inc.. Invention is credited to Reiner L. Gentz, Jian Ni, Craig A. Rosen.
Application Number | 20110151473 12/952967 |
Document ID | / |
Family ID | 26710688 |
Filed Date | 2011-06-23 |
United States Patent
Application |
20110151473 |
Kind Code |
A1 |
Ni; Jian ; et al. |
June 23, 2011 |
Polynucleotides and Polypeptides Encoding Receptors
Abstract
Receptor polypeptides and polynucleotides and methods for
producing such polypeptides by recombinant techniques are
disclosed. Also disclosed are methods for utilizing receptor
polypeptides and polynucleotides in the design of protocols for the
treatment of diseases and diagnostic assays for such
conditions.
Inventors: |
Ni; Jian; (Germantown,
MD) ; Rosen; Craig A.; (Pasadena, MD) ; Gentz;
Reiner L.; (Belo Horizonte - Mg, BR) |
Assignee: |
Human Genome Sciences, Inc.
Rockville
MD
|
Family ID: |
26710688 |
Appl. No.: |
12/952967 |
Filed: |
November 23, 2010 |
Related U.S. Patent Documents
|
|
|
|
|
|
Application
Number |
Filing Date |
Patent Number |
|
|
12431986 |
Apr 29, 2009 |
|
|
|
12952967 |
|
|
|
|
11832019 |
Aug 1, 2007 |
|
|
|
12431986 |
|
|
|
|
11041419 |
Jan 25, 2005 |
|
|
|
11832019 |
|
|
|
|
10156136 |
May 29, 2002 |
|
|
|
11041419 |
|
|
|
|
09764452 |
Jan 19, 2001 |
|
|
|
10156136 |
|
|
|
|
09010146 |
Jan 21, 1998 |
|
|
|
09764452 |
|
|
|
|
60034204 |
Jan 21, 1997 |
|
|
|
60034205 |
Jan 21, 1997 |
|
|
|
Current U.S.
Class: |
435/6.17 ;
530/350; 536/23.5 |
Current CPC
Class: |
C07K 14/70503 20130101;
A61P 43/00 20180101; C07K 14/70535 20130101; C07K 14/705 20130101;
C12N 2799/026 20130101; C12N 9/6491 20130101; C12N 9/6489 20130101;
A61P 35/00 20180101; C12N 9/6421 20130101; C07K 14/47 20130101;
C12N 9/6475 20130101; A61K 38/00 20130101 |
Class at
Publication: |
435/6.17 ;
536/23.5; 530/350 |
International
Class: |
C12Q 1/68 20060101
C12Q001/68; C07H 21/00 20060101 C07H021/00; C07K 14/705 20060101
C07K014/705 |
Claims
1. An isolated polynucleotide comprising a nucleotide sequence that
has at least 80% identity over its entire length to a nucleotide
sequence selected from the group consisting of: (a) a nucleotide
sequence encoding the polypeptide of SEQ ID NO:Y; (b) a nucleotide
sequence having at least 80% identity to a nucleotide sequence
encoding the polypeptide expressed by the cDNA insert deposited at
the ATCC; or (c) a nucleotide sequence complementary to said
isolated polynucleotide.
2. The polynucleotide of claim 1, wherein said polynucleotide is
(a).
3. The polynucleotide of claim 1, wherein said polynucleotide is
(b).
4. The polynucleotide of claim 1, wherein said polynucleotide is
(c).
5. A method of diagnosing a disease or a susceptibility to a
disease in a subject that is related to the presence of mutations
in, or the production of, a nucleotide sequence, comprising
collecting a sample from a subject and: (a) determining the
presence or absence of a mutation in a nucleotide sequence encoding
the polypeptide of SEQ ID NO:Y in the genome of said subject;
and/or (b) analyzing for the presence or amount of the nucleotide
in said sample derived from said subject.
6. The method of claim 5, wherein the polynucleotide is a fragment
of the nucleotide sequence encoding the polypeptide of SEQ ID
NO:Y.
7. The method of claim 6, wherein the polynucleotide fragment
comprises at least 30 consecutive nucleotides of the nucleotide
sequence encoding the polypeptide of SEQ ID NO:Y
8. The method of claim 6, wherein the polynucleotide fragment
comprises at least 50 consecutive nucleotides of the nucleotide
sequence encoding the polypeptide of SEQ ID NO:Y.
9. An isolated polypeptide comprising an amino acid sequence that
has at least 80% identity over its entire length to an amino acid
sequence encoding the polypeptide of SEQ ID NO:Y.
10. The polypeptide of claim 9, wherein the polypeptide comprises
at least 30 consecutive amino acids of SEQ ID NO:Y.
11. The polypeptide of claim 9, wherein the polypeptide comprises
at least 50 consecutive amino acids of SEQ ID NO:Y.
Description
CROSS-REFERENCE TO RELATED APPLICATIONS
[0001] This application is a continuation of application Ser. No.
12/431,986, filed Apr. 29, 2009, which is a continuation of
application Ser. No. 11/832,019, filed Aug. 1, 2007 (now
abandoned), which is a continuation of application Ser. No.
11/041,419, filed Jan. 25, 2005 (now abandoned), which is a
continuation of application Ser. No. 10/156,136, filed May 29, 2002
(now abandoned), which is a continuation of application Ser. No.
09/764,452, filed Jan. 19, 2001 (now abandoned), which is a
continuation of application Ser. No. 09/010,146, filed Jan. 21,
1998 (now abandoned), which claims the benefit under 35 U.S.C.
.sctn.119(e) of provisional Application No. 60/034,204, filed Jan.
21, 1997 and provisional Application No. 60/034,205, filed Jan. 21,
1997; each of which is hereby incorporated by reference in its
entirety.
STATEMENT UNDER 37 C.F.R. .sctn.1.77(b)(5)
[0002] This application refers to a "Sequence Listing" listed
below, which was provided as a text document in U.S. application
Ser. No. 12/431,986, filed Apr. 29, 2009, entitled
"PF354C5_SequenceList.txt". Applicants request the use of the
computer readable "Sequence Listing" filed in connection with U.S.
application Ser. No. 12/431,986 on Apr. 29, 2009, as the computer
readable form for the instant application. Applicants hereby state
that the paper copy of the "Sequence Listing" filed in the instant
application on Nov. 23, 2010, is identical to the computer readable
copy filed on Apr. 29, 2009, in U.S. application Ser. No.
12/431,986, and is hereby incorporated by reference in its
entirety.
BACKGROUND OF THE INVENTION
[0003] Receptor proteins are found on the membrane of the cells and
are generally involved in signal transduction. There are many types
of receptor proteins, and for convenience, these proteins are
grouped in families based on similarity in structure and
function.
[0004] Receptor proteins are found on the membrane of the cells and
are generally involved in signal transduction. There are many types
of receptor proteins, and for convenience, these proteins are
grouped in families based on similarity in structure and
function.
[0005] For example, the TM4SF superfamily of cell surface proteins,
also known as the tetraspan receptor superfamily, is comprised of
at least seventeen individual gene products (these include CD9,
CD20, CD37, CD53, CD63, CD81, CD82, A15, CO-029, Sm23, RDS, Uro B,
Uro A, SAS, Rom-1, PETA3, and YKK8). The TM4SF superfamily is the
second largest group in the CD antigen superfamily. Each member of
the TM4SF superfamily can be characterized by several putative
physical features including four highly conserved transmembrane
domains, two divergent extracellular loops, and two short and
highly divergent cytoplasmic tails. Expression patterns for members
of the TM4SF superfamily tend to be rather broad and can vary
widely between members. The functional roles of TM4SF superfamily
members are primarily associated with signal transduction events
and pathways, but also include cell adhesion in platelets and other
lymphocytic and non-lymphocytic cell lines, as well as cell
motility, proliferation, and metastasis. In addition, recent
evidence suggests that a subset of the members of the TM4SF
superfamily may function as potassium channel molecules.
[0006] One member of the TM4SF family, CD20, is a four membrane
spanning domain cell surface phosphoprotein expressed exclusively
on B lymphocytes. Although the precise functional role of CD20 has
yet to be determined, it is thought to function primarily as a
receptor during B-cell activation. Furthermore, a large number of
experimental observations suggest several additional speculative
roles for the CD20 molecule. For example, CD20-specific
immunoprecipitation of biochemically cross-linked plasma membrane
proteins suggests that CD20 assumes a multimeric structural
conformation characteristic of other previously described membrane
channel proteins. Further experimentation has revealed that
expression of exogenous CD20 on the cell surface specifically
increases Ca.sup.2+ conductance across the plasma membrane.
Together, these results suggest that CD20 complexes may function as
B-cell specific Ca.sup.2+ ion channels. In addition, monoclonal
antibodies raised against CD20 have been used to stimulate resting
B-cells to transition out of the G0/G1 segment of the cell cycle.
It has also been demonstrated that CD20 is associated with both
serine and tyrosine kinases and, more specifically, that CD20 is
associated, although not directly, with the Src family of tyrosine
kinases including p56/53lyn, p56lck, and p59fyn.
[0007] A second example of a receptor subfamily, called
sialoadhesin molecules, belongs to the Ig superfamily of
receptor-like molecules. The more than 100 members of the Ig
superfamily are generally considered to engage in specific
cell-cell interactions through which intercellular communication
may occur. In addition to classical protein-protein interactions,
intercellular communication may also be mediated through
protein-carbohydrate interactions. In fact, all members of the
sialoadhesin family of the Ig superfamily are capable of mediating
protein-sialic acid binding interactions. To date, only a small
number of proteins have been assigned to the sialoadhesin family
including sialoadhesin, CD33, CD22, the myelin-associated
glycoprotein (MAG), and the Schwann cell myelin protein (SMP). Each
of these proteins is expressed in a restricted subset of cell
types. For example, CD22 and CD33 are expressed exclusively by
B-lymphocytes and cells of the myelomonocytic lineage,
respectively.
[0008] Similarly, galectins are a family of the lectin superfamily
of carbohydrate-binding proteins which have a high affinity for
b-galactoside sugars. Although a large number of glycoproteins
containing b-galactoside sugars are produced by the cell, only a
few will bind to known galectins in vitro. Such apparent binding
specificity suggests a highly specific functional role for the
galectins. Galectin 1 (conventionally termed LGALS1 for lectin,
galactoside-binding, soluble-1) is thought to specifically bind
laminin, a highly polylactosaminated cellular glycoprotein, as well
as the highly polylactosaminated lysosome-associated membrane
proteins (LAMPs). Galectin 1 has also been shown to bind
specifically to a lactosamine-containing glycolipid found on
olfactory neurons and to integrin a.sub.7b.sub.1 on skeletal muscle
cells. Galectin 3 has also been observed to bind specifically to
laminin, immunoglobulin E and its receptor, and bacterial
lipopolysaccharides.
[0009] Various galectins have been shown to function in the
mechanisms of intercellular communication. For example, depending
on cell type, galectin 1 has been observed to modulate cell
adhesion either positively or negatively. More specifically,
galectin 1 appears to inhibit cell adhesion of skeletal muscle
presumably by galectin 1-mediated disruption of laminin-integrin
a.sub.7b.sub.1 interactions. Alternatively, galectin 1 appears to
promote cell adhesion in several non-skeletal muscle cell types
examined presumably by a glycoconjugate cross-linking mechanism.
Galectin 3 has also been observed to function in modulating
cell-adhesion, as well as in the activation of certain immune cells
by cross-linking IgE and IgE receptors. In addition, galectins have
been observed to be involved in the regulation of immune cell
activity, as well as in such diverse processes as cell adhesion,
proliferation, inflammation, autoimmunity, and metastasis of tumor
cells. Furthermore, a galectin-like antigen designated HOM-HD-21
was recently found to be highly expressed in a Hodgkin's Disease
cDNA library. Very recently, a novel galectin, termed PCTA-1, was
identified as a specific cell surface marker on human prostate
cancer cell lines and patient-derived carcinomas. Galectins have
also been found to function intracellularly as a component of
ribonucleoprotein complexes. Finally, galectins 1 and 3 have each
been found to modulate T-cell growth and apoptosis by interaction
with CD45 and possibly Bcl2, respectively.
[0010] A relatively new family of cell-surface proteins has been
identified and termed the Ly6 superfamily. The members of this
family include murine and human SCA-2, rat Ly-6 (also termed ThB),
human CD59 [also known as protectin or membrane attack complex
inhibition factor (MACIF)], and E48 antigen. The determination of
an initial functional role for SCA-2 may lie in an analysis of its
expression profile with regard to the complex process of
hematopoiesis. SCA-2 is highly expressed in early thymic precusor
cells. In turn, progeny of the intrathymic precusor population
continue to express SCA-2, but only until the point of transition
occurs from blast cell to small cell. Further experimental evidence
demonstrates that mature thymocytes and peripheral T-cells do not
express detectable levels of SCA-2, whereas mature, peripheral
B-cells do continue to express SCA-2. As a result, it seems very
likely that SCA-2 plays an important role in thymocyte maturation
and differentiation. A plausible explanation for this functional
hypothesis is that SCA-2 may act as a receptor for a unknown
cytokine which regulates thymocyte maturation and
differentiation.
[0011] In addition, CD59 is a recently identified integral membrane
protein which appears to be involved in the regulation of
complement. Recent studies show that the CD59 antigen may prevent
damage from complement C5b-9 and protect astrocytes during
inflammatory and infectious disorders of the nervous system.
Expression of recombinant human CD59 on porcine donor organs have
been shown to prevent complement-mediated lysis and activation of
endothelial cells that leads to hyperacute rejection. Recently,
researchers at Alexion Pharmaceuticals (New Haven, Conn.) reported
on the production of transgenic pigs which expressed human CD59. In
these animals, xenogeneic organs were resistant to hyperacute
rejection. (Fodor, et al., "Expression of a functional human
complement inhibitor in a transgenic pig as a model for the
prevention of xenogeneic hyperacute organ rejection," Proc. Natl.
Acad. Sci., 91:1153-11157 (1994).) The same company also reported
that expression of recombinant transmembrane CD59 in paroxysmal
nocturnal hemoglobinuria (PNH) B-cells confers resistance to human
complement. (Rother et al., "Expression of recombinant
transmembrane CD59 in paroxysmal nocturnal hemoglobinuria B-cells
confers resistance to human complement," Blood, 84:2604-2611
(1994).) PNH is an acquired hematopoietic disorder characterized by
complement-mediated hemolytic anemia, pancytopenia, and venous
thrombosis. It is thought that retroviral gene therapy with this
molecule could provide a treatment for PNH patients.
[0012] A final Ly6 superfamily member, the E48 antigen, is involved
in intercellular adhesion between keratinocyte cells of the
squamous epithelium. Such keratinocytes are attached to adjoining
cells by large numbers of desmosomes, which are thought to play a
role in the transition of transformed keratinocytes to metastatic
tumor cells. Treatment with a monoclonal antibody raised against
the E48 antigen has been successful in the eradication of residual,
postoperative squamous cell carcinoma cells of the upper
aerodigestive tract in several in vivo models and, to some degree,
in humans. (van Dongen, et al., "Progress in radioimmunotherapy of
head and neck cancer," Oncol. Rep. 1:259-264 (1994).) The gene
encoding the E48 antigen has been mapped to the q24-qter region of
human chromosome 8. Interestingly, a number of human diseases have
been mapped to this region of chromosome 8 including Langer-Giedion
syndrome, brachio-otorhinolaryngeal syndrome, trichorhinolaryngeal
syndrome, and epidermolysis bullosa simplex.
[0013] A further example of a receptor family includes the
prohibitin receptors. The prohibitin gene product is expressed in a
wide variety of tissues and has been implicated as a component of a
number of anti-proliferative mechanisms. The prohibitin gene
encodes a 30 kD postsynthetically modified polypeptide located
primarily in the mitochondria, but also may be associated with the
IgM receptor on the B-cell plasma membrane. The protein
functionally inhibits DNA synthesis and entry into S phase of the
cell cycle by an unknown mechanism. Interestingly, although the
prohibitin gene product is hypothesized to be involved in the
maintenance of senescence and the prevention of cancer, one study
found that, although somatic mutations in the prohibitin gene were
present in a small number of breast cancers, no mutations were
identified in any other breast, ovary, liver, and lung cancers
examined. (Sato et al., Genomics 17:762-764 (1993).) However, the
prohibitin gene has been mapped to human chromosome 17q12-21, the
same region thought to contain the gene involved in sporadic breast
cancer. Furthermore, DNA sequence analysis of the prohibitin gene
identified somatic mutation in 4 of 23 cases of sporadic breast
cancer examined. Thus, prohibitin family members may be involved in
the development of cancer.
[0014] Moreover, the EGFR family of plasma membrane proteins are an
integral component of normal cellular proliferation and in the
pathogenesis of the cancerous state. The family is relatively small
and includes the EGFR, c-erbB-2, c-erbB-3, and others. Various
cancers are correlated with aberrant expression of one or more of
these genes. A number of ligands have been identified which bind to
the EGFR-like receptors listed above including TGF-a,
heparin-binding EGF, amphiregulin, criptoregulin, hercgulin, and
others. A large fraction of adenocarcinomas examined to date,
especially those of the breast, colon, and pancreas, are typified
by the amplification or overexpression of the c-erbB-2 gene. EGF,
or an analogous ligand, initiates the cellular growth factor
response by binding to the EGFR, or EGFR-related, receptor.
Following the binding event, the receptor molecule dimerizes
activating its intracellular tyrosine kinase domain. This event
results in the phosphorylation of specific tyrosine residues near
the carboxy terminus of the receptor. The diversity of signals able
to be transduced through the relatively small number of
EGFR-related receptor molecules is amplified considerably by the
recent finding that EGFR-like receptor molecules can function when
dimerized with other EGFR family members forming heterodimers.
[0015] Members of the EGFR-related family of integral membrane
proteins have been implicated in the pathogenesis of a number of
human disease-states. For example, a mutation in the EGFR itself
appears to play an important role in the development of
glioblastomas. (Sang et al., J. Neurosurg 82:841-846 (1995).) The
EGFR gene is amplified or overexpressed in the majority of primary
human glioblastomas. Although not conferring a distinct advantage
on cell growth, an increase in EGFR expression was found to confer
an increase in the ability of glioma cells to maintain
anchorage-independent growth in soft agar especially in response to
EGF and retinoic acid. Anchorage-independent growth in vitro
correlates highly with tumorigenicity in vivo, therefore, it is
likely that cells which express abnormally high levels of EGFR in
human glioblastoma cells may be involved in the high potential for
these cells to cause tumors in vivo.
[0016] Moreover, overexpression or amplification of c-erbB-2 has
been reported to be involved in a high number adenocarcinomas,
particularly of the breast, colon, and pancreas, and in a small
proportion of ovarian carcinomas.
[0017] Thus, there is a clear need for identifying and exploiting
novel members of the receptor families, such as those described
above. Although structurally related, these receptors will likely
possess diverse and multifaceted functions in a variety of cell and
tissue types. Receptor type molecules should prove useful in target
based screens for small molecules and other such pharmacologically
valuable factors. Monoclonal antibodies raised against such
receptors may prove useful as therapeutics in an anti-tumor,
diagnostic, or other capacity. Furthermore, receptors described
here may prove useful in an active or passive immunotherapeutical
role in patients with cancer or other immunocompromised disease
states.
BRIEF SUMMARY OF THE INVENTION
[0018] This invention relates to newly identified polynucleotides
and the polypeptides encoded by them, the use of such
polynucleotides and polypeptides, and their production. More
particularly, the polynucleotides and polypeptides of the present
invention relate to specific receptor families described in the
specification and known in the art. The invention also relates to
inhibiting or activating the action of such polynucleotides and
polypeptides.
[0019] In one aspect, the invention relates to receptor
polypeptides and polynucleotides, as well as the methods for their
production. Another aspect of the invention relates to methods for
using such receptor polypeptides and polynucleotides. Such uses
include the treatment of the specified diseases, among others. In
still another aspect, the invention relates to methods to identify
agonists and antagonists using the materials provided by the
invention, and treating conditions associated with receptor
imbalance with the identified compounds. Yet another aspect of the
invention relates to diagnostic assays for detecting diseases
associated with inappropriate receptor activity or levels.
BRIEF DESCRIPTION OF THE DRAWINGS
[0020] FIG. 1A-1B shows an amino acid sequence alignment of Clone
ID HMACR70 (SEQ ID NO:18) versus OB-1 (SEQ ID NO:33) (shaded boxes
indicate identical amino acid residues, non-shaded boxes indicate
conservative substitutions).
[0021] FIG. 2 shows an amino acid sequence alignment of Clone ID
HTEDK48 (SEQ ID NO:19) versus MRC-OX44 (SEQ ID NO:34) and PETA-3
(SEQ ID NO:35) (shaded boxes indicate identical amino acid
residues, non-shaded boxes indicate conservative
substitutions).
[0022] FIG. 3 shows an amino acid sequence alignment of Clone ID
HPWAE25 (SEQ ID NO:20) versus NAG-2 (SEQ ID NO:36) and TALLA-1 (SEQ
ID NO:37) (shaded boxes indicate identical amino acid residues,
non-shaded boxes indicate conservative substitutions).
[0023] FIG. 4 shows an amino acid sequence alignment of Clone ID
HTPEF86 (SEQ ID NO:21) versus B1 (SEQ ID NO:38) (shaded boxes
indicate identical amino acid residues, non-shaded boxes indicate
conservative substitutions).
[0024] FIG. 5 shows an amino acid sequence alignment of Clone ID
HSBBF02 (SEQ ID NO:22) versus TALLA-1 (SEQ ID NO:37) (shaded boxes
indicate identical amino acid residues, non-shaded boxes indicate
conservative substitutions).
[0025] FIG. 6 shows an amino acid sequence alignment of Clone ID
HLTAH80 (SEQ ID NO:23) versus TALLA-1 (SEQ ID NO:37) (shaded boxes
indicate identical amino acid residues, non-shaded boxes indicate
conservative substitutions).
[0026] FIG. 7 shows an amino acid sequence alignment of Clone ID
HTPBA27 (SEQ ID NO:24) versus NAG-2 (SEQ ID NO:36)(shaded boxes
indicate identical amino acid residues, non-shaded boxes indicate
conservative substitutions).
[0027] FIG. 8 shows an amino acid sequence alignment of Clone ID
HAIDQ59 (SEQ ID NO:25) versus CD9 (SEQ ID NO:39) (shaded boxes
indicate identical amino acid residues, non-shaded boxes indicate
conservative substitutions).
[0028] FIG. 9 shows an amino acid sequence alignment of Clone ID
HHFEK40 (SEQ ID NO:26) versus PETA-3 (SEQ ID NO:35) (shaded boxes
indicate identical amino acid residues, non-shaded boxes indicate
conservative substitutions).
[0029] FIG. 10 shows an amino acid sequence alignment of Clone ID
HGBGV89 (SEQ ID NO:27) versus L6H (SEQ ID NO:40) (shaded boxes
indicate identical amino acid residues, non-shaded boxes indicate
conservative substitutions).
[0030] FIG. 11 shows an amino acid sequence alignment of Clone ID
HUVBB80 (SEQ ID NO:28) versus L6 (SEQ ID NO:41) (shaded boxes
indicate identical amino acid residues, non-shaded boxes indicate
conservative substitutions).
[0031] FIG. 12 shows an amino acid sequence alignment of Clone ID
HJACE54 (SEQ ID NO:29) versus rGALECTIN-5 (SEQ ID NO:42) and
hGALECTN-8 (SEQ ID NO:43) (shaded boxes indicate identical amino
acid residues, non-shaded boxes indicate conservative
substitutions).
[0032] FIG. 13 shows an amino acid sequence alignment of Clone ID
HROAD63 (SEQ ID NO:30) versus E48 (SEQ ID NO:44) (shaded boxes
indicate identical amino acid residues, non-shaded boxes indicate
conservative substitutions).
[0033] FIG. 14 shows an amino acid sequence alignment of Clone ID
HMWGS46 (SEQ ID NO:31) versus B-cell Receptor Associated Protein
(shaded boxes indicate identical amino acid residues, non-shaded
boxes indicate conservative substitutions).
[0034] FIG. 15 shows an amino acid sequence alignment of Clone ID
HNFGW06 (SEQ ID NO:32) versus EGFR (SEQ ID NO:46) (shaded boxes
indicate identical amino acid residues, non-shaded boxes indicate
conservative substitutions).
DETAILED DESCRIPTION OF THE INVENTION
Definitions
[0035] The following definitions are provided to facilitate
understanding of certain terms used frequently herein.
[0036] "Receptor" refers, among others, to a polypeptide comprising
the amino acid sequence set forth in SEQ ID NO:Y, or an allelic
variant thereof.
[0037] "Receptor Activity" or "Biological Activity of the Receptor"
refers to the metabolic or physiologic function of said receptor
including similar activities or improved activities or these
activities with decreased undesirable side-effects. Also included
are antigenic and immunogenic activities of said receptor.
[0038] "Receptor gene" refers to a polynucleotide comprising the
nucleotide sequence set forth in SEQ ID NO:X or allelic variants
thereof and/or their complements.
[0039] "SEQ ID NO:X" comprises all or a substantial portion of the
polynucleotide encoding each receptor of the invention. The value X
for the nucleotide sequence is an integer specified in Table 1.
This nucleotide sequence was translated into the receptor
polypeptide identified in Table 1 as "SEQ ID NO:Y," where the value
of Y for each receptor polypeptide is an integer defined in Table
1.
[0040] The invention further provides a composition of matter
comprising a nucleic acid molecule which comprises a human cDNA
clone identified by a cDNA Clone ID (Identifier) in Table 1, which
DNA molecule is contained in the material deposited with the
American Type Culture Collection ("ATCC.TM.") and given the
ATCC.TM. Deposit Number shown in Table 1 for that cDNA clone. The
ATCC.TM. is located at American Type Culture Collection (ATCC.TM.),
10801 University Boulevard, Manassas, Va. 20110-2209, USA. The
deposit has been made under the terms of the Budapest Treaty on the
international recognition of the deposit of micro-organisms for
purposes of patent procedure. The strain will be irrevocably and
without restriction or condition released to the public upon the
issuance of a patent. The deposit is provided merely as convenience
to those of skill in the art and is not an admission that a deposit
is required for enablement, such as that required under 35 U.S.C.
.sctn.112. The nucleotide sequence of the polynucleotides contained
in the deposited material, as well as the amino acid sequence of
the polypeptide encoded thereby, are controlling in the event of
any conflict with any description of sequences herein.
[0041] "Antibodies" as used herein includes polyclonal and
monoclonal antibodies, chimeric, single chain, and humanized
antibodies, as well as Fab fragments, including the products of an
Fab or other immunoglobulin expression library.
[0042] "Isolated" means altered "by the hand of man" from the
natural state. If an "isolated" composition or substance occurs in
nature, it has been changed or removed from its original
environment, or both. For example, a polynucleotide or a
polypeptide naturally present in a living animal is not "isolated,"
but the same polynucleotide or polypeptide separated from the
coexisting materials of its natural state is "isolated", as the
term is employed herein.
[0043] "Polynucleotide" generally refers to any polyribonucleotide
or polydeoxyribonucleotide, which may be unmodified RNA or DNA or
modified RNA or DNA. "Polynucleotides" include, without limitation
single- and double-stranded DNA, DNA that is a mixture of single-
and double-stranded regions, single- and double-stranded RNA, and
RNA that is mixture of single- and double-stranded regions, hybrid
molecules comprising DNA and RNA that may be single-stranded or,
more typically, double-stranded or a mixture of single- and
double-stranded regions. In addition, "polynucleotide" refers to
triple-stranded regions comprising RNA or DNA or both RNA and DNA.
The term polynucleotide also includes DNAs or RNAs containing one
or more modified bases and DNAs or RNAs with backbones modified for
stability or for other reasons. "Modified" bases include, for
example, tritylated bases and unusual bases such as inosine. A
variety of modifications has been made to DNA and RNA; thus,
"polynucleotide" embraces chemically, enzymatically or
metabolically modified forms of polynucleotides as typically found
in nature, as well as the chemical forms of DNA and RNA
characteristic of viruses and cells. "Polynucleotide" also embraces
relatively short polynucleotides, often referred to as
oligonucleotides.
[0044] "Polypeptide" refers to any peptide or protein comprising
two or more amino acids joined to each other by peptide bonds or
modified peptide bonds, i.e., peptide isosteres. "Polypeptide"
refers to both short chains, commonly referred to as peptides,
oligopeptides or oligomers, and to longer chains, generally
referred to as proteins. Polypeptides may contain amino acids other
than the 20 gene-encoded amino acids. "Polypeptides" include amino
acid sequences modified either by natural processes, such as
posttranslational processing, or by chemical modification
techniques which are well known in the art. Such modifications are
well described in basic texts and in more detailed monographs, as
well as in a voluminous research literature. Modifications can
occur anywhere in a polypeptide, including the peptide backbone,
the amino acid side-chains and the amino or carboxyl termini. It
will be appreciated that the same type of modification may be
present in the same or varying degrees at several sites in a given
polypeptide. Also, a given polypeptide may contain many types of
modifications. Polypeptides may be branched as a result of
ubiquitination, and they may be cyclic, with or without branching.
Cyclic, branched and branched cyclic polypeptides may result from
posttranslation natural processes or may be made by synthetic
methods. Modifications include acetylation, acylation,
ADP-ribosylation, amidation, covalent attachment of flavin,
covalent attachment of a heme moiety, covalent attachment of a
nucleotide or nucleotide derivative, covalent attachment of a lipid
or lipid derivative, covalent attachment of phosphotidylinositol,
cross-linking, cyclization, disulfide bond formation,
demethylation, formation of covalent cross-links, formation of
cystine, formation of pyroglutamate, formylation,
gamma-carboxylation, glycosylation, GPI anchor formation,
hydroxylation, iodination, methylation, myristoylation, oxidation,
proteolytic processing, phosphorylation, prenylation, racemization,
selenoylation, sulfation, transfer-RNA mediated addition of amino
acids to proteins such as arginylation, and ubiquitination. (See,
for instance, PROTEINS--STRUCTURE AND MOLECULAR PROPERTIES, 2nd
Ed., T. E. Creighton, W.H. Freeman and Company, New York, 1993 and
Wold, F., Posttranslational Protein Modifications: Perspectives and
Prospects, pgs. 1-12 in POSTTRANSLATIONAL COVALENT MODIFICATION OF
PROTEINS, B. C. Johnson, Ed., Academic Press, New York, 1983;
Seifter et al., "Analysis for protein modifications and nonprotein
cofactors", Meth Enzymol (1990) 182:626-646 and Rattan et al.,
"Protein Synthesis: Posttranslational Modifications and Aging", Ann
NY Acad Sci (1992) 663:48-62.)
[0045] "Variant" as the term is used herein, is a polynucleotide or
polypeptide that differs from a reference polynucleotide or
polypeptide respectively, but retains essential properties. A
typical variant of a polynucleotide differs in nucleotide sequence
from another, reference polynucleotide. Changes in the nucleotide
sequence of the variant may or may not alter the amino acid
sequence of a polypeptide encoded by the reference polynucleotide.
Nucleotide changes may result in amino acid substitutions,
additions, deletions, fusions and truncations in the polypeptide
encoded by the reference sequence, as discussed below. A typical
variant of a polypeptide differs in amino acid sequence from
another, reference polypeptide. Generally, differences are limited
so that the sequences of the reference polypeptide and the variant
are closely similar overall and, in many regions, identical. A
variant and reference polypeptide may differ in amino acid sequence
by one or more substitutions, additions, deletions in any
combination. A substituted or inserted amino acid residue may or
may not be one encoded by the genetic code. A variant of a
polynucleotide or polypeptide may be a naturally occurring such as
an allelic variant, or it may be a variant that is not known to
occur naturally. Non-naturally occurring variants of
polynucleotides and polypeptides may be made by mutagenesis
techniques or by direct synthesis.
[0046] "Identity" is a measure of the identity of nucleotide
sequences or amino acid sequences. In general, the sequences are
aligned so that the highest order match is obtained. "Identity" per
se has an art-recognized meaning and can be calculated using
published techniques. (See, e.g.: COMPUTATIONAL MOLECULAR BIOLOGY,
Lesk, A. M., ed., Oxford University Press, New York, 1988;
BIOCOMPUTING: INFORMATICS AND GENOME PROJECTS, Smith, D. W., ed.,
Academic Press, New York, 1993; COMPUTER ANALYSIS OF SEQUENCE DATA,
PART I, Griffin, A. M., and Griffin, H. G., eds., Humana Press, New
Jersey, 1994; SEQUENCE ANALYSIS IN MOLECULAR BIOLOGY, von Heinje,
G., Academic Press, 1987; and SEQUENCE ANALYSIS PRIMER, Gribskov,
M. and Devereux, J., eds., M Stockton Press, New York, 1991.) While
there exist a number of methods to measure identity between two
polynucleotide or polypeptide sequences, the term "identity" is
well known to skilled artisans. (Carillo, H., and Lipton, D., SIAM
J Applied Math (1988) 48:1073.) Methods commonly employed to
determine identity or similarity between two sequences include, but
are not limited to, those disclosed in Guide to Huge Computers,
Martin J. Bishop, ed., Academic Press, San Diego, 1994, and
Carillo, H., and Lipton, D., SIAM J Applied Math (1988) 48:1073.
Methods to determine identity and similarity are codified in
computer programs. Preferred computer program methods to determine
identity and similarity between two sequences include, but are not
limited to, GCS program package (Devereux, J., et al., Nucleic
Acids Research (1984) 12(1):387), BLASTP, BLASTN, FASTA (Atschul,
S. F. et al., J Molec Biol (1990) 215:403.)
[0047] As an illustration, by a polynucleotide having a nucleotide
sequence having at least, for example, 95% "identity" to a
reference nucleotide sequence of SEQ ID NO:X is intended that the
nucleotide sequence of the polynucleotide is identical to the
reference sequence except that the polynucleotide sequence may
include up to five point mutations per each 100 nucleotides of the
reference nucleotide sequence of SEQ ID NO: X. In other words, to
obtain a polynucleotide having a nucleotide sequence at least 95%
identical to a reference nucleotide sequence, up to 5% of the
nucleotides in the reference sequence may be deleted or substituted
with another nucleotide, or a number of nucleotides up to 5% of the
total nucleotides in the reference sequence may be inserted into
the reference sequence. These mutations of the reference sequence
may occur at the 5 or 3' terminal positions of the reference
nucleotide sequence or anywhere between those terminal positions,
interspersed either individually among nucleotides in the reference
sequence or in one or more contiguous groups within the reference
sequence.
[0048] Similarly, by a polypeptide having an amino acid sequence
having at least, for example, 95% "identity" to a reference amino
acid sequence of SEQ ID NO:Y is intended that the amino acid
sequence of the polypeptide is identical to the reference sequence
except that the polypeptide sequence may include up to five amino
acid alterations per each 100 amino acids of the reference amino
acid of SEQ ID NO:Y. In other words, to obtain a polypeptide having
an amino acid sequence at least 95% identical to a reference amino
acid sequence, up to 5% of the amino acid residues in the reference
sequence may be deleted or substituted with another amino acid, or
a number of amino acids up to 5% of the total amino acid residues
in the reference sequence may be inserted into the reference
sequence. These alterations of the reference sequence may occur at
the amino or carboxy terminal positions of the reference amino acid
sequence or anywhere between those terminal positions, interspersed
either individually among residues in the reference sequence or in
one or more contiguous groups within the reference sequence.
Polypeptides of the Invention
[0049] In one aspect, the present invention relates to receptor
polypeptides (or receptor proteins). The receptor polypeptides
include the polypeptide of SEQ ID NO:Y; as well as polypeptides
comprising the amino acid sequence of SEQ ID NO:Y; and polypeptides
comprising the amino acid sequence which have at least 80% identity
to that of SEQ ID NO:Y over its entire length, and still more
preferably at least 90% identity, and even still more preferably at
least 95% identity to SEQ ID NO:Y. Furthermore, those with at least
97-99% identity to SEQ ID NO:Y are highly preferred. Also included
within receptor polypeptides are polypeptides having the amino acid
sequence which have at least 80% identity to the polypeptide having
the amino acid sequence of SEQ ID NO:Y over its entire length, and
still more preferably at least 90% identity, and even still more
preferably at least 95% identity to SEQ ID NO:Y. Furthermore, those
with at least 97-99% are highly preferred. Preferably receptor
polypeptides exhibit at least one biological activity of the
receptor.
[0050] The receptor polypeptides may be in the form of the "mature"
protein or may be a part of a larger protein such as a fusion
protein. It is often advantageous to include an additional amino
acid sequence which contains secretory or leader sequences,
pro-sequences, sequences which aid in purification such as multiple
histidine residues, or an additional sequence for stability during
recombinant production.
[0051] Fragments of the receptor polypeptides are also included in
the invention. A "fragment" is a polypeptide having an amino acid
sequence that entirely is the same as part, but not all, of the
amino acid sequence of the aforementioned receptor polypeptides. As
with receptor polypeptides, fragments may be "free-standing," or
comprised within a larger polypeptide of which they form a part or
region, most preferably as a single continuous region.
Representative examples of polypeptide fragments of the invention,
include, for example, fragments from about amino acid number 1-20,
21-40, 41-60, 61-80, 81-100, and 101 to the end of receptor
polypeptide. In this context "about" includes the particularly
recited ranges larger or smaller by several, 5, 4, 3, 2 or 1 amino
acid at either extreme or at both extremes.
[0052] Preferred fragments include, for example, truncation
polypeptides having the amino acid sequence of receptor
polypeptides, except for deletion of a continuous series of
residues that includes the amino terminus, or a continuous series
of residues that includes the carboxyl terminus or deletion of two
continuous series of residues, one including the amino terminus and
one including the carboxyl terminus.
[0053] Also preferred are fragments characterized by structural or
functional domains, such as fragments that comprise alpha-helix and
alpha-helix forming regions, beta-sheet and beta-sheet-forming
regions, turn and turn-forming regions, coil and coil-forming
regions, hydrophilic regions, hydrophobic regions, alpha
amphipathic regions, beta amphipathic regions, flexible regions,
surface-forming regions, substrate binding region, and high
antigenic index regions. The "domains" of each receptor polypeptide
are illustrated in the Figures. The Figures compare SEQ ID NO:Y to
the closest know homologue. Identical amino acids shared between
the two polypeptides are shaded, while conservative amino acid
changes are boxed. By examining the regions or amino acids shaded
and/or boxed, the skilled artisan can readily identify conserved
domains between the two polypeptides. The amino acids sequences of
SEQ ID NO:Y falling within these conserved domains are "fragments"
and are specifically contemplated by the present invention.
Especially preferred is the extracellular domains of a receptor of
the invention. Soluble extracellular domains have antagonist
activity mediated by competition with a receptor ligand.
[0054] Other preferred fragments are biologically active fragments.
Biologically active fragments are those that mediate receptor
activity, including those with a similar activity or an improved
activity, or with a decreased undesirable activity. Also included
are those that are antigenic or immunogenic in an animal,
especially in a human.
[0055] Preferably, all of these polypeptide fragments retain a
biological activity of the receptor, including antigenic activity.
Variants of the defined sequence and fragments also form part of
the present invention. Preferred variants are those that vary from
the referents by conservative amino acid substitutions--i.e., those
that substitute a residue with another of like characteristics.
Typical such substitutions are among Ala, Val, Leu and Ile; among
Ser and Thr; among the acidic residues Asp and Glu; among Asn and
Gln; and among the basic residues Lys and Arg; or aromatic residues
Phe and Tyr. Particularly preferred are variants in which several,
5-10, 1-5, or 1-2 amino acids are substituted, deleted, or added in
any combination.
[0056] The receptor polypeptides of the invention can be prepared
in any suitable manner. Such polypeptides include isolated
naturally occurring polypeptides, recombinantly produced
polypeptides, synthetically produced polypeptides, or polypeptides
produced by a combination of these methods. Means for preparing
such polypeptides are well understood in the art.
Polynucleotides of the Invention
[0057] Another aspect of the invention relates to receptor
polynucleotides. Receptor polynucleotides include isolated
polynucleotides which encode the receptor polypeptides and
fragments, and polynucleotides closely related thereto. More
specifically, a receptor polynucleotide of the invention includes a
polynucleotide comprising the nucleotide sequence contained in SEQ
ID NO:X encoding a receptor polypeptide of SEQ ID NO:Y, and
polynucleotide having the particular sequence of SEQ ID NO:X.
[0058] Receptor polynucleotides further include a polynucleotide
comprising a nucleotide sequence that has at least 80% identity
over its entire length to a nucleotide sequence encoding the
receptor polypeptide of SEQ ID NO:Y, and a polynucleotide
comprising a nucleotide sequence that is at least 80% identical to
that of SEQ ID NO:X over its entire length. In this regard,
polynucleotides at least 90% identical are particularly preferred,
and those with at least 95% are especially preferred. Furthermore,
those with at least 97% are highly preferred and those with at
least 98-99% are most highly preferred, with at least 99% being the
most preferred. Also included under receptor polynucleotides are a
nucleotide sequence which has sufficient identity to a nucleotide
sequence contained in SEQ ID NO:X, or contained in the cDNA insert
in the plasmid deposited with ATCC.TM., to hybridize under
conditions useable for amplification or for use as a probe or
marker. Moreover, the receptor polynucleotide includes a nucleotide
sequence having at least 80% identity to a nucleotide sequence
encoding the receptor polypeptide expressed by the cDNA insert
deposited at the ATCC.TM., and a nucleotide sequence comprising at
least 15 contiguous nucleotides of such cDNA insert. In this
regard, polynucleotides at least 90% identical are particularly
preferred, and those with at least 95% are especially preferred.
Furthermore, those with at least 97% are highly preferred and those
with at least 98-99% are most highly preferred, with at least 99%
being the most preferred. The invention also provides
polynucleotides which are complementary to all the above receptor
polynucleotides.
[0059] The receptors of the invention are structurally related to
other proteins of specified receptor families, as shown by the
results in the Figures. The cDNA sequence of SEQ ID NO:X encodes a
polypeptide as described in Table 1 as SEQ ID NO:Y. Because the
receptor polypeptides contain domains similar in structure to other
receptor family members, the receptors of the present invention are
expected to have, inter alia, similar biological
functions/properties to their homologous polypeptides and
polynucleotides, and their utility is obvious to anyone skilled in
the art.
TABLE-US-00001 TABLE 1 SEQ ID SEQ ID ATCC .TM. ATCC .TM. Receptor
Clone ID Name NO: X NO: Y Deposit No. Deposit Date Family Homology
HMACR70 1 18 209054 May 16, 1997 Ig Sialoadhesin ##### Jan. 21,
1998 OB-1 HTEDK48 209054 May 16, 1997 TM4SF MRC-OX44 PETA-3 1-1849
bp 2 160-900 bp 3 19 HTPED39 4 20 209054 May 16, 1997 TM4SF NAG-2
HPWAE25 ##### Jan. 21, 1998 TALLA-1 HTPEF86 5 21 209053 May 16,
1997 TM4SF CD20 B1 Antigen HSBBF02 6 22 209054 May 16, 1997 TM4SF
TALLA-1 HLTAH80 7 23 97242 Aug. 02, 1995 TM4SF TALLA-1 209054 May
16, 1997 HTPBA27 8 24 97242 Aug. 02, 1995 TM4SF NAG-2 209054 May
16, 1997 HAIDQ59 209054 May 16, 1997 TM4SF CD9 Antigen 5' Sequence
9 25 3' Sequence 10 - HHFEK40 11 26 209054 May 16, 1997 TM4SF
PETA-3 HGBGV89 12 27 209125 Jun. 09, 1997 TM4SF L6H 209054 May 16,
1997 HUVBB80 13 28 209054 May 16, 1997 TM4SF L6 HJACE54 14 29
209053 May 16, 1997 Lectin Galectin-3 Galectin-5 Galectin-8 HROAD63
15 30 209053 May 16, 1997 Ly6 E48 splice variant HMWGS46 16 31
209053 May 16, 1997 Prohibitin BAP-37 HNFGW06 17 32 209053 May 16,
1997 EGFR EGFR
[0060] The novel full-length cDNA clone designated HMACR70 (SEQ ID
NO:1) may be a member of the sialoadhesin family of the Ig
superfamily of receptor-like molecules and a CD33 homologue.
HMACR70 contains a 1497 nucleotide cDNA insert (SEQ ID NO:1)
encoding a 315 amino acid ORF (SEQ ID NO:18) and was cloned from a
GM-CSF-treated human macrophage cDNA library. The only additional
cDNA libraries in the HGS database which include this clone are
human eosinophils and possibly human gall bladder. A BLAST analysis
of the amino acid sequence of HMACR70 (SEQ ID NO:18) demonstrates
that this clone exhibits approximately 50% identity and 69%
similarity over a 300 amino acids stretch of a gene termed human
differentiation antigen, and 38% identity and 62% similarity of the
human myelin-associated glycoprotein precursor CD33 gene.
[0061] A more recent BLAST analysis confirms HMACR70's (SEQ ID
NO:18) designation as a sialoadhesin family member. HMACR70 (SEQ ID
NO:18) is homologous to two recently identified sialoadhesin family
members, human OB binding protein (OB) 1 (SEQ ID NO:33) and 2.
(See, Genbank Accession No. U71382; see FIG. 1.) It is thought that
OB-1 (SEQ ID NO:33) and OB-2 may bind leptin. Thus, HMACR70 (SEQ ID
NO:18), as a sialoadhesin family member, may act to attenuate or
even amplify intercellular routes of communication, including
binding to leptin or modulating the activity of immune cells, such
as macrophages. Clearly, any diseases affected by these processes
could be treated by the polypeptide or fragment of HMACR70 (SEQ ID
NO:18).
[0062] The full-length nucleotide sequences of ten novel human cDNA
clones which potentially belong to the TM4SF superfamily are
disclosed in the table above and will be addressed
sequentially.
[0063] The cDNA clone HTEDK48 contains a 1849 nucleotide cDNA
insert (SEQ ID NO:2) encoding a 245 amino acid ORF that was cloned
from a human testes cDNA library. The coding sequence of HTEDK48
(SEQ ID NO: 3) may be fused to other human proteins, such as
3-hydroxyacyl-CoA dehydrogenase. BLAST analysis of the amino acid
sequence of HTEDK48 (SEQ ID NO:19) demonstrates that this clone
exhibits approximately 30% identity and 51% similarity over a 245
amino acid stretch of the CD82 molecule. Recent studies have shown
that CD82 can associate with CD4 or CD8 and deliver costimulatory
signals for the TCR/CD3 pathway. CD82 has also been found to be
involved in syncytium formation in HTLV-1-infected T-cells. And
finally, in a recently published study in which the expression of
the CD82 gene by tumors of the lung was examined retrospectively,
it was reported that CD82 may be linked to the suppression of tumor
metastasis of prostate cancer. The study also reported that
decreased CD82 expression may be involved in malignant progression
of such cancers. Thus, HTEDK48 (SEQ ID NO:2, NO:3, NO:19) may also
be involved in the development of cancer.
[0064] A more recent BLAST analysis shows that HTEDK48 (SEQ ID
NO:19) is homologous the rat leukocyte antigen, MRC OX-44 (SEQ ID
NO:34), and the platelet endothelial tetraspan antigen-3 (PETA-3)
(SEQ ID NO:35). (See FIG. 2X.) MRC OX-44 (SEQ ID NO:34), a member
of a new family of cell surface proteins, appears to be involved in
growth regulation. (See, Bellacosa, A., et al., "The Rat Leukocyte
antigen MRC OX-44 is a Member of a New Family of Cell Surface
Proteins which Appear to be Involved in Growth Regulation," Mol.
Cell. Bio. 11: 2864-2872 (1991).) Similarly, PETA-3 (SEQ ID NO:35)
has been located to platelet endothelial cells, and an anti-PETA-3
antigen monoclonal antibody can stimulate platelet aggregation and
mediator release. (See, Fitter, S., "Molecular Cloning of cDNA
Encoding a Novel Platelet-Endothelial Cell Tetra-Span Antigen,
PETA-3," Blood, 86(4):1348-1355 (1995).) Thus, HTEDK48 (SEQ ID
NO:19) may function similar to MRC OX-44 (SEQ ID NO:34) or PETA-3
(SEQ ID NO:35) to affect growth of blood cells. Administering
polypeptides or fragments of HTEDK48 (SEQ ID NO:19) may be an
effective treatment of blood disorders.
[0065] The cDNA clone HPWAE25 contains a 1288 nucleotide cDNA
insert (SEQ ID NO:4) encoding a 273 amino acid ORF (SEQ ID NO:20)
that was cloned from a human pancreas tumor cDNA library, while
clone HTPED39 represents a truncated cDNA sequence. This clone also
appears in a number of other cDNA libraries constructed from a
variety of human cell and tissue types including keratinocytes,
ulcerative colitis, striatum depression, lymph node breast cancer,
ovarian cancer, stage B2 prostate cancer, kidney medulla, and
others. Northern blot analysis of HLTAH80 (SEQ ID NO:23) also shows
expression in a variety of human cell lines including U937, MM96,
WM115, and MDAMB231. A BLAST analysis of the amino acid sequence of
HTPED39 demonstrates that this clone exhibits approximately 35%
identity and 50% similarity over the entire length of the CD37
molecule. The CD37 antigen is expressed on B cells and on a
subpopulation of T cells, but not on pre-B or plasma cells. It has
been reported that CD37 expression is downregulated in conjunction
with B-cell activation, suggesting that CD37 may be involved in the
processes which dictate the activation state of the B-cell.
[0066] Moreover, HPWAE25 (SEQ ID NO:20) is also homologous to
recently identified TM4SF members, NAG-2 (SEQ ID NO:36) and TALLA-1
(SEQ ID NO:37). (See FIG. 3.) NAG-2 (SEQ ID NO:36) is thought to
complex with integrins and other TM4SF proteins, while TALLA-1 (SEQ
ID NO:37) is a highly specific marker of T-cell acute lymphoblastic
leukemia and neuroblastoma. (See, Tachibana, I., et al., "NAG-2, A
Novel Transmembrane-4 Superfamily (TM4SF) Protein that Complexs
with Integrins and Other TM4SF Proteins," J. Biol. Chem.,
272:29181-29189 (1997); Takagi, S., "Identification of a Highly
Specific Surface Marker of T-cell Acute Lymphoblastic Leukemia and
Neuroblastoma as a New Member of the Transmembrane 4 Superfamily,"
Int. J. Cancer 61(5):706-715 (1995).) Thus, HPWAE25 (SEQ ID NO:20)
may be involved the development of cancer, particularly leukemia,
lymphoma, and neuroblastoma. HPWAE25 (SEQ ID NO:4 and NO:20) may be
used as an effective treatment of these cancers, as well as a
diagnostic marker.
[0067] A subfamily of TM4SF receptors include CD20 proteins. A
CD20-like cDNA clone was obtained from a human pancreas tumor cDNA
library and contains a 1236 nucleotide insert which encodes a 250
amino acid ORF. A BLAST analysis of the deduced amino acid sequence
of HTPEF86 (SEQ ID NO:21) exhibits approximately 41% identity and
61% similarity to the CD20 gene, also known as B1 antigen (SEQ ID
NO:38). (See FIG. 4.) Expression of this gene is detected in only
two additional HGS human cDNA libraries; amygdala depression and 9
week early stage human. Although the precise functional role of
CD20 has yet to be determined, it is clear that CD20 plays a key
role in the regulation of B-cell activation. Based primarily on
sequence identity, the novel CD20-like molecule presented herein
may also be involved in cell cycle activation. Potential
therapeutic and/or diagnostic applications for HTPEF86 (SEQ ID
NO:21) may include such clinical presentations as juvenile
rheumatoid arthritis, Graves' Disease, and a number of B-cell
lymphomas or other lymphoid tumors.
[0068] The clone HSBBF02 contains a 1115 nucleotide cDNA insert
(SEQ ID NO:6) encoding a 245 amino acid ORF (SEQ ID NO:22) and was
cloned from an HSC 172 cell line cDNA library. This clone also
appears in a number of other cDNA libraries constructed from a
variety of human cell and tissue types including brain amygdala
depression, endothelial cells, fetal liver and heart, osteoblasts,
testes, and others. A BLAST analysis of the amino acid sequence of
HSBBF02 (SEQ ID NO:22) demonstrates that this clone exhibits
approximately 64% identity and 80% similarity with the A15 molecule
over a 131 amino acid stretch (A15 is composed of 244 amino acids).
A more recent BLAST search shows that HSBBF02 (SEQ ID NO:22) is
similar to the TALLA-1 protein (SEQ ID NO:37) and may in fact be a
closely related family member. (See FIG. 5.)
[0069] In addition, a second cDNA clone, designated HLTAH80 (SEQ ID
NO:23), exhibits sequence similarity to the A15 molecule and
TALLA-1 (SEQ ID NO:37). (See FIG. 6.) This clone contains a 1662
nucleotide cDNA insert encoding a 253 amino acid ORF and was cloned
from a human T-cell lymphoma cDNA library. This clone also appears
in a number of other cDNA libraries constructed from a variety of
human cell and tissue types including B-cell lymphoma, corpus
collosum, endometrial tumor, osteosarcoma, testes, and others.
Northern blot analysis of HLTAH80 (SEQ ID NO:7) also shows
expression in a variety of human tissues including spleen, lymph
node, thymus, PBLs, heart, and a particularly strong signal in
skeletal muscle and pancreas. A BLAST analysis of the amino acid
sequence of HLTAH80 (SEQ ID NO:23) demonstrates that this clone
exhibits approximately 35% identity and 55% similarity over the
entire length of the A15 molecule.
[0070] Since expression of A15 drops to undetectable levels when
comparing immature T-cells to peripheral blood lymphocytes, it is
thought that A15 may play a role in the development of T-cells.
Furthermore, the MXS1(CCG-B7) gene which codes for A15 contains a
number of triplet nucleotide repeats which have been associated
with neuropsychiatric diseases such as Huntington's chorea, fragile
X syndrome, and myotonic dystrophy. In addition, A15 appears to be
expressed exclusively on T-cell acute lymphoblastic leukemia cell
lines, including several derived from adult T-cell leukemia and
those established by immortalization with human T-cell leukemia
virus type 1 or Herpesvirus saimiri. Thus, clones HLTAH80 (SEQ ID
NO:7 and NO:23) and/or HSBBF02 (SEQ ID NO:6 and NO:22) may also be
involved in diseases caused by the expansion of repeats or
chromosomal instability.
[0071] The cDNA clone HTPBA27 contains a 1345 nucleotide cDNA
insert (SEQ ID NO:8) encoding a 238 amino acid ORF (SEQ ID NO:24)
and was cloned from a human tumor pancreas cDNA library. This clone
also appears in a number of other cDNA libraries constructed from a
variety of human cell and tissue types including cerebellum, breast
lymph node, osteosarcoma, adult testes, RS4; 11 bone marrow cell
line, microvascular endothelial cells, and others. A BLAST analysis
of the amino acid sequence of HTPBA27 (SEQ ID NO:24) demonstrates
that this clone exhibits approximately 40% identity and 64%
similarity with a glycoprotein termed CD53 over its entire length.
CD53 is thought to be involved in thymopoiesis, since rat CD53 can
be detected on immature CD4-8-thymocytes and the functionally
mature single-positive subset, but cannot be detected on the
intermediate CD4+8+ thymocytic subset of cells. The CD53 molecule
has also been implicated as a component of signal transduction
pathways in B cells, monocytes and granulocytes, rat macrophages,
NK, and T cells. Moreover, as illustrated in FIG. 7, HTPBA27 (SEQ
ID NO:24) was recently confirmed as a TM4SF receptor. (See,
Tachibana, I., et al., "NAG-2, A Novel Transmembrane-4 Superfamily
(TM4SF) Protein that with Integrins and Other TM4SF Proteins," J.
Biol. Chem., 272:29181-29189 (1997).) Calling the HTPBA27
polypeptide (SEQ ID NO:24) NAG-2 (SEQ ID NO:36), this group
confirmed HTPBA27's status as a TM4SF receptor by showing that
NAG-2 (SEQ ID NO:36) complexes with integrin and other TM4SF
receptors. Thus, diseases caused by the failure of HTPBA27 (SEQ ID
NO:24) to complex with integrin and other TM4SF receptors can be
treated by administering HTPBA27(SEQ ID NO:24). HTPBA27 (SEQ ID
NO:8 and NO:24) can also be used to diagnose these diseases.
[0072] The cDNA clone HAIDQ59 contains cDNA insert encoding a 221
amino acid ORF (SEQ ID NO:25) that was cloned from a human
epithelial cell induced with TNFa and INF cDNA library. The 5' end
of HAIDQ59 is represented by the SEQ ID NO: 9, while the 3' end is
represented by SEQ ID NO: 10. This clone appears in only two
additional cDNA libraries in the HGS database. These two libraries
were constructed from the human Jurkat T-cell line and human
microvascular endothelial cells. A BLAST analysis of the amino acid
sequence of HAIDQ59 (SEQ ID NO:25) demonstrates that this clone
exhibits approximately 53% identity and 69% similarity over 226
amino acids of the CD9 TM4SF molecule (SEQ ID NO:39). (See FIG. 8.)
It has been demonstrated that the CD9 molecule (SEQ ID NO:39) is
involved in signal transduction pathways in platelets, as well as
in cell adhesion in both platelets and pre-B-cell lines.
Intriguingly, a monoclonal antibody (vpg15), which recognizes the
feline homologue of CD9, has been shown to block infection by
feline immunodeficiency virus (FIV). Furthermore, a recent study
shows that cells expressing high levels of CD9 (SEQ ID NO:39)
exhibited suppressed cell motility. Thus, HAIDQ59 (SEQ ID NO:25)
may also be involved in signal transduction of blood cells.
[0073] The cDNA clone HHFEK40 contains a 936 nucleotide cDNA insert
(SEQ ID NO:11) encoding a 252 amino acid ORE (SEQ ID NO:26) and was
cloned from a human fetal heart cDNA library. This clone appears
once in the human fetal heart cDNA library and possibly in a
hemangiopericytoma cDNA library. A BLAST analysis of the amino acid
sequence of HHFEK40 (SEQ ID NO:26) demonstrated that this clone
exhibits approximately 60% identity and 75% similarity over the
entire length of a molecule designated PETA-3 (SEQ ID NO:35). (See
FIG. 9.) PETA-3 (SEQ ID NO:35) was originally identified as a novel
human platelet surface glycoprotein termed gp27. Although PETA-3
(SEQ ID NO:35) is present in low abundance on the platelet surface,
an anti-PETA-3 monoclonal antibody can stimulate platelet
aggregation and mediator release. Thus, HHFEK40 (SEQ ID NO:26) may
function similar to PETA-3 (SEQ ID NO:35) to affect growth of blood
cells. Administering polypeptides or fragments of HHFEK40 (SEQ ID
NO:26) may be an effective treatment of blood disorders.
[0074] The cDNA clone HGBGV89 contains a 738 nucleotide cDNA insert
(SEQ ID NO:12) encoding a 197 amino acid ORF (SEQ ID NO:27) and was
cloned from a human gall bladder cDNA library. The only two
additional appearances of this clone in the HGS database are in a
normalized fetal liver cDNA library and in a fetal liver/spleen
cDNA library. The cDNA clone HUVBB80 contains a 1071 nucleotide
cDNA insert (SEQ ID NO:13) encoding a 201 amino acid ORF (SEQ ID
NO:28) and was cloned from a human umbilical vein cDNA library.
This clone appears in several additional cDNA libraries in the HGS
database including prostate BPH, thyroid, and fetal liver/spleen.
BLAST analyses of the amino acid sequences of HGBGV89 (SEQ ID
NO:27) and HUVBB80 (SEQ ID NO:28) demonstrate that these clones
exhibit approximately 49% identity and 65% similarity and 47%
identity and 68% similarity, respectively, over the entire length
of a molecule designated L6 surface protein (SEQ ID NO:41) or human
tumor-associated antigen L6 (SEQ ID NO:41) (See FIGS. 10 & 11.)
Moreover, another group has confirmed the TM4SF receptor homology
of HGBGV89 (SEQ ID NO:27) by describing the protein as a putative
transmembrane protein L6H (SEQ ID NO:40). (See Genbank Accession No
2587054; see FIG. 10.) The L6 cell surface antigen (SEQ ID NO:41)
is highly expressed on lung, breast, colon, and ovarian carcinomas.
Promising results of phase 1 clinical studies have been reported
with an anti-L6 monoclonal antibody, or its humanized counterpart,
suggesting that the L6 antigen (SEQ ID NO:41) may be an attractive
target for monoclonal antibody-based cancer therapy.
[0075] In summary, there is a clear need for identifying and
exploiting novel members of the TM4SF superfamily such as those
described herein. Although structurally related, these factors will
likely possess diverse and multifaceted functions in a variety of
cell and tissue types. Receptor type molecules, such as the novel
potential members of the TM4SF superfamily detailed here, should
prove useful in target based screens for small molecules and other
such pharmacologically valuable factors. Monoclonal antibodies
raised against such factors may prove useful as therapeutics in an
anti-tumor, diagnostic, or other capacity. Furthermore, factors
such as the nine novel TM4SF superfamily-like molecules described
here may prove useful in an active or passive immunotherapeutical
role in patients with cancer or other immunocompromised disease
states.
[0076] Besides TM4SF receptors, receptors from other families are
also described. For example, clone HJACE54 (SEQ ID NO:14 and
NO:29), also called galectin 11, exhibits significant sequence
identity to the rat galectin 5 (SEQ ID NO:42), the chicken galectin
3 gene, and the human galectin 8 (SEQ ID NO:43) genes. (See FIG.
12.) The galectin 11 cDNA clone contains an 865 nucleotide insert
(SEQ ID NO:14) which encodes a 133 amino acid ORF (SEQ ID NO:29).
The clone was obtained from a Jurkat T-cell G1 phase cDNA library.
A BLAST analysis of the deduced amino acid sequence of HJACE54 (SEQ
ID NO:29) demonstrates approximately 35% identity and 57%
similarity to the amino acid sequence of the rat galectin 5 (SEQ ID
NO:42) gene. Expression of galectin 11 (SEQ ID NO:14) is quite
limited in the HGS database. In fact, the only two additional ESTs
in the HGS database which contain the HJACE54 sequence (SEQ ID
NO:14) were found in human neutrophil and human infant adrenal
gland cDNA libraries. Northern blot analyses have not been
performed to examine expression patterns of the galectin 11 gene
(SEQ ID NO:14).
[0077] Various galectins have been shown to function in the
mechanisms of intercellular communication. For example, depending
on cell type, galectin 1 has been observed to modulate cell
adhesion either positively or negatively. More specifically,
galectin 1 appears to inhibit cell adhesion of skeletal muscle
presumably by galectin 1-mediated disruption of laminin-integrin
a.sub.7b.sub.1 interactions. Alternatively, galectin 1 appears to
promote cell adhesion in several non-skeletal muscle cell types
examined presumably by a glycoconjugate cross-linking mechanism.
Galectin 3 has also been observed to function in modulating
cell-adhesion, as well as in the activation of certain immune cells
by cross-linking IgE and IgE receptors. In addition, galectins have
been observed to be involved in the regulation of immune cell
activity, as well as in such diverse processes as cell adhesion,
proliferation, inflammation, autoimmunity, and metastasis of tumor
cells. Furthermore, a galectin-like antigen designated HOM-HD-21
was recently found to be highly expressed in a Hodgkin's Disease
cDNA library. Very recently, a novel galectin, termed PCTA-1, was
identified as a specific cell surface marker on human prostate
cancer cell lines and patient-derived carcinomas. Galectins have
also been found to function intracellularly as a component of
ribonucleoprotein complexes. Finally, galectins 1 and 3 have each
been found to modulate T-cell growth and apoptosis by interaction
with CD45 and possibly Bcl2, respectively. As a result, the
discovery of a novel galectin (SEQ ID NO:29), such as that encoded
by HJACE54 (SEQ ID NO:14), is likely to be a valuable asset both
diagnostically and therapeutically.
[0078] Additionally, a full-length nucleotide sequence of a novel
human cDNA clone which encodes an apparent splice variant of the
previously described human E48 antigen has recently been
determined. (See FIG. 13.) Clone HROAD63 contains a 441 nucleotide
cDNA (SEQ ID NO:15) which encodes a 70 amino acid polypeptide (SEQ
ID NO:30). This novel clone exhibits significant sequence identity
to several members of a relatively new family of cell-surface
proteins termed the Ly6 superfamily. These members include murine
and human SCA-2, rat Ly-6 (also termed ThB), and human CD59 [also
known as protectin or membrane attack complex inhibition factor
(MACIF)]. The novel E48 splice variant (SEQ ID NO:15) was obtained
from the HGS human stomach cDNA library. The clone (SEQ ID NO:30)
is present in only a limited number of other HGS cDNA libraries
including kidney cancer, keratinocyte, and tongue. An alignment of
the nucleotide sequences of the human E48 and HROAD63 (SEQ ID
NO:15) cDNAs demonstrates that the initial 168 and 178 nucleotides
of E48 and HROAD63, respectively, are identical, with the exception
of an additional 10 nucleotides of sequence at the extreme 5' end
of the HROAD63 sequence. The sequence of the two clones is also
identical for an additional 229 nucleotides including the 3' end of
the coding sequences and the entire 3' untranslated regions. The
only divergence of nucleotide sequence in this region of the clones
is the deletion of a single thymidine residue in the 3' UTR of the
E48 cDNA. The major difference between the two nucleotide sequences
is a 329 nucleotide deletion from the HROAD63 sequence. This
deletion causes a shift in the HROAD63 reading frame and
encompasses the translational stop signal used in the E48 clone. As
a result, the carboxy terminal sequence of HROAD63 (SEQ ID NO:30)
is radically altered with regard to that of E48 (SEQ ID NO:44) (as
illustrated in FIG. 13 by the obvious differences between amino
acids 56-128 of E48 and 56-70 of HROAD63 in the amino acid
alignment). The clinical presentation of disorders, including
abnormal skin and hair phenotypes, may be attributed, at least in
part, to a non-functional Ly6 superfamily member such as E48 (SEQ
ID NO:44) or HROAD63 (SEQ ID NO:30). HROAD63 (SEQ ID NO:30) may
also be involved in blood disorders, as seen with its homologues
SCA-2 and CD59.
[0079] A novel prohibitin cDNA clone (SEQ ID NO:16) presented
herein was originally identified in a human bone marrow cell line
(RS4; 11) cDNA library. The clone contains a 1066 nucleotide insert
(SEQ ID NO:16) which encodes a 299 amino acid polypeptide (SEQ ID
NO:31). BLAST and BestFit analyses of the predicted amino acid
sequence of HMWGS46 (SEQ ID NO:31) demonstrate a highly significant
sequence identity to a murine protein termed IgM B-cell receptor
associated protein (BAP)-37 (SEQ ID NO:45) (Genbank accession
number X78683). The HMWGS46 amino acid sequence (SEQ ID NO:31)
exhibits nearly perfect identity and similarity over the entire
length of the murine BAP-37 sequence (SEQ ID NO:45). (See FIG. 14.)
In addition, the full-length nucleotide sequences of HMWGS46 (SEQ
ID NO:16) and BAP-37 (SEQ ID NO:45) exhibit at least 87% identical.
The HMWGS46 clone (SEQ ID NO:16) also exhibits approximately 49%
sequence identity and 85% sequence similarity to a human gene
designated prohibitin. Finally, the HMWGS46 cDNA (SEQ ID NO:16)
appears in a substantial number of HGS human cDNA libraries in
addition to the bone marrow cell line cDNA library from which it
was cloned. Some of the cDNA libraries in which this clone appears
include keratinocytes, induced endothelial cells, activated
neutrophils, synovial sarcoma, colon carcinoma cell line, Jurkat
cell line membrane bound polysomes, epileptic frontal cortex,
primary dendritic cells, and a number of others. The novel gene
related to prohibitin and BAP-37 (SEQ ID NO:45) may prove quite
useful as a diagnostic for tumorigenesis, as well as a target for
therapeutic intervention of such an event. Thus, although the
precise functional role of the prohibitin family members are less
than clear, it is quite likely that such homologues are involved in
such complex processes as development, senescence, and tumor
suppression. Therefore a novel gene, such as HMWGS46 (SEQ ID NO:16
and NO:31), may prove quite useful as a diagnostic for
tumorigenesis, as well as a target for therapeutic intervention of
such an event.
[0080] A human cDNA clone encoding a novel epidermal growth factor
receptor (EGFR)-like molecule (SEQ ID NO:32) is also disclosed. The
novel EGFR-like cDNA clone (SEQ ID NO:17) presented herein was
originally identified in an activated human neutrophil cDNA
library. The clone contains a 704 nucleotide insert (SEQ ID NO:17)
which encodes a 168 amino acid polypeptide (SEQ ID NO:32). A BLAST
analysis of the predicted amino acid sequence of HNFGW06 (SEQ ID
NO:32) demonstrates that this novel clone exhibits approximately
85% identity and 90% similarity to a protein designated epidermal
growth factor receptor-related protein [Homo sapiens] (SEQ ID
NO:46). (See FIG. 15.) The expression profile of the HNFGW06 clone
(SEQ ID NO:17) in the HGS database indicates the existence of a
fairly highly restricted expression pattern. In addition to the
activated neutrophil library from which this clone (SEQ ID NO:17)
was obtained, it also appears in the following HGS human cDNA
libraries: synovial sarcoma, smooth muscle, placenta, and possibly
primary dendritic cells.
[0081] The novel EGFR-like cDNA clone HNFGW06 (SEQ ID NO:17) may
lead to a number of exciting possibilities for therapeutic and/or
diagnostic treatments or reagents. For example, HNFGW06 (SEQ ID
NO:17 and NO:32) may be involved in the onset of human breast
cancers as well. In addition, due to the fact that TGF-a acts
through binding to the EGFR (SEQ ID NO:46), it is possible that
HNFGW06 (SEQ ID NO:17 and NO:32) may also play a role in a variety
of gastric processes including regulation of acid secretion,
regulation of mucous cell growth, and protection against ethanol-
and aspirin-induced injury to gastric tissues.
Generating Polynucleotides
[0082] Polynucleotides of the present invention encoding a receptor
may be obtained using standard cloning and screening, from a cDNA
library derived from mRNA in cells specified in Table 1 using the
expressed sequence tag (EST) analysis (Adams, M. D., et al. Science
(1991) 252:1651-1656; Adams, M. D. et al., Nature, (1992)
355:632-634; Adams, M. D., et al., Nature (1995) 377 Supp:3-174.)
Polynucleotides of the invention can also be obtained from natural
sources such as genomic DNA libraries or can be synthesized using
well known and commercially available techniques.
[0083] The nucleotide sequence encoding a receptor polypeptide of
SEQ ID NO:Y may be identical to the polynucleotide encoding SEQ ID
NO:Y, or it may be a sequence, which as a result of the redundancy
(degeneracy) of the genetic code, also encodes the polypeptide of
SEQ ID NO:Y.
[0084] When the polynucleotides of the invention are used for the
recombinant production of a receptor polypeptide, the
polynucleotide may include the coding sequence for the mature
polypeptide or a fragment thereof, by itself; the coding sequence
for the mature polypeptide or fragment in reading frame with other
coding sequences, such as those encoding a leader or secretory
sequence, a pre-, or pro- or prepro-protein sequence, or other
fusion peptide portions. For example, a marker sequence which
facilitates purification of the fused polypeptide can be encoded.
In certain preferred embodiments of this aspect of the invention,
the marker sequence is a hexa-histidine peptide, as provided in the
pQE vector (Qiagen, Inc.) and described in Gentz et al., Proc Natl
Acad Sci USA (1989) 86:821-824, or is an HA tag. The polynucleotide
may also contain non-coding 5' and 3' sequences, such as
transcribed, non-translated sequences, splicing and polyadenylation
signals, ribosome binding sites and sequences that stabilize
mRNA.
[0085] Further preferred embodiments are polynucleotides encoding
receptor variants comprising the amino acid sequence of receptor
polypeptide of Table 1 (SEQ ID NO:Y) in which several, 5-10, 1-5,
1-3, 1-2 or 1 amino acid residues are substituted, deleted or
added, in any combination.
[0086] The present invention further relates to polynucleotides
that hybridize to the herein above-described sequences. In this
regard, the present invention especially relates to polynucleotides
which hybridize under stringent conditions to the herein
above-described polynucleotides. As herein used, the term
"stringent conditions" means hybridization will occur only if there
is at least 80%, and preferably at least 90%, and more preferably
at least 95%, yet even more preferably 97-99% identity between the
sequences.
[0087] Polynucleotides of the invention, which are identical or
sufficiently identical to a nucleotide sequence contained in SEQ ID
NO:X or a fragment thereof, or to the cDNA insert in the plasmid
deposited at the ATCC.TM., or a fragment thereof, may be used as
hybridization probes for cDNA and genomic DNA, to isolate
full-length cDNAs and genomic clones encoding the receptor and to
isolate cDNA and genomic clones of other genes (including genes
encoding homologs and orthologs) that have a high sequence
similarity to the receptor gene. Such hybridization techniques are
known to those of skill in the art. Typically these nucleotide
sequences are 80% identical, preferably 90% identical, more
preferably 95% identical to that of the referent. The probes
generally will comprise at least 15 nucleotides. Preferably, such
probes will have at least 30 nucleotides and may have at least 50
nucleotides. Particularly preferred probes will range between 30
and 50 nucleotides.
[0088] In one embodiment, to obtain a polynucleotide encoding the
receptor polypeptide, including homologs and orthologs from other
species, comprises the steps of screening an appropriate library
under stringent hybridization conditions with a labeled probe
having the SEQ ID NO:X or a fragment thereof; and isolating
full-length cDNA and genomic clones containing said polynucleotide
sequence. Such hybridization techniques are well known to those of
skill in the art. Stringent hybridization conditions are as defined
above or, alternatively, conditions under overnight incubation at
42.degree. C. in a solution comprising: 50% formamidc, 5.times.SSC
(150 mM NaCl, 15 mM trisodium citrate), 50 mM sodium phosphate
(pH7.6), 5.times.Denhardt's solution, 10% dextran sulfate, and 20
microgram/ml denatured, sheared salmon sperm DNA, followed by
washing the filters in 0.1.times.SSC at about 65.degree. C.
[0089] The polynucleotides and polypeptides of the present
invention may be employed as research reagents and materials for
discovery of treatments and diagnostics to animal and human
disease.
Vectors, Host Cells, Expression
[0090] The present invention also relates to vectors which comprise
a polynucleotide or polynucleotides of the present invention, and
host cells which are genetically engineered with vectors of the
invention and to the production of polypeptides of the invention by
recombinant techniques. Cell-free translation systems can also be
employed to produce such proteins using RNAs derived from the DNA
constructs of the present invention.
[0091] For recombinant production, host cells can be genetically
engineered to incorporate expression systems or portions thereof
for polynucleotides of the present invention. Introduction of
polynucleotides into host cells can be effected by methods
described in many standard laboratory manuals, such as Davis et
al., BASIC METHODS IN MOLECULAR BIOLOGY (1986) and Sambrook et al.,
MOLECULAR CLONING: A LABORATORY MANUAL, 2nd Ed., Cold Spring Harbor
Laboratory Press, Cold Spring Harbor, N.Y. (1989) such as calcium
phosphate transfection, DEAE-dextran mediated transfection,
transvection, microinjection, cationic lipid-mediated transfection,
electroporation, transduction, scrape loading, ballistic
introduction or infection.
[0092] Representative examples of appropriate hosts include
bacterial cells, such as streptococci, staphylococci, E. coli,
Streptomyces and Bacillus subtilis cells; fungal cells, such as
yeast cells and Aspergillus cells; insect cells such as Drosophila
S2 and Spodoptera Sf9 cells; animal cells such as CHO, COS, HeLa,
C127, 3T3, BHK, HEK 293 and Bowes melanoma cells; and plant
cells.
[0093] A great variety of expression systems can be used. Such
systems include, among others, chromosomal, episomal and
virus-derived systems, e.g., vectors derived from bacterial
plasmids, from bacteriophage, from transposons, from yeast
episomes, from insertion elements, from yeast chromosomal elements,
from viruses such as baculoviruses, papova viruses, such as SV40,
vaccinia viruses, adenoviruses, fowl pox viruses, pseudorabies
viruses and retroviruses, and vectors derived from combinations
thereof, such as those derived from plasmid and bacteriophage
genetic elements, such as cosmids and phagemids. The expression
systems may contain control regions that regulate as well as
engender expression. Generally, any system or vector suitable to
maintain, propagate or express polynucleotides to produce a
polypeptide in a host may be used. The appropriate nucleotide
sequence may be inserted into an expression system by any of a
variety of well-known and routine techniques, such as, for example,
those set forth in Sambrook et al., MOLECULAR CLONING, A LABORATORY
MANUAL (supra).
[0094] For secretion of the translated protein into the lumen of
the endoplasmic reticulum, into the periplasmic space or into the
extracellular environment, appropriate secretion signals may be
incorporated into the desired polypeptide. These signals may be
endogenous to the polypeptide or they may be heterologous
signals.
[0095] If the receptor polypeptide is to be expressed for use in
screening assays, generally, it is preferred that the polypeptide
be produced at the surface of the cell. In this event, the cells
may be harvested prior to use in the screening assay. If the
receptor polypeptide is secreted into the medium, the medium can be
recovered in order to recover and purify the polypeptide; if
produced intracellularly, the cells must first be lysed before the
polypeptide is recovered.
[0096] Receptor polypeptides can be recovered and purified from
recombinant cell cultures by well-known methods including ammonium
sulfate or ethanol precipitation, acid extraction, anion or cation
exchange chromatography, phosphocellulose chromatography,
hydrophobic interaction chromatography, affinity chromatography,
hydroxylapatite chromatography and lectin chromatography. Most
preferably, high performance liquid chromatography is employed for
purification. Well known techniques for refolding proteins may be
employed to regenerate active conformation when the polypeptide is
denatured during isolation and or purification.
Diagnostic Assays
[0097] This invention also relates to the use of receptor
polynucleotides or polypeptides for use as diagnostic reagents.
Detection of a mutated form of the receptor gene associated with a
dysfunction will provide a diagnostic tool that can add to or
define a diagnosis of a disease or susceptibility to a disease
which results from under-expression, over-expression or altered
expression of the receptor. Individuals carrying mutations in the
receptor gene may be detected at the DNA level by a variety of
techniques.
[0098] Nucleic acids for diagnosis may be obtained from a subject's
cells, such as from blood, urine, saliva, tissue biopsy or autopsy
material. The genomic DNA may be used directly for detection or may
be amplified enzymatically by using PCR or other amplification
techniques prior to analysis. RNA or cDNA may also be used in
similar fashion. Deletions and insertions can be detected by a
change in size of the amplified product in comparison to the normal
genotype. Point mutations can be identified by hybridizing
amplified DNA to labeled receptor nucleotide sequences. Perfectly
matched sequences can be distinguished from mismatched duplexes by
RNasc digestion or by differences in melting temperatures. DNA
sequence differences may also be detected by alterations in
electrophoretic mobility of DNA fragments in gels, with or without
denaturing agents, or by direct DNA sequencing. (See, e.g., Myers
et al., Science (1985) 230:1242.) Sequence changes at specific
locations may also be revealed by nuclease protection assays, such
as RNase and S1 protection or the chemical cleavage method. (See
Cotton et al., Proc Natl Acad Sci USA (1985) 85: 4397-4401.) In
another embodiment, an array of oligonucleotides probes comprising
receptor nucleotide sequence or fragments thereof can be
constructed to conduct efficient screening of e.g., genetic
mutations. Array technology methods are well known and have general
applicability and can be used to address a variety of questions in
molecular genetics including gene expression, genetic linkage, and
genetic variability. (See for example: M. Chee et al., Science, Vol
274, pp 610-613 (1996).)
[0099] The diagnostic assays offer a process for diagnosing or
determining a susceptibility to specific diseases through detection
of mutation in the receptor gene by the methods described.
[0100] In addition, specific diseases can be diagnosed by methods
comprising determining from a sample derived from a subject an
abnormally decreased or increased level of receptor polypeptide or
receptor mRNA. Decreased or increased expression can be measured at
the RNA level using any of the methods well known in the art for
the quantitation of polynucleotides, such as, for example, PCR,
RT-PCR, RNase protection, Northern blotting and other hybridization
methods. Assay techniques that can be used to determine levels of a
protein in a sample derived from a host are well-known to those of
skill in the art. Such assay methods include radioimmunoassays,
competitive-binding assays, Western Blot analysis and ELISA
assays.
[0101] Thus in another aspect, the present invention relates to a
diagnostic kit for a disease or susceptibility to a disease which
comprises:
[0102] (a) a receptor polynucleotide, preferably the nucleotide
sequence of SEQ ID NO:X, or a fragment thereof;
[0103] (b) a nucleotide sequence complementary to that of (a);
[0104] (c) a receptor polypeptide, preferably the polypeptide of
SEQ ID NO:Y, or a fragment thereof; or
[0105] (d) an antibody to a receptor polypeptide, preferably to the
polypeptide of SEQ ID NO: Y.
[0106] It will be appreciated that in any such kit, (a), (b), (c)
or (d) may comprise a substantial component.
Chromosome Assays
[0107] The nucleotide sequences of the present invention are also
valuable for chromosome identification. The sequence is
specifically targeted to and can hybridize with a particular
location on an individual human chromosome. The mapping of relevant
sequences to chromosomes according to the present invention is an
important first step in correlating those sequences with gene
associated disease. Once a sequence has been mapped to a precise
chromosomal location, the physical position of the sequence on the
chromosome can be correlated with genetic map data. Such data are
found, for example, in V. McKusick, Mendelian Inheritance in Man
(available on line through Johns Hopkins University Welch Medical
Library). The relationship between genes and diseases that have
been mapped to the same chromosomal region are then identified
through linkage analysis (coinheritance of physically adjacent
genes).
[0108] The differences in the cDNA or genomic sequence between
affected and unaffected individuals can also be determined. If a
mutation is observed in some or all of the affected individuals but
not in any normal individuals, then the mutation is likely to be
the causative agent of the disease.
Antibodies
[0109] The polypeptides of the invention or their fragments or
analogs thereof, or cells expressing them can also be used as
immunogens to produce antibodies immunospecific for the receptor
polypeptides. The term "immunospecific" means that the antibodies
have substantially greater affinity for the polypeptides of the
invention than their affinity for other related polypeptides in the
prior art.
[0110] Antibodies generated against the receptor polypeptides can
be obtained by administering the polypeptides or epitope-bearing
fragments, analogs or cells to an animal, preferably a nonhuman,
using routine protocols. For preparation of monoclonal antibodies,
any technique which provides antibodies produced by continuous cell
line cultures can be used. Examples include the hybridoma technique
(Kohler, G. and Milstein, C., Nature (1975) 256:495-497), the
trioma technique, the human B-cell hybridoma technique (Kozbor et
al., Immunology Today (1983) 4:72) and the EBV-hybridoma technique
(Cole et al., MONOCLONAL ANTIBODIES AND CANCER THERAPY, pp. 77-96,
Alan R. Liss, Inc., 1985).
[0111] Techniques for the production of single chain antibodies
(U.S. Pat. No. 4,946,778) can also be adapted to produce single
chain antibodies to polypeptides of this invention. Also,
transgenic mice, or other organisms including other mammals, may be
used to express humanized antibodies.
[0112] The above-described antibodies may be employed to isolate or
to identify clones expressing the polypeptide or to purify the
polypeptides by affinity chromatography.
[0113] Antibodies against receptor polypeptides may also be
employed to treat diseases.
Vaccines
[0114] Another aspect of the invention relates to a method for
inducing an immunological response in a mammal which comprises
inoculating the mammal with a receptor polypeptide, or a fragment
thereof, adequate to produce antibody and/or T cell immune response
to protect said animal from a disease. Yet another aspect of the
invention relates to a method of inducing immunological response in
a mammal which comprises, delivering a receptor polypeptide via a
vector directing expression of the receptor polynucleotide in vivo
in order to induce such an immunological response to produce
antibody to protect said animal from diseases.
[0115] Further aspect of the invention relates to an
immunological/vaccine formulation (composition) which, when
introduced into a mammalian host, induces an immunological response
in that mammal to a receptor polypeptide wherein the composition
comprises a receptor polypeptide or receptor gene. The vaccine
formulation may further comprise a suitable carrier. Since a
receptor polypeptide may be broken down in the stomach, it is
preferably administered parenterally (including subcutaneous,
intramuscular, intravenous, intradermal etc. injection).
Formulations suitable for parenteral administration include aqueous
and non-aqueous sterile injection solutions which may contain
anti-oxidants, buffers, bacteriostats and solutes which render the
formulation instonic with the blood of the recipient; and aqueous
and non-aqueous sterile suspensions which may include suspending
agents or thickening agents. The formulations may be presented in
unit-dose or multi-dose containers, for example, sealed ampoules
and vials and may be stored in a freeze-dried condition requiring
only the addition of the sterile liquid carrier immediately prior
to use. The vaccine formulation may also include adjuvant systems
for enhancing the immunogenicity of the formulation, such as oil-in
water systems and other systems known in the art. The dosage will
depend on the specific activity of the vaccine and can be readily
determined by routine experimentation.
Screening Assays
[0116] The receptor polypeptide of the present invention may be
employed in a screening process for compounds which bind the
receptor and which activate (agonists) or inhibit activation of
(antagonists) the receptor polypeptide of the present invention.
Thus, polypeptides of the invention may also be used to assess the
binding of small molecule substrates and ligands in, for example,
cells, cell-free preparations, chemical libraries, and natural
product mixtures. These substrates and ligands may be natural
substrates and ligands or may be structural or functional mimetics.
See Coligan et al., Current Protocols in Immunology 1(2):Chapter 5
(1991).
[0117] The receptor polypeptides are responsible for many
biological functions, including many pathologies. Accordingly, it
is desirous to find compounds and drugs which stimulate the
receptor on the one hand and which can inhibit the function of the
receptor on the other hand. In general, agonists are employed for
therapeutic and prophylactic purposes for such conditions and
diseases. Antagonists may be employed for a variety of therapeutic
and prophylactic purposes for such conditions and diseases.
[0118] In general, such screening procedures involve producing
appropriate cells which express the receptor polypeptide of the
present invention on the surface thereof. Such cells include cells
from mammals, yeast, Drosophila or E. coli. Cells expressing the
receptor (or cell membrane containing the expressed receptor) are
then contacted with a test compound to observe binding, or
stimulation or inhibition of a functional response.
[0119] The assays may simply test binding of a candidate compound
wherein adherence to the cells bearing the receptor is detected by
means of a label directly or indirectly associated with the
candidate compound or in an assay involving competition with a
labeled competitor. Further, these assays may test whether the
candidate compound results in a signal generated by activation of
the receptor, using detection systems appropriate to the cells
bearing the receptor at their surfaces. Inhibitors of activation
are generally assayed in the presence of a known agonist and the
effect on activation by the agonist by the presence of the
candidate compound is observed.
[0120] Further, the assays may simply comprise the steps of mixing
a candidate compound with a solution containing a receptor
polypeptide to form a mixture, measuring receptor activity in the
mixture, and comparing the receptor activity of the mixture to a
standard.
[0121] The receptor cDNA, protein and antibodies to the protein may
also be used to configure assays for detecting the effect of added
compounds on the production of receptor mRNA and protein in cells.
For example, an ELISA may be constructed for measuring secreted or
cell associated levels of receptor protein using monoclonal and
polyclonal antibodies by standard methods known in the art, and
this can be used to discover agents which may inhibit or enhance
the production of the receptor (also called antagonist or agonist,
respectively) from suitably manipulated cells or tissues. Standard
methods for conducting screening assays are well understood in the
art.
[0122] Examples of potential receptor antagonists include
antibodies or, in some cases, oligonucleotides or proteins which
are closely related to the ligand of the receptor, e.g., a fragment
of the ligand, or small molecules which bind to the receptor but do
not elicit a response, so that the activity of the receptor is
prevented.
[0123] Thus in another aspect, the present invention relates to a
screening kit for identifying agonists, antagonists, ligands,
receptors, substrates, enzymes, etc. for receptor polypeptides; or
compounds which decrease or enhance the production of receptor,
which comprises:
[0124] (a) a receptor polypeptide, preferably that of SEQ ID
NO:Y;
[0125] (b) a recombinant cell expressing a receptor polypeptide,
preferably that of SEQ ID NO:Y;
[0126] (c) a cell membrane expressing a receptor polypeptide;
preferably that of SEQ ID NO: Y; or
[0127] (d) antibody to a receptor polypeptide, preferably that of
SEQ ID NO: Y.
[0128] It will be appreciated that in any such kit, (a), (b), (c)
or (d) may comprise a substantial component.
Prophylactic and Therapeutic Methods
[0129] This invention provides methods of treating an abnormal
conditions related to both an excess of and insufficient amounts of
receptor activity.
[0130] If the activity of the receptor is in excess, several
approaches are available. One approach comprises administering to a
subject an inhibitor compound (antagonist) as described along with
a pharmaceutically acceptable carrier in an amount effective to
inhibit activation by blocking the binding of ligands to the
receptor or by inhibiting a second signal, and thereby alleviating
the abnormal condition.
[0131] In another approach, soluble forms of the receptor
polypeptides still capable of binding the ligand in competition
with endogenous receptor may be administered. Typical embodiments
of such competitors comprise fragments of the receptor
polypeptide.
[0132] In still another approach, expression of the gene encoding
endogenous receptor can be inhibited using expression blocking
techniques. Known such techniques involve the use of antisense
sequences, either internally generated or separately administered.
(See, for example, O'Connor, J Neurochem (1991) 56:560 in
Oligodeoxynucleotides as Antisense Inhibitors of Gene Expression,
CRC Press, Boca Raton, Fla. (1988).) Alternatively,
oligonucleotides which form triple helices with the gene can be
supplied. (See, for example, Lee et al., Nucleic Acids Res (1979)
6:3073; Cooney et al., Science (1988) 241:456; Dervan et al.,
Science (1991) 251:1360.) These oligomers can be administered per
se or the relevant oligomers can be expressed in vivo.
[0133] For treating abnormal conditions related to an
under-expression of the receptor and its activity, several
approaches are also available. One approach comprises administering
to a subject a therapeutically effective amount of a compound which
activates the receptor, i.e., an agonist as described above, in
combination with a pharmaceutically acceptable carrier, to thereby
alleviate the abnormal condition. Alternatively, gene therapy may
be employed to effect the endogenous production of the receptor by
the relevant cells in the subject. For example, a polynucleotide of
the invention may be engineered for expression in a replication
defective retroviral vector, as discussed above. The retroviral
expression construct may then be isolated and introduced into a
packaging cell transduced with a retroviral plasmid vector
containing RNA encoding a polypeptide of the present invention such
that the packaging cell now produces infectious viral particles
containing the gene of interest. These producer cells may be
administered to a subject for engineering cells in vivo and
expression of the polypeptide in vivo. For overview of gene
therapy, see Chapter 20, Gene Therapy and other Molecular
Genetic-based Therapeutic Approaches, (and references cited
therein) in Human Molecular Genetics, T Strachan and A P Read, BIOS
Scientific Publishers Ltd (1996).
Formulation and Administration
[0134] Peptides, such as the soluble form of receptor polypeptides,
and agonists and antagonist peptides or small molecules, may be
formulated in combination with a suitable pharmaceutical carrier.
Such formulations comprise a therapeutically effective amount of
the polypeptide or compound, and a pharmaceutically acceptable
carrier or excipient. Such carriers include but are not limited to,
saline, buffered saline, dextrose, water, glycerol, ethanol, and
combinations thereof. Formulation should suit the mode of
administration, and is well within the skill of the art. The
invention further relates to pharmaceutical packs and kits
comprising one or more containers filled with one or more of the
ingredients of the aforementioned compositions of the
invention.
[0135] Polypeptides and other compounds of the present invention
may be employed alone or in conjunction with other compounds, such
as therapeutic compounds. Preferred forms of systemic
administration of the pharmaceutical compositions include
injection, typically by intravenous injection. Other injection
routes, such as subcutaneous, intramuscular, or intraperitoneal,
can be used. Alternative means for systemic administration include
transmucosal and transdermal administration using penetrants such
as bile salts or fusidic acids or other detergents. In addition, if
properly formulated in enteric or encapsulated formulations, oral
administration may also be possible. Administration of these
compounds may also be topical and/or localized, in the form of
salves, pastes, gels and the like.
[0136] The dosage range required depends on the choice of peptide,
the route of administration, the nature of the formulation, the
nature of the subject's condition, and the judgment of the
attending practitioner. Suitable dosages, however, are in the range
of 0.1-100 .mu.g/kg of subject. Wide variations in the needed
dosage, however, are to be expected in view of the variety of
compounds available and the differing efficiencies of various
routes of administration. For example, oral administration would be
expected to require higher dosages than administration by
intravenous injection. Variations in these dosage levels can be
adjusted using standard empirical routines for optimization, as is
well understood in the art.
[0137] Polypeptides used in treatment can also be generated
endogenously in the subject, in treatment modalities often referred
to as "gene therapy" as described above. Thus, for example, cells
from a subject may be engineered with a polynucleotide, such as a
DNA or RNA, to encode a polypeptide ex vivo, and for example, by
the use of a retroviral plasmid vector. The cells are then
introduced into the subject.
[0138] All publications, including but not limited to patents and
patent applications, cited in this specification are herein
incorporated by reference as if each individual publication were
specifically and individually indicated to be incorporated by
reference herein as though fully set forth.
##STR00001## ##STR00002## ##STR00003## ##STR00004## ##STR00005##
##STR00006## ##STR00007## ##STR00008## ##STR00009## ##STR00010##
##STR00011## ##STR00012## ##STR00013## ##STR00014## ##STR00015##
##STR00016## ##STR00017## ##STR00018##
Sequence CWU 1
1
6311497DNAHomo sapiens 1gcagttcctg agagaagaac cctgaggaac agacgttccc
tcgcggccct ggcacctcca 60accccagata tgctgctgct gctgctgctg cccctgctct
gggggaggga gagggtggaa 120ggacagaaga gtaaccggaa ggattactcg
ctgacgatgc agagttccgt gaccgtgcaa 180gagggcatgt gtgtccatgt
gcgctgctcc ttctcctacc cagtggacag ccagactgac 240tctgacccag
ttcatggcta ctggttccgg gcagggaatg atataagctg gaaggctcca
300gtggccacaa acaacccagc ttgggcagtg caggaggaaa ctcgggaccg
attccacctc 360cttggggacc cacagaccaa aaattgcacc ctgagcatca
gagatgccag aatgagtgat 420gcggggagat acttctttcg tatggagaaa
ggaaatataa aatggaatta taaatatgac 480cagctctctg tgaacgtgac
ataccctcct cagaacttga ctgtgactgt cttccaagga 540gaaggcacag
catccacagc tctggggaac agctcatctc tttcagtcct agagggccag
600tctctgcgct tggtctgtgc tgttgacagc aatccccctg ccaggctgag
ctggacctgg 660aggagtctga ccctgtaccc ctcacagccc tcaaaccctc
tggtactgga gctgcaagtg 720cacctggggg atgaagggga attcacctgt
cgagctcaga actctctggg ttcccagcac 780gtttccctga acctctccct
gcaacaggag tacacaggca aaatgaagcc tgtatcagga 840gtgttgctgg
gggcggtcgg gggaactgga gccacagccc tggtcttcct ctccttctgt
900gtcatcttca ttgtagtgag gtcctgcagg aagaaatcgg caagaccagc
agcggacgtg 960ggagacatag gcatgaagga tgcaaacacc attcaggggc
tcagcctctc agggtaactg 1020gatgagtcct gggcagatga taacccccga
caccatggcc tggctgccca ctccctcagg 1080ggaggaaaga gagatcccag
tatgcacccc tcagctttca taagggggag cctcaggacc 1140tatccaggtc
aagaagccac caacaatgag tactcagaga tcaagatccc caagtaagaa
1200aatgcagagg ctcgggcttg tttgagggtt cacgacccct ccagcaaagg
agtctgaggc 1260tgattccagt agaattagca gccctcaatg ctgtgcaaca
agacatcaga acttattcct 1320cttgtctaac tgaaaatgca tgcctgatga
ccaaactctc cctttcccca tccaatcggt 1380ccacactccc cgccctggcc
tctgtaccca ccattctcct ctgtacttct ctaaggatga 1440ctactttaga
ttccgaatat agtgagattg taacgtgaaa aaaaaaaaaa aaaaaaa
149721849DNAHomo sapiens 2cggcacgagt ggacaaccat cagggagcca
ggacacagag gggcagagca agtcagcatt 60ggcgcccctt cctcagatcc ctatcatctt
gggaaacagt agcccagagg ttcaggaaga 120tgttaactta aatgttcggg
gtgccccagt ctgttcagca tggctgaaat ccacactccg 180tattcttcct
tgaagaaact gttatcttta ctcaatggct tcgtggctgt gtctggcatc
240atcctagttg gcctgggcat tggtggtaaa tgtggagggg cctctctgac
gaatgtcctc 300gggctgtcct ccgcatacct ccttcacgtt ggcaacctgt
gcctggtgat gggatgcatc 360acggtactgc ttggctgtgc cgggtggtat
ggagcgacta aagagagcag aggcacgctc 420ttgttttgca tcctgtcaat
ggttattgtc ctcatcatgg aagttacagc tgccacagtg 480gtccttcttt
tctttccaat tgttggagat gtggccttgg aacacacctt cgtgaccctg
540aggaagaatt acagaggtta caacgagcca gacgactatt ctacacagtg
gaacttggtc 600atggagaagc taaagtgctg tggggtgaat aactacacag
atttttctgg ctcttccttc 660gaaatgacaa cgggccacac ctaccccagg
agttgctgta aatccatcgg aagtgtgtcc 720tgtgacggac gcgatgtgtc
tccaaacgtc atccaccaga agggctgttt ccataaactc 780ctaaaaatca
ccaagactca gagcttcacc ctgagtggga gctctctggg agctgcagtg
840atacagaggt gggggtctcg ctatgttgcg caggctggtc ttgaactgct
ggcctaaagc 900gatccccccg cctaggcctc ccaaagtgct gggtttacca
gcgtgagcca ccacgctggg 960cttcctgcat ccttttaagg ttcctgaggg
tctgcctgag aggagctgtc cctgaatctc 1020catgcagccc cacctgccac
atcaccaaga catacaatct ttgccagcaa cacttcctcc 1080ttgcagatta
caagcatagc taatgccacc accagacaag accgattcgc tggcctccat
1140ttcttcaacc cagtgcctgt catgaaactt gtggaggtca ttaaaacacc
aatgaccagc 1200cagaagacat ttgaatcttt ggtagacttt agcaaaaccc
taggaaagca tcctgtttct 1260tgcaaggaca ctcctgggtt tattgtgaac
cgcctcctgg ttccatacct catggaagca 1320atcaggctgt atgaacgagg
gcctcctggc tttccctgtg ggcttctgag aaaggtttct 1380ggaactccca
ccacccccac tacagtccca gccagagcaa ttgcatggcc ggcccagatt
1440gatatcctgg atctctgctt ttgattaaaa ggtgacgcat ccaaagaaga
cattgacact 1500gctatgaaat taggagccgg ttaccccatg ggcccatttg
agcttctaga ttatgtcgga 1560ctggatacta cgaagttcat cgtggatggg
tggcatgaaa tggatgcaga gaacccatta 1620catcagccca gcccatcctt
aaataagctg gtagcagaga acaagttcgg caagaagact 1680ggagaaggat
tttacaaata caagtgatgt gcagcttctc cggttctgag aagaacacct
1740gagagcgctt tccagccagt gccccgagtg cctgtgggaa tgctctttgg
tcagacattc 1800cctcacacag tacagtttaa taaatgtgca ttttgattgt
aaaaaaaaa 18493741DNAHomo sapiens 3atggctgaaa tccacactcc gtattcttcc
ttgaagaaac tgttatcttt actcaatggc 60ttcgtggctg tgtctggcat catcctagtt
ggcctgggca ttggtggtaa atgtggaggg 120gcctctctga cgaatgtcct
cgggctgtcc tccgcatacc tccttcacgt tggcaacctg 180tgcctggtga
tgggatgcat cacggtactg cttggctgtg ccgggtggta tggagcgact
240aaagagagca gaggcacgct cttgttttgc atcctgtcaa tggttattgt
cctcatcatg 300gaagttacag ctgccacagt ggtccttctt ttctttccaa
ttgttggaga tgtggccttg 360gaacacacct tcgtgaccct gaggaagaat
tacagaggtt acaacgagcc agacgactat 420tctacacagt ggaacttggt
catggagaag ctaaagtgct gtggggtgaa taactacaca 480gatttttctg
gctcttcctt cgaaatgaca acgggccaca cctaccccag gagttgctgt
540aaatccatcg gaagtgtgtc ctgtgacgga cgcgatgtgt ctccaaacgt
catccaccag 600aagggctgtt tccataaact cctaaaaatc accaagactc
agagcttcac cctgagtggg 660agctctctgg gagctgcagt gatacagagg
tgggggtctc gctatgttgc gcaggctggt 720cttgaactgc tggcctaaag c
74141288DNAHomo sapiens 4ggcgtccctc tgcctgccca ctcagtggca
acacccggga gctgttttgt cctttgtgga 60gcctcagcag ttccctcttt cagaactcac
tgccaagagc cctgaacagg agccaccatg 120cagtgcttca gcttcattaa
gaccatgatg atcctcttca atttgctcat ctttctgtgt 180ggtgcagccc
tgttggcagt gggcatctgg gtgtcaatcg atggggcatc ctttctgaag
240atcttcgggc cactgtcgtc cagtgccatg cagtttgtca acgtgggcta
cttcctcatc 300gcagccggcg ttgtggtctt tgctcttggt ttcctgggct
gctatggtgc taagactgag 360agcaagtgtg ccctcgtgac gttcttcttc
atcctcctcc tcatcttcat tgctgaggtt 420gcagctgctg tggtcgcctt
ggtgtacacc acaatggctg agcacttcct gacgttgctg 480gtagtgcctg
ccatcaagaa agattatggt tcccaggaag acttcactca agtgtggaac
540accaccatga aagggctcaa gtgctgtggc ttcaccaact atacggattt
tgaggactca 600ccctacttca aagagaacag tgcctttccc ccattctgtt
gcaatgacaa cgtcaccaac 660acagccaatg aaacctgcac caagcaaaag
gctcacgacc aaaaagtaga gggttgcttc 720aatcagcttt tgtatgacat
ccgaactaat gcagtcaccg tgggtggtgt ggcagctgga 780attgggggcc
tcgagctggc tgccatgaat tgtgtccatg tatctgtact gcaatctaca
840ataagtccac ttctgcctct gccactactg ctgccacatg ggaactgtga
agaggcaccc 900tggcaagcag cagtgattgg gggaggggac aggatctaac
aatgtcactt gggccagaat 960ggacctgccc tttctgctcc agacttgggg
ctagataggg accactcctt ttaggcgatg 1020cctgactttc cttccattgg
tgggtggatg ggtggggggc attccagagc ctctaaggta 1080gccagttctg
ttgcccattc ccccagtcta ttaaaccctt gatatgcccc ctaggcctag
1140tggtgatccc agtgctctac tgggggatga gagaaaggca ttttatagcc
tgggcataag 1200tgaaatcagc agagcctctg ggtggatgtg tagaaggcac
ttcaaaatgc ataaacctgt 1260tacaatgtta aaaaaaaaaa aaaaaaaa
128851236DNAHomo sapiens 5aaaaaaaaca aggtccccac agcaaagaaa
aggaatagga tcaagagata cgtggctgct 60ggcagagcaa gcatgaattc gatgacttca
gcagttccgg tggccaattc tgtgttggtg 120gtggcacccc acaatggtta
tcctgtgacc ccaggaatta tgtctcacgt gcccctgtat 180ccaaacagcc
agccgcaagt ccacctagtt cctgggaacc cacctagttt ggtgtcgaat
240gtgaatgggc agcctgtgca gaaagctctg aaagaaggca aaaccttggg
ggccatccag 300atcatcattg gcctggctca catcggcctc ggctccatca
tggcgacggt tctcgtaggg 360gaatacctgt ctatttcatt ctacggaggc
tttcccttct ggggaggctt gtggtttatc 420atttcaggat ctctctccgt
ggcagcagaa aatcagccat attcttattg cctgctgtct 480ggcagtttgg
gcttgaacat cgtcagtgca atctgctctg cagttggagt catactcttc
540atcacagatc taagtattcc ccacccatat gcctaccccg actattatcc
ttacgcctgg 600ggtgtgaacc ctggaatggc gatttctggc gtgctgctgg
tcttctgcct cctggagttt 660ggcatcgcat gcgcatcttc ccactttggc
tgccagttgg tctgctgtca atcaagcaat 720gtgagtgtca tctatccaaa
catctatgca gcaaacccag tgatcacccc agaaccggtg 780acctcaccac
caagttattc cagtgagatc caagcaaata agtaaggcta cagattctgg
840aagcatcttt cactgggacc aaaagaagtc ctcctccctt tctgggcttc
cataacccag 900gtcgttcctg ttctgacagc tgaggaaacg tctctcccac
tgtttgtact ctcaccttca 960ttcttcaatt cagtctagga aaccatgctg
tttctctatc aagaagaaga cagagatttt 1020aaacagatgt taaccaagag
ggactcccta gggcacatgc atcagcacat atgtgggcat 1080ccagcctctg
gggccttggc acacccattc gtgtgctctg ctgcatgtga gcttgtgggt
1140tagaggaaca aatatctaga cattcaatct tcactctttc aattgtgcat
tcatttaata 1200aatagatact gagcattcaa aaaaaaaaaa aaaaaa
123661115DNAHomo sapiens 6cacgagcagg gtctcgggct agtcatggcg
tccccgtctc ggagactgca gactaaacca 60gtcattactt gtttcaagag cgttctgcta
atctacactt ttattttctg gatcactggc 120gttatccttc ttgcagttgg
catttggggc aaggtgagcc tggagaatta cttttctctt 180ttaaatgaga
aggccaccaa tgtccccttc gtgctcattg ctactggtac cgtcattatt
240cttttgggca cctttggttg ttttgctacc tgccgagctt ctgcatggat
gctaaaactg 300tatgcaatgt ttctgactct cgtttttttg gtcgaactgg
tcgctgccat cgtaggattt 360gttttcagac atgagattaa gaacagcttt
aagaataatt atgagaaggc tttgaagcag 420tataactcta caggagatta
tagaagccat gcagtagaca agatccaaaa tacgttgcat 480tgttgtggtg
tcaccgatta tagagattgg acagatacta attattactc agaaaaagga
540tttcctaaga gttgctgtaa acttgaagat tgtactccac agagagatgc
agacaaagta 600aacaatgaag gttgttttat aaaggtgatg accattatag
agtcagaaat gggagtcgtt 660gcaggaattt cctttggagt tgcttgcttc
caactgattg gaatctttct cgcctactgc 720ctctctcgtg ccataacaaa
taaccagtat gagatagtgt aacccaatgt atctgtgggc 780ctattcctct
ctacctttaa ggacatttag ggtcccccct gtgaattaga aagttgcttg
840gctggagaac tgacaacact acttactgat agaccaaaaa actacaccag
taggttgatt 900caatcaagat gtatgtagac ctaaaactac accaataggc
tgattcaatc aagatccgtg 960ctcgcagtgg gctgattcaa tcaagatgta
tgtttgctat gttctaagtc caccttctat 1020cccattcatg ttagatcgtt
gaaaccctgt atccctctga aacactggaa gagctagtaa 1080attgtaaatg
aagtaaaaaa aaaaaaaaaa aaaaa 111571662DNAHomo sapiens 7cacgagcatt
gccgctctct cggtgagcgc agccccgctc tccgggccgg gccttcgcgg 60gccaccggcg
ccatgggcca gtgcggcatc acctcctcca agaccgtgct ggtctttctc
120aacctcatct tctggggggc agctggcatt ttatgctatg tgggagccta
tgtcttcatc 180acttatgatg actatgacca cttctttgaa gatgtgtaca
cgctcatccc tgctgtagtg 240atcatagctg taggagccct gcttttcatc
attgggctaa ttggctgctg tgccacaatc 300cgggaaagtc gctgtggact
tgccacgttt gtcatcatcc tgctcttggt ttttgtcaca 360gaagttgttg
tagtggtttt gggatatgtt tacagagcaa aggtggaaaa tgaggttgat
420cgcagcattc agaaagtgta taagacctac aatggaacca accctgatgc
tgctagccgg 480gctattgatt atgtacagag acagctgcat tgttgtggaa
ttcacaacta ctcagactgg 540gaaaatacag attggttcaa agaaaccaaa
aaccagagtg tccctcttag ctgctgcaga 600gagactgcca gcaattgtaa
tggcagcctg gcccaccctt ccgacctcta tgctgagggg 660tgtgaggctc
tagttgtgaa gaagctacaa gaaatcatga tgcatgtgat ctgggccgca
720ctggcatttg cagctattca gctgctgggc atgctgtgtg cttgcatcgt
gttgtgcaga 780aggagtagag atcctgctta cgagctcttc atcactggcg
gaacctatgc atagttgaca 840atctcaagcc tgagcttttt ggtcttgttc
tgatttggaa ggtgaattga gcaggtctgc 900tgctgttggc ctctggagtt
catttagtta aagcacatgt acactggtgt tggacagagc 960agcttggctt
ttcatgtgcc cacctactta cctactacct gcgactttct ttttccttgt
1020tctagctgac tcttcatgcc cctaagattt taagtacgat ggtgaacgtt
ctaatttcag 1080aaccaattgc gagtcatgta gtgtggtaga attaaaggag
gacacgagcc tgcttctgtt 1140acctccaagt ggtaacagga ctgatgccga
aatgtcacca ggtcctttca gtcttcacag 1200tggagaactc ttggccaaag
gtttttgggg ggaggaggag gaaaccagct ttctggttaa 1260ggttaacacc
agatggtgcc cctcattggt gtccttttaa aaaatattta ctgtagtcca
1320ataagatagc agctgtacaa aatgactaaa atagattgta ggatcatatg
gcgtatatct 1380tggttcatct tcaaaatcag agactgagct ttgaaactag
tggtttttaa tcaaagttgg 1440ctttatagga ggagtataat gtatgcacta
ctgttttaaa agaattagtg tgagtgtgtt 1500tttgtatgaa tgagcccatt
catggtaagt cttaagcttg ttggaaataa tgtacccatg 1560tagactagca
aaatagtatg tagatgtgat ctcagttgta aatagaaaaa tctaattcaa
1620taaactctgt atcagccccc aacaaaaaaa aaaaaaaaaa aa 166281345DNAHomo
sapiens 8cacgagcgca gagcttgggg cttccttggt cgcacccacc acctgcctgc
ccactggtca 60gccttcaggg accctgagca ccgcctggtc tctttcctgt ggccagccca
gaactgaagc 120gctgcggcat ggcgcgcgcc tgcctccagg ccgtcaagta
cctcatgttc gccttcaacc 180tgttcttctg gctgggaggc tgtggcgtgc
tgggtgtcgg catctggctg gccgccacac 240aggggagctt cgccacgctg
tcttcttcct tcccgtccct gtcggctgcc aacctgctca 300tcatcaccgg
cgcctttgtc atggccatcg gcttcgtggg ctgcctgggt gccatcaagg
360agaacaagtg cctcctgctc actttcttcc tgctgctgct gctggtgttc
ctgctggagg 420ccaccatcgc catcctcttc ttcgcctaca cggacaagat
tgacaggtat gcccagcaag 480acctgaagaa aggcttgcac ctgtacggca
cgcagggcaa cgtgggcctc accaacgcct 540ggagcatcat ccagaccgac
ttccgctgct gtggcgtctc caactacact gactggttcg 600aggtgtacaa
cgccacgcgg gtacctgact cctgctgctt ggagttcagt gagagctgtg
660ggctgcacgc ccccggcacc tggtggaagg cgccgtgcta cgagacggtg
aaggtgtggc 720ttcaggagaa cctgctggct gtgggcatct ttgggctgtg
cacggcgctg gtgcagatcc 780tgggcctgac cttcgccatg accatgtact
gccaagtggt caaggcagac acctactgcg 840cgtaggccgc ccaccgccgg
cttctctgcc aaaaggacgc ccacggggag atggccgcac 900ccacagctgc
ttttcccacc accagcttcg gtgttctgcc ccatgctggg aggagggagg
960gagggacagg tgcctggagc ccccggaacc ctgtttctgg aaggccctag
ctcaggtggc 1020ttcagggcct ccggaccccc cctgggaggg gtggccacgt
gctggctgcg gaacccaggg 1080caggggtggg aggggcctcc agcacttttt
atatttacgt attctccaaa gcagtgttca 1140cacgggagcc agcctgtggc
ccccagcttc ctggaaaaca ggttggcgct ggaggagccg 1200ggtcttggca
tcctggaggt ggccccactg gtcctggtgc tccaggcggg gccgtggacc
1260cctcacctac attccatagt gggcccgtgg ggctcctggt gcatcttaat
aaagtgtgag 1320cagcaaaaaa aaaaaaaaaa aaaaa 13459734DNAHomo sapiens
9gcgccgccgg gccgcagcat ggggcgcttc cgcgggggcc tgcggtgcat caagtacctg
60ctgcttggct tcaacctgct cttctggctg gctggatcgg ccgtcattgc ttttggacta
120tggtttcggt tcggaggtgc cataaaggag ttatcatcag aggacaagtc
cccagagtat 180ttctatgtgg ggctgtatgt tctggttgga gccggggccc
tgatgatggc cgtggggttc 240ttcggatgct gcggagccat gcgggagtcg
caatgtgtgc ttggatcatt ttttacctgc 300ctcctggtga tatttgctgc
tgaagtaacc actggagtat ttgcttttat aggcaagggg 360gtagctatcc
gacatgttca gaccatgtat gaagaggctt acaatgatta ccttaaagac
420aggggaaaag gcaatgggac actcatcacc ttccactcaa catttcagtg
ctgtggaaaa 480gaaagctccg aacaggtcca acctacatgc ccaaaggagc
ttctaggaca caagaattgc 540atcgatgaaa ttgagaccat aatcagtgtt
aagctccagc tcattggaat tgtcggtatt 600ggaattgcag gtctgacgat
ctttggcatg atattcagca tggtcctctg ctgtgcgata 660cgaaactcac
gagatgtgat atgaagctac ttctacatga aaattgcaat ctaaagcttt
720cataccaaat gttc 73410577DNAHomo sapiens 10agtgtttatg ggactaaaaa
acttttaaca cctttttagg ggaaatattt tggtcctata 60caaaacatgt aaatatgctt
tattactttc attttctgac cctgctgtaa actactgcaa 120ccctcacatc
cctcaaaggg acttttatgt caaactcttc tgtttctcca aatataagga
180aaaaagacta aagcaagaga tctggcagtt gaaaattgtg ggaaagagaa
tttgtatggg 240cactgtatct atgaaatacc tcatacttac gtttacatgt
tttcctaact ttttgtattt 300ttcttgtata gccacctaga gaattcttca
tagattaaga actacagttt tcaccactta 360acataagtaa aacaaagtcc
ttcataattt aaccattagc atctttggcc aaaccaaaat 420aaagaaaagc
atcttctcct agttgtgtgt gggcaacaga aacaagttaa ggaaacaaaa
480atacttatat atacacagaa caaaaataat gttcttttta tgcaaatccc
ctgtgaaaat 540aaaattttca atgtttaaaa aaaaaaaaaa aaaaaaa
57711936DNAHomo sapiens 11ttcggcacga gctgcgggcg gtgggcggct
gggcggcccc gggagccgcg ctctcagtct 60ctctaggcgc agtcccttcg ccgcttccgg
agcccctggc agggcccaga agccatggcc 120cactataaga ctgagcagga
cgactggctg atcatctact tgaagtattt actctttgtc 180ttcaacttct
tcttctgggt cgggggagca gccgtcctgg ctgtgggcat ctggaccctg
240gtggagaaga gtggctacct cagcgtcctg gcctccagca cctttgccgc
ctccgcctac 300atcctcatct ttgcgggcgt acttgtcatg gtgaccggct
tcctgggctt cggtgccatc 360ctctgggagc ggaagggctg cctctccacg
tatttctgcc tgttgctcgt catcttcctg 420gttgagctgg tggcgggagt
cctggcccat gtgtattacc agaggctgag tgatgaactg 480aagcagcact
tgaaccggac tctggctgag aactacgggc agccggagca cgcagatcac
540gcctcagtgg accgactcca gcaggatttc aagtgctgcg gaagcaacag
ctcagccgac 600tggcagcaca gcacgtacat cctgttgcgg gaggccgagg
gccgccaggt gcccgacagc 660tgctgcaaga cagtggtggc gcgctgcggc
cagcgggccc acccctccaa catctataag 720gtggagggag gctgcctcac
caagctggag cagttcctgg ccgaccacct gctgcttatg 780ggggcagtgg
gcatcggggt ggcctgcctg cagatctgcg ggatggttct cacctgctgc
840ttgcaccaga ggctccagcg gcatttttac taatggcaac cacctcctct
tccaactgcc 900cctcaagaca acatgtggca catgccatct gcaagg
93612738DNAHomo sapiens 12agcttacttt cactcaccgc ctgtccttcc
tgacacctca ccatgtgtac gggaaaatgt 60gcccgctgtg tggggctctc cctcattacc
ctctgcctcg tctgcattgt ggccaacgcc 120ctcctgctgg tacctaatgg
ggagacctcc tggaccaaca ccaaccatct cagcttgcaa 180gtctggctca
tgggcggctt cattggcggg ggcctaatgg tactgtgtcc agggattgca
240gccgttcggg cagggggcaa gggctgctgt ggtgctgggt gctgtggaaa
ccgctgcagg 300atgctgcgct cggtcttctc ctcggcgttc ggggtgcttg
gtgccatcta ctgcctctcg 360gtgtctggag ctgggctccg aaatggaccc
agatgcttaa tgaacggcga gtggggctac 420cacttcgaag acaccgcggg
agcttacttg ctcaaccgca ctctatggga tcggtgcgag 480gcgccccctc
gcgtggtccc ctggaatgtg acgctcttct cgctgctggt ggccgcctcc
540tgcctggaga tagtactgtg tgggatccag ctggtgaacg cgaccattgg
tgtcttctgc 600ggcgattgca ggaaaaaaca ggacacacct cactgaggct
ccactgaccg ccgggttaca 660cctgctcctt cctggacgct cactcccttg
ctcgctagaa taaactgctt tgcgctctca 720aaaaaaaaaa aaaaaaac
738131071DNAHomo sapiens 13ggcacgagag attgtcggct gcgggtatat
tccaattccc cgtctcctca tgaatatgaa 60gtgaagggct ctgaccctgg aagtggttct
aagcagggca aaatggggtc tcggaagtgt 120ggaggctgcc taagttgttt
gctgattccg cttgcacttt ggagtataat cgtgaacata 180ttattgtatt
tcccgaatgg gcaaacttcc tatgcatcca gcaataaact caccaactac
240gtgtggtatt ttgaaggaat ctgtttctca ggcatcatga tgcttatagt
aacaacagtt 300cttctggtac tggagaataa taacaactat aaatgttgcc
agagtgaaaa ctgcagcaaa 360aaatatgtga cactgctgtc aattatcttt
tcttccctcg gaattgcttt ttctggatac 420tgcctggtca tctctgcctt
gggtcttgtc caagggccat attgccgcac ccttgatggc 480tgggagtatg
cttttgaagg cactgctgga cgtttcctta cagattctag catatggatt
540cagtgcctgg aacctgcaca tgttgtggag tggaacatca ttttattttc
cattctcata 600accctcagtg ggcttcaagt gatcatctgc ctcatcagag
tagtcatgca actatccaag 660atactgtgtg gaagctattc agtgatcttc
cagcctggaa tcatttgaat aaggacaaaa 720tgttttccat tatcaagaca
tggccatcta tctaaatatt
atatcaactg tgttagactt 780gagggcaata ttgaaaatga tggtgctttc
tgcatttggt gtttatttgt aaaaaatttg 840cagtcctcac tgcacatgca
agtataccac ccttccattt agtatgtttt ttaagtaata 900tgcatcagaa
acttcagaaa tacttctgcc ctttgatcaa acaaatccat ttccaagaat
960ctgtactagg gaagtaaata agaatatgag agaaaccttt atgcaatatg
tatattgcaa 1020cattatttaa tattctggaa aattggaaac accccaaaat
tctaactcaa a 107114865DNAHomo sapiens 14tttgtggagg gcagcagaga
gtacccagct ggacatcctt tcctgctgat gagccccagg 60ctggaggtgc cctgctcaca
tgctcttccc cagggtctct cgcctgggca ggtcatcata 120gtacggggac
tggtcttgca agagccgaag cattttactg tgagcctgag ggaccaggct
180gcccatgctc ctgtgacact cagggcctcc ttcgcagaca gaactctggc
ctggatctcc 240cgctgggggc agaagaaact gatctcagcc cccttcctct
tttaccccca gagattcttt 300gaggtgctgc tcctgttcca ggagggaggg
ctgaagctgg cgctcaatgg gcaggggctg 360ggggccacca gcatgaacca
gcaggccctg gagcagctgc gggagctccg gatcagtgga 420agtgtccagc
tctactgtgt ccactcctga aggatggttc caggaaatac cgcagaaaac
480aagagtcagc cactccccag ggccccactc tcctcccctc attaaaccat
ccacctgaac 540accagcacat cagggcctgg ttcacctctg gggtcacgag
actgagtcta caggagcttt 600gggcctgagg gaaggcacaa gagtgcaaag
gttcctcgaa ctctgcacct tcctccacca 660ggagcctggg atatggctcc
atctgccttc agggcctgga ctgcactcac agaggcaagt 720gttgtagact
aacaaagata ctccaaaata caatggctta aagaatgtgg tcatttattc
780tttattattt atttatttgt ggtcaaataa ataaataagg ttatttattt
aaaaaaaaaa 840aaaaaaaaaa aaaaaaaaaa aaaaa 86515441DNAHomo sapiens
15gcacgagaga cgacatcaga gatgaggaca gcattgctgc tccttgcagc cctggctgtg
60gctacagggc cagcccttac cctgcgctgc cacgtgtgca ccagctccag caactgcaag
120cattctgtgg tctgcccggc cagctctcgc ttctgcaaga ccacgaacac
agtggagcct 180ctgagggctt ccccgaaagt ctgggaccag gtccaggtgg
gcatggaatg ctgatgactt 240ggagcaggcc ccacagaccc cacagaggat
gaagccaccc cacagaggat gcagccccca 300gctgcatgga aggtggagga
cagaagccct gtggatcccc ggatttcaca ctccttctgt 360tttgttgccg
tttatttttg tactcaaatc tctacatgga gataaatgat ttaaaccagt
420aaaaaaaaaa aaaaaaaaaa a 441161066DNAHomo sapiens 16agcgggcccg
aaccctcgtg tgaagggtgc agtacctaag ccggagcggg gtagaggcgg 60gccggcaccc
ccttctgacc tccagtgccg ccggcctcaa gatcagacat ggcccagaac
120ttgaaggact tggcgggacg gctgcccgcc gggccccggg gcatgggcac
ggccctgaag 180ctgttgctgg gggccggcgc cgtggcctac ggtgtgcgcg
aatctgtgtt caccgtggaa 240ggcgggcaca gagccatctt cttcaatcgg
atcggtggag tgcagcagga cactatcctg 300gccgagggcc ttcacttcag
gatcccttgg ttccagtacc ccattatcta tgacattcgg 360gccagacctc
gaaaaatctc ctcccctaca ggctccaaag acctacagat ggtgaatatc
420tccctgcgag tgttgtctcg acccaatgct caggagcttc ctagcatgta
ccagcgccta 480gggctggact acgaggaacg agtgttgccg tccattgtca
acgaggtgct caagagtgtg 540gtggccaagt tcaatgcctc acagctgatc
acccagcggg cccaggtatc cctgttgatc 600cgccgggagc tgacagagag
ggccaaggac ttcagcctca tcctggatga tgtggccatc 660acagagctga
gctttagccg agagtacaca gctgctgtag aagccaaaca agtggcccag
720caggaggccc agcgggccca attcttggta gaaaaagcaa agcaggaaca
gcggcagaaa 780attgtgcagg ccgagggtga ggccgaggct gccaagatgc
ttggagaagc actgagcaag 840aaccctggct acatcaaact tcgcaagatt
cgagcagccc agaatatctc caagacgatc 900gccacatcac agaatcgtat
ctatctcaca gctgacaacc ttgtgctgaa cctacaggat 960gaaagtttca
ccaggggaag tgacagcctc atcaagggta agaaatgagc ctagtcacca
1020agaactccac ccccacaaga agtggatctg cttctccagt ttttga
106617704DNAHomo sapiens 17ggcacgagat gacatcacta agtggccgat
ctgcacagag caggccagga gcaaccacac 60aggcttcctg cacatggact gcgagatcaa
gggccgcccc tgctgcatcg gcaccaaggg 120cagctgtgag atcaccaccc
gggaatactg tgagttcatg cacggctatt tccatgagga 180agcaacactc
tgctcccagg tgaggcgagg caggcctgga gtagtggagg agaggacgct
240gggcatggca gcctgctggg gccggggctc acgcactccc tcccatgtcg
gagcctcaga 300ctcaggctgc ttctggggcg ctgagcacca tatgcccatt
cccaggtgca ctgttttgga 360caaggtgtgt tgggctgctg ccttcctcaa
ccctgaggtc ccagatcagt tttacaggtc 420tggctgtctc ttttcctaca
tgttgggtaa gaggtcctca atgcccccga acccgacccc 480tgtgatggac
acccaggcgg acccctgggg aaaggttcct gggccagggt atggtcggtc
540caacctgccg aagactactg ctcctgaagt gtctggatga aggccgctgc
ctggtgtgtc 600cctcccccag tgtgggtgca ctgccctcgg tgtcttgtgg
gtcttttcaa atgacatccc 660tgaaggggac ctggaggaag gtggtccggc
tggcacccta tcgc 70418315PRTHomo sapiens 18Met Leu Leu Leu Leu Leu
Leu Pro Leu Leu Trp Gly Arg Glu Arg Val1 5 10 15Glu Gly Gln Lys Ser
Asn Arg Lys Asp Tyr Ser Leu Thr Met Gln Ser 20 25 30Ser Val Thr Val
Gln Glu Gly Met Cys Val His Val Arg Cys Ser Phe 35 40 45Ser Tyr Pro
Val Asp Ser Gln Thr Asp Ser Asp Pro Val His Gly Tyr 50 55 60Trp Phe
Arg Ala Gly Asn Asp Ile Ser Trp Lys Ala Pro Val Ala Thr65 70 75
80Asn Asn Pro Ala Trp Ala Val Gln Glu Glu Thr Arg Asp Arg Phe His
85 90 95Leu Leu Gly Asp Pro Gln Thr Lys Asn Cys Thr Leu Ser Ile Arg
Asp 100 105 110Ala Arg Met Ser Asp Ala Gly Arg Tyr Phe Phe Arg Met
Glu Lys Gly 115 120 125Asn Ile Lys Trp Asn Tyr Lys Tyr Asp Gln Leu
Ser Val Asn Val Thr 130 135 140Tyr Pro Pro Gln Asn Leu Thr Val Thr
Val Phe Gln Gly Glu Gly Thr145 150 155 160Ala Ser Thr Ala Leu Gly
Asn Ser Ser Ser Leu Ser Val Leu Glu Gly 165 170 175Gln Ser Leu Arg
Leu Val Cys Ala Val Asp Ser Asn Pro Pro Ala Arg 180 185 190Leu Ser
Trp Thr Trp Arg Ser Leu Thr Leu Tyr Pro Ser Gln Pro Ser 195 200
205Asn Pro Leu Val Leu Glu Leu Gln Val His Leu Gly Asp Glu Gly Glu
210 215 220Phe Thr Cys Arg Ala Gln Asn Ser Leu Gly Ser Gln His Val
Ser Leu225 230 235 240Asn Leu Ser Leu Gln Gln Glu Tyr Thr Gly Lys
Met Lys Pro Val Ser 245 250 255Gly Val Leu Leu Gly Ala Val Gly Gly
Thr Gly Ala Thr Ala Leu Val 260 265 270Phe Leu Ser Phe Cys Val Ile
Phe Ile Val Val Arg Ser Cys Arg Lys 275 280 285Lys Ser Ala Arg Pro
Ala Ala Asp Val Gly Asp Ile Gly Met Lys Asp 290 295 300Ala Asn Thr
Ile Gln Gly Leu Ser Leu Ser Gly305 310 31519245PRTHomo sapiens
19Met Ala Glu Ile His Thr Pro Tyr Ser Ser Leu Lys Lys Leu Leu Ser1
5 10 15Leu Leu Asn Gly Phe Val Ala Val Ser Gly Ile Ile Leu Val Gly
Leu 20 25 30Gly Ile Gly Gly Lys Cys Gly Gly Ala Ser Leu Thr Asn Val
Leu Gly 35 40 45Leu Ser Ser Ala Tyr Leu Leu His Val Gly Asn Leu Cys
Leu Val Met 50 55 60Gly Cys Ile Thr Val Leu Leu Gly Cys Ala Gly Trp
Tyr Gly Ala Thr65 70 75 80Lys Glu Ser Arg Gly Thr Leu Leu Phe Cys
Ile Leu Ser Met Val Ile 85 90 95Val Leu Ile Met Glu Val Thr Ala Ala
Thr Val Val Leu Leu Phe Phe 100 105 110Pro Ile Val Gly Asp Val Ala
Leu Glu His Thr Phe Val Thr Leu Arg 115 120 125Lys Asn Tyr Arg Gly
Tyr Asn Glu Pro Asp Asp Tyr Ser Thr Gln Trp 130 135 140Asn Leu Val
Met Glu Lys Leu Lys Cys Cys Gly Val Asn Asn Tyr Thr145 150 155
160Asp Phe Ser Gly Ser Ser Phe Glu Met Thr Thr Gly His Thr Tyr Pro
165 170 175Arg Ser Cys Cys Lys Ser Ile Gly Ser Val Ser Cys Asp Gly
Arg Asp 180 185 190Val Ser Pro Asn Val Ile His Gln Lys Gly Cys Phe
His Lys Leu Leu 195 200 205Lys Ile Thr Lys Thr Gln Ser Phe Thr Leu
Ser Gly Ser Ser Leu Gly 210 215 220Ala Ala Val Ile Gln Arg Trp Gly
Ser Arg Tyr Val Ala Gln Ala Gly225 230 235 240Leu Glu Leu Leu Ala
24520273PRTHomo sapiens 20Met Gln Cys Phe Ser Phe Ile Lys Thr Met
Met Ile Leu Phe Asn Leu1 5 10 15Leu Ile Phe Leu Cys Gly Ala Ala Leu
Leu Ala Val Gly Ile Trp Val 20 25 30Ser Ile Asp Gly Ala Ser Phe Leu
Lys Ile Phe Gly Pro Leu Ser Ser 35 40 45Ser Ala Met Gln Phe Val Asn
Val Gly Tyr Phe Leu Ile Ala Ala Gly 50 55 60Val Val Val Phe Ala Leu
Gly Phe Leu Gly Cys Tyr Gly Ala Lys Thr65 70 75 80Glu Ser Lys Cys
Ala Leu Val Thr Phe Phe Phe Ile Leu Leu Leu Ile 85 90 95Phe Ile Ala
Glu Val Ala Ala Ala Val Val Ala Leu Val Tyr Thr Thr 100 105 110Met
Ala Glu His Phe Leu Thr Leu Leu Val Val Pro Ala Ile Lys Lys 115 120
125Asp Tyr Gly Ser Gln Glu Asp Phe Thr Gln Val Trp Asn Thr Thr Met
130 135 140Lys Gly Leu Lys Cys Cys Gly Phe Thr Asn Tyr Thr Asp Phe
Glu Asp145 150 155 160Ser Pro Tyr Phe Lys Glu Asn Ser Ala Phe Pro
Pro Phe Cys Cys Asn 165 170 175Asp Asn Val Thr Asn Thr Ala Asn Glu
Thr Cys Thr Lys Gln Lys Ala 180 185 190His Asp Gln Lys Val Glu Gly
Cys Phe Asn Gln Leu Leu Tyr Asp Ile 195 200 205Arg Thr Asn Ala Val
Thr Val Gly Gly Val Ala Ala Gly Ile Gly Gly 210 215 220Leu Glu Leu
Ala Ala Met Asn Cys Val His Val Ser Val Leu Gln Ser225 230 235
240Thr Ile Ser Pro Leu Leu Pro Leu Pro Leu Leu Leu Pro His Gly Asn
245 250 255Cys Glu Glu Ala Pro Trp Gln Ala Ala Val Ile Gly Gly Gly
Asp Arg 260 265 270Ile21250PRTHomo sapiens 21Met Asn Ser Met Thr
Ser Ala Val Pro Val Ala Asn Ser Val Leu Val1 5 10 15Val Ala Pro His
Asn Gly Tyr Pro Val Thr Pro Gly Ile Met Ser His 20 25 30Val Pro Leu
Tyr Pro Asn Ser Gln Pro Gln Val His Leu Val Pro Gly 35 40 45Asn Pro
Pro Ser Leu Val Ser Asn Val Asn Gly Gln Pro Val Gln Lys 50 55 60Ala
Leu Lys Glu Gly Lys Thr Leu Gly Ala Ile Gln Ile Ile Ile Gly65 70 75
80Leu Ala His Ile Gly Leu Gly Ser Ile Met Ala Thr Val Leu Val Gly
85 90 95Glu Tyr Leu Ser Ile Ser Phe Tyr Gly Gly Phe Pro Phe Trp Gly
Gly 100 105 110Leu Trp Phe Ile Ile Ser Gly Ser Leu Ser Val Ala Ala
Glu Asn Gln 115 120 125Pro Tyr Ser Tyr Cys Leu Leu Ser Gly Ser Leu
Gly Leu Asn Ile Val 130 135 140Ser Ala Ile Cys Ser Ala Val Gly Val
Ile Leu Phe Ile Thr Asp Leu145 150 155 160Ser Ile Pro His Pro Tyr
Ala Tyr Pro Asp Tyr Tyr Pro Tyr Ala Trp 165 170 175Gly Val Asn Pro
Gly Met Ala Ile Ser Gly Val Leu Leu Val Phe Cys 180 185 190Leu Leu
Glu Phe Gly Ile Ala Cys Ala Ser Ser His Phe Gly Cys Gln 195 200
205Leu Val Cys Cys Gln Ser Ser Asn Val Ser Val Ile Tyr Pro Asn Ile
210 215 220Tyr Ala Ala Asn Pro Val Ile Thr Pro Glu Pro Val Thr Ser
Pro Pro225 230 235 240Ser Tyr Ser Ser Glu Ile Gln Ala Asn Lys 245
25022245PRTHomo sapiens 22Met Ala Ser Pro Ser Arg Arg Leu Gln Thr
Lys Pro Val Ile Thr Cys1 5 10 15Phe Lys Ser Val Leu Leu Ile Tyr Thr
Phe Ile Phe Trp Ile Thr Gly 20 25 30Val Ile Leu Leu Ala Val Gly Ile
Trp Gly Lys Val Ser Leu Glu Asn 35 40 45Tyr Phe Ser Leu Leu Asn Glu
Lys Ala Thr Asn Val Pro Phe Val Leu 50 55 60Ile Ala Thr Gly Thr Val
Ile Ile Leu Leu Gly Thr Phe Gly Cys Phe65 70 75 80Ala Thr Cys Arg
Ala Ser Ala Trp Met Leu Lys Leu Tyr Ala Met Phe 85 90 95Leu Thr Leu
Val Phe Leu Val Glu Leu Val Ala Ala Ile Val Gly Phe 100 105 110Val
Phe Arg His Glu Ile Lys Asn Ser Phe Lys Asn Asn Tyr Glu Lys 115 120
125Ala Leu Lys Gln Tyr Asn Ser Thr Gly Asp Tyr Arg Ser His Ala Val
130 135 140Asp Lys Ile Gln Asn Thr Leu His Cys Cys Gly Val Thr Asp
Tyr Arg145 150 155 160Asp Trp Thr Asp Thr Asn Tyr Tyr Ser Glu Lys
Gly Phe Pro Lys Ser 165 170 175Cys Cys Lys Leu Glu Asp Cys Thr Pro
Gln Arg Asp Ala Asp Lys Val 180 185 190Asn Asn Glu Gly Cys Phe Ile
Lys Val Met Thr Ile Ile Glu Ser Glu 195 200 205Met Gly Val Val Ala
Gly Ile Ser Phe Gly Val Ala Cys Phe Gln Leu 210 215 220Ile Gly Ile
Phe Leu Ala Tyr Cys Leu Ser Arg Ala Ile Thr Asn Asn225 230 235
240Gln Tyr Glu Ile Val 24523253PRTHomo sapiens 23Met Gly Gln Cys
Gly Ile Thr Ser Ser Lys Thr Val Leu Val Phe Leu1 5 10 15Asn Leu Ile
Phe Trp Gly Ala Ala Gly Ile Leu Cys Tyr Val Gly Ala 20 25 30Tyr Val
Phe Ile Thr Tyr Asp Asp Tyr Asp His Phe Phe Glu Asp Val 35 40 45Tyr
Thr Leu Ile Pro Ala Val Val Ile Ile Ala Val Gly Ala Leu Leu 50 55
60Phe Ile Ile Gly Leu Ile Gly Cys Cys Ala Thr Ile Arg Glu Ser Arg65
70 75 80Cys Gly Leu Ala Thr Phe Val Ile Ile Leu Leu Leu Val Phe Val
Thr 85 90 95Glu Val Val Val Val Val Leu Gly Tyr Val Tyr Arg Ala Lys
Val Glu 100 105 110Asn Glu Val Asp Arg Ser Ile Gln Lys Val Tyr Lys
Thr Tyr Asn Gly 115 120 125Thr Asn Pro Asp Ala Ala Ser Arg Ala Ile
Asp Tyr Val Gln Arg Gln 130 135 140Leu His Cys Cys Gly Ile His Asn
Tyr Ser Asp Trp Glu Asn Thr Asp145 150 155 160Trp Phe Lys Glu Thr
Lys Asn Gln Ser Val Pro Leu Ser Cys Cys Arg 165 170 175Glu Thr Ala
Ser Asn Cys Asn Gly Ser Leu Ala His Pro Ser Asp Leu 180 185 190Tyr
Ala Glu Gly Cys Glu Ala Leu Val Val Lys Lys Leu Gln Glu Ile 195 200
205Met Met His Val Ile Trp Ala Ala Leu Ala Phe Ala Ala Ile Gln Leu
210 215 220Leu Gly Met Leu Cys Ala Cys Ile Val Leu Cys Arg Arg Ser
Arg Asp225 230 235 240Pro Ala Tyr Glu Leu Phe Ile Thr Gly Gly Thr
Tyr Ala 245 25024238PRTHomo sapiens 24Met Ala Arg Ala Cys Leu Gln
Ala Val Lys Tyr Leu Met Phe Ala Phe1 5 10 15Asn Leu Phe Phe Trp Leu
Gly Gly Cys Gly Val Leu Gly Val Gly Ile 20 25 30Trp Leu Ala Ala Thr
Gln Gly Ser Phe Ala Thr Leu Ser Ser Ser Phe 35 40 45Pro Ser Leu Ser
Ala Ala Asn Leu Leu Ile Ile Thr Gly Ala Phe Val 50 55 60Met Ala Ile
Gly Phe Val Gly Cys Leu Gly Ala Ile Lys Glu Asn Lys65 70 75 80Cys
Leu Leu Leu Thr Phe Phe Leu Leu Leu Leu Leu Val Phe Leu Leu 85 90
95Glu Ala Thr Ile Ala Ile Leu Phe Phe Ala Tyr Thr Asp Lys Ile Asp
100 105 110Arg Tyr Ala Gln Gln Asp Leu Lys Lys Gly Leu His Leu Tyr
Gly Thr 115 120 125Gln Gly Asn Val Gly Leu Thr Asn Ala Trp Ser Ile
Ile Gln Thr Asp 130 135 140Phe Arg Cys Cys Gly Val Ser Asn Tyr Thr
Asp Trp Phe Glu Val Tyr145 150 155 160Asn Ala Thr Arg Val Pro Asp
Ser Cys Cys Leu Glu Phe Ser Glu Ser 165 170 175Cys Gly Leu His Ala
Pro Gly Thr Trp Trp Lys Ala Pro Cys Tyr Glu 180 185 190Thr Val Lys
Val Trp Leu Gln Glu Asn Leu Leu Ala Val Gly Ile Phe 195 200 205Gly
Leu Cys Thr Ala Leu Val Gln Ile Leu Gly Leu Thr Phe Ala Met 210 215
220Thr Met Tyr Cys Gln Val Val Lys Ala Asp Thr Tyr Cys Ala225 230
23525221PRTHomo sapiens 25Met Gly Arg Phe Arg Gly Gly Leu Arg Cys
Ile Lys Tyr Leu Leu Leu1 5 10 15Gly Phe Asn Leu Leu Phe Trp Leu Ala
Gly Ser Ala Val Ile Ala Phe 20 25 30Gly Leu Trp Phe Arg Phe Gly Gly
Ala Ile Lys Glu Leu Ser Ser Glu 35 40 45Asp Lys Ser
Pro Glu Tyr Phe Tyr Val Gly Leu Tyr Val Leu Val Gly 50 55 60Ala Gly
Ala Leu Met Met Ala Val Gly Phe Phe Gly Cys Cys Gly Ala65 70 75
80Met Arg Glu Ser Gln Cys Val Leu Gly Ser Phe Phe Thr Cys Leu Leu
85 90 95Val Ile Phe Ala Ala Glu Val Thr Thr Gly Val Phe Ala Phe Ile
Gly 100 105 110Lys Gly Val Ala Ile Arg His Val Gln Thr Met Tyr Glu
Glu Ala Tyr 115 120 125Asn Asp Tyr Leu Lys Asp Arg Gly Lys Gly Asn
Gly Thr Leu Ile Thr 130 135 140Phe His Ser Thr Phe Gln Cys Cys Gly
Lys Glu Ser Ser Glu Gln Val145 150 155 160Gln Pro Thr Cys Pro Lys
Glu Leu Leu Gly His Lys Asn Cys Ile Asp 165 170 175Glu Ile Glu Thr
Ile Ile Ser Val Lys Leu Gln Leu Ile Gly Ile Val 180 185 190Gly Ile
Gly Ile Ala Gly Leu Thr Ile Phe Gly Met Ile Phe Ser Met 195 200
205Val Leu Cys Cys Ala Ile Arg Asn Ser Arg Asp Val Ile 210 215
22026252PRTHomo sapiens 26Met Ala His Tyr Lys Thr Glu Gln Asp Asp
Trp Leu Ile Ile Tyr Leu1 5 10 15Lys Tyr Leu Leu Phe Val Phe Asn Phe
Phe Phe Trp Val Gly Gly Ala 20 25 30Ala Val Leu Ala Val Gly Ile Trp
Thr Leu Val Glu Lys Ser Gly Tyr 35 40 45Leu Ser Val Leu Ala Ser Ser
Thr Phe Ala Ala Ser Ala Tyr Ile Leu 50 55 60Ile Phe Ala Gly Val Leu
Val Met Val Thr Gly Phe Leu Gly Phe Gly65 70 75 80Ala Ile Leu Trp
Glu Arg Lys Gly Cys Leu Ser Thr Tyr Phe Cys Leu 85 90 95Leu Leu Val
Ile Phe Leu Val Glu Leu Val Ala Gly Val Leu Ala His 100 105 110Val
Tyr Tyr Gln Arg Leu Ser Asp Glu Leu Lys Gln His Leu Asn Arg 115 120
125Thr Leu Ala Glu Asn Tyr Gly Gln Pro Glu His Ala Asp His Ala Ser
130 135 140Val Asp Arg Leu Gln Gln Asp Phe Lys Cys Cys Gly Ser Asn
Ser Ser145 150 155 160Ala Asp Trp Gln His Ser Thr Tyr Ile Leu Leu
Arg Glu Ala Glu Gly 165 170 175Arg Gln Val Pro Asp Ser Cys Cys Lys
Thr Val Val Ala Arg Cys Gly 180 185 190Gln Arg Ala His Pro Ser Asn
Ile Tyr Lys Val Glu Gly Gly Cys Leu 195 200 205Thr Lys Leu Glu Gln
Phe Leu Ala Asp His Leu Leu Leu Met Gly Ala 210 215 220Val Gly Ile
Gly Val Ala Cys Leu Gln Ile Cys Gly Met Val Leu Thr225 230 235
240Cys Cys Leu His Gln Arg Leu Gln Arg His Phe Tyr 245
25027197PRTHomo sapiens 27Met Cys Thr Gly Lys Cys Ala Arg Cys Val
Gly Leu Ser Leu Ile Thr1 5 10 15Leu Cys Leu Val Cys Ile Val Ala Asn
Ala Leu Leu Leu Val Pro Asn 20 25 30Gly Glu Thr Ser Trp Thr Asn Thr
Asn His Leu Ser Leu Gln Val Trp 35 40 45Leu Met Gly Gly Phe Ile Gly
Gly Gly Leu Met Val Leu Cys Pro Gly 50 55 60Ile Ala Ala Val Arg Ala
Gly Gly Lys Gly Cys Cys Gly Ala Gly Cys65 70 75 80Cys Gly Asn Arg
Cys Arg Met Leu Arg Ser Val Phe Ser Ser Ala Phe 85 90 95Gly Val Leu
Gly Ala Ile Tyr Cys Leu Ser Val Ser Gly Ala Gly Leu 100 105 110Arg
Asn Gly Pro Arg Cys Leu Met Asn Gly Glu Trp Gly Tyr His Phe 115 120
125Glu Asp Thr Ala Gly Ala Tyr Leu Leu Asn Arg Thr Leu Trp Asp Arg
130 135 140Cys Glu Ala Pro Pro Arg Val Val Pro Trp Asn Val Thr Leu
Phe Ser145 150 155 160Leu Leu Val Ala Ala Ser Cys Leu Glu Ile Val
Leu Cys Gly Ile Gln 165 170 175Leu Val Asn Ala Thr Ile Gly Val Phe
Cys Gly Asp Cys Arg Lys Lys 180 185 190Gln Asp Thr Pro His
19528201PRTHomo sapiens 28Met Gly Ser Arg Lys Cys Gly Gly Cys Leu
Ser Cys Leu Leu Ile Pro1 5 10 15Leu Ala Leu Trp Ser Ile Ile Val Asn
Ile Leu Leu Tyr Phe Pro Asn 20 25 30Gly Gln Thr Ser Tyr Ala Ser Ser
Asn Lys Leu Thr Asn Tyr Val Trp 35 40 45Tyr Phe Glu Gly Ile Cys Phe
Ser Gly Ile Met Met Leu Ile Val Thr 50 55 60Thr Val Leu Leu Val Leu
Glu Asn Asn Asn Asn Tyr Lys Cys Cys Gln65 70 75 80Ser Glu Asn Cys
Ser Lys Lys Tyr Val Thr Leu Leu Ser Ile Ile Phe 85 90 95Ser Ser Leu
Gly Ile Ala Phe Ser Gly Tyr Cys Leu Val Ile Ser Ala 100 105 110Leu
Gly Leu Val Gln Gly Pro Tyr Cys Arg Thr Leu Asp Gly Trp Glu 115 120
125Tyr Ala Phe Glu Gly Thr Ala Gly Arg Phe Leu Thr Asp Ser Ser Ile
130 135 140Trp Ile Gln Cys Leu Glu Pro Ala His Val Val Glu Trp Asn
Ile Ile145 150 155 160Leu Phe Ser Ile Leu Ile Thr Leu Ser Gly Leu
Gln Val Ile Ile Cys 165 170 175Leu Ile Arg Val Val Met Gln Leu Ser
Lys Ile Leu Cys Gly Ser Tyr 180 185 190Ser Val Ile Phe Gln Pro Gly
Ile Ile 195 20029133PRTHomo sapiens 29Met Ser Pro Arg Leu Glu Val
Pro Cys Ser His Ala Leu Pro Gln Gly1 5 10 15Leu Ser Pro Gly Gln Val
Ile Ile Val Arg Gly Leu Val Leu Gln Glu 20 25 30Pro Lys His Phe Thr
Val Ser Leu Arg Asp Gln Ala Ala His Ala Pro 35 40 45Val Thr Leu Arg
Ala Ser Phe Ala Asp Arg Thr Leu Ala Trp Ile Ser 50 55 60Arg Trp Gly
Gln Lys Lys Leu Ile Ser Ala Pro Phe Leu Phe Tyr Pro65 70 75 80Gln
Arg Phe Phe Glu Val Leu Leu Leu Phe Gln Glu Gly Gly Leu Lys 85 90
95Leu Ala Leu Asn Gly Gln Gly Leu Gly Ala Thr Ser Met Asn Gln Gln
100 105 110Ala Leu Glu Gln Leu Arg Glu Leu Arg Ile Ser Gly Ser Val
Gln Leu 115 120 125Tyr Cys Val His Ser 1303070PRTHomo sapiens 30Met
Arg Thr Ala Leu Leu Leu Leu Ala Ala Leu Ala Val Ala Thr Gly1 5 10
15Pro Ala Leu Thr Leu Arg Cys His Val Cys Thr Ser Ser Ser Asn Cys
20 25 30Lys His Ser Val Val Cys Pro Ala Ser Ser Arg Phe Cys Lys Thr
Thr 35 40 45Asn Thr Val Glu Pro Leu Arg Ala Ser Pro Lys Val Trp Asp
Gln Val 50 55 60Gln Val Gly Met Glu Cys65 7031299PRTHomo sapiens
31Met Ala Gln Asn Leu Lys Asp Leu Ala Gly Arg Leu Pro Ala Gly Pro1
5 10 15Arg Gly Met Gly Thr Ala Leu Lys Leu Leu Leu Gly Ala Gly Ala
Val 20 25 30Ala Tyr Gly Val Arg Glu Ser Val Phe Thr Val Glu Gly Gly
His Arg 35 40 45Ala Ile Phe Phe Asn Arg Ile Gly Gly Val Gln Gln Asp
Thr Ile Leu 50 55 60Ala Glu Gly Leu His Phe Arg Ile Pro Trp Phe Gln
Tyr Pro Ile Ile65 70 75 80Tyr Asp Ile Arg Ala Arg Pro Arg Lys Ile
Ser Ser Pro Thr Gly Ser 85 90 95Lys Asp Leu Gln Met Val Asn Ile Ser
Leu Arg Val Leu Ser Arg Pro 100 105 110Asn Ala Gln Glu Leu Pro Ser
Met Tyr Gln Arg Leu Gly Leu Asp Tyr 115 120 125Glu Glu Arg Val Leu
Pro Ser Ile Val Asn Glu Val Leu Lys Ser Val 130 135 140Val Ala Lys
Phe Asn Ala Ser Gln Leu Ile Thr Gln Arg Ala Gln Val145 150 155
160Ser Leu Leu Ile Arg Arg Glu Leu Thr Glu Arg Ala Lys Asp Phe Ser
165 170 175Leu Ile Leu Asp Asp Val Ala Ile Thr Glu Leu Ser Phe Ser
Arg Glu 180 185 190Tyr Thr Ala Ala Val Glu Ala Lys Gln Val Ala Gln
Gln Glu Ala Gln 195 200 205Arg Ala Gln Phe Leu Val Glu Lys Ala Lys
Gln Glu Gln Arg Gln Lys 210 215 220Ile Val Gln Ala Glu Gly Glu Ala
Glu Ala Ala Lys Met Leu Gly Glu225 230 235 240Ala Leu Ser Lys Asn
Pro Gly Tyr Ile Lys Leu Arg Lys Ile Arg Ala 245 250 255Ala Gln Asn
Ile Ser Lys Thr Ile Ala Thr Ser Gln Asn Arg Ile Tyr 260 265 270Leu
Thr Ala Asp Asn Leu Val Leu Asn Leu Gln Asp Glu Ser Phe Thr 275 280
285Arg Gly Ser Asp Ser Leu Ile Lys Gly Lys Lys 290 29532168PRTHomo
sapiens 32Met Asp Cys Glu Ile Lys Gly Arg Pro Cys Cys Ile Gly Thr
Lys Gly1 5 10 15Ser Cys Glu Ile Thr Thr Arg Glu Tyr Cys Glu Phe Met
His Gly Tyr 20 25 30Phe His Glu Glu Ala Thr Leu Cys Ser Gln Val Arg
Arg Gly Arg Pro 35 40 45Gly Val Val Glu Glu Arg Thr Leu Gly Met Ala
Ala Cys Trp Gly Arg 50 55 60Gly Ser Arg Thr Pro Ser His Val Gly Ala
Ser Asp Ser Gly Cys Phe65 70 75 80Trp Gly Ala Glu His His Met Pro
Ile Pro Arg Cys Thr Val Leu Asp 85 90 95Lys Val Cys Trp Ala Ala Ala
Phe Leu Asn Pro Glu Val Pro Asp Gln 100 105 110Phe Tyr Arg Ser Gly
Cys Leu Phe Ser Tyr Met Leu Gly Lys Arg Ser 115 120 125Ser Met Pro
Pro Asn Pro Thr Pro Val Met Asp Thr Gln Ala Asp Pro 130 135 140Trp
Gly Lys Val Pro Gly Pro Gly Tyr Gly Arg Ser Asn Leu Pro Lys145 150
155 160Thr Thr Ala Pro Glu Val Ser Gly 16533551PRTHomo sapiens
33Met Leu Pro Leu Leu Leu Leu Pro Leu Leu Trp Gly Gly Ser Leu Gln1
5 10 15Glu Lys Pro Val Tyr Glu Leu Gln Val Gln Lys Ser Val Thr Val
Gln 20 25 30Glu Gly Leu Cys Val Leu Val Pro Cys Ser Phe Ser Tyr Pro
Trp Arg 35 40 45Ser Trp Tyr Ser Ser Pro Pro Leu Tyr Val Tyr Trp Phe
Arg Asp Gly 50 55 60Glu Ile Pro Tyr Tyr Ala Glu Val Val Ala Thr Asn
Asn Pro Asp Arg65 70 75 80Arg Val Lys Pro Glu Thr Gln Gly Arg Phe
Arg Leu Leu Gly Asp Val 85 90 95Gln Lys Lys Asn Cys Ser Leu Ser Ile
Gly Asp Ala Arg Met Glu Asp 100 105 110Thr Gly Ser Tyr Phe Phe Arg
Val Glu Arg Gly Arg Asp Val Lys Tyr 115 120 125Ser Tyr Gln Gln Asn
Lys Leu Asn Leu Glu Val Thr Ala Leu Ile Glu 130 135 140Lys Pro Asp
Ile His Phe Leu Glu Pro Leu Glu Ser Gly Arg Pro Thr145 150 155
160Arg Leu Ser Cys Ser Leu Pro Gly Ser Cys Glu Ala Gly Pro Pro Leu
165 170 175Thr Phe Ser Trp Thr Gly Asn Ala Leu Ser Pro Leu Asp Pro
Glu Thr 180 185 190Thr Arg Ser Ser Glu Leu Thr Leu Thr Pro Arg Pro
Glu Asp His Gly 195 200 205Thr Asn Leu Thr Cys Gln Met Lys Arg Gln
Gly Ala Gln Val Thr Thr 210 215 220Glu Arg Thr Val Gln Leu Asn Val
Ser Tyr Ala Pro Gln Thr Ile Thr225 230 235 240Ile Phe Arg Asn Gly
Ile Ala Leu Glu Ile Leu Gln Asn Thr Ser Tyr 245 250 255Leu Pro Val
Leu Glu Gly Gln Ala Leu Arg Leu Leu Cys Asp Ala Pro 260 265 270Ser
Asn Pro Pro Ala His Leu Ser Trp Phe Gln Gly Ser Pro Ala Leu 275 280
285Asn Ala Thr Pro Ile Ser Asn Thr Gly Ile Leu Glu Leu Arg Arg Val
290 295 300Arg Ser Ala Glu Glu Gly Gly Phe Thr Cys Arg Ala Gln His
Pro Leu305 310 315 320Gly Phe Leu Gln Ile Phe Leu Asn Leu Ser Val
Tyr Ser Leu Pro Gln 325 330 335Leu Leu Gly Pro Ser Cys Ser Trp Glu
Ala Glu Gly Leu His Cys Arg 340 345 350Cys Ser Phe Arg Ala Arg Pro
Ala Pro Ser Leu Cys Trp Arg Leu Glu 355 360 365Glu Lys Pro Leu Glu
Gly Asn Ser Ser Gln Gly Ser Phe Lys Val Asn 370 375 380Ser Ser Ser
Ala Gly Pro Trp Ala Asn Ser Ser Leu Ile Leu His Gly385 390 395
400Gly Leu Ser Ser Asp Leu Lys Val Ser Cys Lys Ala Trp Asn Ile Tyr
405 410 415Gly Ser Gln Ser Gly Ser Val Leu Leu Leu Gln Gly Arg Ser
Asn Leu 420 425 430Gly Thr Gly Val Val Pro Ala Ala Leu Gly Gly Ala
Gly Val Met Ala 435 440 445Leu Leu Cys Ile Cys Leu Cys Leu Ile Phe
Phe Leu Ile Val Lys Ala 450 455 460Arg Arg Lys Gln Ala Ala Gly Arg
Pro Glu Lys Met Asp Asp Glu Asp465 470 475 480Pro Ile Met Gly Thr
Ile Thr Ser Gly Ser Arg Lys Lys Pro Trp Pro 485 490 495Asp Ser Pro
Gly Asp Gln Ala Ser Pro Pro Gly Asp Ala Pro Pro Leu 500 505 510Glu
Glu Gln Lys Glu Leu His Tyr Ala Ser Leu Ser Phe Ser Glu Met 515 520
525Lys Ser Arg Glu Pro Lys Asp Gln Glu Ala Pro Ser Thr Thr Glu Tyr
530 535 540Ser Glu Ile Lys Thr Ser Lys545 55034219PRTRattus sp.
34Met Gly Met Ser Ser Leu Lys Leu Leu Lys Tyr Val Leu Phe Phe Phe1
5 10 15Asn Phe Leu Phe Trp Val Cys Gly Cys Cys Ile Leu Gly Phe Gly
Ile 20 25 30His Leu Leu Val Gln Asn Thr Tyr Gly Ile Leu Phe Arg Asn
Leu Pro 35 40 45Phe Leu Thr Leu Gly Asn Val Leu Val Ile Val Gly Ser
Ile Ile Met 50 55 60Val Val Ala Phe Leu Gly Cys Met Gly Ser Ile Lys
Glu Asn Lys Cys65 70 75 80Leu Leu Met Ser Phe Phe Val Leu Leu Leu
Leu Ile Leu Leu Ala Glu 85 90 95Val Thr Leu Ala Ile Leu Leu Phe Val
Tyr Glu Lys Lys Ile Asn Thr 100 105 110Leu Val Ala Glu Gly Leu Asn
Asp Ser Ile Gln His Tyr His Ser Asp 115 120 125Asn Ser Thr Arg Met
Ala Trp Asp Phe Ile Gln Ser Gln Leu Gln Cys 130 135 140Cys Gly Val
Asn Gly Ser Ser Asp Trp Ile Ser Gly Pro Pro Ser Ser145 150 155
160Cys Pro Ser Gly Ala Asp Val Gln Gly Cys Tyr Lys Lys Gly Gln Ala
165 170 175Trp Phe His Ser Asn Phe Leu Tyr Ile Gly Ile Val Thr Ile
Cys Val 180 185 190Cys Val Ile Gln Val Leu Gly Met Ser Phe Ala Leu
Thr Leu Asn Cys 195 200 205Gln Ile Asp Lys Thr Ser Gln Ala Leu Gly
Leu 210 21535253PRTHomo sapiens 35Met Gly Glu Phe Asn Glu Lys Lys
Thr Thr Cys Gly Thr Val Cys Leu1 5 10 15Lys Tyr Leu Leu Phe Thr Tyr
Asn Cys Cys Phe Trp Leu Ala Gly Leu 20 25 30Ala Val Met Ala Val Gly
Ile Trp Thr Leu Ala Leu Lys Ser Asp Tyr 35 40 45Ile Ser Leu Leu Ala
Ser Gly Thr Tyr Leu Ala Thr Ala Tyr Ile Leu 50 55 60Val Val Ala Gly
Thr Val Val Met Val Thr Gly Val Leu Gly Cys Cys65 70 75 80Ala Thr
Phe Lys Glu Arg Arg Asn Leu Leu Arg Leu Tyr Phe Ile Leu 85 90 95Leu
Leu Ile Ile Phe Leu Leu Glu Ile Ile Ala Gly Ile Leu Ala Tyr 100 105
110Ala Tyr Tyr Gln Gln Leu Asn Thr Glu Leu Lys Glu Asn Leu Lys Asp
115 120 125Thr Met Thr Lys Arg Tyr His Gln Pro Gly His Glu Ala Val
Thr Ser 130 135 140Ala Val Asp Gln Leu Gln Gln Glu Phe His Cys Cys
Gly Ser Asn Asn145 150 155 160Ser Gln Asp Trp Arg Asp Ser Glu Trp
Ile Arg Ser Gln Glu Ala Gly 165 170 175Gly Arg Val Val Pro Asp Ser
Cys Cys Lys Thr Val Val Ala Leu Cys 180 185 190Gly Gln Arg Asp His
Ala
Ser Asn Ile Tyr Lys Val Glu Gly Gly Cys 195 200 205Ile Thr Lys Leu
Glu Thr Phe Ile Gln Glu His Leu Arg Val Ile Gly 210 215 220Ala Val
Gly Ile Gly Ile Ala Cys Val Gln Val Phe Gly Met Ile Phe225 230 235
240Thr Cys Cys Leu Tyr Arg Ser Leu Lys Leu Glu His Tyr 245
25036238PRTHomo sapiens 36Met Ala Arg Ala Cys Leu Gln Ala Val Lys
Tyr Leu Met Phe Ala Phe1 5 10 15Asn Leu Leu Phe Trp Leu Gly Gly Cys
Gly Val Leu Gly Val Gly Ile 20 25 30Trp Leu Ala Ala Thr Gln Gly Ser
Phe Ala Thr Leu Ser Ser Ser Phe 35 40 45Pro Ser Leu Ser Ala Ala Asn
Leu Leu Ile Ile Thr Gly Ala Phe Val 50 55 60Met Ala Ile Gly Phe Val
Gly Cys Leu Gly Ala Ile Lys Glu Asn Lys65 70 75 80Cys Leu Leu Leu
Thr Phe Phe Leu Leu Leu Leu Leu Val Phe Leu Leu 85 90 95Glu Ala Thr
Ile Ala Ile Leu Phe Phe Ala Tyr Thr Asp Lys Ile Asp 100 105 110Arg
Tyr Ala Gln Gln Asp Leu Lys Lys Gly Leu His Leu Tyr Gly Thr 115 120
125Gln Gly Asn Val Gly Leu Thr Asn Ala Trp Ser Ile Ile Gln Thr Asp
130 135 140Phe Arg Cys Cys Gly Val Ser Asn Tyr Thr Asp Trp Phe Glu
Val Tyr145 150 155 160Asn Ala Thr Arg Val Pro Asp Ser Cys Cys Leu
Glu Phe Ser Glu Ser 165 170 175Cys Gly Leu His Ala Pro Gly Thr Trp
Trp Lys Ala Pro Cys Tyr Glu 180 185 190Thr Val Lys Val Trp Leu Gln
Glu Asn Leu Leu Ala Val Gly Ile Phe 195 200 205Gly Leu Cys Thr Ala
Leu Val Gln Ile Leu Gly Leu Thr Phe Ala Met 210 215 220Thr Met Tyr
Cys Gln Val Val Lys Ala Asp Thr Tyr Cys Ala225 230 23537244PRTHomo
sapiens 37Met Glu Thr Lys Pro Val Ile Thr Cys Leu Lys Thr Leu Leu
Ile Ile1 5 10 15Tyr Ser Phe Val Phe Trp Ile Thr Gly Val Ile Leu Leu
Ala Val Gly 20 25 30Val Trp Gly Lys Leu Thr Leu Gly Thr Tyr Ile Ser
Leu Ile Ala Glu 35 40 45Asn Ser Thr Asn Ala Pro Tyr Val Leu Ile Gly
Thr Gly Thr Thr Ile 50 55 60Val Val Phe Gly Leu Phe Gly Cys Phe Ala
Thr Cys Arg Gly Ser Pro65 70 75 80Trp Met Leu Lys Leu Tyr Ala Met
Phe Leu Ser Leu Val Phe Leu Ala 85 90 95Glu Leu Val Ala Gly Ile Ser
Gly Phe Val Phe Arg His Glu Ile Lys 100 105 110Asp Thr Phe Leu Arg
Thr Tyr Thr Asp Ala Met Gln Thr Tyr Asn Gly 115 120 125Asn Asp Glu
Arg Ser Arg Ala Val Asp His Val Gln Arg Ser Leu Ser 130 135 140Cys
Cys Gly Val Gln Asn Tyr Thr Asn Trp Ser Thr Ser Pro Tyr Phe145 150
155 160Leu Glu His Gly Ile Pro Pro Ser Cys Cys Met Asn Glu Thr Asp
Cys 165 170 175Asn Pro Gln Asp Leu His Asn Leu Thr Val Ala Ala Thr
Lys Val Asn 180 185 190Gln Lys Gly Cys Tyr Asp Leu Val Thr Ser Phe
Met Glu Thr Asn Met 195 200 205Gly Ile Ile Ala Gly Val Ala Phe Gly
Ile Ala Phe Ser Gln Leu Ile 210 215 220Gly Met Leu Leu Ala Cys Cys
Leu Ser Arg Phe Ile Thr Ala Asn Gln225 230 235 240Tyr Glu Met
Val38297PRTHomo sapiens 38Met Thr Thr Pro Arg Asn Ser Val Asn Gly
Thr Phe Pro Ala Glu Pro1 5 10 15Met Lys Gly Pro Ile Ala Met Gln Ser
Gly Pro Lys Pro Leu Phe Arg 20 25 30Arg Met Ser Ser Leu Val Gly Pro
Thr Gln Ser Phe Phe Met Arg Glu 35 40 45Ser Lys Thr Leu Gly Ala Val
Gln Ile Met Asn Gly Leu Phe His Ile 50 55 60Ala Leu Gly Gly Leu Leu
Met Ile Pro Ala Gly Ile Tyr Ala Pro Ile65 70 75 80Cys Val Thr Val
Trp Tyr Pro Leu Trp Gly Gly Ile Met Tyr Ile Ile 85 90 95Ser Gly Ser
Leu Leu Ala Ala Thr Glu Lys Asn Ser Arg Lys Cys Leu 100 105 110Val
Lys Gly Lys Met Ile Met Asn Ser Leu Ser Leu Phe Ala Ala Ile 115 120
125Ser Gly Met Ile Leu Ser Ile Met Asp Ile Leu Asn Ile Lys Ile Ser
130 135 140His Phe Leu Lys Met Glu Ser Leu Asn Phe Ile Arg Ala His
Thr Pro145 150 155 160Tyr Ile Asn Ile Tyr Asn Cys Glu Pro Ala Asn
Pro Ser Glu Lys Asn 165 170 175Ser Pro Ser Thr Gln Tyr Cys Tyr Ser
Ile Gln Ser Leu Phe Leu Gly 180 185 190Ile Leu Ser Val Met Leu Ile
Phe Ala Phe Phe Gln Glu Leu Val Ile 195 200 205Ala Gly Ile Val Glu
Asn Glu Trp Lys Arg Thr Cys Ser Arg Pro Lys 210 215 220Ser Asn Ile
Val Leu Leu Ser Ala Glu Glu Lys Lys Glu Gln Thr Ile225 230 235
240Glu Ile Lys Glu Glu Val Val Gly Leu Thr Glu Thr Ser Ser Gln Pro
245 250 255Lys Asn Glu Glu Asp Ile Glu Ile Ile Pro Ile Gln Glu Glu
Glu Glu 260 265 270Glu Glu Thr Glu Thr Asn Phe Pro Glu Pro Pro Gln
Asp Gln Glu Ser 275 280 285Ser Pro Ile Glu Asn Asp Ser Ser Pro 290
29539228PRTHomo sapiens 39Met Pro Val Lys Gly Gly Thr Lys Cys Ile
Lys Tyr Leu Leu Phe Gly1 5 10 15Phe Asn Phe Ile Phe Trp Leu Ala Gly
Ile Ala Val Leu Ala Ile Gly 20 25 30Leu Trp Leu Arg Phe Asp Ser Gln
Thr Lys Ser Ile Phe Glu Gln Glu 35 40 45Thr Asn Asn Asn Asn Ser Ser
Phe Tyr Thr Gly Val Tyr Ile Leu Ile 50 55 60Gly Ala Gly Ala Leu Met
Met Leu Val Gly Phe Leu Gly Cys Cys Gly65 70 75 80Ala Val Gln Glu
Ser Gln Cys Met Leu Gly Leu Phe Phe Gly Phe Leu 85 90 95Leu Val Ile
Phe Ala Ile Glu Ile Ala Ala Ala Ile Trp Gly Tyr Ser 100 105 110His
Lys Asp Glu Val Ile Lys Glu Val Gln Glu Phe Tyr Lys Asp Thr 115 120
125Tyr Asn Lys Leu Lys Thr Lys Asp Glu Pro Gln Arg Glu Thr Leu Lys
130 135 140Ala Ile His Tyr Ala Leu Asn Cys Cys Gly Leu Ala Gly Gly
Val Glu145 150 155 160Gln Phe Ile Ser Asp Ile Cys Pro Lys Lys Asp
Val Leu Glu Thr Phe 165 170 175Thr Val Lys Ser Cys Pro Asp Ala Ile
Lys Glu Val Phe Asp Asn Lys 180 185 190Phe His Ile Ile Gly Ala Val
Gly Ile Gly Ile Ala Val Val Met Ile 195 200 205Phe Gly Met Ile Phe
Ser Met Ile Leu Cys Cys Ala Ile Arg Arg Asn 210 215 220Arg Glu Met
Val22540197PRTHomo sapiens 40Met Cys Thr Gly Lys Cys Ala Arg Cys
Val Gly Leu Ser Leu Ile Thr1 5 10 15Leu Cys Phe Val Cys Ile Val Ala
Asn Ala Leu Leu Leu Val Pro Asn 20 25 30Gly Glu Thr Ser Trp Thr Asn
Thr Asn His Leu Ser Leu Gln Val Trp 35 40 45Leu Met Gly Gly Phe Ile
Gly Gly Gly Leu Met Val Leu Cys Pro Gly 50 55 60Ile Ala Ala Val Arg
Ala Gly Gly Lys Gly Cys Cys Gly Ala Gly Cys65 70 75 80Cys Gly Asn
Arg Cys Arg Met Leu Arg Ser Val Phe Ser Ser Ala Phe 85 90 95Gly Val
Leu Gly Ala Ile Tyr Cys Leu Ser Val Ser Gly Ala Gly Leu 100 105
110Arg Asn Gly Pro Arg Cys Leu Met Asn Gly Glu Trp Gly Tyr His Phe
115 120 125Glu Asp Thr Ala Gly Ala Tyr Leu Leu Asn Arg Thr Leu Trp
Asp Arg 130 135 140Cys Glu Ala Pro Pro Arg Val Val Pro Trp Asn Val
Thr Leu Phe Ser145 150 155 160Leu Leu Val Ala Ala Ser Cys Leu Glu
Ile Val Leu Cys Gly Ile Gln 165 170 175Leu Val Asn Ala Thr Ile Gly
Val Phe Cys Gly Asp Cys Arg Lys Lys 180 185 190Gln Asp Thr Pro His
19541202PRTHomo sapiens 41Met Cys Tyr Gly Lys Cys Ala Arg Cys Ile
Gly His Ser Leu Val Gly1 5 10 15Leu Ala Leu Leu Cys Ile Ala Ala Asn
Ile Leu Leu Tyr Phe Pro Asn 20 25 30Gly Glu Thr Lys Tyr Ala Ser Glu
Asn His Leu Ser Arg Phe Val Trp 35 40 45Phe Phe Ser Gly Ile Val Gly
Gly Gly Leu Leu Met Leu Leu Pro Ala 50 55 60Phe Val Phe Ile Gly Leu
Glu Gln Asp Asp Cys Cys Gly Cys Cys Gly65 70 75 80His Glu Asn Cys
Gly Lys Arg Cys Ala Met Leu Ser Ser Val Leu Ala 85 90 95Ala Leu Ile
Gly Ile Ala Gly Ser Gly Tyr Cys Val Ile Val Ala Ala 100 105 110Leu
Gly Leu Ala Glu Gly Pro Leu Cys Leu Asp Ser Leu Gly Gln Trp 115 120
125Asn Tyr Thr Phe Ala Ser Thr Glu Gly Gln Tyr Leu Leu Asp Thr Ser
130 135 140Thr Trp Ser Glu Cys Thr Glu Pro Lys His Ile Val Glu Trp
Asn Val145 150 155 160Ser Leu Phe Ser Ile Leu Leu Ala Leu Gly Gly
Ile Glu Phe Ile Leu 165 170 175Cys Leu Ile Gln Val Ile Asn Gly Val
Leu Gly Gly Ile Cys Gly Phe 180 185 190Cys Cys Ser His Gln Gln Gln
Tyr Asp Cys 195 20042145PRTRattus sp. 42Met Ser Ser Phe Ser Thr Gln
Thr Pro Tyr Pro Asn Leu Ala Val Pro1 5 10 15Phe Phe Thr Ser Ile Pro
Asn Gly Leu Tyr Pro Ser Lys Ser Ile Val 20 25 30Ile Ser Gly Val Val
Leu Ser Asp Ala Lys Arg Phe Gln Ile Asn Leu 35 40 45Arg Cys Gly Gly
Asp Ile Ala Phe His Leu Asn Pro Arg Phe Asp Glu 50 55 60Asn Ala Val
Val Arg Asn Thr Gln Ile Asn Asn Ser Trp Gly Pro Glu65 70 75 80Glu
Arg Ser Leu Pro Gly Ser Met Pro Phe Ser Arg Gly Gln Arg Phe 85 90
95Ser Val Trp Ile Leu Cys Glu Gly His Cys Phe Lys Val Ala Val Asp
100 105 110Gly Gln His Ile Cys Glu Tyr Ser His Arg Leu Met Asn Leu
Pro Asp 115 120 125Ile Asn Thr Leu Glu Val Ala Gly Asp Ile Gln Leu
Thr His Val Glu 130 135 140Thr14543318PRTHomo sapiens 43Met Met Leu
Ser Leu Asn Asn Leu Gln Asn Ile Ile Tyr Asn Pro Val1 5 10 15Ile Pro
Phe Val Gly Thr Ile Pro Asp Gln Leu Asp Pro Gly Thr Leu 20 25 30Ile
Val Ile Arg Gly His Val Pro Ser Asp Ala Asp Arg Phe Gln Val 35 40
45Asp Leu Gln Asn Gly Ser Ser Met Lys Pro Arg Ala Asp Val Ala Phe
50 55 60His Phe Asn Pro Arg Phe Lys Arg Ala Gly Cys Ile Val Cys Asn
Thr65 70 75 80Leu Ile Asn Glu Lys Trp Gly Arg Glu Glu Ile Thr Tyr
Asp Thr Pro 85 90 95Phe Gln Lys Glu Lys Lys Ser Phe Glu Ile Val Ile
Met Val Leu Lys 100 105 110Ala Lys Phe Gln Val Ala Val Asn Gly Lys
His Thr Leu Leu Tyr Gly 115 120 125His Arg Ile Gly Pro Glu Lys Ile
Asp Thr Leu Gly Ile Tyr Gly Lys 130 135 140Val Asn Ile His Ser Ile
Gly Phe Ser Phe Ser Ser Asp Leu Gln Ser145 150 155 160Thr Gln Ala
Ser Ser Leu Glu Leu Thr Glu Ile Ser Arg Glu Asn Val 165 170 175Pro
Lys Ser Gly Thr Pro Gln Leu Arg Leu Pro Phe Ala Ala Arg Leu 180 185
190Asn Thr Pro Met Gly Pro Gly Arg Thr Val Val Val Lys Gly Glu Val
195 200 205Asn Ala Asn Ala Lys Ser Phe Asn Val Asp Leu Leu Ala Gly
Lys Ser 210 215 220Lys Asp Ile Ala Leu His Leu Asn Pro Arg Leu Asn
Ile Lys Ala Phe225 230 235 240Val Arg Asn Ser Phe Leu Gln Glu Ser
Trp Gly Glu Glu Glu Arg Asn 245 250 255Ile Thr Ser Phe Pro Phe Ser
Pro Gly Met Tyr Phe Glu Met Ile Ile 260 265 270Tyr Cys Asp Val Arg
Glu Phe Lys Val Ala Val Asn Gly Val His Ser 275 280 285Leu Glu Tyr
Lys His Arg Phe Lys Glu Leu Ser Ser Ile Asp Thr Leu 290 295 300Glu
Ile Asn Gly Asp Ile His Leu Leu Glu Val Arg Ser Trp305 310
31544128PRTHomo sapiens 44Met Arg Thr Ala Leu Leu Leu Leu Ala Ala
Leu Ala Val Ala Thr Gly1 5 10 15Pro Ala Leu Thr Leu Arg Cys His Val
Cys Thr Ser Ser Ser Asn Cys 20 25 30Lys His Ser Val Val Cys Pro Ala
Ser Ser Arg Phe Cys Lys Thr Thr 35 40 45Asn Thr Val Glu Pro Leu Arg
Gly Asn Leu Val Lys Lys Asp Cys Ala 50 55 60Glu Ser Cys Thr Pro Ser
Tyr Thr Leu Gln Gly Gln Val Ser Ser Gly65 70 75 80Thr Ser Ser Thr
Gln Cys Cys Gln Glu Asp Leu Cys Asn Glu Lys Leu 85 90 95His Asn Ala
Ala Pro Thr Arg Thr Ala Leu Ala His Ser Ala Leu Ser 100 105 110Leu
Gly Leu Ala Leu Ser Leu Leu Ala Val Ile Leu Ala Pro Ser Leu 115 120
12545299PRTMus sp. 45Met Ala Gln Asn Leu Lys Asp Leu Ala Gly Arg
Leu Pro Ala Gly Pro1 5 10 15Arg Gly Met Gly Thr Ala Leu Lys Leu Leu
Leu Gly Ala Gly Ala Val 20 25 30Ala Tyr Gly Val Arg Glu Ser Val Phe
Thr Val Glu Gly Gly His Arg 35 40 45Ala Ile Phe Phe Asn Arg Ile Gly
Gly Val Gln Gln Asp Thr Ile Leu 50 55 60Ala Glu Gly Leu His Phe Arg
Ile Pro Trp Phe Gln Tyr Pro Ile Ile65 70 75 80Tyr Asp Ile Arg Ala
Arg Pro Arg Lys Ile Ser Ser Pro Thr Gly Ser 85 90 95Lys Asp Leu Gln
Met Val Asn Ile Ser Leu Arg Val Leu Ser Arg Pro 100 105 110Asn Ala
Gln Glu Leu Pro Ser Met Tyr Gln Arg Leu Gly Leu Asp Tyr 115 120
125Glu Glu Arg Val Leu Pro Ser Ile Val Asn Glu Val Leu Lys Ser Val
130 135 140Val Ala Lys Phe Asn Ala Ser Gln Leu Ile Thr Gln Arg Ala
Gln Val145 150 155 160Ser Leu Leu Ile Arg Arg Glu Leu Thr Glu Arg
Ala Lys Asp Phe Ser 165 170 175Leu Ile Leu Asp Asp Val Ala Ile Thr
Glu Leu Ser Phe Ser Arg Glu 180 185 190Tyr Thr Ala Ala Val Glu Ala
Lys Gln Val Ala Gln Gln Glu Ala Gln 195 200 205Arg Ala Gln Phe Leu
Val Glu Lys Ala Lys Gln Glu Gln Arg Gln Lys 210 215 220Ile Val Gln
Ala Glu Gly Glu Ala Glu Ala Ala Lys Met Leu Gly Glu225 230 235
240Ala Leu Ser Lys Asn Pro Gly Tyr Ile Lys Leu Arg Lys Ile Arg Ala
245 250 255Ala Gln Asn Ile Ser Lys Thr Ile Ala Thr Ser Gln Asn Arg
Ile Tyr 260 265 270Leu Thr Ala Asp Asn Leu Val Leu Asn Leu Gln Asp
Glu Ser Phe Thr 275 280 285Arg Gly Ser Asp Ser Leu Ile Lys Gly Lys
Lys 290 2954680PRTHomo sapiens 46Met Asp Cys Val Ile Thr Gly Arg
Pro Cys Cys Ile Gly Thr Lys Gly1 5 10 15Arg Cys Glu Ile Thr Ser Arg
Glu Tyr Cys Asp Phe Met Arg Gly Tyr 20 25 30Phe His Glu Glu Ala Thr
Leu Cys Ser Gln Val His Cys Met Asp Asp 35 40 45Val Cys Gly Leu Leu
Pro Phe Leu Asn Pro Glu Val Pro Asp Gln Phe 50 55 60Tyr Arg Leu Trp
Leu Ser Leu Phe Leu His Ala Gly Ile Leu His Cys65 70 75
80471497DNAHomo sapiens 47tttttttttt tttttttttt cacgttacaa
tctcactata ttcggaatct aaagtagtca 60tccttagaga agtacagagg agaatggtgg
gtacagaggc cagggcgggg agtgtggacc 120gattggatgg ggaaagggag
agtttggtca tcaggcatgc
attttcagtt agacaagagg 180aataagttct gatgtcttgt tgcacagcat
tgagggctgc taattctact ggaatcagcc 240tcagactcct ttgctggagg
ggtcgtgaac cctcaaacaa gcccgagcct ctgcattttc 300ttacttgggg
atcttgatct ctgagtactc attgttggtg gcttcttgac ctggataggt
360cctgaggctc ccccttatga aagctgaggg gtgcatactg ggatctctct
ttcctcccct 420gagggagtgg gcagccaggc catggtgtcg ggggttatca
tctgcccagg actcatccag 480ttaccctgag aggctgagcc cctgaatggt
gtttgcatcc ttcatgccta tgtctcccac 540gtccgctgct ggtcttgccg
atttcttcct gcaggacctc actacaatga agatgacaca 600gaaggagagg
aagaccaggg ctgtggctcc agttcccccg accgccccca gcaacactcc
660tgatacaggc ttcattttgc ctgtgtactc ctgttgcagg gagaggttca
gggaaacgtg 720ctgggaaccc agagagttct gagctcgaca ggtgaattcc
ccttcatccc ccaggtgcac 780ttgcagctcc agtaccagag ggtttgaggg
ctgtgagggg tacagggtca gactcctcca 840ggtccagctc agcctggcag
ggggattgct gtcaacagca cagaccaagc gcagagactg 900gccctctagg
actgaaagag atgagctgtt ccccagagct gtggatgctg tgccttctcc
960ttggaagaca gtcacagtca agttctgagg agggtatgtc acgttcacag
agagctggtc 1020atatttataa ttccatttta tatttccttt ctccatacga
aagaagtatc tccccgcatc 1080actcattctg gcatctctga tgctcagggt
gcaatttttg gtctgtgggt ccccaaggag 1140gtggaatcgg tcccgagttt
cctcctgcac tgcccaagct gggttgtttg tggccactgg 1200agccttccag
cttatatcat tccctgcccg gaaccagtag ccatgaactg ggtcagagtc
1260agtctggctg tccactgggt aggagaagga gcagcgcaca tggacacaca
tgccctcttg 1320cacggtcacg gaactctgca tcgtcagcga gtaatccttc
cggttactct tctgtccttc 1380caccctctcc ctcccccaga gcaggggcag
cagcagcagc agcagcatat ctggggttgg 1440aggtgccagg gccgcgaggg
aacgtctgtt cctcagggtt cttctctcag gaactgc 1497481849DNAHomo sapiens
48ttttttttta caatcaaaat gcacatttat taaactgtac tgtgtgaggg aatgtctgac
60caaagagcat tcccacaggc actcggggca ctggctggaa agcgctctca ggtgttcttc
120tcagaaccgg agaagctgca catcacttgt atttgtaaaa tccttctcca
gtcttcttgc 180cgaacttgtt ctctgctacc agcttattta aggatgggct
gggctgatgt aatgggttct 240ctgcatccat ttcatgccac ccatccacga
tgaacttcgt agtatccagt ccgacataat 300ctagaagctc aaatgggccc
atggggtaac cggctcctaa tttcatagca gtgtcaatgt 360cttctttgga
tgcgtcacct tttaatcaaa agcagagatc caggatatca atctgggccg
420gccatgcaat tgctctggct gggactgtag tgggggtggt gggagttcca
gaaacctttc 480tcagaagccc acagggaaag ccaggaggcc ctcgttcata
cagcctgatt gcttccatga 540ggtatggaac caggaggcgg ttcacaataa
acccaggagt gtccttgcaa gaaacaggat 600gctttcctag ggttttgcta
aagtctacca aagattcaaa tgtcttctgg ctggtcattg 660gtgttttaat
gacctccaca agtttcatga caggcactgg gttgaagaaa tggaggccag
720cgaatcggtc ttgtctggtg gtggcattag ctatgcttgt aatctgcaag
gaggaagtgt 780tgctggcaaa gattgtatgt cttggtgatg tggcaggtgg
ggctgcatgg agattcaggg 840acagctcctc tcaggcagac cctcaggaac
cttaaaagga tgcaggaagc ccagcgtggt 900ggctcacgct ggtaaaccca
gcactttggg aggcctaggc ggggggatcg ctttaggcca 960gcagttcaag
accagcctgc gcaacatagc gagaccccca cctctgtatc actgcagctc
1020ccagagagct cccactcagg gtgaagctct gagtcttggt gatttttagg
agtttatgga 1080aacagccctt ctggtggatg acgtttggag acacatcgcg
tccgtcacag gacacacttc 1140cgatggattt acagcaactc ctggggtagg
tgtggcccgt tgtcatttcg aaggaagagc 1200cagaaaaatc tgtgtagtta
ttcaccccac agcactttag cttctccatg accaagttcc 1260actgtgtaga
atagtcgtct ggctcgttgt aacctctgta attcttcctc agggtcacga
1320aggtgtgttc caaggccaca tctccaacaa ttggaaagaa aagaaggacc
actgtggcag 1380ctgtaacttc catgatgagg acaataacca ttgacaggat
gcaaaacaag agcgtgcctc 1440tgctctcttt agtcgctcca taccacccgg
cacagccaag cagtaccgtg atgcatccca 1500tcaccaggca caggttgcca
acgtgaagga ggtatgcgga ggacagcccg aggacattcg 1560tcagagaggc
ccctccacat ttaccaccaa tgcccaggcc aactaggatg atgccagaca
1620cagccacgaa gccattgagt aaagataaca gtttcttcaa ggaagaatac
ggagtgtgga 1680tttcagccat gctgaacaga ctggggcacc ccgaacattt
aagttaacat cttcctgaac 1740ctctgggcta ctgtttccca agatgatagg
gatctgagga aggggcgcca atgctgactt 1800gctctgcccc tctgtgtcct
ggctccctga tggttgtcca ctcgtgccg 184949741DNAHomo sapiens
49gctttaggcc agcagttcaa gaccagcctg cgcaacatag cgagaccccc acctctgtat
60cactgcagct cccagagagc tcccactcag ggtgaagctc tgagtcttgg tgatttttag
120gagtttatgg aaacagccct tctggtggat gacgtttgga gacacatcgc
gtccgtcaca 180ggacacactt ccgatggatt tacagcaact cctggggtag
gtgtggcccg ttgtcatttc 240gaaggaagag ccagaaaaat ctgtgtagtt
attcacccca cagcacttta gcttctccat 300gaccaagttc cactgtgtag
aatagtcgtc tggctcgttg taacctctgt aattcttcct 360cagggtcacg
aaggtgtgtt ccaaggccac atctccaaca attggaaaga aaagaaggac
420cactgtggca gctgtaactt ccatgatgag gacaataacc attgacagga
tgcaaaacaa 480gagcgtgcct ctgctctctt tagtcgctcc ataccacccg
gcacagccaa gcagtaccgt 540gatgcatccc atcaccaggc acaggttgcc
aacgtgaagg aggtatgcgg aggacagccc 600gaggacattc gtcagagagg
cccctccaca tttaccacca atgcccaggc caactaggat 660gatgccagac
acagccacga agccattgag taaagataac agtttcttca aggaagaata
720cggagtgtgg atttcagcca t 741501288DNAHomo sapiens 50tttttttttt
ttttttttta acattgtaac aggtttatgc attttgaagt gccttctaca 60catccaccca
gaggctctgc tgatttcact tatgcccagg ctataaaatg cctttctctc
120atcccccagt agagcactgg gatcaccact aggcctaggg ggcatatcaa
gggtttaata 180gactggggga atgggcaaca gaactggcta ccttagaggc
tctggaatgc cccccaccca 240tccacccacc aatggaagga aagtcaggca
tcgcctaaaa ggagtggtcc ctatctagcc 300ccaagtctgg agcagaaagg
gcaggtccat tctggcccaa gtgacattgt tagatcctgt 360cccctccccc
aatcactgct gcttgccagg gtgcctcttc acagttccca tgtggcagca
420gtagtggcag aggcagaagt ggacttattg tagattgcag tacagataca
tggacacaat 480tcatggcagc cagctcgagg cccccaattc cagctgccac
accacccacg gtgactgcat 540tagttcggat gtcatacaaa agctgattga
agcaaccctc tactttttgg tcgtgagcct 600tttgcttggt gcaggtttca
ttggctgtgt tggtgacgtt gtcattgcaa cagaatgggg 660gaaaggcact
gttctctttg aagtagggtg agtcctcaaa atccgtatag ttggtgaagc
720cacagcactt gagccctttc atggtggtgt tccacacttg agtgaagtct
tcctgggaac 780cataatcttt cttgatggca ggcactacca gcaacgtcag
gaagtgctca gccattgtgg 840tgtacaccaa ggcgaccaca gcagctgcaa
cctcagcaat gaagatgagg aggaggatga 900agaagaacgt cacgagggca
cacttgctct cagtcttagc accatagcag cccaggaaac 960caagagcaaa
gaccacaacg ccggctgcga tgaggaagta gcccacgttg acaaactgca
1020tggcactgga cgacagtggc ccgaagatct tcagaaagga tgccccatcg
attgacaccc 1080agatgcccac tgccaacagg gctgcaccac acagaaagat
gagcaaattg aagaggatca 1140tcatggtctt aatgaagctg aagcactgca
tggtggctcc tgttcagggc tcttggcagt 1200gagttctgaa agagggaact
gctgaggctc cacaaaggac aaaacagctc ccgggtgttg 1260ccactgagtg
ggcaggcaga gggacgcc 1288511236DNAHomo sapiens 51tttttttttt
ttttttttga atgctcagta tctatttatt aaatgaatgc acaattgaaa 60gagtgaagat
tgaatgtcta gatatttgtt cctctaaccc acaagctcac atgcagcaga
120gcacacgaat gggtgtgcca aggccccaga ggctggatgc ccacatatgt
gctgatgcat 180gtgccctagg gagtccctct tggttaacat ctgtttaaaa
tctctgtctt cttcttgata 240gagaaacagc atggtttcct agactgaatt
gaagaatgaa ggtgagagta caaacagtgg 300gagagacgtt tcctcagctg
tcagaacagg aacgacctgg gttatggaag cccagaaagg 360gaggaggact
tcttttggtc ccagtgaaag atgcttccag aatctgtagc cttacttatt
420tgcttggatc tcactggaat aacttggtgg tgaggtcacc ggttctgggg
tgatcactgg 480gtttgctgca tagatgtttg gatagatgac actcacattg
cttgattgac agcagaccaa 540ctggcagcca aagtgggaag atgcgcatgc
gatgccaaac tccaggaggc agaagaccag 600cagcacgcca gaaatcgcca
ttccagggtt cacaccccag gcgtaaggat aatagtcggg 660gtaggcatat
gggtggggaa tacttagatc tgtgatgaag agtatgactc caactgcaga
720gcagattgca ctgacgatgt tcaagcccaa actgccagac agcaggcaat
aagaatatgg 780ctgattttct gctgccacgg agagagatcc tgaaatgata
aaccacaagc ctccccagaa 840gggaaagcct ccgtagaatg aaatagacag
gtattcccct acgagaaccg tcgccatgat 900ggagccgagg ccgatgtgag
ccaggccaat gatgatctgg atggccccca aggttttgcc 960ttctttcaga
gctttctgca caggctgccc attcacattc gacaccaaac taggtgggtt
1020cccaggaact aggtggactt gcggctggct gtttggatac aggggcacgt
gagacataat 1080tcctggggtc acaggataac cattgtgggg tgccaccacc
aacacagaat tggccaccgg 1140aactgctgaa gtcatcgaat tcatgcttgc
tctgccagca gccacgtatc tcttgatcct 1200attccttttc tttgctgtgg
ggaccttgtt tttttt 1236521115DNAHomo sapiens 52tttttttttt tttttttttt
tacttcattt acaatttact agctcttcca gtgtttcaga 60gggatacagg gtttcaacga
tctaacatga atgggataga aggtggactt agaacatagc 120aaacatacat
cttgattgaa tcagcccact gcgagcacgg atcttgattg aatcagccta
180ttggtgtagt tttaggtcta catacatctt gattgaatca acctactggt
gtagtttttt 240ggtctatcag taagtagtgt tgtcagttct ccagccaagc
aactttctaa ttcacagggg 300ggaccctaaa tgtccttaaa ggtagagagg
aataggccca cagatacatt gggttacact 360atctcatact ggttatttgt
tatggcacga gagaggcagt aggcgagaaa gattccaatc 420agttggaagc
aagcaactcc aaaggaaatt cctgcaacga ctcccatttc tgactctata
480atggtcatca cctttataaa acaaccttca ttgtttactt tgtctgcatc
tctctgtgga 540gtacaatctt caagtttaca gcaactctta ggaaatcctt
tttctgagta ataattagta 600tctgtccaat ctctataatc ggtgacacca
caacaatgca acgtattttg gatcttgtct 660actgcatggc ttctataatc
tcctgtagag ttatactgct tcaaagcctt ctcataatta 720ttcttaaagc
tgttcttaat ctcatgtctg aaaacaaatc ctacgatggc agcgaccagt
780tcgaccaaaa aaacgagagt cagaaacatt gcatacagtt ttagcatcca
tgcagaagct 840cggcaggtag caaaacaacc aaaggtgccc aaaagaataa
tgacggtacc agtagcaatg 900agcacgaagg ggacattggt ggccttctca
tttaaaagag aaaagtaatt ctccaggctc 960accttgcccc aaatgccaac
tgcaagaagg ataacgccag tgatccagaa aataaaagtg 1020tagattagca
gaacgctctt gaaacaagta atgactggtt tagtctgcag tctccgagac
1080ggggacgcca tgactagccc gagaccctgc tcgtg 1115531662DNAHomo
sapiens 53tttttttttt tttttttttg ttgggggctg atacagagtt tattgaatta
gatttttcta 60tttacaactg agatcacatc tacatactat tttgctagtc tacatgggta
cattatttcc 120aacaagctta agacttacca tgaatgggct cattcataca
aaaacacact cacactaatt 180cttttaaaac agtagtgcat acattatact
cctcctataa agccaacttt gattaaaaac 240cactagtttc aaagctcagt
ctctgatttt gaagatgaac caagatatac gccatatgat 300cctacaatct
attttagtca ttttgtacag ctgctatctt attggactac agtaaatatt
360ttttaaaagg acaccaatga ggggcaccat ctggtgttaa ccttaaccag
aaagctggtt 420tcctcctcct ccccccaaaa acctttggcc aagagttctc
cactgtgaag actgaaagga 480cctggtgaca tttcggcatc agtcctgtta
ccacttggag gtaacagaag caggctcgtg 540tcctccttta attctaccac
actacatgac tcgcaattgg ttctgaaatt agaacgttca 600ccatcgtact
taaaatctta ggggcatgaa gagtcagcta gaacaaggaa aaagaaagtc
660gcaggtagta ggtaagtagg tgggcacatg aaaagccaag ctgctctgtc
caacaccagt 720gtacatgtgc tttaactaaa tgaactccag aggccaacag
cagcagacct gctcaattca 780ccttccaaat cagaacaaga ccaaaaagct
caggcttgag attgtcaact atgcataggt 840tccgccagtg atgaagagct
cgtaagcagg atctctactc cttctgcaca acacgatgca 900agcacacagc
atgcccagca gctgaatagc tgcaaatgcc agtgcggccc agatcacatg
960catcatgatt tcttgtagct tcttcacaac tagagcctca cacccctcag
catagaggtc 1020ggaagggtgg gccaggctgc cattacaatt gctggcagtc
tctctgcagc agctaagagg 1080gacactctgg tttttggttt ctttgaacca
atctgtattt tcccagtctg agtagttgtg 1140aattccacaa caatgcagct
gtctctgtac ataatcaata gcccggctag cagcatcagg 1200gttggttcca
ttgtaggtct tatacacttt ctgaatgctg cgatcaacct cattttccac
1260ctttgctctg taaacatatc ccaaaaccac tacaacaact tctgtgacaa
aaaccaagag 1320caggatgatg acaaacgtgg caagtccaca gcgactttcc
cggattgtgg cacagcagcc 1380aattagccca atgatgaaaa gcagggctcc
tacagctatg atcactacag cagggatgag 1440cgtgtacaca tcttcaaaga
agtggtcata gtcatcataa gtgatgaaga cataggctcc 1500cacatagcat
aaaatgccag ctgcccccca gaagatgagg ttgagaaaga ccagcacggt
1560cttggaggag gtgatgccgc actggcccat ggcgccggtg gcccgcgaag
gcccggcccg 1620gagagcgggg ctgcgctcac cgagagagcg gcaatgctcg tg
1662541345DNAHomo sapiens 54tttttttttt tttttttttt tgctgctcac
actttattaa gatgcaccag gagccccacg 60ggcccactat ggaatgtagg tgaggggtcc
acggccccgc ctggagcacc aggaccagtg 120gggccacctc caggatgcca
agacccggct cctccagcgc caacctgttt tccaggaagc 180tgggggccac
aggctggctc ccgtgtgaac actgctttgg agaatacgta aatataaaaa
240gtgctggagg cccctcccac ccctgccctg ggttccgcag ccagcacgtg
gccacccctc 300ccaggggggg tccggaggcc ctgaagccac ctgagctagg
gccttccaga aacagggttc 360cgggggctcc aggcacctgt ccctccctcc
ctcctcccag catggggcag aacaccgaag 420ctggtggtgg gaaaagcagc
tgtgggtgcg gccatctccc cgtgggcgtc cttttggcag 480agaagccggc
ggtgggcggc ctacgcgcag taggtgtctg ccttgaccac ttggcagtac
540atggtcatgg cgaaggtcag gcccaggatc tgcaccagcg ccgtgcacag
cccaaagatg 600cccacagcca gcaggttctc ctgaagccac accttcaccg
tctcgtagca cggcgccttc 660caccaggtgc cgggggcgtg cagcccacag
ctctcactga actccaagca gcaggagtca 720ggtacccgcg tggcgttgta
cacctcgaac cagtcagtgt agttggagac gccacagcag 780cggaagtcgg
tctggatgat gctccaggcg ttggtgaggc ccacgttgcc ctgcgtgccg
840tacaggtgca agcctttctt caggtcttgc tgggcatacc tgtcaatctt
gtccgtgtag 900gcgaagaaga ggatggcgat ggtggcctcc agcaggaaca
ccagcagcag cagcaggaag 960aaagtgagca ggaggcactt gttctccttg
atggcaccca ggcagcccac gaagccgatg 1020gccatgacaa aggcgccggt
gatgatgagc aggttggcag ccgacaggga cgggaaggaa 1080gaagacagcg
tggcgaagct cccctgtgtg gcggccagcc agatgccgac acccagcacg
1140ccacagcctc ccagccagaa gaacaggttg aaggcgaaca tgaggtactt
gacggcctgg 1200aggcaggcgc gcgccatgcc gcagcgcttc agttctgggc
tggccacagg aaagagacca 1260ggcggtgctc agggtccctg aaggctgacc
agtgggcagg caggtggtgg gtgcgaccaa 1320ggaagcccca agctctgcgc tcgtg
134555734DNAHomo sapiens 55gaacatttgg tatgaaagct ttagattgca
attttcatgt agaagtagct tcatatcaca 60tctcgtgagt ttcgtatcgc acagcagagg
accatgctga atatcatgcc aaagatcgtc 120agacctgcaa ttccaatacc
gacaattcca atgagctgga gcttaacact gattatggtc 180tcaatttcat
cgatgcaatt cttgtgtcct agaagctcct ttgggcatgt aggttggacc
240tgttcggagc tttcttttcc acagcactga aatgttgagt ggaaggtgat
gagtgtccca 300ttgccttttc ccctgtcttt aaggtaatca ttgtaagcct
cttcatacat ggtctgaaca 360tgtcggatag ctaccccctt gcctataaaa
gcaaatactc cagtggttac ttcagcagca 420aatatcacca ggaggcaggt
aaaaaatgat ccaagcacac attgcgactc ccgcatggct 480ccgcagcatc
cgaagaaccc cacggccatc atcagggccc cggctccaac cagaacatac
540agccccacat agaaatactc tggggacttg tcctctgatg ataactcctt
tatggcacct 600ccgaaccgaa accatagtcc aaaagcaatg acggccgatc
cagccagcca gaagagcagg 660ttgaagccaa gcagcaggta cttgatgcac
cgcaggcccc cgcggaagcg ccccatgctg 720cggcccggcg gcgc 73456577DNAHomo
sapiens 56tttttttttt tttttttttt taaacattga aaattttatt ttcacagggg
atttgcataa 60aaagaacatt atttttgttc tgtgtatata taagtatttt tgtttcctta
acttgtttct 120gttgcccaca cacaactagg agaagatgct tttctttatt
ttggtttggc caaagatgct 180aatggttaaa ttatgaagga ctttgtttta
cttatgttaa gtggtgaaaa ctgtagttct 240taatctatga agaattctct
aggtggctat acaagaaaaa tacaaaaagt taggaaaaca 300tgtaaacgta
agtatgaggt atttcataga tacagtgccc atacaaattc tctttcccac
360aattttcaac tgccagatct cttgctttag tcttttttcc ttatatttgg
agaaacagaa 420gagtttgaca taaaagtccc tttgagggat gtgagggttg
cagtagttta cagcagggtc 480agaaaatgaa agtaataaag catatttaca
tgttttgtat aggaccaaaa tatttcccct 540aaaaaggtgt taaaagtttt
ttagtcccat aaacact 57757936DNAHomo sapiens 57ccttgcagat ggcatgtgcc
acatgttgtc ttgaggggca gttggaagag gaggtggttg 60ccattagtaa aaatgccgct
ggagcctctg gtgcaagcag caggtgagaa ccatcccgca 120gatctgcagg
caggccaccc cgatgcccac tgcccccata agcagcaggt ggtcggccag
180gaactgctcc agcttggtga ggcagcctcc ctccacctta tagatgttgg
aggggtgggc 240ccgctggccg cagcgcgcca ccactgtctt gcagcagctg
tcgggcacct ggcggccctc 300ggcctcccgc aacaggatgt acgtgctgtg
ctgccagtcg gctgagctgt tgcttccgca 360gcacttgaaa tcctgctgga
gtcggtccac tgaggcgtga tctgcgtgct ccggctgccc 420gtagttctca
gccagagtcc ggttcaagtg ctgcttcagt tcatcactca gcctctggta
480atacacatgg gccaggactc ccgccaccag ctcaaccagg aagatgacga
gcaacaggca 540gaaatacgtg gagaggcagc ccttccgctc ccagaggatg
gcaccgaagc ccaggaagcc 600ggtcaccatg acaagtacgc ccgcaaagat
gaggatgtag gcggaggcgg caaaggtgct 660ggaggccagg acgctgaggt
agccactctt ctccaccagg gtccagatgc ccacagccag 720gacggctgct
cccccgaccc agaagaagaa gttgaagaca aagagtaaat acttcaagta
780gatgatcagc cagtcgtcct gctcagtctt atagtgggcc atggcttctg
ggccctgcca 840ggggctccgg aagcggcgaa gggactgcgc ctagagagac
tgagagcgcg gctcccgggg 900ccgcccagcc gcccaccgcc cgcagctcgt gccgaa
93658738DNAHomo sapiens 58gttttttttt tttttttttg agagcgcaaa
gcagtttatt ctagcgagca agggagtgag 60cgtccaggaa ggagcaggtg taacccggcg
gtcagtggag cctcagtgag gtgtgtcctg 120ttttttcctg caatcgccgc
agaagacacc aatggtcgcg ttcaccagct ggatcccaca 180cagtactatc
tccaggcagg aggcggccac cagcagcgag aagagcgtca cattccaggg
240gaccacgcga gggggcgcct cgcaccgatc ccatagagtg cggttgagca
agtaagctcc 300cgcggtgtct tcgaagtggt agccccactc gccgttcatt
aagcatctgg gtccatttcg 360gagcccagct ccagacaccg agaggcagta
gatggcacca agcaccccga acgccgagga 420gaagaccgag cgcagcatcc
tgcagcggtt tccacagcac ccagcaccac agcagccctt 480gccccctgcc
cgaacggctg caatccctgg acacagtacc attaggcccc cgccaatgaa
540gccgcccatg agccagactt gcaagctgag atggttggtg ttggtccagg
aggtctcccc 600attaggtacc agcaggaggg cgttggccac aatgcagacg
aggcagaggg taatgaggga 660gagccccaca cagcgggcac attttcccgt
acacatggtg aggtgtcagg aaggacaggc 720ggtgagtgaa agtaagct
738591071DNAHomo sapiens 59tttgagttag aattttgggg tgtttccaat
tttccagaat attaaataat gttgcaatat 60acatattgca taaaggtttc tctcatattc
ttatttactt ccctagtaca gattcttgga 120aatggatttg tttgatcaaa
gggcagaagt atttctgaag tttctgatgc atattactta 180aaaaacatac
taaatggaag ggtggtatac ttgcatgtgc agtgaggact gcaaattttt
240tacaaataaa caccaaatgc agaaagcacc atcattttca atattgccct
caagtctaac 300acagttgata taatatttag atagatggcc atgtcttgat
aatggaaaac attttgtcct 360tattcaaatg attccaggct ggaagatcac
tgaatagctt ccacacagta tcttggatag 420ttgcatgact actctgatga
ggcagatgat cacttgaagc ccactgaggg ttatgagaat 480ggaaaataaa
atgatgttcc actccacaac atgtgcaggt tccaggcact gaatccatat
540gctagaatct gtaaggaaac gtccagcagt gccttcaaaa gcatactccc
agccatcaag 600ggtgcggcaa tatggccctt ggacaagacc caaggcagag
atgaccaggc agtatccaga 660aaaagcaatt ccgagggaag aaaagataat
tgacagcagt gtcacatatt ttttgctgca 720gttttcactc tggcaacatt
tatagttgtt attattctcc agtaccagaa gaactgttgt 780tactataagc
atcatgatgc ctgagaaaca gattccttca aaataccaca cgtagttggt
840gagtttattg ctggatgcat aggaagtttg cccattcggg aaatacaata
atatgttcac 900gattatactc caaagtgcaa gcggaatcag caaacaactt
aggcagcctc cacacttccg 960agaccccatt ttgccctgct tagaaccact
tccagggtca gagcccttca cttcatattc 1020atgaggagac ggggaattgg
aatatacccg cagccgacaa tctctcgtgc c 107160865DNAHomo sapiens
60tttttttttt tttttttttt tttttttttt tttttaaata aataacctta tttatttatt
60tgaccacaaa taaataaata ataaagaata aatgaccaca ttctttaagc cattgtattt
120tggagtatct ttgttagtct acaacacttg cctctgtgag tgcagtccag
gccctgaagg 180cagatggagc catatcccag gctcctggtg gaggaaggtg
cagagttcga ggaacctttg 240cactcttgtg ccttccctca ggcccaaagc
tcctgtagac tcagtctcgt gaccccagag 300gtgaaccagg ccctgatgtg
ctggtgttca ggtggatggt ttaatgaggg gaggagagtg 360gggccctggg
gagtggctga ctcttgtttt ctgcggtatt tcctggaacc atccttcagg
420agtggacaca gtagagctgg acacttccac tgatccggag ctcccgcagc
tgctccaggg 480cctgctggtt catgctggtg gcccccagcc cctgcccatt
gagcgccagc ttcagccctc 540cctcctggaa caggagcagc acctcaaaga
atctctgggg gtaaaagagg aagggggctg 600agatcagttt cttctgcccc
cagcgggaga tccaggccag agttctgtct gcgaaggagg 660ccctgagtgt
cacaggagca tgggcagcct ggtccctcag gctcacagta aaatgcttcg
720gctcttgcaa gaccagtccc cgtactatga tgacctgccc aggcgagaga
ccctggggaa 780gagcatgtga gcagggcacc tccagcctgg ggctcatcag
caggaaagga tgtccagctg 840ggtactctct gctgccctcc acaaa
86561441DNAHomo sapiens 61tttttttttt tttttttttt tactggttta
aatcatttat ctccatgtag agatttgagt 60acaaaaataa acggcaacaa aacagaagga
gtgtgaaatc cggggatcca cagggcttct 120gtcctccacc ttccatgcag
ctgggggctg catcctctgt ggggtggctt catcctctgt 180ggggtctgtg
gggcctgctc caagtcatca gcattccatg cccacctgga cctggtccca
240gactttcggg gaagccctca gaggctccac tgtgttcgtg gtcttgcaga
agcgagagct 300ggccgggcag accacagaat gcttgcagtt gctggagctg
gtgcacacgt ggcagcgcag 360ggtaagggct ggccctgtag ccacagccag
ggctgcaagg agcagcaatg ctgtcctcat 420ctctgatgtc gtctctcgtg c
441621066DNAHomo sapiens 62tcaaaaactg gagaagcaga tccacttctt
gtgggggtgg agttcttggt gactaggctc 60atttcttacc cttgatgagg ctgtcacttc
ccctggtgaa actttcatcc tgtaggttca 120gcacaaggtt gtcagctgtg
agatagatac gattctgtga tgtggcgatc gtcttggaga 180tattctgggc
tgctcgaatc ttgcgaagtt tgatgtagcc agggttcttg ctcagtgctt
240ctccaagcat cttggcagcc tcggcctcac cctcggcctg cacaattttc
tgccgctgtt 300cctgctttgc tttttctacc aagaattggg cccgctgggc
ctcctgctgg gccacttgtt 360tggcttctac agcagctgtg tactctcggc
taaagctcag ctctgtgatg gccacatcat 420ccaggatgag gctgaagtcc
ttggccctct ctgtcagctc ccggcggatc aacagggata 480cctgggcccg
ctgggtgatc agctgtgagg cattgaactt ggccaccaca ctcttgagca
540cctcgttgac aatggacggc aacactcgtt cctcgtagtc cagccctagg
cgctggtaca 600tgctaggaag ctcctgagca ttgggtcgag acaacactcg
cagggagata ttcaccatct 660gtaggtcttt ggagcctgta ggggaggaga
tttttcgagg tctggcccga atgtcataga 720taatggggta ctggaaccaa
gggatcctga agtgaaggcc ctcggccagg atagtgtcct 780gctgcactcc
accgatccga ttgaagaaga tggctctgtg cccgccttcc acggtgaaca
840cagattcgcg cacaccgtag gccacggcgc cggcccccag caacagcttc
agggccgtgc 900ccatgccccg gggcccggcg ggcagccgtc ccgccaagtc
cttcaagttc tgggccatgt 960ctgatcttga ggccggcggc actggaggtc
agaagggggt gccggcccgc ctctaccccg 1020ctccggctta ggtactgcac
ccttcacacg agggttcggg cccgct 106663704DNAHomo sapiens 63gcgatagggt
gccagccgga ccaccttcct ccaggtcccc ttcagggatg tcatttgaaa 60agacccacaa
gacaccgagg gcagtgcacc cacactgggg gagggacaca ccaggcagcg
120gccttcatcc agacacttca ggagcagtag tcttcggcag gttggaccga
ccataccctg 180gcccaggaac ctttccccag gggtccgcct gggtgtccat
cacaggggtc gggttcgggg 240gcattgagga cctcttaccc aacatgtagg
aaaagagaca gccagacctg taaaactgat 300ctgggacctc agggttgagg
aaggcagcag cccaacacac cttgtccaaa acagtgcacc 360tgggaatggg
catatggtgc tcagcgcccc agaagcagcc tgagtctgag gctccgacat
420gggagggagt gcgtgagccc cggccccagc aggctgccat gcccagcgtc
ctctcctcca 480ctactccagg cctgcctcgc ctcacctggg agcagagtgt
tgcttcctca tggaaatagc 540cgtgcatgaa ctcacagtat tcccgggtgg
tgatctcaca gctgcccttg gtgccgatgc 600agcaggggcg gcccttgatc
tcgcagtcca tgtgcaggaa gcctgtgtgg ttgctcctgg 660cctgctctgt
gcagatcggc cacttagtga tgtcatctcg tgcc 704
* * * * *