U.S. patent application number 09/972115 was filed with the patent office on 2003-02-13 for second mammalian tankyrase.
Invention is credited to Funk, Walter D., Morin, Gregg B., Piatyszek, Mieczyslaw A..
Application Number | 20030032769 09/972115 |
Document ID | / |
Family ID | 26826720 |
Filed Date | 2003-02-13 |
United States Patent
Application |
20030032769 |
Kind Code |
A1 |
Morin, Gregg B. ; et
al. |
February 13, 2003 |
Second mammalian tankyrase
Abstract
A new protein named Tankyrase II is described in this
disclosure. Sequences for the human Tankyrase II cDNA and the
protein translation product are provided. Also provided are species
homologs, muteins, related nucleic acids, peptides, and drug
screening assays. Tankyrase II interacts with telomere-associated
proteins, thereby affecting telomerase activity and potentially
telomere length. The materials and techniques provided in this
disclosure allow Tankyrase II activity to be studied in vitro and
manipulated inside cells--to the potential benefit of clinical
conditions associated with a defect in telomerase activity, or the
replicative capacity of affected cells.
Inventors: |
Morin, Gregg B.; (Davis,
CA) ; Funk, Walter D.; (Hayward, CA) ;
Piatyszek, Mieczyslaw A.; (Morgan Hill, CA) |
Correspondence
Address: |
GERON CORPORATION
230 CONSTITUTION DRIVE
MENLO PARK
CA
94025
|
Family ID: |
26826720 |
Appl. No.: |
09/972115 |
Filed: |
October 5, 2001 |
Related U.S. Patent Documents
|
|
|
|
|
|
Application
Number |
Filing Date |
Patent Number |
|
|
09972115 |
Oct 5, 2001 |
|
|
|
PCT/US00/09558 |
Apr 10, 2000 |
|
|
|
60128577 |
Apr 9, 1999 |
|
|
|
60129123 |
Apr 13, 1999 |
|
|
|
Current U.S.
Class: |
530/300 ;
536/23.5; 536/24.31 |
Current CPC
Class: |
C12Q 2600/136 20130101;
C12Q 1/6886 20130101 |
Class at
Publication: |
530/300 ;
536/23.5; 536/24.31 |
International
Class: |
C07K 002/00; C07K
005/00; C07K 014/00; C07K 017/00; C07K 004/00; C07K 007/00; C07K
016/00; A61K 038/00; C07H 021/04 |
Claims
What is claimed as the invention is:
1. An isolated polynucleotide that hybridizes under stringent
conditions to a polynucleotide with the sequence in SEQ. ID NO:5,
but not to a polynucleotide with the sequence in SEQ. ID NO:7, and
which encodes a peptide that has ribosylation activity.
2. An isolated polynucleotide that comprises a sequence of at least
30 consecutive nucleotides contained in SEQ. ID NO:5, but not in
SEQ. ID NO:7, and which encodes a peptide that has ribosylation
activity.
3. An isolated polynucleotide encoding a protein that comprises
sequence of at least 25 consecutive amino acids that is at least
90% identical to a Tankyrase II protein sequence contained in SEQ.
ID NO:6, and that further comprises a PARP domain, a SAM domain,
and an ANK domain.
4. The isolated polynucleotide of claim 1, comprising at least 100
consecutive nucleotides contained in SEQ. ID NO:5.
5. The isolated polynucleotide of claim 1, which encodes a peptide
comprising at least 10 consecutive amino acids contained in SEQ. ID
NO:6
6. The isolated polynucleotide of claim 1, which encodes a peptide
comprising at least 25 consecutive amino acids contained in SEQ. ID
NO:6
7. The isolated polynucleotide of claim 1, which ADP-ribosylates a
target protein in the presence of nicotinamide adenine dinucleotide
(NAD.sup.+).
8. A method for ribosylating a target protein, comprising
incubating the target protein with a peptide according to claim 1
in the presence of NAD.sup.+.
9. A method of screening a test compound for an ability to affect
Tankyrase II activity, comprising incubating a reaction mixture
containing the peptide encoded by the polynucleotide of claim 1, a
target protein, a substrate, and the test compound, under
conditions where the target protein would be ribosylated by the
peptide in the absence of the test compound; and determining any
effect of the test compound on the amount or rate of
ribosylation.
10. The method of claim 8, wherein the substrate is nicotinamide
adenine dinucleotide (NAD.sup.+).
11. The method of claim 8, wherein the target protein is selected
from TRF1, TRF2, TIN2, Tankyrase I, and Tankyrase II, and fragments
thereof.
12. The method of claim 8, wherein the peptide is expressed by a
recombinant host cell in the reaction mixture.
13. The method of claim 8, wherein the peptide is isolated from a
host cell before being added to the reaction mixture.
14. The method of claim 8, further comprising determining whether
the compound affects the activity of ribosylation enzymes other
than Tankyrase II.
15. The method of claim 8, wherein the compound enhances the amount
or rate of ribosylation.
16. The method of claim 8, wherein the compound inhibits the amount
or rate of ribosylation.
17. A method of screening a test compound for an ability to affect
Tankyrase II activity, comprising expressing the polynucleotide of
claim 1 in a host cell, combining the cell with the test compound,
and determining any effect of the test compound on the cell, in
comparison with a cell expressing Tankyrase II in the absence of
the compound.
18. An isolated polynucleotide that has at least one of the
following properties: a) it hybridizes under stringent conditions
to a polynucleotide with the sequence in SEQ. ID NO:5, but not to a
polynucleotide with the sequence in SEQ. ID NO:7, and encodes a
peptide that has ribosylation activity; b) it comprises a sequence
of at least 30 consecutive nucleotides contained in SEQ. ID NO:5,
but not in SEQ. ID NO:7, and encodes a peptide that has
ribosylation activity; or c) it encodes a protein that comprises
sequence of at least 25 consecutive amino acids that is at least
90% identical to a Tankyrase II protein sequence contained in SEQ.
ID NO:6, and that further comprises a PARP domain, a SAM domain,
and an ANK domain.
Description
RELATED APPLICATIONS
[0001] This application is a continuation of PCT/US00/09558, an
International patent application designating the U.S., filed on
Apr. 10, 2000, and published on Oct. 19, 2001 as WO 00/61813. Both
the PCT application and this continuation claim the priority basis
of U.S. Provisional Patent Application Nos. 60/128,577, filed Apr.
9, 1999; and 60/129,123, filed Apr. 13, 1999. The aforelisted
priority documents are hereby incorporated herein by reference in
their entirety.
TECHNICAL FIELD
[0002] This invention relates generally to the field of molecular
biology of telomere and telomere associated proteins, and the
maintenance of telomere structure. More specifically, this
invention relates to a novel protein that shares three domains of
homology with the telomerase associated protein Tankyrase I.
BACKGROUND
[0003] Recent research has described what may be a key switch in
the control of cellular aging. The telomeres at chromosome ends are
made up of multiple repeats of the DNA sequence TTAGGG, which are
thought to stabilize the chromosome during replication. Telomeres
shorten each time the cell divides, and cells become senescent when
the telomeres are too short to protect the chromosome. But in some
cells, including embryonic cells, an enzyme called telomerase
rebuilds the telomeres after each division, extending the
replicative capacity of the cell. (Bodnar et al., Science 279:349,
1998; Harley et al., Curr. Opin. Genet. Dev. 5:249,1995).
[0004] Regulation of telomerase activity is a complex process
involving several protein components. Two such proteins have DNA
binding activity, and are named telomeric repeat binding factors
(TRF1 and TRF2). It is thought that TRF1 is involved in regulating
telomere length, because overexpression of wild-type TRF1 makes
telomeres shorter, while overexpression of a dominant-negative form
of TRF1 makes telomeres longer--perhaps by affecting the access of
telomerase to the chromosome terminus (van Steensel et al., Nature,
385:740, 1997). TRF1 promotes parallel pairing of telomeric tracts,
apparently pairing in parallel homodimers that form filamentous
structures on longer telomeric repeat arrays (Griffith et al., J.
Mol. Biol. 278:79-88, 1998).
[0005] The role of TRF2 appears to be protection of the chromosome
terminus, since expression of a dominant-negative form of TRF2
leads to chromosome-chromosome fusions. (Griffith et al., J Mol
Biol. 278:79, 1998; Broccoli et al., Nature Genetics 17:231, 1997;
van Steensel et al., Cell 92:401, 1998). This in turn leads to
p53--and ATM-dependent apoptosis of the cell (Karlseder et al.,
Science 283:1321, 1999). TRF1 and TRF2 have been implicated in
large duplex loops at the end of telomeres that may provide a
general mechanism for telomere protection and replication (Griffith
et al., Cell 97:503, 1999).
[0006] Smith et al. (Science 282:1484, 1998; Genomics 57:320, 1999;
J. Cell Sci. 112:3649, 1999) have reported a novel protein that
associates with TRF1, which they named "Tankyrase". A yeast
two-hybrid screen was used with human TRF1 as bait, and yielded two
overlapping cDNAs which provided the full-length sequence. Northern
blot analysis revealed that multiple mRNAs were ubiquitously
expressed in human tissues, with the highest amounts detectable in
testes. It has been proposed that tankyrase interferes with the
binding of TRF1 to telomeres, which in turn has an effect on
telomere length. Tankyrase co-localizes with TRF1 at the ends of
human chromosomes in metaphase and interphase, and also resides at
nuclear pore complexes and centrosomes. Smith et al. reported that
the gene for tankyrase is positioned at 17.6 cR.sub.10000 on human
chromosome 8 with a LOD of 8.2 on the G3 map.
[0007] The molecular events involved in managing chromosome
structure and regulating cell senescence are extremely complex.
Each new protein found to participate in this process provides new
opportunities for monitoring and intervening in some of the
fundamental events of cell biology.
SUMMARY OF THE INVENTION
[0008] This invention provides a new human protein which is hereby
designated Tankyrase II. This new protein shares three domains with
the Tankyrase protein of Smith et al.: the ANK domain comprising 24
repeats of the ankyrin motif, the SAM domain thought to be involved
in protein-protein interaction, and the PARP domain that is
responsible for the poly(ADP-ribose) polymerase activity. Tankyrase
II further comprises has a new domain at the N-terminal, designated
the GC domain, which has no known homologs.
[0009] One of the embodiments of this invention is an isolated
polynucleotide having at least about 30 consecutive nucleotides
contained in a human Tankyrase II encoding sequence, or that is
contained in plasmids deposited under Accession No. 203919, or that
hybridizes under stringent conditions to a Tankyrase II encoding
sequence, but does not consist of the encoding sequence for human
Tankyrase I or other previously known structurally related
proteins, such as those having PARP activity. Another embodiment of
this invention is an isolated polynucleotide having at least 100
consecutive nucleotides that is at least 90% identical to a
Tankyrase II sequence, or contained in the deposited plasmids, but
not in .lambda.-phage, Tankyrase I, or other previously known
sequences. Certain polynucleotides of this invention encode a
protein comprising a GC domain, a PARP domain, a SAM domain, or an
ANK domain, or a protein that binds other telomere-associated
proteins like TRF1, TRF2, TIN2, and Tankyrase I, or that
ADP-ribosylates a target protein in the presence of NAD.sup.+.
Polynucleotides of this invention can be used to obtain the encoded
polypeptide, or to determine other polynucleotides that encode
Tankyrase II-like protein.
[0010] Another embodiment of this invention is an isolated
polypeptide comprising a sequence of at least 10 consecutive amino
acids that is contained in Tankyrase II, or is contained in the
deposited plasmids, but is not contained in any previously known
peptide sequence. Another embodiment of this invention is an
isolated polypeptide comprising a sequence of at least 25
consecutive amino acids that is at least 90% identical to a
Tankyrase II protein sequence, or a protein sequence encoded in the
deposited plasmids. Certain polypeptides of this invention comprise
a GC domain, a PARP domain, a SAM domain, or an ANK domain, or have
activity for binding other telomere-associated proteins like TRF1,
TRF2, TIN2, and Tankyrase I, or ADP-ribosylate a target protein in
the presence of NAD.sup.+.
[0011] A further embodiment of this invention is an isolated human
Tankyrase II protein or fragment thereof, at least 10-fold higher
in purity (or more) on a weight per weight basis than what occurs
in natural sources.
[0012] Also embodied in this invention are polynucleotides encoding
the polypeptides of this invention, and antibodies of any sort that
bind specifically to the polypeptides of this invention. Some of
the antibodies inhibit the catalytic activity of Tankyrase II;
inhibit the binding of Tankyrase II to other telomere associated
protein; or inhibit protein ribosylation mediated by Tankyrase II.
Peptides can be obtained by expressing a polynucleotide of the
invention in a suitable host cell. Also provided are means for
obtaining any antibody of this invention, comprising immunizing an
animal or contacting an immunocompetent particle with a polypeptide
of this invention. Peptides of this invention can be isolated from
a mixture by using an antibody as a specific adsorbant; conversely,
antibodies of this invention can be isolated using a peptide
epitope as a specific adsorbant.
[0013] A further embodiment of this invention is a method for
ribosylating a target protein, comprising incubating the target
protein with a peptide of this invention in the presence of
NAD.sup.+.
[0014] Assay methods of this invention include determining
Tankyrase II binding activity by incubating with a peptide of this
invention under conditions where the protein can bind the peptide
specifically to form a complex, and then correlating any complex
formed with the presence or amount of the protein in the sample.
The protein that has Tankyrase II binding activity can optionally
be TRF1, TRF2, TIN2, or Tankyrase I.
[0015] Another assay method of this invention is for screening a
test compound to determine an ability to affect Tankyrase II
activity, comprising incubating the compound with containing a
peptide of this invention and a conjugate binding ligand, and
determining any effect of the test compound on complex formation.
Another such method comprises incubating a test compound with a
peptide of this invention, a potential target protein, and
NAD.sup.+; then determining any effect of the test compound on the
amount or rate of ribosylation of the target.
[0016] This invention also includes a method for modulating
Tankyrase II expression in a cell, comprising contacting the cell
with the polynucleotide of this invention such as an antisense
polynucleotide, a ribozyme, or an inhibitory RNA under conditions
where the polynucleotide can interfere with mRNA translation.
Modulating Tankyrase II expression in turn is believed to modulate
telomere length in the cell.
[0017] These and other embodiments of the invention will be
apparent from the description that follows.
BRIEF DESCRIPTION OF THE FIGURES
[0018] FIG. 1 is a schematic depiction of Tankyrase II protein. The
domains depicted are the GC domain (encoded by a gene segment rich
in GC), the ANK domain, containing contains 24 ankyrin repeats
thought to be involved in protein-protein interaction, the sterile
alpha motif (SAM) domain, thought to be involved in cellular
signaling, and the poly (ADP)-ribose polymerase (PARP) domain, with
enzymatic activity for ribosylating target proteins such as
TRF1.
[0019] FIGS. 2, 3, and 4 are sequence listings showing cDNA and
amino acid sequence data for human Tankyrase II (SEQ. ID NOs: 1 to
6). The data from FIGS. 2-3 were obtained as described in Examples
1-4; the data from FIG. 4 were obtained as described in Examples
6-7.
[0020] FIG. 5 is a sequence listing comparing Tankyrase II (SEQ. ID
NO:6) with its closest known intraspecies homolog, Tankyrase I
(SEQ. ID NO:8), at the protein level.
[0021] FIG. 6 is a sequence listing comparing Tankyrase II (SEQ. ID
NO:5) with Tankyrase I (SEQ. ID NO:7), at the cDNA level.
DETAILED DESCRIPTION OF THE INVENTION
[0022] This disclosure describes the newly discovered protein
Tankyrase II. Polynucleotides, polypeptides, and antibodies related
to Tankyrase II are provided and exemplified. The protein has
enzymatic activity that causes ribosylation of proximal target
proteins using NAD as substrate. Tankyrase II is thought to have
binding activity for other telomere-associated proteins, which
could become ribosylated targets of the enzyme. This in turn could
play a role in the regulation of telomere length, thereby affecting
the replicative capacity of the cell. The techniques and materials
in this disclosure provide the means to model Tankyrase II activity
in vitro, and provide a way to monitor and modulate Tankyrase II
activity in vivo. Modulation of Tankyrase II activity may be used
to regulate telomerase activity or telomere length.
[0023] FIG. 1 shows the structurally distinct domains of Tankyrase
II, which provide different functional features of tankyrase
activity. There is a unique amino-terminal (GC) domain, followed by
an ankyrin (ANK) motif domain, a sterile alpha module (SAM) domain,
and a carboxy-terminal poly(ADP) ribose polymerase (PARP)
domain.
[0024] The ankyrin (ANK) domain of Tankyrase II contains 24 ankyrin
repeats--a motif of about 33 residues found in a number of
different proteins, and thought to act as modular adapters for
heterologous protein-protein interactions (reviewed by Bennett et
al., J. Biol. Chem. 267:8703, 1992; Bennett et al., and Michaely,
TICB 2:127, 1992; Bork et al., Proteins: Structure, Function, &
Genetics 17:363, 1993). A correlation has been observed between the
number of ankyrin repeats and the nature of the protein-protein
association. Ankyrin family members containing 24 ankyrin repeats
bind cytoskeletal proteins such as tubulin and spectrin.
[0025] The sterile alpha motif (SAM) domain of Tankyrase II lies
downstream from the ANK domain. SAM domains are found in signaling
proteins such as transcription factors, serine/threonine protein
kinases, and GTPases, (Stapleton et al., 1999, Nature Struct. Biol.
6:44-9; Thamos et al., 1999, Science 283: 833-36). SAM-containing
proteins form hetero- and homo-dimers with other SAM-containing
proteins that can regulate cellular signaling processes.
[0026] The carboxy-terminus of Tankyrase II is a domain homologous
to other proteins with poly (ADP)-ribose polymerase activity,
referred to as the PARP domain. Proteins that contain a PARP domain
catalyze the addition of long branched chains of ADP-ribose to
target proteins, using nicotinamide adenine dinucleotide
(NAD.sup.+) as a substrate (reviewed by Still et al., Genomics
62:533, 1999; de Murcia et al., Trends Biochem. Sci. 19:172, 1994;
Lindahl et al., Trends Biochem. Sci. 20:405, 1995). The first such
protein, PARP-1 (Adprt1) contains DNA-binding zinc fingers, a
nuclear localization sequence, and an automodification domain.
PARP-1 binds to nicked DNA, and is thought to play a role in
chromosomal damage repair (P. A. Jeggo, Current Biol. 8:R49, 1998).
PARP-2 (AdprtL2) is a homologous protein with ribosylation activity
that also binds damaged DNA (Am et al. J. Biol. Chem. 274:17860,
1999). VPARP is a related protein that ribosylates major vault
protein in the mammalian ribonucleoprotein complex (Kickhoefer, J.
Cell Biol. 146:917, 1999). Other members of the PARP family include
Tankyrase I and AdprtL1 (Still et al., supra).
[0027] The presence of the ANK, SAM and PARP domains in Tankyrase
II suggests that Tankyrase II plays a role in intercellular or
intracellular communication (e.g., signal transduction), possibly
in conjunction with proteins involved in DNA repair pathways or
maintenance of telomeres. Ribosylation can also play an important
role in how Tankyrase II regulates other proteins involved in
telomere management. Ribosylation of telomere-associated proteins
may result in them leaving the telomere, potentially modulating the
activity of telomerase reverse transcriptase, and thereby affecting
telomere length and replicative capacity of the cell.
[0028] The SAM, PARP and ANK domains of Tankyrase II are homologous
to counterpart domains in the Tankyrase I protein. However, the
N-terminal domains appear to have no homology. Tankyrase I has a
180-residue HPS domain, so called for the abundance of histidine,
proline, and serine residues. In contrast, Tankyrase II has a
substantially different amino acid composition, encoded by a highly
GC-rich gene sequence. This N-terminal domain of Tankyrase II will
be referred to in this disclosure as the "Divergent" or "GC"
domain.
[0029] There is a relationship between the attainment of a critical
telomere length in dividing somatic cells and DNA damage, and both
processes lead to cell cycle arrest and the activation of gene
expression pathways. Thus, Tankyrase II may communicate with a
subset of the signaling molecules in DNA repair processes to
initiate the specific arrest and gene activation pathways of
cellular senescence. Notably, Tankyrase I has been demonstrated to
ribosylate both itself and TRF1 (Smith et al., 1998, supra),
resulting in a reduction of the ability of TRF1 to bind telomeric
DNA. The link between telomere structure and DNA repair is
supported by the observation that p53-and ATM (ataxia
telangiectasia mutated) dependent apoptosis is induced by telomeres
with attenuated TRF2 function (Karlseder et al., 1999, Science
283:1321-1325).
[0030] The structural features of Tankyrase II indicate that it
binds to nuclear and cell proteins, and is involved in
intercellular or intracellular cell signaling that affect telomere
structure and metabolism. Compositions and treatments that modulate
Tankyrase II expression or function are likely to be of therapeutic
benefit for cancer, disorders associated with replicative
senescence, and other conditions associated with perturbations of
telomerase activity or telomere length.
[0031] Definitions
[0032] The term "polynucleotide" as used in this disclosure refers
to a polymeric form of nucleotides of any length. Included are
genes and gene fragments, mRNA, tRNA, rRNA, ribozymes, cDNA,
recombinant polynucleotides, branched polynucleotides, plasmids,
vectors, isolated DNA and RNA, nucleic acid probes, and primers.
Also included are nucleotide analogs, including but not limited to
thiol-derivatized nucleosides (U.S. Pat. No. 5,578,718),
oligonucleotides with modified backbones (U.S. Pat. Nos. 5,541,307
and 5,378,825), and peptide nucleic acids (U.S. Pat. No.
5,786,461). The term polynucleotide, as used in this disclosure,
refers interchangeably to double- and single-stranded molecules.
Unless otherwise specified or required, any embodiment of the
invention that is a polynucleotide encompasses both a
double-stranded form, and each of the two complementary
single-stranded forms known or predicted to make up the
double-stranded form.
[0033] The term "oligonucleotide" is reserved for polynucleotides
of no more than 100 bases in length, and may or may not be
accompanied with an antisense strand, depending on context.
Oligonucleotides are often used as probes in specific hybridization
reactions, or as primers in amplification reactions.
[0034] When comparison is made between polynucleotides for degree
of identity, it is implicitly understood that complementary strands
are easily generated, and the sense or antisense strand is selected
or predicted that maximizes the degree of identity between the
polynucleotides being compared. Percentage of sequence identity is
calculated by first aligning the polynucleotide being examined with
the reference counterpart, and then counting the number of residues
shared between the sequences being compared as a percentage of the
region under examination. No penalty is imposed for the presence of
insertions or deletions, but insertions or deletions are permitted
only where clearly required to readjust the alignment. The
percentage is given in terms of residues in the sequence being
examined that are identical to residues in the comparison or
reference sequence. Particularly desirable polynucleotide sequences
preserve at least one function of the prototype. By way of example
and depending on context, the function preserved may include an
ability to hybridize with a target sequence, the function of a
polypeptide it may encode, or (for certain gene targeting vectors)
the ability to facilitate homologous recombination or gene
inactivation. An example of an algorithm suitable for finding
homologous sequences and determining percent sequence identity is
the BLAST algorithm, (Altschul et al., 1990, J. Mol. Biol. 215:403,
1990; Karlin et al., Proc. Natl. Acad. Sci. USA 90:5873, 1993),
available through the National Center for Biotechnology Information
(www.ncbi.nlm.nih.gov/entrez).
[0035] Polynucleotide sequences are said to be in a "non-natural
arrangement" when they are joined together or interposed with
another sequence in an arrangement not found in nature.
[0036] "Hybridization" refers to a reaction in which one or more
polynucleotides react to form a complex that is stabilized via
hydrogen bonding between the bases of the nucleotide residues. The
hydrogen bonding can occur by Watson-Crick base pairing, Hoogsteen
binding, triplex formation, or complexing in any other
sequence-specific manner. A hybridization reaction will, on
occasion, be a step in a more extensive process, such as part of
PCR amplification. Hybridization reactions can be performed under
conditions of different "stringency". Conditions that increase the
stringency of a hybridization reaction are widely known (see e.g.,
Sambrook et al., infra). Examples of conditions in order of
increasing stringency: incubation temperatures of 25.degree. C.,
37.degree. C., 50.degree. C., and 68.degree. C.; concentrations of
10.times.SSC, 6.times.SSC, 1.times.SSC, 0.1.times.SSC (where SSC is
0.15 M NaCl and 15 mM citrate buffer, pH 7.2) and their equivalent
using other buffer systems; formamide concentrations of 0%, 25%,
50%, and 75%; incubation times from 5 min to 24 h; 1, 2, or more
washing steps; wash incubation times of 1, 5, or 15 min; and wash
solutions of 6.times.SSC, 1.times.SSC, 0.1.times.SSC, or deionized
water. Typical conditions of high stringency for the binding of a
probe of about 100 base pairs and above is a hybridization reaction
at 65.degree. C. in 2.times.SSC, followed by repeat washes at
0.1.times.SSC--or the equivalent combination of solvent and
temperature conditions for the particular nucleic acids being
studied.
[0037] A "hybrid" of polynucleotides, or a "complex" formed between
any two or more components in a biochemical reaction (such as
antibody and antigen), refers to a duplex or higher-order complex
that is sufficiently long-lasting to persist between its formation
and subsequent detection.
[0038] A "control element" or "control sequence" is a nucleotide
sequence involved in an interaction of molecules that contributes
to the functional regulation of a polynucleotide, including
replication, duplication, transcription, splicing, translation, or
degradation of the polynucleotide. "Operatively linked" refers to
an operative relationship between genetic elements, in which the
function of one element influences the function of another element.
For example, an expressible encoding sequence may be operatively
linked to control element that permit transcription and
translation.
[0039] The terms "polypeptide", "peptide" and "protein" are used
interchangeably in this disclosure to refer to polymers of amino
acids of any length. The polymer may comprise modified amino acids,
it may be linear or branched, and it may be interrupted by
non-amino acids. The terms also encompass an amino acid polymer
that has been modified naturally or by intervention; for example,
disulfide bond formation, glycosylation, lipidation, acetylation,
and/or phosphorylation.
[0040] Percentage of sequence identity is calculated for
polypeptides by first aligning the polypeptide being examined with
the reference counterpart or prototype, and then counting the
number of residues shared between the sequences being compared as a
percentage of the region under examination. No penalty is imposed
for the presence of insertions or deletions, but insertions or
deletions are permitted only where clearly required to readjust the
alignment. The percentage is given in terms of residues in the
sequence being examined that are identical to residues in the
comparison or reference sequence. Where substitutions are made,
conservative substitutions (in which one amino acid is substituted
by another with similar charge, size, hydrophobicity, or
aromaticity) are typically better tolerated. Desirable sequences
preserves the function of the prototype: for example, the enzymatic
activity, the binding of specific substrates, and the binding of
specific antibody as detectable in a standard competition
inhibition immunoassay. In certain embodiments, the identity may
exist over a region that is at least about 10, 20-25, or 50-100
amino acids in length.
[0041] The term "antibody" as used in this disclosure refers to
both polyclonal and monoclonal antibody of any species. The ambit
of the term deliberately encompasses not only intact immunoglobulin
molecules, but also such fragments and genetically engineered
derivatives of immunoglobulin molecules (including humanized forms)
that may be prepared by techniques known in the art, and retaining
the binding specificity of the antigen binding site.
[0042] An "immunogenic" compound or composition is capable of
stimulating production of a specific immunological response when
administered to a suitable host, usually a mammal.
[0043] An "isolated" polynucleotide, polypeptide, protein,
antibody, or other substance refers to a preparation of the
substance devoid of at least some of the other components that may
also be present where the substance or a similar substance
naturally occurs or is initially obtained from. Thus, for example,
an isolated substance may be prepared by using a purification
technique to enrich it from a source mixture. Enrichment can be
measured on an absolute basis, such as weight per volume of
solution, or it can be measured in relation to a second,
potentially interfering substance present in the source mixture.
Enrichments by 2, 10, 100, and 1000 fold achieve improved degrees
of purification. A substance can also be provided in an isolated
state by a process of artificial assembly, such as by chemical
synthesis or recombinant expression. An "isolated" cell is a cell
that has been separated from the organism in which it was
grown.
[0044] A polynucleotide used in a reaction, such as a probe used in
a hybridization reaction or a vector used in gene targeting is
referred to as "specific" or "selective" if it hybridizes or reacts
with the intended target more frequently, more rapidly, or with
greater duration than it does with alternative substances.
Similarly, a polypeptide is referred to as "specific" or
"selective" if it binds an intended target, such as a ligand,
hapten, substrate, antibody, or other polypeptide more frequently,
more rapidly, or with greater duration than it does to alternative
substances. An antibody is referred to as "specific" or "selective"
if it binds via at least one antigen recognition site to the
intended target more frequently, more rapidly, or with greater
duration than it does to alternative substances.
[0045] General Techniques
[0046] Unless otherwise noted, the practice of this invention can
be carried out by employing standard techniques of genetic
engineering, protein manipulation, and cell culture. Textbooks that
describe standard laboratory techniques include "Molecular Cloning:
A Laboratory Manual", 2nd Ed. (Sambrook et al., 1989);
"Oligonucleotide Synthesis" (M. J. Gait, ed., 1984); "Animal Cell
Culture" (R. I. Freshney, ed., 1987); the series "Methods in
Enzymology" (Academic Press, Inc.); "Gene Transfer Vectors for
Mammalian Cells" (J. M. Miller & M. P. Calos, eds., 1987);
"Current Protocols in Molecular Biology" and "Short Protocols in
Molecular Biology, 3rd Edition" (F. M. Ausubel et al., eds., 1987
& 1995); and "Recombinant DNA Methodology II" (R. Wu ed.,
Academic Press 1995). Techniques used in raising, purifying and
modifying antibodies, and the design and execution of immunoassays,
are described in Handbook of Experimental Immunology (D. M. Weir
& C. C. Blackwell, eds.); The Immunoassay Handbook (Stockton
Press NY, 1994); and R. Masseyeff, W. H. Albert, and N. A. Staines,
eds., Methods of Immunological Analysis (Weinheim: VCH Verlags
GmbH, 1993).
[0047] Polynucleotides
[0048] The polynucleotides of this invention include those
containing nucleotide sequences which are found within the
Tankyrase II DNA sequence, shown in SEQ. ID NOs:1, 3, and 5.
Further sequence for Tankyrase II gene can be obtained by employing
standard sequencing techniques known in the art to the phage
plasmids deposited in support of this application.
[0049] Also included in this invention are polynucleotides that are
from naturally occurring allelic variants, synthetic variants, and
homologs of Tankyrase II with a percentage of residues identical to
the Tankyrase II cDNA or gene sequence, determined as described
above. It is understood that substitutions, insertions, and
deletions can be accommodated within a polynucleotide sequence
without departing from the spirit of this invention. In certain
embodiments, the polynucleotide sequences are at least about 80%,
90%, 95%, or 98% identical to a sequence or part of a sequence
exemplified in this disclosure; in order if increasing preference.
In other embodiments, the polynucleotide sequences are 100%
identical to a reference sequence or a fragment thereof. The length
of consecutive residues in the identical or homologous sequence
compared with the exemplary sequence can be at least about 15, 30,
50, 75, 100, 200 or 500 residues in order of increasing preference,
up to the length of the entire clone, gene, or sequence.
[0050] This invention includes polynucleotides that are uniquely
related to the prototype polynucleotide sequences, in comparison
with other sequences that may be present in a sample or reaction
mixture of interest. By way of example, probes of at least about
100 consecutive nucleotides that are at least 90% or 80% identical
to a reference sequence may be specific, and a probe of at least
about 500 or 2000 consecutive nucleotides may be specific if at
least 90%, 80%, 70%, or even 60% identical with the reference
sequence, depending on hybridization conditions, as explained
below. On occasion, such nucleotides can be divided into halves
(about nucleotides 1-2137 and 2138-4275), or quarters (about
1-1068; 1069-2137; 2138-3206; 3207-4275), and still retain their
specificity. Nucleic acid molecules comprising specified lengths of
consecutive nucleotides can be selected from any of these regions.
It will also be recognized that for some purposes such as
hybridization reactions, a specific polynucleotide sequence will
readily accommodate deletions from the 5' or 3' end of either
strand of say, 15, 25, or even 50 nucleotides without compromising
function. Internal deletions may also be tolerated.
[0051] Of particular interest are polynucleotides that are distinct
from polynucleotides encoding Tankyrase I, and other proteins
containing ANK, PARP, or SAM domains. In certain embodiments of the
invention, polynucleotides are distinct from one or more previously
known EST sequences, such as those in GenBank Accession Nos.
R64714, AA244138 (SEQ. ID NO:9), A244137, AA307492, H11865, H17748,
N57467, R06946, AI247608 R06902, AI247608, H11505, H17635, N29528,
AA088990 AI426537, and AW157349 (SEQ. ID NO:10), and those listed
elsewhere in this disclosure. A polynucleotide of this invention
can be "distinct" from other polynucleotides because of an internal
sequence difference (a substitution, deletion, or insertion), or
because it is defined to encompass additional sequence at either
end. Also included in the invention are recombinant or synthetic
polynucleotides in which a Tankyrase II-like sequence is linked to
a heterologous sequence to form: for example, a heterologous
promoter in an expression vector, or a selectable marker such as
neo in a targeting vector.
[0052] The polynucleotides of this invention can be in the form of
an expression vector, in which the encoding sequence is operatively
linked to control elements for transcription and translation in a
prokaryotic or eukaryotic host cell of interest. A variety of
suitable vectors and their design and manufacture are known in the
art. Vector systems of interest include but are not limited to
those based on retroviruses, adenoviruses, adenoassociated viruses,
herpes viruses, SV40, papilloma virus, Epstein Barr virus, vaccinia
virus, lenti virus, and Semliki Forest virus.
[0053] Particular polynucleotides of this invention are useful for
producing polypeptides of interest, as nucleotide probes and
primers, and as targeting vectors for genetic knockouts. Further
description of the characteristics of such constructs are provided
elsewhere in this disclosure.
[0054] Preparation
[0055] The polynucleotides of this invention can be prepared by any
suitable technique in the art. Using the data provided in this
disclosure or deduced from the deposited plasmids, sequences of
less than .about.50 base pairs are conveniently prepared by
chemical synthesis, either through a commercial service or by a
known synthetic method, such as the triester method or the
phosphite method. A suitable method is solid phase synthesis using
mononucleoside phosphoramidite coupling units (Hirose et al.,
Tetra. Lett. 19:2449-2452, 1978; U.S. Pat. No. 4,415,732).
[0056] For use in antisense therapy, polynucleotides can be
prepared synthetically that are more stable for the pharmaceutical
preparation for which they are intended. Non-limiting examples
include thiol-derivatized nucleosides (U.S. Pat. No. 5,578,718),
oligonucleotides with modified backbones (U.S. Pat. Nos. 5,541,307
and 5,378,825). Also of interest in the context of antisense
constructs are peptide nucleic acids. Prototype PNA have an achiral
polyamide backbone consisting of N-(2-aminoethyl)glycine units, to
which purine and pyrimidine bases are linked, for example, by way
of a methylene carbonyl linker. PNAs are nuclease and protease
resistant, and the uncharged nature of the PNA oligomers enhances
the stability of PNA-nucleotide duplexes, thereby blocking
transcription or translation. Uptake into cells can be enhanced by
conjugating to lipophilic groups incorporating into liposomes, and
introducing an amino acid side chain into the PNA backbone. See
Soomets et al., Front. Biosci. 4:D782, 1999; U.S. Pat. Nos.
5,539,082, 5,766,855, 5,786,461, and International Patent
Application WO 8/53801.
[0057] Polynucleotides of this invention can also be obtained by
PCR amplification of a template with the desired sequence.
Oligonucleotide primers spanning the desired sequence are annealed
to the template, elongated by a DNA polymerase, and then melted at
higher temperature so that the template and elongated
oligonucleotides dissociate. The cycle is repeated until the
desired amount of amplified polynucleotide is obtained (U.S. Pat.
Nos. 4,683,195 and 4,683,202). Exemplary primers are shown in Table
1. Suitable templates include the plasmids deposited in support of
this application, and cDNA libraries for cells expressing Tankyrase
II. Encoding sequences, intron sequences, and upstream or
downstream sequences for Tankyrase II can be obtained from a human
genomic DNA library.
1TABLE 1 Exemplary primers for amplifying Tankyrase II sequences
Forward & Reverse Primers Function UTANKII-32:
5'-TCCAGAGGCTGGTGACCCCTGA-3' SEQ. ID NO:11 Amplifies entire
LTANKII-37: 5'-TTGAACTAACTACTGAAGA-3' SEQ. ID NO:12 ANK domain
UTANKII-38: 5'-CTGTCTTCAGTAGTTAGTTCA-3' SEQ. ID NO:13 Amplifies
entire LTANKII-39: 5'-GTTACAAACCTTCTGAATCT-3' SEQ. ID NO:14 SAM
domain UTANKII-40: 5'-GAAAGATACACTCACCGGA-3' SEQ. ID NO:15
Amplifies entire LTANKII-41: 5'-TAGGGTTCAGTGGGAATTAG-3' SEQ. ID
NO:16 PARP domain gt11-5': 5'GACTCCTGGAGCCCGTCA-3' SEQ. ID NO:17
Amplifies .lambda. 11L-1-1 gt11-3': 5'-GGTAGCGACCGGGCGTCA-3' SEQ.
ID NO:18 cDNA insert
[0058] Production scale amounts of large polynucleotides are most
conveniently obtained by inserting the desired sequence into a
suitable cloning vector and reproducing the clone. Techniques for
nucleotide cloning are given in Sambrook, Fritsch & Maniatis
(supra) and in U.S. Pat. No. 5,552,524. Exemplary cloning and
expression methods are illustrated in Examples 1 and 2, below.
Polynucleotides can be purified by standard techniques in nucleic
acid chemistry, such as phenol-chloroform extraction, agarose gel
electrophoresis, and other techniques known in the art, adapted
according to the source of the polynucleotide.
[0059] Assessment and use of the Polynucleotides
[0060] Polynucleotides of this invention can be used to identify
Tankyrase II nucleotide sequences in a sample of interest for
research, diagnostic evaluation, or any other purpose. Generally,
this will involve preparing a reaction mixture in which a sample
suspected of containing an Tankyrase II-related sequence is
contacted with a polynucleotide of this invention under conditions
that permit the polynucleotide to hybridize specifically with the
compound being tested for, detecting any stable hybrids that form,
and correlating the hybrids with the presence of a Tankyrase II
related sequence in the sample. The formation of stable hybrids can
be detected by any suitable method known in the art. For example,
the probe sequence with a detectable label such as a radioisotope,
a chromophore, or a hapten such as avidin to which an signaling
reagent can be attached. Alternatively, the reagent polynucleotide
can be a primer for an amplification reaction in which the amount
of product produced correlates with the formation of specific
hybrids.
[0061] The specificity of the probe or primer, and the stringency
of hybridization conditions are both chosen with a view to
facilitating detection of sequences of interest, while diminishing
false positive reactions. Thus, when it is important to distinguish
between Tankyrase II sequences from Tankyrase I sequences,
particularly when using sequence outside the GC domain, then
stringency conditions should be high, and the reagent
polynucleotide should be nearly identical to the sequence being
tested for. Conditions can be determined empirically so that the
reagent polynucleotide will hybridize with the Tankyrase II
sequence being tested for but not with other sequences that might
be present in the sample of interest. In other instances, assays
for Tankyrase II are conducted on samples where Tankyrase I is not
present, or where it is desirable to test for Tankyrase I and
Tankyrase II together. In these instances, the capability of the
probe to cross-hybridize with Tankyrase I is not a hindrance, and
may provide certain advantages.
[0062] Polynucleotides of this invention can also be used to
inhibit the transcription or translation of Tankyrase II in target
cells. Such polynucleotides can be in the form of antisense
constructs, which in some embodiments binds to Tankyrase II mRNA
and prevent translation. Other polynucleotides of this invention
are ribozymes having a substrate (Tankyrase II mRNA) binding
portion, and an enzymatic portion with endonuclease activity that
cleaves the substrate. Design and use of ribozymes is described
generally in U.S. Pat. Nos. 4,987,071, 5,766,942, 5,998,193, and
6,025,167. The modulation of Tankyrase II expression using ribozyme
constructs is embodied in this invention.
[0063] This invention also includes interfering RNA (RNAi)
complexes. The structure and activity of RNAi is reviewed by Bosher
et al. (Nature Cell Biol. 2:E31, 2000) and C. P. Hunter (Curr.
Biol. 9:R440, 1999). The RNAi complexs of this invention comprise
double-stranded RNA comprising Tankyrase II sense and antisense
polynucleotides (optionally in a hairpin configuration) that
specifically inhibits translation of mRNA encoding Tankyrase II and
Tankyrase II-like proteins. Also contemplated are polynucleotides
that bind to duplex Tankyrase II sequences to form a triple
helix-containing nucleic acid, blocking expression at the
transcription level (Gee et al., in Huber and Carr, 1994, Molecular
and Immunologic Approaches, Futura Publishing Co.; Rininsland et
al., 1997, Proc. Natl. Acad. Sci. USA 94:5854,1997).
[0064] This invention also encompasses polynucleotides that encode
polypeptides of interest. Characteristics of the polypeptides of
this invention are described in the section that follows. For
polypeptides that are fragments of naturally occurring Tankyrase
II, there will be a corresponding naturally occurring
polynucleotide encoding sequence. Those skilled in the art will
recognize that because of redundancies in the amino acid code, any
polynucleotide that encodes a peptide of interest can be used in a
translation system to produce the peptide. Except where otherwise
required, all possible codon combinations that translate into the
peptide sequence of interest are included in the scope of the
invention.
[0065] Polypeptides
[0066] The polypeptides of this invention include those that
comprise amino acid sequences encoded within any of the
polynucleotides of this invention, exemplified by SEQ. ID NO:6 and
its subfragments. Also included in this invention are polypeptides
containing Tankyrase II like sequence that is from naturally
occurring allelic variants, synthetic variants, and homologs of
Tankyrase II with a percentage of residues identical to the
Tankyrase II protein, calculated as described elsewhere in this
disclosure.
[0067] It is understood that substitutions, insertions, and
deletions can be accommodated within a protein sequence without
departing from the spirit of this invention. Conservative
substitutions are typically more tolerable, such as the
substitution of charged amino acids with amino acids having the
same charge, or substituting aromatic or lipophylic amino acids
with others having similar features. Certain peptides of this
invention are 60%, 80%, 90%, 95%, or 100% identical to one of the
sequences exemplified in this disclosure; in order of increasing
preference. The length of the identical or homologous sequence
compared with the prototype polypeptide can be about 7, 10, 15, 25,
50 or 100 residues in order of increasing preference, up to the
length of the entire protein.
[0068] This invention includes polypeptides that are uniquely
related to the prototype sequences, in comparison with other
sequences that may be present in a sample or reaction mixture of
interest. By way of example, peptides of at least about 10
consecutive amino acids that are at least 90% or 80% identical to a
reference sequence, or wherein the rest of the peptide contains
only conservative substitutions, may uniquely identify the peptide
in terms of functional or antigenic characteristics. A peptide of
at least about 25, 100, or 300 consecutive amino acids may be
specific if at least 90%, 70%, 60%, or even 50% identical with the
reference sequence. Longer peptides can be divided into halves
(amino acids 1-584 and 585-1166 of SEQ. ID NO:6), or quarters
(amino acids 1-292; 293-584; 585-876; 877-1166) and still retain
one or more of their functional activities--such as ribosylation of
target proteins, and the binding to conjugate peptides through the
Tankyrase II ANK and SAM domains. Peptides from the region encoded
by nucleotides 1-283 of SEQ. ID NO:5 are also of interest. It will
be recognized that for some purposes such as reactions with
antibody or the contact region on an opposing protein, a specific
polypeptide sequence will readily accommodate deletions from the N-
or C-end, say, of 3, 5, or even 10 amino acid residues.
[0069] Certain peptides of this invention are distinct from
peptides previously known: These include human Tankyrase I
proteins, and other proteins comprising PARP, SAM, and ANK domains.
In certain embodiments, polypeptides of the invention are distinct
from predicted amino acid sequences encoded in one or more
previously known polynucleotide sequences, such as those cited at
other places in this disclosure. A polypeptide of this invention
can be "distinct" from other polypeptides because of an internal
sequence difference (a substitution, deletion, or insertion), or
because it is defined to encompass additional sequence at either
end. Also included in the invention are artificially engineered
fusion proteins in which a Tankyrase II like sequence is linked to
a heterologous sequence which modulates Tankyrase II activity,
provides a complementary function, acts as a tag for purposes of
labeling or affinity purification, or has any other desirable
purpose.
[0070] Preparation
[0071] Short polypeptides of this invention can be prepared by
solid-phase chemical synthesis. The principles of solid phase
chemical synthesis can be found in Dugas & Penney, Bioorganic
Chemistry, Springer-Verlag NY pp 54-92 (1981), and U.S. Pat. No.
4,493,795. Automated solid-phase peptide synthesis can be performed
using devices such as a PE-Applied Biosystems 430A peptide
synthesizer (commercially available from Applied Biosystems, Foster
City Calif.).
[0072] Longer polypeptides are conveniently obtained by translation
in an in vitro translation system, or by expression in a suitable
host cell. To produce an expression vector, a polynucleotide
encoding the desired polypeptide is operably linked to control
elements for transcription and translation, and then transfected
into a suitable host cell. Expression may be effected in
prokaryotes such as E. coli, eukaryotic microorganisms such as the
yeast Saccharomyces cerevisiae, or higher eukaryotes, such as
insect or mammalian cells. Control elements such as the promoter
are chosen to permit translation at an acceptable rate under
desired conditions. A number of expression systems suitable for
producing the peptides of this invention are described in U.S. Pat.
No. 5,552,524. Expression cloning is available from such commercial
services as Lark Technologies, Houston Tex.
[0073] Following production, the protein is typically purified from
the producing host cell by standard methods in protein chemistry in
an appropriate combination, which may include ion exchange
chromatography, affinity chromatography, and HPLC. Expression
products are optionally produced with a sequence tag to facilitate
affinity purification, which can subsequently be removed by
proteolytic cleavage.
[0074] Also contemplated are Tankyrase II protein isolated from
human biological samples, including tissue samples and cultured
cell lines, tracking activity on the basis of functional assays
and/or immunoassays provided below. Antibody to Tankyrase II
described in the following section can be used in immunoaffinity or
immunoprecipitation techniques to enrich Tankyrase II from
biological samples. If desired, fragments can be made from whole
Tankyrase II by chemical cleavage (e.g., using CNBr), or enzymatic
cleavage (using trypsin, pepsin, dispase, V8 protease, or any other
suitable endopeptidase or exopeptidase). Enrichment of peptides and
proteins of this invention from natural or synthetic sources
provides a purity of 10-fold, 100-fold, 1000-fold, or 10,000-fold
higher than what is found in nature, in terms of a weight to weight
ratio of Tankyrase II peptide to other proteins in the sample
mass.
[0075] Assessment and use of the Polypeptides
[0076] Polypeptides of this invention can be used for a number of
purposes, including but not limited to the characterization of
telomerase function and how it is regulated, assays for proteins
and nucleotide sequences to which Tankyrase II binds, the
identification of new proteins with Tankyrase II binding activity
that may play a role in maintaining telomere length, replicative
capacity, apoptosis, chromosome packing, or gene expression, and
the obtaining of antibody specific for Tankyrase II.
[0077] Subregions of Tankyrase II and homologs can be assessed for
function based on the known domain structure of Tankyrase II, and
employed according to the role they play in the activity of the
whole molecule.
[0078] A putative PARP domain in a Tankyrase II homolog can be
identified on the basis of sequence similarity, since a high degree
of conservation with Tankyrase II PARP and other proteins with PARP
domains (e.g., Tankyrase I, PARP-1 and PARP-2), especially over
critical conserved residues, correlates with ribosylation activity.
Residues of Tankyrase II thought to play an important role in the
enzymatic activity are shown in Table 2. Functional assays for
poly(ADP-ribose) polymerase can be conducted by incubating the
putative PARP-containing peptide with a target protein (such as the
Tankyrase II ANK and domains), or TRF1, in the presence of
nicotinamide adenine dinucleotide (NAD.sup.+), or an analog labeled
with a radioisotope such as .sup.32P or .sup.33P, biotin, or a
fluorescent group. ADP-ribosylation can be monitored by
incorporation of the label into the protein phase, by a change of
size of the target protein (measurable, for example on a protein
gel), or by detection of ADP ribose polymers on the target (for
example, using commercially available antibody specific for
ADP-ribose polymers, by digestion with glycohydrolases, or by
physical-chemical mechanisms, such as mass spectrometry). Known
PARP inhibitors like 3 amino-benzamide (3AB) can be used to verify
the specificity of the assay. A number of other assays for rapid
detection of poly(ADP-ribose) polymerase activity have been
described. See Sallmann et al., Mol. Cell Biochem. 185:199, 1998;
Simonin et al., Anal. Biochem. 195:226, 1991; Shah et al., Anal.
Biochem 232:251, 1995. Peptides with confirmed PARP activity can
then be used as a reagent to ADP-ribosylate protein targets of
interest.
[0079] A putative SAM domain can also be identified on the basis of
sequence similarity with the sterile alpha motif domain in other
proteins (Tankyrase II, Tankyrase I, EphB2 receptor, and others
reviewed by Stapleton et al., Nature Struct. Biol. 6:44, 1999). The
SAM domain has been implicated in forming homodimers and
heterodimers with other SAM-containing proteins (Stapleton et al.,
op. cit.; and Kyba et al., Dev. Genet. 22:74, 1998; Thanos et al.,
Science 283:833, 1999). Thus, putative SAM domains can be screened
functionally in dimerization reactions: either with themselves, or
with SAM domains from other proteins with known heterodimerization
activity. Dimerization can be detected in an equilibrium system
(e.g., using a biosensor), or in a separation system (e.g., by gel
filtration chromatography or in a gel shift experiment).
Dimerization can also be detected in a reporter gene assembly--for
example, where a conjugate binding site on another protein is fused
to a DNA-binding peptide, and the putative SAM domain is fused to a
trans-activator. These constructs are then transfected into a cell
comprising a reporter gene (such as Lac Z), which signals proximity
of the trans-activator, indicating binding between the two
peptides. Peptides with confirmed SAM activity according to any of
these assays can then be used in turn as reagents in a dimerization
assay to detect or quantitate Tankyrase II or other proteins with a
SAM domain They can also be used to inhibit Tankyrase II activity
by competition at the SAM binding site.
[0080] Putative ANK and GC domains can also be identified on the
basis of sequence similarity to corresponding domains in Tankyrase
II. In addition, ANK domains characteristically have a number of
tandem repeats about 33 amino acids long. Dozens of proteins
containing anywhere from one to dozens of ANK repeats are known.
Michaely et al. (Trends Cell Biol. 2:127, 1992) report the
consensus sequence as
[0081] -XGXTPLHLAARXGHVEVVKLLLDXGADVNAXTK-A I SQ NNLDIAEV K NPD D V
K T M R Q SI N E SEQ. ID NO:19
[0082] ANK repeats generally are implicated in protein-protein
binding, and the ANK domain in Tankyrase I is responsible for the
binding of TRF1. Tankyrase II is believed to have binding activity
for several proteins involved in telomere regulation and other
aspects of chromosome management. Such proteins include TRF1, TRF2,
TIN2, and Tankyrase I, which are all proteins known to interact in
the management of telomeres. Peptide fragments and homologs of
Tankyrase II can be tested for binding to such proteins, and those
showing activity can in turn be used to assay TRF1, TRF2, TIN2, or
Tankyrase I. The general format of such an assay comprises
incubating a sample suspected of containing the protein with
Tankyrase II binding activity with a peptide of this invention
under conditions where the protein can bind the peptide to form a
complex, and correlating any complex formed with the presence or
amount of Tankyrase II binding activity in the sample. In a similar
fashion, fragments and homologs of Tankyrase II can be tested for
binding to polynucleotides having particular sequences, such as the
tandem repeats that are characteristic of telomeres. Since
Tankyrase II binds and ribosylates telomere-associated proteins,
fragments and homologs of Tankyrase II can also be tested for
modulation of telomere length. Cells are transfected with an
expression vector for the fragment or homolog, and the effect on
telomere length is measured by a suitable method, such as the
assays described in U.S. Pat. Nos. 5,707,795, 5,741,677, and
5,834,193.
[0083] A systematic approach can be used to determine functional
regions and homologs of Tankyrase II according to any of these
assays. For example, the viability of an assay system is confirmed
on the intact Tankyrase II protein; then a series of nested
fragments is tested to determine the minimum fragment that provides
the same activity. Similarly, amino acid substitutions can be
introduced into the sequence until the activity is ablated, thereby
determining what residues are critical for functional activity.
[0084] Peptides of this invention can also be used for the
preparation and testing of antibodies against Tankyrase II, for the
testing of other compounds for Tankyrase II binding activity, and
for the screening of potential Tankyrase II modulators. These
procedures are detailed further on in this disclosure.
[0085] Dominant Negative Mutants
[0086] Based on the sequence data provided in this disclosure,
someone skilled in the art will be able to develop dominant
negative polypeptide mutants of Tankyrase II, and polynucleotides
that encode them. These mutants may be used to inhibit the function
of Tankyrase II in a cell or reaction mixture. The production of
dominant negative mutants entails deleting or mutating an important
functional element of the native Tankyrase II. For example,
functional mutation or deletion of the ANK domain may produce
peptides that do not bind to TRF1 or TRF2, but retain SAM binding
activity. Conversely, a functional mutation or deletion of the SAM
domain may produce peptides are deficient in binding to proteins
such as TRF1 or TRF2, but still have the ribosylation activity of
PARP. A functional mutation or deletion of the PARP domain (for
example, mutation of all or a subset of the residues from the
Tankyrase II C-terminus to alanine), may result in a peptide that
binds Tankyrase II associated proteins, but does not have any
ribosylation activity.
[0087] Muteins with point mutations can also be obtained.
Specifically, amino acids thought to be critical for the activity
of the domain could be changed to a neutral amino acid such as
alanine, and then reassayed for functional activity. For Tankyrase
II, mutations that may abolish ribosylation activity are changes to
His (position 1031), Gly (1032), Gly (1058), Tyr (1060), Tyr
(1071), and Glu (1138).
2TABLE 2 Critical Residues for PARP Activity in Tankyrase II
ERYTHRRKEV SEENHNHANE RMLFHGSPFV NAIIHKGFDE RHAYIGGMFG SEQ. ID
NO:20 AGIYFAENSS KSNQYVYGIG GGTGCPVHKD RSCYICHRQL LFCRVTLGKS
FLQFSAMKMA HSPPGHHSVT GRPSVNGLAL AEYVIYRGEQ AYPEYLITYQ
IMRPEGMVDG
[0088] Screening for Other Tankyrase II Binding Proteins
[0089] Those skilled in the art will readily appreciate that the
assays described earlier in this section can be adapted to screen
for other proteins that may be involved in telomere regulation,
cell proliferative capacity, senescence, and apoptosis. The
Tankyrase II domains ANK and SAM have characteristic features of
protein binding molecules, and can be used to identify binding
partners for Tankyrase II by incubating with a candidate compound
under conditions suitable for binding, typically a physiological
isotonic buffer containing any necessary cofactors that may promote
transmolecular interaction. The formation of binding complexes with
a candidate binding partner that is demonstrably specific (by
virtue of being higher affinity than the binding of other candidate
compounds) correlates with binding activity for Tankyrase II.
Positive controls include peptides or proteins which have binding
activity for Tankyrase II, while negative controls include
ubiquitous and generally unreactive compounds, such as albumin.
Candidates likely to screen positive for Tankyrase II binding
include fragments and homologs related to telomere-associated
proteins such as TRF1.
[0090] This type of conjugate binding assay can be conducted in
several different formats. For example, Tankyrase II containing
protein complexes can be isolated from human tissues or cell lines,
for example, by tracking Tankyrase II through standard protein
purification regimens by way of ribosylation activity or Tankyrase
II antibody binding. In a similar approach, natural sources of
Tankyrase II are solubilized in a suitable buffer, and Tankyrase II
complexes are immunoprecipitated. The conjugate binding partner is
then recovered from the complex, and characterized by physical and
chemical criteria (such as apparent molecular weight determined by
SDS gel electrophoresis), amino acid sequencing, or binding assays
with Tankyrase II domains. In another example, Tankyrase II is
labeled with a traceable substituent, such as biotin, a fluorescent
group, an enzyme, a radioisotope, or a peptide group (e.g., FLG,
HA, myc, or an immunoreactive peptide sequence), and then combined
in a reaction solution with an isolated candidate binding partner,
or with a mixture of components (such as a cell extract) in which
compounds with Tankyrase II binding activity may be found.
Formation of complexes with the labeled Tankyrase II is then
detected (for example, by gel shift techniques or
immunoprecipitation), and correlated with binding activity for
Tankyrase II.
[0091] Another format is a coexpression system, using Tankyrase II,
a Tankyrase II fragment, or a Tankyrase II homolog as bait. For
example, a yeast two-hybrid screen system is employed, in which a
Tankyrase II encoding sequence is fused to one part of the
expression system, and a library of candidate binding partners is
fused to the complementary component needed for expression of the
marker. Cloned cells that express the activity of the marker
contain an insert that comprises the encoding sequence for a
Tankyrase II binding partner. Yeast two-hybrid screen systems are
described generally in Bianchi et al., EMBO J., 16:1785, 1997.
Reagents and suitable libraries (e.g., human fetal liver cDNA
transformants) are commercially available from Clontech, Palo Alto
Calif.
[0092] Antibodies
[0093] Antibody molecules of this invention include those that are
specific for any novel peptide encompassed in this disclosure.
These antibodies are useful for a number of purposes, including
assaying for the expression of Tankyrase-II, and purification of
Tankyrase-II peptides by affinity purification.
[0094] Polyclonal antibodies can be prepared by injecting a
vertebrate with a polypeptide of this invention in an immunogenic
form. If needed, immunogenicity of a polypeptide can be enhanced by
linking to a carrier such as KLH, or combining with an adjuvant,
such as Freund's adjuvant. Typically, a priming injection is
followed by a booster injection is after about 4 weeks, and
antiserum is harvested a week later. If desired, the specific
antibody activity can be further purified by a combination of
techniques, which may include Protein-A chromatography, ammonium
sulfate precipitation, ion exchange chromatography, HPLC, and
immunoaffinity chromatography using the immunizing polypeptide
coupled to a solid support. Antibody fragments and other
derivatives can be prepared by standard immunochemical methods,
such as subjecting the antibody to cleavage with enzymes such as
papain, pepsin, or trypsin.
[0095] Any unwanted cross-reactivity can be removed by treating the
polyclonal antibody mixture with adsorbants made of those antigens
attached to a solid phase, and collecting the unbound fraction.
Contaminating activity against other proteins containing ANK, SAM,
or PARP domains, or against Tankyrase I, or against Tankyrase-II
from other species, can all be removed by adsorption if such
cross-reactivity would interfere with the intended use of the
antibody. Specificity of the original antisera can be improved to
start with, by immunizing with peptide fragments of Tankyrase-II
that are substantially distinct from the equivalent region of the
homologous protein. Alternatively, antibodies that cross-react with
Tankyrase I can be enriched by immunizing with peptide sequences
that are shared between the two proteins. This is illustrated in
Example 11.
[0096] Production of monoclonal antibodies is described in such
standard references as Harrow & Lane (1988), U.S. Pat. Nos.
4,491,632, 4,472,500 and 4,444,887, and Methods in Enzymology 73B:3
(1981). Briefly, a mammal is immunized as described above, and
antibody-producing cells (usually splenocytes) are harvested. Cells
are immortalized, for example, by fusion with a non-producing
myeloma, transfecting with Epstein Barr Virus, or transforming with
oncogenic DNA. The treated cells are cloned and cultured, and the
clones are selected that produce antibody of the desired
specificity.
[0097] Other methods of obtaining specific antibody molecules
(optimally in the form of single-chain variable regions) involve
contacting a library of immunocompetent cells or viral particles
with the target antigen, and growing out positively selected
clones. Immunocompetent phage can be constructed to express
immunoglobulin variable region segments on their surface. See Marks
et al., New Eng. J. Med. 335:730, 1996, International Patent
Applications WO 94/13804, WO 92/01047, WO 90/02809, and McGuiness
et al., Nature Biotechnol. 14:1449,1996.
[0098] Antibodies can be raised that distinguish between Tankyrase
II and Tankyrase I by selecting an immunogenic peptide from a
region unshared by the ANK, SAM, or PARP domains of Tankyrase I, or
other proteins having one of these domains. Suitable subregions of
Tankyrase II are shown in Table 3.
3TABLE 3 Immunogen Sequences for Tankyrase II Specific Antibody
Amino Acid Sequence Location SEQ. ID NO: MSGRRCAGGGMCASAAAEAVE
Beyond N-terminal of ANK 21 domain (GC region)
TAAMPPSALPSCYKPQVLNGVRSPG Sequence between ANK 22
ATADALSSGPSSPSSLSAASSLDNLS and SAM domains
GSFSELSSVVSSSGTEGASSLEKKEV PGVDFSITQFVRN RPEGMVDG Beyond C-terminal
of PARP 23 domain
[0099] Antibody molecules in a polyclonal antiserum against intact
Tankyrase II can be screened to map immunogenic portions of the
amino acid sequence. Sequential peptides about 12 residues long are
synthesized that cover the entire protein (SEQ. ID NO:6), and
overlapping by about 8 residues. The peptides can be prepared on a
nylon membrane support by standard F-Moc chemistry, using a
SPOTS.TM. kit from Genosys according to manufacturer's directions.
Prepared membranes are overlaid with the antiserum, washed, and
overlaid with .beta.-galactosidase conjugated anti-immunoglobulin.
Positive staining identifies antigenic regions, which, in an
appropriate context, may themselves be immunogenic. There will also
be antibodies that span different parts of the primary structure,
or which rely on a conformational component not displayed in
smaller peptides.
[0100] The antibodies of this invention can be used in immunoassays
to detect or quantitate any of the polypeptides of this invention,
including the natural form of Tankyrase II present in biological
fluid or tissue samples. For example, it may be desirable to
measure Tankyrase II in a clinical sample to determine whether the
level of Tankyrase II expression is abnormal, and then correlating
the finding with the presence or status of a disease associated
with increased or decreased Tankyrase II activity or abundance.
[0101] General techniques of immunoassay can be found in "The
Immunoassay Handbook", Stockton Press NY, 1994; and "Methods of
Immunological Analysis", Weinheim: VCH Verlags gesellschaft mbH,
1993). The antibody is combined with a test sample under conditions
where the antibody will bind specifically to any modulator that
might be present, but not any other proteins liable to be in the
sample. The complex formed can be measured in situ (U.S. Pat. Nos.
4,208,479 and 4,708,929), or by physically separating it from
unreacted reagents (U.S. Pat. No. 3,646,346). Separation assays
typically involve labeled Tankyrase-II reagent (competition assay),
or labeled antibody (sandwich assay) to facilitate detection and
quantitation of the complex. Assays of this nature can also be used
in a competitive format to identify antibodies that bind to the
same epitope on a target compound. In one such format, the
reference antibody is labeled, and tested for binding to Tankyrase
II in competition with a test antibody. Antibodies can also be
screened to identify those with inhibitory capacity for the binding
and catalytic activities of Tankyrase II.
[0102] Modulating Tankyrase II Activity
[0103] This invention provides a number of different approaches to
modulate Tankyrase II activity in a live cell. In one embodiment,
the cell is genetically altered using a polynucleotide that affects
expression of Tankyrase II at the transcription or translation
level. Suitable polynucleotides include antisense sequences,
ribozymes, or polynucleotides that form triplexes with the
chromosomal gene for Tankyrase II, all of which were described in
more detail earlier in this disclosure. In another embodiment,
activity of Tankyrase II within the cell is inhibited by a peptide
inside the cell that prevents Tankyrase II from exercising its
usual function. Suitable peptides include intracellular antibody
constructs that bind to regions of Tankyrase II necessary for
catalytic or molecular binding activity, and dominant negative
homologs that compete for the binding between Tankyrase II and a
Tankyrase II binding partner. Proteins of this nature can be
introduced into a cell by contacting the cell with a polynucleotide
expression vector for the intracellular antibody or the mutant
homolog.
[0104] Also contemplated are small molecule drugs that have the
capability of modulating either Tankyrase II catalytic activity, or
with its binding to conjugate partners. The ability to inhibit
association between Tankyrase II and accessory proteins can be
determined by introducing candidate inhibitors into any of the
peptide binding assays described earlier, and correlating a
decrease in protein complex formation with inhibitory capacity of
the candidate.
[0105] Compounds can be screened for an ability to modulate
ribosylation by preparing a reaction mixture comprising the test
compound, either Tankyrase II or the Tankyrase II PARP domain (or a
functional equivalent), the NAD.sup.+ substrate, and a ribosylation
target (Tankyrase II itself, or a Tankyrase II associated protein).
Ribosylation is monitored by incorporation of .sup.32P or .sup.33P
from labeled NAD.sup.+ substrate into the solid phase, by a change
of size of the target protein, or by any of the other techniques
described earlier. An increase in ribosylation of the target
correlates with an ability of the compound to enhance Tankyrase II
ribosylation activity, while a decrease in ribosylation of the
target correlates with inhibitory capacity of the test compound.
The compound can also be screened in one or more parallel assays to
determine whether it has the capacity to modulate the ribosylation
activity of other enzymes--such as Tankyrase I, and other proteins
containing PARP domains (reviewed recently by Still et al.,
Genomics 62:533, 1999).
[0106] Compounds that modulate the activity of Tankyrase II but not
other ribosylation enzymes can be selected when it is desirable to
obtain compounds that are specific for Tankyrase II. These assays
can be used to screen random combinatorial libraries of small
molecule compounds, or as part of rational drug design, based on
known PARP inhibitors such as 3-amino-benzamide. Other potential
Tankyrase II inhibitors include 4-amino-1,8-naphthalimide
(Schlicker et al., Int. J. Radiat. Biol. 75:91, 1999),
thiophenecaroxamides (Shinkwin et al., Bioorg. Med. Chem. 7:297,
1999), and 2-nitroimidazol-5-ylmethyl (Bioorg. Med. Chem. Lett.
9:2031, 1999).
[0107] It is potentially beneficial to modulate Tankyrase II
activity in conditions associated with overexpression or
underexpression of Tankyrase II. Peptides, expression systems, and
small molecule drugs can also be screened according to the effect
on cell biology. Cells expressing Tankyrase II treated with the
test system can thereafter be monitored for an effect on telomere
length as described earlier, or on replicative capacity in
proliferation culture.
[0108] The following examples are provided as further non-limiting
illustrations of particular embodiments of the einvention.
EXAMPLES
Example 1
Identification of expressed sequence tags for Tankyrase II
[0109] A BLAST search against the GenBank dbEST database using the
Tankyrase I sequence identified several expressed sequence tags
(ESTs). Many of the ESTs were identical in DNA sequence to the
Tankyrase I gene. However, several ESTs coded peptides distinct
from Tankyrase I. Further evaluation of these ESTs revealed they
represented a distinct gene, termed Tankyrase II, since the DNA
sequence identity to Tankyrase I was significantly lower than the
amino acid identity, with a preponderance of silent third position
codon changes.
[0110] The ESTs R64714, AA244138, and AA244137 contain sequences of
the ankyrin domain of Tankyrase II; the EST AA307492 contain
sequences of the SAM domain; ESTs H11865, H17748, N57467, R06946,
AI247608, and R06902 contain sequences of the PARP domain.
Additional 5' sequence of the 3' EST AI247608 clone revealed it
diverges from the tankyrase gene and does not overlap with the SAM
domain. It may contain an unprocessed intron. The 3' ESTs H11505,
H17635, and N29528 were identified in GenBank as partners for the
5' ESTs H11865, H17748, and N57467, respectively. These 9 ESTs
along with additional sequence obtained from the H11865, H17748,
N57467, R06946 clones formed a contig containing the PARP domain
and .about.1 kbp of the 3' UTR, including a poly-A tail. These ESTs
formed 3 contigs that contained 3 of the Tankyrase II domains and
approximately 40% of the coding region.
[0111] Additionally, AA088990 and AI426537 are ESTs containing the
ankyrin and PARP domains of a putative mouse Tankyrase II,
respectively.
Example 2
Cloning of the N-terminus of Tankvrase II
[0112] To extend the ANK EST contig, Rapid Amplification of cDNA
Ends (RACE.TM.) (Gibco-BRL, #18374-058) was performed using the
primers tankII-2 and tankII-3 and (i) poly A+RNA from BJ
fibroblasts transduced with a a retroviral vector expressing the
human TERT gene (pBABE-TERT) and (ii) the Marathon-Ready testis
cDNA (Clontech, #7414-1). This was followed by nested amplification
using primers LtankII-1 and LtankII-2, respectively.
[0113] The products from these amplifications were cloned into
pCR2.1-TOPO (InVitrogen, #45-0641). Four clones were identified by
PCR with primers (AAP [Gibco] or AP-1 [Clonetech]) to the vector
and the ANK EST contig (LtankII-1 or LtankII-2), termed inside-out
PCR, to contain additional DNA 5' to the EST contig. Custom primers
were designed based on the evolving sequence data. Subsequent
sequence analysis indicated that only two, designated MP9 and MP12,
contained authentic Tankyrase II sequences. "Inside-out" PCR is the
term used to describe amplification using a primer pair in which
one primer is from the target gene (e.g., Tankyrase II) and the
second primer is specific for the vector.
[0114] Additional N-terminally extended clones were isolated by the
GeneTrapper (Gibco-BRL, #10356-020) cDNA clone enrichment procedure
using the oligonucleotides LtankII-4B, LtankII-5B, and LtankII-6B
and plasmid cDNA libraries from liver and spleen (Gibco-BRL,
#10422-012 and 10425-015, respectively). Approximately 100
GeneTrapper clones were screened by colony hybridization with a PCR
probe (described infra) from the ANK clone AA244138 and by
inside-out PCR (primers: SP6/tankII-2 and SP6/tankII-3) to identify
four clones (S10, S25, S34, and L11) that contained additional DNA
5' to the EST contig.
[0115] Sequence analysis of the 2 RACE and 4 GeneTrapper clones
formed a contig that extended approximately 200 bp downstream and
approximately 1100 bp upstream of the original ANK EST contig. The
5' most sequence terminated in DNA homologous to the most
N-terminal tankyrase ANK repeat just after the HPS domain.
Example 3
Identification of .lambda. Bacteriophace Clones of Tankvrase II
[0116] Three .lambda. bacteriophage human cDNA libraries,
.lambda.gt10 thymus (Clontech, HL1074a), .lambda.gt11 293 human
embryonic kidney cancer cell line, and .lambda.Triplx Testis
(Clontech, cat # HL5033t), were screened by plaque hybridization
with a probe from the ANK clone AA244138. The probe was generated
by PCR using the primers UtankII-5 and LtankII-7. Twenty-six phage
were positively identified through secondary and tertiary plaque
hybridizations. Using PCR the presence of the ANK EST contig was
confirmed in all 26 phage. Additional PCR was used to identify one
phage (.lambda.11L-1-1) from the 293 library that contained the
most N-terminal ankyrin repeat and the ANK, SAM, and PARP contigs.
The .lambda.11L-1-1 insert is believed to contain the entire
Tankyrase II coding sequence. Two other phage .lambda.11-L-1-3
.lambda.11L-1-4 (293) from the 293 library were identified that
contained the most N-terminal ankyrin repeats and the original ANK
contig. Inside-out PCR (primers gt10-5'/LtankII-31 or
gt11-5'/LtankII-31), showed these clones contained up to 800 bp of
additional Tankyrase II sequence upstream of the most N-terminal
ankyrin repeat. These vector/N-terminal insert PCR products and PCR
products from these four phage that linked the SAM and ANK contigs
and the SAM and PARP contigs were sequenced directly.
[0117] The .lambda.11L-1-1, .lambda.11L-1-3, or .lambda.11L-1-4 can
be individually characterized by the following tests. Page
.lambda.11L-1-1 contains DNA that can be amplified with the primer
pairs UtankII-5/LtankII-16, UtankII-5/LtankII-9, and
UtankII-3/LtankII-10 (Table 4). Phage .lambda.11L-1-3 contains DNA
that can be amplified with the primer pair UtankII-5/LtankII-16,
but not with the primer pairs UtankII-5/LtankII-9 and
UtankII-3/LtankII-10 (Table 4). Phage .lambda.11L-1-4 contain DNA
that can be amplified with the primer pairs UtankII-5/LtankII-16,
UtankII-5/LtankII-9, but not with the primer pair
UtankII-3/LtankII-10 (Table 4). Additional sequence was obtained by
amplifying phage .lambda.11L-1-4 DNA using UTankII-3 and LTankII-11
primers (PARP/SAM spanning sequence).
Example 4
Tankyrase II Amino-terminal Domain Sequence
[0118] More than 200 GeneTrapper clones obtained using UTankII-4B,
UTankII-5B and UTankII-6B oligonucleotides were probed by colony
hybridization with a .sup.32P-labeled PCR fragment from clone MP9
(supra) using primers UTank2-30 and LTank2-1. Of the positive
clones identified, sequence was obtained from 5 independent clones
from two different cDNA libraries (SuperScript human liver cDNA and
SuperScript human spleen cDNA libraries [Clonetech]). The clones
are: S4.66, S4.21, S6.7, S6.91, L5.4. Of these L5.4 had longest 5'
sequence. L5.4 was deposited with the ATCC (American Type Culture
Collection, 10801 University Boulevard, Manassas, Va.) and assigned
Accession No. 203919. The DNA encoding the amino-terminal region of
the Tankyrase II polypeptide is extremely GC-rich (>80% in the
sequence).
[0119] FIGS. 2 and 3 show cDNA and protein sequence data obtained
for Tankyrase II.
4TABLE 4 Primers for amplifying Tankyrase II sequences Forward
& Reverse Primer Designation Nucleotide Sequence SEQ. ID NO:
UTANKII-1 GTT ACA TTT GCC ACA GGC AG 24 UTANKII-2 GTC TTT CTT GCA
GTT CAG TG 25 UTANKII-3 GAG TCG AGA GAC TTA TCT CC 26 UTANKII-4A
GAG CAC AGA GAT GGA GGT C 27 UTANKII-4B ATG TAC AGC AAC TCC TCC AAG
A 28 UTANKII-5A CAG ACA ATT GCT GGA AGC TG 29 UTANKII-5B CAG ACA
ATT GCT GGA AGC TGC A 30 UTANKII-6A CTA CTC CTG AGC TAT GGG TG 31
UTANKII-6B GTG TAC TGT TCA GAG TGT CAA C 32 LTANKII-7 CCA TGC TGG
AGC AGA AGT TTG 33 LTANKII-8 GCT AAA ATC TCT CCT GGA ACC 34
LTANKII-9 GTT TGT GCC TAT GTC CAT AAG C 35 LTANKII-10 CAA AAG AGC
AGC TGC CTG TG 36 LTANKII-11 CTG CAG GAA AGA CTT TCC CAA G 37
UTANKII-12 GCA GCC AGT GGC CCT CTA CG 38 UTANKII-13 GCC CCA CAG GCC
TGT GGC C 39 UTANKII-14 GAA ACT AAT TCC CAC TAA CC 40 LTANKII-15
AAT AAA TAC TGG GCT AGT AC 41 LTANKII-16 AGG GTC TGC ACC ATG CTG
GAG C 42 LTANKII-17 ATA AAT CAG CTA CAT TAA CTA C 43 LTANKII-18 CCC
AGC TGC AAA ATG AAG T 44 LTANKII-19 AAT GAC TCT GCA GTT GAC AC 45
UTANKII-20 GAT ACA CTC ACC GGA GAA AAG 46 LTANKII-21 GTG AAC TGG
ACA CCC AGT ACC 47 UTANKII-22 GGT ATG GTC GAT GGA TAA ATA G 48
LTANKII-23 GAA CAC AGT ATT GTA TTA G 49 UTANK2-30 CGG CGG GCA GGA
AAT CCA CC 50 LTANK2-31 TTG GGG TCT GCA CCA TGT CG 51 UTANK2-32 TCC
AGA GGC TGG TGA CCC CTG A 52 LTANKII-33 TCT GCT AAA TCC AAT GCT GTC
C 53 LTANKII-34 TGC AGC GGG GTG GAT TTC CT 54 LTANKII-35 CAT TTT
GAA GCA AAT ATT TA 55 LTANKII-36 GGA ATA AGG CCC CCA TTA TA 56
LTANKII-35 CAT TTT GAA GCA AAT ATT TA 57 LTANKII-36 GGA ATA AGG CCC
CCA TTA TA 58
Example 5
Northern Hybridization of Tankyrase II mRNA
[0120] A Northern blot (Human Multiple Tissue Northern (MTN) TM
Blot.TM. (obtained from Clontech, Cat #7780-1) was hybridized with
a 3'UTR probe at 2.times.10.sup.6 cpm/ml hybridization solution.
The 3' UTR fragment was amplified by PCR with UTank2-14 and
LTank2-15 primers using the Est clone n57467 as a template.
[0121] The Northern analysis showed that the Tankyrase II
transcript is about 6 to 7.5 kb in length, and is expressed in most
tissues, including brain, heart, colon, thymus, spleen, kidney,
liver, small intestine, lung, and peripheral blood leukocytes. It
appears to be particularly abundant in skeletal muscle and
placenta. "BJ RNA" is polyadenylated RNA isolated from a human
fibroblast cell line designated BJ.
Example 6
Plasmid Clones of the Tankyrase II cDNA
[0122] The isolated Tankyrase II cDNA bacteriophage clones were
transferred to plasmid vectors as follows:
[0123] 1. The Tankyrase II cDNA contained in bacteriophage
.lambda.11L-1 was removed as a BsiW1 fragment and inserted into the
Acc65 I site of pBluescript II SK+ (Stratagene) (designated
pGRN509).
[0124] 2. The Tankyrase II cDNA contained in bacteriophage
.lambda.11L-3 was removed as a BsiW1 fragment and inserted into the
Acc65 I site of pBluescript II SK+ (Stratagene) (pGRN510).
[0125] 3. The Tankyrase II cDNA contained in bacteriophage
.lambda.11L-4 was removed as a BsiW1 fragment and inserted into the
Acc65 I site of pBluescript II SK+ (Stratagene) (pGRN511).
[0126] The PARP domain from pGRN509 was amplified with the primer
hParp1 (5'-CC ATCGAT GCCAGCCATG GAG GTT CCA GGA GTA GAT-3'; SEQ. ID
NO:59) and primer hParp2 (5'-GCTCT AGA TCA GGC CTC ATA ATC TGG-3';
SEQ. ID NO:60) using PFU/Taq polymerase mixture. The resulting
fragment was TA cloned into the InVitrogen TA cloning vector
pCR2.1-TOPO.RTM.. A clone (pGRN513) was selected with the sense
strand downstream of the T7 promoter. The primer hparp1 introduces
an ATG and kozak consensus sequence at the 5' end of the PARP
[0127] To assemble a full length cDNA containing the Tankyrase II
ORF fragments from pGRN511 and pGRN509 were combined as follows:
The Not I fragment of pGRN511 was inserted into the Not I site of
pBluescript II KS+ (Stratagene) (pGRN512). pGRN512 was digested
with Nhe I and Cla I and the larger vector/Tankyrase II cDNA
fragment was isolated, the .about.2.1 Kbp Nhe I-Cla I fragment of
pGRN509 was ligated to this fragment to generate a clone containing
the full length Tankyrase II cDNA ORF (pGRN514).
Example 7
Further Sequence Data for the Tankyrase II cDNA
[0128] The plasmids pGRN509 and pGRN511 were sequenced with an ABI
377 automated DNA sequencer by standard techniques using primers
complementary to the insert sequences.
[0129] FIG. 4 shows the revised cDNA sequence (SEQ. ID NO:5), and
the revised amino acid translation (SEQ. ID NO:6). The translated
protein product is presumed to begin at the Met encoded at position
224 of the cDNA sequence, and ending at position 3721. The Met is
assigned position No. 1 for purposes of numbering the amino acid
translation. However, the upstream polynucleotide sequence shown
contains no stop codon, and the translation starting Met may be
further upstream from the insert shown in the figure.
[0130] The number of amino acids in Tankyrase II corresponding to
nucleotides 284 to 3721 is 1166 amino acids long. The calculated
molecular weight is 126.8 kDa, and the calculated isoelectric point
is 6.78.
[0131] FIG. 1 shows the location of the functional domains in the
Tankyrase II sequence. The position of each domain within the
sequence is shown in Table 5.
5TABLE 5 Location of Functional Domains in Tankyrase II Domain
Position GC 1 to 22 ANK 23 to 859 SAM 870 to 935 PARP 1023 to
1161
[0132] FIG. 5 and FIG. 6 compare Tankyrase II (SEQ. ID NO:6) with
its closest known intraspecies homolog, Tankyrase I (SEQ. ID
NOs:8), at the protein level, and at the cDNA level.
[0133] The degree of sequence identity of Tankyrase II relative
Tankyrase I was determined in this example by dividing the two
proteins into their functional domains. Identity was then
calculated by dividing the number of matched residues by the number
of matched and mismatched residues over each area, scoring half a
point for each unmatched residue occurring in a gap or overhang on
either side. Results were as follows:
[0134] N-terminus: 7 matches/7 matches+20 mismatches+(79
gaps/2)=7%
[0135] Ankyrin repeats: 720/720+104+(7/2)=87%
[0136] Inter domain 1: 7/7+6+(2/2)=50%
[0137] SAM domain: 54/54+12=82%
[0138] Inter domain #2: 82/82+15=83%
[0139] PARP domain: 132/132+7=95%
[0140] C-terminus: 0/5+(8/2)=0%
[0141] Overall: 992/1241.5=79.9%
[0142] Overall (discounting N- and C-termini): 985/1133.5=86.9%
Example 8
Testing for PARP Activity
[0143] To produce a Tankyrase II peptide comprising the PARP
domain, pGRN513 was transcribed and translated in a 20 fold scale
up in vitro coupled transcription/translation (TnT) reaction.
Full-length Tankyrase II peptides can be obtained by similar
procedures, using plasmids designated pGRN514 or pGRN323.
[0144] Each plasmid was set up as follows, paired with a reaction
in which the plasmid DNA was omitted as an unprogrammed control.
The reaction mixture contained 20 .mu.g of circular plasmid pGRN513
or pGRN514 or pGRN523; 500 .mu.l rabbit reticulocyte lysate;
1.times.TnT Buffer; 20 .mu.l T7 RNA polymerase; 20 .mu.l 1 mM
complete amino acids; 20 .mu.l RNAguard.TM.; and dH.sub.2O to 1 ml
total volume. The reactions were incubated for 90 minutes at
30.degree. C., then pooled and made 50% with ammonium sulfate. The
resulting pellets were washed with 50% ammonium sulfate and
resuspended in either 400 .mu.l PARP buffer A (50 mM Tris-HCL pH 8,
4 mM MgCl.sub.2, 0.2 mM DTT, 50 mM NaCl, 10 mM
.beta.-mercaptoethanol, 1 mM PMSF) for the Tankyrase II reactions,
or in 100 .mu.l PARP buffer A for the unprogrammed control. The TnT
resuspended lysate was then dialysed overnight against two changes
of PARP buffer A to remove traces of ammonium sulfate.
[0145] The following assays were performed to determine PARP
activity:
[0146] 1. .sup.35S-labelled Tankyrase without NAD.sup.+
[0147] 2. .about.150 ng Tankyrase II PARP domain, with
NAD.sup.+
[0148] 3. .about.150 ng Tankyrase II PARP domain, with NAD.sup.+,
1.5 .mu.g TRF1, 5 .mu.g Histones
[0149] 4. .about.150 ng Tankyrase II PARP domain, with NAD.sup.+,
1.5 .mu.g TRF1, 5 .mu.g Histones
[0150] 5. .about.150 ng Tankyrase II PARP domain, with NAD.sup.+,
1.5 .mu.g TRF1, 5 .mu.g Histones and 1.6 mM 3-aminobenzamide
[0151] 6. Unprogrammed lysate with NAD, 1.5 .mu.g TRF1, 5 .mu.g
Histones
[0152] 7. PARP control enzyme (Trevigen cat#4667-50-01), with
NAD.sup.+, TRF1, Histones
[0153] For Reaction 1, Tankyrase II was biosynthetically labeled
using [.sup.35S]methionine as a molecular weight marker of the
non-ribosylated form. The reaction mixture comprised 1 .mu.g of
circular plasmid pGRN513 or pGRN514 or pGRN523; 25 .mu.l rabbit
reticulocyte lysate; 1.times.TnT Buffer; 1 .mu.l T7 RNA polymerase;
1 .mu.l 1 mM methionine; 2.mu.l [.sup.35S] Met (1000 Ci/mmol); 1
.mu.l RNAguard.TM.; and dH20 to 50 .mu.l total volume. The product
was precicipitated with 50% ammonium sulfate but not dialysed.
[0154] Reactions 2 to 6 were conducted under the following assay
conditions: 1.times.PARP enzyme buffer (Trevigen Cat#4667-50-02),
40 .mu.M [.sup.32P]NAD.sup.+ (50 .mu.Ci). Reactions 2, 3 5-7 with
TCA were precipitated in 20% TCA, pellets were washed sequentially
with 5% TCA, 90% acetone/1 N HCl and 100% acetone. Pellets were
then resuspended in 40 .mu.L protein loading buffer and heated for
10 minutes at 80.degree. C.
[0155] Reaction 4 was immunoprecipitated with 10 .mu.L Anti-poly
(ADP-ribose) monoclonal antibody (Trevigen Cat#4335-MC-100) and 10
.mu.L Anti-poly (ADP-ribose) polyclonal antibody (Biomol Cat#
SA-276). Reaction v was incubated on ice for 30 minutes in the
presence of the antibodies. 40 .mu.L Protein A slurry was incubated
by rotating at room temperature for 2 hours. Beads were washed
twice with PARP buffer A containing 50 mM NaCl, and once in PARP
buffer A containing 450 mM NaCl. Beads were boiled in 40 .mu.L
protein loading buffer. Reaction 7 was performed according to
manufacturer's directions (Trevigen). Samples from these reaction
mixtures were then analysed on a 12% SDS-PAGE gel. The gel was
dried and exposed to a phosphorimager screen and imaged.
Preliminary results of these assays have been inconclusive.
Example 9
Chromosomal Location of the Tankyrase II Gene
[0156] The Tankyrase II gene was localized to chromosome 10q by
radiation hybrid mapping (Boehnke et al., Am J. Hum Genet 49:1174,
1991; Walter et al., Nature Genet 7:22) using the medium resolution
Stanford G3 panel of 83 RH clones of the whole genome (created at
the Stanford Human Genome Center). A human lymphoblastoid cell line
(donor; rM) was exposed to 10,000 rad of X-rays and was then fused
with non-irradiated hamster recipient cells (A3). Eighty-three
independent somatic cell hybrid clones were isolated, and each
represents a fusion event between an irradiated donor cell and a
recipient hamster cell. The panel of G3 DNA was used for ordering
markers in the region of interest as well as establishing the
distance between these markers.
[0157] The primers used for RH mapping were UTANKII-20 and
LTANKII-21 (Table 4). The 83 pools were amplified independently and
13 (16%) scored positive for Tankyrase II. The amplification
results were submitted to the Stanford RH server, which then
provided the map location, 10q23.3, and the closest marker, STS
D10S536.
Example 10
Transcription and Translation of Tankyrase II PARP Domain
[0158] pGRN513 was tested by in vitro transcription/translation
(TnT) to confirm it encoded the appropriate sized protein. The PARP
domain is expected to run at approximately 35 kDa, as determined by
SDS polyacrylamide gel electrophoresis. TnT reactions were set up
for pGRN513 and pGRN125 (hTERT as a positive control for the TnT
reaction) with the following components:
[0159] 1 .mu.g of circular plasmid
[0160] 25 .mu.l rabbit reticulocyte lysate
[0161] 1.times.TNT Buffer (Promega Cat # L4610)
[0162] 1 .mu.l T7 RNA polymerase
[0163] 1 .mu.l 1 mM methionine
[0164] 2 .mu.l [.sup.35S] methionine (1000 Ci/mmol) (Amersham Cat
#SJ 1015, 1000 Ci/mmol)
[0165] 1 .mu.l RNAguard (Pharmacia Cat #27-0815-01)
[0166] dH20 to 50 .mu.l total volume
[0167] The reaction was incubated for 90 minutes at 30.degree. C.
40 .mu.l of the TnT reactions were precipitated with 50% ammonium
sulfate and resuspended in 40 .mu.l of buffer (20mM HEPES-KOH pH
7.9, 2 mM MgCl.sub.2, 1 mM EGTA, 10% glycerol, 0.1% Nonidet P-40,
0.1 mM phenylmethylsulphonyl fluoride)/100 mM NaCl. 5 .mu.L of TnT
reaction, 5 .mu.l of ammonium sulfate cut TnT reaction and 5 .mu.L
of the ammonium sulfate cut was analyzed on a 12% SDS-PAGE. pGRN513
generated the expected size fragment.
Example 11
Antibodies to Tankyrase II
[0168] Peptides are prepared on a synthesizer for use as immunogens
based on the sequence data shown in
6TABLE 6 Peptide Immunogens Laboratory Designation Sequence
Specificity SEQ. ID NO: GCJT-1 MAASRRSQC residues 1-8 of Tankyrase
I 61 GCJT-2 MSGRRCAGK residues 1-8 of Tankyrase II 62 GCJT-3
QEGISLGNSEADRQC residues 481-494 of Tankyrase II 63 QCJT-4
GEYKKDELLEC residues 269-276 of Tankyrase II; 64 common to both
proteins
[0169] Underlined residues do not belong to the native sequence of
the proteins but are added to the peptides in order to couple them
to carriers for antibody production.
Biological Deposit
[0170] Phage .lambda.11L-1-1, .lambda.11L-1-3, .lambda.11L-1-4 were
deposited as a mixture with the ATCC (American Type Culture
Collection, 10801 University Boulevard, Manassas, Va.) on Apr. 12,
1999, under Accession No. 203919.
[0171] Phage are stored in a buffer of 5.8 g NaCl, 2 g
MgSO.sub.4.&H.sub.2O, 50 mL 1 M Tris-HCl pH 7.5, and 0.01%
gelatin (Difco). To isolate each phage, test individual phage are
separated by plaque purification by standard PCR amplification.
Tankyrase II sequence in phage .lambda.11L-1-1 can be amplified
with the primer pairs UtankII-5/LtankII-16, UtankII-5/LtankII-9,
and UtankII-3/LtankII-10 (Table 4, supra). Tankyrase II sequence in
phage .lambda.11L-3 can be amplified with the primer pair
UtankII-5/LtankII-16, but not with the primer pairs
UtankII-5/LtankII-9 and UtankII-3/LtankII-10. Tankyrase II sequence
in phage .lambda.11L-1-4 contains DNA that can be amplified with
the primer pairs UtankII-5/LtankII-16, UtankII-5/LtankII-9, but not
with the primer pair UtankII-3/LtankII-10.
7TABLE 7 Sequences Listed in this Disclosure SEQ. ID NO: Subject
Reference 1 Human Tankyrase II DNA sequence FIG. 2, this Invention.
(60/128,577) 2 Human Tankyrase II protein sequence FIG. 2, this
Invention. (60/128,577) 3 Human Tankyrase II DNA sequence FIG. 3,
this Invention. (60/129,123) 4 Human Tankyrase II protein sequence
FIG. 3, this Invention. (60/129,123) 5 Human Tankyrase II DNA
sequence FIG. 4, this Invention. 6 Human Tankyrase II protein
sequence FIG. 4, this Invention. 7 Human Tankyrase I DNA sequence
GenBank Accession No. AF082556 Smith et al. Science 282:1484 (1998)
8 Human Tankyrase I protein sequence GenBank Accession No. AF082556
Smith et al. Science 282:1484 (1998) 9 Human cDNA clone similar to
GenBank Accession No. AA244138 Ankyrin G119 mRNA R. Strausberg
(unpublished) 10 Human cDNA clone similar to Ankyrin- GenBank
Accession No. AW157349 related ADP-ribose polymerase mRNA L.
Hillier et al. (unpublished)
[0172]
8TABLE 8 Additional Sequence Data GenBank Accession No. Subject
Reference U40705 TRF1 sequence Chong et al., Science 270:1663
(1995) AF002999 TRF2 sequence Broccoli et al., Nature Genet. 17:231
(1997) AF195512 TIN2 sequence Kim et al., Nat. Genet. 23:405 (1999)
For purposes of prosecution in the U.S.A., the DNA and encoded
amino acid sequences listed in this are hereby incorporated herein
by reference.
[0173]
Sequence CWU 1
1
64 1 4493 DNA Homo sapiens CDS (1)..(3999) 1 nnn nnn nnn nnn nnn
nnn nnn nnn nnn nnn nnn nnn nnn nnn nnn nnn 48 Xaa Xaa Xaa Xaa Xaa
Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa 1 5 10 15 nnn nnn nnn
nnn nnn nnn nnn nnn nnn nnn nnn nnn nnn nnn nnn nnn 96 Xaa Xaa Xaa
Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa 20 25 30 nnn
nnn nnn nnn nnn nnn nnn nnn nnn nnn nnn nnn nnn nnn nnn nnn 144 Xaa
Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa 35 40
45 nnn nnn nnn nnn nnn nnn nnn nnn nnn nnn nnn nnn nnn nnn nnn nnn
192 Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa
50 55 60 nnn nnn nnn nnn nnn nnn nnn nnn nnn nnn nnn nnn nnn nnn
nnn nnn 240 Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa
Xaa Xaa 65 70 75 80 nnn nnn nnn nnn nnn nnn nnn nnn nnn nnn nnn nnn
nnn nnn nnn nnn 288 Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa
Xaa Xaa Xaa Xaa 85 90 95 nnn nnn nnn nnn nnn nnn nnn nnn nnn nnn
nnn nnn nnn nnn nnn nnn 336 Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa
Xaa Xaa Xaa Xaa Xaa Xaa 100 105 110 nnn nnn nnn nnn nnn nnn nnn nnn
nnn nnn nnn nnn nnn nnn nnn nnn 384 Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa
Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa 115 120 125 nnn nnn nnn nnn nnn nnn
nnn nnn nnn nnn nnn nnn nnn nnn nnn nnn 432 Xaa Xaa Xaa Xaa Xaa Xaa
Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa 130 135 140 nnn nnn nnn nnn
nnn nnn nnn nnn nnn nnn nnn nnn nnn nnn nnn nnn 480 Xaa Xaa Xaa Xaa
Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa 145 150 155 160 nnn
nnn nnn nnn nnn nnn nnn nnn nnn nnn nnn nnn nnn nnn nnn nnn 528 Xaa
Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa 165 170
175 nnn nnn nnn nnn nnn nnn nnn nnn nnn nnn nnn nnn nnn nnn nnn nnn
576 Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa
180 185 190 nnn nnn nnn naa cca ttc cnn agg ctg gtg acc cct gaa aag
gtn aac 624 Xaa Xaa Xaa Xaa Pro Phe Xaa Arg Leu Val Thr Pro Glu Lys
Val Asn 195 200 205 anc cnc aac acg gng ggc agg aaa tcc acc ccg ctg
cac ttc ccc gca 672 Xaa Xaa Asn Thr Xaa Gly Arg Lys Ser Thr Pro Leu
His Phe Pro Ala 210 215 220 ggt ttt ggg cgg aaa aac ctn ntt aaa tat
ttg ctt caa aat ggt gca 720 Gly Phe Gly Arg Lys Asn Xaa Xaa Lys Tyr
Leu Leu Gln Asn Gly Ala 225 230 235 240 aat ntc caa nca ctt tat aat
ggg ggc ctt att cct ctt cat ant gca 768 Asn Xaa Gln Xaa Leu Tyr Asn
Gly Gly Leu Ile Pro Leu His Xaa Ala 245 250 255 tgc tct ttt ggt cat
gct aaa ant atc aat ctc ctt ttg cga cat ggt 816 Cys Ser Phe Gly His
Ala Lys Xaa Ile Asn Leu Leu Leu Arg His Gly 260 265 270 gca gac ccc
aat gct cga gat aat tgg aat tat act cct ctc cat gaa 864 Ala Asp Pro
Asn Ala Arg Asp Asn Trp Asn Tyr Thr Pro Leu His Glu 275 280 285 gct
gca att aaa gga aag att gat gtt tgc att gtg ctg tta cag cat 912 Ala
Ala Ile Lys Gly Lys Ile Asp Val Cys Ile Val Leu Leu Gln His 290 295
300 gga gct gag cca acc atc cga aat aca gat gga agg aca gca ttg gat
960 Gly Ala Glu Pro Thr Ile Arg Asn Thr Asp Gly Arg Thr Ala Leu Asp
305 310 315 320 tta gca gat cca tct gcc aaa gca gtg ctt act ggt gaa
tat aag aaa 1008 Leu Ala Asp Pro Ser Ala Lys Ala Val Leu Thr Gly
Glu Tyr Lys Lys 325 330 335 gat gaa ctc tta gaa agt gcc agg agt ggc
aat gaa gaa aaa atg atg 1056 Asp Glu Leu Leu Glu Ser Ala Arg Ser
Gly Asn Glu Glu Lys Met Met 340 345 350 gct cta ctc aca cca tta aat
gtc aac tgc cac gca agt gat ggc aga 1104 Ala Leu Leu Thr Pro Leu
Asn Val Asn Cys His Ala Ser Asp Gly Arg 355 360 365 aag tca act cca
tta cat ttg gca gca gga tat aac aga gta aag att 1152 Lys Ser Thr
Pro Leu His Leu Ala Ala Gly Tyr Asn Arg Val Lys Ile 370 375 380 gta
cag ctg tta ctg caa cat gga gct gat gtc cat gct aaa gat aaa 1200
Val Gln Leu Leu Leu Gln His Gly Ala Asp Val His Ala Lys Asp Lys 385
390 395 400 ggt gat ctg gta cca tta cac aat gcc tgt tct tat ggt cat
tat gaa 1248 Gly Asp Leu Val Pro Leu His Asn Ala Cys Ser Tyr Gly
His Tyr Glu 405 410 415 gta act gaa ctt ttg gtc aag cat ggt gcc tgt
gta aat gca atg gac 1296 Val Thr Glu Leu Leu Val Lys His Gly Ala
Cys Val Asn Ala Met Asp 420 425 430 ttg tgg caa ttc act cct ctt cat
gag gca gct tct aag aac agg gtt 1344 Leu Trp Gln Phe Thr Pro Leu
His Glu Ala Ala Ser Lys Asn Arg Val 435 440 445 gaa gta tgt tct ctt
ctc tta agt tat ggt gca gac cca aca ctg ctc 1392 Glu Val Cys Ser
Leu Leu Leu Ser Tyr Gly Ala Asp Pro Thr Leu Leu 450 455 460 aat tgt
cac aat aaa agt gct ata gac ttg gct ccc aca cca cag tta 1440 Asn
Cys His Asn Lys Ser Ala Ile Asp Leu Ala Pro Thr Pro Gln Leu 465 470
475 480 aaa gaa aga tta gca tat gaa ttt aaa ggc cac tcg ttg ctg caa
gct 1488 Lys Glu Arg Leu Ala Tyr Glu Phe Lys Gly His Ser Leu Leu
Gln Ala 485 490 495 gca cga gaa gct gat gtt act cga atc aaa aaa cat
ctc tct ctg gaa 1536 Ala Arg Glu Ala Asp Val Thr Arg Ile Lys Lys
His Leu Ser Leu Glu 500 505 510 atg gtg aat ttc aag cat cct caa aca
cat gaa aca gca ttg cat tgt 1584 Met Val Asn Phe Lys His Pro Gln
Thr His Glu Thr Ala Leu His Cys 515 520 525 gct gct gca tct cca tat
ccc aaa aga aag caa ata tgt gaa ctg ttg 1632 Ala Ala Ala Ser Pro
Tyr Pro Lys Arg Lys Gln Ile Cys Glu Leu Leu 530 535 540 cta aga aaa
gga gca aac atc aat gaa aag act aaa gaa ttc ttg act 1680 Leu Arg
Lys Gly Ala Asn Ile Asn Glu Lys Thr Lys Glu Phe Leu Thr 545 550 555
560 cct ctg cac gtg gca tct gag aaa gct cat aat gat gtt gtt gaa gta
1728 Pro Leu His Val Ala Ser Glu Lys Ala His Asn Asp Val Val Glu
Val 565 570 575 gtg gtg aaa cat gaa gca aag gtt aat gct ctg gat aat
ctt ggt cag 1776 Val Val Lys His Glu Ala Lys Val Asn Ala Leu Asp
Asn Leu Gly Gln 580 585 590 act tct cta cac aga gct gca tat tgt ggt
cat cta caa acc tgc cgc 1824 Thr Ser Leu His Arg Ala Ala Tyr Cys
Gly His Leu Gln Thr Cys Arg 595 600 605 cta ctc ctg agc tat ggg tgt
gat cct aac att ata tcc ctt cag ggc 1872 Leu Leu Leu Ser Tyr Gly
Cys Asp Pro Asn Ile Ile Ser Leu Gln Gly 610 615 620 ttt act gct tta
cag atg gga aat gaa aat gta cag caa ctc ctc caa 1920 Phe Thr Ala
Leu Gln Met Gly Asn Glu Asn Val Gln Gln Leu Leu Gln 625 630 635 640
gag ggt atc tca tta ggt aat tca gag gca gac aga caa ttg ctg gaa
1968 Glu Gly Ile Ser Leu Gly Asn Ser Glu Ala Asp Arg Gln Leu Leu
Glu 645 650 655 gct gca aag gct gga gat gtc gaa act gta aaa aaa ctg
tgt act gtt 2016 Ala Ala Lys Ala Gly Asp Val Glu Thr Val Lys Lys
Leu Cys Thr Val 660 665 670 cag agt gtc aac tgc aga gac att gaa ggg
cgt cag tct aca cca ctt 2064 Gln Ser Val Asn Cys Arg Asp Ile Glu
Gly Arg Gln Ser Thr Pro Leu 675 680 685 cat ttt gca gct ggg tat aac
aga gtg tcc gtg gtg gaa tat ctg cta 2112 His Phe Ala Ala Gly Tyr
Asn Arg Val Ser Val Val Glu Tyr Leu Leu 690 695 700 cag cat gga gct
gat gtg cat gct aaa gat aaa ggn ggc ctt gta cct 2160 Gln His Gly
Ala Asp Val His Ala Lys Asp Lys Gly Gly Leu Val Pro 705 710 715 720
ttg cac aat gca tgt tnt tat gga cat tat gaa gtt gca gaa ctt ctt
2208 Leu His Asn Ala Cys Xaa Tyr Gly His Tyr Glu Val Ala Glu Leu
Leu 725 730 735 gtt aaa cat gga gca gta gtt aat gta gct gat tta tgg
aaa ttt aca 2256 Val Lys His Gly Ala Val Val Asn Val Ala Asp Leu
Trp Lys Phe Thr 740 745 750 cct tta cat gaa gca gca gca aaa gga aaa
tat gaa att tgc aaa ctt 2304 Pro Leu His Glu Ala Ala Ala Lys Gly
Lys Tyr Glu Ile Cys Lys Leu 755 760 765 ctg ctc cag cat ggt gca gac
cct aca aaa aaa aac agg gat gga aat 2352 Leu Leu Gln His Gly Ala
Asp Pro Thr Lys Lys Asn Arg Asp Gly Asn 770 775 780 act ctt ttg gat
ctt gtt aaa gat gga gan aca gat att caa gat ntg 2400 Thr Leu Leu
Asp Leu Val Lys Asp Gly Xaa Thr Asp Ile Gln Asp Xaa 785 790 795 800
ctt agg gga gat gca gtt ttg tta gat gct gcc aag aag ggt tgt tta
2448 Leu Arg Gly Asp Ala Val Leu Leu Asp Ala Ala Lys Lys Gly Cys
Leu 805 810 815 gcc aga gtg aag aag ttn tnt ttt cct gat aat gta aat
tgc cgn gat 2496 Ala Arg Val Lys Lys Xaa Xaa Phe Pro Asp Asn Val
Asn Cys Arg Asp 820 825 830 acc caa ggc aga cat tca aca cct tta cat
tta gca ggt nnn nnn nnn 2544 Thr Gln Gly Arg His Ser Thr Pro Leu
His Leu Ala Gly Xaa Xaa Xaa 835 840 845 nnn nnn nnn nnn nnn nnn nnn
nnn nnn nnn nnn nnn nnn nnn nnn nnn 2592 Xaa Xaa Xaa Xaa Xaa Xaa
Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa 850 855 860 nnn nnn nnn nnn
nnn nnn nnn nnn nnn nnn nnn nnn nnn nnn nnn nnn 2640 Xaa Xaa Xaa
Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa 865 870 875 880
nnn nnn nnn nnn nnn nnn nnn nnn nnn nnn nnn nnn nnn nnn nnn nnn
2688 Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa
Xaa 885 890 895 nnn nnn nnn nnn nnn nnn nnn nnn nnn nnn nnn nnn nnn
nnn nnn nnn 2736 Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa
Xaa Xaa Xaa Xaa 900 905 910 nnn nnn nnn nnn nnn nnn nnn nnn nnn nnn
nnn nnn nnn nnn nnn nnn 2784 Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa
Xaa Xaa Xaa Xaa Xaa Xaa Xaa 915 920 925 nnn nnn nnn nnn nnn nnn nnn
nnn nnn nnn nnn nnn nnn nnn nnn nnn 2832 Xaa Xaa Xaa Xaa Xaa Xaa
Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa 930 935 940 nnn nnn nnn nnn
nnn nnn nnn nnn nnn ntg aca gca gcc atg ccc cca 2880 Xaa Xaa Xaa
Xaa Xaa Xaa Xaa Xaa Xaa Xaa Thr Ala Ala Met Pro Pro 945 950 955 960
tct gtt ctg ccc tct tgt aac aag cct caa gtg ctc aat ggt gtg aga
2928 Ser Val Leu Pro Ser Cys Asn Lys Pro Gln Val Leu Asn Gly Val
Arg 965 970 975 agc cca gga gcc act gca gat gct ctc tct tca ggt cca
tct agc cca 2976 Ser Pro Gly Ala Thr Ala Asp Ala Leu Ser Ser Gly
Pro Ser Ser Pro 980 985 990 tca agc ctt tct gca gcc agc agt ctt gac
aac tta tct ggg agt ttt 3024 Ser Ser Leu Ser Ala Ala Ser Ser Leu
Asp Asn Leu Ser Gly Ser Phe 995 1000 1005 tca gaa ctg tct tca gta
gtt agt tca agt gga aca gag ggt gct 3069 Ser Glu Leu Ser Ser Val
Val Ser Ser Ser Gly Thr Glu Gly Ala 1010 1015 1020 tcc agt ttg gag
aaa aag gag gtt cca gga gta gat ttt agc ata 3114 Ser Ser Leu Glu
Lys Lys Glu Val Pro Gly Val Asp Phe Ser Ile 1025 1030 1035 act caa
ttc gta agg aat ctt gga ctt gag cac cta atg gat ata 3159 Thr Gln
Phe Val Arg Asn Leu Gly Leu Glu His Leu Met Asp Ile 1040 1045 1050
ttt nag aga gaa cag atc act ttg gat gta tta gtt gag atg ggg 3204
Phe Xaa Arg Glu Gln Ile Thr Leu Asp Val Leu Val Glu Met Gly 1055
1060 1065 cac aag gag ctg aag gag att ggw atc aat gct tat gga cat
agg 3249 His Lys Glu Leu Lys Glu Ile Xaa Ile Asn Ala Tyr Gly His
Arg 1070 1075 1080 cac aaa cta att aaa gga gtc gag aga ctt atc tcc
gga caa caa 3294 His Lys Leu Ile Lys Gly Val Glu Arg Leu Ile Ser
Gly Gln Gln 1085 1090 1095 ggt ctt aac cca tat tta act ttg aac acc
tct ggt agt gga aca 3339 Gly Leu Asn Pro Tyr Leu Thr Leu Asn Thr
Ser Gly Ser Gly Thr 1100 1105 1110 att ctt ata gat ctg tct cct gat
gat aaa gag ttt cag tct gtg 3384 Ile Leu Ile Asp Leu Ser Pro Asp
Asp Lys Glu Phe Gln Ser Val 1115 1120 1125 gag gaa gag atg caa agt
aca gtt cga gag cac aga gat gga ggt 3429 Glu Glu Glu Met Gln Ser
Thr Val Arg Glu His Arg Asp Gly Gly 1130 1135 1140 cat gca ggt gga
atc ttc aac aga tac aat att ctc aag att cag 3474 His Ala Gly Gly
Ile Phe Asn Arg Tyr Asn Ile Leu Lys Ile Gln 1145 1150 1155 aag gtt
tgt aac ann nnn nnn nnn nga gcc aag att cgg cac gag 3519 Lys Val
Cys Asn Xaa Xaa Xaa Xaa Xaa Ala Lys Ile Arg His Glu 1160 1165 1170
gaa aga tac act cac cgg aga aaa gaa gtt tct gaa gaa aac cac 3564
Glu Arg Tyr Thr His Arg Arg Lys Glu Val Ser Glu Glu Asn His 1175
1180 1185 aac cat gcc aat gaa cga atg cta ttt cat ggg tct cct ttt
gtg 3609 Asn His Ala Asn Glu Arg Met Leu Phe His Gly Ser Pro Phe
Val 1190 1195 1200 aat gca att atc cac aaa ggc ttt gat gaa agg cat
gcg tac ata 3654 Asn Ala Ile Ile His Lys Gly Phe Asp Glu Arg His
Ala Tyr Ile 1205 1210 1215 ggt ggt atg ttt gga gct ggc att tat ttt
gct gaa aac tct tcc 3699 Gly Gly Met Phe Gly Ala Gly Ile Tyr Phe
Ala Glu Asn Ser Ser 1220 1225 1230 aaa agc aat caa tat gta tat gga
att gga gga ggt act ggg tgt 3744 Lys Ser Asn Gln Tyr Val Tyr Gly
Ile Gly Gly Gly Thr Gly Cys 1235 1240 1245 cca gtt cac aaa gac aga
tct tgt tac att tgc cac agg cag ctg 3789 Pro Val His Lys Asp Arg
Ser Cys Tyr Ile Cys His Arg Gln Leu 1250 1255 1260 ctc ttt tgc cgg
gta acc ttg gga aag tct ttc ctg cag ttc agt 3834 Leu Phe Cys Arg
Val Thr Leu Gly Lys Ser Phe Leu Gln Phe Ser 1265 1270 1275 gca atg
aaa atg gca cat tct cct cca ggt cat cac tca gtc act 3879 Ala Met
Lys Met Ala His Ser Pro Pro Gly His His Ser Val Thr 1280 1285 1290
ggt agg ccc agt gta aat ggc cta gca tta gct gaa tat gtt att 3924
Gly Arg Pro Ser Val Asn Gly Leu Ala Leu Ala Glu Tyr Val Ile 1295
1300 1305 tac aga gga gaa cag gct tat cct gag tat tta att act tac
cag 3969 Tyr Arg Gly Glu Gln Ala Tyr Pro Glu Tyr Leu Ile Thr Tyr
Gln 1310 1315 1320 att atg agg cct gaa ggt atg gtc gat gga
taaatagtta ttttaagaaa 4019 Ile Met Arg Pro Glu Gly Met Val Asp Gly
1325 1330 ctaattccac tgaacctaaa atcatcaaag cagcagtggc ctctacgttt
tactcctttg 4079 ctgaaaaaaa atcatcttgc ccacaggcct gtggcaaaag
gataaaaatg tgaacgaagt 4139 ttaacattct gacttgataa agctttaata
atgtacagtg ttttctaaat atttcctgtt 4199 ttttcagcac tttaacagat
gccattccag gttaaactgg gttgtctgta ctaaattata 4259 aacagagtta
acttgaacct tttatatgtt atgcattgat tctaacaaac tgtaatgccc 4319
tcaacagaac taattttact aatacaatac tgtgttcttt aaaacacagc atttacactg
4379 aatacaattt catttgtaaa actgtaaata agagcttttg tactagccca
gtatttattt 4439 acattgcttt gtaatataaa tctgttttag aactgcaaaa
aaaaaaaaaa aaaa 4493 2 1333 PRT Homo sapiens misc_feature (1)..(1)
The 'Xaa' at location 1 stands for Lys, Asn, Arg, Ser, Thr, Ile,
Met, Glu, Asp, Gly, Ala, Val, Gln, His, Pro, Leu, a stop codon,
Tyr, Trp, Cys, or Phe. 2 Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa
Xaa Xaa Xaa Xaa Xaa Xaa 1 5 10 15 Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa
Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa 20 25 30 Xaa Xaa Xaa Xaa Xaa Xaa
Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa 35 40 45 Xaa Xaa Xaa Xaa
Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa 50 55 60 Xaa Xaa
Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa 65 70 75 80
Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa 85
90 95 Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa
Xaa 100 105 110 Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa
Xaa Xaa Xaa 115 120 125 Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa
Xaa Xaa Xaa Xaa Xaa 130 135 140 Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa
Xaa Xaa Xaa Xaa Xaa Xaa Xaa 145 150 155 160 Xaa Xaa Xaa Xaa Xaa Xaa
Xaa Xaa Xaa Xaa Xaa Xaa
Xaa Xaa Xaa Xaa 165 170 175 Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa
Xaa Xaa Xaa Xaa Xaa Xaa 180 185 190 Xaa Xaa Xaa Xaa Pro Phe Xaa Arg
Leu Val Thr Pro Glu Lys Val Asn 195 200 205 Xaa Xaa Asn Thr Xaa Gly
Arg Lys Ser Thr Pro Leu His Phe Pro Ala 210 215 220 Gly Phe Gly Arg
Lys Asn Xaa Xaa Lys Tyr Leu Leu Gln Asn Gly Ala 225 230 235 240 Asn
Xaa Gln Xaa Leu Tyr Asn Gly Gly Leu Ile Pro Leu His Xaa Ala 245 250
255 Cys Ser Phe Gly His Ala Lys Xaa Ile Asn Leu Leu Leu Arg His Gly
260 265 270 Ala Asp Pro Asn Ala Arg Asp Asn Trp Asn Tyr Thr Pro Leu
His Glu 275 280 285 Ala Ala Ile Lys Gly Lys Ile Asp Val Cys Ile Val
Leu Leu Gln His 290 295 300 Gly Ala Glu Pro Thr Ile Arg Asn Thr Asp
Gly Arg Thr Ala Leu Asp 305 310 315 320 Leu Ala Asp Pro Ser Ala Lys
Ala Val Leu Thr Gly Glu Tyr Lys Lys 325 330 335 Asp Glu Leu Leu Glu
Ser Ala Arg Ser Gly Asn Glu Glu Lys Met Met 340 345 350 Ala Leu Leu
Thr Pro Leu Asn Val Asn Cys His Ala Ser Asp Gly Arg 355 360 365 Lys
Ser Thr Pro Leu His Leu Ala Ala Gly Tyr Asn Arg Val Lys Ile 370 375
380 Val Gln Leu Leu Leu Gln His Gly Ala Asp Val His Ala Lys Asp Lys
385 390 395 400 Gly Asp Leu Val Pro Leu His Asn Ala Cys Ser Tyr Gly
His Tyr Glu 405 410 415 Val Thr Glu Leu Leu Val Lys His Gly Ala Cys
Val Asn Ala Met Asp 420 425 430 Leu Trp Gln Phe Thr Pro Leu His Glu
Ala Ala Ser Lys Asn Arg Val 435 440 445 Glu Val Cys Ser Leu Leu Leu
Ser Tyr Gly Ala Asp Pro Thr Leu Leu 450 455 460 Asn Cys His Asn Lys
Ser Ala Ile Asp Leu Ala Pro Thr Pro Gln Leu 465 470 475 480 Lys Glu
Arg Leu Ala Tyr Glu Phe Lys Gly His Ser Leu Leu Gln Ala 485 490 495
Ala Arg Glu Ala Asp Val Thr Arg Ile Lys Lys His Leu Ser Leu Glu 500
505 510 Met Val Asn Phe Lys His Pro Gln Thr His Glu Thr Ala Leu His
Cys 515 520 525 Ala Ala Ala Ser Pro Tyr Pro Lys Arg Lys Gln Ile Cys
Glu Leu Leu 530 535 540 Leu Arg Lys Gly Ala Asn Ile Asn Glu Lys Thr
Lys Glu Phe Leu Thr 545 550 555 560 Pro Leu His Val Ala Ser Glu Lys
Ala His Asn Asp Val Val Glu Val 565 570 575 Val Val Lys His Glu Ala
Lys Val Asn Ala Leu Asp Asn Leu Gly Gln 580 585 590 Thr Ser Leu His
Arg Ala Ala Tyr Cys Gly His Leu Gln Thr Cys Arg 595 600 605 Leu Leu
Leu Ser Tyr Gly Cys Asp Pro Asn Ile Ile Ser Leu Gln Gly 610 615 620
Phe Thr Ala Leu Gln Met Gly Asn Glu Asn Val Gln Gln Leu Leu Gln 625
630 635 640 Glu Gly Ile Ser Leu Gly Asn Ser Glu Ala Asp Arg Gln Leu
Leu Glu 645 650 655 Ala Ala Lys Ala Gly Asp Val Glu Thr Val Lys Lys
Leu Cys Thr Val 660 665 670 Gln Ser Val Asn Cys Arg Asp Ile Glu Gly
Arg Gln Ser Thr Pro Leu 675 680 685 His Phe Ala Ala Gly Tyr Asn Arg
Val Ser Val Val Glu Tyr Leu Leu 690 695 700 Gln His Gly Ala Asp Val
His Ala Lys Asp Lys Gly Gly Leu Val Pro 705 710 715 720 Leu His Asn
Ala Cys Xaa Tyr Gly His Tyr Glu Val Ala Glu Leu Leu 725 730 735 Val
Lys His Gly Ala Val Val Asn Val Ala Asp Leu Trp Lys Phe Thr 740 745
750 Pro Leu His Glu Ala Ala Ala Lys Gly Lys Tyr Glu Ile Cys Lys Leu
755 760 765 Leu Leu Gln His Gly Ala Asp Pro Thr Lys Lys Asn Arg Asp
Gly Asn 770 775 780 Thr Leu Leu Asp Leu Val Lys Asp Gly Xaa Thr Asp
Ile Gln Asp Xaa 785 790 795 800 Leu Arg Gly Asp Ala Val Leu Leu Asp
Ala Ala Lys Lys Gly Cys Leu 805 810 815 Ala Arg Val Lys Lys Xaa Xaa
Phe Pro Asp Asn Val Asn Cys Arg Asp 820 825 830 Thr Gln Gly Arg His
Ser Thr Pro Leu His Leu Ala Gly Xaa Xaa Xaa 835 840 845 Xaa Xaa Xaa
Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa 850 855 860 Xaa
Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa 865 870
875 880 Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa
Xaa 885 890 895 Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa
Xaa Xaa Xaa 900 905 910 Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa
Xaa Xaa Xaa Xaa Xaa 915 920 925 Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa
Xaa Xaa Xaa Xaa Xaa Xaa Xaa 930 935 940 Xaa Xaa Xaa Xaa Xaa Xaa Xaa
Xaa Xaa Xaa Thr Ala Ala Met Pro Pro 945 950 955 960 Ser Val Leu Pro
Ser Cys Asn Lys Pro Gln Val Leu Asn Gly Val Arg 965 970 975 Ser Pro
Gly Ala Thr Ala Asp Ala Leu Ser Ser Gly Pro Ser Ser Pro 980 985 990
Ser Ser Leu Ser Ala Ala Ser Ser Leu Asp Asn Leu Ser Gly Ser Phe 995
1000 1005 Ser Glu Leu Ser Ser Val Val Ser Ser Ser Gly Thr Glu Gly
Ala 1010 1015 1020 Ser Ser Leu Glu Lys Lys Glu Val Pro Gly Val Asp
Phe Ser Ile 1025 1030 1035 Thr Gln Phe Val Arg Asn Leu Gly Leu Glu
His Leu Met Asp Ile 1040 1045 1050 Phe Xaa Arg Glu Gln Ile Thr Leu
Asp Val Leu Val Glu Met Gly 1055 1060 1065 His Lys Glu Leu Lys Glu
Ile Xaa Ile Asn Ala Tyr Gly His Arg 1070 1075 1080 His Lys Leu Ile
Lys Gly Val Glu Arg Leu Ile Ser Gly Gln Gln 1085 1090 1095 Gly Leu
Asn Pro Tyr Leu Thr Leu Asn Thr Ser Gly Ser Gly Thr 1100 1105 1110
Ile Leu Ile Asp Leu Ser Pro Asp Asp Lys Glu Phe Gln Ser Val 1115
1120 1125 Glu Glu Glu Met Gln Ser Thr Val Arg Glu His Arg Asp Gly
Gly 1130 1135 1140 His Ala Gly Gly Ile Phe Asn Arg Tyr Asn Ile Leu
Lys Ile Gln 1145 1150 1155 Lys Val Cys Asn Xaa Xaa Xaa Xaa Xaa Ala
Lys Ile Arg His Glu 1160 1165 1170 Glu Arg Tyr Thr His Arg Arg Lys
Glu Val Ser Glu Glu Asn His 1175 1180 1185 Asn His Ala Asn Glu Arg
Met Leu Phe His Gly Ser Pro Phe Val 1190 1195 1200 Asn Ala Ile Ile
His Lys Gly Phe Asp Glu Arg His Ala Tyr Ile 1205 1210 1215 Gly Gly
Met Phe Gly Ala Gly Ile Tyr Phe Ala Glu Asn Ser Ser 1220 1225 1230
Lys Ser Asn Gln Tyr Val Tyr Gly Ile Gly Gly Gly Thr Gly Cys 1235
1240 1245 Pro Val His Lys Asp Arg Ser Cys Tyr Ile Cys His Arg Gln
Leu 1250 1255 1260 Leu Phe Cys Arg Val Thr Leu Gly Lys Ser Phe Leu
Gln Phe Ser 1265 1270 1275 Ala Met Lys Met Ala His Ser Pro Pro Gly
His His Ser Val Thr 1280 1285 1290 Gly Arg Pro Ser Val Asn Gly Leu
Ala Leu Ala Glu Tyr Val Ile 1295 1300 1305 Tyr Arg Gly Glu Gln Ala
Tyr Pro Glu Tyr Leu Ile Thr Tyr Gln 1310 1315 1320 Ile Met Arg Pro
Glu Gly Met Val Asp Gly 1325 1330 3 4297 DNA Homo sapiens CDS
(1)..(3801) 3 ncc cac gcg tcc ggg cag gag ggg cct tgc cag ctt ccg
ccg ccg cgt 48 Xaa His Ala Ser Gly Gln Glu Gly Pro Cys Gln Leu Pro
Pro Pro Arg 1 5 10 15 cgt ttc agg acc cgg acg gcg gat tcg cgc tgc
ctc cgc cgc cgc ggg 96 Arg Phe Arg Thr Arg Thr Ala Asp Ser Arg Cys
Leu Arg Arg Arg Gly 20 25 30 gca gcc ggg ggg cag gga gcc cat cga
ang ggc gcg cgt ggg cgc ggc 144 Ala Ala Gly Gly Gln Gly Ala His Arg
Xaa Gly Ala Arg Gly Arg Gly 35 40 45 cat ggg act gcg ccg gat ccg
gtg aca gca ggg agc caa gcg gcc cgg 192 His Gly Thr Ala Pro Asp Pro
Val Thr Ala Gly Ser Gln Ala Ala Arg 50 55 60 gcc ctg agc gcg tct
tct ccg ggg ggc ctc gcc ctc ctg ctc gcg ggg 240 Ala Leu Ser Ala Ser
Ser Pro Gly Gly Leu Ala Leu Leu Leu Ala Gly 65 70 75 80 ccg ggg ctc
ctg ctc cgg ttg ctg gcg ctg ttg ctg gct gtg gcg gcg 288 Pro Gly Leu
Leu Leu Arg Leu Leu Ala Leu Leu Leu Ala Val Ala Ala 85 90 95 gcc
ang atc atg tcg ggt cgc cgc tgc gcc ggc ggg gga ncg gcc tgc 336 Ala
Xaa Ile Met Ser Gly Arg Arg Cys Ala Gly Gly Gly Xaa Ala Cys 100 105
110 gcg anc gcc gcg gcc gaa gcc gtg gaa ccg gcc gcc cga aan ctg ttc
384 Ala Xaa Ala Ala Ala Glu Ala Val Glu Pro Ala Ala Arg Xaa Leu Phe
115 120 125 gaa gcg tgc cgc aac ggg gac gtg gaa cga ntc aag aag ctg
gtg acn 432 Glu Ala Cys Arg Asn Gly Asp Val Glu Arg Xaa Lys Lys Leu
Val Xaa 130 135 140 cct gar aag gtg aac agc cgc gac acn gcg ggc agg
aaa tcc acc ccg 480 Pro Glu Lys Val Asn Ser Arg Asp Xaa Ala Gly Arg
Lys Ser Thr Pro 145 150 155 160 ctg cac tty ccc gca ngt ttt ggg cgg
aaa gac tta ntt raa tat ttg 528 Leu His Phe Pro Ala Xaa Phe Gly Arg
Lys Asp Leu Xaa Xaa Tyr Leu 165 170 175 ctt can aat ggt gca aat gty
caa nca cgt gat nat ggg ggc ctt att 576 Leu Thr Asn Gly Ala Asn Xaa
Gln Xaa Arg Asp Xaa Gly Gly Leu Ile 180 185 190 cct ctt cat aat gca
tgc tct ttt ggt cmt gct raa ant atc nat ctc 624 Pro Leu His Asn Ala
Cys Ser Phe Gly Xaa Ala Xaa Xaa Ile Xaa Leu 195 200 205 ctt ttg cna
cat ngt gca nam ccc aat gct cga gat aat tgg aat tat 672 Leu Leu Xaa
His Xaa Ala Xaa Pro Asn Ala Arg Asp Asn Trp Asn Tyr 210 215 220 act
cct cnc nat gaa gct gca att aaa gga aag att gan nnt tgc att 720 Thr
Pro Xaa Xaa Glu Ala Ala Ile Lys Gly Lys Ile Xaa Xaa Cys Ile 225 230
235 240 gtg ctg tta cag cat gga gct gag cca acc atc cga aat aca gat
gga 768 Val Leu Leu Gln His Gly Ala Glu Pro Thr Ile Arg Asn Thr Asp
Gly 245 250 255 agg aca gca ttg gat tta gca gat cca tct gcc aaa gca
gtg ctt act 816 Arg Thr Ala Leu Asp Leu Ala Asp Pro Ser Ala Lys Ala
Val Leu Thr 260 265 270 ggt gaa tat aag aaa gat gaa ctc tta gaa agt
gcc agg agt ggc aat 864 Gly Glu Tyr Lys Lys Asp Glu Leu Leu Glu Ser
Ala Arg Ser Gly Asn 275 280 285 gaa gaa aaa atg atg gct cta ctc aca
cca tta aat gtc aac tgc cac 912 Glu Glu Lys Met Met Ala Leu Leu Thr
Pro Leu Asn Val Asn Cys His 290 295 300 gca agt gat ggc aga aag tca
act cca tta cat ttg gca gca gga tat 960 Ala Ser Asp Gly Arg Lys Ser
Thr Pro Leu His Leu Ala Ala Gly Tyr 305 310 315 320 aac aga gta aag
att gta cag ctg tta ctg caa cat gga gct gat gtc 1008 Asn Arg Val
Lys Ile Val Gln Leu Leu Leu Gln His Gly Ala Asp Val 325 330 335 cat
gct aaa gat aaa ggt gat ctg gta cca tta cac aat gcc tgt tct 1056
His Ala Lys Asp Lys Gly Asp Leu Val Pro Leu His Asn Ala Cys Ser 340
345 350 tat ggt cat tat gaa gta act gaa ctt ttg gtc aag cat ggt gcc
tgt 1104 Tyr Gly His Tyr Glu Val Thr Glu Leu Leu Val Lys His Gly
Ala Cys 355 360 365 gta aat gca atg gac ttg tgg caa ttc act cct ctt
cat gag gca gct 1152 Val Asn Ala Met Asp Leu Trp Gln Phe Thr Pro
Leu His Glu Ala Ala 370 375 380 tct aag aac agg gtt gaa gta tgt tct
ctt ctc tta agt tat ggt gca 1200 Ser Lys Asn Arg Val Glu Val Cys
Ser Leu Leu Leu Ser Tyr Gly Ala 385 390 395 400 gac cca aca ctg ctc
aat tgt cac aat aaa agt gct ata gac ttg gct 1248 Asp Pro Thr Leu
Leu Asn Cys His Asn Lys Ser Ala Ile Asp Leu Ala 405 410 415 ccc aca
cca cag tta aaa gaa aga tta gca tat gaa ttt aaa ggc cac 1296 Pro
Thr Pro Gln Leu Lys Glu Arg Leu Ala Tyr Glu Phe Lys Gly His 420 425
430 tcg ttg ctg caa gct gca cga gaa gct gat gtt act cga atc aaa aaa
1344 Ser Leu Leu Gln Ala Ala Arg Glu Ala Asp Val Thr Arg Ile Lys
Lys 435 440 445 cat ctc tct ctg gaa atg gtg aat ttc aag cat cct caa
aca cat gaa 1392 His Leu Ser Leu Glu Met Val Asn Phe Lys His Pro
Gln Thr His Glu 450 455 460 aca gca ttg cat tgt gct gct gca tct cca
tat ccc aaa aga aag caa 1440 Thr Ala Leu His Cys Ala Ala Ala Ser
Pro Tyr Pro Lys Arg Lys Gln 465 470 475 480 ata tgt gaa ctg ttg cta
aga aaa gga gca aac atc aat gaa aag act 1488 Ile Cys Glu Leu Leu
Leu Arg Lys Gly Ala Asn Ile Asn Glu Lys Thr 485 490 495 aaa gaa ttc
ttg act cct ctg cac gtg gca tct gag aaa gct cat aat 1536 Lys Glu
Phe Leu Thr Pro Leu His Val Ala Ser Glu Lys Ala His Asn 500 505 510
gat gtt gtt gaa gta gtg gtg aaa cat gaa gca aag gtt aat gct ctg
1584 Asp Val Val Glu Val Val Val Lys His Glu Ala Lys Val Asn Ala
Leu 515 520 525 gat aat ctt ggt cag act tct cta cac aga gct gca tat
tgt ggt cat 1632 Asp Asn Leu Gly Gln Thr Ser Leu His Arg Ala Ala
Tyr Cys Gly His 530 535 540 cta caa acc tgc cgc cta ctc ctg agc tat
ggg tgt gat cct aac att 1680 Leu Gln Thr Cys Arg Leu Leu Leu Ser
Tyr Gly Cys Asp Pro Asn Ile 545 550 555 560 ata tcc ctt cag ggc ttt
act gct tta cag atg gga aat gaa aat gta 1728 Ile Ser Leu Gln Gly
Phe Thr Ala Leu Gln Met Gly Asn Glu Asn Val 565 570 575 cag caa ctc
ctc caa gag ggt atc tca tta ggt aat tca gag gca gac 1776 Gln Gln
Leu Leu Gln Glu Gly Ile Ser Leu Gly Asn Ser Glu Ala Asp 580 585 590
aga caa ttg ctg gaa gct gca aag gct gga gat gtc gaa act gta aaa
1824 Arg Gln Leu Leu Glu Ala Ala Lys Ala Gly Asp Val Glu Thr Val
Lys 595 600 605 aaa ctg tgt act gtt cag agt gtc aac tgc aga gac att
gaa ggg cgt 1872 Lys Leu Cys Thr Val Gln Ser Val Asn Cys Arg Asp
Ile Glu Gly Arg 610 615 620 cag tct aca cca ctt cat ttt gca gct ggg
tat aac aga gtg tcc gtg 1920 Gln Ser Thr Pro Leu His Phe Ala Ala
Gly Tyr Asn Arg Val Ser Val 625 630 635 640 gtg gaa tat ctg cta cag
cat gga gct gat gtg cat gct aaa gat aaa 1968 Val Glu Tyr Leu Leu
Gln His Gly Ala Asp Val His Ala Lys Asp Lys 645 650 655 ggn ggc ctt
gta cct ttg cac aat gca tgt tnt tat gga cat tat gaa 2016 Gly Gly
Leu Val Pro Leu His Asn Ala Cys Xaa Tyr Gly His Tyr Glu 660 665 670
gtt gca gaa ctt ctt gtt aaa cat gga gca gta gtt aat gta gct gat
2064 Val Ala Glu Leu Leu Val Lys His Gly Ala Val Val Asn Val Ala
Asp 675 680 685 tta tgg aaa ttt aca cct tta cat gaa gca gca gca aaa
gga aaa tat 2112 Leu Trp Lys Phe Thr Pro Leu His Glu Ala Ala Ala
Lys Gly Lys Tyr 690 695 700 gaa att tgc aaa ctt ctg ctc cag cat ggt
gca gac cct aca aaa aaa 2160 Glu Ile Cys Lys Leu Leu Leu Gln His
Gly Ala Asp Pro Thr Lys Lys 705 710 715 720 aaa aaa aaa gga aan att
cnt ttg gat ctt gtt aaa gat gga gan aca 2208 Lys Lys Lys Gly Xaa
Ile Xaa Leu Asp Leu Val Lys Asp Gly Xaa Thr 725 730 735 gat att caa
gat ntg ctt agg gga gat gca gtt ttg tta gat gct gcc 2256 Asp Ile
Gln Asp Xaa Leu Arg Gly Asp Ala Val Leu Leu Asp Ala Ala 740 745 750
aag aag ggt tgt tta gcc aga gtg aag aag ttn tnt ttt cct gat aat
2304 Lys Lys Gly Cys Leu Ala Arg Val Lys Lys Xaa Xaa Phe Pro Asp
Asn 755 760 765 gta aat tgc cgn gat acc caa ggc aga cat tca aca cct
tta cat tta 2352 Val Asn Cys Arg Asp Thr Gln Gly Arg His Ser Thr
Pro Leu His Leu 770 775 780 gca ggt nnn nnn nnn nnn nnn nnn nnn nnn
nnn nnn nnn nnn nnn nnn 2400 Ala Gly
Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa 785 790 795
800 nnn nnn nnn nnn nnn nnn nnn nnn nnn nnn nnn nnn nnn nnn nnn nnn
2448 Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa
Xaa 805 810 815 nnn nnn nnn nnn nnn nnn nnn nnn nnn nnn nnn nnn nnn
nnn nnn nnn 2496 Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa
Xaa Xaa Xaa Xaa 820 825 830 nnn nnn nnn nnn nnn nnn nnn nnn nnn nnn
nnn nnn nnn nnn nnn nnn 2544 Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa
Xaa Xaa Xaa Xaa Xaa Xaa Xaa 835 840 845 nnn nnn nnn nnn nnn nnn nnn
nnn nnn nnn nnn nnn nnn nnn nnn nnn 2592 Xaa Xaa Xaa Xaa Xaa Xaa
Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa 850 855 860 nnn nnn nnn nnn
nnn nnn nnn nnn nnn nnn nnn nnn nnn nnn nnn nnn 2640 Xaa Xaa Xaa
Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa 865 870 875 880
nnn nnn nnn nnn nnn nnn nnn nnn nnn nnn nnn ntg aca gca gcc atg
2688 Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Thr Ala Ala
Met 885 890 895 ccc cca tct gtt ctg ccc tct tgt aac aag cct caa gtg
ctc aat ggt 2736 Pro Pro Ser Val Leu Pro Ser Cys Asn Lys Pro Gln
Val Leu Asn Gly 900 905 910 gtg aga agc cca gga gcc act gca gat gct
ctc tct tca ggt cca tct 2784 Val Arg Ser Pro Gly Ala Thr Ala Asp
Ala Leu Ser Ser Gly Pro Ser 915 920 925 agc cca tca agc ctt tct gca
gcc agc agt ctt gac aac tta tct ggg 2832 Ser Pro Ser Ser Leu Ser
Ala Ala Ser Ser Leu Asp Asn Leu Ser Gly 930 935 940 agt ttt tca gaa
ctg tct tca gta gtt agt tca agt gga aca gag ggt 2880 Ser Phe Ser
Glu Leu Ser Ser Val Val Ser Ser Ser Gly Thr Glu Gly 945 950 955 960
gct tcc agt ttg gag aaa aag gag gtt cca gga gta gat ttt agc ata
2928 Ala Ser Ser Leu Glu Lys Lys Glu Val Pro Gly Val Asp Phe Ser
Ile 965 970 975 act caa ttc gta agg aat ctt gga ctt gag cac cta atg
gat ata ttt 2976 Thr Gln Phe Val Arg Asn Leu Gly Leu Glu His Leu
Met Asp Ile Phe 980 985 990 nag aga gaa cag atc act ttg gat gta tta
gtt gag atg ggg cac aag 3024 Xaa Arg Glu Gln Ile Thr Leu Asp Val
Leu Val Glu Met Gly His Lys 995 1000 1005 gag ctg aag gag att ggw
atc aat gct tat gga cat agg cac aaa 3069 Glu Leu Lys Glu Ile Xaa
Ile Asn Ala Tyr Gly His Arg His Lys 1010 1015 1020 cta att aaa agt
ttc gag aga ctt atc tcc gga caa caa ggt ctt 3114 Leu Ile Lys Ser
Phe Glu Arg Leu Ile Ser Gly Gln Gln Gly Leu 1025 1030 1035 aac cca
tat tta act ttg aac acc tct ggt agt gga aca att ctt 3159 Asn Pro
Tyr Leu Thr Leu Asn Thr Ser Gly Ser Gly Thr Ile Leu 1040 1045 1050
ata gat ctg tct cct gat gat aaa gag ttt cag tct gtg gag gaa 3204
Ile Asp Leu Ser Pro Asp Asp Lys Glu Phe Gln Ser Val Glu Glu 1055
1060 1065 gag atg caa agt aca gtt cga gag cac aga gat gga ggt cat
gca 3249 Glu Met Gln Ser Thr Val Arg Glu His Arg Asp Gly Gly His
Ala 1070 1075 1080 ggt gga atc ttc aac aga tac aat att ctc aag att
cag aag gtt 3294 Gly Gly Ile Phe Asn Arg Tyr Asn Ile Leu Lys Ile
Gln Lys Val 1085 1090 1095 tgt aac aga gcc aag att cgg cac gag gaa
aga tac act cac cgg 3339 Cys Asn Arg Ala Lys Ile Arg His Glu Glu
Arg Tyr Thr His Arg 1100 1105 1110 aga aaa gaa gtt tct gaa gaa aac
cac aac cat gcc aat gaa cga 3384 Arg Lys Glu Val Ser Glu Glu Asn
His Asn His Ala Asn Glu Arg 1115 1120 1125 atg cta ttt cat ggg tct
cct ttt gtg aat gca att atc cac aaa 3429 Met Leu Phe His Gly Ser
Pro Phe Val Asn Ala Ile Ile His Lys 1130 1135 1140 ggc ttt gat gaa
agg cat gcg tac ata ggt ggt atg ttt gga gct 3474 Gly Phe Asp Glu
Arg His Ala Tyr Ile Gly Gly Met Phe Gly Ala 1145 1150 1155 ggc att
tat ttt gct gaa aac tct tcc aaa agc aat caa tat gta 3519 Gly Ile
Tyr Phe Ala Glu Asn Ser Ser Lys Ser Asn Gln Tyr Val 1160 1165 1170
tat gga att gga gga ggt act ggg tgt cca gtt cac aaa gac aga 3564
Tyr Gly Ile Gly Gly Gly Thr Gly Cys Pro Val His Lys Asp Arg 1175
1180 1185 tct tgt tac att tgc cac agg cag ctg ctc ttt tgc cgg gta
acc 3609 Ser Cys Tyr Ile Cys His Arg Gln Leu Leu Phe Cys Arg Val
Thr 1190 1195 1200 ttg gga aag tct ttc ctg cag ttc agt gca atg aaa
atg gca cat 3654 Leu Gly Lys Ser Phe Leu Gln Phe Ser Ala Met Lys
Met Ala His 1205 1210 1215 tct cct cca ggt cat cac tca gtc act ggt
agg ccc agt gta aat 3699 Ser Pro Pro Gly His His Ser Val Thr Gly
Arg Pro Ser Val Asn 1220 1225 1230 ggc cta gca tta gct gaa tat gtt
att tac aga gga gaa cag gct 3744 Gly Leu Ala Leu Ala Glu Tyr Val
Ile Tyr Arg Gly Glu Gln Ala 1235 1240 1245 tat cct gag tat tta att
act tac cag att atg agg cct gaa ggt 3789 Tyr Pro Glu Tyr Leu Ile
Thr Tyr Gln Ile Met Arg Pro Glu Gly 1250 1255 1260 atg gtc gat gga
taaatagtta ttttaagaaa ctaattccac tgaacctaaa 3841 Met Val Asp Gly
1265 atcatcaaag cagcagtggc ctctacgttt tactcctttg ctgaaaaaaa
atcatcttgc 3901 ccacaggcct gtggcaaaag gataaaaatg tgaacgaagt
ttaacattct gacttgataa 3961 agctttaata atgtacagtg ttttctaaat
atttcctgtt ttttcagcac tttaacagat 4021 gccattccag gttaaactgg
gttgtctgta ctaaattata aacagagtta acttgaacct 4081 tttatatgtt
atgcattgat tctaacaaac tgtaatgccc tcaacagaac taattttact 4141
aatacaatac tgtgttcttt aaaacacagc atttacactg aatacaattt catttgtaaa
4201 actgtaaata agagcttttg tactagccca gtatttattt acattgcttt
gtaatataaa 4261 tctgttttag aactgcaaaa aaaaaaaaaa aaaatc 4297 4 1267
PRT Homo sapiens misc_feature (1)..(1) The 'Xaa' at location 1
stands for Thr, Ala, Pro, or Ser. 4 Xaa His Ala Ser Gly Gln Glu Gly
Pro Cys Gln Leu Pro Pro Pro Arg 1 5 10 15 Arg Phe Arg Thr Arg Thr
Ala Asp Ser Arg Cys Leu Arg Arg Arg Gly 20 25 30 Ala Ala Gly Gly
Gln Gly Ala His Arg Xaa Gly Ala Arg Gly Arg Gly 35 40 45 His Gly
Thr Ala Pro Asp Pro Val Thr Ala Gly Ser Gln Ala Ala Arg 50 55 60
Ala Leu Ser Ala Ser Ser Pro Gly Gly Leu Ala Leu Leu Leu Ala Gly 65
70 75 80 Pro Gly Leu Leu Leu Arg Leu Leu Ala Leu Leu Leu Ala Val
Ala Ala 85 90 95 Ala Xaa Ile Met Ser Gly Arg Arg Cys Ala Gly Gly
Gly Xaa Ala Cys 100 105 110 Ala Xaa Ala Ala Ala Glu Ala Val Glu Pro
Ala Ala Arg Xaa Leu Phe 115 120 125 Glu Ala Cys Arg Asn Gly Asp Val
Glu Arg Xaa Lys Lys Leu Val Xaa 130 135 140 Pro Glu Lys Val Asn Ser
Arg Asp Xaa Ala Gly Arg Lys Ser Thr Pro 145 150 155 160 Leu His Phe
Pro Ala Xaa Phe Gly Arg Lys Asp Leu Xaa Xaa Tyr Leu 165 170 175 Leu
Thr Asn Gly Ala Asn Xaa Gln Xaa Arg Asp Xaa Gly Gly Leu Ile 180 185
190 Pro Leu His Asn Ala Cys Ser Phe Gly Xaa Ala Xaa Xaa Ile Xaa Leu
195 200 205 Leu Leu Xaa His Xaa Ala Xaa Pro Asn Ala Arg Asp Asn Trp
Asn Tyr 210 215 220 Thr Pro Xaa Xaa Glu Ala Ala Ile Lys Gly Lys Ile
Xaa Xaa Cys Ile 225 230 235 240 Val Leu Leu Gln His Gly Ala Glu Pro
Thr Ile Arg Asn Thr Asp Gly 245 250 255 Arg Thr Ala Leu Asp Leu Ala
Asp Pro Ser Ala Lys Ala Val Leu Thr 260 265 270 Gly Glu Tyr Lys Lys
Asp Glu Leu Leu Glu Ser Ala Arg Ser Gly Asn 275 280 285 Glu Glu Lys
Met Met Ala Leu Leu Thr Pro Leu Asn Val Asn Cys His 290 295 300 Ala
Ser Asp Gly Arg Lys Ser Thr Pro Leu His Leu Ala Ala Gly Tyr 305 310
315 320 Asn Arg Val Lys Ile Val Gln Leu Leu Leu Gln His Gly Ala Asp
Val 325 330 335 His Ala Lys Asp Lys Gly Asp Leu Val Pro Leu His Asn
Ala Cys Ser 340 345 350 Tyr Gly His Tyr Glu Val Thr Glu Leu Leu Val
Lys His Gly Ala Cys 355 360 365 Val Asn Ala Met Asp Leu Trp Gln Phe
Thr Pro Leu His Glu Ala Ala 370 375 380 Ser Lys Asn Arg Val Glu Val
Cys Ser Leu Leu Leu Ser Tyr Gly Ala 385 390 395 400 Asp Pro Thr Leu
Leu Asn Cys His Asn Lys Ser Ala Ile Asp Leu Ala 405 410 415 Pro Thr
Pro Gln Leu Lys Glu Arg Leu Ala Tyr Glu Phe Lys Gly His 420 425 430
Ser Leu Leu Gln Ala Ala Arg Glu Ala Asp Val Thr Arg Ile Lys Lys 435
440 445 His Leu Ser Leu Glu Met Val Asn Phe Lys His Pro Gln Thr His
Glu 450 455 460 Thr Ala Leu His Cys Ala Ala Ala Ser Pro Tyr Pro Lys
Arg Lys Gln 465 470 475 480 Ile Cys Glu Leu Leu Leu Arg Lys Gly Ala
Asn Ile Asn Glu Lys Thr 485 490 495 Lys Glu Phe Leu Thr Pro Leu His
Val Ala Ser Glu Lys Ala His Asn 500 505 510 Asp Val Val Glu Val Val
Val Lys His Glu Ala Lys Val Asn Ala Leu 515 520 525 Asp Asn Leu Gly
Gln Thr Ser Leu His Arg Ala Ala Tyr Cys Gly His 530 535 540 Leu Gln
Thr Cys Arg Leu Leu Leu Ser Tyr Gly Cys Asp Pro Asn Ile 545 550 555
560 Ile Ser Leu Gln Gly Phe Thr Ala Leu Gln Met Gly Asn Glu Asn Val
565 570 575 Gln Gln Leu Leu Gln Glu Gly Ile Ser Leu Gly Asn Ser Glu
Ala Asp 580 585 590 Arg Gln Leu Leu Glu Ala Ala Lys Ala Gly Asp Val
Glu Thr Val Lys 595 600 605 Lys Leu Cys Thr Val Gln Ser Val Asn Cys
Arg Asp Ile Glu Gly Arg 610 615 620 Gln Ser Thr Pro Leu His Phe Ala
Ala Gly Tyr Asn Arg Val Ser Val 625 630 635 640 Val Glu Tyr Leu Leu
Gln His Gly Ala Asp Val His Ala Lys Asp Lys 645 650 655 Gly Gly Leu
Val Pro Leu His Asn Ala Cys Xaa Tyr Gly His Tyr Glu 660 665 670 Val
Ala Glu Leu Leu Val Lys His Gly Ala Val Val Asn Val Ala Asp 675 680
685 Leu Trp Lys Phe Thr Pro Leu His Glu Ala Ala Ala Lys Gly Lys Tyr
690 695 700 Glu Ile Cys Lys Leu Leu Leu Gln His Gly Ala Asp Pro Thr
Lys Lys 705 710 715 720 Lys Lys Lys Gly Xaa Ile Xaa Leu Asp Leu Val
Lys Asp Gly Xaa Thr 725 730 735 Asp Ile Gln Asp Xaa Leu Arg Gly Asp
Ala Val Leu Leu Asp Ala Ala 740 745 750 Lys Lys Gly Cys Leu Ala Arg
Val Lys Lys Xaa Xaa Phe Pro Asp Asn 755 760 765 Val Asn Cys Arg Asp
Thr Gln Gly Arg His Ser Thr Pro Leu His Leu 770 775 780 Ala Gly Xaa
Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa 785 790 795 800
Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa 805
810 815 Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa
Xaa 820 825 830 Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa
Xaa Xaa Xaa 835 840 845 Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa
Xaa Xaa Xaa Xaa Xaa 850 855 860 Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa
Xaa Xaa Xaa Xaa Xaa Xaa Xaa 865 870 875 880 Xaa Xaa Xaa Xaa Xaa Xaa
Xaa Xaa Xaa Xaa Xaa Xaa Thr Ala Ala Met 885 890 895 Pro Pro Ser Val
Leu Pro Ser Cys Asn Lys Pro Gln Val Leu Asn Gly 900 905 910 Val Arg
Ser Pro Gly Ala Thr Ala Asp Ala Leu Ser Ser Gly Pro Ser 915 920 925
Ser Pro Ser Ser Leu Ser Ala Ala Ser Ser Leu Asp Asn Leu Ser Gly 930
935 940 Ser Phe Ser Glu Leu Ser Ser Val Val Ser Ser Ser Gly Thr Glu
Gly 945 950 955 960 Ala Ser Ser Leu Glu Lys Lys Glu Val Pro Gly Val
Asp Phe Ser Ile 965 970 975 Thr Gln Phe Val Arg Asn Leu Gly Leu Glu
His Leu Met Asp Ile Phe 980 985 990 Xaa Arg Glu Gln Ile Thr Leu Asp
Val Leu Val Glu Met Gly His Lys 995 1000 1005 Glu Leu Lys Glu Ile
Xaa Ile Asn Ala Tyr Gly His Arg His Lys 1010 1015 1020 Leu Ile Lys
Ser Phe Glu Arg Leu Ile Ser Gly Gln Gln Gly Leu 1025 1030 1035 Asn
Pro Tyr Leu Thr Leu Asn Thr Ser Gly Ser Gly Thr Ile Leu 1040 1045
1050 Ile Asp Leu Ser Pro Asp Asp Lys Glu Phe Gln Ser Val Glu Glu
1055 1060 1065 Glu Met Gln Ser Thr Val Arg Glu His Arg Asp Gly Gly
His Ala 1070 1075 1080 Gly Gly Ile Phe Asn Arg Tyr Asn Ile Leu Lys
Ile Gln Lys Val 1085 1090 1095 Cys Asn Arg Ala Lys Ile Arg His Glu
Glu Arg Tyr Thr His Arg 1100 1105 1110 Arg Lys Glu Val Ser Glu Glu
Asn His Asn His Ala Asn Glu Arg 1115 1120 1125 Met Leu Phe His Gly
Ser Pro Phe Val Asn Ala Ile Ile His Lys 1130 1135 1140 Gly Phe Asp
Glu Arg His Ala Tyr Ile Gly Gly Met Phe Gly Ala 1145 1150 1155 Gly
Ile Tyr Phe Ala Glu Asn Ser Ser Lys Ser Asn Gln Tyr Val 1160 1165
1170 Tyr Gly Ile Gly Gly Gly Thr Gly Cys Pro Val His Lys Asp Arg
1175 1180 1185 Ser Cys Tyr Ile Cys His Arg Gln Leu Leu Phe Cys Arg
Val Thr 1190 1195 1200 Leu Gly Lys Ser Phe Leu Gln Phe Ser Ala Met
Lys Met Ala His 1205 1210 1215 Ser Pro Pro Gly His His Ser Val Thr
Gly Arg Pro Ser Val Asn 1220 1225 1230 Gly Leu Ala Leu Ala Glu Tyr
Val Ile Tyr Arg Gly Glu Gln Ala 1235 1240 1245 Tyr Pro Glu Tyr Leu
Ile Thr Tyr Gln Ile Met Arg Pro Glu Gly 1250 1255 1260 Met Val Asp
Gly 1265 5 4275 DNA Homo sapiens CDS (284)..(3781) 5 ggcaggaggg
gccttgccag cttccgccgc cgcgtcgttt caggacccgg acggcggatt 60
cgcgctgcct ccgccgccgc ggggcagccg gggggcaggg agcccatcga ggggcgcgcg
120 tgggcgcggc catgggactg cgccggatcc ggtgacagca gggagccaag
cggcccgggc 180 cctgagcgcg tcttctccgg ggggcctcgc cctcctgctc
gcggggccgg ggctcctgct 240 ccggttgctg gcgctgttgc tggctgtggc
ggcggccagg atc atg tcg ggt cgc 295 Met Ser Gly Arg 1 cgc tgc gcc
ggc ggg gga gcg gcc tgc gcg agc gcc gcg gcc gag gcc 343 Arg Cys Ala
Gly Gly Gly Ala Ala Cys Ala Ser Ala Ala Ala Glu Ala 5 10 15 20 gtg
gag ccg gcc gcc cga gag ctg ttc gag gcg tgc cgc aac ggg gac 391 Val
Glu Pro Ala Ala Arg Glu Leu Phe Glu Ala Cys Arg Asn Gly Asp 25 30
35 gtg gaa cga gtc aag agg ctg gtg acg cct gag aag gtg aac agc cgc
439 Val Glu Arg Val Lys Arg Leu Val Thr Pro Glu Lys Val Asn Ser Arg
40 45 50 gac acg gcg ggc agg aaa tcc acc ccg ctg cac ttc gcc gca
ggt ttt 487 Asp Thr Ala Gly Arg Lys Ser Thr Pro Leu His Phe Ala Ala
Gly Phe 55 60 65 ggg cgg aaa gac gta gtt gaa tat ttg ctt cag aat
ggt gca aat gtc 535 Gly Arg Lys Asp Val Val Glu Tyr Leu Leu Gln Asn
Gly Ala Asn Val 70 75 80 caa gca cgt gat gat ggg ggc ctt att cct
ctt cat aat gca tgc tct 583 Gln Ala Arg Asp Asp Gly Gly Leu Ile Pro
Leu His Asn Ala Cys Ser 85 90 95 100 ttt ggt cat gct gaa gta gtc
aat ctc ctt ttg cga cat ggt gca gac 631 Phe Gly His Ala Glu Val Val
Asn Leu Leu Leu Arg His Gly Ala Asp 105 110 115 ccc aat gct cga gat
aat tgg aat tat act cct ctc cat gaa gct gca 679 Pro Asn Ala Arg Asp
Asn Trp Asn Tyr Thr Pro Leu His Glu Ala Ala 120 125 130 att aaa gga
aag att gat gtt tgc att gtg ctg tta cag cat gga gct 727 Ile Lys Gly
Lys Ile Asp Val Cys Ile
Val Leu Leu Gln His Gly Ala 135 140 145 gag cca acc atc cga aat aca
gat gga agg aca gca ttg gat tta gca 775 Glu Pro Thr Ile Arg Asn Thr
Asp Gly Arg Thr Ala Leu Asp Leu Ala 150 155 160 gat cca tct gcc aaa
gca gtg ctt act ggt gaa tat aag aaa gat gaa 823 Asp Pro Ser Ala Lys
Ala Val Leu Thr Gly Glu Tyr Lys Lys Asp Glu 165 170 175 180 ctc tta
gaa agt gcc agg agt ggc aat gaa gaa aaa atg atg gct cta 871 Leu Leu
Glu Ser Ala Arg Ser Gly Asn Glu Glu Lys Met Met Ala Leu 185 190 195
ctc aca cca tta aat gtc aac tgc cac gca agt gat ggc aga aag tca 919
Leu Thr Pro Leu Asn Val Asn Cys His Ala Ser Asp Gly Arg Lys Ser 200
205 210 act cca tta cat ttg gca gca gga tat aac aga gta aag att gta
cag 967 Thr Pro Leu His Leu Ala Ala Gly Tyr Asn Arg Val Lys Ile Val
Gln 215 220 225 ctg tta ctg caa cat gga gct gat gtc cat gct aaa gat
aaa ggt gat 1015 Leu Leu Leu Gln His Gly Ala Asp Val His Ala Lys
Asp Lys Gly Asp 230 235 240 ctg gta cca tta cac aat gcc tgt tct tat
ggt cat tat gaa gta act 1063 Leu Val Pro Leu His Asn Ala Cys Ser
Tyr Gly His Tyr Glu Val Thr 245 250 255 260 gaa ctt ttg gtc aag cat
ggt gcc tgt gta aat gca atg gac ttg tgg 1111 Glu Leu Leu Val Lys
His Gly Ala Cys Val Asn Ala Met Asp Leu Trp 265 270 275 caa ttc act
cct ctt cat gag gca gct tct aag aac agg gtt gaa gta 1159 Gln Phe
Thr Pro Leu His Glu Ala Ala Ser Lys Asn Arg Val Glu Val 280 285 290
tgt tct ctt ctc tta agt tat ggt gca gac cca aca ctg ctc aat tgt
1207 Cys Ser Leu Leu Leu Ser Tyr Gly Ala Asp Pro Thr Leu Leu Asn
Cys 295 300 305 cac aat aaa agt gct ata gac ttg gct ccc aca cca cag
tta aaa gaa 1255 His Asn Lys Ser Ala Ile Asp Leu Ala Pro Thr Pro
Gln Leu Lys Glu 310 315 320 aga tta gca tat gaa ttt aaa ggc cac tcg
ttg ctg caa gct gca cga 1303 Arg Leu Ala Tyr Glu Phe Lys Gly His
Ser Leu Leu Gln Ala Ala Arg 325 330 335 340 gaa gct gat gtt act cga
atc aaa aaa cat ctc tct ctg gaa atg gtg 1351 Glu Ala Asp Val Thr
Arg Ile Lys Lys His Leu Ser Leu Glu Met Val 345 350 355 aat ttc aag
cat cct caa aca cat gaa aca gca ttg cat tgt gct gct 1399 Asn Phe
Lys His Pro Gln Thr His Glu Thr Ala Leu His Cys Ala Ala 360 365 370
gca tct cca tat ccc aaa aga aag caa ata tgt gaa ctg ttg cta aga
1447 Ala Ser Pro Tyr Pro Lys Arg Lys Gln Ile Cys Glu Leu Leu Leu
Arg 375 380 385 aaa gga gca aac atc aat gaa aag act aaa gaa ttc ttg
act cct ctg 1495 Lys Gly Ala Asn Ile Asn Glu Lys Thr Lys Glu Phe
Leu Thr Pro Leu 390 395 400 cac gtg gca tct gag aaa gct cat aat gat
gtt gtt gaa gta gtg gtg 1543 His Val Ala Ser Glu Lys Ala His Asn
Asp Val Val Glu Val Val Val 405 410 415 420 aaa cat gaa gca aag gtt
aat gct ctg gat aat ctt ggt cag act tct 1591 Lys His Glu Ala Lys
Val Asn Ala Leu Asp Asn Leu Gly Gln Thr Ser 425 430 435 cta cac aga
gct gca tat tgt ggt cat cta caa acc tgc cgc cta ctc 1639 Leu His
Arg Ala Ala Tyr Cys Gly His Leu Gln Thr Cys Arg Leu Leu 440 445 450
ctg agc tat ggg tgt gat cct aac att ata tcc ctt cag ggc ttt act
1687 Leu Ser Tyr Gly Cys Asp Pro Asn Ile Ile Ser Leu Gln Gly Phe
Thr 455 460 465 gct tta cag atg gga aat gaa aat gta cag caa ctc ctc
caa gag ggt 1735 Ala Leu Gln Met Gly Asn Glu Asn Val Gln Gln Leu
Leu Gln Glu Gly 470 475 480 atc tca tta ggt aat tca gag gca gac aga
caa ttg ctg gaa gct gca 1783 Ile Ser Leu Gly Asn Ser Glu Ala Asp
Arg Gln Leu Leu Glu Ala Ala 485 490 495 500 aag gct gga gat gtc gaa
act gta aaa aaa ctg tgt act gtt cag agt 1831 Lys Ala Gly Asp Val
Glu Thr Val Lys Lys Leu Cys Thr Val Gln Ser 505 510 515 gtc aac tgc
aga gac att gaa ggg cgt cag tct aca cca ctt cat ttt 1879 Val Asn
Cys Arg Asp Ile Glu Gly Arg Gln Ser Thr Pro Leu His Phe 520 525 530
gca gct ggg tat aac aga gtg tcc gtg gtg gaa tat ctg cta cag cat
1927 Ala Ala Gly Tyr Asn Arg Val Ser Val Val Glu Tyr Leu Leu Gln
His 535 540 545 gga gct gat gtg cat gct aaa gat aaa gga ggc ctt gta
cct ttg cac 1975 Gly Ala Asp Val His Ala Lys Asp Lys Gly Gly Leu
Val Pro Leu His 550 555 560 aat gca tgt tct tat gga cat tat gaa gtt
gca gaa ctt ctt gtt aaa 2023 Asn Ala Cys Ser Tyr Gly His Tyr Glu
Val Ala Glu Leu Leu Val Lys 565 570 575 580 cat gga gca gta gtt aat
gta gct gat tta tgg aaa ttt aca cct tta 2071 His Gly Ala Val Val
Asn Val Ala Asp Leu Trp Lys Phe Thr Pro Leu 585 590 595 cat gaa gca
gca gca aaa gga aaa tat gaa att tgc aaa ctt ctg ctc 2119 His Glu
Ala Ala Ala Lys Gly Lys Tyr Glu Ile Cys Lys Leu Leu Leu 600 605 610
cag cat ggt gca gac cct aca aaa aaa aac agg gat gga aat act cct
2167 Gln His Gly Ala Asp Pro Thr Lys Lys Asn Arg Asp Gly Asn Thr
Pro 615 620 625 ttg gat ctt gtt aaa gat gga gat aca gat att caa gat
ctg ctt agg 2215 Leu Asp Leu Val Lys Asp Gly Asp Thr Asp Ile Gln
Asp Leu Leu Arg 630 635 640 gga gat gca gct ttg cta gat gct gcc aag
aag ggt tgt tta gcc aga 2263 Gly Asp Ala Ala Leu Leu Asp Ala Ala
Lys Lys Gly Cys Leu Ala Arg 645 650 655 660 gtg aag aag ttg tct tct
cct gat aat gta aat tgc cgc gat acc caa 2311 Val Lys Lys Leu Ser
Ser Pro Asp Asn Val Asn Cys Arg Asp Thr Gln 665 670 675 ggc aga cat
tca aca cct tta cat tta gca gct ggt tat aat aat tta 2359 Gly Arg
His Ser Thr Pro Leu His Leu Ala Ala Gly Tyr Asn Asn Leu 680 685 690
gaa gtt gca gag tat ttg tta caa cac gga gct gat gtg aat gcc caa
2407 Glu Val Ala Glu Tyr Leu Leu Gln His Gly Ala Asp Val Asn Ala
Gln 695 700 705 gac aaa gga gga ctt att cct tta cat aat gca gca tct
tac ggg cat 2455 Asp Lys Gly Gly Leu Ile Pro Leu His Asn Ala Ala
Ser Tyr Gly His 710 715 720 gta gat gta gca gct cta cta ata aag tat
aat gca tgt gtc aat gcc 2503 Val Asp Val Ala Ala Leu Leu Ile Lys
Tyr Asn Ala Cys Val Asn Ala 725 730 735 740 acg gac aaa tgg gct ttc
aca cct ttg cac gaa gca gcc caa aag gga 2551 Thr Asp Lys Trp Ala
Phe Thr Pro Leu His Glu Ala Ala Gln Lys Gly 745 750 755 cga aca cag
ctt tgt gct ttg ttg cta gcc cat gga gct gac ccg act 2599 Arg Thr
Gln Leu Cys Ala Leu Leu Leu Ala His Gly Ala Asp Pro Thr 760 765 770
ctt aaa aat cag gaa gga caa aca cct tta gat tta gtt tca gcg gat
2647 Leu Lys Asn Gln Glu Gly Gln Thr Pro Leu Asp Leu Val Ser Ala
Asp 775 780 785 gat gtc agc gct ctt ctg aca gca gcc atg ccc cca tct
gct ctg ccc 2695 Asp Val Ser Ala Leu Leu Thr Ala Ala Met Pro Pro
Ser Ala Leu Pro 790 795 800 tct tgt tac aag cct caa gtg ctc aat ggt
gtg aga agc cca gga gcc 2743 Ser Cys Tyr Lys Pro Gln Val Leu Asn
Gly Val Arg Ser Pro Gly Ala 805 810 815 820 act gca gat gct ctc tct
tca ggt cca tct agc cca tca agc ctt tct 2791 Thr Ala Asp Ala Leu
Ser Ser Gly Pro Ser Ser Pro Ser Ser Leu Ser 825 830 835 gca gcc agc
agt ctt gac aac tta tct ggg agt ttt tca gaa ctg tct 2839 Ala Ala
Ser Ser Leu Asp Asn Leu Ser Gly Ser Phe Ser Glu Leu Ser 840 845 850
tca gta gtt agt tca agt gga aca gag ggt gct tcc agt ttg gag aaa
2887 Ser Val Val Ser Ser Ser Gly Thr Glu Gly Ala Ser Ser Leu Glu
Lys 855 860 865 aag gag gtt cca gga gta gat ttt agc ata act caa ttc
gta agg aat 2935 Lys Glu Val Pro Gly Val Asp Phe Ser Ile Thr Gln
Phe Val Arg Asn 870 875 880 ctt gga ctt gag cac cta atg gat ata ttt
gag aga gaa cag atc act 2983 Leu Gly Leu Glu His Leu Met Asp Ile
Phe Glu Arg Glu Gln Ile Thr 885 890 895 900 ttg gat gta tta gtt gag
atg ggg cac aag gag ctg aag gag att gga 3031 Leu Asp Val Leu Val
Glu Met Gly His Lys Glu Leu Lys Glu Ile Gly 905 910 915 atc aat gct
tat gga cat agg cac aaa cta att aaa gga gtc gag aga 3079 Ile Asn
Ala Tyr Gly His Arg His Lys Leu Ile Lys Gly Val Glu Arg 920 925 930
ctt atc tcc gga caa caa ggt ctt aac cca tat tta act ttg aac acc
3127 Leu Ile Ser Gly Gln Gln Gly Leu Asn Pro Tyr Leu Thr Leu Asn
Thr 935 940 945 tct ggt agt gga aca att ctt ata gat ctg tct cct gat
gat aaa gag 3175 Ser Gly Ser Gly Thr Ile Leu Ile Asp Leu Ser Pro
Asp Asp Lys Glu 950 955 960 ttt cag tct gtg gag gaa gag atg caa agt
aca gtt cga gag cac aga 3223 Phe Gln Ser Val Glu Glu Glu Met Gln
Ser Thr Val Arg Glu His Arg 965 970 975 980 gat gga ggt cat gca ggt
gga atc ttc aac aga tac aat att ctc aag 3271 Asp Gly Gly His Ala
Gly Gly Ile Phe Asn Arg Tyr Asn Ile Leu Lys 985 990 995 att cag aag
gtt tgt aac aag aaa cta tgg gaa aga tac act cac 3316 Ile Gln Lys
Val Cys Asn Lys Lys Leu Trp Glu Arg Tyr Thr His 1000 1005 1010 cgg
aga aaa gaa gtt tct gaa gaa aac cac aac cat gcc aat gaa 3361 Arg
Arg Lys Glu Val Ser Glu Glu Asn His Asn His Ala Asn Glu 1015 1020
1025 cga atg cta ttt cat ggg tct cct ttt gtg aat gca att atc cac
3406 Arg Met Leu Phe His Gly Ser Pro Phe Val Asn Ala Ile Ile His
1030 1035 1040 aaa ggc ttt gat gaa agg cat gcg tac ata ggt ggt atg
ttt gga 3451 Lys Gly Phe Asp Glu Arg His Ala Tyr Ile Gly Gly Met
Phe Gly 1045 1050 1055 gct ggc att tat ttt gct gaa aac tct tcc aaa
agc aat caa tat 3496 Ala Gly Ile Tyr Phe Ala Glu Asn Ser Ser Lys
Ser Asn Gln Tyr 1060 1065 1070 gta tat gga att gga gga ggt act ggg
tgt cca gtt cac aaa gac 3541 Val Tyr Gly Ile Gly Gly Gly Thr Gly
Cys Pro Val His Lys Asp 1075 1080 1085 aga tct tgt tac att tgc cac
agg cag ctg ctc ttt tgc cgg gta 3586 Arg Ser Cys Tyr Ile Cys His
Arg Gln Leu Leu Phe Cys Arg Val 1090 1095 1100 acc ttg gga aag tct
ttc ctg cag ttc agt gca atg aaa atg gca 3631 Thr Leu Gly Lys Ser
Phe Leu Gln Phe Ser Ala Met Lys Met Ala 1105 1110 1115 cat tct cct
cca ggt cat cac tca gtc act ggt agg ccc agt gta 3676 His Ser Pro
Pro Gly His His Ser Val Thr Gly Arg Pro Ser Val 1120 1125 1130 aat
ggc cta gca tta gct gaa tat gtt att tac aga gga gaa cag 3721 Asn
Gly Leu Ala Leu Ala Glu Tyr Val Ile Tyr Arg Gly Glu Gln 1135 1140
1145 gct tat cct gag tat tta att act tac cag att atg agg cct gaa
3766 Ala Tyr Pro Glu Tyr Leu Ile Thr Tyr Gln Ile Met Arg Pro Glu
1150 1155 1160 ggt atg gtc gat gga taaatagtta ttttaagaaa ctaattccac
tgaacctaaa 3821 Gly Met Val Asp Gly 1165 atcatcaaag cagcagtggc
ctctacgttt tactcctttg ctgaaaaaaa atcatcttgc 3881 ccacaggcct
gtggcaaaag gataaaaatg tgaacgaagt ttaacattct gacttgataa 3941
agctttaata atgtacagtg ttttctaaat atttcctgtt ttttcagcac tttaacagat
4001 gccattccag gttaaactgg gttgtctgta ctaaattata aacagagtta
acttgaacct 4061 tttatatgtt atgcattgat tctaacaaac tgtaatgccc
tcaacagaac taattttact 4121 aatacaatac tgtgttcttt aaaacacagc
atttacactg aatacaattt catttgtaaa 4181 actgtaaata agagcttttg
tactagccca gtatttattt acattgcttt gtaatataaa 4241 tctgttttag
aactgcaaaa aaaaaaaaaa aaaa 4275 6 1166 PRT Homo sapiens 6 Met Ser
Gly Arg Arg Cys Ala Gly Gly Gly Ala Ala Cys Ala Ser Ala 1 5 10 15
Ala Ala Glu Ala Val Glu Pro Ala Ala Arg Glu Leu Phe Glu Ala Cys 20
25 30 Arg Asn Gly Asp Val Glu Arg Val Lys Arg Leu Val Thr Pro Glu
Lys 35 40 45 Val Asn Ser Arg Asp Thr Ala Gly Arg Lys Ser Thr Pro
Leu His Phe 50 55 60 Ala Ala Gly Phe Gly Arg Lys Asp Val Val Glu
Tyr Leu Leu Gln Asn 65 70 75 80 Gly Ala Asn Val Gln Ala Arg Asp Asp
Gly Gly Leu Ile Pro Leu His 85 90 95 Asn Ala Cys Ser Phe Gly His
Ala Glu Val Val Asn Leu Leu Leu Arg 100 105 110 His Gly Ala Asp Pro
Asn Ala Arg Asp Asn Trp Asn Tyr Thr Pro Leu 115 120 125 His Glu Ala
Ala Ile Lys Gly Lys Ile Asp Val Cys Ile Val Leu Leu 130 135 140 Gln
His Gly Ala Glu Pro Thr Ile Arg Asn Thr Asp Gly Arg Thr Ala 145 150
155 160 Leu Asp Leu Ala Asp Pro Ser Ala Lys Ala Val Leu Thr Gly Glu
Tyr 165 170 175 Lys Lys Asp Glu Leu Leu Glu Ser Ala Arg Ser Gly Asn
Glu Glu Lys 180 185 190 Met Met Ala Leu Leu Thr Pro Leu Asn Val Asn
Cys His Ala Ser Asp 195 200 205 Gly Arg Lys Ser Thr Pro Leu His Leu
Ala Ala Gly Tyr Asn Arg Val 210 215 220 Lys Ile Val Gln Leu Leu Leu
Gln His Gly Ala Asp Val His Ala Lys 225 230 235 240 Asp Lys Gly Asp
Leu Val Pro Leu His Asn Ala Cys Ser Tyr Gly His 245 250 255 Tyr Glu
Val Thr Glu Leu Leu Val Lys His Gly Ala Cys Val Asn Ala 260 265 270
Met Asp Leu Trp Gln Phe Thr Pro Leu His Glu Ala Ala Ser Lys Asn 275
280 285 Arg Val Glu Val Cys Ser Leu Leu Leu Ser Tyr Gly Ala Asp Pro
Thr 290 295 300 Leu Leu Asn Cys His Asn Lys Ser Ala Ile Asp Leu Ala
Pro Thr Pro 305 310 315 320 Gln Leu Lys Glu Arg Leu Ala Tyr Glu Phe
Lys Gly His Ser Leu Leu 325 330 335 Gln Ala Ala Arg Glu Ala Asp Val
Thr Arg Ile Lys Lys His Leu Ser 340 345 350 Leu Glu Met Val Asn Phe
Lys His Pro Gln Thr His Glu Thr Ala Leu 355 360 365 His Cys Ala Ala
Ala Ser Pro Tyr Pro Lys Arg Lys Gln Ile Cys Glu 370 375 380 Leu Leu
Leu Arg Lys Gly Ala Asn Ile Asn Glu Lys Thr Lys Glu Phe 385 390 395
400 Leu Thr Pro Leu His Val Ala Ser Glu Lys Ala His Asn Asp Val Val
405 410 415 Glu Val Val Val Lys His Glu Ala Lys Val Asn Ala Leu Asp
Asn Leu 420 425 430 Gly Gln Thr Ser Leu His Arg Ala Ala Tyr Cys Gly
His Leu Gln Thr 435 440 445 Cys Arg Leu Leu Leu Ser Tyr Gly Cys Asp
Pro Asn Ile Ile Ser Leu 450 455 460 Gln Gly Phe Thr Ala Leu Gln Met
Gly Asn Glu Asn Val Gln Gln Leu 465 470 475 480 Leu Gln Glu Gly Ile
Ser Leu Gly Asn Ser Glu Ala Asp Arg Gln Leu 485 490 495 Leu Glu Ala
Ala Lys Ala Gly Asp Val Glu Thr Val Lys Lys Leu Cys 500 505 510 Thr
Val Gln Ser Val Asn Cys Arg Asp Ile Glu Gly Arg Gln Ser Thr 515 520
525 Pro Leu His Phe Ala Ala Gly Tyr Asn Arg Val Ser Val Val Glu Tyr
530 535 540 Leu Leu Gln His Gly Ala Asp Val His Ala Lys Asp Lys Gly
Gly Leu 545 550 555 560 Val Pro Leu His Asn Ala Cys Ser Tyr Gly His
Tyr Glu Val Ala Glu 565 570 575 Leu Leu Val Lys His Gly Ala Val Val
Asn Val Ala Asp Leu Trp Lys 580 585 590 Phe Thr Pro Leu His Glu Ala
Ala Ala Lys Gly Lys Tyr Glu Ile Cys 595 600 605 Lys Leu Leu Leu Gln
His Gly Ala Asp Pro Thr Lys Lys Asn Arg Asp 610 615 620 Gly Asn Thr
Pro Leu Asp Leu Val Lys Asp Gly Asp Thr Asp Ile Gln 625 630 635 640
Asp Leu Leu Arg Gly Asp Ala Ala Leu Leu Asp Ala Ala Lys Lys Gly 645
650 655 Cys Leu Ala Arg Val Lys Lys Leu Ser Ser Pro Asp Asn Val Asn
Cys 660 665 670 Arg Asp Thr Gln Gly Arg His Ser Thr Pro Leu His Leu
Ala Ala Gly 675 680 685 Tyr Asn Asn
Leu Glu Val Ala Glu Tyr Leu Leu Gln His Gly Ala Asp 690 695 700 Val
Asn Ala Gln Asp Lys Gly Gly Leu Ile Pro Leu His Asn Ala Ala 705 710
715 720 Ser Tyr Gly His Val Asp Val Ala Ala Leu Leu Ile Lys Tyr Asn
Ala 725 730 735 Cys Val Asn Ala Thr Asp Lys Trp Ala Phe Thr Pro Leu
His Glu Ala 740 745 750 Ala Gln Lys Gly Arg Thr Gln Leu Cys Ala Leu
Leu Leu Ala His Gly 755 760 765 Ala Asp Pro Thr Leu Lys Asn Gln Glu
Gly Gln Thr Pro Leu Asp Leu 770 775 780 Val Ser Ala Asp Asp Val Ser
Ala Leu Leu Thr Ala Ala Met Pro Pro 785 790 795 800 Ser Ala Leu Pro
Ser Cys Tyr Lys Pro Gln Val Leu Asn Gly Val Arg 805 810 815 Ser Pro
Gly Ala Thr Ala Asp Ala Leu Ser Ser Gly Pro Ser Ser Pro 820 825 830
Ser Ser Leu Ser Ala Ala Ser Ser Leu Asp Asn Leu Ser Gly Ser Phe 835
840 845 Ser Glu Leu Ser Ser Val Val Ser Ser Ser Gly Thr Glu Gly Ala
Ser 850 855 860 Ser Leu Glu Lys Lys Glu Val Pro Gly Val Asp Phe Ser
Ile Thr Gln 865 870 875 880 Phe Val Arg Asn Leu Gly Leu Glu His Leu
Met Asp Ile Phe Glu Arg 885 890 895 Glu Gln Ile Thr Leu Asp Val Leu
Val Glu Met Gly His Lys Glu Leu 900 905 910 Lys Glu Ile Gly Ile Asn
Ala Tyr Gly His Arg His Lys Leu Ile Lys 915 920 925 Gly Val Glu Arg
Leu Ile Ser Gly Gln Gln Gly Leu Asn Pro Tyr Leu 930 935 940 Thr Leu
Asn Thr Ser Gly Ser Gly Thr Ile Leu Ile Asp Leu Ser Pro 945 950 955
960 Asp Asp Lys Glu Phe Gln Ser Val Glu Glu Glu Met Gln Ser Thr Val
965 970 975 Arg Glu His Arg Asp Gly Gly His Ala Gly Gly Ile Phe Asn
Arg Tyr 980 985 990 Asn Ile Leu Lys Ile Gln Lys Val Cys Asn Lys Lys
Leu Trp Glu Arg 995 1000 1005 Tyr Thr His Arg Arg Lys Glu Val Ser
Glu Glu Asn His Asn His 1010 1015 1020 Ala Asn Glu Arg Met Leu Phe
His Gly Ser Pro Phe Val Asn Ala 1025 1030 1035 Ile Ile His Lys Gly
Phe Asp Glu Arg His Ala Tyr Ile Gly Gly 1040 1045 1050 Met Phe Gly
Ala Gly Ile Tyr Phe Ala Glu Asn Ser Ser Lys Ser 1055 1060 1065 Asn
Gln Tyr Val Tyr Gly Ile Gly Gly Gly Thr Gly Cys Pro Val 1070 1075
1080 His Lys Asp Arg Ser Cys Tyr Ile Cys His Arg Gln Leu Leu Phe
1085 1090 1095 Cys Arg Val Thr Leu Gly Lys Ser Phe Leu Gln Phe Ser
Ala Met 1100 1105 1110 Lys Met Ala His Ser Pro Pro Gly His His Ser
Val Thr Gly Arg 1115 1120 1125 Pro Ser Val Asn Gly Leu Ala Leu Ala
Glu Tyr Val Ile Tyr Arg 1130 1135 1140 Gly Glu Gln Ala Tyr Pro Glu
Tyr Leu Ile Thr Tyr Gln Ile Met 1145 1150 1155 Arg Pro Glu Gly Met
Val Asp Gly 1160 1165 7 4134 DNA Homo sapiens 7 cgaagatggc
ggcgtcgcgt cgctctcagc atcatcacca ccatcatcaa caacagctcc 60
agcccgcccc aggggcttca gcgccgccgc cgccacctcc tcccccactc agccctggcc
120 tggccccggg gaccacccca gcctctccca cggccagcgg cctggccccc
ttcgcctccc 180 cgcggcacgg cctagcgctg ccggaggggg atggcagtcg
ggatccgccc gacaggcccc 240 gatccccgga cccggttgac ggtaccagct
gttgcagtac caccagcaca atctgtaccg 300 tcgccgccgc tcccgtggtc
ccagcggttt ctacttcatc tgccgctggg gtcgctccca 360 acccagccgg
cagtggcagt aacaattcac cgtcgtcctc ttcttccccg acttcttcct 420
catcttcctc tccatcctcc cctggatcga gcttggcgga gagccccgag gcggccggag
480 ttagcagcac agcaccactg gggcctgggg cagcaggacc tgggacaggg
gtcccagcag 540 tgagcggggc cctacgggaa ctgctggagg cctgtcgcaa
tggggacgtg tcccgggtaa 600 agaggctggt ggacgcggca aacgtaaatg
caaaggacat ggccggccgg aagtcttctc 660 ccctgcactt cgctgcaggt
tttggaagga aggatgttgt agaacactta ctacagatgg 720 gtgctaatgt
ccacgctcgt gatgatggag gtctcatccc gcttcataat gcctgttctt 780
ttggccatgc tgaggttgtg agtctgttat tgtgccaagg agctgatcca aatgccaggg
840 ataactggaa ctatacacct ctgcatgaag ctgctattaa agggaagatc
gatgtgtgca 900 ttgtgctgct gcagcacgga gctgacccaa acattcggaa
cactgatggg aaatcagccc 960 tggacctggc agatccttca gcaaaagctg
tccttacagg tgaatacaag aaagacgaac 1020 tcctagaagc tgctaggagt
ggtaatgaag aaaaactaat ggctttactg actcctctaa 1080 atgtgaattg
ccatgcaagt gatgggcgaa agtcgactcc tttacatcta gcagcgggct 1140
acaacagagt tcgaatagtt cagcttcttc ttcagcatgg tgctgatgtt catgcaaaag
1200 acaaaggtgg acttgtgcct cttcataatg catgttcata tggacattat
gaagtcacag 1260 aactgctact aaagcatgga gcttgtgtta atgccatgga
tctctggcag tttactccac 1320 tgcacgaggc tgcttccaag aaccgtgtag
aagtctgctc tttgttactt agccatggcg 1380 ctgatcctac gttagtcaac
tgccatggca aaagtgctgt ggatatggct ccaactccgg 1440 agcttaggga
gagattgact tatgaattta aaggtcattc tttactacaa gcagccagag 1500
aagcagactt agctaaagtt aaaaaaacac tcgctctgga aatcattaat ttcaaacaac
1560 cgcagtctca tgaaacagca ctgcactgtg ctgtggcctc tctgcatccc
aaacgtaaac 1620 aagtgacaga attgttactt agaaaaggag caaatgttaa
tgaaaaaaat aaagatttca 1680 tgactcccct gcatgttgca gccgaaagag
cccataatga tgtcatggaa gttctgcata 1740 agcatggcgc caagatgaat
gcactggaca cccttggtca gactgctttg catagagccg 1800 ccctagcagg
ccacctgcag acctgccgcc tcctgctgag ttacggctct gacccctcca 1860
tcatctcctt acaaggcttc acagcagcac agatgggcaa tgaagcagtg cagcagattc
1920 tgagtgagag tacacctata cgtacttctg atgttgatta tcgactctta
gaggcatcta 1980 aagctggaga cttggaaact gtgaagcaac tttgcagctc
tcaaaatgtg aattgtagag 2040 acttagaggg ccggcattcc acgcccttac
acttcgcagc aggctacaac cgcgtgtctg 2100 ttgtagagta cctgctacac
cacggtgccg atgtccatgc caaagacaag ggtggcttgg 2160 tgccccttca
taatgcctgt tcatatggac actatgaggt ggctgagctt ttagtaaggc 2220
atggggcttc tgtcaatgtg gcggacttat ggaaatttac ccctctccat gaagcagcag
2280 ctaaaggaaa gtatgaaatc tgcaagctcc ttttaaaaca tggagcagat
ccaactaaaa 2340 agaacagaga tggaaataca cctttggatt tggtaaagga
aggagacaca gatattcagg 2400 acttactgaa aggggatgct gctttgttgg
atgctgccaa gaagggctgc ctggcaagag 2460 tgcagaagct ctgtacccca
gagaatatca actgcagaga cacccagggc agaaattcaa 2520 cccctctgca
cctggcagca ggctataata acctggaagt agctgaatat cttctagagc 2580
atggagctga tgttaatgcc caggacaagg gtggtttaat tcctcttcat aatgcggcat
2640 cttatgggca tgttgacata gcggctttat tgataaaata caacacgtgt
gtaaatgcaa 2700 cagataagtg ggcgtttact cccctccatg aagcagccca
gaaaggaagg acgcagctgt 2760 gcgccctcct cctagcgcat ggtgcagacc
ccaccatgaa gaaccaggaa ggccagacgc 2820 ctctggatct ggcaacagct
gacgatatca gagctttgct gatagatgcc atgcccccag 2880 aggccttacc
tacctgtttt aaacctcagg ctactgtagt gagtgcctct ctgatctcac 2940
cagcatccac cccctcctgc ctctcggctg ccagcagcat agacaacctc actggccctt
3000 tagcagagtt ggccgtagga ggagcctcca atgcagggga tggcgccgcg
ggaacagaaa 3060 ggaaggaagg agaagttgct ggtcttgaca tgaatatcag
ccaatttcta aaaagccttg 3120 gccttgaaca ccttcgggat atctttgaaa
cagaacagat tacactagat gtgttggctg 3180 atatgggtca tgaagagttg
aaagaaatag gcatcaatgc atatgggcac cgccacaaat 3240 taatcaaagg
agtagaaaga ctcttaggtg gacaacaagg caccaatcct tatttgactt 3300
ttcactgtgt taatcaggga acgattttgc tggatcttgc tccagaagat aaagaatatc
3360 agtcagtgga agaagagatg caaagtacta ttcgagaaca cagagatggt
ggtaatgctg 3420 gcggcatctt caacagatac aatgtcattc gaattcaaaa
agttgtcaac aagaagttga 3480 gggagcggtt ctgccaccga cagaaggaag
tgtctgagga gaatcacaac catcacaatg 3540 agcgcatgtt gtttcatggt
tctcctttca ttaatgccat tattcataaa gggtttgatg 3600 agcgacatgc
atacatagga ggaatgtttg gggccgggat ttattttgct gaaaactcct 3660
caaaaagcaa ccaatatgtt tatggaattg gaggaggaac aggctgccct acacacaagg
3720 acaggtcatg ctatatatgt cacagacaaa tgctcttctg tagagtgacc
cttgggaaat 3780 cctttctgca gtttagcacc atgaaaatgg cccacgcgcc
tccagggcac cactcagtca 3840 ttggtagacc gagcgtcaat gggctggcat
atgctgaata tgtcatctac agaggagaac 3900 aggcataccc agagtatctt
atcacttacc agatcatgaa gccagaagcc ccttcccaga 3960 ccgcaacagc
cgcagagcag aagacctagt gaatgcctgc tggtgaaggc cagatcagat 4020
ttcaacctgg gactggatta cagaggattg tttctaataa caacatcaat attctagaag
4080 tccctgacag cctagaaata agctgtttgt cttctataaa gcattgctat agtg
4134 8 1327 PRT Homo sapiens 8 Met Ala Ala Ser Arg Arg Ser Gln His
His His His His His Gln Gln 1 5 10 15 Gln Leu Gln Pro Ala Pro Gly
Ala Ser Ala Pro Pro Pro Pro Pro Pro 20 25 30 Pro Pro Leu Ser Pro
Gly Leu Ala Pro Gly Thr Thr Pro Ala Ser Pro 35 40 45 Thr Ala Ser
Gly Leu Ala Pro Phe Ala Ser Pro Arg His Gly Leu Ala 50 55 60 Leu
Pro Glu Gly Asp Gly Ser Arg Asp Pro Pro Asp Arg Pro Arg Ser 65 70
75 80 Pro Asp Pro Val Asp Gly Thr Ser Cys Cys Ser Thr Thr Ser Thr
Ile 85 90 95 Cys Thr Val Ala Ala Ala Pro Val Val Pro Ala Val Ser
Thr Ser Ser 100 105 110 Ala Ala Gly Val Ala Pro Asn Pro Ala Gly Ser
Gly Ser Asn Asn Ser 115 120 125 Pro Ser Ser Ser Ser Ser Pro Thr Ser
Ser Ser Ser Ser Ser Pro Ser 130 135 140 Ser Pro Gly Ser Ser Leu Ala
Glu Ser Pro Glu Ala Ala Gly Val Ser 145 150 155 160 Ser Thr Ala Pro
Leu Gly Pro Gly Ala Ala Gly Pro Gly Thr Gly Val 165 170 175 Pro Ala
Val Ser Gly Ala Leu Arg Glu Leu Leu Glu Ala Cys Arg Asn 180 185 190
Gly Asp Val Ser Arg Val Lys Arg Leu Val Asp Ala Ala Asn Val Asn 195
200 205 Ala Lys Asp Met Ala Gly Arg Lys Ser Ser Pro Leu His Phe Ala
Ala 210 215 220 Gly Phe Gly Arg Lys Asp Val Val Glu His Leu Leu Gln
Met Gly Ala 225 230 235 240 Asn Val His Ala Arg Asp Asp Gly Gly Leu
Ile Pro Leu His Asn Ala 245 250 255 Cys Ser Phe Gly His Ala Glu Val
Val Ser Leu Leu Leu Cys Gln Gly 260 265 270 Ala Asp Pro Asn Ala Arg
Asp Asn Trp Asn Tyr Thr Pro Leu His Glu 275 280 285 Ala Ala Ile Lys
Gly Lys Ile Asp Val Cys Ile Val Leu Leu Gln His 290 295 300 Gly Ala
Asp Pro Asn Ile Arg Asn Thr Asp Gly Lys Ser Ala Leu Asp 305 310 315
320 Leu Ala Asp Pro Ser Ala Lys Ala Val Leu Thr Gly Glu Tyr Lys Lys
325 330 335 Asp Glu Leu Leu Glu Ala Ala Arg Ser Gly Asn Glu Glu Lys
Leu Met 340 345 350 Ala Leu Leu Thr Pro Leu Asn Val Asn Cys His Ala
Ser Asp Gly Arg 355 360 365 Lys Ser Thr Pro Leu His Leu Ala Ala Gly
Tyr Asn Arg Val Arg Ile 370 375 380 Val Gln Leu Leu Leu Gln His Gly
Ala Asp Val His Ala Lys Asp Lys 385 390 395 400 Gly Gly Leu Val Pro
Leu His Asn Ala Cys Ser Tyr Gly His Tyr Glu 405 410 415 Val Thr Glu
Leu Leu Leu Lys His Gly Ala Cys Val Asn Ala Met Asp 420 425 430 Leu
Trp Gln Phe Thr Pro Leu His Glu Ala Ala Ser Lys Asn Arg Val 435 440
445 Glu Val Cys Ser Leu Leu Leu Ser His Gly Ala Asp Pro Thr Leu Val
450 455 460 Asn Cys His Gly Lys Ser Ala Val Asp Met Ala Pro Thr Pro
Glu Leu 465 470 475 480 Arg Glu Arg Leu Thr Tyr Glu Phe Lys Gly His
Ser Leu Leu Gln Ala 485 490 495 Ala Arg Glu Ala Asp Leu Ala Lys Val
Lys Lys Thr Leu Ala Leu Glu 500 505 510 Ile Ile Asn Phe Lys Gln Pro
Gln Ser His Glu Thr Ala Leu His Cys 515 520 525 Ala Val Ala Ser Leu
His Pro Lys Arg Lys Gln Val Thr Glu Leu Leu 530 535 540 Leu Arg Lys
Gly Ala Asn Val Asn Glu Lys Asn Lys Asp Phe Met Thr 545 550 555 560
Pro Leu His Val Ala Ala Glu Arg Ala His Asn Asp Val Met Glu Val 565
570 575 Leu His Lys His Gly Ala Lys Met Asn Ala Leu Asp Thr Leu Gly
Gln 580 585 590 Thr Ala Leu His Arg Ala Ala Leu Ala Gly His Leu Gln
Thr Cys Arg 595 600 605 Leu Leu Leu Ser Tyr Gly Ser Asp Pro Ser Ile
Ile Ser Leu Gln Gly 610 615 620 Phe Thr Ala Ala Gln Met Gly Asn Glu
Ala Val Gln Gln Ile Leu Ser 625 630 635 640 Glu Ser Thr Pro Ile Arg
Thr Ser Asp Val Asp Tyr Arg Leu Leu Glu 645 650 655 Ala Ser Lys Ala
Gly Asp Leu Glu Thr Val Lys Gln Leu Cys Ser Ser 660 665 670 Gln Asn
Val Asn Cys Arg Asp Leu Glu Gly Arg His Ser Thr Pro Leu 675 680 685
His Phe Ala Ala Gly Tyr Asn Arg Val Ser Val Val Glu Tyr Leu Leu 690
695 700 His His Gly Ala Asp Val His Ala Lys Asp Lys Gly Gly Leu Val
Pro 705 710 715 720 Leu His Asn Ala Cys Ser Tyr Gly His Tyr Glu Val
Ala Glu Leu Leu 725 730 735 Val Arg His Gly Ala Ser Val Asn Val Ala
Asp Leu Trp Lys Phe Thr 740 745 750 Pro Leu His Glu Ala Ala Ala Lys
Gly Lys Tyr Glu Ile Cys Lys Leu 755 760 765 Leu Leu Lys His Gly Ala
Asp Pro Thr Lys Lys Asn Arg Asp Gly Asn 770 775 780 Thr Pro Leu Asp
Leu Val Lys Glu Gly Asp Thr Asp Ile Gln Asp Leu 785 790 795 800 Leu
Lys Gly Asp Ala Ala Leu Leu Asp Ala Ala Lys Lys Gly Cys Leu 805 810
815 Ala Arg Val Gln Lys Leu Cys Thr Pro Glu Asn Ile Asn Cys Arg Asp
820 825 830 Thr Gln Gly Arg Asn Ser Thr Pro Leu His Leu Ala Ala Gly
Tyr Asn 835 840 845 Asn Leu Glu Val Ala Glu Tyr Leu Leu Glu His Gly
Ala Asp Val Asn 850 855 860 Ala Gln Asp Lys Gly Gly Leu Ile Pro Leu
His Asn Ala Ala Ser Tyr 865 870 875 880 Gly His Val Asp Ile Ala Ala
Leu Leu Ile Lys Tyr Asn Thr Cys Val 885 890 895 Asn Ala Thr Asp Lys
Trp Ala Phe Thr Pro Leu His Glu Ala Ala Gln 900 905 910 Lys Gly Arg
Thr Gln Leu Cys Ala Leu Leu Leu Ala His Gly Ala Asp 915 920 925 Pro
Thr Met Lys Asn Gln Glu Gly Gln Thr Pro Leu Asp Leu Ala Thr 930 935
940 Ala Asp Asp Ile Arg Ala Leu Leu Ile Asp Ala Met Pro Pro Glu Ala
945 950 955 960 Leu Pro Thr Cys Phe Lys Pro Gln Ala Thr Val Val Ser
Ala Ser Leu 965 970 975 Ile Ser Pro Ala Ser Thr Pro Ser Cys Leu Ser
Ala Ala Ser Ser Ile 980 985 990 Asp Asn Leu Thr Gly Pro Leu Ala Glu
Leu Ala Val Gly Gly Ala Ser 995 1000 1005 Asn Ala Gly Asp Gly Ala
Ala Gly Thr Glu Arg Lys Glu Gly Glu 1010 1015 1020 Val Ala Gly Leu
Asp Met Asn Ile Ser Gln Phe Leu Lys Ser Leu 1025 1030 1035 Gly Leu
Glu His Leu Arg Asp Ile Phe Glu Thr Glu Gln Ile Thr 1040 1045 1050
Leu Asp Val Leu Ala Asp Met Gly His Glu Glu Leu Lys Glu Ile 1055
1060 1065 Gly Ile Asn Ala Tyr Gly His Arg His Lys Leu Ile Lys Gly
Val 1070 1075 1080 Glu Arg Leu Leu Gly Gly Gln Gln Gly Thr Asn Pro
Tyr Leu Thr 1085 1090 1095 Phe His Cys Val Asn Gln Gly Thr Ile Leu
Leu Asp Leu Ala Pro 1100 1105 1110 Glu Asp Lys Glu Tyr Gln Ser Val
Glu Glu Glu Met Gln Ser Thr 1115 1120 1125 Ile Arg Glu His Arg Asp
Gly Gly Asn Ala Gly Gly Ile Phe Asn 1130 1135 1140 Arg Tyr Asn Val
Ile Arg Ile Gln Lys Val Val Asn Lys Lys Leu 1145 1150 1155 Arg Glu
Arg Phe Cys His Arg Gln Lys Glu Val Ser Glu Glu Asn 1160 1165 1170
His Asn His His Asn Glu Arg Met Leu Phe His Gly Ser Pro Phe 1175
1180 1185 Ile Asn Ala Ile Ile His Lys Gly Phe Asp Glu Arg His Ala
Tyr 1190 1195 1200 Ile Gly Gly Met Phe Gly Ala Gly Ile Tyr Phe Ala
Glu Asn Ser 1205 1210 1215 Ser Lys Ser Asn Gln Tyr Val Tyr Gly Ile
Gly Gly Gly Thr Gly 1220 1225 1230 Cys Pro Thr His Lys Asp Arg Ser
Cys Tyr Ile Cys His Arg Gln 1235 1240 1245 Met Leu Phe Cys Arg Val
Thr Leu Gly Lys Ser Phe Leu Gln Phe 1250 1255 1260 Ser Thr Met Lys
Met Ala His Ala Pro Pro Gly His His Ser Val 1265 1270 1275 Ile Gly
Arg Pro Ser Val
Asn Gly Leu Ala Tyr Ala Glu Tyr Val 1280 1285 1290 Ile Tyr Arg Gly
Glu Gln Ala Tyr Pro Glu Tyr Leu Ile Thr Tyr 1295 1300 1305 Gln Ile
Met Lys Pro Glu Ala Pro Ser Gln Thr Ala Thr Ala Ala 1310 1315 1320
Glu Gln Lys Thr 1325 9 384 DNA Homo sapiens 9 tcgacagaca attgctggaa
gctgcaaagg ctggagatgt cgaaactgta aaaaaactgt 60 gtactgttca
gagtgtcaac tgcagagaca ttgaagggcg tcagtctaca ccacttcatt 120
ttgcagctgg gtataacaga gtgtccgtgg tggaatatct gctacagcat ggagctgatg
180 tgcatgctaa agataaagga ggccttgtac ctttgcacaa tgcatgttct
tatggacatt 240 atgaagttgc agaacttctt gttaaacatg gagcagtagt
taatgtagct gatttatgga 300 aatttacacc tttacatgaa gcagcagcaa
aaggaaaata tgaaatttgc aaacttctgc 360 tccagcatgg tgcagaccct acaa 384
10 520 DNA Homo sapiens 10 tttttttaac tgtggtgtgg gagccaagtc
tatagcactt ttattgtgac aattgagcag 60 tgttgggtct gcaccataac
ttaagagaag agaacatact tcaaccctgt tcttagaagc 120 tgcctcatga
agaggagtga attgccacaa gtccattgca tttacacagg caccatgctt 180
gaccaaaagt tcagttactt cataatgacc ataagaacag gcattgtgta atggtaccag
240 atcaccttta tctttagcat ggacatcagc tccatgttgc agtaacagct
gtacaatctt 300 tactctgtta tatcctgctg ccaaatgtaa tggagttgac
tttctgccat cacttgcgtg 360 gcagttgaca tttaatggtg tgagtagagc
catcattttt tcttcattgc cactcctggc 420 actttctaag agttcatctt
tcttatattc accagtaagc actgctttgg cagatggatc 480 tgctaaatcc
aatgctgtcc ttccatctgt atttcggatg 520 11 22 DNA Artificial Sequence
Primer 11 tccagaggct ggtgacccct ga 22 12 19 DNA Artificial Sequence
Primer 12 ttgaactaac tactgaaga 19 13 21 DNA Artificial Sequence
Primer 13 ctgtcttcag tagttagttc a 21 14 20 DNA Artificial Sequence
Primer 14 gttacaaacc ttctgaatct 20 15 19 DNA Artificial Sequence
Primer 15 gaaagataca ctcaccgga 19 16 20 DNA Artificial Sequence
Primer 16 tagggttcag tgggaattag 20 17 18 DNA Artificial Sequence
Primer 17 gactcctgga gcccgtca 18 18 18 DNA Artificial Sequence
Primer 18 ggtagcgacc gggcgtca 18 19 33 PRT Artificial Sequence
Consensus 19 Xaa Gly Xaa Thr Pro Leu His Leu Ala Ala Arg Xaa Gly
His Val Glu 1 5 10 15 Val Val Lys Leu Leu Leu Asp Xaa Gly Ala Asp
Val Asn Ala Xaa Thr 20 25 30 Lys 20 160 PRT Homo sapiens 20 Glu Arg
Tyr Thr His Arg Arg Lys Glu Val Ser Glu Glu Asn His Asn 1 5 10 15
His Ala Asn Glu Arg Met Leu Phe His Gly Ser Pro Phe Val Asn Ala 20
25 30 Ile Ile His Lys Gly Phe Asp Glu Arg His Ala Tyr Ile Gly Gly
Met 35 40 45 Phe Gly Ala Gly Ile Tyr Phe Ala Glu Asn Ser Ser Lys
Ser Asn Gln 50 55 60 Tyr Val Tyr Gly Ile Gly Gly Gly Thr Gly Cys
Pro Val His Lys Asp 65 70 75 80 Arg Ser Cys Tyr Ile Cys His Arg Gln
Leu Leu Phe Cys Arg Val Thr 85 90 95 Leu Gly Lys Ser Phe Leu Gln
Phe Ser Ala Met Lys Met Ala His Ser 100 105 110 Pro Pro Gly His His
Ser Val Thr Gly Arg Pro Ser Val Asn Gly Leu 115 120 125 Ala Leu Ala
Glu Tyr Val Ile Tyr Arg Gly Glu Gln Ala Tyr Pro Glu 130 135 140 Tyr
Leu Ile Thr Tyr Gln Ile Met Arg Pro Glu Gly Met Val Asp Gly 145 150
155 160 21 22 PRT Homo sapiens 21 Met Ser Gly Arg Arg Cys Ala Gly
Gly Gly Ala Ala Cys Ala Ser Ala 1 5 10 15 Ala Ala Glu Ala Val Glu
20 22 90 PRT Homo sapiens 22 Thr Ala Ala Met Pro Pro Ser Ala Leu
Pro Ser Cys Tyr Lys Pro Gln 1 5 10 15 Val Leu Asn Gly Val Arg Ser
Pro Gly Ala Thr Ala Asp Ala Leu Ser 20 25 30 Ser Gly Pro Ser Ser
Pro Ser Ser Leu Ser Ala Ala Ser Ser Leu Asp 35 40 45 Asn Leu Ser
Gly Ser Phe Ser Glu Leu Ser Ser Val Val Ser Ser Ser 50 55 60 Gly
Thr Glu Gly Ala Ser Ser Leu Glu Lys Lys Glu Val Pro Gly Val 65 70
75 80 Asp Phe Ser Ile Thr Gln Phe Val Arg Asn 85 90 23 8 PRT Homo
sapiens 23 Arg Pro Glu Gly Met Val Asp Gly 1 5 24 20 DNA Artificial
Sequence Primer 24 gttacatttg ccacaggcag 20 25 20 DNA Artificial
Sequence Primer 25 gtctttcttg cagttcagtg 20 26 20 DNA Artificial
Sequence Primer 26 gagtcgagag acttatctcc 20 27 19 DNA Artificial
Sequence Primer 27 gagcacagag atggaggtc 19 28 22 DNA Artificial
Sequence Primer 28 atgtacagca actcctccaa ga 22 29 20 DNA Artificial
Sequence Primer 29 cagacaattg ctggaagctg 20 30 21 DNA Artificial
Sequence Primer 30 cagacaattg ctggaagctg c 21 31 20 DNA Artificial
Sequence Primer 31 ctactcctga gctatgggtg 20 32 22 DNA Artificial
Sequence Primer 32 gtgtactgtt cagagtgtca ac 22 33 21 DNA Artificial
Sequence Primer 33 ccatgctgga gcagaagttt g 21 34 21 DNA Artificial
Sequence Primer 34 gctaaaatct ctcctggaac c 21 35 22 DNA Artificial
Sequence Primer 35 gtttgtgcct atgtccataa gc 22 36 20 DNA Artificial
Sequence Primer 36 caaaagagca gctgcctgtg 20 37 22 DNA Artificial
Sequence Primer 37 ctgcaggaaa gactttccca ag 22 38 20 DNA Artificial
Sequence Primer 38 gcagccagtg gccctctacg 20 39 19 DNA Artificial
Sequence Primer 39 gccccacagg cctgtggcc 19 40 20 DNA Artificial
Sequence Primer 40 gaaactaatt cccactaacc 20 41 20 DNA Artificial
Sequence Primer 41 aataaatact gggctagtac 20 42 22 DNA Artificial
Sequence Primer 42 agggtctgca ccatgctgga gc 22 43 22 DNA Artificial
Sequence Primer 43 ataaatcagc tacattaact ac 22 44 19 DNA Artificial
Sequence Primer 44 cccagctgca aaatgaagt 19 45 20 DNA Artificial
Sequence Primer 45 aatgactctg cagttgacac 20 46 21 DNA Artificial
Sequence Primer 46 gatacactca ccggagaaaa g 21 47 21 DNA Artificial
Sequence Primer 47 gtgaactgga cacccagtac c 21 48 22 DNA Artificial
Sequence Primer 48 ggtatggtcg atggataaat ag 22 49 19 DNA Artificial
Sequence Primer 49 gaacacagta ttgtattag 19 50 20 DNA Artificial
Sequence Primer 50 cggcgggcag gaaatccacc 20 51 20 DNA Artificial
Sequence Primer 51 ttggggtctg caccatgtcg 20 52 22 DNA Artificial
Sequence Primer 52 tccagaggct ggtgacccct ga 22 53 22 DNA Artificial
Sequence Primer 53 tctgctaaat ccaatgctgt cc 22 54 20 DNA Artificial
Sequence Primer 54 tgcagcgggg tggatttcct 20 55 20 DNA Artificial
Sequence Primer 55 cattttgaag caaatattta 20 56 20 DNA Artificial
Sequence Primer 56 ggaataaggc ccccattata 20 57 20 DNA Artificial
Sequence Primer 57 cattttgaag caaatattta 20 58 20 DNA Artificial
Sequence Primer 58 ggaataaggc ccccattata 20 59 34 DNA Artificial
Sequence Primer 59 atcgatgcca gccatggagg ttccaggagt agat 34 60 26
DNA Artificial Sequence Primer 60 gctctagatc aggcctcata atctgg 26
61 9 PRT Homo sapiens 61 Met Ala Ala Ser Arg Arg Ser Gln Cys 1 5 62
9 PRT Homo sapiens 62 Met Ser Gly Arg Arg Cys Ala Gly Lys 1 5 63 15
PRT Homo sapiens 63 Gln Glu Gly Ile Ser Leu Gly Asn Ser Glu Ala Asp
Arg Gln Cys 1 5 10 15 64 11 PRT Homo sapiens 64 Gly Glu Tyr Lys Lys
Asp Glu Leu Leu Glu Cys 1 5 10
* * * * *