U.S. patent application number 13/212965 was filed with the patent office on 2012-03-29 for method for assaying protein-protein interaction.
This patent application is currently assigned to Life Technologies Corporation. Invention is credited to Richard Axel, Gilad Barnea, Kevin J. LEE, Walter Strapps.
Application Number | 20120077706 13/212965 |
Document ID | / |
Family ID | 34084516 |
Filed Date | 2012-03-29 |
United States Patent
Application |
20120077706 |
Kind Code |
A1 |
LEE; Kevin J. ; et
al. |
March 29, 2012 |
METHOD FOR ASSAYING PROTEIN-PROTEIN INTERACTION
Abstract
The invention relates to a method for determining if a test
compound, or a mix of compounds, modulates the interaction between
two proteins of interest. The determination is made possible via
the use of two recombinant molecules, one of which contains the
first protein a cleavage site for a proteolytic molecules, and an
activator of a gene. The second recombinant molecule includes the
second protein and the proteolytic molecule. If the test compound
binds to the first protein, a reaction is initiated whereby the
activator is cleaved, and activates a reporter gene.
Inventors: |
LEE; Kevin J.; (New York,
NY) ; Axel; Richard; (New York, NY) ; Strapps;
Walter; (San Francisco, CA) ; Barnea; Gilad;
(Providence, RI) |
Assignee: |
Life Technologies
Corporation
Carlsbad
CA
|
Family ID: |
34084516 |
Appl. No.: |
13/212965 |
Filed: |
August 18, 2011 |
Related U.S. Patent Documents
|
|
|
|
|
|
Application
Number |
Filing Date |
Patent Number |
|
|
11346759 |
Feb 2, 2006 |
8017398 |
|
|
13212965 |
|
|
|
|
10888313 |
Jul 9, 2004 |
7049076 |
|
|
11346759 |
|
|
|
|
60566113 |
Apr 27, 2004 |
|
|
|
60511918 |
Oct 15, 2003 |
|
|
|
60485968 |
Jul 9, 2003 |
|
|
|
Current U.S.
Class: |
506/11 ; 435/18;
435/188; 435/252.3; 435/252.31; 435/252.33; 435/252.34; 435/29;
435/320.1; 435/353; 435/357; 435/358; 435/365; 435/367; 435/369;
530/350; 536/23.2; 536/23.4 |
Current CPC
Class: |
G01N 33/9406 20130101;
G01N 2500/10 20130101; C12N 15/1055 20130101; G01N 2333/726
20130101; G01N 33/6845 20130101 |
Class at
Publication: |
506/11 ; 435/29;
435/252.33; 536/23.4; 435/320.1; 530/350; 536/23.2; 435/188;
435/252.31; 435/252.3; 435/252.34; 435/367; 435/357; 435/365;
435/358; 435/353; 435/369; 435/18 |
International
Class: |
C12Q 1/02 20060101
C12Q001/02; C12N 15/62 20060101 C12N015/62; C12N 15/63 20060101
C12N015/63; C40B 30/08 20060101 C40B030/08; C12N 9/96 20060101
C12N009/96; C12N 5/10 20060101 C12N005/10; C12Q 1/34 20060101
C12Q001/34; C12N 1/21 20060101 C12N001/21; C07K 14/00 20060101
C07K014/00 |
Claims
1. A method for determining if a test compound modulates a specific
protein/protein interaction of interest comprising contacting said
compound to a cell which has been transformed or transfected with
(a) a nucleic acid molecule which comprises: (i) a nucleotide
sequence which encodes said first test protein, (ii) a nucleotide
sequence encoding a cleavage site for a protease or a portion of a
protease, and (iii) a nucleotide sequence which encodes a protein
which activates a reporter gene in said cell, and (b) a nucleic
acid molecule which comprises: (i) a nucleotide sequence which
encodes a second test protein whose interaction with said first
test protein in the presence of said test compound is to be
measured, and (ii) a nucleotide sequence which encodes a protease
or a portion of a protease which is specific for said cleavage
site, and determining activity of said reporter gene as a
determination of whether said compound modulates said
protein/protein interaction.
2-27. (canceled)
28. Recombinant cell, transformed or transfected with: (a) a
nucleic acid molecule which comprises: (i) a nucleotide sequence
which encodes said first test protein, (ii) a nucleotide sequence
encoding a cleavage site for a protease or a portion of a protease,
and (iii) a nucleotide sequence which encodes a protein which
activates a reporter gene in said cell, and (b) a nucleic acid
molecule which comprises: (i) a nucleotide sequence which encodes a
second test protein whose interaction with said first test protein
in the presence of said test compound is to be measured, and (ii) a
nucleotide sequence which encodes a protease or a portion of a
protease which is specific for said cleavage site.
29-46. (canceled)
47. An isolated nucleic acid molecule which comprises, in 5' to 3'
order, (i) a nucleotide sequence which encodes a test protein, (ii)
a nucleotide sequence encoding a cleavage site for a protease or a
portion of a protease, and (iii) a nucleotide sequence which
encodes a protein which activates a reporter gene in said cell.
48. The isolated nucleic acid molecule of claim 47, wherein said
test protein is a membrane bound protein.
49. The isolated nucleic acid molecule of claim 48, wherein said
membrane bound protein is a transmembrane receptor.
50. The isolated nucleic acid molecule of claim 49, wherein said
transmembrane receptor is a GPCR.
51. The isolated nucleic acid molecule of claim 47, wherein said
protease or portion of a protease is tobacco etch virus nuclear
inclusion A protease.
52. The isolated nucleic acid molecule of claim 47, wherein said
protein which activates said reporter gene is a transcription
factor.
53. The isolated nucleic acid molecule of claim 52, wherein said
transcription factor is tTA or GAL4.
54. The isolated nucleic acid molecule of claim 48, wherein said
membrane bound protein is ADBR2, AVPR2, HTR1A, CHRM2, CCR5, DRD2,
or OPRK.
55. Expression vector comprising the isolated nucleic acid molecule
of claim 47, operably linked to a promoter.
56. An isolated nucleic acid molecule which comprises: (i)
nucleotide sequence which encodes a test protein whose interaction
with another test protein in the presence of a test compound is to
be measured, and (ii) a nucleotide sequence which encodes a
protease or a portion of a protease which is specific for said
cleavage site.
57. The isolated nucleic acid molecule of claim 56, wherein said
test protein is an inhibitory protein.
58. The isolated nucleic acid molecule of claim 57, wherein said
inhibitory protein is an arrestin.
59. Expression vector comprising the isolated nucleic acid molecule
of claim 56, operably linked to a promoter.
60. A fusion protein produced by expression of the isolated nucleic
acid molecule of claim 47.
61. A fusion protein produced by expression of the isolated nucleic
acid molecule of claim 56.
62. A test kit useful for determining if a test compound modulates
a specific protein/protein interaction of interest comprising the
recombinant cell according to claim 28.
63-76. (canceled)
Description
RELATED APPLICATIONS
[0001] This is a continuation-in-part of Application No. 60/566,113
filed Apr. 27, 2004, which is a continuation-in-part of Application
No. 60/511,918, filed Oct. 15, 2003, which is a
continuation-in-part of Application No. 60/485,968 filed Jul. 9,
2003, all of which are incorporated by reference in its
entirety.
FIELD OF THE INVENTION
[0002] This invention relates to methods for determining
interaction between molecules of interest. More particularly, it
relates to determining if a particular substance referred to as the
test compound modulates the interaction of two or more specific
proteins of interest, via determining activation of a reporter gene
in a cell, where the activation, or lack thereof, results from the
modulation or its absence. The determination occurs using
transformed or transfected cells, which are also a feature of the
invention, as are the agents used to transform or transfect
them.
BACKGROUND AND RELATED ART
[0003] The study of protein/protein interaction, as exemplified,
e.g., by the identification of ligands for receptors, is an area of
great interest. Even when a ligand or ligands for a given receptor
are known, there is interest in identifying more effective or more
selective ligands. GPCRs will be discussed herein as a
non-exclusive example of a class of proteins which can be studied
in this way.
[0004] The G-protein coupled receptors, or "GPCRs" hereafter, are
the largest class of cell surface receptors known for humans. Among
the ligands recognized by GPCRs are hormones, neurotransmitters,
peptides, glycoproteins, lipids, nucleotides, and ions. They also
act as receptors for light, odors, pheromones, and taste. Given
these various roles, it is perhaps not surprising that they are the
subject of intense research, seeking to identify drugs useful in
various conditions. The success rate has been phenomenal. Indeed,
Howard, et al., Trends Pharmacol. Sci., 22:132-140 (2001) estimate
that over 50% of marketed drugs act on such receptors. "GPCRs" as
used herein, refers to any member of the GPCR superfamily of
receptors characterized by a seven-transmembrane domain (7.TM.)
structure. Examples of these receptors include, but are not limited
to, the class A or "rhodopsin-like" receptors; the class B or
"secretin-like" receptors; the class C or "metabotropic
glutamate-like" receptors; the Frizzled and Smoothened-related
receptors; the adhesion receptor family or EGF-7.TM./LNB-7.TM.
receptors; adiponectin receptors and related receptors; and
chemosensory receptors including odorant, taste, vomeronasal and
pheromone receptors. As examples, the GPCR superfamily in humans
includes but is not limited to those receptor molecules described
by Vassilatis, et al., Proc. Natl. Acad. Sci. USA, 100:4903-4908
(2003); Takeda, et al., FEBS Letters, 520:97-101 (2002);
Fredricksson, et al., Mol. Pharmacol., 63:1256-1272 (2003);
Glusman, et al., Genome Res., 11:685-702 (2001); and Zozulya, et
al., Genome Biol., 2:0018.1-0018.12 (2001), all of which are
incorporated by reference.
[0005] The mechanisms of action by which GPCRs function has been
explicated to some degree. In brief, when a GPCR binds a ligand, a
conformational change results, stimulating a cascade of reactions
leading to a change in cell physiology. It is thought that GPCRs
transduce signals by modulating the activity of intracellular,
heterotrimeric guanine nucleotide binding proteins, or "G
proteins". The complex of ligand and receptor stimulates guanine
nucleotide exchange and dissociation of the G protein heterotrimer
into .alpha. and .beta..gamma. subunits.
[0006] Both the GTP-bound .alpha. subunit and the .beta..gamma.
dimer can act to regulate various cellular effector proteins,
including adenylyl cyclase and phospholipase C (PLC). In
conventional cell based assays for GPCRs, receptor activity is
monitored by measuring the output of a G-protein regulated effector
pathway, such as the accumulation of cAMP that is produced by
adenylyl cyclase, or the release of intracellular calcium, which is
stimulated by PLC activity.
[0007] Conventional G-protein based, signal transduction assays
have been difficult to develop for some targets, as a result of two
major issues.
[0008] First, different GPCRs are coupled to different G protein
regulated signal transduction pathways, and G-protein based assays
are dependent on knowing the G-protein specificity of the target
receptor, or require engineering of the cellular system, to force
coupling of the target receptor to a particular effect or pathway.
Second, all cells express a large number of endogenous GPCRs, as
well as other signaling factors. As a result, the effector pathways
that are measured may be modulated by other endogenous molecules in
addition to the target GPCR, potentially leading to false
results.
[0009] Regulation of G-protein activity is not the only result of
ligand/GPCR binding. Luttrell, et al., J. Cell Sci., 115:455-465
(2002), and Ferguson, Pharmacol. Rev., 53:1-24 (2001), both of
which are incorporated by reference, review other activities which
lead to termination of the GPCR signal. These termination processes
prevent excessive cell stimulation, and enforce temporal linkage
between extracellular signal and corresponding intracellular
pathway.
[0010] In the case of binding of an agonist to GPCR, serine and
threonine residues at the C terminus of the GPCR molecule are
phosphorylated. This phosphorylation is caused by the GPCR kinase,
or "GRK," family. Agonist complexed, C-terminal phosphorylated
GPCRs interact with arrestin family members, which "arrest"
receptor signaling. This binding inhibits coupling of the receptor
to G proteins, thereby targeting the receptor for internalization,
followed by degradation and/or recycling. Hence, the binding of a
ligand to a GPCR can be said to "modulate" the interaction between
the GPCR and arrestin protein, since the binding of ligand to GPCR
causes the arrestin to bind to the GPCR, thereby modulating its
activity. Hereafter, when "modulates" or any form thereof is used,
it refers simply to some change in the way the two proteins of the
invention interact, when the test compound is present, as compared
to how these two proteins interact, in its absence. For example,
the presence of the test compound may strengthen or enhance the
interaction of the two proteins, weaken it, inhibit it, or lessen
it in some way, manner or form which can then be detected.
[0011] This background information has led to alternate methods for
assaying activation and inhibition of GPCRs. These methods involve
monitoring interaction with arrestins. A major advantage of this
approach is that no knowledge of G-protein pathways is
necessary.
[0012] Oakley, et al., Assay Drug Dev. Technol., 1:21-30 (2002) and
U.S. Pat. Nos. 5,891,646 and 6,110,693, incorporated by reference,
describe assays where the redistribution of fluorescently labelled
arrestin molecules in the cytoplasm to activated receptors on the
cell surface is measured. These methods rely on high resolution
imaging of cells, in order to measure arrestin relocalization and
receptor activation. It will be recognized by the skilled artisan
that this is a complex, involved procedure.
[0013] Various other U.S. patents and patent applications dealing
with these points have issued and been filed. For example, U.S.
Pat. No. 6,528,271 to Bohn, et al., deals with assays for screening
for pain controlling medications, where the inhibitor of
.beta.-arrestin binding is measured. Published U.S. patent
applications, such as 2004/0002119, 2003/0157553, 2003/0143626, and
2002/0132327, all describe different forms of assays involving
GPCRs. Published application 2002/0106379 describes a construct
which is used in an example which follows; however, it does not
teach or suggest the invention described herein.
[0014] It is an object of the invention to develop a simpler assay
for monitoring and/or determining modulation of specific
protein/protein interactions, where the proteins include but are
not limited to, membrane bound proteins, such as receptors, GPCRs
in particular. How this is accomplished will be seen in the
examples which follow.
SUMMARY OF THE INVENTION
[0015] Thus, in accordance with the present invention, there is
provided a method for determining if a test compound modulates a
specific protein/protein interaction of interest comprising
contacting said compound to a cell which has been transformed or
transfected with (a) a nucleic acid molecule which comprises, (i) a
nucleotide sequence which encodes said first test protein, (ii) a
nucleotide sequence encoding a cleavage site for a protease or a
portion of a protease, and (iii) a nucleotide sequence which
encodes a protein which activates a reporter gene in said cell, and
(b) a nucleic acid molecule which comprises, (i) a nucleotide
sequence which encodes a second test protein whose interaction with
said first test protein in the presence of said test compound is to
be measured, and (ii) a nucleotide sequence which encodes a
protease or a portion of a protease which is specific for said
cleavage site, and determining activity of said reporter gene as a
determination of whether said compound modulates said
protein/protein interaction.
[0016] The first test protein may be a membrane bound protein, such
as a transmembrane receptor, and in particular a GPCR. Particular
transmembrane receptors include .beta.2-adrenergic receptor
(ADRB2), arginine vasopressin receptor 2 (AVPR2), serotonin
receptor 1a (HTR1A), m2 muscarinic acetylcholine receptor (CHRM2),
chemokine (C-C motif) receptor 5 (CCR5), dopamine D2 receptor
(DRD2), kappa opioid receptor (OPRK), or .alpha.1a-adregenic
receptor (ADRA1A) although it is to be understood that in all cases
the invention is not limited to these specific embodiments. For
example, molecules such as the insulin growth factor-1 receptor
(IGF-1R), which is a tyrosine kinase, and proteins which are not
normally membrane bound, like estrogen receptor 1 (ESR1) and
estrogen receptors 2 (ESR2). The protease or portion of a protease
may be a tobacco etch virus nuclear inclusion A protease. The
protein which activates said reporter gene may be a transcription
factor, such as tTA or GAL4. The second protein may be an
inhibitory protein, such as an arrestin. The cell may be a
eukaryote or a prokaryote. The reporter gene may be an exogenous
gene, such as .beta.-galactosidase or luciferase.
[0017] The nucleotide sequence encoding said first test protein may
be modified to increase interaction with said second test protein.
Such modifications include but are not limited to replacing all or
part of the nucleotide sequence of the C-terminal region of said
first test protein with a nucleotide sequence which encodes an
amino acid sequence which has higher affinity for said second test
protein than the original sequence. For example, the C-terminal
region may be replaced by a nucleotide sequence encoding the
C-terminal region of AVPR2, AGTRLI, GRPR, F2RL1, CXCR2/IL-8b, CCR4,
or GRPR.
[0018] The method may comprise contacting more than one test
compound to a plurality of samples of cells, each of said samples
being contacted by one or more of said test compounds, wherein each
of said cell samples have been transformed or transfected with the
aforementioned nucleic acid molecules, and determining activity of
reporter genes in said plurality of said samples to determine if
any of said test compounds modulate a specific, protein/protein
interaction. The method may comprise contacting each of said
samples with one test compound, each of which differs from all
others, or comprise contacting each of said samples with a mixture
of said test compounds.
[0019] In another embodiment, there is provided a method for
determining if a test compound modulates one or more of a plurality
of protein interactions of interest, comprising contacting said
test compound to a plurality of samples of cells, each of which has
been transformed or transfected with (a) a first nucleic acid
molecule comprising, (i) a nucleotide sequence which encodes a
first test protein, a nucleotide sequence encoding a cleavage site
for a protease, and (ii) a nucleotide sequence which encodes a
protein which activates a reporter gene in said cell, (b) a second
nucleic acid molecule which comprises, (i) a nucleotide sequence
which encodes a second test protein whose interaction with said
first test protein in the presence of said test compound of
interest is to be measured, (ii) a nucleotide sequence which
encodes a protease or a protease which is specific for said
cleavage site, wherein said first test protein differs from other
first test proteins in each of said plurality of samples, and
determining activity of said reporter gene in at one or more of
said plurality of samples as a determination of modulation of one
or more protein interactions of interest
[0020] The second test protein may be different in each sample or
the same in each sample. All of said samples may be combined in a
common receptacle, and each sample comprises a different pair of
first and second test proteins. Alternatively, each sample may be
tested in a different receptacle. The reporter gene in a given
sample may differ from the reporter gene in other samples. The
mixture of test compounds may comprise or be present in a
biological sample, such as cerebrospinal fluid, urine, blood,
serum, pus, ascites, synovial fluid, a tissue extract, or an
exudate.
[0021] In yet another embodiment, there is provided a recombinant
cell, transformed or transfected with (a) a nucleic acid molecule
which comprises, (i) a nucleotide sequence which encodes said first
test protein, (ii) a nucleotide sequence encoding a cleavage site
for a protease or a portion of a protease, and (iii) a nucleotide
sequence which encodes a protein which activates a reporter gene in
said cell, and (b) a nucleic acid molecule which comprises, (i) a
nucleotide sequence which encodes a second test protein whose
interaction with said first test protein in the presence of said
test compound is to be measured, and (ii) a nucleotide sequence
which encodes a protease or a portion of a protease which is
specific for said cleavage site.
[0022] One or both of said nucleic acid molecules may be stably
incorporated into the genome of said cell. The cell also may have
been transformed or transfected with said reporter gene. The first
test protein may be a membrane bound protein, such as a
transmembrane receptor, and in particular a GPCR. Particular
transmembrane receptors include ADRB2, AVPR2, HTR1A, CHRM2, CCR5,
DRD2, OPRK, or ADRA1A.
[0023] The protease or portion of a protease may be a tobacco etch
virus nuclear inclusion A protease. The protein which activates
said reporter gene may be a transcription factor, such as tTA or
GAL4. The second protein may be an inhibitory protein. The cell may
be a eukaryote or a prokaryote. The reporter gene may be an
exogenous gene, such as .beta.-galactosidase or luciferase. The
nucleotide sequence encoding said first test protein may be
modified to increase interaction with said second test protein,
such as by replacing all or part of the nucleotide sequence of the
C-terminal region of said first test protein with a nucleotide
sequence which encodes an amino acid sequence which has higher
affinity for said second test protein than the original sequence.
The C-terminal region may be replaced by a nucleotide sequence
encoding the C-terminal region of AVPR2, AGTRLI, GRPR, F2RL1,
CXCR2/IL-8B, CCR4, or GRPR.
[0024] In still yet another embodiment, there is provided an
isolated nucleic acid molecule which comprises, (i) a nucleotide
sequence which encodes a test protein (ii) a nucleotide sequence
encoding a cleavage site for a protease or a portion of a protease,
and (iii) a nucleotide sequence which encodes a protein which
activates a reporter gene in said cell. The test protein may be a
membrane bound protein, such as is a transmembrane receptor. A
particular type of transmembrane protein is a GPCR. Particular
transmembrane receptors include ADRB2, AVPR2, HTR1A, CHRM2, CCR5,
DRD2, OPRK, or ADRA1A. The protease or portion of a protease may be
a tobacco etch virus nuclear inclusion A protease. The protein
which activates said reporter gene may be a transcription factor,
such as tTA or GAL4. As above, the invention is not to be viewed as
limited to these specific embodiments.
[0025] In still a further embodiment, there is provided an
expression vector comprising an isolated nucleic acid molecule
which comprises, (i) a nucleotide sequence which encodes a test
protein (ii) a nucleotide sequence encoding a cleavage site for a
protease or a portion of a protease, and (iii) a nucleotide
sequence which encodes a protein which activates a reporter gene in
said cell, and further being operably linked to a promoter.
[0026] In still yet a further embodiment, there is provided an
isolated nucleic acid molecule which comprises, (i) a nucleotide
sequence which encodes a test protein whose interaction with
another test protein in the presence of a test compound is to be
measured, and (ii) a nucleotide sequence which encodes a protease
or a portion of a protease which is specific for said cleavage
site. The test protein may be an inhibitory protein, such as an
arrestin.
[0027] Also provided is an expression vector comprising an isolated
nucleic acid molecule which comprises, (i) a nucleotide sequence
which encodes a test protein whose interaction with another test
protein in the presence of a test compound is to be measured, and
(ii) a nucleotide sequence which encodes a protease or a portion of
a protease which is specific for said cleavage site, said nucleic
acid further being operably linked to a promoter.
[0028] An additional embodiment comprises a fusion protein produced
by expression of: [0029] an isolated nucleic acid molecule which
comprises, (i) a nucleotide sequence which encodes a test protein
(ii) a nucleotide sequence encoding a cleavage site for a protease
or a portion of a protease, and (iii) a nucleotide sequence which
encodes a protein which activates a reporter gene in said cell, and
further being operably linked to a promoter; or [0030] an isolated
nucleic acid molecule which comprises, (i) a nucleotide sequence
which encodes a test protein whose interaction with another test
protein in the presence of a test compound is to be measured, and
(ii) a nucleotide sequence which encodes a protease or a portion of
a protease which is specific for said cleavage site
[0031] In yet another embodiment, there is provided a test kit
useful for determining if a test compound modulates a specific
protein/protein interaction of interest comprising a separate
portion of each of (a) a nucleic acid molecule which comprises, a
nucleotide sequence which encodes said first test protein (i) a
nucleotide sequence encoding a cleavage site for a protease or a
portion of a protease, (ii) a nucleotide sequence which encodes a
protein which activates a reporter gene in said cell, and (b) a
nucleic acid molecule which comprises, (i) a nucleotide sequence
which encodes a second test protein whose interaction with said
first test protein in the presence of said test compound is to be
measured, (ii) a nucleotide sequence which encodes a protease or a
portion of a protease which is specific for said cleavage site, and
container means for holding each of (a) and (b) separately from
each other.
[0032] The first test protein may be a membrane bound protein, such
as a transmembrane receptor. A particular type of transmembrane
receptor is a GPCR. A particular transmembrane protein is a GPCR.
Particular transmembrane receptors include ADRB2, AVPR2, HTR1A,
CHRM2, CCR5, DRD2, OPRK, or ADRA1A. The protease or portion of a
protease may be tobacco etch virus nuclear inclusion A protease.
The protein which activates said reporter gene may be a
transcription factor, such as tTA or GAL4. The second protein may
be an inhibitory protein, such as an arrestin. The kit may further
comprise a separate portion of an isolated nucleic acid molecule
which encodes a reporter gene. The reporter gene may encode
.beta.-galactosidase or luciferase. The nucleotide sequence
encoding said first test protein may be modified to increase
interaction with said second test protein, such as by replacing all
or part of the nucleotide sequence of the C-terminal region of said
first test protein with a nucleotide sequence which encodes an
amino acid sequence which has higher affinity for said second test
protein than the original sequence. The nucleotide sequence of said
C-terminal region may be replaced by a nucleotide sequence encoding
the C-terminal region of AVPR2, AGTRLI, GRPR, F2RL1, CXCR2/IL-8B,
CCR4, or GRPR.
[0033] It is contemplated that any method or composition described
herein can be implemented with respect to any other method or
composition described herein. The use of the word "a" or "an" when
used in conjunction with the term "comprising" in the claims and/or
the specification may mean "one," but it is also consistent with
the meaning of "one or more," "at least one," and "one or more than
one."
[0034] These, and other, embodiments of the invention will be
better appreciated and understood when considered in conjunction
with the following description and the accompanying drawings. It
should be understood, however, that the following description,
while indicating various embodiments of the invention and numerous
specific details thereof, is given by way of illustration and not
of limitation. Many substitutions, modifications, additions and/or
rearrangements may be made within the scope of the invention
without departing from the spirit thereof, and the invention
includes all such substitutions, modifications, additions and/or
rearrangements.
BRIEF DESCRIPTION OF THE FIGURES
[0035] The following drawings form part of the present
specification and are included to further demonstrate certain
aspects of the present invention. The invention may be better
understood by reference to one or more of these drawings in
combination with the detailed description of specific embodiments
presented herein.
[0036] FIG. 1 shows the conceptual underpinnings of the invention,
pictorially, using ligand-receptor binding as an example.
[0037] FIGS. 2a and 2b show that the response of targets in assays
in accordance with the invention is dose dependent, both for
agonists and antagonists.
[0038] FIG. 3 shows that a dose response curve results with a
different target and a different agonist as well.
[0039] FIG. 4 depicts results obtained in accordance with the
invention, using the D2 dopamine receptor.
[0040] FIGS. 5a and 5b illustrate results of an assay which shows
that two molecules can be studied simultaneously.
[0041] FIG. 6 sets forth the result of another "multiplex" assay,
i.e., one where two molecules are studied simultaneously.
[0042] FIG. 7 presents data obtained from assays measuring EGFR
activity.
[0043] FIG. 8 presents data obtained from assays in accordance with
the invention, designed to measure the activity of human type I
interferon receptor.
[0044] FIG. 9 elaborates on the results in FIG. 7, showing a dose
response curve for IFN-.alpha. in the cells used to generate FIG.
7.
[0045] FIG. 10 shows the results of additional experiments where a
different transcription factor, and a different cell line, were
used.
DETAILED DESCRIPTION OF PREFERRED EMBODIMENTS
[0046] The present invention relates to methods for determining if
a substance of interest modulates interaction of a first test
protein, such as a membrane bound protein, like a receptor, e.g., a
transmembrane receptor, with a second test protein, like a member
of the arrestin family. The methodology involves cotransforming or
cotransfecting a cell, which may be prokaryotic or eukaryotic, with
two constructs. The first construct includes, a sequence encoding
(i) the first test protein, such as a transmembrane receptor, (ii)
a cleavage site for a protease, and (iii) a sequence encoding a
protein which activates a reporter gene. The second construct
includes, (i) a sequence which encodes a second test protein whose
interaction with the first test protein is measured and/or
determined, and (ii) a nucleotide sequence which encodes a protease
or a portion of a protease sufficient to act on the cleavage site
that is part of the first construct. In especially preferred
embodiments, these constructs become stably integrated into the
cells.
[0047] The features of an embodiment of the invention are shown,
pictorially, in FIG. 1. In brief, first, standard techniques are
employed to fuse DNA encoding a transcription factor to DNA
encoding a first test protein, such as a transmembrane receptor
molecule, being studied. This fusion is accompanied by the
inclusion of a recognition and cleavage site for a protease not
expressed endogenously by the host cell being used in the
experiments.
[0048] DNA encoding this first fusion protein is introduced into
and is expressed by a cell which also contains a reporter gene
sequence, under the control of a promoter element which is
dependent upon the transcription factor fused to the first test
protein, e.g., the receptor. If the exogenous protease is not
present, the transcription factor remains tethered to the first
test protein and is unable to enter the nucleus to stimulate
expression of the reporter gene.
[0049] Recombinant techniques can also be used to produce a second
fusion protein. In the depicted embodiment, DNA encoding a member
of the arrestin family is fused to a DNA molecule encoding the
exogenous protease, resulting in a second fusion protein containing
the second test protein, i.e., the arrestin family member.
[0050] An assay is then carried out wherein the second fusion
protein is expressed, together with the first fusion protein, and a
test compound is contacted to the cells, preferably for a specific
length of time. If the test compound modulates interaction of the
two test proteins, e.g., by stimulating, promoting or enhancing the
association of the first and second test proteins, this leads to
release of the transcription factor, which in turn moves to the
nucleus, and provokes expression of the reporter gene. The activity
of the reporter gene is measured.
[0051] In an alternative system, the two test proteins may interact
in the absence of the test compound, and the test compound may
cause the two test proteins to dissociate, lessen or inhibit their
interaction. In such a case, the level of free, functionally active
transcription factor in the cell decreases in the presence of the
test compound, leading to a decrease in proteolysis, and a
measurable decrease in the activity of the reporter gene.
[0052] In the depicted embodiment, the arrestin protein, which is
the second test protein, binds to the receptor in the presence of
an agonist; however, it is to be understood that since receptors
are but one type of protein, the assay is not dependent upon the
use of receptor molecules, nor is agonist binding the only
interaction capable of being involved. Any protein will suffice,
although the interest in transmembrane proteins is clear. Further,
agonist binding to a receptor is not the only type of binding which
can be assayed. One can determine antagonists, per se and also
determine the relative strengths of different antagonists and/or
agonists in accordance with the invention.
[0053] Other details of the invention, include specific methods and
technology for making and using the subject matter thereof, are
described below.
I. EXPRESSION CONSTRUCTS AND TRANSFORMATION
[0054] The term "vector" is used to refer to a carrier nucleic acid
molecule into which a nucleic acid sequence can be inserted for
introduction into a cell where it can be replicated. A nucleic acid
sequence can be "exogenous," which means that it is foreign to the
cell into which the vector is being introduced or that the sequence
is homologous to a sequence in the cell but in a position within
the host cell nucleic acid in which the sequence is ordinarily not
found. Vectors include plasmids, cosmids, viruses (bacteriophage,
animal viruses, and plant viruses), and artificial chromosomes
(e.g., YACs). One of skill in the art would be well equipped to
construct a vector through standard recombinant techniques (see,
for example, Maniatis, et al., Molecular Cloning, A Laboratory
Manual (Cold Spring Harbor, 1990) and Ausubel, et al., 1994,
Current Protocols In Molecular Biology (John Wiley & Sons,
1996), both incorporated herein by reference).
[0055] The term "expression vector" refers to any type of genetic
construct comprising a nucleic acid coding for a RNA capable of
being transcribed. In some cases, RNA molecules are then translated
into a protein, polypeptide, or peptide. In other cases, these
sequences are not translated, for example, in the production of
antisense molecules or ribozymes. Expression vectors can contain a
variety of "control sequences," which refer to nucleic acid
sequences necessary for the transcription and possibly translation
of an operably linked coding sequence in a particular host cell. In
addition to control sequences that govern transcription and
translation, vectors and expression vectors may contain nucleotide
sequences that serve other functions as well and are described
infra.
[0056] In certain embodiments, a plasmid vector is contemplated for
use in cloning and gene transfer. In general, plasmid vectors
containing replicon and control sequences which are derived from
species compatible with the host cell are used in connection with
these hosts. The vector ordinarily carries a replication site, as
well as marking sequences which are capable of providing phenotypic
selection in transformed cells. In a non-limiting example, E. coli
is often transformed using derivatives of pBR322, a plasmid derived
from an E. coli species. pBR322 contains genes for ampicillin and
tetracycline resistance and thus provides easy means for
identifying transformed cells. The pBR plasmid, or other microbial
plasmid or phage must also contain, or be modified to contain, for
example, promoters which can be used by the microbial organism for
expression of its own proteins.
[0057] In addition, phage vectors containing replicon and control
sequences that are compatible with the host microorganism can be
used as transforming vectors in connection with these hosts. For
example, the phage lambda GEM.TM.-11 may be utilized in making a
recombinant phage vector which can be used to transform host cells,
such as, for example, E. coli LE392.
[0058] Bacterial host cells, for example, E. coli, comprising the
expression vector, are grown in any of a number of suitable media,
for example, LB. The expression of the recombinant protein in
certain vectors may be induced, as would be understood by those of
skill in the art, by contacting a host cell with an agent specific
for certain promoters, e.g., by adding IPTG to the media or by
switching incubation to a higher temperature. After culturing the
bacteria for a further period, generally of between 2 and 24 h, the
cells are collected by centrifugation and washed to remove residual
media.
[0059] Many prokaryotic vectors can also be used to transform
eukaryotic host cells. However, it may be desirable to select
vectors that have been modified for the specific purpose of
expressing proteins in eukaryotic host cells. Expression systems
have been designed for regulated and/or high level expression in
such cells. For example, the insect cell/baculovirus system can
produce a high level of protein expression of a heterologous
nucleic acid segment, such as described in U.S. Pat. Nos. 5,871,986
and 4,879,236, both herein incorporated by reference, and which can
be bought, for example, under the name MAXBAC.RTM. 2.0 from
INVITROGEN.RTM. and BACPACKT.TM. BACULOVIRUS EXPRESSION SYSTEM FROM
CLONTECH.RTM..
[0060] Other examples of expression systems include
STRATAGENE.RTM.'s COMPLETE CONTROL.TM. Inducible Mammalian
Expression System, which involves a synthetic ecdysone-inducible
receptor, or its pET Expression System, an E. coli expression
system. Another example of an inducible expression system is
available from INVITROGEN.RTM., which carries the T-REX.TM.
(tetracycline-regulated expression) System, an inducible mammalian
expression system that uses the full-length CMV promoter.
INVITROGEN.RTM. also provides a yeast expression system called the
Pichia methanolica Expression System, which is designed for
high-level production of recombinant proteins in the methylotrophic
yeast Pichia methanolica. One of skill in the art would know how to
express a vector, such as an expression construct, to produce a
nucleic acid sequence or its cognate polypeptide, protein, or
peptide.
[0061] Regulatory Signals
[0062] The construct may contain additional 5' and/or 3' elements,
such as promoters, poly A sequences, and so forth. The elements may
be derived from the host cell, i.e., homologous to the host, or
they may be derived from distinct source, i.e., heterologous.
[0063] A "promoter" is a control sequence that is a region of a
nucleic acid sequence at which initiation and rate of transcription
are controlled. It may contain genetic elements at which regulatory
proteins and molecules may bind, such as RNA polymerase and other
transcription factors, to initiate the specific transcription a
nucleic acid sequence. The phrases "operatively positioned,"
"operatively linked," "under control," and "under transcriptional
control" mean that a promoter is in a correct functional location
and/or orientation in relation to a nucleic acid sequence to
control transcriptional initiation and/or expression of that
sequence.
[0064] A promoter generally comprises a sequence that functions to
position the start site for RNA synthesis. The best known example
of this is the TATA box, but in some promoters lacking a TATA box,
such as, for example, the promoter for the mammalian terminal
deoxynucleotidyl transferase gene and the promoter for the SV40
late genes, a discrete element overlying the start site itself
helps to fix the place of initiation. Additional promoter elements
regulate the frequency of transcriptional initiation. Typically,
these are located in the region 30-110 bp upstream of the start
site, although a number of promoters have been shown to contain
functional elements downstream of the start site as well. To bring
a coding sequence "under the control of" a promoter, one positions
the 5' end of the transcription initiation site of the
transcriptional reading frame "downstream" of (i.e., 3' of) the
chosen promoter. The "upstream" promoter stimulates transcription
of the DNA and promotes expression of the encoded RNA.
[0065] The spacing between promoter elements frequently is
flexible, so that promoter function is preserved when elements are
inverted or moved relative to one another. In the tk promoter, the
spacing between promoter elements can be increased to 50 bp apart
before activity begins to decline. Depending on the promoter, it
appears that individual elements can function either cooperatively
or independently to activate transcription. A promoter may or may
not be used in conjunction with an "enhancer," which refers to a
cis-acting regulatory sequence involved in the transcriptional
activation of a nucleic acid sequence.
[0066] A promoter may be one naturally associated with a nucleic
acid molecule, as may be obtained by isolating the 5' non-coding
sequences located upstream of the coding segment and/or exon. Such
a promoter can be referred to as "endogenous." Similarly, an
enhancer may be one naturally associated with a nucleic acid
molecule, located either downstream or upstream of that sequence.
Alternatively, certain advantages will be gained by positioning the
coding nucleic acid segment under the control of a recombinant or
heterologous promoter, which refers to a promoter that is not
normally associated with a nucleic acid molecule in its natural
environment. A recombinant or heterologous enhancer refers also to
an enhancer not normally associated with a nucleic acid molecule in
its natural environment. Such promoters or enhancers may include
promoters or enhancers of other genes, and promoters or enhancers
isolated from any other virus, or prokaryotic or eukaryotic cell,
and promoters or enhancers not "naturally occurring," i.e.,
containing different elements of different transcriptional
regulatory regions, and/or mutations that alter expression. For
example, promoters that are most commonly used in recombinant DNA
construction include the .beta.-lactamase (penicillinase), lactose
and tryptophan (trp) promoter systems. In addition to producing
nucleic acid sequences of promoters and enhancers synthetically,
sequences may be produced using recombinant cloning and/or nucleic
acid amplification technology, including PCR.TM., in connection
with the compositions disclosed herein (see U.S. Pat. Nos.
4,683,202 and 5,928,906, each incorporated herein by reference).
Furthermore, it is contemplated the control sequences that direct
transcription and/or expression of sequences within non-nuclear
organelles such as mitochondria, chloroplasts, and the like, can be
employed as well.
[0067] Naturally, it will be important to employ a promoter and/or
enhancer that effectively directs the expression of the DNA segment
in the organelle, cell type, tissue, organ, or organism chosen for
expression. Those of skill in the art of molecular biology
generally know the use of promoters, enhancers, and cell type
combinations for protein expression, (see, for example Sambrook, et
al., 1989, incorporated herein by reference). The promoters
employed may be constitutive, tissue-specific, inducible, and/or
useful under the appropriate conditions to direct high level
expression of the introduced DNA segment, such as is advantageous
in the large-scale production of recombinant proteins and/or
peptides. The promoter may be heterologous or endogenous.
[0068] Additionally any promoter/enhancer combination (as per, for
example, the Eukaryotic Promoter Data Base EPDB,
www.epd.isb-sib.ch/) could also be used to drive expression. Use of
a T3, T7 or SP6 cytoplasmic expression system is another possible
embodiment. Eukaryotic cells can support cytoplasmic transcription
from certain bacterial promoters if the appropriate bacterial
polymerase is provided, either as part of the delivery complex or
as an additional genetic expression construct.
[0069] A specific initiation signal also may be required for
efficient translation of coding sequences. These signals include
the ATG initiation codon or adjacent sequences. Exogenous
translational control signals, including the ATG initiation codon,
may need to be provided. One of ordinary skill in the art would
readily be capable of determining this and providing the necessary
signals. It is well known that the initiation codon must be
"in-frame" with the reading frame of the desired coding sequence to
ensure translation of the entire insert. The exogenous
translational control signals and initiation codons can be either
natural or synthetic. The efficiency of expression may be enhanced
by the inclusion of appropriate transcription enhancer
elements.
[0070] In certain embodiments of the invention, the use of internal
ribosome entry sites (IRES) elements are used to create multigene,
or polycistronic, messages. IRES elements are able to bypass the
ribosome scanning model of 5' methylated Cap dependent translation
and begin translation at internal sites (Pelletier and Sonenberg,
Nature, 334:320-325 (1988)). IRES elements from two members of the
picornavirus family (polio and encephalomyocarditis) have been
described (Pelletier and Sonenberg, supra), as well an IRES from a
mammalian message (Macejak and Sarnow, Nature, 353:90-94 (1991))
1991). IRES elements can be linked to heterologous open reading
frames. Multiple open reading frames can be transcribed together,
each separated by an IRES, creating polycistronic messages. By
virtue of the IRES element, each open reading frame is accessible
to ribosomes for efficient translation. Multiple genes can be
efficiently expressed using a single promoter/enhancer to
transcribe a single message (see U.S. Pat. Nos. 5,925,565 and
5,935,819, each herein incorporated by reference).
[0071] Other Vector Sequence Elements
[0072] Vectors can include a multiple cloning site (MCS), which is
a nucleic acid region that contains multiple restriction enzyme
sites, any of which can be used in conjunction with standard
recombinant technology to digest the vector (see, for example,
Carbonelli, et al., FEMS Microbiol. Lett., 172(1):75-82 (1999),
Levenson, et al., Hum. Gene Ther. 9(8):1233-1236 (1998), and Cocea,
Biotechniques, 23(5):814-816 (1997)), incorporated herein by
reference.) "Restriction enzyme digestion" refers to catalytic
cleavage of a nucleic acid molecule with an enzyme that functions
only at specific locations in a nucleic acid molecule. Many of
these restriction enzymes are commercially available. Use of such
enzymes is widely understood by those of skill in the art.
Frequently, a vector is linearized or fragmented using a
restriction enzyme that cuts within the MCS to enable exogenous
sequences to be ligated to the vector. "Ligation" refers to the
process of forming phosphodiester bonds between two nucleic acid
fragments, which may or may not be contiguous with each other.
Techniques involving restriction enzymes and ligation reactions are
well known to those of skill in the art of recombinant
technology.
[0073] Most transcribed eukaryotic RNA molecules will undergo RNA
splicing to remove introns from the primary transcripts. Vectors
containing genomic eukaryotic sequences may require donor and/or
acceptor splicing sites to ensure proper processing of the
transcript for protein expression (see, for example, Chandler, et
al., 1997, herein incorporated by reference).
[0074] The vectors or constructs of the present invention will
generally comprise at least one termination signal. A "termination
signal" or "terminator" comprises a DNA sequence involved in
specific termination of an RNA transcript by an RNA polymerase.
Thus, in certain embodiments a termination signal that ends the
production of an RNA transcript is contemplated. A terminator may
be necessary in vivo to achieve desirable message levels.
[0075] In eukaryotic systems, the terminator region may also
comprise specific DNA sequences that permit site-specific cleavage
of the new transcript so as to expose a polyadenylation site. This
signals a specialized endogenous polymerase to add a stretch of
about 200 adenosine residues (polyA) to the 3' end of the
transcript. RNA molecules modified with this polyA tail appear to
more stable and are translated more efficiently. Thus, in other
embodiments involving eukaryotes, it is preferred that that
terminator comprises a signal for the cleavage of the RNA, and it
is more preferred that the terminator signal promotes
polyadenylation of the message. The terminator and/or
polyadenylation site elements can serve to enhance message levels
and to minimize read through from the cassette into other
sequences.
[0076] Terminators contemplated for use in the invention include
any known terminator of transcription described herein or known to
one of ordinary skill in the art, including but not being limited
to, for example, the termination sequences of genes, such as the
bovine growth hormone terminator, viral termination sequences, such
as the SV40 terminator. In certain embodiments, the termination
signal may be a lack of transcribable or translatable sequence,
such as an untranslatable/untranscribable sequence due to a
sequence truncation.
[0077] In expression, particularly eukaryotic expression, one will
typically include a polyadenylation signal to effect proper
polyadenylation of the transcript. The nature of the
polyadenylation signal is not believed to be crucial to the
successful practice of the invention, and any such sequence may be
employed. Preferred embodiments include the SV40 polyadenylation
signal or the bovine growth hormone polyadenylation signal, both of
which are convenient, readily available, and known to function well
in various target cells. Polyadenylation may increase the stability
of the transcript or may facilitate cytoplasmic transport.
[0078] In order to propagate a vector in a host cell, it may
contain one or more origins of replication (often termed "ori"),
sites which are specific nucleotide sequences at which replication
is initiated. Alternatively, an autonomously replicating sequence
(ARS) can be employed if the host cell is yeast.
[0079] Transformation Methodology
[0080] Suitable methods for nucleic acid delivery for use with the
current invention are believed to include virtually any method by
which a nucleic acid molecule (e.g., DNA) can be introduced into a
cell as described herein or as would be known to one of ordinary
skill in the art. Such methods include, but are not limited to,
direct delivery of DNA such as by ex vivo transfection (Wilson, et
al., Science, 244:1344-1346 (1989), Nabel et al, Science.
244:1342-1344 (1989), by injection (U.S. Pat. Nos. 5,994,624,
5,981,274, 5,945,100, 5,780,448, 5,736,524, 5,702,932, 5,656,610,
5,589,466 and 5,580,859, each incorporated herein by reference),
including microinjection (Harlan and Weintraub, J. Cell Biol.,
101(3):1094-1099 (1985); U.S. Pat. No. 5,789,215, incorporated
herein by reference); by electroporation (U.S. Pat. No. 5,384,253,
incorporated herein by reference; Tur-Kaspa, et al., Mol. Cell
Biol., 6:716-718 (1986); Potter, et al., Proc. Natl. Acad. Sci.
USA, 81:7161-7165 (1984); by calcium phosphate precipitation
(Graham and Van Der Eb, Virology, 52:456-467 (1973); Chen and
Okayama, Mol. Cell Biol., 7(8):2745-2752 (1987); Rippe, et al.,
Mol. Cell Biol., 10:689-695 (1990); by using DEAE-dextran followed
by polyethylene glycol (Gopal, Mol. Cell Biol. 5:1188-190 (1985);
by direct sonic loading (Fechheimer, et al., Proc. Natl. Acad. Sci.
USA, 89(17):8463-8467 (1987); by liposome mediated transfection
(Nicolau and Sene, Biochem. & Biophys. Acta., 721:185-190
(1982); Fraley, et al., Proc. Natl. Acad. Sci. USA, 76:3348-3352
(1979); Nicolau, et al., Meth. Enzym., 149:157-176 (1987); Wong, et
al., Gene, 10:879-894 (1980); Kaneda, et al., Science, 243:375-378
(1989); Kato, et al., J. Biol. Chem., 266:3361-3364 (1991) and
receptor-mediated transfection (Wu and Wu, J. Biol. Chem.,
262:4429-4432 (1987); Wu and Wu, 1988); by PEG-mediated
transformation of protoplasts (Omirulleh, et al., Plant Mol. Biol.,
21(3):415-428 (1987); U.S. Pat. Nos. 4,684,611 and 4,952,500, each
incorporated herein by reference); by
desiccation/inhibition-mediated DNA uptake (Potrykus, et al. Mol.
Gen. Genet., 199(2):169-177 (1985), and any combination of such
methods.
II. COMPONENTS OF THE ASSAY SYSTEM
[0081] As with the method described herein, the products which are
features of the invention have preferred embodiments. For example,
in the "three part construct," i.e., that contain sequences
encoding a test protein, the cleavage site, and the activator
protein, the test protein is preferably a membrane bound protein,
such as a transmembrane receptor, e.g., a member of the GPCR
family. These sequences can be modified so that the C terminus of
the proteins they encode have better and stronger interactions with
the second protein. The modifications can include, e.g., replacing
a C-terminal encoding sequence of the test protein, such as a GPCR,
with the C terminal coding region for AVPR2, AGTRLI, GRPR, F2PLI,
CCR4, CXCR2/IL-8, CCR4, or GRPR, all of which are defined
supra.
[0082] The protein which activates the reporter gene may be a
protein which acts within the nucleus, like a transcription factor
(e.g., tTA, GAL4, etc.), or it may be a molecule that sets a
cascade of reactions in motion, leading to an intranuclear reaction
by another protein. The skilled artisan will be well versed in such
cascades.
[0083] The second construct, as described supra, includes a region
which encodes a protein that interacts with the first protein,
leading to some measurable phenomenon. The protein may be an
activator, an inhibitor, or, more, generically, a "modulator" of
the first protein. Members of the arrestin family are preferred,
especially when the first protein is a GPCR, but other protein
encoding sequences may be used, especially when the first protein
is not a GPCR. The second part of these two part constructs encodes
the protease, or portion of a protease, which acts to remove the
activating molecule from the fusion protein encoded by the first
construct.
[0084] However, these preferred embodiments do not limit the
invention, as discussed in the following additional
embodiments.
[0085] Host Cells
[0086] As used herein, the terms "cell," "cell line," and "cell
culture" may be used interchangeably. All of these terms also
include their progeny, which is any and all subsequent generations.
It is understood that all progeny may not be identical due to
deliberate or inadvertent mutations. The host cells generally will
have been engineered to express a screenable or selectable marker
which is activated by the transcription factor that is part of a
fusion protein, along with the first test protein.
[0087] In the context of expressing a heterologous nucleic acid
sequence, "host cell" refers to a prokaryotic or eukaryotic cell
that is capable of replicating a vector and/or expressing a
heterologous gene encoded by a vector. When host cells are
"transfected" or "transformed" with nucleic acid molecules, they
are referred to as "engineered" or "recombinant" cells or host
cells, e.g., a cell into which an exogenous nucleic acid sequence,
such as, for example, a vector, has been introduced. Therefore,
recombinant cells are distinguishable from naturally-occurring
cells which do not contain a recombinantly introduced nucleic
acid.
[0088] Numerous cell lines and cultures are available for use as a
host cell, and they can be obtained through the American Type
Culture Collection (ATCC), which is an organization that serves as
an archive for living cultures and genetic materials
(www.atcc.org). An appropriate host can be determined by one of
skill in the art based on the vector backbone and the desired
result. A plasmid or cosmid, for example, can be introduced into a
prokaryote host cell for replication of many vectors. Cell types
available for vector replication and/or expression include, but are
not limited to, bacteria, such as E. coli (e.g., E. coli strain
RR1, E. coli LE392, E. coli B, E. coli X 1776 (ATCC No. 31537) as
well as E. coli W3110 (F-, lambda-, prototrophic, ATCC No. 273325),
DH5.alpha., JM109, and KC8, bacilli such as Bacillus subtilis; and
other enterobacteriaceae such as Salmonella typhimurium, Serratia
marcescens, various Pseudomonas specie, as well as a number of
commercially available bacterial hosts such as SURE.RTM. Competent
Cells and SOLOPACK.TM. Gold Cells (STRATAGENE.RTM., La Jolla). In
certain embodiments, bacterial cells such as E. coli LE392 are
particularly contemplated as host cells for phage viruses.
[0089] Examples of eukaryotic host cells for replication and/or
expression of a vector include, but are not limited to, HeLa,
NIH3T3, Jurkat, 293, COS, CHO, Saos, and PC12. Many host cells from
various cell types and organisms are available and would be known
to one of skill in the art. Similarly, a viral vector may be used
in conjunction with either a eukaryotic or prokaryotic host cell,
particularly one that is permissive for replication or expression
of the vector.
[0090] Test Proteins
[0091] The present invention contemplates the use of any two
proteins for which a physical interaction is known or suspected.
The proteins will exist as fusions proteins, a first test protein
fused to a transcription factor, and the second test protein fused
to a protease that recognizes a cleavage site in the first fusion
protein, cleavage of which releases the transcription factor. The
only requirements for the test proteins/fusions are (a) that the
first test protein cannot localize to the nucleus prior to
cleavage, and (b) that the protease must remain active following
both fusion to the second test protein and binding of the first
test protein to the second test protein.
[0092] With respect to the first construct, the first test protein
may be, e.g., a naturally membrane bound protein, or one which has
been engineered to become membrane bound, via standard techniques.
The first test protein may be, e.g., a transmembrane receptor such
as any of the GPCRs, or any other transmembrane receptor of
interest, including, but not being limited to, receptor tyrosine
kinases, receptor serine threonine kinases, cytokine receptors, and
so forth. Further, as it is well known that portions of proteins,
will function in the same manner as the full length first test
protein, such active portions of a first test protein are
encompassed by the definition of protein herein.
[0093] As will be evident to the skilled artisan, the present
invention may be used to assay for interaction with any protein,
and is not limited in its scope to assaying membrane bound
receptor, like the GPCRs. For example, the activity of other
classes of transmembrane receptors, including but not limited to:
receptor tyrosine kinases (RTKs), such as IGF1R, such as the
epidermal growth factor receptor (EGFR), ErbB2/HER2/Neu or related
RTKs; receptor serine/threonine kinases, such as Transforming
Growth Factor-beta (TGF.beta.), activin, or Bone Morphogenetic
Protein (BMP) receptors; cytokine receptors, such as receptors for
the interferon family for interleukin, erythropoietin, G-CSF,
GM-CSF, tumor necrosis factor (TNF) and leptin receptors; and other
receptors, which are not necessarily normally membrane bound, such
as estrogen receptor 1 (ESR1), and estrogen receptor 2 (ESR2). In
each case, the method involves transfecting a cell with a modified
receptor construct that directs the expression of a chimeric
protein containing the receptor of interest, to which is appended,
a protease cleavage site followed by a nucleic acid molecule
encoding a transcription factor. The cell is co-transfected with a
second construct that directs the expression of a chimeric protein
consisting of an interacting protein fused, to the protease that
recognizes and cleaves the site described supra. In the case of
RTKs, such as the EGFR, this interacting protein may consist of a
SH2 (Src homology domain 2) containing protein or portion thereof,
such as phospholipase C (PLC) or Src homology 2 domain containing
transforming protein 1 (SHC1). In the case of receptor
serine/threonine kinases, such as TGF.beta., activin, BMP
receptors, this interacting protein may be a Smad protein or
portion thereof. In the case of cytokine receptors, such as
interferon-.alpha./.beta. or interferon-.gamma. gamma receptors,
this interacting protein may be a signal transducer and activator
of transcription (STAT) protein such as, but not being limited to,
Stat1, Stat2; Janus kinase (JAK) proteins Jak1, Jak2, or Tyk2; or
portions thereof. In each case, the transfected cell contains a
reporter gene that is regulated by the transcription factor fused
to the receptor. An assay is then performed in which the
transfected cells are treated with a test compound for a specific
period and the reporter gene activity is measured at the end of the
test period. If the test compound activates the receptor of
interest, interactions between the receptor of interest and the
interacting protein are stimulated, leading to cleavage of the
protease site and release of the fused transcription factor, which
is in turn measurable as an increase in reporter gene activity.
[0094] Other possible test protein pairs include antibody-ligands,
enzyme-substrates, dimerizing proteins, components of signal
transduction cascades, and other protein pairs well known to the
art.
[0095] Reporters
[0096] The protein which activates a reporter gene may be any
protein having an impact on a gene, expression or lack thereof
which leads to a detectable signal. Typical protein reporters
include enzymes such as chloramphenicol acetyl transferase (CAT),
.beta.-glucuronidase (GUS) or .beta.-galactosidase. Also
contemplated are fluorescent and chemilluminescent proteins such as
green fluorescent protein, red fluorescent protein, cyan
fluorescent protein luciferase, beta lactamase, and alkaline
phosphatase.
[0097] Transcriptions Factors and Repressors
[0098] In accordance with the present invention, transcription
factors are used to activate expression of a reporter gene in an
engineered host cell. Transcription factors are typically
classified according to the structure of their DNA-binding domain,
which are generally (a) zinc fingers, (b) helix-turn-helix, (c)
leucine zipper, (d) helix-loop-helix, or (e) high mobility groups.
The activator domains of transcription factors interact with the
components of the transcriptional apparatus (RNA polymerase) and
with other regulatory proteins, thereby affecting the efficiency of
DNA binding.
[0099] The Rel/Nuclear Factor kB (NF-kB) and Activating Protein-1
(AP-1) are among the most studied transcription factor families.
They have been identified as important components of signal
transduction pathways leading to pathological outcomes such as
inflammation and tumorogenesis. Other transcription factor families
include the heat shock/E2F family, POU family and the ATF family.
Particular transcription factors, such as tTA and GAL4, are
contemplated for use in accordance with the present invention.
[0100] Though transcription factors are one class of molecules that
can be used, the assays may be modified to accept the use of
transcriptional repressor molecules, where the measurable signal is
downregulation of a signal generator, or even cell death.
[0101] Proteases and Cleavage Sites
[0102] Proteases are well characterized enzymes that cleave other
proteins at a particular site. One family, the Ser/Thr proteases,
cleave at serine and threonine residues. Other proteases include
cysteine or thiol proteases, aspartic proteases,
metalloproteinases, aminopeptidases, di & tripeptidases,
carboxypeptidases, and peptidyl peptidases. The choice of these is
left to the skilled artisan and certainly need not be limited to
the molecules described herein. It is well known that enzymes have
catalytic domains and these can be used in place of full length
proteases. Such are encompassed by the invention as well. A
specific embodiment is the tobacco etch virus nuclear inclusion A
protease, or an active portion thereof. Other specific cleavage
sites for proteases may also be used, as will be clear to the
skilled artisan.
[0103] Modification of Test Proteins
[0104] The first test protein may be modified to enhance its
binding to the interacting protein in this assay. For example, it
is known that certain GPCRs bind arrestins more stably or with
greater affinity upon ligand stimulation and this enhanced
interaction is mediated by discrete domains, e.g., clusters of
serine and threonine residues in the C-terminal tail (Oakley, et
al., J. Biol. Chem., 274:32248-32257, 1999 and Oakley, et al., J.
Biol. Chem., 276:19452-19460, 2001). Using this as an example, it
is clear that the receptor encoding sequence itself may be
modified, so as to increase the affinity of the membrane bound
protein, such as the receptor, with the protein to which it binds.
Exemplary of such modifications are modifications of the C-terminal
region of the membrane bound protein, e.g., receptor, such as those
described supra, which involve replacing a portion of it with a
corresponding region of another receptor, which has higher affinity
for the binding protein, but does not impact the receptor function.
Examples 16 and 20, supra, show embodiments of this feature of the
invention.
[0105] In addition, the second test protein may be modified to
enhance its interaction with the first test protein. For example,
the assay may incorporate point mutants, truncations or other
variants of the second test protein, e.g., arrestin that are known
to bind agonist-occupied GPCRs more stably or in a
phosphorylation-independent manner (Kovoor, et al., J. Biol. Chem.,
274:6831-6834, 1999).
III. ASSAY FORMATS
[0106] As discussed above, the present invention, in one
embodiment, offers a straightforward way to assess the interaction
of two test proteins when expressed in the same cell. A first
construct, as described supra, comprises a sequence encoding a
first protein, concatenated to a sequence encoding a cleavage site
for a protease or protease portion, which is itself concatenated to
a sequence encoding a reporter gene activator. By "concatenated" is
meant that the sequences described are fused to produce a single,
intact open reading frame, which may be translated into a single
polypeptide which contains all the elements. These may, but need
not be, separated by additional nucleotide sequences which may or
may not encode additional proteins or peptides. A second construct
inserted into the recombinant cells is also as described supra,
i.e., it contains both a sequence encoding a second protein, and
the protease or protease portion. Together, these elements
constitute the basic assay format when combined with a candidate
agent whose effect on target protein interaction is sought.
[0107] However, the invention may also be used to assay more than
one membrane bound protein, such as a receptor, simultaneously by
employing different reporter genes, each of which is stimulated by
the activation of a protein, such as the classes of proteins
described herein. For example, this may be accomplished by mixing
cells transfected with different receptor constructs and different
reporter genes, or by fusing different transcription factors to
each test receptor, and measuring the activity of each reporter
gene upon treatment with the test compound. For example, it may be
desirable to determine if a molecule of interest activates a first
receptor and also determine if side effects should be expected as a
result of interaction with a second receptor. In such a case one
may, e.g., involve a first cell line encoding a first receptor and
a first reporter, such as lacZ, and a second cell line encoding a
second receptor and a second reporter, such as GFP. Preferred
embodiments of such a system are seen in Examples 17 and 18. One
would mix the two cell lines, add the compound of interest, and
look for a positive effect on one, with no effect on the other.
[0108] It is contemplated that the invention relates both to assays
where a single pair of interacting test proteins is examined, but
more preferably, what will be referred to herein as "multiplex"
assays are used. Such assays may be carried out in various ways,
but in all cases, more than one pair of test proteins is tested
simultaneously. This may be accomplished, e.g., by providing more
than one sample of cells, each of which has been transformed or
transfected, to test each interacting pair of proteins. The
different transformed cells may be combined, and tested
simultaneously, in one receptacle, or each different type of
transformant may be placed in a different well, and then
tested.
[0109] The cells used for the multiplex assays described herein may
be, but need not be, the same. Similarly, the reporter system used
may, but need not be, the same in each sample. After the sample or
samples are placed in receptacles, such as wells of a microarray,
one or more compounds may be screened against the plurality of
interacting protein pairs set out in the receptacles.
[0110] The fusion proteins expressed by the constructs are also a
feature of the invention. Other aspects of the invention which will
be clear to the artisan, are antibodies which can identify the
fusion proteins as well as various protein based assays for
determining the presence of the protein, as well as hybridization
assays, such as assays based on PCR, which determine expression of
the gene.
IV. KITS
[0111] Any of the compositions described herein may be comprised in
a kit. The kits will thus comprise, in suitable container means for
the vectors or cells of the present invention, and any additional
agents that can be used in accordance with the present
invention.
[0112] The kits may comprise a suitably aliquoted compositions of
the present invention. The components of the kits may be packaged
either in aqueous media or in lyophilized form. The container means
of the kits will generally include at least one vial, test tube,
flask, bottle, syringe or other container means, into which a
component may be placed, and preferably, suitably aliquoted. Where
there are more than one component in the kit, the kit also will
generally contain a second, third or other additional container
into which the additional components may be separately placed.
However, various combinations of components may be comprised in a
vial. The kits of the present invention also will typically include
a means for containing reagent containers in close confinement for
commercial sale. Such containers may include injection or
blow-molded plastic containers into which the desired vials are
retained.
[0113] When the components of the kit are provided in one and/or
more liquid solutions, the liquid solution is an aqueous solution,
with a sterile aqueous solution being particularly preferred.
However, the components of the kit may be provided as dried
powder(s). When reagents and/or components are provided as a dry
powder, the powder can be reconstituted by the addition of a
suitable solvent. It is envisioned that the solvent may also be
provided in another container means.
V. EXAMPLES
[0114] Specific embodiments describing the invention will be seen
in the examples which follow, but the invention should not be
deemed as limited thereto.
Example 1
[0115] A fusion construct was created, using DNA encoding human
.beta.2 adrenergic receptor, referred to hereafter as "ADRB2", in
accordance with standard nomenclature. Its nucleotide sequence can
be found at GenBank, under Accession Number NM.sub.--000024 (SEQ ID
NO: 1). The tetracycline controlled transactivator tTA, described
by Gossen, et al., Proc. Natl. Acad. Sci. USA, 87:5547-5551 (1992),
incorporated by reference, was also used. A sequence encoding the
recognition and cleavage site for tobacco etch virus nuclear
inclusion A protease, described by Parks, et al., Anal. Biochem.,
216:413-417 (1994), incorporated by reference, is inserted between
these sequences in the fusion coding gene. The CMV promoter region
was placed upstream of the ADRB2 coding region, and a poly A
sequence was placed downstream of the tTA region.
[0116] A fusion construct was prepared by first generating a form
of ADRB2 which lacked internal BamHI and BglII restriction sites.
Further, the endogenous stop codon was replaced with a unique BamHI
site.
[0117] Overlapping PCR was used to do this. To elaborate, a 5'
portion of the coding region was amplified with:
TABLE-US-00001 (SEQ ID NO: 2) gattgaagat ctgccttctt gctggc, and
(SEQ ID NO: 3) gcagaacttg gaagacctgc ggagtcc,
while a 3' portion of the coding region was amplified with:
TABLE-US-00002 ggactccgca ggtcttccaa gttctgc, (SEQ ID NO: 4) and
ttcggatcct agcagtgagt catttgt. (SEQ ID NO: 5)
[0118] The resulting PCR products have 27 nucleotides of
overlapping sequence and were purified via standard agarose gel
electrophoresis. These were mixed together, and amplified with SEQ
ID NO: 2, and SEQ ID NO: 5.
[0119] PCR was also used to modify the coding region of tTA so that
the endogenous start codon was replaced with a TEV NIa-Pro cleavage
site. The cleavage site, defined by the seven amino acid sequence
ENLYFQS (SEQ ID NO: 6), is taught by Parks, et al., Anal. Biochem.,
216:413-417 (1994), incorporated by reference. The seventh amino
acid is known as P l' position, and replacing it with other amino
acids is known to reduce the efficiency of cleavage by TEV NIa-Pro.
See Kapust, et al., Biochem. Biophys. Res. Commun., 294:949-955
(2002).
[0120] Variants where the seventh amino acid was changed to Tyr,
and where it was changed to Leu, were produced. These resulted in
intermediate and low efficiency cleavage sites, as compared to the
natural high efficiency site.
[0121] A DNA sequence encoding the natural high efficiency site was
added to the tTA coding region in two steps. Briefly, BamHI and
XbaI restriction sites were added to the 5' end and a XhoI
restriction site was added to the 3' end of the tTA coding region
by PCR with
TABLE-US-00003 (SEQ ID NO: 7) ccggatcctc tagattagat aaaagtaaag tg
and (SEQ ID NO: 8) gactcgagct agcagtatcc tcgcgccccc taccc,
and the TEV NIa-Pro cleavage site was added to the 5' end by
ligating an oligonucleotide with the sequence
TABLE-US-00004 gagaacctgt acttccag (SEQ ID NO: 9)
between the BamHI and XbaI sites.
[0122] This DNA sequence was modified to encode the intermediate
and low efficiency cleavage sites by PCR using:
TABLE-US-00005 (SEQ ID NO: 10) ggatccgaga acctgtactt ccagtacaga
tta, and (SEQ ID NO: 11) ctcgagagat cctcgcgccc cctacccacc . (SEQ ID
NO: 12) for ENLYFQY, and (SEQ ID NO: 13) ggatccgaga acctgtactt
ccagctaaga tta, and (SEQ ID NO: 11) ctcgagagat cctcgcgccc
cctacccacc (SEQ ID NO: 14) for ENLYFQL.
[0123] These PCR steps also introduced a BamHI restriction site 5'
to the sequence encoding each cleavage site, and an XhoI
restriction site 3' to tTA stop codon.
[0124] The thus modified ADRB2 coding region was digested with
PstI, which cuts at nucleotide position 260 in the coding region,
and BamHI. This 3' fragment was ligated with the three variants of
tTA modified with the TEV NIa-Pro cleavage sites, that had been
digested with BamHI and XhoI, and the resulting complexes were
cloned into pBlueScript II, which had been digested with PstI and
XhoI.
[0125] A NotI restriction site was introduced 5' to the start codon
of the ADRB2 coding region, again via PCR, using
TABLE-US-00006 (SEQ ID NO: 15) gcggccgcca ccatgaacgg taccgaaggc
cca, and (SEQ ID NO: 16) ctggtgggtg gcccggtacc a.
[0126] The 5' fragment of modified ADRB2 coding region was
isolated, via digestion with NotI and PstI and was ligated into
each of the constructs of the 3' fragment of
ADRB2-TEV-NIa-Pro-cleavage site tTA fusions that had been digested
previously, to produce three, full length constructs encoding
fusion proteins.
[0127] Each construct was digested with NotI and XhoI, and was then
inserted into the commercially available expression vector pcDNA 3,
digested with NotI and XhoI.
Example 2
[0128] A second construct was also made, whereby the coding
sequence for ".beta. arrestin 2 or ARRB2" hereafter (GenBank,
NM.sub.--004313) (SEQ ID NO: 17), was ligated to the catalytic
domain of the TEV NIa protease (i.e., amino acids 189-424 of mature
NIa protease, residues 2040-2279) in the TEV protein. To do this, a
DNA sequence encoding ARRB2 was modified, so as to add a BamHI
restriction site to its 5' end. Further, the sequence was modified
to replace the endogenous stop codon with a BamHI site. The
oligonucleotides
TABLE-US-00007 (SEQ ID NO: 18) caggatcctc tggaatgggg gagaaacccg
ggacc, and (SEQ ID NO: 19) ggatccgcag agttgatcat catagtcgtc
were used. The resulting PCR product was cloned into the
commercially available vector pGEM-T EASY (Promega). The multiple
cloning site of the pGEM-T EASY vector includes an EcoRI site 5' to
the start codon of ARRB2.
[0129] The TEV NIa-Pro coding region was then modified to replace
the endogenous start codon with a BglII site, and to insert at the
3' end a sequence which encodes influenza hemagluttinin epitope
YPYDVPDYA (SEQ ID NO: 20) in accordance with Kolodziej, et al.,
Meth. Enzymol., 194:508-519 (1991), followed by a stop codon, and a
NotI restriction site. This was accomplished via PCR, using
TABLE-US-00008 (SEQ ID NO: 21) agatctagct tgtttaaggg accacgtg, and
(SEQ ID NO: 22) gcggccgctc aagcgtaatc tggaacatca tatgggtacg
agtacaccaa ttcattcatg ag.
[0130] The resulting, modified ARRB2 coding region was digested
with EcoRI and BamHI, while the modified TEV coding region was
cleaved with BglII and NotI. Both fragments were ligated into a
commercially available pcDNA3 expression vector, digested with
EcoRI and NotI.
Example 3
[0131] Plasmids encoding ADRB2-TEV-NIa-Pro cleavage site-tTA and
the ARRB2-TEV-NIa protease fusion proteins were transfected into
HEK-293T cells, and into "clone 41," which is a derivative of
HEK-293T, that has a stably integrated .beta.-galactosidase gene
under control of a tTA dependent promoter. About 5.times.10.sup.4
cells were plated in each well of a 24 well plate, in DMEM medium
supplemented with 10% fetal bovine serum, 2 mM L-glutamine, 100
units/ml penicillin, 100 .mu.g/ml G418, and 5 .mu.g/ml purimycin.
Cells were grown to reach 50% confluency the next day, and were
then transfected, using 0.4 .mu.g plasmid DNA, and 2 .mu.l Fugene
(a proprietary transfection reagent containing lipids and other
material). The mix was combined in 100 .mu.l of DMEM medium, and
incubated for 15 minutes at room temperature prior to adding cells.
Transfected cells were incubated for 8-20 hours before testing by
adding drugs which are known agonists for the receptor, and then
16-24 hours after drug addition.
Example 4
[0132] The levels of .beta.-galactosidase activity in the cells
were first measured by staining the cells with a chromogenic
substance, i.e., "X-gal," as taught by MacGregor, et al., Somat.
Cell Mol. Genet., 13:253-265 (1987), incorporated by reference.
Following culture, cells were washed, twice, in D-PBS with calcium
and magnesium, fixed for 5 minutes in 4% paraformaldehyde, and then
washed two additional times with D-PBS, calcium and magnesium, for
10 minutes each time. Fixed cells were incubated with 5 mM
potassium ferricyanide, 5 mM potassium ferrocyanide, 2 mM
MgCl.sub.2, 0.1% X-Gal, that had been prepared from a 1:40 dilution
of 4% X-Gal stock in dimethylformamide, in D-PBS with calcium and
magnesium.
[0133] The reaction was incubated in the dark at room temperature
for from 3-4 hours, to overnight. Substrate solution was removed,
and cells were mounted under glass coverslips with mowiol mounting
medium (10% mowiol, 0.1% 1.4-diazabicyclo[2.2.2]octane, 24%
glycerol).
[0134] The results indicated that cells transfected with either the
ADRB2-TEV-NIa-Pro cleavage site-tTA plasmid alone or the
ARRB2-TEV-NIa protease plasmid alone did not express
.beta.-galactosidase. A small fraction of cells transfected with
both plasmids did express .beta.-galactosidase, probably due to
basal levels of interaction between unstimulated ADRB2 and ARRB2.
About 3-5 fold more cells expressed the reporter gene after
treatment with either 10 .mu.M isoproterenol, or 10 .mu.M
epinephrine, both of which are ADRB2 agonists.
[0135] When the cells were pretreated for 5 minutes with the ADRB2
antagonist alprenolol (10 .mu.M), the agonist induced increase in
.beta.-galactosidase expressing cells was blocked, and treatment
with alprenolol alone had no apparent effect.
[0136] These results show that one can link agonist binding and
GPCR stimulation to transcriptional activation of a reporter
gene.
Example 5
[0137] A set of experiments were carried out in order to quantify
the level of reporter gene activity in the cells more precisely and
to maximize the signal-to-background ratio of the assay. This was
accomplished by measuring the level of reporter gene induction
using a commercially available chemiluminescence assay for
.beta.-galactosidase activity. Clone 41 cells were transfected with
the ADRB2-tTA fusion constructs, containing either the high, medium
or low efficiency cleavage sites, and the ARRB2-TEV-NIa protease
expression plasmid described supra. Cells were either untreated or
treated with 1 .mu.M isoproterenol 20 hours after the transfection,
and the luminescence assay was carried out 24 hours after the drug
addition. In brief, following cell culture, the medium was removed,
and 50 .mu.l of lysis buffer (100 mM potassium phosphate, pH7.8,
0.2% Triton X-100) was added to each well. The cells were lysed via
incubation for 5 minutes, at room temperature, with mild agitation.
Lysates were collected and analyzed via commercially available
products.
[0138] In all cases, treatment with agonist increased levels of
.beta.-galactosidase activity. However, the background level of
reporter gene activity in untreated cells was lowest with the low
efficiency cleavage site, relative to the medium and high
efficiency sites. Further, agonist treatment resulted in a 4.8-fold
stimulation of reporter gene activity in cells transfected with the
low efficiency cleavage site, compared to 2.8-fold for the medium
efficiency cleavage site and 1.2-fold for the high efficiency
cleavage site. Thus, the highest signal-to-background ratio is
obtained by using the low efficiency protease cleavage site.
Example 6
[0139] These experiments were designed to verify that the agonist
stimulated increase in reporter gene expression is dependent on
binding and activation of the receptor by the agonist.
[0140] To do this, variants of the ADRB2-tTA fusion constructs were
generated following the protocols supra, except each contained a
mutant form of the receptor with a single amino acid change from D
to S at position 113, which results in a greatly reduced affinity
for the agonist isoproterenol. See Strader, et al., J. Biol. Chem.,
266:5-8 (1991). Three forms of the mutant receptor-tTA fusion
construct with each of the different cleavage sites were
formed.
[0141] The levels of .beta.-galactosidase activity were measured in
clone 41 cells co-transfected with the ADRB2-tTA fusion constructs
containing the D113S point mutation and the ARRB2-TEV-NIa protease
expression plasmid described previously. The activity tests were
carried out exactly as described, supra. The results indicated that
the agonist isoproterenol did not stimulate reporter gene
expression in cells expressing the mutant ADRB2-tTA fusion
constructs.
Example 7
[0142] These experiments were designed to examine whether the
agonist stimulated increase in reporter gene expression is
dependent on fusion of TEV NIa-Pro to ARRB2.
[0143] To do this, the levels of .beta.-galactosidase activity were
measured in clone 41 cells co-transfected with the ADRB2-tTA fusion
construct containing the low efficiency cleavage site and either
the ARRB2-TEV-NIa protease expression plasmid described supra, or a
control TEV-NIa protease fusion to the SH2 domain of phospholipase
C. The activity tests were carried out exactly as described, supra.
The results indicated that agonist-stimulated increase in reporter
gene expression was detected only when the TEV protease was fused
to ARRB2 and not when fused to an unrelated polypeptide.
Example 8
[0144] These experiments were designed to determine if gene
expression is induced selectively by agonists of the target
receptor, or if it can be stimulated by other molecules.
[0145] ATP is an agonist for G protein coupled receptors P2Y1 and
P2Y2, which are expressed endogenously by HEK-293T cells.
[0146] Experiments were carried out using clone 41 cells which were
cotransfected with the ADRB2-tTA fusion construct containing the
low efficiency cleavage site and the arrestin-TEV-NIa protease
fusion as described supra, which were treated with isoproterenol,
ATP, or untreated. The assays were carried out as described,
supra.
[0147] The results indicated that induction of reporter gene
activity was specific to activation of target receptor. Stimulation
of another GPCR pathway was irrelevant.
Example 9
[0148] A set of experiments were carried out using clone 41 cells
which were cotransfected with the ADRB2-tTA fusion construct
containing the low efficiency cleavage site and the ARRB2-TEV-NIa
protease fusion as described supra, which were treated with varying
amounts of one of the adrenergic receptor agonists isoproterenol
and epinephrine. The assays were carried out as described, supra.
The results presented in FIG. 2a show a dose-response curve for the
stimulation of reporter gene expression by these two ligands. Each
point represents the mean value obtained from three
experiments.
[0149] A set of experiments were carried out as described supra, in
which the co-transfected clone 41 cells were pretreated with
varying concentrations of the adrenergic receptor antagonist
alprenolol for 15 minutes, followed by treatment with 1 .mu.M
epinephrine. The results shown in FIG. 2b indicate a
dose-inhibition curve for this antagonist.
Example 10
[0150] A similar set of constructs were made to establish an assay
for the G protein coupled arginine vasopressin receptor 2 (AVPR2).
The AVPR2 coding region (Genbank Accession Number: NM.sub.--000054)
(SEQ ID NO: 23) was modified to place an EcoRI site at the 5' end
and replace the stop codon with a BamHI site using PCR with the
primers
TABLE-US-00009 gaattcatgc tcatggcgtc caccac (SEQ ID NO: 24) and
ggatcccgat gaagtgtcct tggccag. (SEQ ID NO: 25)
[0151] The modified AVPR2 coding region was ligated into the three
ADRB2-tTA constructs described supra, which had been cut with EcoRI
and BamHI. This replaced the entire coding sequence of the ADRB2
with the coding sequence of AVPR2.
[0152] Clone 41 cells were co-transfected with the AVPR2-tTA fusion
construct containing the low efficiency cleavage site and the
ARRB2-TEV-NIa protease fusion described supra, and assays were
carried out using varying concentrations (1 pM to 2 .mu.M) of
[Arg8] vasopressin, an agonist for AVPR2. The data, presented in
FIG. 3, shows a dose-response curve for this agonist, with an EC50
of 3.3 nM, which agrees with previously published data (Oakley, R.,
et. al., Assay and Drug Development Technologies, 1:21-30, (2002)).
The maximal response resulted in an approximately 40-fold induction
of reporter gene expression over the background level.
Example 11
[0153] A similar set of constructs were made to establish an assay
for the G protein coupled serotonin receptor 1a (HTR1A). The HTR1A
coding region (Genbank Accession Number: NM.sub.--000524) (SEQ ID
NO: 26) was modified to place an EcoRI site at the 5' end and
replace the stop codon with a BamHI site using PCR with the
primers
TABLE-US-00010 gaattcatgg atgtgctcag ccctgg (SEQ ID NO: 27) and
ggatccctgg cggcagaact tacac. (SEQ ID NO: 28)
[0154] The modified HTR1A coding region was ligated into the
AVPR2-tTA constructs described supra, which had been cut with EcoRI
and BamHI. This replaced the entire coding sequence of AVPR2 with
the coding sequence of HTR1A. The resulting construct will be
referred to as "HTR1A-tTA" hereafter.
[0155] Clone 41 cells were co-transfected with the HTR1A-tTA fusion
construct containing the low efficiency cleavage site and the
ARRB2-TEV-NIa protease fusion construct described supra, and assays
were carried out using 10 .mu.M 8-hydroxy-DPAT HBr (OH-DPAT), an
agonist for the HTR1A, as well as with 10 .mu.M serotonin, a
natural agonist for HTR1A. The assays were carried out as
described, supra. The maximal response to OH-DPAT resulted in a
6.3-fold induction of reporter gene expression over background
level and the maximal response to serotonin resulted in a 4.6-fold
induction of reporter gene expression over background level.
Example 12
[0156] Similar constructs were made to establish an assay for the G
protein coupled m2 muscarinic acetylcholine receptor (CHRM2). The
CHRM2 coding region (Genbank Accession Number: NM.sub.--000739)
(SEQ ID NO: 29) was modified to place an EcoRI site at the 5' end
and replace the stop codon with a BglII site using PCR with the
primers
TABLE-US-00011 gaattcatga ataactcaac aaactcc (SEQ ID NO: 30) and
agatctcctt gtagcgccta tgttc. (SEQ ID NO: 31)
[0157] The modified CHRM2 coding region was ligated into the
AVPR2-tTA constructs described supra, which had been cut with EcoRI
and BamHI. This replaced the entire coding sequence of AVPR2 with
the coding sequence of CHRM2.
[0158] Clone 41 cells were co-transfected with the CHRM2-tTA fusion
construct containing the high efficiency cleavage site and the
ARRB2-TEV-NIa protease fusion described supra, where the
ARRB2-protease fusion protein was expressed under the control of
the Herpes Simplex Virus thymidine kinase (HSV-TK) promoter, and
assays were carried out using 10 .mu.M carbamylcholine Cl
(carbochol), an agonist for CHRM2, as described supra. The maximal
response to carbochol resulted in a 7.2-fold induction of reporter
gene expression over background.
Example 13
[0159] .alpha. Constructs were also made to establish an assay for
the G protein coupled chemokine (C-C motif) receptor 5 (CCR5). The
CCR5 coding region (Genbank Accession Number: NM.sub.--000579) (SEQ
ID NO: 32) was modified to place Not I site at the 5' end and
replace the stop codon with a BamHI site using PCR with the
primers
TABLE-US-00012 gcggccgcat ggattatcaa gtgtcaagtc c (SEQ ID NO: 33)
and ggatccctgg cggcagaact tacac. (SEQ ID NO: 34)
[0160] The CCR5 coding region was also modified to place a BsaI
site at the 5' end which, when cut, leaves a nucleotide overhang
which is compatible with EcoRI cut DNA using the primers
TABLE-US-00013 (SEQ ID NO: 35) ggtctccaat tcatggatta tcaagtgtca agt
and (SEQ ID NO: 36) gacgacagcc aggtacctat c.
[0161] The first modified coding region was cut with ClaI and BamHI
and the second was cut with BsaI and ClaI. Both fragments were
ligated into the AVPR2-tTA constructs described supra, which had
been cut with EcoRI and BamHI. This replaced the entire coding
sequence of AVPR2 with the coding sequence of CCR5.
[0162] The CCR5-tTA fusion construct containing the low efficiency
cleavage site was transfected into "clone 34" cells, which are a
derivative of the HEK cell line "clone 41" described supra, but
which contain a stably integrated ARRB2-TEV-NIa protease fusion
gene under the control of the CMV promoter. Assays were carried out
using 1 .mu.g/ml "Regulated on Activation, Normal T-Cell Expressed
and Secreted" (RANTES), a known agonist for CCR5. The maximal
response to RANTES, measured as described supra resulted in an
approximately 40-fold induction of reporter gene expression over
the background.
Example 14
[0163] Next, a set of constructs were made to establish an assay
for the G protein coupled dopamine 2 receptor (DRD2). The DRD2
coding region (Genbank Accession Number: NM.sub.--000795) (SEQ ID
NO: 37) was modified to place an EcoRI site at the 5' end and
replace the stop codon with a BglII site using PCR with the
primers
TABLE-US-00014 gaattcatgg atccactgaa tctgtcc (SEQ ID NO: 38) and
agatctgcag tggaggatct tcagg. (SEQ ID NO: 39)
[0164] The modified DRD2 coding region was ligated into the
AVPR2-tTA constructs described supra, cut with EcoRI and BamHI.
This replaced the entire coding sequence of AVPR2 with the coding
sequence of DRD2.
[0165] Clone 41 cells were co-transfected with the DRD2-tTA fusion
construct containing the medium efficiency cleavage site and the
ARRB2-TEV-NIa protease fusion described supra, and assays were
carried out using 10 .mu.M dopamine HCl (dopamine), an agonist for
DRD2. Results were measured as in the assays described supra. The
maximal response to dopamine resulted in a 2.7-fold induction of
reporter gene expression over the background.
Example 15
[0166] These experiments were designed to demonstrate enhancements
of the assay using arrestin variants that bind agonist-occupied
GPCRs more stably. First, a fusion of the TEV NIa protease to
.beta.-arrestin-1 (ARRB1) was constructed. The coding region of
ARRB1 (Genbank Accession Number: NM.sub.--004041) (SEQ ID NO: 40)
was modified to place an Asp718 site at the 5' end and replace the
stop codon with a BamHI site using PCR with the primers
TABLE-US-00015 (SEQ ID NO: 41) ggtaccatgg gcgacaaagg gacgcgagtg and
(SEQ ID NO: 42) ggatcctctg ttgttgagct gtggagagcc tgtaccatcc
tcctcttc.
[0167] The resulting modified ARRB1 coding region was cut with
Asp718 and EcoRI and with EcoRI and BamHI, while the modified TEV
NIa-Pro coding region described supra was cut with BglII and NotI.
All three fragments were ligated into a commercially available
pcDNA3 expression vector, which had digested with Asp718 and
NotI.
[0168] Clone 41 cells were co-transfected with the DRD2-tTA fusion
construct containing the medium efficiency cleavage site and the
ARRB1-TEV-NIa protease fusion, and assays were carried out using 10
.mu.M dopamine HCl (dopamine), an agonist for the D2 receptor, as
described supra. The maximal response to dopamine resulted in a
2.1-fold induction of reporter gene expression over the
background.
[0169] Truncation of ARRB1 following amino acid 382 has been
reported to result in enhanced affinity for agonist-bound GPCRs,
independent of GRK-mediated phosphorylation (Kovoor A., et. al., J.
Biol. Chem., 274(11):6831-6834 (1999)). To demonstrate the use of
such a "constitutively active" arrestin in the present assay, the
coding region of .beta.-arrestin-1 was modified to place an Asp718
site at the 5' end and a BamHI site after amino acid 382 using PCR
with SEQ ID NO: 41, supra
and
[0170] ggatccattt gtgtcaagtt ctatgag (SEQ ID NO: 43).
[0171] This results in a an ARRb1 coding region which is 36 amino
acids shorter than the full-length coding region. The resulting
modified ARRB1 coding region, termed "ARRB1 (.DELTA.383)", was cut
with Asp718 and EcoRI and with EcoRI and BamHI, while the modified
TEV NIa-Pro coding region described supra was cut with BglII and
NotI. All three fragments were ligated into a commercially
available pcDNA3 expression vector, digested with Asp718 and
NotI.
[0172] Clone 41 cells were co-transfected with the DRD2-tTA fusion
construct containing the medium efficiency cleavage site and the
ARRB1 (.DELTA.383)-TEV-NIa protease fusion, and assays were carried
out using 10 .mu.M dopamine HCl (dopamine), an agonist for the DRD2
receptor, as described supra. The maximal response to dopamine
resulted in an 8.3-fold induction of reporter gene expression over
the background.
[0173] To examine the effect of a comparable truncation of the
ARRB2 coding region the coding region of ARRB2 was modified to
place an Asp718 site at the 5' end and replaced 81 nucleotides at
the 3' end with a BamHI site using PCR with the primers
TABLE-US-00016 ggtaccatgg gggagaaacc cgggacc (SEQ ID NO: 44) and
ggatcctgtg gcatagttgg tatc. (SEQ ID NO: 45)
[0174] This results in a ARRB2 coding region which is 27 amino
acids shorter than the full-length coding region. The resulting
modified ARRB2 coding region was cut with Asp718 and BamHI, while
the modified TEV NIa-Pro coding region described supra was cut with
BglII and NotI. Both fragments were ligated into a commercially
available pcDNA3 expression vector, digested with Asp718 and
NotI.
[0175] Clone 41 cells were co-transfected with the DRD2-tTA fusion
construct containing the medium efficiency cleavage site and the
ARRB2 (.DELTA.383)-TEV-NIa protease fusion, and assays were carried
out using 10 .mu.M dopamine HCl (dopamine), an agonist for the DRD2
receptor, as described supra. The maximal response to dopamine
resulted in a 2.1-fold induction of reporter gene expression over
the background.
[0176] These results, presented in FIG. 4, demonstrate that DRD2
dopamine receptor assay shows the highest signal-to-background
ratio using the arrestin variant ARRB1 (.DELTA.383).
Example 16
[0177] This set of experiments was carried out to demonstrate
enhancements of the assay using receptor modifications that are
designed to increase affinity for the interacting protein. In this
example, the C-terminal tail domain of a test receptor was replaced
with the corresponding tail domain from AVPR2, a receptor known to
bind arrestins with high affinity. In these examples the fusion
junction was made 15-18 amino acids after the conserved NPXXY motif
at the end of the seventh transmembrane helix, which typically
corresponds to a position immediately after a putative
palmitoylation site in the receptor C-terminus.
[0178] First, PCR was used to produce a DNA fragment encoding the
C-terminal 29 amino acids from AVPR2, followed by the low
efficiency TEV cleavage site and tTA transcription factor. The
fragment was also designed such that the first two amino acids
(Ala, A and Arg, R) are encoded by the BssHII restriction site
GCGCGC. This was accomplished by amplifying the AVPR2-tTA construct
with the low efficiency cleavage site described supra, with the
primers
TABLE-US-00017 (SEQ ID NO: 46) tgtgcgcgcg gacgcacccc acccagcctg ggt
and (SEQ ID NO: 11) ctcgagagat cctcgcgccc cctacccacc.
[0179] Next, the coding region of the DRD2 was modified to place an
EcoRI site at the 5' end and to insert a BssHII site after the last
amino acid in the coding region (Cys-443). This was done using PCR
with the primers
TABLE-US-00018 (SEQ ID NO: 47) gaattcatgg atccactgaa tctgtcc and
(SEQ ID NO: 48) tgtgcgcgcg cagtggagga tcttcaggaa ggc.
[0180] The resulting modified D2 coding region was cut with EcoRI
and BssHII and the resulting AVPR2 C-terminal tail-low efficiency
cleavage site-tTA fragment was cut with BssHII and BamHI. Both
fragments were ligated into the AVPR2-low efficiency cleavage
site-tTA construct described supra, cut with EcoRI and BamHI.
[0181] Clone 41 cells were co-transfected with the DRD2-AVPR2
Tail-tTA fusion construct containing the low efficiency TEV
cleavage site and the ARRB2-TEV-NIa protease fusion described
supra, and assays were carried out using 10 .mu.M dopamine HCl
(dopamine), an agonist for the DRD2 receptor. The maximal response
to dopamine resulted in an approximately 60-fold induction of
reporter gene expression over the background.
[0182] A construct was made which modified the ADRB2 receptor
coding region by inserting an Asp718 site at the 5' end and by
placing a BssHII site after Cys-341. This was done using PCR with
the primers
TABLE-US-00019 (SEQ ID NO: 49) gcggccgcca ccatgaacgg taccgaaggc cca
and (SEQ ID NO: 50) tgtgcgcgcg cacagaagct cctggaaggc.
[0183] The modified ADRB2 receptor coding region was cut with EcoRI
and BssHII and the AVPR2 C-terminal tail-low efficiency cleavage
site-tTA fragment was cut with BssHII and BamHI. Both fragments
were ligated into the AVPR2-low efficiency cleavage site-tTA
construct described supra cut, with EcoRI and BamHI. The resulting
construct is "ADRB2-AVPR2 Tail-tTA." (Also see published
application U.S. 2002/0106379, supra, SEQ ID NO: 3 in
particular.)
[0184] Clone 41 cells were co-transfected with the ADRB2-AVPR2
Tail-tTA fusion construct containing the low efficiency TEV
cleavage site and the ARRB2-TEV-NIa protease fusion described
supra, and assays were carried out using 10 .mu.M isoproterenol, an
agonist for the ADRB2 receptor. The maximal response to
isoproterenol resulted in an approximately 10-fold induction of
reporter gene expression over the background.
[0185] A construct was made which modified the kappa opioid
receptor (OPRK; Genbank Accession Number: NM.sub.--000912) (SEQ ID
NO: 51) coding region by placing a BssHII site after Cys-345. This
was done using PCR with the primers
TABLE-US-00020 (SEQ ID NO: 52) ggtctacttg atgaattcct ggcc and (SEQ
ID NO: 53) gcgcgcacag aagtcccgga aacaccg
[0186] The modified OPRK receptor coding region was cut with EcoRI
and BssHII and AVPR2 C-terminal tail-low efficiency cleavage
site-tTA fragment was cut with BssHII and XhoI. Both fragments were
ligated into a plasmid containing the modified OPRK receptor
sequence, cloned into pcDNA3.1+ at Asp718 (5') and XhoI (3'), which
had been digested with EcoRI and XhoI.
[0187] Clone 41 cells were co-transfected with the OPRK-AVPR2
Tail-tTA fusion construct containing the low efficiency cleavage
site and the ARRB2-TEV-NIa protease fusion described supra, and
assays were carried out using 10 .mu.M U-69593, an agonist for the
OPRK. The maximal response to U-69593 resulted in an approximately
12-fold induction of reporter gene expression over the
background.
Example 17
[0188] This experiment was designed to demonstrate the use of the
assay to measure the activity of two test receptors simultaneously
using a multiplex format.
[0189] Clone 41 cells and "clone 1H10" cells, which are cells of an
HEK-293T cell line containing a stable integration of the
luciferase gene under the control of a tTA-dependent promoter, were
each plated on 24-well culture dishes and were transiently
transfected with the chimeric ADRB2-AVPR2 Tail-tTA or the
DRD2-AVPR2 Tail-tTA fusion constructs described supra,
respectively. Transient transfections were performed using 100
.mu.l of media, 0.4 .mu.g of DNA and 2 .mu.l of FuGene reagent per
well. After 24 hr of incubation, Clone 41 cells expressing
ADRB2-AVPR2 Tail-tTA and clone 1H10 cells expressing DRD2-AVPR2
Tail-tTA were trypsinized, mixed in equal amounts, and replated in
12 wells of a 96-well plate. Triplicate wells were incubated
without drug addition or were immediately treated with 1 .mu.M
isoproterenol, 1 .mu.M dopamine, or a mixture of both agonists at 1
.mu.M. Cells were assayed for reporter gene activity approximately
24 hours after ligand addition. Medium was discarded, cells were
lysed in 40 .mu.l lysis buffer [100 mM potassium phosphate pH 7.8,
0.2% Triton X-100] and the cell lysate was assayed for
beta-galactosidase and for luciferase activity using commercially
available luminescent detection reagents.
[0190] The results are presented in FIGS. 5A and 5B. Treatment with
isoproterenol resulted in an approximately seven-fold induction of
beta-galactosidase reporter gene activity, whereas luciferase
activity remained unchanged. Treatment with dopamine resulted in a
3.5-fold induction of luciferase activity, while beta-galactosidase
activity remained unchanged. Treatment with both isoproterenol and
dopamine resulted in seven-fold and three-fold induction of
beta-galactosidase and luciferase activity, respectively.
Example 18
[0191] This experiment was designed to demonstrate the use of the
assay to measure the activity of two test receptors simultaneously
using a multiplex format.
[0192] "Clone 34.9" cells, which are a derivative of clone 41 cells
and containing a stably integrated ARRB2-TEV NIa protease fusion
protein gene, were transiently transfected with the chimeric
OPRK-AVPR2 Tail-TEV-NIa-Pro cleavage (Leu)-tTA fusion construct
described supra. In parallel, "clone HTL 5B8.1" cells, which are an
HEK-293T cell line containing a stable integrated luciferase gene
under the control of a tTA-dependent promoter, were transiently
transfected with the ADRB-AVPR2 Tail-TEV-NIa-Pro cleavage (Leu)-tTA
fusion construct described supra. In each case 5.times.10.sup.5
cells were plated in each well of a 6-well dish, and cultured for
24 hours in DMEM supplemented with 10% fetal bovine serum, 2 mM
L-Glutamine, 100 units/ml penicillin, 500 .mu.g/ml G418, and 3
.mu.g/ml puromycin. Cells were transiently transfected with 100
.mu.l of DMEM, 0.5 .mu.g of OPRK-AVPR2 Tail-TEV-NIa-Pro cleavage
(Leu)-tTA DNA, and 2.5 .mu.l Fugene ("clone 34.9 cells") or with
100 .mu.l of DMEM, 0.5 .mu.g of ADRB2-AVPR2 Tail-TEV-NIa-Pro
cleavage (Leu)-tTA DNA, 0.5 .mu.g of ARRB2-TEV NIa Protease DNA and
5 .mu.l Fugene ("clone HTL 5B8.1 cells"). Transiently transfected
cells were cultured for about 24 hours, and were then trypsinized,
mixed in equal amounts and replated in wells of a 96 well plate.
Cell were incubated for 24 hours before treatment with 10 .mu.M
U-69593, 10 .mu.M isoproterenol or a mixture of both agonists at 10
.mu.M. Sixteen wells were assayed for each experimental condition.
After 24 hours, cells were lysed and the activity of both
beta-galactosidase and luciferase reporter genes were assayed as
described supra. The results are presented in FIG. 6. Treatment
with U-69593 resulted in an approximately 15-fold induction of
beta-galactosidase reporter gene activity, whereas luciferase
activity remained unchanged. Treatment with isoproterenol resulted
in a 145-fold induction of luciferase activity, while
beta-galactosidase activity remained unchanged. Treatment with both
U-69593 and isoproterenol resulted in nine-fold and 136-fold
induction of beta-galactosidase and luciferase activity,
respectively.
Example 19
[0193] This experiment was carried out to demonstrate the use of a
different transcription factor and promoter in the assay of the
invention.
[0194] A fusion construct was created, comprising DNA encoding
AVPR2, fused in frame to a DNA sequence encoding the amino acid
linker GSENLYFQLR (SEQ ID NO: 54) which included the low efficiency
cleavage site for TEV N1a-Pro described supra, fused in frame to a
DNA sequence encoding amino acids 2-147 of the yeast GAL4 protein
(GenBank Accession Number P04386) (SEQ ID NO: 55) followed by a
linker, i.e., of the sequence PELGSASAELTMVF (SEQ ID NO: 56),
followed by amino acids 368-549 of the murine nuclear factor
kappa-B chain p65 protein (GenBank Accession Number A37932) (SEQ ID
NO: 57). The CMV promoter was placed upstream of the AVPR2 coding
region and a polyA sequence was placed downstream of the GAL4-NFkB
region. This construct was designated AVPR2-TEV-NIa-Pro cleavage
(Leu)-GAL4.
[0195] HUL 5C1.1 is a derivative of HEK-293T cells, which contain a
stably integrated luciferase reporter gene under the control of a
GAL4 upstream activating sequence (UAS), commercially available
pFR-LUC.
[0196] This AVPR2-TEV-NIa-Pro cleavage (Leu)-GAL4 plasmid was
co-transfected along with the .beta.-arrestin2-TEV N1a Protease
described supra into HUL 5C1.1 cells. About 2.5.times.10.sup.4
cells were plated into each well of a 96 well-plate, in DMEM medium
supplemented with 10% fetal bovine serum, 2 mM L-Glutamine, 100
units/ml penicillin, 500 .mu.g/ml G418, and 3 .mu.g/ml puromycin.
Cells were grown to reach 50% confluency the next day and were
transfected with 10 .mu.l per well of a mixture consisting of 85
.mu.l of DMEM, 0.1 .mu.g of AVPR2-TEV-Nia-Pro cleavage (Leu)-GAL4
DNA, 0.1 .mu.g of ARRB2-TEV N1a Protease DNA, and 1 .mu.l Fugene,
which had been incubated for 15 minutes at room temperature prior
to addition to the cells. Transfected cells were cultured for about
16 hours before treatment with 10 .mu.M vasopressin. After six
hours, cells were lysed and luciferase activity was assayed as
described supra. Under these conditions, treatment with vasopressin
resulted in a 180-fold increase in reporter gene activity.
Example 20
[0197] This set of experiments were carried out to demonstrate
enhancements of the assay using further receptor modifications that
are designed to increase the affinity for the interacting protein.
In this example, the C-terminal tail domain of the test receptor is
replaced with the corresponding tail domain of one of the following
receptors: apelin J receptor--AGTRL1 (accession number:
NM.sub.--005161) (SEQ ID NO: 58), gastrin-releasing peptide
receptor--GRPR (accession number: NM.sub.--005314) (SEQ ID NO: 59),
proteinase-activated receptor 2--F2RL1 (accession number:
NM.sub.--005242) (SEQ ID NO: 60), CCR4 (accession number:
NM.sub.--005508) (SEQ ID NO: 61), chemokine (C-X-C motif) receptor
4-CXCR4 (accession number: NM.sub.--003467) (SEQ ID NO: 62), and
interleukin 8 receptor, beta-CXCR2/IL8b (accession number:
NM.sub.--001557) (SEQ ID NO: 63).
[0198] First PCR was used to produce a DNA fragment encoding the
C-terminal tail of the above receptors. These fragments were
designed such that the first two amino acids (Ala, A and Arg, R)
are encoded by the BssHII restriction site.
[0199] The AGTRL1 C-terminal fragment was amplified with the
primers
TABLE-US-00021 (SEQ ID NO: 64) tgtgcgcgcg gccagagcag gtgcgca and
(SEQ ID NO: 65) gaggatccgt caaccacaag ggtctc.
[0200] The GRPR C-terminal fragment was amplified with the
primers
TABLE-US-00022 (SEQ ID NO: 66) tgtgcgcgcg gcctgatcat ccggtct and
(SEQ ID NO: 67) gaggatccga cataccgctc gtgaca.
[0201] The F2RL1 C-terminal fragment was amplified with the
primers
TABLE-US-00023 (SEQ ID NO: 68) tgtgcgcgca gtgtccgcac tgtaaagc and
(SEQ ID NO: 69) gaggatccat aggaggtctt aacagt.
[0202] The CCR4C-terminal fragment was amplified with the
primers
TABLE-US-00024 (SEQ ID NO: 70) tgtgcgcgcg gcctttttgt gctctgc and
(SEQ ID NO: 71) gaggatccca gagcatcatg aagatc.
[0203] The CXCR2/IL8b C-terminal fragment was amplified with the
primers
TABLE-US-00025 (SEQ ID NO: 72) tgtgcgcgcg gcttgatcag caagggac and
(SEQ ID NO: 73) gaggatccga gagtagtgga agtgtg.
[0204] The CXCR4C-terminal fragment was amplified with the
primers
TABLE-US-00026 (SEQ ID NO: 74) tgtgcgcgcg ggtccagcct caagatc and
(SEQ ID NO: 75) gaggatccgc tggagtgaaa acttga.
[0205] The resulting DNA fragments encoding the modified C-terminal
tail domains of these receptors were cut with BssHII and BamHI and
the fragments were ligated in frame to the OPRK receptor coding
region, replacing the AVPR2-C-terminal tail fragment, in the
OPRK-AVPR2 Tail-TEV-NIa-Pro cleavage (Leu)-tTA expression construct
described supra.
[0206] HTL 5B8.1 cells described supra were co-transfected with
each of the above modified OPRK coding region--TEV-NIa-Pro cleavage
(Leu)--tTA constructs and the .beta.-arrestin 2--TEV NIa protease
fusion described supra. About 2.5.times.10.sup.4 cells per well
were plated onto a 96 well-plate, in DMEM medium supplemented with
10% fetal bovine serum, 2 mM L-Glutamine, 100 units/ml penicillin,
500 .mu.g/ml G418, and 3 .mu.g/ml puromycin. Cells were grown to
reach 50% confluency the next day and were transfected with 10
.mu.l per well of a mixture consisting of 85 .mu.l of DMEM, 0.25
.mu.g of AVPR2-TEV-NIa-Pro cleavage (Leu)-GAL4 DNA, 0.25 .mu.g of
ARRB2-TEV NIa protease DNA, and 2.5 .mu.l Fugene (a proprietary
transfection reagent containing lipids and other material), which
had been incubated for 15 minutes at room temperature prior to
addition to the cells. Transfected cells were cultured for about 16
hours before treatment 10 .mu.M U-69593. After six hours, cells
were lysed and luciferase activity was assayed as described supra.
Under these conditions, treatment with U-69593 resulted in the
following relative increases in reporter gene activity for each of
the modified OPRK receptors: OPRK-AGTRLI C-terminal tail--30 fold;
OPRK-GRPR C-terminal tail--312 fold; OPRK-F2RL1 C-terminal
tail--69.5 fold; OPRK-CCR4C-terminal tail--3.5 fold;
OPRK-CXCR4C-terminal tail--9.3 fold; OPRK-IL8b C-terminal tail--113
fold.
Example 21
[0207] This experiment was designed to produce a cell line that
stably expressed the ARRB2-TEV NIa protease fusion protein
described supra.
[0208] A plasmid was made which expressed the ARRB2-TEV NIa
protease fusion protein under the control of the EF1.alpha.
promoter and also expressed the hygromycin resistance gene under
the control of the thymidine kinase (TK) promoter.
[0209] This plasmid was transfected into HTL 5B8.1, and clones
containing a stable genomic integration of the plasmid were
selected by culturing in the presence of 100 .mu.g/ml hygromycin.
Resistant clones were isolated and expanded and were screened by
transfection of the ADRB2-AVPR2 Tail-TEV-NIa-Pro cleavage (Leu)-tTA
plasmid described supra. Three cell lines that were selected using
this procedure were designated "HTLA 4C2.10", "HTLA 2C11.6" and
"HTLA 5D4". About 2.5.times.10.sup.4 cells per well were plated
onto a 96 well-plate, in DMEM medium supplemented with 10% fetal
bovine serum, 2 mM L-Glutamine, 100 units/ml penicillin, 500
.mu.g/ml G418, 3 .mu.g/ml puromycin, and 100 .mu.g/ml hygromycin.
Cells were grown to reach 50% confluency the next day and were
transfected with 10 .mu.l per well of a mixture consisting of 85
.mu.l of DMEM, 0.25 .mu.g of ADRB2-AVPR2-TEV-NIa-Pro cleavage
(Leu)-GAL4 DNA and 0.5 .mu.l Fugene, which had been incubated for
15 minutes at room temperature prior to addition to the cells.
Transfected cells were cultured for about 16 hours before treatment
10 .mu.M isoproterenol. After six hours, cells were lysed and
luciferase activity was assayed as described supra. Under these
conditions, treatment with isoproterenol resulted in a 112-fold
("HTLA 4C2.10"), 56-fold ("HTLA 2C11.6") and 180-fold ("HTLA 5D4")
increase in reporter gene activity in the three cell lines,
respectively.
Example 22
[0210] This experiment was designed to produce a cell line that
stably expressed the ARRB2-TEV NIa protease and the ADRB2-AVPR2
Tail-TEV-NIa-Pro cleavage (Leu)-tTA fusion proteins described
supra.
[0211] The ARRB2-TEV NIa protease plasmid containing the hygromycin
resistance gene was transfected together with the ADRB2-AVPR2
Tail-TEV-NIa-Pro cleavage (Leu)-tTA fusion protein plasmid
described supra into HTL 5B8.1 cells and clones containing stable
genomic integration of the plasmids were selected by culturing in
the presence of 100 .mu.g/ml hygromycin. Resistant clones were
isolated and expanded, and were screened by treating with 10 .mu.M
isoproterenol and measuring the induction of reporter gene activity
as described supra. Three cell lines that were selected using this
procedure were designated "HTLAR 1E4", "HTLAR 1C10" and "HTLAR
2G2". Treatment with isoproterenol for 6 hours resulted in a
208-fold ("HTLAR 1E4"), 197-fold ("HTLAR 1C10") and 390-fold
("HTLAR 2G2") increase in reporter gene activity in the three cell
lines, respectively.
Example 23
[0212] This experiment was designed to demonstrate the use of the
assay to measure the activity of the receptor tyrosine kinase
epidermal growth factor receptor (EGFR).
[0213] A first fusion construct was created, comprising DNA
encoding the human EGFR, which can be found at GenBank under the
Accession Number NM.sub.--005228 (SEQ ID NO: 76), fused in frame to
a DNA sequence encoding amino acids 3-335 of the
tetracycline-controlled transactivator tTA, described supra.
Inserted between these sequences is a DNA sequence encoding the
amino acid sequence GGSGSENLYFQL (SEQ ID NO: 77) which includes the
low efficiency cleavage site for TEV NIa-Pro, ENLYFQL (SEQ ID NO:
14), described supra. The CMV promoter was placed upstream of the
Epidermal Growth Factor Receptor coding region, and a polyA
sequence was placed downstream of the tTA region. This construct is
designated EGFR-TEV-NIa-Pro cleavage (Leu)-tTA.
[0214] A second fusion construct was created, comprising DNA
encoding the two SH2 domains of human Phospholipase C Gamma 1,
corresponding to amino acids 538-759 (GeneBank accession number
NP.sub.--002651.2) (SEQ ID NO: 78) fused in frame to a DNA sequence
encoding the catalytic domain of mature TEV NIa protease, described
supra, corresponding to amino acids 2040-2279 (GeneBank accession
number AAA47910) (SEQ ID NO: 79). Inserted between these sequences
is a linker DNA sequence encoding the amino acids NSSGGNSGS (SEQ ID
NO: 80). The CMV promoter was placed upstream of the PLC-Gamma SH2
domain coding sequence and a polyA sequence was placed downstream
of the TEV NIa protease sequence. This construct is designated PLC
Gamma1-TEV.
[0215] The EGFR-TEV-NIa-Pro cleavage (Leu)-tTA and PLC Gamma1-TEV
fusion constructs were transfected into clone HTL5B8.1 cells
described supra. About 2.5.times.10.sup.4 cells were plated into
each well of a 96 well-plate, in DMEM medium supplemented with 10%
fetal bovine serum, 2 mM L-Glutamine, 100 units/ml penicillin, 500
.mu.g/ml G418, and 3 .mu.g/ml puromycin. Cells were grown to reach
50% confluency the next day and were transfected with 15 .mu.l per
well of a mixture consisting of 100 .mu.l of DMEM, 0.4 .mu.g of
pcDNA3 DNA ("carrier" vector DNA), 0.04 .mu.g of EGFR-TEV-NIa-Pro
cleavage (Leu)-tTA DNA, 0.04 .mu.g of PLC Gamma1-TEV DNA, and 2
.mu.l Fugene (a proprietary transfection reagent containing lipids
and other material), which had been incubated for 15 minutes at
room temperature prior to addition to the cells. Transfected cells
were cultured for about 16 hours before treatment with specified
receptor agonists and inhibitors. After six hours, cells were lysed
and luciferase activity was assayed as described supra. Results are
shown in FIG. 7.
[0216] The addition of 2.5 ng/ml human Epidermal Growth Factor
(corresponding to the EC80 for this ligand) resulted in a 12.3 fold
increase of luciferase reporter gene activity, while addition of
100 ng/ml human Transforming Growth Factor--Alpha resulted in an
18.3 fold increase. Prior treatment with tyrosine kinase inhibitors
(70 .mu.M AG-494; 0.3 .mu.M AG-1478; 2 mM RG-130022) before
addition of human Epidermal Growth Factor blocked the induction of
reporter gene activity.
Example 24
[0217] This experiment was designed to demonstrate the use of the
assay to measure the activity of the human Type I Interferon
Receptor.
[0218] A fusion construct was created, comprising DNA encoding
human Interferon Receptor I (IFNAR1) (557 amino acids), which can
be found in Genbank under Accession Number NM.sub.--000629 (SEQ ID
NO: 81), fused in frame to a DNA sequence encoding amino acids
3-335 of the tetracycline controlled transactivator tTA, described
supra. Inserted between these sequences is a DNA sequence encoding
the amino acid sequence GSENLYFQL (SEQ ID NO: 82) which includes
the low efficiency cleavage site for TEV NIa-Pro, ENLYFQL (SEQ ID
NO: 14), described supra. The CMV promoter was placed upstream of
the Human Interferon Receptor I (IFNAR1) coding region, and a poly
A sequence was placed downstream of the tTA region. This construct
is designated IFNAR1-TEV-NIa-Pro cleavage (L)-tTA.
[0219] A second fusion construct was created, using DNA encoding
Human Interferon Receptor 2, splice variant 2 (IFNAR2.2) (515 amino
acids), which can be found at Genbank, under Accession Number
L41942 (SEQ ID NO: 83), fused in frame to a DNA sequence encoding
the catalytic domain of the TEV NIa protease, described supra,
corresponding to amino acids 2040-2279 (GenBank accession number
AAA47910) (SEQ ID NO: 84). Inserted between these sequences is a
DNA sequence encoding the amino acid sequence RS (Arg-Ser). The CMV
promoter region was placed upstream of the Human Interferon
Receptor 2 (IFNAR2.2) coding region, and a poly A sequence was
placed downstream of the TEV region. This construct is designated
IFNAR2.2-TEV.
[0220] Expression constructs were also generated in which the genes
for Human Signal Transducer and Activator of Transcription 1
(STAT1), found in Genbank, under Accession Number NM.sub.--007315
(SEQ ID NO: 85), Human Signal Transducer and Activator of
Transcription 2 (STAT2) found in Genbank, under Accession Number
NM.sub.--005419 (SEQ ID NO: 86), were expressed under the control
of the CMV promoter region. These constructs were designated
CMV-STAT1 and CMV-STAT2 respectively.
[0221] The IFNAR1-TEV-NIa-Pro cleavage (L)-tTA and IFNAR2.2-TEV
fusion constructs, together with CMV-STAT1 and CMV-STAT2 were
transiently transfected into HTL5B8.1 cells described supra. About
2.5.times.10.sup.4 cells were seeded in each well of a 96 well
plate and cultured in DMEM medium supplemented with 10% fetal
bovine serum, 2 mM L-glutamine, 100 units/ml penicillin, 100
.mu.g/m1 G418, and 5 .mu.g/ml puromycin. After 24 hours of
incubation, cells were transfected with 15 ng of each
IFNAR1-TEV-NIa-Pro cleavage (L)-tTA, IFNAR2.2-TEV, CMV-STAT1 and
CMV-STAT2 DNA, or with 60 ng control pcDNA plasmid, together with
0.3 .mu.l Fugene per well. Transfected cells were cultured for 8-20
hours before treatment with 5000 U/ml human interferon-alpha or
5000 U/ml human interferon-beta. At the time of interferon
addition, medium was aspirated and replaced with 293 SFM II media
supplemented with 2 mM L-glutamine, 100 units/ml penicillin, 3
.mu.g/ml puromycin and 500 .mu.g/ml of G418. Interferon-treated
cells were cultured for an additional 18-20 hours before they were
assayed for luciferase reporter gene activity as described supra.
Results are shown in FIG. 8. Treatment with 5000 U/ml IFN-.alpha.
resulted in 15-fold increase in reporter gene activity, while
treatment with 5000 U/ml IFN-.beta. resulted in a 10-fold increase.
Interferon treatment of HTL5B8.1 cells transfected with the control
plasmid pcDNA3 had no effect on reporter gene activity. FIG. 9
shows a dose-response curve generated for IFN-.alpha. in HTL5B8.1
cells transfected with IFNAR1(ENLYFQ(L)-tTa, IFNAR2.2-TEV, STAT1
and STAT2 expression constructs as described supra.
Example 25
[0222] This experiment was designed to demonstrate the use of the
assay to measure the activity of the human Type I Interferon
Receptor using a different transcription factor and a different
cell line.
[0223] A fusion construct was created, using DNA encoding Human
Interferon Receptor I (IFNAR1), fused in frame to a DNA sequence
encoding the GAL4-NF-.kappa.B-fusion, described supra. Inserted
between these sequences is a DNA sequence encoding the amino acid
sequence GSENLYFQL (SEQ ID NO: 87), which includes the low
efficiency cleavage site for TEV NIa-Pro, ENLYFQL (SEQ ID NO: 14),
described supra. The CMV promoter was placed upstream of the Human
Interferon Receptor I (IFNAR1) coding region, and a poly A sequence
was placed downstream of the GAL4-NF-.kappa.B region. This
construct is designated IFNAR1-TEV-NIa-Pro cleavage
(L)-GAL4-NF-.kappa.B.
[0224] CHO-K1 cells were then transiently transfected with a
mixture of five plasmids: IFNAR1-TEV-NIa-Pro cleavage
(L)-GAL4-NF-.kappa.B, IFNAR2.2-TEV, CMV-STAT1, CMV-STAT2 and
pFR-Luc, a luciferase reporter gene plasmid under the control of a
GAL4-dependent promoter. About 1.0.times.10.sup.4 cells per well
were seeded in a 96 well plate 24 hours prior to transfections in
DMEM medium supplemented with 10% fetal bovine serum, 2 mM
L-glutamine, 100 units/ml penicillin. Cells were transfected the
following day with 10 ng of reporter plasmid (pFR-Luc), plus 20 ng
of each of the expression constructs described supra, or with 10 ng
reporter plasmid plus 80 ng of control pcDNA3 plasmid, together
with 0.3 .mu.l Fugene per well. Transfected cells were cultured for
8-20 hours before treatment with 5000 U/ml human interferon-alpha.
At the time of interferon addition, medium was aspirated and
replaced with DMEM media supplemented with 2 mM L-glutamine, 100
units/ml penicillin. Interferon-treated cells were cultured for an
additional 6 hours before they were assayed for luciferase reporter
gene activity as described supra.
[0225] Results are shown in FIG. 10. IFN-.alpha. treatment of
CHO-K1 cells transfected with the reporter, IFNAR and STAT
constructs resulted in 3-fold increase in reporter gene activity,
while interferon treatment of cells transfected with the reporter
and control plasmids had no effect on reporter gene activity.
Example 26
[0226] This set of experiments was carried out to demonstrate
additional enhancements of the assay using receptor modifications
designed to increase the affinity of the test receptor for the
interacting protein. In these examples, the fusion junction between
the test receptor and a C-terminal tail domain of GRPR (Genbank
Accession Number: NM.sub.--005314) (SEQ ID NO: 59) was made 17-23
amino acids after the conserved NPXXY motif at the end of the
seventh transmembrane helix.
[0227] First, PCR was used to produce a DNA fragment encoding the
C-terminal 42 amino acids from GRPR beginning 2 amino acids after
the putative palmitoylation site (hereafter referred to as GRPR
42aa). The fragment was designed such that the first amino acid of
the C-terminal tail is preceded by two amino acids (Ser, S and Arg,
R) which are encoded by the XbaI restriction site TCTAGA, and the
stop codon is replaced by two amino acids (Gly, G and Ser, S) which
are encoded by a BamHI restriction site GGATCC. This was
accomplished by amplifying a plasmid containing the GRPR coding
region with primers
TABLE-US-00027 (SEQ ID NO: 88) tctagaggcctgatcatccggtctcac and (SEQ
ID NO: 67) gaggatccgacataccgctcgtgaca
[0228] Next the coding region of OPRK (Genbank Accession Number:
NM.sub.--000912) (SEQ ID NO: 51) was modified to place insert an
XbaI site after Pro-347. This was done using PCR with the
primers
TABLE-US-00028 (SEQ ID NO: 52) ggtctacttgatgaattcctggcc and (SEQ ID
NO: 89) tctagatggaaaacagaagtcccggaaac
[0229] In addition, the coding region of ADRA1A (Genbank Accession
Number: NM.sub.--000680) (SEQ ID NO: 90) was modified to insert an
XbaI site after Lys-349. This was done using PCR with the
primers
TABLE-US-00029 (SEQ ID NO: 91) ctcggatatctaaacagctgcatcaa and (SEQ
ID NO: 92) tctagactttctgcagagacactggattc
[0230] In addition, the coding region of DRD2 (Genbank Accession
Number: NM.sub.--000795) (SEQ ID NO: 37) was modified to insert two
amino acids (Leu and Arg) and an XbaI site after Cys-343. This was
done using PCR with the primers
TABLE-US-00030 (SEQ ID NO: 38) gaattcatggatccactgaatctgtcc and (SEQ
ID NO: 93) tctagatcgaaggcagtggaggatcttcagg
[0231] The modified OPRK receptor coding region was cut with EcoRI
and XbaI and the GRPR 42aa C-terminal tail fragment was cut with
XbaI and BamHI. Both fragments were ligated into a plasmid
containing the OPRK receptor with the AVPR2 C-terminal
tail-low-efficiency cleavage site-tTA described supra which had
been digested with EcoRI and BamHI.
[0232] The modified ADRA1A receptor coding region was cut with
EcoRV and XbaI and the OPRK-GRPR 42aa Tail-tTA fusion construct
containing the low efficiency cleavage site was cut with XbaI and
XhoI. Both fragments were ligated into a plasmid containing the
ADRA1A receptor which had been digested with EcoRV and XhoI.
[0233] The modified DRD2 receptor coding region was cut with EcoRI
and XbaI and the OPRK-GRPR 42aa Tail-tTA fusion construct
containing the low efficiency cleavage site was cut with XbaI and
XhoI. Both fragments were ligated into a pcDNA6 plasmid digested
with EcoRI and XhoI
[0234] HTLA 2C11.6 cells, described supra, were transfected with
OPRK-GRPR 42aa Tail-tTA fusion construct containing the low
efficiency cleavage site and assays were carried out using 10 .mu.M
U-69593, an agonist for OPRK. The maximal response to U-69593
resulted in an approximately 200-fold increase in reporter gene
activity.
[0235] HTLA 2C11.6 cells were transfected with ADRA1A-GRPR 42aa
Tail-tTA fusion construct containing the low efficiency cleavage
site and assays were carried out using 10 .mu.M epinephrine, an
agonist for ADRA1A. The maximal response to epinephrine resulted in
an approximately 14-fold increase in reporter gene activity.
[0236] HTLA 2C11.6 cells were transfected with DRD2-GRPR 42aa
Tail-tTA fusion construct containing the low efficiency cleavage
site and assays were carried out using 10 .mu.M dopamine, an
agonist for DRD2. The maximal response to dopamine resulted in an
approximately 30-fold increase in reporter gene activity.
Example 27
[0237] This set of experiments were carried out to demonstrate
further enhancements of the assay using a different set of test
receptor modifications designed to increase the affinity for the
interacting protein. In these examples, the C-terminal domain of
the test receptor was replaced with a portion of the endogenous
C-terminal tail domain of GRPR.
[0238] First, PCR was used to produce a DNA fragment encoding the
truncated GRPR tail, specifically a sequence encoding 23 amino
acids from Gly-343 to Asn-365. The fragment was designed such that
the first amino acid of the C-terminal tail is preceded by two
amino acids (Ser, S and Arg, R) which are encoded by the XbaI
restriction site TCTAGA, and the Ser-366 is replaced by two amino
acids (Gly, G and Ser, S) which are encoded by a BamHI restriction
site GGATCC. This was accomplished by amplifying a plasmid
containing the GRPR coding region with primers
TABLE-US-00031 (SEQ ID NO: 94) tctagaggcctgatcatccggtctcac and (SEQ
ID NO: 95) cggatccgttggtactcttgagg
[0239] Next the truncated GRPR fragment (hereafter referred to as
GRPR 23aa Tail) was cut with XbaI and BamHI and inserted into the
OPRK-GRPR 42aa Tail-tTA fusion construct containing the low
efficiency cleavage site described herein, digested with XbaI and
BamHI.
[0240] Similarly, the GRPR 23aa Tail fragment was cut with XbaI and
BamHI and inserted into the ADRA1A-GRPR 42aa Tail-tTA fusion
construct containing the low efficiency cleavage site described
herein, digested with XbaI and BamHI.
[0241] HTLA 2C11.6 cells were transfected with OPRK-GRPR 23aa
Tail-tTA fusion construct containing the low efficiency cleavage
site and assays were carried out using 10 .mu.M U-69593, an agonist
for OPRK. The maximal response to U-69593 resulted in an
approximately 115-fold induction of reporter gene expression over
the background.
[0242] HTLA 2C11.6 cells were transfected with ADRA1A-GRPR 23aa
Tail-tTA fusion construct containing the low efficiency cleavage
site and assays were carried out using 10 .mu.M epinephrine, an
agonist for ADRA1A. The maximal response to epinephrine resulted in
an approximately 102-fold induction of reporter gene expression
over the background.
Example 28
[0243] This experiment was designed to demonstrate the use of the
assay to measure the activity of the receptor tyrosine kinase
Insulin-like Growth Factor-1 Receptor (IGF1R), specifically by
monitoring the ligand-induced recruitment of the intracellular
signaling protein SHC1 (Src homology 2 domain-containing
transforming protein 1).
[0244] A first fusion construct was created, comprising DNA
encoding the human IGF-1R, which can be found at GenBank under the
Accession Number NM.sub.--000875 (SEQ ID NO: 96), fused in frame to
a DNA sequence encoding amino acids 3-335 of the
tetracycline-controlled transactivator tTA, described supra.
Inserted between these sequences is a DNA sequence encoding the
amino acid sequence GSENLYFQL (SEQ ID NO: 82) which includes the
low efficiency cleavage site for TEV NIa-Pro, ENLYFQL (SEQ ID NO:
14), described supra. The CMV promoter was placed upstream of the
IGF1R coding region, and a polyA sequence was placed downstream of
the tTA region. This construct is designated IGF1R-TEV-NIa-Pro
cleavage (Leu)-tTA.
[0245] A second fusion construct was created, comprising DNA
encoding the PTB domain of human SHC1, corresponding to amino acids
1-238 (GeneBank accession number BC014158) (SEQ ID NO: 97) fused in
frame to a DNA sequence encoding the catalytic domain of mature TEV
NIa protease, described supra, corresponding to amino acids
2040-2279 (GeneBank accession number AAA47910) (SEQ ID NO: 79).
Inserted between these sequences is a linker DNA sequence encoding
the amino acids NSGS (SEQ ID NO: 98). The CMV promoter was placed
upstream of the SHC1 PTB domain coding sequence and a polyA
sequence was placed downstream of the TEV NIa protease sequence.
This construct is designated SHC1-TEV.
[0246] The IGF1R-TEV-NIa-Pro cleavage (Leu)-tTA and SHC1-TEV fusion
constructs were transfected into clone HTL5B8.1 cells described
supra. About 2.5.times.10.sup.4 cells were plated into each well of
a 96 well-plate, in DMEM medium supplemented with 10% fetal bovine
serum, 2 mM L-Glutamine, 100 units/ml penicillin, 500 .mu.g/ml
G418, and 3 .mu.g/ml puromycin. Cells were grown to reach 50%
confluency the next day and were transfected with 15 .mu.l per well
of a mixture consisting of 100 .mu.l of DMEM, 0.2 .mu.g of
IGF1R-TEV-NIa-Pro cleavage (Leu)-tTA DNA, 0.2 .mu.g of SHC1-TEV
DNA, and 2 .mu.A Fugene (a proprietary transfection reagent
containing lipids and other material), which had been incubated for
15 minutes at room temperature prior to addition to the cells.
Transfected cells were cultured for about 16 hours before treatment
with a specific receptor agonist. After 24 hours, cells were lysed
and luciferase activity was assayed as described supra.
[0247] The addition of 1 .mu.M human Insulin-like Growth Factor 1
resulted in a 90 fold increase of luciferase reporter gene
activity.
Example 29
[0248] This experiment was designed to demonstrate the use of the
assay to measure the interaction of two test proteins that are not
normally membrane bound. In this example, the assay was used to
measure the ligand-induced dimerization of the nuclear steroid
hormone receptors, ESR1 (estrogen receptor 1 or ER alpha) and ESR2
(estrogen receptor 2 or ER beta). In this example, ESR1 is fused to
the transcription factor tTA, where the cleavage site for the TEV
NIa-Pro protease is inserted between the ESR1 and tTA sequences.
This ESR1-tTA fusion is tethered to the membrane by a fusion to the
intracellular, C-terminal end of the transmembrane protein CD8. CD8
essentially serves as an inert scaffold that tethers ESR1 to the
cytoplasmic side of the cell membrane. The transcription factor
fused thereto cannot enter the nucleus until interaction with ESR2
and protease. Any transmembrane protein could be used. This
CD8-ESR1-TEV NIa Pro cleavage-tTA fusion protein is expressed
together with a second fusion protein comprised of ESR2 and the TEV
NIa-Pro protease in a cell line containing a tTA-dependent reporter
gene. The estrogen-induced dimerization of ESR1 and ESR2 thereby
triggers the release of the tTA transcription factor from the
membrane bound fusion, which is detected by the subsequent
induction in reporter gene activity.
[0249] A fusion construct was created, comprising DNA encoding
human CD8 gene (235 amino acids), which can be found in Genbank
under Accession Number NM.sub.--001768 (SEQ ID NO: 99), fused in
frame to a DNA sequence encoding the human ESR1 (596 amino acids),
which can be found in Genbank under Accession Number
NM.sub.--000125 (SEQ ID NO: 100). Inserted between these sequences
is a DNA sequence encoding the amino acid sequence GRA
(Gly-Arg-Ala). The resulting construct is then fused in frame to a
DNA sequence encoding amino acids 3-335 of the tetracycline
controlled transactivator tTA, described supra. Inserted between
these sequences is a DNA sequence encoding the amino acid sequence
GSENLYFQL (SEQ ID NO: 82) which includes the low efficiency
cleavage site for TEV NIa-Pro, ENLYFQL (SEQ ID NO: 14), described
supra. The CMV promoter was placed upstream of the Human CD8 coding
region, and a poly A sequence was placed downstream of the tTA
region. This construct is designated CD8-ESR1-TEV-NIa-Pro cleavage
(L)-tTA.
[0250] A second fusion construct was created, using DNA encoding
Human Estrogen Receptor beta (ESR2) (530 amino acids), which can be
found at Genbank, under Accession Number NM.sub.--001437 (SEQ ID
NO: 101), fused in frame to a DNA sequence encoding the catalytic
domain of the TEV NIa protease, described supra, corresponding to
amino acids 2040-2279 (GenBank accession number AAA47910) (SEQ ID
NO: 84). Inserted between these sequences is a DNA sequence
encoding the amino acid sequence RS (Arg-Ser). The CMV promoter
region was placed upstream of the Human Estrogen Receptor beta
(ESR2) coding region, and a poly A sequence was placed downstream
of the TEV region. This construct is designated ESR2-TEV.
[0251] The CD8-ESR1-TEV-NIa-Pro cleavage (L)-tTA and ESR2-TEV
fusion constructs, together with pcDNA3 were transiently
transfected into HTL5B8.1 cells described supra. About
2.0.times.10.sup.4 cells were seeded in each well of a 96 well
plate and cultured in phenol-free DMEM medium supplemented with 10%
fetal bovine serum, 2 mM L-glutamine, 100 units/ml penicillin, 100
.mu.g/m1 G418, and 5 .mu.g/ml puromycin. After 24 hours of
incubation, cells were transfected with a mixture of 5 ng of
ESR1-TEV-NIa-Pro cleavage (L)-tTA, 15 ng of ESR2-TEV and 40 ng of
pcDNA3, together with 0.3 .mu.l Fugene per well. 6 hours after
transfection, the cells were washed with PBS and incubated in 100
.mu.l of phenol-free DMEM without serum for 24 hours before
treatment with 50 nM 17-.beta. Estradiol. Ligand-treated cells were
cultured for an additional 18-20 hours before they were assayed for
luciferase reporter gene activity as described supra. Treatment
with 50 nM 17-.beta. Estradiol resulted in a 16-fold increase in
reporter gene activity.
[0252] Other features of the invention will be clear to the skilled
artisan and need not be reiterated here.
Sequence CWU 1
1
10112015DNAHomo sapiens 1actgcgaagc ggcttcttca gagcacgggc
tggaactggc aggcaccgcg agcccctagc 60acccgacaag ctgagtgtgc aggacgagtc
cccaccacac ccacaccaca gccgctgaat 120gaggcttcca ggcgtccgct
cgcggcccgc agagccccgc cgtgggtccg cccgctgagg 180cgcccccagc
cagtgcgctt acctgccaga ctgcgcgcca tggggcaacc cgggaacggc
240agcgccttct tgctggcacc caatagaagc catgcgccgg accacgacgt
cacgcagcaa 300agggacgagg tgtgggtggt gggcatgggc atcgtcatgt
ctctcatcgt cctggccatc 360gtgtttggca atgtgctggt catcacagcc
attgccaagt tcgagcgtct gcagacggtc 420accaactact tcatcacttc
actggcctgt gctgatctgg tcatgggcct ggcagtggtg 480ccctttgggg
ccgcccatat tcttatgaaa atgtggactt ttggcaactt ctggtgcgag
540ttttggactt ccattgatgt gctgtgcgtc acggccagca ttgagaccct
gtgcgtgatc 600gcagtggatc gctactttgc cattacttca cctttcaagt
accagagcct gctgaccaag 660aataaggccc gggtgatcat tctgatggtg
tggattgtgt caggccttac ctccttcttg 720cccattcaga tgcactggta
ccgggccacc caccaggaag ccatcaactg ctatgccaat 780gagacctgct
gtgacttctt cacgaaccaa gcctatgcca ttgcctcttc catcgtgtcc
840ttctacgttc ccctggtgat catggtcttc gtctactcca gggtctttca
ggaggccaaa 900aggcagctcc agaagattga caaatctgag ggccgcttcc
atgtccagaa ccttagccag 960gtggagcagg atgggcggac ggggcatgga
ctccgcagat cttccaagtt ctgcttgaag 1020gagcacaaag ccctcaagac
gttaggcatc atcatgggca ctttcaccct ctgctggctg 1080cccttcttca
tcgttaacat tgtgcatgtg atccaggata acctcatccg taaggaagtt
1140tacatcctcc taaattggat aggctatgtc aattctggtt tcaatcccct
tatctactgc 1200cggagcccag atttcaggat tgccttccag gagcttctgt
gcctgcgcag gtcttctttg 1260aaggcctatg ggaatggcta ctccagcaac
ggcaacacag gggagcagag tggatatcac 1320gtggaacagg agaaagaaaa
taaactgctg tgtgaagacc tcccaggcac ggaagacttt 1380gtgggccatc
aaggtactgt gcctagcgat aacattgatt cacaagggag gaattgtagt
1440acaaatgact cactgctgta aagcagtttt tctactttta aagacccccc
cccccccaac 1500agaacactaa acagactatt taacttgagg gtaataaact
tagaataaaa ttgtaaaaat 1560tgtatagaga tatgcagaag gaagggcatc
cttctgcctt ttttattttt ttaagctgta 1620aaaagagaga aaacttattt
gagtgattat ttgttatttg tacagttcag ttcctctttg 1680catggaattt
gtaagtttat gtctaaagag ctttagtcct agaggacctg agtctgctat
1740attttcatga cttttccatg tatctacctc actattcaag tattaggggt
aatatattgc 1800tgctggtaat ttgtatctga aggagatttt ccttcctaca
cccttggact tgaggatttt 1860gagtatctcg gacctttcag ctgtgaacat
ggactcttcc cccactcctc ttatttgctc 1920acacggggta ttttaggcag
ggatttgagg agcagcttca gttgttttcc cgagcaaagg 1980tctaaagttt
acagtaaata aaatgtttga ccatg 2015226DNAHomo Sapiens 2gattgaagat
ctgccttctt gctggc 26327DNAHomo Sapiens 3gcagaacttg gaagacctgc
ggagtcc 27427DNAHomo Sapiens 4ggactccgca ggtcttccaa gttctgc
27527DNAHomo Sapiens 5ttcggatcct agcagtgagt catttgt 2767PRTHomo
Sapiens 6Glu Asn Leu Tyr Phe Gln Ser1 5732DNAHomo Sapiens
7ccggatcctc tagattagat aaaagtaaag tg 32835DNAHomo Sapiens
8gactcgagct agcagtatcc tcgcgccccc taccc 35918DNAHomo Sapiens
9gagaacctgt acttccag 181033DNAHomo Sapiens 10ggatccgaga acctgtactt
ccagtacaga tta 331130DNAHomo Sapiens 11ctcgagagat cctcgcgccc
cctacccacc 30127PRTHomo Sapiens 12Glu Asn Leu Tyr Phe Gln Tyr1
51333DNAHomo Sapiens 13ggatccgaga acctgtactt ccagctaaga tta
33147PRTHomo Sapiens 14Glu Asn Leu Tyr Phe Gln Leu1 51533DNAHomo
Sapiens 15gcggccgcca ccatgaacgg taccgaaggc cca 331621DNAHomo
Sapiens 16ctggtgggtg gcccggtacc a 21171936DNAHomo sapiens
17ccccgcgtgt ctgctaggag agggcgggca gcgccgcggc gcgcgcgatc cggctgacgc
60atctggcccc ggttccccaa gaccagagcg gggccgggag ggagggggaa gaggcgagag
120cgcggagggc gcgcgtgcgc attggcgcgg ggaggagcag ggatcttggc
agcgggcgag 180gaggctgcga gcgagccgcg aaccgagcgg gcggcgggcg
cgcgcaccat gggggagaaa 240cccgggacca gggtcttcaa gaagtcgagc
cctaactgca agctcaccgt gtacttgggc 300aagcgggact tcgtagatca
cctggacaaa gtggaccctg tagatggcgt ggtgcttgtg 360gaccctgact
acctgaagga ccgcaaagtg tttgtgaccc tcacctgcgc cttccgctat
420ggccgtgaag acctggatgt gctgggcttg tccttccgca aagacctgtt
catcgccacc 480taccaggcct tccccccggt gcccaaccca ccccggcccc
ccacccgcct gcaggaccgg 540ctgctgagga agctgggcca gcatgcccac
cccttcttct tcaccatacc ccagaatctt 600ccatgctccg tcacactgca
gccaggccca gaggatacag gaaaggcctg cggcgtagac 660tttgagattc
gagccttctg tgctaaatca ctagaagaga aaagccacaa aaggaactct
720gtgcggctgg tgatccgaaa ggtgcagttc gccccggaga aacccggccc
ccagccttca 780gccgaaacca cacgccactt cctcatgtct gaccggtccc
tgcacctcga ggcttccctg 840gacaaggagc tgtactacca tggggagccc
ctcaatgtaa atgtccacgt caccaacaac 900tccaccaaga ccgtcaagaa
gatcaaagtc tctgtgagac agtacgccga catctgcctc 960ttcagcaccg
cccagtacaa gtgtcctgtg gctcaactcg aacaagatga ccaggtatct
1020cccagctcca cattctgtaa ggtgtacacc ataaccccac tgctcagcga
caaccgggag 1080aagcggggtc tcgccctgga tgggaaactc aagcacgagg
acaccaacct ggcttccagc 1140accatcgtga aggagggtgc caacaaggag
gtgctgggaa tcctggtgtc ctacagggtc 1200aaggtgaagc tggtggtgtc
tcgaggcggg gatgtctctg tggagctgcc ttttgttctt 1260atgcacccca
agccccacga ccacatcccc ctccccagac cccagtcagc cgctccggag
1320acagatgtcc ctgtggacac caacctcatt gaatttgata ccaactatgc
cacagatgat 1380gacattgtgt ttgaggactt tgcccggctt cggctgaagg
ggatgaagga tgacgactat 1440gatgatcaac tctgctagga agcggggtgg
gaagaaggga ggggatgggg ttgggagagg 1500tgagggcagg attaagatcc
ccactgtcaa tgggggattg tcccagcccc tcttcccttc 1560ccctcacctg
gaagcttctt caaccaatcc cttcacactc tctcccccat ccccccaaga
1620tacacactgg accctctctt gctgaatgtg ggcattaatt ttttgactgc
agctctgctt 1680ctccagcccc gccgtgggtg gcaagctgtg ttcataccta
aattttctgg aaggggacag 1740tgaaaagagg agtgacagga gggaaagggg
gagacaaaac tcctactctc aacctcacac 1800caacacctcc cattatcact
ctctctgccc ccattccttc aagaggagac cctttgggga 1860caaggccgtt
tctttgtttc tgagcataaa gaagaaaata aatcttttac taagcatgaa
1920aaaaaaaaaa aaaaaa 19361835DNAHomo Sapiens 18caggatcctc
tggaatgggg gagaaacccg ggacc 351930DNAHomo Sapiens 19ggatccgcag
agttgatcat catagtcgtc 30209PRTHomo Sapiens 20Tyr Pro Tyr Asp Val
Pro Asp Tyr Ala52128DNAHomo Sapiens 21agatctagct tgtttaaggg
accacgtg 282262DNAHomo Sapiens 22gcggccgctc aagcgtaatc tggaacatca
tatgggtacg agtacaccaa ttcattcatg 60ag 62231809DNAHomo sapiens
23agaagatcct gggttctgtg catccgtctg tctgaccatc cctctcaatc ttccctgccc
60aggactggcc atactgccac cgcacacgtg cacacacgcc aacaggcatc tgccatgctg
120gcatctctat aagggctcca gtccagagac cctgggccat tgaacttgct
cctcaggcag 180aggctgagtc cgcacatcac ctccaggccc tcagaacacc
tgccccagcc ccaccatgct 240catggcgtcc accacttccg ctgtgcctgg
gcatccctct ctgcccagcc tgcccagcaa 300cagcagccag gagaggccac
tggacacccg ggacccgctg ctagcccggg cggagctggc 360gctgctctcc
atagtctttg tggctgtggc cctgagcaat ggcctggtgc tggcggccct
420agctcggcgg ggccggcggg gccactgggc acccatacac gtcttcattg
gccacttgtg 480cctggccgac ctggccgtgg ctctgttcca agtgctgccc
cagctggcct ggaaggccac 540cgaccgcttc cgtgggccag atgccctgtg
tcgggccgtg aagtatctgc agatggtggg 600catgtatgcc tcctcctaca
tgatcctggc catgacgctg gaccgccacc gtgccatctg 660ccgtcccatg
ctggcgtacc gccatggaag tggggctcac tggaaccggc cggtgctagt
720ggcttgggcc ttctcgctcc ttctcagcct gccccagctc ttcatcttcg
cccagcgcaa 780cgtggaaggt ggcagcgggg tcactgactg ctgggcctgc
tttgcggagc cctggggccg 840tcgcacctat gtcacctgga ttgccctgat
ggtgttcgtg gcacctaccc tgggtatcgc 900cgcctgccag gtgctcatct
tccgggagat tcatgccagt ctggtgccag ggccatcaga 960gaggcctggg
gggcgccgca ggggacgccg gacaggcagc cccggtgagg gagcccacgt
1020gtcagcagct gtggccaaga ctgtgaggat gacgctagtg attgtggtcg
tctatgtgct 1080gtgctgggca cccttcttcc tggtgcagct gtgggccgcg
tgggacccgg aggcacctct 1140ggaaggggcg ccctttgtgc tactcatgtt
gctggccagc ctcaacagct gcaccaaccc 1200ctggatctat gcatctttca
gcagcagcgt gtcctcagag ctgcgaagct tgctctgctg 1260tgcccgggga
cgcaccccac ccagcctggg tccccaagat gagtcctgca ccaccgccag
1320ctcctccctg gccaaggaca cttcatcgtg aggagctgtt gggtgtcttg
cctctagagg 1380ctttgagaag ctcagctgcc ttcctggggc tggtcctggg
agccactggg agggggaccc 1440gtggagaatt ggccagagcc tgtggccccg
aggctgggac actgtgtggc cctggacaag 1500ccacagcccc tgcctgggtc
tccacatccc cagctgtatg aggagagctt caggccccag 1560gactgtgggg
gcccctcagg tcagctcact gagctgggtg taggaggggc tgcagcagag
1620gcctgaggag tggcaggaaa gagggagcag gtgcccccag gtgagacagc
ggtcccaggg 1680gcctgaaaag gaaggaccag gctggggcca ggggaccttc
ctgtctccgc ctttctaatc 1740cctccctcct cattctctcc ctaataaaaa
ttggagctct tttccacatg gcaaggggtc 1800tccttggaa 18092426DNAHomo
Sapiens 24gaattcatgc tcatggcgtc caccac 262527DNAHomo Sapiens
25ggatcccgat gaagtgtcct tggccag 27261266DNAHomo sapiens
26atggatgtgc tcagccctgg tcagggcaac aacaccacat caccaccggc tccctttgag
60accggcggca acactactgg tatctccgac gtgaccgtca gctaccaagt gatcacctct
120ctgctgctgg gcacgctcat cttctgcgcg gtgctgggca atgcgtgcgt
ggtggctgcc 180atcgccttgg agcgctccct gcagaacgtg gccaattatc
ttattggctc tttggcggtc 240accgacctca tggtgtcggt gttggtgctg
cccatggccg cgctgtatca ggtgctcaac 300aagtggacac tgggccaggt
aacctgcgac ctgttcatcg ccctcgacgt gctgtgctgc 360acctcatcca
tcttgcacct gtgcgccatc gcgctggaca ggtactgggc catcacggac
420cccatcgact acgtgaacaa gaggacgccc cggccgcgtg cgctcatctc
gctcacttgg 480cttattggct tcctcatctc tatcccgccc atcctgggct
ggcgcacccc ggaagaccgc 540tcggaccccg acgcatgcac cattagcaag
gatcatggct acactatcta ttccaccttt 600ggagctttct acatcccgct
gctgctcatg ctggttctct atgggcgcat attccgagct 660gcgcgcttcc
gcatccgcaa gacggtcaaa aaggtggaga agaccggagc ggacacccgc
720catggagcat ctcccgcccc gcagcccaag aagagtgtga atggagagtc
ggggagcagg 780aactggaggc tgggcgtgga gagcaaggct gggggtgctc
tgtgcgccaa tggcgcggtg 840aggcaaggtg acgatggcgc cgccctggag
gtgatcgagg tgcaccgagt gggcaactcc 900aaagagcact tgcctctgcc
cagcgaggct ggtcctaccc cttgtgcccc cgcctctttc 960gagaggaaaa
atgagcgcaa cgccgaggcg aagcgcaaga tggccctggc ccgagagagg
1020aagacagtga agacgctggg catcatcatg ggcaccttca tcctctgctg
gctgcccttc 1080ttcatcgtgg ctcttgttct gcccttctgc gagagcagct
gccacatgcc caccctgttg 1140ggcgccataa tcaattggct gggctactcc
aactctctgc ttaaccccgt catttacgca 1200tacttcaaca aggactttca
aaacgcgttt aagaagatca ttaagtgtaa cttctgccgc 1260cagtga
12662726DNAHomo Sapiens 27gaattcatgg atgtgctcag ccctgg
262825DNAHomo Sapiens 28ggatccctgg cggcagaact tacac 25291401DNAHomo
Sapiens 29atgaataact caacaaactc ctctaacaat agcctggctc ttacaagtcc
ttataagaca 60tttgaagtgg tgtttattgt cctggtggct ggatccctca gtttggtgac
cattatcggg 120aacatcctag tcatggtttc cattaaagtc aaccgccacc
tccagaccgt caacaattac 180tttttattca gcttggcctg tgctgacctt
atcataggtg ttttctccat gaacttgtac 240accctctaca ctgtgattgg
ttactggcct ttgggacctg tggtgtgtga cctttggcta 300gccctggact
atgtggtcag caatgcctca gttatgaatc tgctcatcat cagctttgac
360aggtacttct gtgtcacaaa acctctgacc tacccagtca agcggaccac
aaaaatggca 420ggtatgatga ttgcagctgc ctgggtcctc tctttcatcc
tctgggctcc agccattctc 480ttctggcagt tcattgtagg ggtgagaact
gtggaggatg gggagtgcta cattcagttt 540ttttccaatg ctgctgtcac
ctttggtacg gctattgcag ccttctattt gccagtgatc 600atcatgactg
tgctatattg gcacatatcc cgagccagca agagcaggat aaagaaggac
660aagaaggagc ctgttgccaa ccaagacccc gtttctccaa gtctggtaca
aggaaggata 720gtgaagccaa acaataacaa catgcccagc agtgacgatg
gcctggagca caacaaaatc 780cagaatggca aagcccccag ggatcctgtg
actgaaaact gtgttcaggg agaggagaag 840gagagctcca atgactccac
ctcagtcagt gctgttgcct ctaatatgag agatgatgaa 900ataacccagg
atgaaaacac agtttccact tccctgggcc attccaaaga tgagaactct
960aagcaaacat gcatcagaat tggcaccaag accccaaaaa gtgactcatg
taccccaact 1020aataccaccg tggaggtagt ggggtcttca ggtcagaatg
gagatgaaaa gcagaatatt 1080gtagcccgca agattgtgaa gatgactaag
cagcctgcaa aaaagaagcc tcctccttcc 1140cgggaaaaga aagtcaccag
gacaatcttg gctattctgt tggctttcat catcacttgg 1200gccccataca
atgtcatggt gctcattaac accttttgtg caccttgcat ccccaacact
1260gtgtggacaa ttggttactg gctttgttac atcaacagca ctatcaaccc
tgcctgctat 1320gcactttgca atgccacctt caagaagacc tttaaacacc
ttctcatgtg tcattataag 1380aacataggcg ctacaaggta a 14013027DNAHomo
Sapiens 30gaattcatga ataactcaac aaactcc 273125DNAHomo Sapiens
31agatctcctt gtagcgccta tgttc 25323655DNAHomo sapiens 32cttcagatag
attatatctg gagtgaagga tcctgccacc tacgtatctg gcatagtatt 60ctgtgtagtg
ggatgagcag agaacaaaaa caaaataatc cagtgagaaa agcccgtaaa
120taaaccttca gaccagagat ctattctcca gcttatttta agctcaactt
aaaaagaaga 180actgttctct gattcttttc gccttcaata cacttaatga
tttaactcca ccctccttca 240aaagaaacag catttcctac ttttatactg
tctatatgat tgatttgcac agctcatctg 300gccagaagag ctgagacatc
cgttccccta caagaaactc tccccgggtg gaacaagatg 360gattatcaag
tgtcaagtcc aatctatgac atcaattatt atacatcgga gccctgccaa
420aaaatcaatg tgaagcaaat cgcagcccgc ctcctgcctc cgctctactc
actggtgttc 480atctttggtt ttgtgggcaa catgctggtc atcctcatcc
tgataaactg caaaaggctg 540aagagcatga ctgacatcta cctgctcaac
ctggccatct ctgacctgtt tttccttctt 600actgtcccct tctgggctca
ctatgctgcc gcccagtggg actttggaaa tacaatgtgt 660caactcttga
cagggctcta ttttataggc ttcttctctg gaatcttctt catcatcctc
720ctgacaatcg ataggtacct ggctgtcgtc catgctgtgt ttgctttaaa
agccaggacg 780gtcacctttg gggtggtgac aagtgtgatc acttgggtgg
tggctgtgtt tgcgtctctc 840ccaggaatca tctttaccag atctcaaaaa
gaaggtcttc attacacctg cagctctcat 900tttccataca gtcagtatca
attctggaag aatttccaga cattaaagat agtcatcttg 960gggctggtcc
tgccgctgct tgtcatggtc atctgctact cgggaatcct aaaaactctg
1020cttcggtgtc gaaatgagaa gaagaggcac agggctgtga ggcttatctt
caccatcatg 1080attgtttatt ttctcttctg ggctccctac aacattgtcc
ttctcctgaa caccttccag 1140gaattctttg gcctgaataa ttgcagtagc
tctaacaggt tggaccaagc tatgcaggtg 1200acagagactc ttgggatgac
gcactgctgc atcaacccca tcatctatgc ctttgtcggg 1260gagaagttca
gaaactacct cttagtcttc ttccaaaagc acattgccaa acgcttctgc
1320aaatgctgtt ctattttcca gcaagaggct cccgagcgag caagctcagt
ttacacccga 1380tccactgggg agcaggaaat atctgtgggc ttgtgacacg
gactcaagtg ggctggtgac 1440ccagtcagag ttgtgcacat ggcttagttt
tcatacacag cctgggctgg gggtggggtg 1500ggagaggtct tttttaaaag
gaagttactg ttatagaggg tctaagattc atccatttat 1560ttggcatctg
tttaaagtag attagatctt ttaagcccat caattataga aagccaaatc
1620aaaatatgtt gatgaaaaat agcaaccttt ttatctcccc ttcacatgca
tcaagttatt 1680gacaaactct cccttcactc cgaaagttcc ttatgtatat
ttaaaagaaa gcctcagaga 1740attgctgatt cttgagttta gtgatctgaa
cagaaatacc aaaattattt cagaaatgta 1800caacttttta cctagtacaa
ggcaacatat aggttgtaaa tgtgtttaaa acaggtcttt 1860gtcttgctat
ggggagaaaa gacatgaata tgattagtaa agaaatgaca cttttcatgt
1920gtgatttccc ctccaaggta tggttaataa gtttcactga cttagaacca
ggcgagagac 1980ttgtggcctg ggagagctgg ggaagcttct taaatgagaa
ggaatttgag ttggatcatc 2040tattgctggc aaagacagaa gcctcactgc
aagcactgca tgggcaagct tggctgtaga 2100aggagacaga gctggttggg
aagacatggg gaggaaggac aaggctagat catgaagaac 2160cttgacggca
ttgctccgtc taagtcatga gctgagcagg gagatcctgg ttggtgttgc
2220agaaggttta ctctgtggcc aaaggagggt caggaaggat gagcatttag
ggcaaggaga 2280ccaccaacag ccctcaggtc agggtgagga tggcctctgc
taagctcaag gcgtgaggat 2340gggaaggagg gaggtattcg taaggatggg
aaggagggag gtattcgtgc agcatatgag 2400gatgcagagt cagcagaact
ggggtggatt tggtttggaa gtgagggtca gagaggagtc 2460agagagaatc
cctagtcttc aagcagattg gagaaaccct tgaaaagaca tcaagcacag
2520aaggaggagg aggaggttta ggtcaagaag aagatggatt ggtgtaaaag
gatgggtctg 2580gtttgcagag cttgaacaca gtctcaccca gactccaggc
tgtctttcac tgaatgcttc 2640tgacttcata gatttccttc ccatcccagc
tgaaatactg aggggtctcc aggaggagac 2700tagatttatg aatacacgag
gtatgaggtc taggaacata cttcagctca cacatgagat 2760ctaggtgagg
attgattacc tagtagtcat ttcatgggtt gttgggagga ttctatgagg
2820caaccacagg cagcatttag cacatactac acattcaata agcatcaaac
tcttagttac 2880tcattcaggg atagcactga gcaaagcatt gagcaaaggg
gtcccatata ggtgagggaa 2940gcctgaaaaa ctaagatgct gcctgcccag
tgcacacaag tgtaggtatc attttctgca 3000tttaaccgtc aataggcaaa
ggggggaagg gacatattca tttggaaata agctgccttg 3060agccttaaaa
cccacaaaag tacaatttac cagcctccgt atttcagact gaatgggggt
3120ggggggggcg ccttaggtac ttattccaga tgccttctcc agacaaacca
gaagcaacag 3180aaaaaatcgt ctctccctcc ctttgaaatg aatatacccc
ttagtgtttg ggtatattca 3240tttcaaaggg agagagagag gtttttttct
gttctttctc atatgattgt gcacatactt 3300gagactgttt tgaatttggg
ggatggctaa aaccatcata gtacaggtaa ggtgagggaa 3360tagtaagtgg
tgagaactac tcagggaatg aaggtgtcag aataataaga ggtgctactg
3420actttctcag cctctgaata tgaacggtga gcattgtggc tgtcagcagg
aagcaacgaa 3480gggaaatgtc tttccttttg ctcttaagtt gtggagagtg
caacagtagc ataggaccct 3540accctctggg ccaagtcaaa gacattctga
catcttagta tttgcatatt cttatgtatg 3600tgaaagttac aaattgcttg
aaagaaaata tgcatctaat aaaaaacacc ttcta 36553331DNAHomo Sapiens
33gcggccgcat ggattatcaa gtgtcaagtc c 313425DNAHomo Sapiens
34ggatccctgg cggcagaact tacac 253533DNAHomo Sapiens 35ggtctccaat
tcatggatta tcaagtgtca agt 333621DNAHomo Sapiens 36gacgacagcc
aggtacctat c 21372643DNAHomo sapiens 37ggcagccgtc cggggccgcc
actctcctcg gccggtccct ggctcccgga ggcggccgcg 60cgtggatgcg gcgggagctg
gaagcctcaa gcagccggcg ccgtctctgc cccggggcgc 120cctatggctt
gaagagcctg gccacccagt ggctccaccg ccctgatgga tccactgaat
180ctgtcctggt atgatgatga tctggagagg cagaactgga gccggccctt
caacgggtca 240gacgggaagg cggacagacc ccactacaac tactatgcca
cactgctcac cctgctcatc 300gctgtcatcg tcttcggcaa cgtgctggtg
tgcatggctg tgtcccgcga gaaggcgctg 360cagaccacca ccaactacct
gatcgtcagc ctcgcagtgg ccgacctcct cgtcgccaca 420ctggtcatgc
cctgggttgt ctacctggag gtggtaggtg agtggaaatt cagcaggatt
480cactgtgaca tcttcgtcac tctggacgtc atgatgtgca cggcgagcat
cctgaacttg 540tgtgccatca gcatcgacag gtacacagct gtggccatgc
ccatgctgta caatacgcgc 600tacagctcca agcgccgggt caccgtcatg
atctccatcg tctgggtcct gtccttcacc 660atctcctgcc cactcctctt
cggactcaat aacgcagacc agaacgagtg catcattgcc 720aacccggcct
tcgtggtcta ctcctccatc gtctccttct acgtgccctt cattgtcacc
780ctgctggtct acatcaagat ctacattgtc ctccgcagac gccgcaagcg
agtcaacacc 840aaacgcagca gccgagcttt cagggcccac ctgagggctc
cactaaaggg caactgtact 900caccccgagg acatgaaact ctgcaccgtt
atcatgaagt ctaatgggag tttcccagtg 960aacaggcgga gagtggaggc
tgcccggcga gcccaggagc tggagatgga gatgctctcc 1020agcaccagcc
cacccgagag gacccggtac agccccatcc cacccagcca ccaccagctg
1080actctccccg acccgtccca ccatggtctc cacagcactc ccgacagccc
cgccaaacca 1140gagaagaatg ggcatgccaa agaccacccc aagattgcca
agatctttga gatccagacc 1200atgcccaatg gcaaaacccg gacctccctc
aagaccatga gccgtaggaa gctctcccag 1260cagaaggaga agaaagccac
tcagatgctc gccattgttc tcggcgtgtt catcatctgc 1320tggctgccct
tcttcatcac acacatcctg aacatacact gtgactgcaa catcccgcct
1380gtcctgtaca gcgccttcac gtggctgggc tatgtcaaca gcgccgtgaa
ccccatcatc 1440tacaccacct tcaacattga gttccgcaag gccttcctga
agatcctcca ctgctgactc 1500tgctgcctgc ccgcacagca gcctgcttcc
cacctccctg cccaggccgg ccagcctcac 1560ccttgcgaac cgtgagcagg
aaggcctggg tggatcggcc tcctcttcac cccggcaggc 1620cctgcagtgt
tcgcttggct ccatgctcct cactgcccgc acaccctcac tctgccaggg
1680cagtgctagt gagctgggca tggtaccagc cctggggctg ggccccccag
ctcaggggca 1740gctcatagag tcccccctcc cacctccagt ccccctatcc
ttggcaccaa agatgcagcc 1800gccttccttg accttcctct ggggctctag
ggttgctgga gcctgagtca gggcccagag 1860gctgagtttt ctctttgtgg
ggcttggcgt ggagcaggcg gtggggagag atggacagtt 1920cacaccctgc
aaggcccaca ggaggcaagc aagctctctt gccgaggagc caggcaactt
1980cagtcctggg agacccatgt aaataccaga ctgcaggttg gaccccagag
attcccaagc 2040caaaaacctt agctccctcc cgcaccccga tgtggacctc
tactttccag gctagtccgg 2100acccacctca ccccgttaca gctccccaag
tggtttccac atgctctgag aagaggagcc 2160ctcatcttga agggcccagg
agggtctatg gggagaggaa ctccttggcc tagcccaccc 2220tgctgccttc
tgacggccct gcaatgtatc ccttctcaca gcacatgctg gccagcctgg
2280ggcctggcag ggaggtcagg ccctggaact ctatctgggc ctgggctagg
ggacatcaga 2340ggttctttga gggactgcct ctgccacact ctgacgcaaa
accactttcc ttttctattc 2400cttctggcct ttcctctctc ctgtttccct
tcccttccac tgcctctgcc ttagaggagc 2460ccacggctaa gaggctgctg
aaaaccatct ggcctggcct ggccctgccc tgaggaagga 2520ggggaagctg
cagcttggga gagcccctgg ggcctagact ctgtaacatc actatccatg
2580caccaaacta ataaaacttt gacgagtcac cttccaggac ccctgggtaa
aaaaaaaaaa 2640aaa 26433827DNAHomo Sapiens 38gaattcatgg atccactgaa
tctgtcc 273925DNAHomo Sapiens 39agatctgcag tggaggatct tcagg
25401301DNAHomo sapiens 40atgggcgaca aagggacgcg agtgttcaag
aaggccagtc caaatggaaa gctcaccgtc 60tacctgggaa agcgggactt tgtggaccac
atcgacctcg tggaccctgt ggatggtgtg 120gtcctggtgg atcctgagta
tctcaaagag cggagagtct atgtgacgct gacctgcgcc 180ttccgctatg
gccgggagga cctggatgtc ctgggcctga cctttcgcaa ggacctgttt
240gtggccaacg tacagtcgtt cccaccggcc cccgaggaca agaagcccct
gacgcggctg 300caggaacgcc tcatcaagaa gctgggcgag cacgcttacc
ctttcacctt tgagatccct 360ccaaaccttc catgttctgt gacactgcag
ccggggcccg aagacacggg gaaggcttgc 420ggtgtggact atgaagtcaa
agccttctgc gcggagaatt tggaggagaa gatccacaag 480cggaattctg
tgcgtctggt catccggaag gttcagtatg ccccagagag gcctggcccc
540cagcccacag ccgagaccac caggcagttc ctcatgtcgg acaagccctt
gcacctagaa 600gcctctctgg ataaggagat ctattaccat ggagaaccca
tcagcgtcaa cgtccacgtc 660accaacaaca ccaacaagac ggtgaagaag
atcaagatct cagtgcgcca gtatgcagac 720atctgccttt tcaacacagc
tcagtacaag tgccctgttg ccatggaaga ggctgatgac 780actgtggcac
ccagctcgac gttctgcaag gtctacacac tgaccccctt cctagccaat
840aaccgagaga agcggggcct cgccttggac gggaagctca agcacgaaga
cacgaacttg 900gcctctagca ccctgttgag ggaaggtgcc aaccgtgaga
tcctggggat cattgtttcc 960tacaaagtga aagtgaagct ggtggtgtct
cggggcggcc tgttgggaga tcttgcatcc 1020agcgacgtgg ccgtggaact
gcccttcacc ctaatgcacc ccaagcccaa agaggaaccc 1080ccgcatcggg
aagttccaga gaacgagacg ccagtagata ccaatctcat agaacttgac
1140acaaatgatg acgacattgt atttgaggac tttgctcgcc agagactgaa
aggcatgaag 1200gatgacaagg aggaagagga ggatggtacc ggctctccac
agctcaacaa cagatagacg 1260ggccggccct gcctccacgt ggctccggct
ccactctcgt g 13014130DNAHomo Sapiens 41ggtaccatgg gcgacaaagg
gacgcgagtg 304248DNAHomo Sapiens 42ggatcctctg ttgttgagct gtggagagcc
tgtaccatcc tcctcttc 484327DNAHomo Sapiens 43ggatccattt gtgtcaagtt
ctatgag 274427DNAHomo Sapiens 44ggtaccatgg gggagaaacc cgggacc
274524DNAHomo Sapiens 45ggatcctgtg gcatagttgg tatc 244633DNAHomo
Sapiens 46tgtgcgcgcg gacgcacccc acccagcctg ggt 334727DNAHomo
Sapiens 47gaattcatgg atccactgaa tctgtcc 274833DNAHomo Sapiens
48tgtgcgcgcg cagtggagga tcttcaggaa ggc 334933DNAHomo Sapiens
49gcggccgcca ccatgaacgg taccgaaggc cca 335030DNAHomo Sapiens
50tgtgcgcgcg cacagaagct cctggaaggc 30511602DNAHomo sapiens
51gagctccgtg ctgggaggtg ggaagggggc ttgaccctgg ggactcaggc agtctgggga
60cagttccacc aggggccggt gcctagaatt ggtgagggag gcacctcagg ggctggggga
120gaaggaacga gcgctcttcg cccctctctg gcacccagcg gcgcgcctgc
tggccggaaa 180ggcagcgaga agtccgttct ccctgtcctg cccccggcga
cttgcggccc gggtgggagt 240ccgcaggctc cgggtcccca gcgccgctgg
ccagggcgcg ggcaaagttt gcctctccgc 300gtccagccgg ttctttcgct
cccgcagcgc cgcaggtgcc gcctgtcctc gccttcctgc 360tgcaatcgcc
ccaccatgga ctccccgatc cagatcttcc gcggggagcc gggccctacc
420tgcgccccga gcgcctgcct gccccccaac agcagcgcct ggtttcccgg
ctgggccgag 480cccgacagca acggcagcgc cggctcggag gacgcgcagc
tggagcccgc gcacatctcc 540ccggccatcc cggtcatcat cacggcggtc
tactccgtag tgttcgtcgt gggcttggtg 600ggcaactcgc tggtcatgtt
cgtgatcatc cgatacacaa agatgaagac agcaaccaac 660atttacatat
ttaacctggc tttggcagat gctttagtta ctacaaccat gccctttcag
720agtacggtct acttgatgaa ttcctggcct tttggggatg tgctgtgcaa
gatagtaatt 780tccattgatt actacaacat gttcaccagc atcttcacct
tgaccatgat gagcgtggac 840cgctacattg ccgtgtgcca ccccgtgaag
gctttggact tccgcacacc cttgaaggca 900aagatcatca atatctgcat
ctggctgctg tcgtcatctg ttggcatctc tgcaatagtc 960cttggaggca
ccaaagtcag ggaagacgtc gatgtcattg agtgctcctt gcagttccca
1020gatgatgact actcctggtg ggacctcttc atgaagatct gcgtcttcat
ctttgccttc 1080gtgatccctg tcctcatcat catcgtctgc tacaccctga
tgatcctgcg tctcaagagc 1140gtccggctcc tttctggctc ccgagagaaa
gatcgcaacc tgcgtaggat caccagactg 1200gtcctggtgg tggtggcagt
cttcgtcgtc tgctggactc ccattcacat attcatcctg 1260gtggaggctc
tggggagcac ctcccacagc acagctgctc tctccagcta ttacttctgc
1320atcgccttag gctataccaa cagtagcctg aatcccattc tctacgcctt
tcttgatgaa 1380aacttcaagc ggtgtttccg ggacttctgc tttccactga
agatgaggat ggagcggcag 1440agcactagca gagtccgaaa tacagttcag
gatcctgctt acctgaggga catcgatggg 1500atgaataaac cagtatgact
agtcgtggag atgtcttcgt acagttcttc gggaagagag 1560gagttcaatg
atctaggttt aactcagatc actactgcag tc 16025224DNAHomo Sapiens
52ggtctacttg atgaattcct ggcc 245327DNAHomo Sapiens 53gcgcgcacag
aagtcccgga aacaccg 275410PRTHomo Sapiens 54Gly Ser Glu Asn Leu Tyr
Phe Gln Leu Arg1 5 1055881PRTHomo sapiens 55Met Lys Leu Leu Ser Ser
Ile Glu Gln Ala Cys Asp Ile Cys Arg Leu1 5 10 15Lys Lys Leu Lys Cys
Ser Lys Glu Lys Pro Lys Cys Ala Lys Cys Leu 20 25 30Lys Asn Asn Trp
Glu Cys Arg Tyr Ser Pro Lys Thr Lys Arg Ser Pro 35 40 45Leu Thr Arg
Ala His Leu Thr Glu Val Glu Ser Arg Leu Glu Arg Leu 50 55 60Glu Gln
Leu Phe Leu Leu Ile Phe Pro Arg Glu Asp Leu Asp Met Ile65 70 75
80Leu Lys Met Asp Ser Leu Gln Asp Ile Lys Ala Leu Leu Thr Gly Leu
85 90 95Phe Val Gln Asp Asn Val Asn Lys Asp Ala Val Thr Asp Arg Leu
Ala 100 105 110Ser Val Glu Thr Asp Met Pro Leu Thr Leu Arg Gln His
Arg Ile Ser 115 120 125Ala Thr Ser Ser Ser Glu Glu Ser Ser Asn Lys
Gly Gln Arg Gln Leu 130 135 140Thr Val Ser Ile Asp Ser Ala Ala His
His Asp Asn Ser Thr Ile Pro145 150 155 160Leu Asp Phe Met Pro Arg
Asp Ala Leu His Gly Phe Asp Trp Ser Glu 165 170 175Glu Asp Asp Met
Ser Asp Gly Leu Pro Phe Leu Lys Thr Asp Pro Asn 180 185 190Asn Asn
Gly Phe Phe Gly Asp Gly Ser Leu Leu Cys Ile Leu Arg Ser 195 200
205Ile Gly Phe Lys Pro Glu Asn Tyr Thr Asn Ser Asn Val Asn Arg Leu
210 215 220Pro Thr Met Ile Thr Asp Arg Tyr Thr Leu Ala Ser Arg Ser
Thr Thr225 230 235 240Ser Arg Leu Leu Gln Ser Tyr Leu Asn Asn Phe
His Pro Tyr Cys Pro 245 250 255Ile Val His Ser Pro Thr Leu Met Met
Leu Tyr Asn Asn Gln Ile Glu 260 265 270Ile Ala Ser Lys Asp Gln Trp
Gln Ile Leu Phe Asn Cys Ile Leu Ala 275 280 285Ile Gly Ala Trp Cys
Ile Glu Gly Glu Ser Thr Asp Ile Asp Val Phe 290 295 300Tyr Tyr Gln
Asn Ala Lys Ser His Leu Thr Ser Lys Val Phe Glu Ser305 310 315
320Gly Ser Ile Ile Leu Val Thr Ala Leu His Leu Leu Ser Arg Tyr Thr
325 330 335Gln Trp Arg Gln Lys Thr Asn Thr Ser Tyr Asn Phe His Ser
Phe Ser 340 345 350Ile Arg Met Ala Ile Ser Leu Gly Leu Asn Arg Asp
Leu Pro Ser Ser 355 360 365Phe Ser Asp Ser Ser Ile Leu Glu Gln Arg
Arg Arg Ile Trp Trp Ser 370 375 380Val Tyr Ser Trp Glu Ile Gln Leu
Ser Leu Leu Tyr Gly Arg Ser Ile385 390 395 400Gln Leu Ser Gln Asn
Thr Ile Ser Phe Pro Ser Ser Val Asp Asp Val 405 410 415Gln Arg Thr
Thr Thr Gly Pro Thr Ile Tyr His Gly Ile Ile Glu Thr 420 425 430Ala
Arg Leu Leu Gln Val Phe Thr Lys Ile Tyr Glu Leu Asp Lys Thr 435 440
445Val Thr Ala Glu Lys Ser Pro Ile Cys Ala Lys Lys Cys Leu Met Ile
450 455 460Cys Asn Glu Ile Glu Glu Val Ser Arg Gln Ala Pro Lys Phe
Leu Gln465 470 475 480Met Asp Ile Ser Thr Thr Ala Leu Thr Asn Leu
Leu Lys Glu His Pro 485 490 495Trp Leu Ser Phe Thr Arg Phe Glu Leu
Lys Trp Lys Gln Leu Ser Leu 500 505 510Ile Ile Tyr Val Leu Arg Asp
Phe Phe Thr Asn Phe Thr Gln Lys Lys 515 520 525Ser Gln Leu Glu Gln
Asp Gln Asn Asp His Gln Ser Tyr Glu Val Lys 530 535 540Arg Cys Ser
Ile Met Leu Ser Asp Ala Ala Gln Arg Thr Val Met Ser545 550 555
560Val Ser Ser Tyr Met Asp Asn His Asn Val Thr Pro Tyr Phe Ala Trp
565 570 575Asn Cys Ser Tyr Tyr Leu Phe Asn Ala Val Leu Val Pro Ile
Lys Thr 580 585 590Leu Leu Ser Asn Ser Lys Ser Asn Ala Glu Asn Asn
Glu Thr Ala Gln 595 600 605Leu Leu Gln Gln Ile Asn Thr Val Leu Met
Leu Leu Lys Lys Leu Ala 610 615 620Thr Phe Lys Ile Gln Thr Cys Glu
Lys Tyr Ile Gln Val Leu Glu Glu625 630 635 640Val Cys Ala Pro Phe
Leu Leu Ser Gln Cys Ala Ile Pro Leu Pro His 645 650 655Ile Ser Tyr
Asn Asn Ser Asn Gly Ser Ala Ile Lys Asn Ile Val Gly 660 665 670Ser
Ala Thr Ile Ala Gln Tyr Pro Thr Leu Pro Glu Glu Asn Val Asn 675 680
685Asn Ile Ser Val Lys Tyr Val Ser Pro Gly Ser Val Gly Pro Ser Pro
690 695 700Val Pro Leu Lys Ser Gly Ala Ser Phe Ser Asp Leu Val Lys
Leu Leu705 710 715 720Ser Asn Arg Pro Pro Ser Arg Asn Ser Pro Val
Thr Ile Pro Arg Ser 725 730 735Thr Pro Ser His Arg Ser Val Thr Pro
Phe Leu Gly Gln Gln Gln Gln 740 745 750Leu Gln Ser Leu Val Pro Leu
Thr Pro Ser Ala Leu Phe Gly Gly Ala 755 760 765Asn Phe Asn Gln Ser
Gly Asn Ile Ala Asp Ser Ser Leu Ser Phe Thr 770 775 780Phe Thr Asn
Ser Ser Asn Gly Pro Asn Leu Ile Thr Thr Gln Thr Asn785 790 795
800Ser Gln Ala Leu Ser Gln Pro Ile Ala Ser Ser Asn Val His Asp Asn
805 810 815Phe Met Asn Asn Glu Ile Thr Ala Ser Lys Ile Asp Asp Gly
Asn Asn 820 825 830Ser Lys Pro Leu Ser Pro Gly Trp Thr Asp Gln Thr
Ala Tyr Asn Ala 835 840 845Phe Gly Ile Thr Thr Gly Met Phe Asn Thr
Thr Thr Met Asp Asp Val 850 855 860Tyr Asn Tyr Leu Phe Asp Asp Glu
Asp Thr Pro Pro Asn Pro Lys Lys865 870 875 880Glu5613PRTHomo
Sapiens 56Pro Gln Lys Gly Ser Ala Ser Glu Lys Thr Met Val Phe1 5
1057549PRTHomo sapiens 57Met Asp Asp Leu Phe Pro Leu Ile Phe Pro
Ser Glu Pro Ala Gln Ala1 5 10 15Ser Gly Pro Tyr Val Glu Ile Ile Glu
Gln Pro Lys Gln Arg Gly Met 20 25 30Arg Phe Arg Tyr Lys Cys Glu Gly
Arg Ser Ala Gly Ser Ile Pro Gly 35 40 45Glu Arg Ser Thr Asp Thr Thr
Lys Thr His Pro Thr Ile Lys Ile Asn 50 55 60Gly Tyr Thr Gly Pro Gly
Thr Val Arg Ile Ser Leu Val Thr Lys Asp65 70 75 80Pro Pro His Arg
Pro His Pro His Glu Leu Val Gly Lys Asp Cys Arg 85 90 95Asp Gly Tyr
Tyr Glu Ala Asp Leu Cys Pro Asp Arg Ser Ile His Ser 100 105 110Phe
Gln Asn Leu Gly Ile Gln Cys Val Lys Lys Arg Asp Leu Glu Gln 115 120
125Ala Ile Ser Gln Arg Ile Gln Thr Asn Asn Asn Pro Phe His Val Pro
130 135 140Ile Glu Glu Gln Arg Gly Asp Tyr Asp Leu Asn Ala Val Arg
Leu Cys145 150 155 160Phe Gln Val Thr Val Arg Asp Pro Ala Gly Arg
Pro Leu Leu Leu Thr 165 170 175Pro Val Leu Ser His Pro Ile Phe Asp
Asn Arg Ala Pro Asn Thr Ala 180 185 190Glu Leu Lys Ile Cys Arg Val
Asn Arg Asn Ser Gly Ser Cys Leu Gly 195 200 205Gly Asp Glu Ile Phe
Leu Leu Cys Asp Lys Val Gln Lys Glu Asp Ile 210 215 220Glu Val Tyr
Phe Thr Gly Pro Gly Trp Glu Ala Arg Gly Ser Phe Ser225 230 235
240Gln Ala Asp Val His Arg Gln Val Ala Ile Val Phe Arg Thr Pro Pro
245 250 255Tyr Ala Asp Pro Ser Leu Gln Ala Pro Val Arg Val Ser Met
Gln Leu 260 265 270Arg Arg Pro Ser Asp Arg Glu Leu Ser Glu Pro Met
Glu Phe Gln Tyr 275 280 285Leu Pro Asp Thr Asp Asp Arg His Arg Ile
Glu Glu Lys Arg Lys Arg 290 295 300Thr Tyr Glu Thr Phe Lys Ser Ile
Met Lys Lys Ser Pro Phe Asn Gly305 310 315 320Pro Thr Glu Pro Arg
Pro Pro Thr Arg Arg Ile Ala Val Pro Thr Arg 325 330 335Asn Ser Thr
Ser Val Pro Lys Pro Ala Pro Gln Pro Tyr Thr Phe Pro 340 345 350Ala
Ser Leu Ser Thr Ile Asn Phe Asp Glu Phe Ser Pro Met Leu Leu 355 360
365Pro Ser Gly Gln Ile Ser Asn Gln Ala Leu Ala Leu Ala Pro Ser Ser
370 375 380Ala Pro Val Leu Ala Gln Thr Met Val Pro Ser Ser Ala Met
Val Pro385 390 395 400Leu Ala Gln Pro Pro Ala Pro Ala Pro Val Leu
Thr Pro Gly Pro Pro 405 410 415Gln Ser Leu Ser Ala Pro Val Pro Lys
Ser Thr Gln Ala Gly Glu Gly 420 425 430Thr Leu Ser Glu Ala Leu Leu
His Leu Gln Phe Asp Ala Asp Glu Asp 435 440 445Leu Gly Ala Leu Leu
Gly Asn Ser Thr Asp Pro Gly Val Phe Thr Asp 450 455 460Leu Ala Ser
Val Asp Asn Ser Glu Phe Gln Gln Leu Leu Asn Gln Gly465 470 475
480Val Ser Met Ser His Ser Thr Ala Glu Pro Met Leu Met Glu Tyr
Pro
485 490 495Glu Ala Ile Thr Arg Leu Val Thr Gly Ser Gln Arg Pro Pro
Asp Pro 500 505 510Ala Pro Thr Pro Leu Gly Thr Ser Gly Leu Pro Asn
Gly Leu Ser Gly 515 520 525Asp Glu Asp Phe Ser Ser Ile Ala Asp Met
Asp Phe Ser Ala Leu Leu 530 535 540Ser Gln Ile Ser
Ser545581833DNAHomo sapiens 58ggaggtggga ggagggagtg acgagtcaag
gaggagacag ggacgcagga gggtgcaagg 60aagtgtctta actgagacgg gggtaaggca
agagagggtg gaggaaattc tgcaggagac 120aggcttcctc cagggtctgg
agaacccaga ggcagctcct cctgagtgct gggaaggact 180ctgggcatct
tcagcccttc ttactctctg aggctcaagc cagaaattca ggctgcttgc
240agagtgggtg acagagccac ggagctggtg tccctgggac cctctgcccg
tcttctctcc 300actccccagc atggaggaag gtggtgattt tgacaactac
tatggggcag acaaccagtc 360tgagtgtgag tacacagact ggaaatcctc
gggggccctc atccctgcca tctacatgtt 420ggtcttcctc ctgggcacca
cgggcaacgg tctggtgctc tggaccgtgt ttcggagcag 480ccgggagaag
aggcgctcag ctgatatctt cattgctagc ctggcggtgg ctgacctgac
540cttcgtggtg acgctgcccc tgtgggctac ctacacgtac cgggactatg
actggccctt 600tgggaccttc ttctgcaagc tcagcagcta cctcatcttc
gtcaacatgt acgccagcgt 660cttctgcctc accggcctca gcttcgaccg
ctacctggcc atcgtgaggc cagtggccaa 720tgctcggctg aggctgcggg
tcagcggggc cgtggccacg gcagttcttt gggtgctggc 780cgccctcctg
gccatgcctg tcatggtgtt acgcaccacc ggggacttgg agaacaccac
840taaggtgcag tgctacatgg actactccat ggtggccact gtgagctcag
agtgggcctg 900ggaggtgggc cttggggtct cgtccaccac cgtgggcttt
gtggtgccct tcaccatcat 960gctgacctgt tacttcttca tcgcccaaac
catcgctggc cacttccgca aggaacgcat 1020cgagggcctg cggaagcggc
gccggctgct cagcatcatc gtggtgctgg tggtgacctt 1080tgccctgtgc
tggatgccct accacctggt gaagacgctg tacatgctgg gcagcctgct
1140gcactggccc tgtgactttg acctcttcct catgaacatc ttcccctact
gcacctgcat 1200cagctacgtc aacagctgcc tcaacccctt cctctatgcc
tttttcgacc cccgcttccg 1260ccaggcctgc acctccatgc tctgctgtgg
ccagagcagg tgcgcaggca cctcccacag 1320cagcagtggg gagaagtcag
ccagctactc ttcggggcac agccaggggc ccggccccaa 1380catgggcaag
ggtggagaac agatgcacga gaaatccatc ccctacagcc aggagaccct
1440tgtggttgac tagggctggg agcagagaga agcctggcgc cctcggccct
ccccggcctt 1500tgcccttgct ttctgaaaat cagagtcacc tcctctgccc
agagctgtcc tcaaagcatc 1560cagtgaacac tggaagaggc ttctagaagg
gaagaaattg tccctctgag gccgccgtgg 1620gtgacctgca gagacttcct
gcctggaact catctgtgaa ctgggacaga agcagaggag 1680gctgcctgct
gtgatacccc cttacctccc ccagtgcctt cttcagaata tctgcactgt
1740cttctgatcc tgttagtcac tgtggttcat caaataaaac tgtttgtgca
actgttgtgt 1800ccaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaa
1833591666DNAHomo sapiens 59aactgcagcc agggagactc agactagaat
ggaggtagaa agaactgatg cagagtgggt 60ttaattctaa gcctttttgt ggctaagttt
tgttgttgtt aacttattga atttagagtt 120gtattgcact ggtcatgtga
aagccagagc agcaccagtg tcaaaatagt gacagagagt 180tttgaatacc
atagttagta tatatgtact cagagtattt ttattaaaga aggcaaagag
240cccggcatag atcttatctt catcttcact cggttgcaaa atcaatagtt
aagaaatagc 300atctaaggga acttttaggt gggaaaaaaa atctagagat
ggctctaaat gactgtttcc 360ttctgaactt ggaggtggac catttcatgc
actgcaacat ctccagtcac agtgcggatc 420tccccgtgaa cgatgactgg
tcccacccgg ggatcctcta tgtcatccct gcagtttatg 480gggttatcat
tctgataggc ctcattggca acatcacttt gatcaagatc ttctgtacag
540tcaagtccat gcgaaacgtt ccaaacctgt tcatttccag tctggctttg
ggagacctgc 600tcctcctaat aacgtgtgct ccagtggatg ccagcaggta
cctggctgac agatggctat 660ttggcaggat tggctgcaaa ctgatcccct
ttatacagct tacctctgtt ggggtgtctg 720tcttcacact cacggcgctc
tcggcagaca gatacaaagc cattgtccgg ccaatggata 780tccaggcctc
ccatgccctg atgaagatct gcctcaaagc cgcctttatc tggatcatct
840ccatgctgct ggccattcca gaggccgtgt tttctgacct ccatcccttc
catgaggaaa 900gcaccaacca gaccttcatt agctgtgccc catacccaca
ctctaatgag cttcacccca 960aaatccattc tatggcttcc tttctggtct
tctacgtcat cccactgtcg atcatctctg 1020tttactacta cttcattgct
aaaaatctga tccagagtgc ttacaatctt cccgtggaag 1080ggaatataca
tgtcaagaag cagattgaat cccggaagcg acttgccaag acagtgctgg
1140tgtttgtggg cctgttcgcc ttctgctggc tccccaatca tgtcatctac
ctgtaccgct 1200cctaccacta ctctgaggtg gacacctcca tgctccactt
tgtcaccagc atctgtgccc 1260gcctcctggc cttcaccaac tcctgcgtga
acccctttgc cctctacctg ctgagcaaga 1320gtttcaggaa acagttcaac
actcagctgc tctgttgcca gcctggcctg atcatccggt 1380ctcacagcac
tggaaggagt acaacctgca tgacctccct caagagtacc aacccctccg
1440tggccacctt tagcctcatc aatggaaaca tctgtcacga gcggtatgtc
tagattgacc 1500cttgattttg ccccctgagg gacggttttg ctttatggct
agacaggaac ccttgcatcc 1560attgttgtgt ctgtgccctc caaagagcct
tcagaatgct cctgagtggt gtaggtgggg 1620gtggggaggc ccaaatgatg
gatcaccatt atattttgaa agaagc 1666602876DNAHomo sapiens 60tgaaacctaa
cccgccctgg ggaggcgcgc agcagaggct ccgattcggg gcaggtgaga 60ggctgacttt
ctctcggtgc gtccagtgga gctctgagtt tcgaatcggc ggcggcggat
120tccccgcgcg cccggcgtcg gggcttccag gaggatgcgg agccccagcg
cggcgtggct 180gctgggggcc gccatcctgc tagcagcctc tctctcctgc
agtggcacca tccaaggaac 240caatagatcc tctaaaggaa gaagccttat
tggtaaggtt gatggcacat cccacgtcac 300tggaaaagga gttacagttg
aaacagtctt ttctgtggat gagttttctg catctgtcct 360cactggaaaa
ctgaccactg tcttccttcc aattgtctac acaattgtgt ttgtggtggg
420tttgccaagt aacggcatgg ccctgtgggt ctttcttttc cgaactaaga
agaagcaccc 480tgctgtgatt tacatggcca atctggcctt ggctgacctc
ctctctgtca tctggttccc 540cttgaagatt gcctatcaca tacatggcaa
caactggatt tatggggaag ctctttgtaa 600tgtgcttatt ggctttttct
atggcaacat gtactgttcc attctcttca tgacctgcct 660cagtgtgcag
aggtattggg tcatcgtgaa ccccatgggg cactccagga agaaggcaaa
720cattgccatt ggcatctccc tggcaatatg gctgctgatt ctgctggtca
ccatcccttt 780gtatgtcgtg aagcagacca tcttcattcc tgccctgaac
atcacgacct gtcatgatgt 840tttgcctgag cagctcttgg tgggagacat
gttcaattac ttcctctctc tggccattgg 900ggtctttctg ttcccagcct
tcctcacagc ctctgcctat gtgctgatga tcagaatgct 960gcgatcttct
gccatggatg aaaactcaga gaagaaaagg aagagggcca tcaaactcat
1020tgtcactgtc ctggccatgt acctgatctg cttcactcct agtaaccttc
tgcttgtggt 1080gcattatttt ctgattaaga gccagggcca gagccatgtc
tatgccctgt acattgtagc 1140cctctgcctc tctaccctta acagctgcat
cgaccccttt gtctattact ttgtttcaca 1200tgatttcagg gatcatgcaa
agaacgctct cctttgccga agtgtccgca ctgtaaagca 1260gatgcaagta
tccctcacct caaagaaaca ctccaggaaa tccagctctt actcttcaag
1320ttcaaccact gttaagacct cctattgagt tttccaggtc ctcagatggg
aattgcacag 1380taggatgtgg aacctgttta atgttatgag gacgtgtctg
ttatttccta atcaaaaagg 1440tctcaccaca taccatgtgg atgcagcacc
tctcaggatt gctaggagct cccctgtttg 1500catgagaaaa gtagtccccc
aaattaacat cagtgtctgt ttcagaatct ctctactcag 1560atgaccccag
aaactgaacc aacagaagca gacttttcag aagatggtga agacagaaac
1620ccagtaactt gcaaaaagta gacttggtgt gaagactcac ttctcagctg
aaattatata 1680tatacacata tatatatttt acatctggga tcatgataga
cttgttaggg cttcaaggcc 1740ctcagagatg atcagtccaa ctgaacgacc
ttacaaatga ggaaaccaag ataaatgagc 1800tgccagaatc aggtttccaa
tcaacagcag tgagttggga ttggacagta gaatttcaat 1860gtccagtgag
tgaggttctt gtaccacttc atcaaaatca tggatcttgg ctgggtgcgg
1920tgcctcatgc ctgtaatcct agcactttgg gaggctgagg caggcaatca
cttgaggtca 1980ggagttcgag accagcctgg ccatcatggc gaaacctcat
ctctactaaa aatacaaaag 2040ttaaccaggt gtgtggtgca cgtttgtaat
cccagttact caggaggctg aggcacaaga 2100attgagtatc actttaactc
aggaggcaga ggttgcagtg agccgagatt gcaccactgc 2160actccagctt
gggtgataaa ataaaataaa atagtcgtga atcttgttca aaatgcagat
2220tcctcagatt caataatgag agctcagact gggaacaggg cccaggaatc
tgtgtggtac 2280aaacctgcat ggtgtttatg cacacagaga tttgagaacc
attgttctga atgctgcttc 2340catttgacaa agtgccgtga taatttttga
aaagagaagc aaacaatggt gtctctttta 2400tgttcagctt ataatgaaat
ctgtttgttg acttattagg actttgaatt atttctttat 2460taaccctctg
agtttttgta tgtattatta ttaaagaaaa atgcaatcag gattttaaac
2520atgtaaatac aaattttgta taacttttga tgacttcagt gaaattttca
ggtagtctga 2580gtaatagatt gttttgccac ttagaatagc atttgccact
tagtatttta aaaaataatt 2640gttggagtat ttattgtcag ttttgttcac
ttgttatcta atacaaaatt ataaagcctt 2700cagagggttt ggaccacatc
tctttggaaa atagtttgca acatatttaa gagatacttg 2760atgccaaaat
gactttatac aacgattgta tttgtgactt ttaaaaataa ttattttatt
2820gtgtaattga tttataaata acaaaatttt ttttacaact taaaaaaaaa aaaaaa
2876611668DNAHomo sapiens 61gggagataac tcgtgctcac aggaagccac
gcacccttga aaggcaccgg gtccttctta 60gcatcgtgct tcctgagcaa gcctggcatt
gcctcacaga ccttcctcag agccgctttc 120agaaaagcaa gctgcttctg
gttgggccca gacctgcctt gaggagcctg tagagttaaa 180aaatgaaccc
cacggatata gcagacacca ccctcgatga aagcatatac agcaattact
240atctgtatga aagtatcccc aagccttgca ccaaagaagg catcaaggca
tttggggagc 300tcttcctgcc cccactgtat tccttggttt ttgtatttgg
tctgcttgga aattctgtgg 360tggttctggt cctgttcaaa tacaagcggc
tcaggtccat gactgatgtg tacctgctca 420accttgccat ctcggatctg
ctcttcgtgt tttccctccc tttttggggc tactatgcag 480cagaccagtg
ggtttttggg ctaggtctgt gcaagatgat ttcctggatg tacttggtgg
540gcttttacag tggcatattc tttgtcatgc tcatgagcat tgatagatac
ctggcaattg 600tgcacgcggt gttttccttg agggcaagga ccttgactta
tggggtcatc accagtttgg 660ctacatggtc agtggctgtg ttcgcctccc
ttcctggctt tctgttcagc acttgttata 720ctgagcgcaa ccatacctac
tgcaaaacca agtactctct caactccacg acgtggaagg 780ttctcagctc
cctggaaatc aacattctcg gattggtgat ccccttaggg atcatgctgt
840tttgctactc catgatcatc aggaccttgc agcattgtaa aaatgagaag
aagaacaagg 900cggtgaagat gatctttgcc gtggtggtcc tcttccttgg
gttctggaca ccttacaaca 960tagtgctctt cctagagacc ctggtggagc
tagaagtcct tcaggactgc acctttgaaa 1020gatacttgga ctatgccatc
caggccacag aaactctggc ttttgttcac tgctgcctta 1080atcccatcat
ctactttttt ctgggggaga aatttcgcaa gtacatccta cagctcttca
1140aaacctgcag gggccttttt gtgctctgcc aatactgtgg gctcctccaa
atttactctg 1200ctgacacccc cagctcatct tacacgcagt ccaccatgga
tcatgatctt catgatgctc 1260tgtagaaaaa tgaaatggtg aaatgcagag
tcaatgaact ttccacattc agagcttact 1320taaaattgta ttttggtaag
agatccctga gccagtgtca ggaggaaggc ttacacccac 1380agtggaaaga
cagcttctca tcctgcaggc agctttttct ctcccactag acaagtccag
1440cctggcaagg gttcacctgg gctgaggcat ccttcctcac accaggcttg
cctgcaggca 1500tgagtcagtc tgatgagaac tctgagcagt gcttgaatga
agttgtaggt aatattgcaa 1560ggcaaagact attcccttct aacctgaact
gatgggtttc tccagaggga attgcagagt 1620actggctgat ggagtaaatc
gctacctttt gctgtggcaa atgggccc 1668621679DNAHomo sapiens
62gtttgttggc tgcggcagca ggtagcaaag tgacgccgag ggcctgagtg ctccagtagc
60caccgcatct ggagaaccag cggttaccat ggaggggatc agtatataca cttcagataa
120ctacaccgag gaaatgggct caggggacta tgactccatg aaggaaccct
gtttccgtga 180agaaaatgct aatttcaata aaatcttcct gcccaccatc
tactccatca tcttcttaac 240tggcattgtg ggcaatggat tggtcatcct
ggtcatgggt taccagaaga aactgagaag 300catgacggac aagtacaggc
tgcacctgtc agtggccgac ctcctctttg tcatcacgct 360tcccttctgg
gcagttgatg ccgtggcaaa ctggtacttt gggaacttcc tatgcaaggc
420agtccatgtc atctacacag tcaacctcta cagcagtgtc ctcatcctgg
ccttcatcag 480tctggaccgc tacctggcca tcgtccacgc caccaacagt
cagaggccaa ggaagctgtt 540ggctgaaaag gtggtctatg ttggcgtctg
gatccctgcc ctcctgctga ctattcccga 600cttcatcttt gccaacgtca
gtgaggcaga tgacagatat atctgtgacc gcttctaccc 660caatgacttg
tgggtggttg tgttccagtt tcagcacatc atggttggcc ttatcctgcc
720tggtattgtc atcctgtcct gctattgcat tatcatctcc aagctgtcac
actccaaggg 780ccaccagaag cgcaaggccc tcaagaccac agtcatcctc
atcctggctt tcttcgcctg 840ttggctgcct tactacattg ggatcagcat
cgactccttc atcctcctgg aaatcatcaa 900gcaagggtgt gagtttgaga
acactgtgca caagtggatt tccatcaccg aggccctagc 960tttcttccac
tgttgtctga accccatcct ctatgctttc cttggagcca aatttaaaac
1020ctctgcccag cacgcactca cctctgtgag cagagggtcc agcctcaaga
tcctctccaa 1080aggaaagcga ggtggacatt catctgtttc cactgagtct
gagtcttcaa gttttcactc 1140cagctaacac agatgtaaaa gacttttttt
tatacgataa ataacttttt tttaagttac 1200acatttttca gatataaaag
actgaccaat attgtacagt ttttattgct tgttggattt 1260ttgtcttgtg
tttctttagt ttttgtgaag tttaattgac ttatttatat aaattttttt
1320tgtttcatat tgatgtgtgt ctaggcagga cctgtggcca agttcttagt
tgctgtatgt 1380ctcgtggtag gactgtagaa aagggaactg aacattccag
agcgtgtagt gaatcacgta 1440aagctagaaa tgatccccag ctgtttatgc
atagataatc tctccattcc cgtggaacgt 1500ttttcctgtt cttaagacgt
gattttgctg tagaagatgg cacttataac caaagcccaa 1560agtggtatag
aaatgctggt ttttcagttt tcaggagtgg gttgatttca gcacctacag
1620tgtacagtct tgtattaagt tgttaataaa agtacatgtt aaacttactt
agtgttatg 1679632859DNAHomo sapiens 63cattcagaga cagaaggtgg
atagacaaat ctccaccttc agactggtag gctcctccag 60aagccatcag acaggaagat
gtgaaaatcc ccagcactca tcccagaatc actaagtggc 120acctgtcctg
ggccaaagtc ccaggacaga cctcattgtt cctctgtggg aatacctccc
180caggagggca tcctggattt cccccttgca acccaggtca gaagtttcat
cgtcaaggtt 240gtttcatctt ttttttcctg tctaacagct ctgactacca
cccaaccttg aggcacagtg 300aagacatcgg tggccactcc aataacagca
ggtcacagct gctcttctgg aggtgtccta 360caggtgaaaa gcccagcgac
ccagtcagga tttaagttta cctcaaaaat ggaagatttt 420aacatggaga
gtgacagctt tgaagatttc tggaaaggtg aagatcttag taattacagt
480tacagctcta ccctgccccc ttttctacta gatgccgccc catgtgaacc
agaatccctg 540gaaatcaaca agtattttgt ggtcattatc tatgccctgg
tattcctgct gagcctgctg 600ggaaactccc tcgtgatgct ggtcatctta
tacagcaggg tcggccgctc cgtcactgat 660gtctacctgc tgaacctagc
cttggccgac ctactctttg ccctgacctt gcccatctgg 720gccgcctcca
aggtgaatgg ctggattttt ggcacattcc tgtgcaaggt ggtctcactc
780ctgaaggaag tcaacttcta tagtggcatc ctgctactgg cctgcatcag
tgtggaccgt 840tacctggcca ttgtccatgc cacacgcaca ctgacccaga
agcgctactt ggtcaaattc 900atatgtctca gcatctgggg tctgtccttg
ctcctggccc tgcctgtctt acttttccga 960aggaccgtct actcatccaa
tgttagccca gcctgctatg aggacatggg caacaataca 1020gcaaactggc
ggatgctgtt acggatcctg ccccagtcct ttggcttcat cgtgccactg
1080ctgatcatgc tgttctgcta cggattcacc ctgcgtacgc tgtttaaggc
ccacatgggg 1140cagaagcacc gggccatgcg ggtcatcttt gctgtcgtcc
tcatcttcct gctctgctgg 1200ctgccctaca acctggtcct gctggcagac
accctcatga ggacccaggt gatccaggag 1260acctgtgagc gccgcaatca
catcgaccgg gctctggatg ccaccgagat tctgggcatc 1320cttcacagct
gcctcaaccc cctcatctac gccttcattg gccagaagtt tcgccatgga
1380ctcctcaaga ttctagctat acatggcttg atcagcaagg actccctgcc
caaagacagc 1440aggccttcct ttgttggctc ttcttcaggg cacacttcca
ctactctcta agacctcctg 1500cctaagtgca gccccgtggg gttcctccct
tctcttcaca gtcacattcc aagcctcatg 1560tccactggtt cttcttggtc
tcagtgtcaa tgcagccccc attgtggtca caggaagtag 1620aggaggccac
gttcttacta gtttcccttg catggtttag aaagcttgcc ctggtgcctc
1680accccttgcc ataattacta tgtcatttgc tggagctctg cccatcctgc
ccctgagccc 1740atggcactct atgttctaag aagtgaaaat ctacactcca
gtgagacagc tctgcatact 1800cattaggatg gctagtatca aaagaaagaa
aatcaggctg gccaacgggg tgaaaccctg 1860tctctactaa aaatacaaaa
aaaaaaaaaa attagccggg cgtggtggtg agtgcctgta 1920atcacagcta
cttgggaggc tgagatggga gaatcacttg aacccgggag gcagaggttg
1980cagtgagccg agattgtgcc cctgcactcc agcctgagcg acagtgagac
tctgtctcag 2040tccatgaaga tgtagaggag aaactggaac tctcgagcgt
tgctgggggg gattgtaaaa 2100tggtgtgacc actgcagaag acagtatggc
agctttcctc aaaacttcag acatagaatt 2160aacacatgat cctgcaattc
cacttatagg aattgaccca caagaaatga aagcagggac 2220ttgaacccat
atttgtacac caatattcat agcagcttat tcacaagacc caaaaggcag
2280aagcaaccca aatgttcatc aatgaatgaa tgaatggcta agcaaaatgt
gatatgtacc 2340taacgaagta tccttcagcc tgaaagagga atgaagtact
catacatgtt acaacacgga 2400cgaaccttga aaactttatg ctaagtgaaa
taagccagac atcaacagat aaatagttta 2460tgattccacc tacatgaggt
actgagagtg aacaaattta cagagacaga aagcagaaca 2520gtgattacca
gggactgagg ggaggggagc atgggaagtg acggtttaat gggcacaggg
2580tttatgttta ggatgttgaa aaagttctgc agataaacag tagtgatagt
tgtaccgcaa 2640tgtgacttaa tgccactaaa ttgacactta aaaatggttt
aaatggtcaa ttttgttatg 2700tatattttat atcaatttaa aaaaaaacct
gagccccaaa aggtatttta atcaccaagg 2760ctgattaaac caaggctaga
accacctgcc tatatttttt gttaaatgat ttcattcaat 2820atcttttttt
taataaacca tttttacttg ggtgtttat 28596427DNAHomo Sapiens
64tgtgcgcgcg gccagagcag gtgcgca 276526DNAHomo Sapiens 65gaggatccgt
caaccacaag ggtctc 266627DNAHomo Sapiens 66tgtgcgcgcg gcctgatcat
ccggtct 276726DNAHomo Sapiens 67gaggatccga cataccgctc gtgaca
266828DNAHomo Sapiens 68tgtgcgcgca gtgtccgcac tgtaaagc
286926DNAHomo Sapiens 69gaggatccat aggaggtctt aacagt 267027DNAHomo
Sapiens 70tgtgcgcgcg gcctttttgt gctctgc 277126DNAHomo Sapiens
71gaggatccca gagcatcatg aagatc 267228DNAHomo Sapiens 72tgtgcgcgcg
gcttgatcag caagggac 287326DNAHomo Sapiens 73gaggatccga gagtagtgga
agtgtg 267427DNAHomo Sapiens 74tgtgcgcgcg ggtccagcct caagatc
277526DNAHomo Sapiens 75gaggatccgc tggagtgaaa acttga
26765616DNAHomo Sapiens 76ccccggcgca gcgcggccgc agcagcctcc
gccccccgca cggtgtgagc gcccgacgcg 60gccgaggcgg ccggagtccc gagctagccc
cggcggccgc cgccgcccag accggacgac 120aggccacctc gtcggcgtcc
gcccgagtcc ccgcctcgcc gccaacgcca caaccaccgc 180gcacggcccc
ctgactccgt ccagtattga tcgggagagc cggagcgagc tcttcgggga
240gcagcgatgc gaccctccgg gacggccggg gcagcgctcc tggcgctgct
ggctgcgctc 300tgcccggcga gtcgggctct ggaggaaaag aaagtttgcc
aaggcacgag taacaagctc 360acgcagttgg gcacttttga agatcatttt
ctcagcctcc agaggatgtt caataactgt 420gaggtggtcc ttgggaattt
ggaaattacc tatgtgcaga ggaattatga tctttccttc 480ttaaagacca
tccaggaggt ggctggttat gtcctcattg ccctcaacac agtggagcga
540attcctttgg aaaacctgca gatcatcaga ggaaatatgt actacgaaaa
ttcctatgcc 600ttagcagtct tatctaacta tgatgcaaat aaaaccggac
tgaaggagct gcccatgaga 660aatttacagg aaatcctgca tggcgccgtg
cggttcagca acaaccctgc cctgtgcaac 720gtggagagca tccagtggcg
ggacatagtc agcagtgact ttctcagcaa catgtcgatg 780gacttccaga
accacctggg cagctgccaa aagtgtgatc caagctgtcc caatgggagc
840tgctggggtg caggagagga gaactgccag aaactgacca aaatcatctg
tgcccagcag 900tgctccgggc gctgccgtgg caagtccccc agtgactgct
gccacaacca gtgtgctgca 960ggctgcacag gcccccggga gagcgactgc
ctggtctgcc gcaaattccg agacgaagcc 1020acgtgcaagg acacctgccc
cccactcatg ctctacaacc ccaccacgta ccagatggat 1080gtgaaccccg
agggcaaata cagctttggt gccacctgcg tgaagaagtg tccccgtaat
1140tatgtggtga cagatcacgg ctcgtgcgtc cgagcctgtg gggccgacag
ctatgagatg 1200gaggaagacg gcgtccgcaa gtgtaagaag tgcgaagggc
cttgccgcaa agtgtgtaac 1260ggaataggta ttggtgaatt taaagactca
ctctccataa atgctacgaa tattaaacac 1320ttcaaaaact gcacctccat
cagtggcgat ctccacatcc tgccggtggc atttaggggt 1380gactccttca
cacatactcc tcctctggat ccacaggaac tggatattct gaaaaccgta
1440aaggaaatca cagggttttt gctgattcag gcttggcctg aaaacaggac
ggacctccat 1500gcctttgaga acctagaaat catacgcggc aggaccaagc
aacatggtca gttttctctt 1560gcagtcgtca gcctgaacat aacatccttg
ggattacgct ccctcaagga gataagtgat 1620ggagatgtga taatttcagg
aaacaaaaat ttgtgctatg caaatacaat aaactggaaa 1680aaactgtttg
ggacctccgg tcagaaaacc aaaattataa gcaacagagg tgaaaacagc
1740tgcaaggcca caggccaggt ctgccatgcc ttgtgctccc ccgagggctg
ctggggcccg 1800gagcccaggg actgcgtctc ttgccggaat gtcagccgag
gcagggaatg cgtggacaag 1860tgcaaccttc tggagggtga gccaagggag
tttgtggaga actctgagtg catacagtgc 1920cacccagagt gcctgcctca
ggccatgaac atcacctgca caggacgggg accagacaac 1980tgtatccagt
gtgcccacta cattgacggc ccccactgcg tcaagacctg cccggcagga
2040gtcatgggag aaaacaacac cctggtctgg aagtacgcag acgccggcca
tgtgtgccac 2100ctgtgccatc caaactgcac ctacggatgc actgggccag
gtcttgaagg ctgtccaacg 2160aatgggccta agatcccgtc catcgccact
gggatggtgg gggccctcct cttgctgctg 2220gtggtggccc tggggatcgg
cctcttcatg cgaaggcgcc acatcgttcg gaagcgcacg 2280ctgcggaggc
tgctgcagga gagggagctt gtggagcctc ttacacccag tggagaagct
2340cccaaccaag ctctcttgag gatcttgaag gaaactgaat tcaaaaagat
caaagtgctg 2400ggctccggtg cgttcggcac ggtgtataag ggactctgga
tcccagaagg tgagaaagtt 2460aaaattcccg tcgctatcaa ggaattaaga
gaagcaacat ctccgaaagc caacaaggaa 2520atcctcgatg aagcctacgt
gatggccagc gtggacaacc cccacgtgtg ccgcctgctg 2580ggcatctgcc
tcacctccac cgtgcagctc atcacgcagc tcatgccctt cggctgcctc
2640ctggactatg tccgggaaca caaagacaat attggctccc agtacctgct
caactggtgt 2700gtgcagatcg caaagggcat gaactacttg gaggaccgtc
gcttggtgca ccgcgacctg 2760gcagccagga acgtactggt gaaaacaccg
cagcatgtca agatcacaga ttttgggctg 2820gccaaactgc tgggtgcgga
agagaaagaa taccatgcag aaggaggcaa agtgcctatc 2880aagtggatgg
cattggaatc aattttacac agaatctata cccaccagag tgatgtctgg
2940agctacgggg tgaccgtttg ggagttgatg acctttggat ccaagccata
tgacggaatc 3000cctgccagcg agatctcctc catcctggag aaaggagaac
gcctccctca gccacccata 3060tgtaccatcg atgtctacat gatcatggtc
aagtgctgga tgatagacgc agatagtcgc 3120ccaaagttcc gtgagttgat
catcgaattc tccaaaatgg cccgagaccc ccagcgctac 3180cttgtcattc
agggggatga aagaatgcat ttgccaagtc ctacagactc caacttctac
3240cgtgccctga tggatgaaga agacatggac gacgtggtgg atgccgacga
gtacctcatc 3300ccacagcagg gcttcttcag cagcccctcc acgtcacgga
ctcccctcct gagctctctg 3360agtgcaacca gcaacaattc caccgtggct
tgcattgata gaaatgggct gcaaagctgt 3420cccatcaagg aagacagctt
cttgcagcga tacagctcag accccacagg cgccttgact 3480gaggacagca
tagacgacac cttcctccca gtgcctgaat acataaacca gtccgttccc
3540aaaaggcccg ctggctctgt gcagaatcct gtctatcaca atcagcctct
gaaccccgcg 3600cccagcagag acccacacta ccaggacccc cacagcactg
cagtgggcaa ccccgagtat 3660ctcaacactg tccagcccac ctgtgtcaac
agcacattcg acagccctgc ccactgggcc 3720cagaaaggca gccaccaaat
tagcctggac aaccctgact accagcagga cttctttccc 3780aaggaagcca
agccaaatgg catctttaag ggctccacag ctgaaaatgc agaataccta
3840agggtcgcgc cacaaagcag tgaatttatt ggagcatgac cacggaggat
agtatgagcc 3900ctaaaaatcc agactctttc gatacccagg accaagccac
agcaggtcct ccatcccaac 3960agccatgccc gcattagctc ttagacccac
agactggttt tgcaacgttt acaccgacta 4020gccaggaagt acttccacct
cgggcacatt ttgggaagtt gcattccttt gtcttcaaac 4080tgtgaagcat
ttacagaaac gcatccagca agaatattgt ccctttgagc agaaatttat
4140ctttcaaaga ggtatatttg aaaaaaaaaa aaagtatatg tgaggatttt
tattgattgg 4200ggatcttgga gtttttcatt gtcgctattg atttttactt
caatgggctc ttccaacaag 4260gaagaagctt gctggtagca cttgctaccc
tgagttcatc caggcccaac tgtgagcaag 4320gagcacaagc cacaagtctt
ccagaggatg cttgattcca gtggttctgc ttcaaggctt 4380ccactgcaaa
acactaaaga tccaagaagg ccttcatggc cccagcaggc cggatcggta
4440ctgtatcaag tcatggcagg tacagtagga taagccactc tgtcccttcc
tgggcaaaga 4500agaaacggag gggatggaat tcttccttag acttactttt
gtaaaaatgt ccccacggta 4560cttactcccc actgatggac cagtggtttc
cagtcatgag cgttagactg acttgtttgt 4620cttccattcc attgttttga
aactcagtat gctgcccctg tcttgctgtc atgaaatcag 4680caagagagga
tgacacatca aataataact cggattccag cccacattgg attcatcagc
4740atttggacca atagcccaca gctgagaatg tggaatacct aaggatagca
ccgcttttgt 4800tctcgcaaaa acgtatctcc taatttgagg ctcagatgaa
atgcatcagg tcctttgggg 4860catagatcag aagactacaa aaatgaagct
gctctgaaat ctcctttagc catcacccca 4920accccccaaa attagtttgt
gttacttatg gaagatagtt ttctcctttt acttcacttc 4980aaaagctttt
tactcaaaga gtatatgttc cctccaggtc agctgccccc aaaccccctc
5040cttacgcttt gtcacacaaa aagtgtctct gccttgagtc atctattcaa
gcacttacag 5100ctctggccac aacagggcat tttacaggtg cgaatgacag
tagcattatg agtagtgtgg 5160aattcaggta gtaaatatga aactagggtt
tgaaattgat aatgctttca caacatttgc 5220agatgtttta gaaggaaaaa
agttccttcc taaaataatt tctctacaat tggaagattg 5280gaagattcag
ctagttagga gcccaccttt tttcctaatc tgtgtgtgcc ctgtaacctg
5340actggttaac agcagtcctt tgtaaacagt gttttaaact ctcctagtca
atatccaccc 5400catccaattt atcaaggaag aaatggttca gaaaatattt
tcagcctaca gttatgttca 5460gtcacacaca catacaaaat gttccttttg
cttttaaagt aatttttgac tcccagatca 5520gtcagagccc ctacagcatt
gttaagaaag tatttgattt ttgtctcaat gaaaataaaa 5580ctatattcat
ttccactcta aaaaaaaaaa aaaaaa 56167712PRTHomo Sapiens 77Gly Gly Ser
Gly Ser Glu Asn Leu Tyr Phe Gln Leu1 5 10781291PRTHomo sapiens
78Met Ala Gly Ala Ala Ser Pro Cys Ala Asn Gly Cys Gly Pro Gly Ala1
5 10 15Pro Ser Asp Ala Glu Val Leu His Leu Cys Arg Ser Leu Glu Val
Gly 20 25 30Thr Val Met Thr Leu Phe Tyr Ser Lys Lys Ser Gln Arg Pro
Glu Arg 35 40 45Lys Thr Phe Gln Val Lys Leu Glu Thr Arg Gln Ile Thr
Trp Ser Arg 50 55 60Gly Ala Asp Lys Ile Glu Gly Ala Ile Asp Ile Arg
Glu Ile Lys Glu65 70 75 80Ile Arg Pro Gly Lys Thr Ser Arg Asp Phe
Asp Arg Tyr Gln Glu Asp 85 90 95Pro Ala Phe Arg Pro Asp Gln Ser His
Cys Phe Val Ile Leu Tyr Gly 100 105 110Met Glu Phe Arg Leu Lys Thr
Leu Ser Leu Gln Ala Thr Ser Glu Asp 115 120 125Glu Val Asn Met Trp
Ile Lys Gly Leu Thr Trp Leu Met Glu Asp Thr 130 135 140Leu Gln Ala
Pro Thr Pro Leu Gln Ile Glu Arg Trp Leu Arg Lys Gln145 150 155
160Phe Tyr Ser Val Asp Arg Asn Arg Glu Asp Arg Ile Ser Ala Lys Asp
165 170 175Leu Lys Asn Met Leu Ser Gln Val Asn Tyr Arg Val Pro Asn
Met Arg 180 185 190Phe Leu Arg Glu Arg Leu Thr Asp Leu Glu Gln Arg
Ser Gly Asp Ile 195 200 205Thr Tyr Gly Gln Phe Ala Gln Leu Tyr Arg
Ser Leu Met Tyr Ser Ala 210 215 220Gln Lys Thr Met Asp Leu Pro Phe
Leu Glu Ala Ser Thr Leu Arg Ala225 230 235 240Gly Glu Arg Pro Glu
Leu Cys Arg Val Ser Leu Pro Glu Phe Gln Gln 245 250 255Phe Leu Leu
Asp Tyr Gln Gly Glu Leu Trp Ala Val Asp Arg Leu Gln 260 265 270Val
Gln Glu Phe Met Leu Ser Phe Leu Arg Asp Pro Leu Arg Glu Ile 275 280
285Glu Glu Pro Tyr Phe Phe Leu Asp Glu Phe Val Thr Phe Leu Phe Ser
290 295 300Lys Glu Asn Ser Val Trp Asn Ser Gln Leu Asp Ala Val Cys
Pro Asp305 310 315 320Thr Met Asn Asn Pro Leu Ser His Tyr Trp Ile
Ser Ser Ser His Asn 325 330 335Thr Tyr Leu Thr Gly Asp Gln Phe Ser
Ser Glu Ser Ser Leu Glu Ala 340 345 350Tyr Ala Arg Cys Leu Arg Met
Gly Cys Arg Cys Ile Glu Leu Asp Cys 355 360 365Trp Asp Gly Pro Asp
Gly Met Pro Val Ile Tyr His Gly His Thr Leu 370 375 380Thr Thr Lys
Ile Lys Phe Ser Asp Val Leu His Thr Ile Lys Glu His385 390 395
400Ala Phe Val Ala Ser Glu Tyr Pro Val Ile Leu Ser Ile Glu Asp His
405 410 415Cys Ser Ile Ala Gln Gln Arg Asn Met Ala Gln Tyr Phe Lys
Lys Val 420 425 430Leu Gly Asp Thr Leu Leu Thr Lys Pro Val Glu Ile
Ser Ala Asp Gly 435 440 445Leu Pro Ser Pro Asn Gln Leu Lys Arg Lys
Ile Leu Ile Lys His Lys 450 455 460Lys Leu Ala Glu Gly Ser Ala Tyr
Glu Glu Val Pro Thr Ser Met Met465 470 475 480Tyr Ser Glu Asn Asp
Ile Ser Asn Ser Ile Lys Asn Gly Ile Leu Tyr 485 490 495Leu Glu Asp
Pro Val Asn His Glu Trp Tyr Pro His Tyr Phe Val Leu 500 505 510Thr
Ser Ser Lys Ile Tyr Tyr Ser Glu Glu Thr Ser Ser Asp Gln Gly 515 520
525Asn Glu Asp Glu Glu Glu Pro Lys Glu Val Ser Ser Ser Thr Glu Leu
530 535 540His Ser Asn Glu Lys Trp Phe His Gly Lys Leu Gly Ala Gly
Arg Asp545 550 555 560Gly Arg His Ile Ala Glu Arg Leu Leu Thr Glu
Tyr Cys Ile Glu Thr 565 570 575Gly Ala Pro Asp Gly Ser Phe Leu Val
Arg Glu Ser Glu Thr Phe Val 580 585 590Gly Asp Tyr Thr Leu Ser Phe
Trp Arg Asn Gly Lys Val Gln His Cys 595 600 605Arg Ile His Ser Arg
Gln Asp Ala Gly Thr Pro Lys Phe Phe Leu Thr 610 615 620Asp Asn Leu
Val Phe Asp Ser Leu Tyr Asp Leu Ile Thr His Tyr Gln625 630 635
640Gln Val Pro Leu Arg Cys Asn Glu Phe Glu Met Arg Leu Ser Glu Pro
645 650 655Val Pro Gln Thr Asn Ala His Glu Ser Lys Glu Trp Tyr His
Ala Ser 660 665 670Leu Thr Arg Ala Gln Ala Glu His Met Leu Met Arg
Val Pro Arg Asp 675 680 685Gly Ala Phe Leu Val Arg Lys Arg Asn Glu
Pro Asn Ser Tyr Ala Ile 690 695 700Ser Phe Arg Ala Glu Gly Lys Ile
Lys His Cys Arg Val Gln Gln Glu705 710 715 720Gly Gln Thr Val Met
Leu Gly Asn Ser Glu Phe Asp Ser Leu Val Asp 725 730 735Leu Ile Ser
Tyr Tyr Glu Lys His Pro Leu Tyr Arg Lys Met Lys Leu 740 745 750Arg
Tyr Pro Ile Asn Glu Glu Ala Leu Glu Lys Ile Gly Thr Ala Glu 755 760
765Pro Asp Tyr Gly Ala Leu Tyr Glu Gly Arg Asn Pro Gly Phe Tyr Val
770 775 780Glu Ala Asn Pro Met Pro Thr Phe Lys Cys Ala Val Lys Ala
Leu Phe785 790 795 800Asp Tyr Lys Ala Gln Arg Glu Asp Glu Leu Thr
Phe Ile Lys Ser Ala 805 810 815Ile Ile Gln Asn Val Glu Lys Gln Glu
Gly Gly Trp Trp Arg Gly Asp 820 825 830Tyr Gly Gly Lys Lys Gln Leu
Trp Phe Pro Ser Asn Tyr Val Glu Glu 835 840 845Met Val Asn Pro Val
Ala Leu Glu Pro Glu Arg Glu His Leu Asp Glu 850 855 860Asn Ser Pro
Leu Gly Asp Leu Leu Arg Gly Val Leu Asp Val Pro Ala865 870 875
880Cys Gln Ile Ala Ile Arg Pro Glu Gly Lys Asn Asn Arg Leu Phe Val
885 890 895Phe Ser Ile Ser Met Ala Ser Val Ala His Trp Ser Leu Asp
Val Ala 900 905 910Ala Asp Ser Gln Glu Glu Leu Gln Asp Trp Val Lys
Lys Ile Arg Glu 915 920 925Val Ala Gln Thr Ala Asp Ala Arg Leu Thr
Glu Gly Lys Ile Met Glu 930 935 940Arg Arg Lys Lys Ile Ala Leu Glu
Leu Ser Glu Leu Val Val Tyr Cys945 950 955 960Arg Pro Val Pro Phe
Asp Glu Glu Lys Ile Gly Thr Glu Arg Ala Cys 965 970 975Tyr Arg Asp
Met Ser Ser Phe Pro Glu Thr Lys Ala Glu Lys Tyr Val 980 985 990Asn
Lys Ala Lys Gly Lys Lys Phe Leu Gln Tyr Asn Arg Leu Gln Leu 995
1000 1005Ser Arg Ile Tyr Pro Lys Gly Gln Arg Leu Asp Ser Ser Asn
Tyr Asp 1010 1015 1020Pro Leu Pro Met Trp Ile Cys Gly Ser Gln Leu
Val Ala Leu Asn Phe1025 1030 1035 1040Gln Thr Pro Asp Lys Pro Met
Gln Met Asn Gln Ala Leu Phe Met Thr 1045 1050 1055Gly Arg His Cys
Gly Tyr Val Leu Gln Pro Ser Thr Met Arg Asp Glu 1060 1065 1070Ala
Phe Asp Pro Phe Asp Lys Ser Ser Leu Arg Gly Leu Glu Pro Cys 1075
1080 1085Ala Ile Ser Ile Glu Val Leu Gly Ala Arg His Leu Pro Lys
Asn Gly 1090 1095 1100Arg Gly Ile Val Cys Pro Phe Val Glu Ile Glu
Val Ala Gly Ala Glu1105 1110 1115 1120Tyr Asp Ser Thr Lys Gln Lys
Thr Glu Phe Val Val Asp Asn Gly Leu 1125 1130 1135Asn Pro Val Trp
Pro Ala Lys Pro Phe His Phe Gln Ile Ser Asn Pro 1140 1145 1150Glu
Phe Ala Phe Leu Arg Phe Val Val Tyr Glu Glu Asp Met Phe Ser 1155
1160 1165Asp Gln Asn Phe Leu Ala Gln Ala Thr Phe Pro Val Lys Gly
Leu Lys 1170 1175 1180Thr Gly Tyr Arg Ala Val Pro Leu Lys Asn Asn
Tyr Ser Glu Asp Leu1185 1190 1195 1200Glu Leu Ala Ser Leu Leu Ile
Lys Ile Asp Ile Phe Pro Ala Lys Gln 1205 1210 1215Glu Asn Gly Asp
Leu Ser Pro Phe Ser Gly Thr Ser Leu Arg Glu Arg 1220 1225 1230Gly
Ser Asp Ala Ser Gly Gln Leu Phe His Gly Arg Ala Arg Glu Gly 1235
1240 1245Ser Phe Glu Ser Arg Tyr Gln Gln Pro Phe Glu Asp Phe Arg
Ile Ser 1250 1255 1260Gln Glu His Leu Ala Asp His Phe Asp Ser Arg
Glu Arg Arg Ala Pro1265 1270 1275 1280Arg Arg Thr Arg Val Asn Gly
Asp Asn Arg Leu 1285 1290793054PRTHomo sapiens 79Met Ala Leu Ile
Phe Gly Thr Val Asn Ala Asn Ile Leu Lys Glu Val1 5 10 15Phe Gly Gly
Ala Arg Met Ala Cys Val Thr Ser Ala His Met Ala Gly 20 25 30Ala Asn
Gly Ser Ile Leu Lys Lys Ala Glu Glu Thr Ser Arg Ala Ile 35 40 45Met
His Lys Pro Val Ile Phe Gly Glu Asp Tyr Ile Thr Glu Ala Asp 50 55
60Leu Pro Tyr Thr Pro Leu His Leu Glu Val Asp Ala Glu Met Glu Arg65
70 75 80Met Tyr Tyr Leu Gly Arg Arg Ala Leu Thr His Gly Lys Arg Arg
Lys 85 90 95Val Ser Val Asn Asn Lys Arg Asn Arg Arg Arg Lys Val Ala
Lys Thr 100 105 110Tyr Val Gly Arg Asp Ser Ile Val Glu Lys Ile Val
Val Pro His Thr 115 120 125Glu Arg Lys Val Asp Thr Thr Ala Ala Val
Glu Asp Ile Cys Asn Glu 130 135 140Ala Thr Thr Gln Leu Val His Asn
Ser Met Pro Lys Arg Lys Lys Gln145 150 155 160Lys Asn Phe Leu Pro
Ala Thr Ser Leu Ser Asn Val Tyr Ala Gln Thr 165 170 175Trp Ser Ile
Val Arg Lys Arg His Met Gln Val Glu Ile Ile Ser Lys 180 185 190Lys
Ser Val Arg Ala Arg Val Lys Arg Phe Glu Gly Ser Val Gln Leu 195 200
205Phe Ala Ser Val Arg His Met Tyr Gly Glu Arg Lys Arg Val Asp Leu
210 215 220Arg Ile Asp Asn Trp Gln Gln Glu Thr Leu Leu Asp Leu Ala
Lys Arg225 230 235 240Phe Lys Asn Glu Arg Val Asp Gln Ser Lys Leu
Thr Phe Gly Ser Ser 245 250 255Gly Leu Val Leu Arg Gln Gly Ser Tyr
Gly Pro Ala His Trp Tyr Arg 260 265 270His Gly Met Phe Ile Val Arg
Gly Arg Ser Asp Gly Met Leu Val Asp 275 280 285Ala Arg Ala Lys Val
Thr Phe Ala Val Cys His Ser Met Thr His Tyr 290 295 300Ser Asp Lys
Ser Ile Ser Glu Ala Phe Phe Ile Pro Tyr Ser Lys Lys305 310 315
320Phe Leu Glu Leu Arg Pro Asp Gly Ile Ser His Glu Cys Thr Arg Gly
325 330 335Val Ser Val Glu Arg Cys Gly Glu Val Ala Ala Ile Leu Thr
Gln Ala 340 345 350Leu Ser Pro Cys Gly Lys Ile Thr Cys Lys Arg Cys
Met Val Glu Thr 355 360 365Pro Asp Ile Val Glu Gly Glu Ser Gly Glu
Ser Val Thr Asn Gln Gly 370 375 380Lys Leu Leu Ala Met Leu Lys Glu
Gln Tyr
Pro Asp Phe Pro Met Ala385 390 395 400Glu Lys Leu Leu Thr Arg Phe
Leu Gln Gln Lys Ser Leu Val Asn Thr 405 410 415Asn Leu Thr Ala Cys
Val Ser Val Lys Gln Leu Ile Gly Asp Arg Lys 420 425 430Gln Ala Pro
Phe Thr His Val Leu Ala Val Ser Glu Ile Leu Phe Lys 435 440 445Gly
Asn Lys Leu Thr Gly Ala Asp Leu Glu Glu Ala Ser Thr His Met 450 455
460Leu Glu Ile Ala Arg Phe Leu Asn Asn Arg Thr Glu Asn Met Arg
Ile465 470 475 480Gly His Leu Gly Ser Phe Arg Asn Lys Ile Ser Ser
Lys Ala His Val 485 490 495Asn Asn Ala Leu Met Cys Asp Asn Gln Leu
Asp Gln Asn Gly Asn Phe 500 505 510Ile Trp Gly Leu Arg Gly Ala His
Ala Lys Arg Phe Leu Lys Gly Phe 515 520 525Phe Thr Glu Ile Asp Pro
Asn Glu Gly Tyr Asp Lys Tyr Val Ile Arg 530 535 540Lys His Ile Arg
Gly Ser Arg Lys Leu Ala Ile Gly Asn Leu Ile Met545 550 555 560Ser
Thr Asp Phe Gln Thr Leu Arg Gln Gln Ile Gln Gly Glu Thr Ile 565 570
575Glu Arg Lys Glu Ile Gly Asn His Cys Ile Ser Met Arg Asn Gly Asn
580 585 590Tyr Val Tyr Pro Cys Cys Cys Val Thr Leu Glu Asp Gly Lys
Ala Gln 595 600 605Tyr Ser Asp Leu Lys His Pro Thr Lys Arg His Leu
Val Ile Gly Asn 610 615 620Ser Gly Asp Ser Lys Tyr Leu Asp Leu Pro
Val Leu Asn Glu Glu Lys625 630 635 640Met Tyr Ile Ala Asn Glu Gly
Tyr Cys Tyr Met Asn Ile Phe Phe Ala 645 650 655Leu Leu Val Asn Val
Lys Glu Glu Asp Ala Lys Asp Phe Thr Lys Phe 660 665 670Ile Arg Asp
Thr Ile Val Pro Lys Leu Gly Ala Trp Pro Thr Met Gln 675 680 685Asp
Val Ala Thr Ala Cys Tyr Leu Leu Ser Ile Leu Tyr Pro Asp Val 690 695
700Leu Arg Ala Glu Leu Pro Arg Ile Leu Val Asp His Asp Asn Lys
Thr705 710 715 720Met His Val Leu Asp Ser Tyr Gly Ser Arg Thr Thr
Gly Tyr His Met 725 730 735Leu Lys Met Asn Thr Thr Ser Gln Leu Ile
Glu Phe Val His Ser Gly 740 745 750Leu Glu Ser Glu Met Lys Thr Tyr
Asn Val Gly Gly Met Asn Arg Asp 755 760 765Val Val Thr Gln Gly Ala
Ile Glu Met Leu Ile Lys Ser Ile Tyr Lys 770 775 780Pro His Leu Met
Lys Gln Leu Leu Glu Glu Glu Pro Tyr Ile Ile Val785 790 795 800Leu
Ala Ile Val Ser Pro Ser Ile Leu Ile Ala Met Tyr Asn Ser Gly 805 810
815Thr Phe Glu Gln Ala Leu Gln Met Trp Leu Pro Asn Thr Met Arg Leu
820 825 830Ala Asn Leu Ala Ala Ile Leu Ser Ala Leu Ala Gln Lys Leu
Thr Leu 835 840 845Ala Asp Leu Phe Val Gln Gln Arg Asn Leu Ile Asn
Glu Tyr Ala Gln 850 855 860Val Ile Leu Asp Asn Leu Ile Asp Gly Val
Arg Val Asn His Ser Leu865 870 875 880Ser Leu Ala Met Glu Ile Val
Thr Ile Lys Leu Ala Thr Gln Glu Met 885 890 895Asp Met Ala Leu Arg
Glu Gly Gly Tyr Ala Val Thr Ser Glu Lys Val 900 905 910His Glu Met
Leu Glu Lys Asn Tyr Val Lys Ala Leu Lys Asp Ala Trp 915 920 925Asp
Glu Leu Thr Trp Leu Glu Lys Phe Ser Ala Ile Arg His Ser Arg 930 935
940Lys Leu Leu Lys Phe Gly Arg Lys Pro Leu Ile Met Lys Asn Thr
Val945 950 955 960Asp Cys Gly Gly His Ile Asp Leu Ser Val Lys Ser
Leu Phe Lys Phe 965 970 975His Leu Glu Leu Leu Lys Gly Thr Ile Ser
Arg Ala Val Asn Gly Gly 980 985 990Ala Arg Lys Val Arg Val Ala Lys
Asn Ala Met Thr Lys Gly Val Phe 995 1000 1005Leu Lys Ile Tyr Ser
Met Leu Pro Asp Val Tyr Lys Phe Ile Thr Val 1010 1015 1020Ser Ser
Val Leu Ser Leu Leu Leu Thr Phe Leu Phe Gln Ile Asp Cys1025 1030
1035 1040Met Ile Arg Ala His Arg Glu Ala Lys Val Ala Ala Gln Leu
Gln Lys 1045 1050 1055Glu Ser Glu Trp Asp Asn Ile Ile Asn Arg Thr
Phe Gln Tyr Ser Lys 1060 1065 1070Leu Glu Asn Pro Ile Gly Tyr Arg
Ser Thr Ala Glu Glu Arg Leu Gln 1075 1080 1085Ser Glu His Pro Glu
Ala Phe Glu Tyr Tyr Lys Phe Cys Ile Gly Lys 1090 1095 1100Glu Asp
Leu Val Glu Gln Ala Lys Gln Pro Glu Ile Ala Tyr Phe Glu1105 1110
1115 1120Lys Ile Ile Ala Phe Ile Thr Leu Val Leu Met Ala Phe Asp
Ala Glu 1125 1130 1135Arg Ser Asp Gly Val Phe Lys Ile Leu Asn Lys
Phe Lys Gly Ile Leu 1140 1145 1150Ser Ser Thr Glu Arg Glu Ile Ile
Tyr Thr Gln Ser Leu Asp Asp Tyr 1155 1160 1165Val Thr Thr Phe Asp
Asp Asn Met Thr Ile Asn Leu Glu Leu Asn Met 1170 1175 1180Asp Glu
Leu His Lys Thr Ser Leu Pro Gly Val Thr Phe Lys Gln Trp1185 1190
1195 1200Trp Asn Asn Gln Ile Ser Arg Gly Asn Val Lys Pro His Tyr
Arg Thr 1205 1210 1215Glu Gly His Phe Met Glu Phe Thr Arg Asp Thr
Ala Ala Ser Val Ala 1220 1225 1230Ser Glu Ile Ser His Ser Pro Ala
Arg Asp Phe Leu Val Arg Gly Ala 1235 1240 1245Val Gly Ser Gly Lys
Ser Thr Gly Leu Pro Tyr His Leu Ser Lys Arg 1250 1255 1260Gly Arg
Val Leu Met Leu Glu Pro Thr Arg Pro Leu Thr Asp Asn Met1265 1270
1275 1280His Lys Gln Leu Arg Ser Glu Pro Phe Asn Cys Phe Pro Thr
Leu Arg 1285 1290 1295Met Arg Gly Lys Ser Thr Phe Gly Ser Ser Pro
Ile Thr Val Met Thr 1300 1305 1310Ser Gly Phe Ala Leu His His Phe
Ala Arg Asn Ile Ala Glu Val Lys 1315 1320 1325Thr Tyr Asp Phe Val
Ile Ile Asp Glu Cys His Val Asn Asp Ala Ser 1330 1335 1340Ala Ile
Ala Phe Arg Asn Leu Leu Phe Glu His Glu Phe Glu Gly Lys1345 1350
1355 1360Val Leu Lys Val Ser Ala Thr Pro Pro Gly Arg Glu Val Glu
Phe Thr 1365 1370 1375Thr Gln Phe Pro Val Lys Leu Lys Ile Glu Glu
Ala Leu Ser Phe Gln 1380 1385 1390Glu Phe Val Ser Leu Gln Gly Thr
Gly Ala Asn Ala Asp Val Ile Ser 1395 1400 1405Cys Gly Asp Asn Ile
Leu Val Tyr Val Ala Ser Tyr Asn Asp Val Asp 1410 1415 1420Ser Leu
Gly Lys Leu Leu Val Gln Lys Gly Tyr Lys Val Ser Lys Ile1425 1430
1435 1440Asp Gly Arg Thr Met Lys Ser Gly Gly Thr Glu Ile Ile Thr
Glu Gly 1445 1450 1455Thr Ser Val Lys Lys His Phe Ile Val Ala Thr
Asn Ile Ile Glu Asn 1460 1465 1470Gly Val Thr Ile Asp Ile Asp Val
Val Val Asp Phe Gly Thr Lys Val 1475 1480 1485Val Pro Val Leu Asp
Val Asp Asn Arg Ala Val Gln Tyr Asn Lys Thr 1490 1495 1500Val Val
Ser Tyr Gly Glu Arg Ile Gln Lys Leu Gly Arg Val Gly Arg1505 1510
1515 1520His Lys Glu Gly Val Ala Leu Arg Ile Gly Gln Thr Asn Lys
Thr Leu 1525 1530 1535Val Glu Ile Pro Glu Met Val Ala Thr Glu Ala
Ala Phe Leu Cys Phe 1540 1545 1550Met Tyr Asn Leu Pro Val Thr Thr
Gln Ser Val Ser Thr Thr Leu Leu 1555 1560 1565Glu Asn Ala Thr Leu
Leu Gln Ala Arg Thr Met Ala Gln Phe Glu Leu 1570 1575 1580Ser Tyr
Phe Tyr Thr Ile Asn Phe Val Arg Phe Asp Gly Ser Met His1585 1590
1595 1600Pro Val Ile His Asp Lys Leu Lys Arg Phe Lys Leu His Thr
Cys Glu 1605 1610 1615Thr Phe Leu Asn Lys Leu Ala Ile Pro Asn Lys
Gly Leu Ser Ser Trp 1620 1625 1630Leu Thr Ser Gly Glu Tyr Lys Arg
Leu Gly Tyr Ile Ala Glu Asp Ala 1635 1640 1645Gly Ile Arg Ile Pro
Phe Val Cys Lys Glu Ile Pro Asp Ser Leu His 1650 1655 1660Glu Glu
Ile Trp His Ile Val Val Ala His Lys Gly Asp Ser Gly Ile1665 1670
1675 1680Gly Arg Leu Thr Ser Val Gln Ala Ala Lys Val Val Tyr Thr
Leu Gln 1685 1690 1695Thr Asp Val His Ser Ile Ala Arg Thr Leu Ala
Cys Ile Asn Arg Arg 1700 1705 1710Ile Ala Asp Glu Gln Met Lys Gln
Ser His Phe Glu Ala Ala Thr Gly 1715 1720 1725Arg Ala Phe Ser Phe
Thr Asn Tyr Ser Ile Gln Ser Ile Phe Asp Thr 1730 1735 1740Leu Lys
Ala Asn Tyr Ala Thr Lys His Thr Lys Glu Asn Ile Ala Val1745 1750
1755 1760Leu Gln Gln Ala Lys Asp Gln Leu Leu Glu Phe Ser Asn Leu
Ala Lys 1765 1770 1775Asp Gln Asp Val Thr Gly Ile Ile Gln Asp Phe
Asn His Leu Glu Thr 1780 1785 1790Ile Tyr Leu Gln Ser Asp Ser Glu
Val Ala Lys His Leu Lys Leu Lys 1795 1800 1805Ser His Trp Asn Lys
Ser Gln Ile Thr Arg Asp Ile Ile Ile Ala Leu 1810 1815 1820Ser Val
Leu Ile Gly Gly Gly Trp Met Leu Ala Thr Tyr Phe Lys Asp1825 1830
1835 1840Lys Phe Asn Glu Pro Val Tyr Phe Gln Gly Lys Lys Asn Gln
Lys His 1845 1850 1855Lys Leu Lys Met Arg Glu Ala Arg Gly Ala Arg
Gly Gln Tyr Glu Val 1860 1865 1870Ala Ala Glu Pro Glu Ala Leu Glu
His Tyr Phe Gly Ser Ala Tyr Asn 1875 1880 1885Asn Lys Gly Lys Arg
Lys Gly Thr Thr Arg Gly Met Gly Ala Lys Ser 1890 1895 1900Arg Lys
Phe Ile Asn Met Tyr Gly Phe Asp Pro Thr Asp Phe Ser Tyr1905 1910
1915 1920Ile Arg Phe Val Asp Pro Leu Thr Gly His Thr Ile Asp Glu
Ser Thr 1925 1930 1935Asn Ala Pro Ile Asp Leu Val Gln His Glu Phe
Gly Lys Val Arg Thr 1940 1945 1950Arg Met Leu Ile Asp Asp Glu Ile
Glu Pro Gln Ser Leu Ser Thr His 1955 1960 1965Thr Thr Ile His Ala
Tyr Leu Val Asn Ser Gly Thr Lys Lys Val Leu 1970 1975 1980Lys Val
Asp Leu Thr Pro His Ser Ser Leu Arg Ala Ser Glu Lys Ser1985 1990
1995 2000Thr Ala Ile Met Gly Phe Pro Glu Arg Glu Asn Glu Leu Arg
Gln Thr 2005 2010 2015Gly Met Ala Val Pro Val Ala Tyr Asp Gln Leu
Pro Pro Lys Asn Glu 2020 2025 2030Asp Leu Thr Phe Glu Gly Glu Ser
Leu Phe Lys Gly Pro Arg Asp Tyr 2035 2040 2045Asn Pro Ile Ser Ser
Thr Ile Cys His Leu Thr Asn Glu Ser Asp Gly 2050 2055 2060His Thr
Thr Ser Leu Tyr Gly Ile Gly Phe Gly Pro Phe Ile Ile Thr2065 2070
2075 2080Asn Lys His Leu Phe Arg Arg Asn Asn Gly Thr Leu Leu Val
Gln Ser 2085 2090 2095Leu His Gly Val Phe Lys Val Lys Asn Thr Thr
Thr Leu Gln Gln His 2100 2105 2110Leu Ile Asp Gly Arg Asp Met Ile
Ile Ile Arg Met Pro Lys Asp Phe 2115 2120 2125Pro Pro Phe Pro Gln
Lys Leu Lys Phe Arg Glu Pro Gln Arg Glu Glu 2130 2135 2140Arg Ile
Cys Leu Val Thr Thr Asn Phe Gln Thr Lys Ser Met Ser Ser2145 2150
2155 2160Met Val Ser Asp Thr Ser Cys Thr Phe Pro Ser Ser Asp Gly
Ile Phe 2165 2170 2175Trp Lys His Trp Ile Gln Thr Lys Asp Gly Gln
Cys Gly Ser Pro Leu 2180 2185 2190Val Ser Thr Arg Asp Gly Phe Ile
Val Gly Ile His Ser Ala Ser Asn 2195 2200 2205Phe Thr Asn Thr Asn
Asn Tyr Phe Thr Ser Val Pro Lys Asn Phe Met 2210 2215 2220Glu Leu
Leu Thr Asn Gln Glu Ala Gln Gln Trp Val Ser Gly Trp Arg2225 2230
2235 2240Leu Asn Ala Asp Ser Val Leu Trp Gly Gly His Lys Val Phe
Met Ser 2245 2250 2255Lys Pro Glu Glu Pro Phe Gln Pro Val Lys Glu
Ala Thr Gln Leu Met 2260 2265 2270Asn Glu Leu Val Tyr Ser Gln Gly
Glu Lys Arg Lys Trp Val Val Glu 2275 2280 2285Ala Leu Ser Gly Asn
Leu Arg Pro Val Ala Glu Cys Pro Ser Gln Leu 2290 2295 2300Val Thr
Lys His Val Val Lys Gly Lys Cys Pro Leu Phe Glu Leu Tyr2305 2310
2315 2320Leu Gln Leu Asn Pro Glu Lys Glu Ala Tyr Phe Lys Pro Met
Met Gly 2325 2330 2335Ala Tyr Lys Pro Ser Arg Leu Asn Arg Glu Ala
Phe Leu Lys Asp Ile 2340 2345 2350Leu Lys Tyr Ala Ser Glu Ile Glu
Ile Gly Asn Val Asp Cys Asp Leu 2355 2360 2365Leu Glu Leu Ala Ile
Ser Met Leu Val Thr Lys Leu Lys Ala Leu Gly 2370 2375 2380Phe Pro
Thr Val Asn Tyr Ile Thr Asp Pro Glu Glu Ile Phe Ser Ala2385 2390
2395 2400Leu Asn Met Lys Ala Ala Met Gly Ala Leu Tyr Lys Gly Lys
Lys Lys 2405 2410 2415Glu Ala Leu Ser Glu Leu Thr Leu Asp Glu Gln
Glu Ala Met Leu Lys 2420 2425 2430Ala Ser Cys Leu Arg Leu Tyr Thr
Gly Lys Leu Gly Ile Trp Asn Gly 2435 2440 2445Ser Leu Lys Ala Glu
Leu Arg Pro Ile Glu Lys Val Glu Asn Asn Lys 2450 2455 2460Thr Arg
Thr Phe Thr Ala Ala Pro Ile Asp Thr Leu Leu Ala Gly Lys2465 2470
2475 2480Val Cys Val Asp Asp Phe Asn Asn Gln Phe Tyr Asp Leu Asn
Ile Lys 2485 2490 2495Ala Pro Trp Thr Val Gly Met Thr Lys Phe Tyr
Gln Gly Trp Asn Glu 2500 2505 2510Leu Met Glu Ala Leu Pro Ser Gly
Trp Val Tyr Cys Asp Ala Asp Gly 2515 2520 2525Ser Gln Phe Asp Ser
Ser Leu Thr Pro Phe Leu Ile Asn Ala Val Leu 2530 2535 2540Lys Val
Arg Leu Ala Phe Met Glu Glu Trp Asp Ile Gly Glu Gln Met2545 2550
2555 2560Leu Arg Asn Leu Tyr Thr Glu Ile Val Tyr Thr Pro Ile Leu
Thr Pro 2565 2570 2575Asp Gly Thr Ile Ile Lys Lys His Lys Gly Asn
Asn Ser Gly Gln Pro 2580 2585 2590Ser Thr Val Val Asp Asn Thr Leu
Met Val Ile Ile Ala Met Leu Tyr 2595 2600 2605Thr Cys Glu Lys Cys
Gly Ile Asn Lys Glu Glu Ile Val Tyr Tyr Val 2610 2615 2620Asn Gly
Asp Asp Leu Leu Ile Ala Ile His Pro Asp Lys Ala Glu Arg2625 2630
2635 2640Leu Ser Arg Phe Lys Glu Ser Phe Gly Glu Leu Gly Leu Lys
Tyr Glu 2645 2650 2655Phe Asp Cys Thr Thr Arg Asp Lys Thr Gln Leu
Trp Phe Met Ser His 2660 2665 2670Arg Ala Leu Glu Arg Asp Gly Met
Tyr Ile Pro Lys Leu Glu Glu Glu 2675 2680 2685Arg Ile Val Ser Ile
Leu Glu Trp Asp Arg Ser Lys Glu Pro Ser His 2690 2695 2700Arg Leu
Glu Ala Ile Cys Ala Ser Met Ile Glu Ala Trp Gly Tyr Asp2705 2710
2715 2720Lys Leu Val Glu Glu Ile Arg Asn Phe Tyr Ala Trp Val Leu
Glu Gln 2725 2730 2735Ala Pro Tyr Ser Gln Leu Ala Glu Glu Gly Lys
Ala Pro Tyr Leu Ala 2740 2745 2750Glu Thr Ala Leu Lys Phe Leu Tyr
Thr Ser Gln His Gly Thr Asn Ser 2755 2760 2765Glu Ile Glu Glu Tyr
Leu Lys Val Leu Tyr Asp Tyr Asp Ile Pro Thr
2770 2775 2780Thr Glu Asn Leu Tyr Phe Gln Ser Gly Thr Val Asp Ala
Gly Ala Asp2785 2790 2795 2800Ala Gly Lys Lys Lys Asp Gln Lys Asp
Asp Lys Val Ala Glu Gln Ala 2805 2810 2815Ser Lys Asp Arg Asp Val
Asn Ala Gly Thr Ser Gly Thr Phe Ser Val 2820 2825 2830Pro Arg Ile
Asn Ala Met Ala Thr Lys Leu Gln Tyr Pro Arg Met Arg 2835 2840
2845Gly Glu Val Val Val Asn Leu Asn His Leu Leu Gly Tyr Lys Pro Gln
2850 2855 2860Gln Ile Asp Leu Ser Asn Ala Arg Ala Thr His Glu Gln
Phe Ala Ala2865 2870 2875 2880Trp His Gln Ala Val Met Thr Ala Tyr
Gly Val Asn Glu Glu Gln Met 2885 2890 2895Lys Ile Leu Leu Asn Gly
Phe Met Val Trp Cys Ile Glu Asn Gly Thr 2900 2905 2910Ser Pro Asn
Leu Asn Gly Thr Trp Val Met Met Asp Gly Glu Asp Gln 2915 2920
2925Val Ser Tyr Pro Leu Lys Pro Met Val Glu Asn Ala Gln Pro Thr Leu
2930 2935 2940Arg Gln Ile Met Thr His Phe Ser Asp Leu Ala Glu Ala
Tyr Ile Glu2945 2950 2955 2960Met Arg Asn Arg Glu Arg Pro Tyr Met
Pro Arg Tyr Gly Leu Gln Arg 2965 2970 2975Asn Ile Thr Asp Met Ser
Leu Ser Arg Tyr Ala Phe Asp Phe Tyr Glu 2980 2985 2990Leu Thr Ser
Lys Thr Pro Val Arg Ala Arg Glu Ala His Met Gln Met 2995 3000
3005Lys Ala Ala Ala Val Arg Asn Ser Gly Thr Arg Leu Phe Gly Leu Asp
3010 3015 3020Gly Asn Val Gly Thr Ala Glu Glu Asp Thr Glu Arg His
Thr Ala His3025 3030 3035 3040Asp Val Asn Arg Asn Met His Thr Leu
Leu Gly Val Arg Gln 3045 3050809PRTHomo Sapiens 80Asn Ser Ser Gly
Gly Asn Ser Gly Ser1 5812755DNAHomo sapiens 81ttaggacggg gcgatggcgg
ctgagaggag ctgcgcgtgc gcgaacatgt aactggtggg 60atctgcggcg gctcccagat
gatggtcgtc ctcctgggcg cgacgaccct agtgctcgtc 120gccgtgggcc
catgggtgtt gtccgcagcc gcaggtggaa aaaatctaaa atctcctcaa
180aaagtagagg tcgacatcat agatgacaac tttatcctga ggtggaacag
gagcgatgag 240tctgtcggga atgtgacttt ttcattcgat tatcaaaaaa
ctgggatgga taattggata 300aaattgtctg ggtgtcagaa tattactagt
accaaatgca acttttcttc actcaagctg 360aatgtttatg aagaaattaa
attgcgtata agagcagaaa aagaaaacac ttcttcatgg 420tatgaggttg
actcatttac accatttcgc aaagctcaga ttggtcctcc agaagtacat
480ttagaagctg aagataaggc aatagtgata cacatctctc ctggaacaaa
agatagtgtt 540atgtgggctt tggatggttt aagctttaca tatagcttac
ttatctggaa aaactcttca 600ggtgtagaag aaaggattga aaatatttat
tccagacata aaatttataa actctcacca 660gagactactt attgtctaaa
agttaaagca gcactactta cgtcatggaa aattggtgtc 720tatagtccag
tacattgtat aaagaccaca gttgaaaatg aactacctcc accagaaaat
780atagaagtca gtgtccaaaa tcagaactat gttcttaaat gggattatac
atatgcaaac 840atgacctttc aagttcagtg gctccacgcc tttttaaaaa
ggaatcctgg aaaccatttg 900tataaatgga aacaaatacc tgactgtgaa
aatgtcaaaa ctacccagtg tgtctttcct 960caaaacgttt tccaaaaagg
aatttacctt ctccgcgtac aagcatctga tggaaataac 1020acatcttttt
ggtctgaaga gataaagttt gatactgaaa tacaagcttt cctacttcct
1080ccagtcttta acattagatc ccttagtgat tcattccata tctatatcgg
tgctccaaaa 1140cagtctggaa acacgcctgt gatccaggat tatccactga
tttatgaaat tattttttgg 1200gaaaacactt caaatgctga gagaaaaatt
atcgagaaaa aaactgatgt tacagttcct 1260aatttgaaac cactgactgt
atattgtgtg aaagccagag cacacaccat ggatgaaaag 1320ctgaataaaa
gcagtgtttt tagtgacgct gtatgtgaga aaacaaaacc aggaaatacc
1380tctaaaattt ggcttatagt tggaatttgt attgcattat ttgctctccc
gtttgtcatt 1440tatgctgcga aagtcttctt gagatgcatc aattatgtct
tctttccatc acttaaacct 1500tcttccagta tagatgagta tttctctgaa
cagccattga agaatcttct gctttcaact 1560tctgaggaac aaatcgaaaa
atgtttcata attgaaaata taagcacaat tgctacagta 1620gaagaaacta
atcaaactga tgaagatcat aaaaaataca gttcccaaac tagccaagat
1680tcaggaaatt attctaatga agatgaaagc gaaagtaaaa caagtgaaga
actacagcag 1740gactttgtat gaccagaaat gaactgtgtc aagtataagg
tttttcagca ggagttacac 1800tgggagcctg aggtcctcac cttcctctca
gtaactacag agaggacgtt tcctgtttag 1860ggaaagaaaa aacatcttca
gatcataggt cctaaaaata cgggcaagct cttaactatt 1920taaaaatgaa
attacaggcc cgggcacggt ggctcacacc tgtaatccca gcactttggg
1980aggctgaggc aggcagatca tgaggtcaag agatcgagac cagcctggcc
aacgtggtga 2040aaccccatct ctactaaaaa tacaaaaatt agccgggtag
taggtaggcg cgcgcctgtt 2100gtcttagcta ctcaggaggc tgaggcagga
gaatcgcttg aaaacaggag gtggaggttg 2160cagtgagccg agatcacgcc
actgcactcc agcctggtga cagcgtgaga ctctttaaaa 2220aaagaaatta
aaagagttga gacaaacgtt tcctacattc ttttccatgt gtaaaatcat
2280gaaaaagcct gtcaccggac ttgcattgga tgagatgagt cagaccaaaa
cagtggccac 2340ccgtcttcct cctgtgagcc taagtgcagc cgtgctagct
gcgcaccgtg gctaaggatg 2400acgtctgtgt tcctgtccat cactgatgct
gctggctact gcatgtgcca cacctgtctg 2460ttcgccattc ctaacattct
gtttcattct tcctcgggag atatttcaaa catttggtct 2520tttcttttaa
cactgagggt aggcccttag gaaatttatt taggaaagtc tgaacacgtt
2580atcacttggt tttctggaaa gtagcttacc ctagaaaaca gctgcaaatg
ccagaaagat 2640gatccctaaa aatgttgagg gacttctgtt cattcatccc
gagaacattg gcttccacat 2700cacagtatct acccttacat ggtttaggat
taaagccagg caatctttta ctatg 2755829PRTHomo Sapiens 82Gly Ser Glu
Asn Leu Tyr Phe Gln Leu1 5832897DNAHomo sapiens 83cccgcactaa
agacgcttct tcccggcggg taggaatccc gccggcgagc cgaacagttc 60cccgagcgca
gcccgcggac caccacccgg ccgcacgggc cgcttttgtc ccccgcccgc
120cgcttctgtc cgagaggccg cccgcgaggc gcatcctgac cgcgagcgtc
gggtcccaga 180gccgggcgcg gctggggccc gaggctagca tctctcggga
gccgcaaggc gagagctgca 240aagtttaatt agacacttca gaattttgat
cacctaatgt tgatttcaga tgtaaaagtc 300aagagaagac tctaaaaata
gcaaagatgc ttttgagcca gaatgccttc atcttcagat 360cacttaattt
ggttctcatg gtgtatatca gcctcgtgtt tggtatttca tatgattcgc
420ctgattacac agatgaatct tgcactttca agatatcatt gcgaaatttc
cggtccatct 480tatcatggga attaaaaaac cactccattg taccaactca
ctatacattg ctgtatacaa 540tcatgagtaa accagaagat ttgaaggtgg
ttaagaactg tgcaaatacc acaagatcat 600tttgtgacct cacagatgag
tggagaagca cacacgaggc ctatgtcacc gtcctagaag 660gattcagcgg
gaacacaacg ttgttcagtt gctcacacaa tttctggctg gccatagaca
720tgtcttttga accaccagag tttgagattg ttggttttac caaccacatt
aatgtgatgg 780tgaaatttcc atctattgtt gaggaagaat tacagtttga
tttatctctc gtcattgaag 840aacagtcaga gggaattgtt aagaagcata
aacccgaaat aaaaggaaac atgagtggaa 900atttcaccta tatcattgac
aagttaattc caaacacgaa ctactgtgta tctgtttatt 960tagagcacag
tgatgagcaa gcagtaataa agtctccctt aaaatgcacc ctccttccac
1020ctggccagga atcagaatca gcagaatctg ccaaaatagg aggaataatt
actgtgtttt 1080tgatagcatt ggtcttgaca agcaccatag tgacactgaa
atggattggt tatatatgct 1140taagaaatag cctccccaaa gtcttgaatt
ttcataactt tttagcctgg ccatttccta 1200acctgccacc gttggaagcc
atggatatgg tggaggtcat ttacatcaac agaaagaaga 1260aagtgtggga
ttataattat gatgatgaaa gtgatagcga tactgaggca gcgcccagga
1320caagtggcgg tggctatacc atgcatggac tgactgtcag gcctctgggt
caggcctctg 1380ccacctctac agaatcccag ttgatagacc cggagtccga
ggaggagcct gacctgcctg 1440aggttgatgt ggagctcccc acgatgccaa
aggacagccc tcagcagttg gaactcttga 1500gtgggccctg tgagaggaga
aagagtccac tccaggaccc ttttcccgaa gaggactaca 1560gctccacgga
ggggtctggg ggcagaatta ccttcaatgt ggacttaaac tctgtgtttt
1620tgagagttct tgatgacgag gacagtgacg acttagaagc ccctctgatg
ctatcgtctc 1680atctggaaga gatggttgac ccagaggatc ctgataatgt
gcaatcaaac catttgctgg 1740ccagcgggga agggacacag ccaacctttc
ccagcccctc ttcagagggc ctgtggtccg 1800aagatgctcc atctgatcaa
agtgacactt ctgagtcaga tgttgacctt ggggatggtt 1860atataatgag
atgactccaa aactattgaa tgaacttgga cagacaagca cctacagggt
1920tctttgtctc tgcatcctaa cttgctgcct tatcgtctgc aagtgttctc
caagggaagg 1980aggaggaaac tgtggtgttc ctttcttcca ggtgacatca
cctatgcaca ttcccagtat 2040ggggaccata gtatcattca gtgcattgtt
tacatattca aagtggtgca ctttgaagga 2100agcacatgtg cacctttcct
ttacactaat gcacttagga tgtttctgca tcatgtctac 2160cagggagcag
ggttccccac agtttcagag gtggtccagg accctatgat atttctcttc
2220tttcgttctt tttttttttt ttttgagaca gagtctcgtt ctgtcgccca
agctggagcg 2280caatggtgtg atcttggctc actgcaacat ccgcctcccg
ggttcaggtg attctcctgc 2340ctcagcctcc ctcgcaagta gctgggatta
caggcgcctg ccaccatgcc tagcaaattt 2400ttgtattttt agtggagaca
ggattttacc atgttggcca ggctggtctc gaactcctga 2460cctcaagtga
tctgccctcc tcagcctcgt aaagtgctgg gattacaggg gtgagccgct
2520gtgcctggct ggccctgtga tatttctgtg aaataaattg ggccagggtg
ggagcaggga 2580aagaaaagga aaatagtagc aagagctgca aagcaggcag
gaagggagga ggagagccag 2640gtgagcagtg gagagaaggg gggccctgca
caaggaaaca gggaagagcc atcgaagttt 2700cagtcggtga gccttgggca
cctcacccat gtcacatcct gtctcctgca attggaattc 2760caccttgtcc
agccctcccc agttaaagtg gggaagacag actttaggat cacgtgtgtg
2820actaatacag aaaggaaaca tggcgtcggg gagagggata aaacctgaat
gccatatttt 2880aagttaaaaa aaaaaaa 2897843054PRTHomo sapiens 84Met
Ala Leu Ile Phe Gly Thr Val Asn Ala Asn Ile Leu Lys Glu Val1 5 10
15Phe Gly Gly Ala Arg Met Ala Cys Val Thr Ser Ala His Met Ala Gly
20 25 30Ala Asn Gly Ser Ile Leu Lys Lys Ala Glu Glu Thr Ser Arg Ala
Ile 35 40 45Met His Lys Pro Val Ile Phe Gly Glu Asp Tyr Ile Thr Glu
Ala Asp 50 55 60Leu Pro Tyr Thr Pro Leu His Leu Glu Val Asp Ala Glu
Met Glu Arg65 70 75 80Met Tyr Tyr Leu Gly Arg Arg Ala Leu Thr His
Gly Lys Arg Arg Lys 85 90 95Val Ser Val Asn Asn Lys Arg Asn Arg Arg
Arg Lys Val Ala Lys Thr 100 105 110Tyr Val Gly Arg Asp Ser Ile Val
Glu Lys Ile Val Val Pro His Thr 115 120 125Glu Arg Lys Val Asp Thr
Thr Ala Ala Val Glu Asp Ile Cys Asn Glu 130 135 140Ala Thr Thr Gln
Leu Val His Asn Ser Met Pro Lys Arg Lys Lys Gln145 150 155 160Lys
Asn Phe Leu Pro Ala Thr Ser Leu Ser Asn Val Tyr Ala Gln Thr 165 170
175Trp Ser Ile Val Arg Lys Arg His Met Gln Val Glu Ile Ile Ser Lys
180 185 190Lys Ser Val Arg Ala Arg Val Lys Arg Phe Glu Gly Ser Val
Gln Leu 195 200 205Phe Ala Ser Val Arg His Met Tyr Gly Glu Arg Lys
Arg Val Asp Leu 210 215 220Arg Ile Asp Asn Trp Gln Gln Glu Thr Leu
Leu Asp Leu Ala Lys Arg225 230 235 240Phe Lys Asn Glu Arg Val Asp
Gln Ser Lys Leu Thr Phe Gly Ser Ser 245 250 255Gly Leu Val Leu Arg
Gln Gly Ser Tyr Gly Pro Ala His Trp Tyr Arg 260 265 270His Gly Met
Phe Ile Val Arg Gly Arg Ser Asp Gly Met Leu Val Asp 275 280 285Ala
Arg Ala Lys Val Thr Phe Ala Val Cys His Ser Met Thr His Tyr 290 295
300Ser Asp Lys Ser Ile Ser Glu Ala Phe Phe Ile Pro Tyr Ser Lys
Lys305 310 315 320Phe Leu Glu Leu Arg Pro Asp Gly Ile Ser His Glu
Cys Thr Arg Gly 325 330 335Val Ser Val Glu Arg Cys Gly Glu Val Ala
Ala Ile Leu Thr Gln Ala 340 345 350Leu Ser Pro Cys Gly Lys Ile Thr
Cys Lys Arg Cys Met Val Glu Thr 355 360 365Pro Asp Ile Val Glu Gly
Glu Ser Gly Glu Ser Val Thr Asn Gln Gly 370 375 380Lys Leu Leu Ala
Met Leu Lys Glu Gln Tyr Pro Asp Phe Pro Met Ala385 390 395 400Glu
Lys Leu Leu Thr Arg Phe Leu Gln Gln Lys Ser Leu Val Asn Thr 405 410
415Asn Leu Thr Ala Cys Val Ser Val Lys Gln Leu Ile Gly Asp Arg Lys
420 425 430Gln Ala Pro Phe Thr His Val Leu Ala Val Ser Glu Ile Leu
Phe Lys 435 440 445Gly Asn Lys Leu Thr Gly Ala Asp Leu Glu Glu Ala
Ser Thr His Met 450 455 460Leu Glu Ile Ala Arg Phe Leu Asn Asn Arg
Thr Glu Asn Met Arg Ile465 470 475 480Gly His Leu Gly Ser Phe Arg
Asn Lys Ile Ser Ser Lys Ala His Val 485 490 495Asn Asn Ala Leu Met
Cys Asp Asn Gln Leu Asp Gln Asn Gly Asn Phe 500 505 510Ile Trp Gly
Leu Arg Gly Ala His Ala Lys Arg Phe Leu Lys Gly Phe 515 520 525Phe
Thr Glu Ile Asp Pro Asn Glu Gly Tyr Asp Lys Tyr Val Ile Arg 530 535
540Lys His Ile Arg Gly Ser Arg Lys Leu Ala Ile Gly Asn Leu Ile
Met545 550 555 560Ser Thr Asp Phe Gln Thr Leu Arg Gln Gln Ile Gln
Gly Glu Thr Ile 565 570 575Glu Arg Lys Glu Ile Gly Asn His Cys Ile
Ser Met Arg Asn Gly Asn 580 585 590Tyr Val Tyr Pro Cys Cys Cys Val
Thr Leu Glu Asp Gly Lys Ala Gln 595 600 605Tyr Ser Asp Leu Lys His
Pro Thr Lys Arg His Leu Val Ile Gly Asn 610 615 620Ser Gly Asp Ser
Lys Tyr Leu Asp Leu Pro Val Leu Asn Glu Glu Lys625 630 635 640Met
Tyr Ile Ala Asn Glu Gly Tyr Cys Tyr Met Asn Ile Phe Phe Ala 645 650
655Leu Leu Val Asn Val Lys Glu Glu Asp Ala Lys Asp Phe Thr Lys Phe
660 665 670Ile Arg Asp Thr Ile Val Pro Lys Leu Gly Ala Trp Pro Thr
Met Gln 675 680 685Asp Val Ala Thr Ala Cys Tyr Leu Leu Ser Ile Leu
Tyr Pro Asp Val 690 695 700Leu Arg Ala Glu Leu Pro Arg Ile Leu Val
Asp His Asp Asn Lys Thr705 710 715 720Met His Val Leu Asp Ser Tyr
Gly Ser Arg Thr Thr Gly Tyr His Met 725 730 735Leu Lys Met Asn Thr
Thr Ser Gln Leu Ile Glu Phe Val His Ser Gly 740 745 750Leu Glu Ser
Glu Met Lys Thr Tyr Asn Val Gly Gly Met Asn Arg Asp 755 760 765Val
Val Thr Gln Gly Ala Ile Glu Met Leu Ile Lys Ser Ile Tyr Lys 770 775
780Pro His Leu Met Lys Gln Leu Leu Glu Glu Glu Pro Tyr Ile Ile
Val785 790 795 800Leu Ala Ile Val Ser Pro Ser Ile Leu Ile Ala Met
Tyr Asn Ser Gly 805 810 815Thr Phe Glu Gln Ala Leu Gln Met Trp Leu
Pro Asn Thr Met Arg Leu 820 825 830Ala Asn Leu Ala Ala Ile Leu Ser
Ala Leu Ala Gln Lys Leu Thr Leu 835 840 845Ala Asp Leu Phe Val Gln
Gln Arg Asn Leu Ile Asn Glu Tyr Ala Gln 850 855 860Val Ile Leu Asp
Asn Leu Ile Asp Gly Val Arg Val Asn His Ser Leu865 870 875 880Ser
Leu Ala Met Glu Ile Val Thr Ile Lys Leu Ala Thr Gln Glu Met 885 890
895Asp Met Ala Leu Arg Glu Gly Gly Tyr Ala Val Thr Ser Glu Lys Val
900 905 910His Glu Met Leu Glu Lys Asn Tyr Val Lys Ala Leu Lys Asp
Ala Trp 915 920 925Asp Glu Leu Thr Trp Leu Glu Lys Phe Ser Ala Ile
Arg His Ser Arg 930 935 940Lys Leu Leu Lys Phe Gly Arg Lys Pro Leu
Ile Met Lys Asn Thr Val945 950 955 960Asp Cys Gly Gly His Ile Asp
Leu Ser Val Lys Ser Leu Phe Lys Phe 965 970 975His Leu Glu Leu Leu
Lys Gly Thr Ile Ser Arg Ala Val Asn Gly Gly 980 985 990Ala Arg Lys
Val Arg Val Ala Lys Asn Ala Met Thr Lys Gly Val Phe 995 1000
1005Leu Lys Ile Tyr Ser Met Leu Pro Asp Val Tyr Lys Phe Ile Thr
1010 1015 1020Val Ser Ser Val Leu Ser Leu Leu Leu Thr Phe Leu Phe
Gln Ile 1025 1030 1035Asp Cys Met Ile Arg Ala His Arg Glu Ala Lys
Val Ala Ala Gln 1040 1045 1050Leu Gln Lys Glu Ser Glu Trp Asp Asn
Ile Ile Asn Arg Thr Phe 1055 1060 1065Gln Tyr Ser Lys Leu Glu Asn
Pro Ile Gly Tyr Arg Ser Thr Ala 1070 1075 1080Glu Glu Arg Leu Gln
Ser Glu His Pro Glu Ala Phe Glu Tyr Tyr 1085 1090 1095Lys Phe Cys
Ile Gly Lys Glu Asp Leu Val Glu Gln Ala Lys Gln 1100 1105 1110Pro
Glu Ile Ala Tyr Phe Glu Lys Ile Ile Ala Phe Ile Thr Leu 1115 1120
1125Val Leu Met Ala Phe Asp Ala Glu Arg Ser Asp Gly Val Phe Lys
1130 1135 1140Ile Leu Asn Lys Phe Lys Gly Ile Leu Ser Ser Thr Glu
Arg Glu 1145 1150 1155Ile Ile Tyr Thr Gln Ser Leu Asp Asp Tyr Val
Thr Thr Phe Asp 1160 1165 1170Asp Asn Met Thr Ile Asn Leu Glu Leu
Asn Met Asp Glu Leu His 1175 1180 1185Lys Thr Ser Leu Pro Gly Val
Thr Phe Lys Gln Trp Trp Asn Asn 1190 1195 1200Gln Ile Ser Arg Gly
Asn Val Lys Pro His Tyr Arg Thr Glu Gly 1205
1210 1215His Phe Met Glu Phe Thr Arg Asp Thr Ala Ala Ser Val Ala
Ser 1220 1225 1230Glu Ile Ser His Ser Pro Ala Arg Asp Phe Leu Val
Arg Gly Ala 1235 1240 1245Val Gly Ser Gly Lys Ser Thr Gly Leu Pro
Tyr His Leu Ser Lys 1250 1255 1260Arg Gly Arg Val Leu Met Leu Glu
Pro Thr Arg Pro Leu Thr Asp 1265 1270 1275Asn Met His Lys Gln Leu
Arg Ser Glu Pro Phe Asn Cys Phe Pro 1280 1285 1290Thr Leu Arg Met
Arg Gly Lys Ser Thr Phe Gly Ser Ser Pro Ile 1295 1300 1305Thr Val
Met Thr Ser Gly Phe Ala Leu His His Phe Ala Arg Asn 1310 1315
1320Ile Ala Glu Val Lys Thr Tyr Asp Phe Val Ile Ile Asp Glu Cys
1325 1330 1335His Val Asn Asp Ala Ser Ala Ile Ala Phe Arg Asn Leu
Leu Phe 1340 1345 1350Glu His Glu Phe Glu Gly Lys Val Leu Lys Val
Ser Ala Thr Pro 1355 1360 1365Pro Gly Arg Glu Val Glu Phe Thr Thr
Gln Phe Pro Val Lys Leu 1370 1375 1380Lys Ile Glu Glu Ala Leu Ser
Phe Gln Glu Phe Val Ser Leu Gln 1385 1390 1395Gly Thr Gly Ala Asn
Ala Asp Val Ile Ser Cys Gly Asp Asn Ile 1400 1405 1410Leu Val Tyr
Val Ala Ser Tyr Asn Asp Val Asp Ser Leu Gly Lys 1415 1420 1425Leu
Leu Val Gln Lys Gly Tyr Lys Val Ser Lys Ile Asp Gly Arg 1430 1435
1440Thr Met Lys Ser Gly Gly Thr Glu Ile Ile Thr Glu Gly Thr Ser
1445 1450 1455Val Lys Lys His Phe Ile Val Ala Thr Asn Ile Ile Glu
Asn Gly 1460 1465 1470Val Thr Ile Asp Ile Asp Val Val Val Asp Phe
Gly Thr Lys Val 1475 1480 1485Val Pro Val Leu Asp Val Asp Asn Arg
Ala Val Gln Tyr Asn Lys 1490 1495 1500Thr Val Val Ser Tyr Gly Glu
Arg Ile Gln Lys Leu Gly Arg Val 1505 1510 1515Gly Arg His Lys Glu
Gly Val Ala Leu Arg Ile Gly Gln Thr Asn 1520 1525 1530Lys Thr Leu
Val Glu Ile Pro Glu Met Val Ala Thr Glu Ala Ala 1535 1540 1545Phe
Leu Cys Phe Met Tyr Asn Leu Pro Val Thr Thr Gln Ser Val 1550 1555
1560Ser Thr Thr Leu Leu Glu Asn Ala Thr Leu Leu Gln Ala Arg Thr
1565 1570 1575Met Ala Gln Phe Glu Leu Ser Tyr Phe Tyr Thr Ile Asn
Phe Val 1580 1585 1590Arg Phe Asp Gly Ser Met His Pro Val Ile His
Asp Lys Leu Lys 1595 1600 1605Arg Phe Lys Leu His Thr Cys Glu Thr
Phe Leu Asn Lys Leu Ala 1610 1615 1620Ile Pro Asn Lys Gly Leu Ser
Ser Trp Leu Thr Ser Gly Glu Tyr 1625 1630 1635Lys Arg Leu Gly Tyr
Ile Ala Glu Asp Ala Gly Ile Arg Ile Pro 1640 1645 1650Phe Val Cys
Lys Glu Ile Pro Asp Ser Leu His Glu Glu Ile Trp 1655 1660 1665His
Ile Val Val Ala His Lys Gly Asp Ser Gly Ile Gly Arg Leu 1670 1675
1680Thr Ser Val Gln Ala Ala Lys Val Val Tyr Thr Leu Gln Thr Asp
1685 1690 1695Val His Ser Ile Ala Arg Thr Leu Ala Cys Ile Asn Arg
Arg Ile 1700 1705 1710Ala Asp Glu Gln Met Lys Gln Ser His Phe Glu
Ala Ala Thr Gly 1715 1720 1725Arg Ala Phe Ser Phe Thr Asn Tyr Ser
Ile Gln Ser Ile Phe Asp 1730 1735 1740Thr Leu Lys Ala Asn Tyr Ala
Thr Lys His Thr Lys Glu Asn Ile 1745 1750 1755Ala Val Leu Gln Gln
Ala Lys Asp Gln Leu Leu Glu Phe Ser Asn 1760 1765 1770Leu Ala Lys
Asp Gln Asp Val Thr Gly Ile Ile Gln Asp Phe Asn 1775 1780 1785His
Leu Glu Thr Ile Tyr Leu Gln Ser Asp Ser Glu Val Ala Lys 1790 1795
1800His Leu Lys Leu Lys Ser His Trp Asn Lys Ser Gln Ile Thr Arg
1805 1810 1815Asp Ile Ile Ile Ala Leu Ser Val Leu Ile Gly Gly Gly
Trp Met 1820 1825 1830Leu Ala Thr Tyr Phe Lys Asp Lys Phe Asn Glu
Pro Val Tyr Phe 1835 1840 1845Gln Gly Lys Lys Asn Gln Lys His Lys
Leu Lys Met Arg Glu Ala 1850 1855 1860Arg Gly Ala Arg Gly Gln Tyr
Glu Val Ala Ala Glu Pro Glu Ala 1865 1870 1875Leu Glu His Tyr Phe
Gly Ser Ala Tyr Asn Asn Lys Gly Lys Arg 1880 1885 1890Lys Gly Thr
Thr Arg Gly Met Gly Ala Lys Ser Arg Lys Phe Ile 1895 1900 1905Asn
Met Tyr Gly Phe Asp Pro Thr Asp Phe Ser Tyr Ile Arg Phe 1910 1915
1920Val Asp Pro Leu Thr Gly His Thr Ile Asp Glu Ser Thr Asn Ala
1925 1930 1935Pro Ile Asp Leu Val Gln His Glu Phe Gly Lys Val Arg
Thr Arg 1940 1945 1950Met Leu Ile Asp Asp Glu Ile Glu Pro Gln Ser
Leu Ser Thr His 1955 1960 1965Thr Thr Ile His Ala Tyr Leu Val Asn
Ser Gly Thr Lys Lys Val 1970 1975 1980Leu Lys Val Asp Leu Thr Pro
His Ser Ser Leu Arg Ala Ser Glu 1985 1990 1995Lys Ser Thr Ala Ile
Met Gly Phe Pro Glu Arg Glu Asn Glu Leu 2000 2005 2010Arg Gln Thr
Gly Met Ala Val Pro Val Ala Tyr Asp Gln Leu Pro 2015 2020 2025Pro
Lys Asn Glu Asp Leu Thr Phe Glu Gly Glu Ser Leu Phe Lys 2030 2035
2040Gly Pro Arg Asp Tyr Asn Pro Ile Ser Ser Thr Ile Cys His Leu
2045 2050 2055Thr Asn Glu Ser Asp Gly His Thr Thr Ser Leu Tyr Gly
Ile Gly 2060 2065 2070Phe Gly Pro Phe Ile Ile Thr Asn Lys His Leu
Phe Arg Arg Asn 2075 2080 2085Asn Gly Thr Leu Leu Val Gln Ser Leu
His Gly Val Phe Lys Val 2090 2095 2100Lys Asn Thr Thr Thr Leu Gln
Gln His Leu Ile Asp Gly Arg Asp 2105 2110 2115Met Ile Ile Ile Arg
Met Pro Lys Asp Phe Pro Pro Phe Pro Gln 2120 2125 2130Lys Leu Lys
Phe Arg Glu Pro Gln Arg Glu Glu Arg Ile Cys Leu 2135 2140 2145Val
Thr Thr Asn Phe Gln Thr Lys Ser Met Ser Ser Met Val Ser 2150 2155
2160Asp Thr Ser Cys Thr Phe Pro Ser Ser Asp Gly Ile Phe Trp Lys
2165 2170 2175His Trp Ile Gln Thr Lys Asp Gly Gln Cys Gly Ser Pro
Leu Val 2180 2185 2190Ser Thr Arg Asp Gly Phe Ile Val Gly Ile His
Ser Ala Ser Asn 2195 2200 2205Phe Thr Asn Thr Asn Asn Tyr Phe Thr
Ser Val Pro Lys Asn Phe 2210 2215 2220Met Glu Leu Leu Thr Asn Gln
Glu Ala Gln Gln Trp Val Ser Gly 2225 2230 2235Trp Arg Leu Asn Ala
Asp Ser Val Leu Trp Gly Gly His Lys Val 2240 2245 2250Phe Met Ser
Lys Pro Glu Glu Pro Phe Gln Pro Val Lys Glu Ala 2255 2260 2265Thr
Gln Leu Met Asn Glu Leu Val Tyr Ser Gln Gly Glu Lys Arg 2270 2275
2280Lys Trp Val Val Glu Ala Leu Ser Gly Asn Leu Arg Pro Val Ala
2285 2290 2295Glu Cys Pro Ser Gln Leu Val Thr Lys His Val Val Lys
Gly Lys 2300 2305 2310Cys Pro Leu Phe Glu Leu Tyr Leu Gln Leu Asn
Pro Glu Lys Glu 2315 2320 2325Ala Tyr Phe Lys Pro Met Met Gly Ala
Tyr Lys Pro Ser Arg Leu 2330 2335 2340Asn Arg Glu Ala Phe Leu Lys
Asp Ile Leu Lys Tyr Ala Ser Glu 2345 2350 2355Ile Glu Ile Gly Asn
Val Asp Cys Asp Leu Leu Glu Leu Ala Ile 2360 2365 2370Ser Met Leu
Val Thr Lys Leu Lys Ala Leu Gly Phe Pro Thr Val 2375 2380 2385Asn
Tyr Ile Thr Asp Pro Glu Glu Ile Phe Ser Ala Leu Asn Met 2390 2395
2400Lys Ala Ala Met Gly Ala Leu Tyr Lys Gly Lys Lys Lys Glu Ala
2405 2410 2415Leu Ser Glu Leu Thr Leu Asp Glu Gln Glu Ala Met Leu
Lys Ala 2420 2425 2430Ser Cys Leu Arg Leu Tyr Thr Gly Lys Leu Gly
Ile Trp Asn Gly 2435 2440 2445Ser Leu Lys Ala Glu Leu Arg Pro Ile
Glu Lys Val Glu Asn Asn 2450 2455 2460Lys Thr Arg Thr Phe Thr Ala
Ala Pro Ile Asp Thr Leu Leu Ala 2465 2470 2475Gly Lys Val Cys Val
Asp Asp Phe Asn Asn Gln Phe Tyr Asp Leu 2480 2485 2490Asn Ile Lys
Ala Pro Trp Thr Val Gly Met Thr Lys Phe Tyr Gln 2495 2500 2505Gly
Trp Asn Glu Leu Met Glu Ala Leu Pro Ser Gly Trp Val Tyr 2510 2515
2520Cys Asp Ala Asp Gly Ser Gln Phe Asp Ser Ser Leu Thr Pro Phe
2525 2530 2535Leu Ile Asn Ala Val Leu Lys Val Arg Leu Ala Phe Met
Glu Glu 2540 2545 2550Trp Asp Ile Gly Glu Gln Met Leu Arg Asn Leu
Tyr Thr Glu Ile 2555 2560 2565Val Tyr Thr Pro Ile Leu Thr Pro Asp
Gly Thr Ile Ile Lys Lys 2570 2575 2580His Lys Gly Asn Asn Ser Gly
Gln Pro Ser Thr Val Val Asp Asn 2585 2590 2595Thr Leu Met Val Ile
Ile Ala Met Leu Tyr Thr Cys Glu Lys Cys 2600 2605 2610Gly Ile Asn
Lys Glu Glu Ile Val Tyr Tyr Val Asn Gly Asp Asp 2615 2620 2625Leu
Leu Ile Ala Ile His Pro Asp Lys Ala Glu Arg Leu Ser Arg 2630 2635
2640Phe Lys Glu Ser Phe Gly Glu Leu Gly Leu Lys Tyr Glu Phe Asp
2645 2650 2655Cys Thr Thr Arg Asp Lys Thr Gln Leu Trp Phe Met Ser
His Arg 2660 2665 2670Ala Leu Glu Arg Asp Gly Met Tyr Ile Pro Lys
Leu Glu Glu Glu 2675 2680 2685Arg Ile Val Ser Ile Leu Glu Trp Asp
Arg Ser Lys Glu Pro Ser 2690 2695 2700His Arg Leu Glu Ala Ile Cys
Ala Ser Met Ile Glu Ala Trp Gly 2705 2710 2715Tyr Asp Lys Leu Val
Glu Glu Ile Arg Asn Phe Tyr Ala Trp Val 2720 2725 2730Leu Glu Gln
Ala Pro Tyr Ser Gln Leu Ala Glu Glu Gly Lys Ala 2735 2740 2745Pro
Tyr Leu Ala Glu Thr Ala Leu Lys Phe Leu Tyr Thr Ser Gln 2750 2755
2760His Gly Thr Asn Ser Glu Ile Glu Glu Tyr Leu Lys Val Leu Tyr
2765 2770 2775Asp Tyr Asp Ile Pro Thr Thr Glu Asn Leu Tyr Phe Gln
Ser Gly 2780 2785 2790Thr Val Asp Ala Gly Ala Asp Ala Gly Lys Lys
Lys Asp Gln Lys 2795 2800 2805Asp Asp Lys Val Ala Glu Gln Ala Ser
Lys Asp Arg Asp Val Asn 2810 2815 2820Ala Gly Thr Ser Gly Thr Phe
Ser Val Pro Arg Ile Asn Ala Met 2825 2830 2835Ala Thr Lys Leu Gln
Tyr Pro Arg Met Arg Gly Glu Val Val Val 2840 2845 2850Asn Leu Asn
His Leu Leu Gly Tyr Lys Pro Gln Gln Ile Asp Leu 2855 2860 2865Ser
Asn Ala Arg Ala Thr His Glu Gln Phe Ala Ala Trp His Gln 2870 2875
2880Ala Val Met Thr Ala Tyr Gly Val Asn Glu Glu Gln Met Lys Ile
2885 2890 2895Leu Leu Asn Gly Phe Met Val Trp Cys Ile Glu Asn Gly
Thr Ser 2900 2905 2910Pro Asn Leu Asn Gly Thr Trp Val Met Met Asp
Gly Glu Asp Gln 2915 2920 2925Val Ser Tyr Pro Leu Lys Pro Met Val
Glu Asn Ala Gln Pro Thr 2930 2935 2940Leu Arg Gln Ile Met Thr His
Phe Ser Asp Leu Ala Glu Ala Tyr 2945 2950 2955Ile Glu Met Arg Asn
Arg Glu Arg Pro Tyr Met Pro Arg Tyr Gly 2960 2965 2970Leu Gln Arg
Asn Ile Thr Asp Met Ser Leu Ser Arg Tyr Ala Phe 2975 2980 2985Asp
Phe Tyr Glu Leu Thr Ser Lys Thr Pro Val Arg Ala Arg Glu 2990 2995
3000Ala His Met Gln Met Lys Ala Ala Ala Val Arg Asn Ser Gly Thr
3005 3010 3015Arg Leu Phe Gly Leu Asp Gly Asn Val Gly Thr Ala Glu
Glu Asp 3020 3025 3030Thr Glu Arg His Thr Ala His Asp Val Asn Arg
Asn Met His Thr 3035 3040 3045Leu Leu Gly Val Arg Gln
3050854157DNAHomo sapiens 85agcggggcgg ggcgccagcg ctgccttttc
tcctgccggg tagtttcgct ttcctgcgca 60gagtctgcgg aggggctcgg ctgcaccggg
gggatcgcgc ctggcagacc ccagaccgag 120cagaggcgac ccagcgcgct
cgggagaggc tgcaccgccg cgcccccgcc tagcccttcc 180ggatcctgcg
cgcagaaaag tttcatttgc tgtatgccat cctcgagagc tgtctaggtt
240aacgttcgca ctctgtgtat ataacctcga cagtcttggc acctaacgtg
ctgtgcgtag 300ctgctccttt ggttgaatcc ccaggccctt gttggggcac
aaggtggcag gatgtctcag 360tggtacgaac ttcagcagct tgactcaaaa
ttcctggagc aggttcacca gctttatgat 420gacagttttc ccatggaaat
cagacagtac ctggcacagt ggttagaaaa gcaagactgg 480gagcacgctg
ccaatgatgt ttcatttgcc accatccgtt ttcatgacct cctgtcacag
540ctggatgatc aatatagtcg cttttctttg gagaataact tcttgctaca
gcataacata 600aggaaaagca agcgtaatct tcaggataat tttcaggaag
acccaatcca gatgtctatg 660atcatttaca gctgtctgaa ggaagaaagg
aaaattctgg aaaacgccca gagatttaat 720caggctcagt cggggaatat
tcagagcaca gtgatgttag acaaacagaa agagcttgac 780agtaaagtca
gaaatgtgaa ggacaaggtt atgtgtatag agcatgaaat caagagcctg
840gaagatttac aagatgaata tgacttcaaa tgcaaaacct tgcagaacag
agaacacgag 900accaatggtg tggcaaagag tgatcagaaa caagaacagc
tgttactcaa gaagatgtat 960ttaatgcttg acaataagag aaaggaagta
gttcacaaaa taatagagtt gctgaatgtc 1020actgaactta cccagaatgc
cctgattaat gatgaactag tggagtggaa gcggagacag 1080cagagcgcct
gtattggggg gccgcccaat gcttgcttgg atcagctgca gaactggttc
1140actatagttg cggagagtct gcagcaagtt cggcagcagc ttaaaaagtt
ggaggaattg 1200gaacagaaat acacctacga acatgaccct atcacaaaaa
acaaacaagt gttatgggac 1260cgcaccttca gtcttttcca gcagctcatt
cagagctcgt ttgtggtgga aagacagccc 1320tgcatgccaa cgcaccctca
gaggccgctg gtcttgaaga caggggtcca gttcactgtg 1380aagttgagac
tgttggtgaa attgcaagag ctgaattata atttgaaagt caaagtctta
1440tttgataaag atgtgaatga gagaaataca gtaaaaggat ttaggaagtt
caacattttg 1500ggcacgcaca caaaagtgat gaacatggag gagtccacca
atggcagtct ggcggctgaa 1560tttcggcacc tgcaattgaa agaacagaaa
aatgctggca ccagaacgaa tgagggtcct 1620ctcatcgtta ctgaagagct
tcactccctt agttttgaaa cccaattgtg ccagcctggt 1680ttggtaattg
acctcgagac gacctctctg cccgttgtgg tgatctccaa cgtcagccag
1740ctcccgagcg gttgggcctc catcctttgg tacaacatgc tggtggcgga
acccaggaat 1800ctgtccttct tcctgactcc accatgtgca cgatgggctc
agctttcaga agtgctgagt 1860tggcagtttt cttctgtcac caaaagaggt
ctcaatgtgg accagctgaa catgttggga 1920gagaagcttc ttggtcctaa
cgccagcccc gatggtctca ttccgtggac gaggttttgt 1980aaggaaaata
taaatgataa aaattttccc ttctggcttt ggattgaaag catcctagaa
2040ctcattaaaa aacacctgct ccctctctgg aatgatgggt gcatcatggg
cttcatcagc 2100aaggagcgag agcgtgccct gttgaaggac cagcagccgg
ggaccttcct gctgcggttc 2160agtgagagct cccgggaagg ggccatcaca
ttcacatggg tggagcggtc ccagaacgga 2220ggcgaacctg acttccatgc
ggttgaaccc tacacgaaga aagaactttc tgctgttact 2280ttccctgaca
tcattcgcaa ttacaaagtc atggctgctg agaatattcc tgagaatccc
2340ctgaagtatc tgtatccaaa tattgacaaa gaccatgcct ttggaaagta
ttactccagg 2400ccaaaggaag caccagagcc aatggaactt gatggcccta
aaggaactgg atatatcaag 2460actgagttga tttctgtgtc tgaagttcac
ccttctagac ttcagaccac agacaacctg 2520ctccccatgt ctcctgagga
gtttgacgag gtgtctcgga tagtgggctc tgtagaattc 2580gacagtatga
tgaacacagt atagagcatg aatttttttc atcttctctg gcgacagttt
2640tccttctcat ctgtgattcc ctcctgctac tctgttcctt cacatcctgt
gtttctaggg 2700aaatgaaaga aaggccagca aattcgctgc aacctgttga
tagcaagtga atttttctct 2760aactcagaaa catcagttac tctgaagggc
atcatgcatc ttactgaagg taaaattgaa 2820aggcattctc tgaagagtgg
gtttcacaag tgaaaaacat ccagatacac ccaaagtatc 2880aggacgagaa
tgagggtcct ttgggaaagg agaagttaag caacatctag caaatgttat
2940gcataaagtc agtgcccaac tgttataggt tgttggataa atcagtggtt
atttagggaa 3000ctgcttgacg taggaacggt aaatttctgt gggagaattc
ttacatgttt tctttgcttt 3060aagtgtaact ggcagttttc cattggttta
cctgtgaaat agttcaaagc caagtttata 3120tacaattata tcagtcctct
ttcaaaggta gccatcatgg atctggtagg gggaaaatgt 3180gtattttatt
acatctttca cattggctat ttaaagacaa agacaaattc tgtttcttga
3240gaagagaata ttagctttac tgtttgttat ggcttaatga cactagctaa
tatcaataga 3300aggatgtaca tttccaaatt cacaagttgt gtttgatatc
caaagctgaa tacattctgc 3360tttcatcttg gtcacataca attattttta
cagttctccc aagggagtta ggctattcac 3420aaccactcat tcaaaagttg
aaattaacca tagatgtaga taaactcaga aatttaattc 3480atgtttctta
aatgggctac tttgtccttt ttgttattag ggtggtattt agtctattag
3540ccacaaaatt gggaaaggag tagaaaaagc agtaactgac aacttgaata
atacaccaga 3600gataatatga gaatcagatc atttcaaaac tcatttccta
tgtaactgca ttgagaactg 3660catatgtttc gctgatatat
gtgtttttca catttgcgaa tggttccatt ctctctcctg 3720tactttttcc
agacactttt ttgagtggat gatgtttcgt gaagtatact gtatttttac
3780ctttttcctt ccttatcact gacacaaaaa gtagattaag agatgggttt
gacaaggttc 3840ttccctttta catactgctg tctatgtggc tgtatcttgt
ttttccacta ctgctaccac 3900aactatatta tcatgcaaat gctgtattct
tctttggtgg agataaagat ttcttgagtt 3960ttgttttaaa attaaagcta
aagtatctgt attgcattaa atataatatg cacacagtgc 4020tttccgtggc
actgcataca atctgaggcc tcctctctca gtttttatat agatggcgag
4080aacctaagtt tcagttgatt ttacaattga aatgactaaa aaacaaagaa
gacaacatta 4140aaacaatatt gtttcta 4157864451DNAHomo sapiens
86gctcatacta gggacgggaa gtcgcgacca gagccattgg agggcgcggg gactgcaacc
60ctaatcagca gagcccaaat ggcgcagtgg gaaatgctgc agaatcttga cagccccttt
120caggatcagc tgcaccagct ttactcgcac agcctcctgc ctgtggacat
tcgacagtac 180ttggctgtct ggattgaaga ccagaactgg caggaagctg
cacttgggag tgatgattcc 240aaggctacca tgctattctt ccacttcttg
gatcagctga actatgagtg tggccgttgc 300agccaggacc cagagtcctt
gttgctgcag cacaatttgc ggaaattctg ccgggacatt 360cagccctttt
cccaggatcc tacccagttg gctgagatga tctttaacct ccttctggaa
420gaaaaaagaa ttttgatcca ggctcagagg gcccaattgg aacaaggaga
gccagttctc 480gaaacacctg tggagagcca gcaacatgag attgaatccc
ggatcctgga tttaagggct 540atgatggaga agctggtaaa atccatcagc
caactgaaag accagcagga tgtcttctgc 600ttccgatata agatccaggc
caaagggaag acaccctctc tggaccccca tcagaccaaa 660gagcagaaga
ttctgcagga aactctcaat gaactggaca aaaggagaaa ggaggtgctg
720gatgcctcca aagcactgct aggccgatta actaccctaa tcgagctact
gctgccaaag 780ttggaggagt ggaaggccca gcagcaaaaa gcctgcatca
gagctcccat tgaccacggg 840ttggaacagc tggagacatg gttcacagct
ggagcaaagc tgttgtttca cctgaggcag 900ctgctgaagg agctgaaggg
actgagttgc ctggttagct atcaggatga ccctctgacc 960aaaggggtgg
acctacgcaa cgcccaggtc acagagttgc tacagcgtct gctccacaga
1020gcctttgtgg tagaaaccca gccctgcatg ccccaaactc cccatcgacc
cctcatcctc 1080aagactggca gcaagttcac cgtccgaaca aggctgctgg
tgagactcca ggaaggcaat 1140gagtcactga ctgtggaagt ctccattgac
aggaatcctc ctcaattaca aggcttccgg 1200aagttcaaca ttctgacttc
aaaccagaaa actttgaccc ccgagaaggg gcagagtcag 1260ggtttgattt
gggactttgg ttacctgact ctggtggagc aacgttcagg tggttcagga
1320aagggcagca ataaggggcc actaggtgtg acagaggaac tgcacatcat
cagcttcacg 1380gtcaaatata cctaccaggg tctgaagcag gagctgaaaa
cggacaccct ccctgtggtg 1440attatttcca acatgaacca gctctcaatt
gcctgggctt cagttctctg gttcaatttg 1500ctcagcccaa accttcagaa
ccagcagttc ttctccaacc cccccaaggc cccctggagc 1560ttgctgggcc
ctgctctcag ttggcagttc tcctcctatg ttggccgagg cctcaactca
1620gaccagctga gcatgctgag aaacaagctg ttcgggcaga actgtaggac
tgaggatcca 1680ttattgtcct gggctgactt cactaagcga gagagccctc
ctggcaagtt accattctgg 1740acatggctgg acaaaattct ggagttggta
catgaccacc tgaaggatct ctggaatgat 1800ggacgcatca tgggctttgt
gagtcggagc caggagcgcc ggctgctgaa gaagaccatg 1860tctggcacct
ttctactgcg cttcagtgaa tcgtcagaag ggggcattac ctgctcctgg
1920gtggagcacc aggatgatga caaggtgctc atctactctg tgcaaccgta
cacgaaggag 1980gtgctgcagt cactcccgct gactgaaatc atccgccatt
accagttgct cactgaggag 2040aatatacctg aaaacccact gcgcttcctc
tatccccgaa tcccccggga tgaagctttt 2100gggtgctact accaggagaa
agttaatctc caggaacgga ggaaatacct gaaacacagg 2160ctcattgtgg
tctctaatag acaggtggat gaactgcaac aaccgctgga gcttaagcca
2220gagccagagc tggagtcatt agagctggaa ctagggctgg tgccagagcc
agagctcagc 2280ctggacttag agccactgct gaaggcaggg ctggatctgg
ggccagagct agagtctgtg 2340ctggagtcca ctctggagcc tgtgatagag
cccacactat gcatggtatc acaaacagtg 2400ccagagccag accaaggacc
tgtatcacag ccagtgccag agccagattt gccctgtgat 2460ctgagacatt
tgaacactga gccaatggaa atcttcagaa actgtgtaaa gattgaagaa
2520atcatgccga atggtgaccc actgttggct ggccagaaca ccgtggatga
ggtttacgtc 2580tcccgcccca gccacttcta cactgatgga cccttgatgc
cttctgactt ctaggaacca 2640catttcctct gttcttttca tatctcttgc
ccttcctact cctcatagca tgatattgtt 2700ctccaaggat gggaatcagg
catgtgtccc ttccaagctg tgttaactgt tcaaactcag 2760gcctgtgtga
ctccattggg gtgagaggtg aaagcataac atgggtacag aggggacaac
2820aatgaatcag aacagatgct gagccatagg tctaaatagg atcctggagg
ctgcctgctg 2880tgctgggagg tataggggtc ctgggggcag gccagggcag
ttgacaggta cttggagggc 2940tcagggcagt ggcttctttc cagtatggaa
ggatttcaac attttaatag ttggttaggc 3000taaactggtg catactggca
ttggcccttg gtggggagca cagacacagg ataggactcc 3060atttctttct
tccattcctt catgtctagg ataacttgct ttcttctttc ctttactcct
3120ggctcaagcc ctgaatttct tcttttcctg caggggttga gagctttctg
ccttagccta 3180ccatgtgaaa ctctaccctg aagaaaggga tggataggaa
gtagacctct ttttcttacc 3240agtctcctcc cctactctgc ccctaagctg
gctgtacctg ttcctccccc ataaaatgat 3300cctgccaatc taatgtgagt
gtgaagcttt gcacactagt ttatgctacc tagtctccac 3360tttctcaatg
cttaggagac agatcactcc tggaggctgg ggatggtagg attgctgggg
3420attttttttt ttttaaacag ggtctcactc tgttgcccag gctagagtgc
aatggtgcaa 3480tcacagctca ctgcagcctc aacctcctgg gttcaagcaa
tcctcctacc tcagcctcct 3540gggtagctag caccatggca tgcgccacca
tgccctattt ttttttttta aagacagggt 3600cttgctatat tgcccaggct
ggtcttgaac tgggctcaag tgatcctcac gccttggcct 3660cccaaagtgc
tgggattata ggcatgagcc actgtgcttg gccaggattt tttttttttt
3720ttttttgaga tggagtttct ctcttgttgt ccaggctgga gtgcaatggt
gtgatctcgg 3780ctcactgcaa cctccgcctt ccgggttcaa gtgactctcc
tgcctcagcc tccccagtag 3840ctgggattac agatctgcac caccatgccc
agctaatttt gtatttttag tagagacggg 3900gtttctccat gttggtcagg
ctggtctcga actcctgacc tcaagtgatc tgtccacctc 3960ggcctcccag
agtgctggga ttacaggcgt gagccactgt tcccagcagg aatttctttt
4020ttatagtatt ggataaagtt tggtgttttt acagaggaga agcaatgggt
cttagctctt 4080tctctattat gttatcatcc tccctttttt gtacaatatg
ttgtttacct gaaaggaagg 4140tttctattcg ttggttgtgg acctggacaa
agtccaagtc tgtggaactt aaaaccttga 4200aggtctgtca taggactctg
gacaatctca caccttagct attcccaggg aaccccaggg 4260ggcaactgac
attgctccaa gatgttctcc tgatgtagct tgagatataa aggaaaggcc
4320ctgcacaggt ggctgtttct tgtctgttat gtcagaggaa cagtcctgtt
cagaaagggg 4380ctcttctgag cagaaatggc taataaactt tgtgctgatc
tggaaaaaaa aaaaaaaaaa 4440aaaaaaaaaa a 4451879PRTHomo sapiens 87Gly
Ser Glu Asn Leu Tyr Phe Gln Leu1 58827DNAhomo sapiens 88tctagaggcc
tgatcatccg gtctcac 278929DNAhomo sapiens 89tctagatgga aaacagaagt
cccggaaac 29902290DNAHomo sapiens 90gaattccgaa tcatgtgcag
aatgctgaat cttcccccag ccaggacgaa taagacagcg 60cggaaaagca gattctcgta
attctggaat tgcatgttgc aaggagtctc ctggatcttc 120gcacccagct
tcgggtaggg agggagtccg ggtcccgggc taggccagcc cggcaggtgg
180agagggtccc cggcagcccc gcgcgcccct ggccatgtct ttaatgccct
gccccttcat 240gtggccttct gagggttccc agggctggcc agggttgttt
cccacccgcg cgcgcgctct 300cacccccagc caaacccacc tggcagggct
ccctccagcc gagacctttt gattcccggc 360tcccgcgctc ccgcctccgc
gccagcccgg gaggtggccc tggacagccg gacctcgccc 420ggccccggct
gggaccatgg tgtttctctc gggaaatgct tccgacagct ccaactgcac
480ccaaccgccg gcaccggtga acatttccaa ggccattctg ctcggggtga
tcttgggggg 540cctcattctt ttcggggtgc tgggtaacat cctagtgatc
ctctccgtag cctgtcaccg 600acacctgcac tcagtcacgc actactacat
cgtcaacctg gcggtggccg acctcctgct 660cacctccacg gtgctgccct
tctccgccat cttcgaggtc ctaggctact gggccttcgg 720cagggtcttc
tgcaacatct gggcggcagt ggatgtgctg tgctgcaccg cgtccatcat
780gggcctctgc atcatctcca tcgaccgcta catcggcgtg agctacccgc
tgcgctaccc 840aaccatcgtc acccagagga ggggtctcat ggctctgctc
tgcgtctggg cactctccct 900ggtcatatcc attggacccc tgttcggctg
gaggcagccg gcccccgagg acgagaccat 960ctgccagatc aacgaggagc
cgggctacgt gctcttctca gcgctgggct ccttctacct 1020gcctctggcc
atcatcctgg tcatgtactg ccgcgtctac gtggtggcca agagggagag
1080ccggggcctc aagtctggcc tcaagaccga caagtcggac tcggagcaag
tgacgctccg 1140catccatcgg aaaaacgccc cggcaggagg cagcgggatg
gccagcgcca agaccaagac 1200gcacttctca gtgaggctcc tcaagttctc
ccgggagaag aaagcggcca aaacgctggg 1260catcgtggtc ggctgcttcg
tcctctgctg gctgcctttt ttcttagtca tgcccattgg 1320gtctttcttc
cctgatttca agccctctga aacagttttt aaaatagtat tttggctcgg
1380atatctaaac agctgcatca accccatcat atacccatgc tccagccaag
agttcaaaaa 1440ggcctttcag aatgtcttga gaatccagtg tctccgcaga
aagcagtctt ccaaacatgc 1500cctgggctac accctgcacc cgcccagcca
ggccgtggaa gggcaacaca aggacatggt 1560gcgcatcccc gtgggatcaa
gagagacctt ctacaggatc tccaagacgg atggcgtttg 1620tgaatggaaa
tttttctctt ccatgccccg tggatctgcc aggattacag tgtccaaaga
1680ccaatcctcc tgtaccacag cccgggtgag aagtaaaagc tttttggagg
tctgctgctg 1740tgtagggccc tcaaccccca gccttgacaa gaaccatcaa
gttccaacca ttaaggtcca 1800caccatctcc ctcagtgaga acggggagga
agtctaggac aggaaagatg cagaggaaag 1860gggaataatc ttaggtaccc
accccacttc cttctcggaa ggccagctct tcttggagga 1920caagacagga
ccaatcaaag aggggacctg ctgggaatgg ggtgggtggt agacccaact
1980catcaggcag cgggtagggc acagggaaga gggagggtgt ctcacaacca
accagttcag 2040aatgatacgg aacagcattt ccctgcagct aatgctttct
tggtcactct gtgcccactt 2100caacgaaaac caccatggga aacagaattt
catgcacaat ccaaaagact ataaatatag 2160gattatgatt tcatcatgaa
tattttgagc acacactcta agtttggagc tatttcttga 2220tggaagtgag
gggattttat tttcaggctc aacctactga cagccacatt tgacatttat
2280gccggaattc 22909126DNAhomo sapiens 91ctcggatatc taaacagctg
catcaa 269229DNAhomo sapiens 92tctagacttt ctgcagagac actggattc
299331DNAhomo sapiens 93tctagatcga aggcagtgga ggatcttcag g
319427DNAhomo sapiens 94tctagaggcc tgatcatccg gtctcac 279523DNAhomo
sapiens 95cggatccgtt ggtactcttg agg 23964989DNAhomo sapiens
96tttttttttt ttttgagaaa gggaatttca tcccaaataa aaggaatgaa gtctggctcc
60ggaggagggt ccccgacctc gctgtggggg ctcctgtttc tctccgccgc gctctcgctc
120tggccgacga gtggagaaat ctgcgggcca ggcatcgaca tccgcaacga
ctatcagcag 180ctgaagcgcc tggagaactg cacggtgatc gagggctacc
tccacatcct gctcatctcc 240aaggccgagg actaccgcag ctaccgcttc
cccaagctca cggtcattac cgagtacttg 300ctgctgttcc gagtggctgg
cctcgagagc ctcggagacc tcttccccaa cctcacggtc 360atccgcggct
ggaaactctt ctacaactac gccctggtca tcttcgagat gaccaatctc
420aaggatattg ggctttacaa cctgaggaac attactcggg gggccatcag
gattgagaaa 480aatgctgacc tctgttacct ctccactgtg gactggtccc
tgatcctgga tgcggtgtcc 540aataactaca ttgtggggaa taagccccca
aaggaatgtg gggacctgtg tccagggacc 600atggaggaga agccgatgtg
tgagaagacc accatcaaca atgagtacaa ctaccgctgc 660tggaccacaa
accgctgcca gaaaatgtgc ccaagcacgt gtgggaagcg ggcgtgcacc
720gagaacaatg agtgctgcca ccccgagtgc ctgggcagct gcagcgcgcc
tgacaacgac 780acggcctgtg tagcttgccg ccactactac tatgccggtg
tctgtgtgcc tgcctgcccg 840cccaacacct acaggtttga gggctggcgc
tgtgtggacc gtgacttctg cgccaacatc 900ctcagcgccg agagcagcga
ctccgagggg tttgtgatcc acgacggcga gtgcatgcag 960gagtgcccct
cgggcttcat ccgcaacggc agccagagca tgtactgcat cccttgtgaa
1020ggtccttgcc cgaaggtctg tgaggaagaa aagaaaacaa agaccattga
ttctgttact 1080tctgctcaga tgctccaagg atgcaccatc ttcaagggca
atttgctcat taacatccga 1140cgggggaata acattgcttc agagctggag
aacttcatgg ggctcatcga ggtggtgacg 1200ggctacgtga agatccgcca
ttctcatgcc ttggtctcct tgtccttcct aaaaaacctt 1260cgcctcatcc
taggagagga gcagctagaa gggaattact ccttctacgt cctcgacaac
1320cagaacttgc agcaactgtg ggactgggac caccgcaacc tgaccatcaa
agcagggaaa 1380atgtactttg ctttcaatcc caaattatgt gtttccgaaa
tttaccgcat ggaggaagtg 1440acggggacta aagggcgcca aagcaaaggg
gacataaaca ccaggaacaa cggggagaga 1500gcctcctgtg aaagtgacgt
cctgcatttc acctccacca ccacgtcgaa gaatcgcatc 1560atcataacct
ggcaccggta ccggccccct gactacaggg atctcatcag cttcaccgtt
1620tactacaagg aagcaccctt taagaatgtc acagagtatg atgggcagga
tgcctgcggc 1680tccaacagct ggaacatggt ggacgtggac ctcccgccca
acaaggacgt ggagcccggc 1740atcttactac atgggctgaa gccctggact
cagtacgccg tttacgtcaa ggctgtgacc 1800ctcaccatgg tggagaacga
ccatatccgt ggggccaaga gtgagatctt gtacattcgc 1860accaatgctt
cagttccttc cattcccttg gacgttcttt cagcatcgaa ctcctcttct
1920cagttaatcg tgaagtggaa ccctccctct ctgcccaacg gcaacctgag
ttactacatt 1980gtgcgctggc agcggcagcc tcaggacggc tacctttacc
ggcacaatta ctgctccaaa 2040gacaaaatcc ccatcaggaa gtatgccgac
ggcaccatcg acattgagga ggtcacagag 2100aaccccaaga ctgaggtgtg
tggtggggag aaagggcctt gctgcgcctg ccccaaaact 2160gaagccgaga
agcaggccga gaaggaggag gctgaatacc gcaaagtctt tgagaatttc
2220ctgcacaact ccatcttcgt gcccagacct gaaaggaagc ggagagatgt
catgcaagtg 2280gccaacacca ccatgtccag ccgaagcagg aacaccacgg
ccgcagacac ctacaacatc 2340accgacccgg aagagctgga gacagagtac
cctttctttg agagcagagt ggataacaag 2400gagagaactg tcatttctaa
ccttcggcct ttcacattgt accgcatcga tatccacagc 2460tgcaaccacg
aggctgagaa gctgggctgc agcgcctcca acttcgtctt tgcaaggact
2520atgcccgcag aaggagcaga tgacattcct gggccagtga cctgggagcc
aaggcctgaa 2580aactccatct ttttaaagtg gccggaacct gagaatccca
atggattgat tctaatgtat 2640gaaataaaat acggatcaca agttgaggat
cagcgagaat gtgtgtccag acaggaatac 2700aggaagtatg gaggggccaa
gctaaaccgg ctaaacccgg ggaactacac agcccggatt 2760caggccacat
ctctctctgg gaatgggtcg tggacagatc ctgtgttctt ctatgtccag
2820gccaaaacag gatatgaaaa cttcatccat ctgatcatcg ctctgcccgt
cgctgtcctg 2880ttgatcgtgg gagggttggt gattatgctg tacgtcttcc
atagaaagag aaataacagc 2940aggctgggga atggagtgct gtatgcctct
gtgaacccgg agtacttcag cgctgctgat 3000gtgtacgttc ctgatgagtg
ggaggtggct cgggagaaga tcaccatgag ccgggaactt 3060gggcaggggt
cgtttgggat ggtctatgaa ggagttgcca agggtgtggt gaaagatgaa
3120cctgaaacca gagtggccat taaaacagtg aacgaggccg caagcatgcg
tgagaggatt 3180gagtttctca acgaagcttc tgtgatgaag gagttcaatt
gtcaccatgt ggtgcgattg 3240ctgggtgtgg tgtcccaagg ccagccaaca
ctggtcatca tggaactgat gacacggggc 3300gatctcaaaa gttatctccg
gtctctgagg ccagaaatgg agaataatcc agtcctagca 3360cctccaagcc
tgagcaagat gattcagatg gccggagaga ttgcagacgg catggcatac
3420ctcaacgcca ataagttcgt ccacagagac cttgctgccc ggaattgcat
ggtagccgaa 3480gatttcacag tcaaaatcgg agattttggt atgacgcgag
atatctatga gacagactat 3540taccggaaag gaggcaaagg gctgctgccc
gtgcgctgga tgtctcctga gtccctcaag 3600gatggagtct tcaccactta
ctcggacgtc tggtccttcg gggtcgtcct ctgggagatc 3660gccacactgg
ccgagcagcc ctaccagggc ttgtccaacg agcaagtcct tcgcttcgtc
3720atggagggcg gccttctgga caagccagac aactgtcctg acatgctgtt
tgaactgatg 3780cgcatgtgct ggcagtataa ccccaagatg aggccttcct
tcctggagat catcagcagc 3840atcaaagagg agatggagcc tggcttccgg
gaggtctcct tctactacag cgaggagaac 3900aagctgcccg agccggagga
gctggacctg gagccagaga acatggagag cgtccccctg 3960gacccctcgg
cctcctcgtc ctccctgcca ctgcccgaca gacactcagg acacaaggcc
4020gagaacggcc ccggccctgg ggtgctggtc ctccgcgcca gcttcgacga
gagacagcct 4080tacgcccaca tgaacggggg ccgcaagaac gagcgggcct
tgccgctgcc ccagtcttcg 4140acctgctgat ccttggatcc tgaatctgtg
caaacagtaa cgtgtgcgca cgcgcagcgg 4200ggtggggggg gagagagagt
tttaacaatc cattcacaag cctcctgtac ctcagtggat 4260cttcagttct
gcccttgctg cccgcgggag acagcttctc tgcagtaaaa cacatttggg
4320atgttccttt tttcaatatg caagcagctt tttattccct gcccaaaccc
ttaactgaca 4380tgggccttta agaaccttaa tgacaacact taatagcaac
agagcacttg agaaccagtc 4440tcctcactct gtccctgtcc ttccctgttc
tccctttctc tctcctctct gcttcataac 4500ggaaaaataa ttgccacaag
tccagctggg aagccctttt tatcagtttg aggaagtggc 4560tgtccctgtg
gccccatcca accactgtac acacccgcct gacaccgtgg gtcattacaa
4620aaaaacacgt ggagatggaa atttttacct ttatctttca cctttctagg
gacatgaaat 4680ttacaaaggg ccatcgttca tccaaggctg ttaccatttt
aacgctgcct aattttgcca 4740aaatcctgaa ctttctccct catcggcccg
gcgctgattc ctcgtgtccg gaggcatggg 4800tgagcatggc agctggttgc
tccatttgag agacacgctg gcgacacact ccgtccatcc 4860gactgcccct
gctgtgctgc tcaaggccac aggcacacag gtctcattgc ttctgactag
4920attattattt gggggaactg gacacaatag gtctttctct cagtgaaggt
ggggagaagc 4980tgaaccggc 4989973076DNAhomo sapiens 97gtttctccag
ggaggcaggg cccggggaga aagttggagc ggtaacctaa gctggcagtg 60gcgtgatccg
gcaccaaatc ggcccgcggt gcggtgcgga gactccatga ggccctggac
120atgaacaagc tgagtggagg cggcgggcgc aggactcggg tggaaggggg
ccagcttggg 180ggcgaggagt ggacccgcca cgggagcttt gtcaataagc
ccacgcgggg ctggctgcat 240cccaacgaca aagtcatggg acccggggtt
tcctacttgg ttcggtacat gggttgtgtg 300gaggtcctcc agtcaatgcg
tgccctggac ttcaacaccc ggactcaggt caccagggag 360gccatcagtc
tggtgtgtga ggctgtgccg ggtgctaagg gggcgacaag gaggagaaag
420ccctgtagcc gcccgctcag ctctatcctg gggaggagta acctgaaatt
tgctggaatg 480ccaatcactc tcaccgtctc caccagcagc ctcaacctca
tggccgcaga ctgcaaacag 540atcatcgcca accaccacat gcaatctatc
tcatttgcat ccggcgggga tccggacaca 600gccgagtatg tcgcctatgt
tgccaaagac cctgtgaatc agagagcctg ccacattctg 660gagtgtcccg
aagggcttgc ccaggatgtc atcagcacca ttggccaggc cttcgagttg
720cgcttcaaac aatacctcag gaacccaccc aaactggtca cccctcatga
caggatggct 780ggctttgatg gctcagcatg ggatgaggag gaggaagagc
cacctgacca tcagtactat 840aatgacttcc cggggaagga accccccttg
gggggggtgg tagacatgag gcttcgggaa 900ggagccgctc caggggctgc
tcgacccact gcacccaatg cccagacccc cagccacttg 960ggagctacat
tgcctgtagg acagcctgtt gggggagatc cagaagtccg caaacagatg
1020ccacctccac caccctgtcc agcaggcaga gagctttttg atgatccctc
ctatgtcaac 1080gtccagaacc tagacaaggc ccggcaagca gtgggtggtg
ctgggccccc caatcctgct 1140atcaatggca gtgcaccccg ggacctgttt
gacatgaagc ccttcgaaga tgctcttcgc 1200gtgcctccac ctccccagtc
ggtgtccatg gctgagcagc tccgagggga gccctggttc 1260catgggaagc
tgagccggcg ggaggctgag gcactgctgc agctcaatgg ggacttcctg
1320gtacgggaga gcacgaccac acctggccag tatgtgctca ctggcttgca
gagtgggcag 1380cctaagcatt tgctactggt ggaccctgag ggtgtggttc
ggactaagga tcaccgcttt 1440gaaagtgtca gtcaccttat cagctaccac
atggacaatc acttgcccat catctctgcg 1500ggcagcgaac tgtgtctaca
gcaacctgtg gagcggaaac tgtgatctgc cctagcgctc 1560tcttccagaa
gatgccctcc aatcctttcc accctattcc ctaactctcg ggacctcgtt
1620tgggagtgtt ctgtgggctt ggccttgtgt cagagctggg agtagcatgg
actctgggtt 1680tcatatccag ctgagtgaga gggtttgagt caaaagcctg
ggtgagaatc ctgcctctcc 1740ccaaacatta atcaccaaag tattaatgta
cagagtggcc cctcacctgg gcctttcctg 1800tgccaacctg atgccccttc
cccaagaagg tgagtgcttg tcatggaaaa tgtcctgtgg 1860tgacaggccc
agtggaacag tcacccttct gggcaagggg gaacaaatca cacctctggg
1920cttcagggta tcccagaccc ctctcaacac ccgccccccc
catgtttaaa ctttgtgcct 1980ttgaccatct cttaggtcta atgatatttt
atgcaaacag ttcttggacc cctgaattca 2040atgacaggga tgccaacacc
ttcttggctt ctgggacctg tgttcttgct gagcaccctc 2100tccggtttgg
gttgggataa cagaggcagg agtggcagct gtcccctctc cctggggata
2160tgcaaccctt agagattgcc ccagagcccc actcccggcc aggcgggaga
tggacccctc 2220ccttgctcag tgcctcctgg ccggggcccc tcaccccaag
gggtctgtat atacatttca 2280taaggcctgc cctcccatgt tgcatgccta
tgtactctac gccaaagtgc agcccttcct 2340cctgaagcct ctgccctgcc
tccctttctg ggagggcggg gtgggggtga ctgaatttgg 2400gcctcttgta
cagttaactc tcccaggtgg attttgtgga ggtgagaaaa ggggcattga
2460gactataaag cagtagacaa tccccacata ccatctgtag agttggaact
gcattctttt 2520aaagttttat atgcatatat tttagggctg tagacttact
ttcctatttt cttttccatt 2580gcttattctt gagcacaaaa tgataatcaa
ttattacatt tatacatcac ctttttgact 2640tttccaagcc cttttacagc
tcttggcatt ttcctcgcct aggcctgtga ggtaactggg 2700atcgcacctt
ttataccaga gacctgaggc agatgaaatt tatttccatc taggactaga
2760aaaacttggg tctcttaccg cgagactgag aggcagaagt cagcccgaat
gcctgtcagt 2820ttcatggagg ggaaacgcaa aacctgcagt tcctgagtac
cttctacagg cccggcccag 2880cctaggcccg gggtggccac accacagcaa
gccggccccc cctcttttgg ccttgtggat 2940aagggagagt tgaccgtttt
catcctggcc tccttttgct gtttggatgt ttccacgggt 3000ctcacttata
ccaaagggaa aactcttcat taaagtccgt atttcttcta aaaaaaaaaa
3060aaaaaaaaaa aaaaaa 3076984PRThomo sapiens 98Asn Ser Gly
Ser1992261DNAhomo sapiens 99gaaatcaggc tccgggccgg ccgaagggcg
caactttccc ccctcggcgc cccaccggct 60cccgcgcgcc tcccctcgcg cccgagcttc
gagccaagca gcgtcctggg gagcgcgtca 120tggccttacc agtgaccgcc
ttgctcctgc cgctggcctt gctgctccac gccgccaggc 180cgagccagtt
ccgggtgtcg ccgctggatc ggacctggaa cctgggcgag acagtggagc
240tgaagtgcca ggtgctgctg tccaacccga cgtcgggctg ctcgtggctc
ttccagccgc 300gcggcgccgc cgccagtccc accttcctcc tatacctctc
ccaaaacaag cccaaggcgg 360ccgaggggct ggacacccag cggttctcgg
gcaagaggtt gggggacacc ttcgtcctca 420ccctgagcga cttccgccga
gagaacgagg gctactattt ctgctcggcc ctgagcaact 480ccatcatgta
cttcagccac ttcgtgccgg tcttcctgcc agcgaagccc accacgacgc
540cagcgccgcg accaccaaca ccggcgccca ccatcgcgtc gcagcccctg
tccctgcgcc 600cagaggcgtg ccggccagcg gcggggggcg cagtgcacac
gagggggctg gacttcgcct 660gtgatatcta catctgggcg cccttggccg
ggacttgtgg ggtccttctc ctgtcactgg 720ttatcaccct ttactgcaac
cacaggaacc gaagacgtgt ttgcaaatgt ccccggcctg 780tggtcaaatc
gggagacaag cccagccttt cggcgagata cgtctaaccc tgtgcaacag
840ccactacatt acttcaaact gagatccttc cttttgaggg agcaagtcct
tccctttcat 900tttttccagt cttcctccct gtgtattcat tctcatgatt
attattttag tgggggcggg 960gtgggaaaga ttactttttc tttatgtgtt
tgacgggaaa caaaactagg taaaatctac 1020agtacaccac aagggtcaca
atactgttgt gcgcacatcg cggtagggcg tggaaagggg 1080caggccagag
ctacccgcag agttctcaga atcatgctga gagagctgga ggcacccatg
1140ccatctcaac ctcttccccg cccgttttac aaagggggag gctaaagccc
agagacagct 1200tgatcaaagg cacacagcaa gtcagggttg gagcagtagc
tggagggacc ttgtctccca 1260gctcagggct ctttcctcca caccattcag
gtctttcttt ccgaggcccc tgtctcaggg 1320tgaggtgctt gagtctccaa
cggcaaggga acaagtactt cttgatacct gggatactgt 1380gcccagagcc
tcgaggaggt aatgaattaa agaagagaac tgcctttggc agagttctat
1440aatgtaaaca atatcagact tttttttttt ataatcaagc ctaaaattgt
atagacctaa 1500aataaaatga agtggtgagc ttaaccctgg aaaatgaatc
cctctatctc taaagaaaat 1560ctctgtgaaa cccctatgtg gaggcggaat
tgctctccca gcccttgcat tgcagagggg 1620cccatgaaag aggacaggct
acccctttac aaatagaatt tgagcatcag tgaggttaaa 1680ctaaggccct
cttgaatctc tgaatttgag atacaaacat gttcctggga tcactgatga
1740ctttttatac tttgtaaaga caattgttgg agagcccctc acacagccct
ggcctctgct 1800caactagcag atacagggat gaggcagacc tgactctctt
aaggaggctg agagcccaaa 1860ctgctgtccc aaacatgcac ttccttgctt
aaggtatggt acaagcaatg cctgcccatt 1920ggagagaaaa aacttaagta
gataaggaaa taagaaccac tcataattct tcaccttagg 1980aataatctcc
tgttaatatg gtgtacattc ttcctgatta ttttctacac atacatgtaa
2040aatatgtctt tcttttttaa atagggttgt actatgctgt tatgagtggc
tttaatgaat 2100aaacatttgt agcatcctct ttaatgggta aacagcaaaa
aaaaaaaaaa aaaaaaaaaa 2160aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa
aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa 2220aaaaaaaaaa aaaaaaaaaa
aaaaaaaaaa aaaaaaaaaa a 22611006450DNAhomo sapiens 100gagttgtgcc
tggagtgatg tttaagccaa tgtcagggca aggcaacagt ccctggccgt 60cctccagcac
ctttgtaatg catatgagct cgggagacca gtacttaaag ttggaggccc
120gggagcccag gagctggcgg agggcgttcg tcctgggagc tgcacttgct
ccgtcgggtc 180gccggcttca ccggaccgca ggctcccggg gcagggccgg
ggccagagct cgcgtgtcgg 240cgggacatgc gctgcgtcgc ctctaacctc
gggctgtgct ctttttccag gtggcccgcc 300ggtttctgag ccttctgccc
tgcggggaca cggtctgcac cctgcccgcg gccacggacc 360atgaccatga
ccctccacac caaagcatct gggatggccc tactgcatca gatccaaggg
420aacgagctgg agcccctgaa ccgtccgcag ctcaagatcc ccctggagcg
gcccctgggc 480gaggtgtacc tggacagcag caagcccgcc gtgtacaact
accccgaggg cgccgcctac 540gagttcaacg ccgcggccgc cgccaacgcg
caggtctacg gtcagaccgg cctcccctac 600ggccccgggt ctgaggctgc
ggcgttcggc tccaacggcc tggggggttt ccccccactc 660aacagcgtgt
ctccgagccc gctgatgcta ctgcacccgc cgccgcagct gtcgcctttc
720ctgcagcccc acggccagca ggtgccctac tacctggaga acgagcccag
cggctacacg 780gtgcgcgagg ccggcccgcc ggcattctac aggccaaatt
cagataatcg acgccagggt 840ggcagagaaa gattggccag taccaatgac
aagggaagta tggctatgga atctgccaag 900gagactcgct actgtgcagt
gtgcaatgac tatgcttcag gctaccatta tggagtctgg 960tcctgtgagg
gctgcaaggc cttcttcaag agaagtattc aaggacataa cgactatatg
1020tgtccagcca ccaaccagtg caccattgat aaaaacagga ggaagagctg
ccaggcctgc 1080cggctccgca aatgctacga agtgggaatg atgaaaggtg
ggatacgaaa agaccgaaga 1140ggagggagaa tgttgaaaca caagcgccag
agagatgatg gggagggcag gggtgaagtg 1200gggtctgctg gagacatgag
agctgccaac ctttggccaa gcccgctcat gatcaaacgc 1260tctaagaaga
acagcctggc cttgtccctg acggccgacc agatggtcag tgccttgttg
1320gatgctgagc cccccatact ctattccgag tatgatccta ccagaccctt
cagtgaagct 1380tcgatgatgg gcttactgac caacctggca gacagggagc
tggttcacat gatcaactgg 1440gcgaagaggg tgccaggctt tgtggatttg
accctccatg atcaggtcca ccttctagaa 1500tgtgcctggc tagagatcct
gatgattggt ctcgtctggc gctccatgga gcacccagtg 1560aagctactgt
ttgctcctaa cttgctcttg gacaggaacc agggaaaatg tgtagagggc
1620atggtggaga tcttcgacat gctgctggct acatcatctc ggttccgcat
gatgaatctg 1680cagggagagg agtttgtgtg cctcaaatct attattttgc
ttaattctgg agtgtacaca 1740tttctgtcca gcaccctgaa gtctctggaa
gagaaggacc atatccaccg agtcctggac 1800aagatcacag acactttgat
ccacctgatg gccaaggcag gcctgaccct gcagcagcag 1860caccagcggc
tggcccagct cctcctcatc ctctcccaca tcaggcacat gagtaacaaa
1920ggcatggagc atctgtacag catgaagtgc aagaacgtgg tgcccctcta
tgacctgctg 1980ctggagatgc tggacgccca ccgcctacat gcgcccacta
gccgtggagg ggcatccgtg 2040gaggagacgg accaaagcca cttggccact
gcgggctcta cttcatcgca ttccttgcaa 2100aagtattaca tcacggggga
ggcagagggt ttccctgcca cagtctgaga gctccctggc 2160tcccacacgg
ttcagataat ccctgctgca ttttaccctc atcatgcacc actttagcca
2220aattctgtct cctgcataca ctccggcatg catccaacac caatggcttt
ctagatgagt 2280ggccattcat ttgcttgctc agttcttagt ggcacatctt
ctgtcttctg ttgggaacag 2340ccaaagggat tccaaggcta aatctttgta
acagctctct ttcccccttg ctatgttact 2400aagcgtgagg attcccgtag
ctcttcacag ctgaactcag tctatgggtt ggggctcaga 2460taactctgtg
catttaagct acttgtagag acccaggcct ggagagtaga cattttgcct
2520ctgataagca ctttttaaat ggctctaaga ataagccaca gcaaagaatt
taaagtggct 2580cctttaattg gtgacttgga gaaagctagg tcaagggttt
attatagcac cctcttgtat 2640tcctatggca atgcatcctt ttatgaaagt
ggtacacctt aaagctttta tatgactgta 2700gcagagtatc tggtgattgt
caattcactt ccccctatag gaatacaagg ggccacacag 2760ggaaggcaga
tcccctagtt ggccaagact tattttaact tgatacactg cagattcaga
2820gtgtcctgaa gctctgcctc tggctttccg gtcatgggtt ccagttaatt
catgcctccc 2880atggacctat ggagagcaac aagttgatct tagttaagtc
tccctatatg agggataagt 2940tcctgatttt tgtttttatt tttgtgttac
aaaagaaagc cctccctccc tgaacttgca 3000gtaaggtcag cttcaggacc
tgttccagtg ggcactgtac ttggatcttc ccggcgtgtg 3060tgtgccttac
acaggggtga actgttcact gtggtgatgc atgatgaggg taaatggtag
3120ttgaaaggag caggggccct ggtgttgcat ttagccctgg ggcatggagc
tgaacagtac 3180ttgtgcagga ttgttgtggc tactagagaa caagagggaa
agtagggcag aaactggata 3240cagttctgag cacagccaga cttgctcagg
tggccctgca caggctgcag ctacctagga 3300acattccttg cagaccccgc
attgcctttg ggggtgccct gggatccctg gggtagtcca 3360gctcttattc
atttcccagc gtggccctgg ttggaagaag cagctgtcaa gttgtagaca
3420gctgtgttcc tacaattggc ccagcaccct ggggcacggg agaagggtgg
ggaccgttgc 3480tgtcactact caggctgact ggggcctggt cagattacgt
atgcccttgg tggtttagag 3540ataatccaaa atcagggttt ggtttgggga
agaaaatcct cccccttcct cccccgcccc 3600gttccctacc gcctccactc
ctgccagctc atttccttca atttcctttg acctataggc 3660taaaaaagaa
aggctcattc cagccacagg gcagccttcc ctgggccttt gcttctctag
3720cacaattatg ggttacttcc tttttcttaa caaaaaagaa tgtttgattt
cctctgggtg 3780accttattgt ctgtaattga aaccctattg agaggtgatg
tctgtgttag ccaatgaccc 3840aggtagctgc tcgggcttct cttggtatgt
cttgtttgga aaagtggatt tcattcattt 3900ctgattgtcc agttaagtga
tcaccaaagg actgagaatc tgggagggca aaaaaaaaaa 3960aaaaagtttt
tatgtgcact taaatttggg gacaatttta tgtatctgtg ttaaggatat
4020gcttaagaac ataattcttt tgttgctgtt tgtttaagaa gcaccttagt
ttgtttaaga 4080agcaccttat atagtataat atatattttt ttgaaattac
attgcttgtt tatcagacaa 4140ttgaatgtag taattctgtt ctggatttaa
tttgactggg ttaacatgca aaaaccaagg 4200aaaaatattt agtttttttt
tttttttttg tatacttttc aagctacctt gtcatgtata 4260cagtcattta
tgcctaaagc ctggtgatta ttcatttaaa tgaagatcac atttcatatc
4320aacttttgta tccacagtag acaaaatagc actaatccag atgcctattg
ttggatattg 4380aatgacagac aatcttatgt agcaaagatt atgcctgaaa
aggaaaatta ttcagggcag 4440ctaattttgc ttttaccaaa atatcagtag
taatattttt ggacagtagc taatgggtca 4500gtgggttctt tttaatgttt
atacttagat tttcttttaa aaaaattaaa ataaaacaaa 4560aaaaatttct
aggactagac gatgtaatac cagctaaagc caaacaatta tacagtggaa
4620ggttttacat tattcatcca atgtgtttct attcatgtta agatactact
acatttgaag 4680tgggcagaga acatcagatg attgaaatgt tcgcccaggg
gtctccagca actttggaaa 4740tctctttgta tttttacttg aagtgccact
aatggacagc agatattttc tggctgatgt 4800tggtattggg tgtaggaaca
tgatttaaaa aaaaaactct tgcctctgct ttcccccact 4860ctgaggcaag
ttaaaatgta aaagatgtga tttatctggg gggctcaggt atggtgggga
4920agtggattca ggaatctggg gaatggcaaa tatattaaga agagtattga
aagtatttgg 4980aggaaaatgg ttaattctgg gtgtgcacca aggttcagta
gagtccactt ctgccctgga 5040gaccacaaat caactagctc catttacagc
catttctaaa atggcagctt cagttctaga 5100gaagaaagaa caacatcagc
agtaaagtcc atggaatagc tagtggtctg tgtttctttt 5160cgccattgcc
tagcttgccg taatgattct ataatgccat catgcagcaa ttatgagagg
5220ctaggtcatc caaagagaag accctatcaa tgtaggttgc aaaatctaac
ccctaaggaa 5280gtgcagtctt tgatttgatt tccctagtaa ccttgcagat
atgtttaacc aagccatagc 5340ccatgccttt tgagggctga acaaataagg
gacttactga taatttactt ttgatcacat 5400taaggtgttc tcaccttgaa
atcttataca ctgaaatggc cattgattta ggccactggc 5460ttagagtact
ccttcccctg catgacactg attacaaata ctttcctatt catactttcc
5520aattatgaga tggactgtgg gtactgggag tgatcactaa caccatagta
atgtctaata 5580ttcacaggca gatctgcttg gggaagctag ttatgtgaaa
ggcaaataaa gtcatacagt 5640agctcaaaag gcaaccataa ttctctttgg
tgcaagtctt gggagcgtga tctagattac 5700actgcaccat tcccaagtta
atcccctgaa aacttactct caactggagc aaatgaactt 5760tggtcccaaa
tatccatctt ttcagtagcg ttaattatgc tctgtttcca actgcatttc
5820ctttccaatt gaattaaagt gtggcctcgt ttttagtcat ttaaaattgt
tttctaagta 5880attgctgcct ctattatggc acttcaattt tgcactgtct
tttgagattc aagaaaaatt 5940tctattcatt tttttgcatc caattgtgcc
tgaactttta aaatatgtaa atgctgccat 6000gttccaaacc catcgtcagt
gtgtgtgttt agagctgtgc accctagaaa caacatactt 6060gtcccatgag
caggtgcctg agacacagac ccctttgcat tcacagagag gtcattggtt
6120atagagactt gaattaataa gtgacattat gccagtttct gttctctcac
aggtgataaa 6180caatgctttt tgtgcactac atactcttca gtgtagagct
cttgttttat gggaaaaggc 6240tcaaatgcca aattgtgttt gatggattaa
tatgcccttt tgccgatgca tactattact 6300gatgtgactc ggttttgtcg
cagctttgct ttgtttaatg aaacacactt gtaaacctct 6360tttgcacttt
gaaaaagaat ccagcgggat gctcgagcac ctgtaaacaa ttttctcaac
6420ctatttgatg ttcaaataaa gaattaaact 64501012011DNAhomo sapiens
101tttcagtttc tccagctgct ggctttttgg acacccactc ccccgccagg
aggcagttgc 60aagcgcggag gctgcgagaa ataactgcct cttgaaactt gcagggcgaa
gagcaggcgg 120cgagcgctgg gccggggagg gaccacccga gctgcgacgg
gctctggggc tgcggggcag 180ggctggcgcc cggagcctga gctgcaggag
gtgcgctcgc tttcctcaac aggtggcggc 240ggggcgcgcg ccgggagacc
ccccctaatg cgggaaaagc acgtgtccgc attttagaga 300aggcaaggcc
ggtgtgttta tctgcaagcc attatacttg cccacgaatc tttgagaaca
360ttataatgac ctttgtgcct cttcttgcaa ggtgttttct cagctgttat
ctcaagacat 420ggatataaaa aactcaccat ctagccttaa ttctccttcc
tcctacaact gcagtcaatc 480catcttaccc ctggagcacg gctccatata
cataccttcc tcctatgtag acagccacca 540tgaatatcca gccatgacat
tctatagccc tgctgtgatg aattacagca ttcccagcaa 600tgtcactaac
ttggaaggtg ggcctggtcg gcagaccaca agcccaaatg tgttgtggcc
660aacacctggg cacctttctc ctttagtggt ccatcgccag ttatcacatc
tgtatgcgga 720acctcaaaag agtccctggt gtgaagcaag atcgctagaa
cacaccttac ctgtaaacag 780agagacactg aaaaggaagg ttagtgggaa
ccgttgcgcc agccctgtta ctggtccagg 840ttcaaagagg gatgctcact
tctgcgctgt ctgcagcgat tacgcatcgg gatatcacta 900tggagtctgg
tcgtgtgaag gatgtaaggc cttttttaaa agaagcattc aaggacataa
960tgattatatt tgtccagcta caaatcagtg tacaatcgat aaaaaccggc
gcaagagctg 1020ccaggcctgc cgacttcgga agtgttacga agtgggaatg
gtgaagtgtg gctcccggag 1080agagagatgt gggtaccgcc ttgtgcggag
acagagaagt gccgacgagc agctgcactg 1140tgccggcaag gccaagagaa
gtggcggcca cgcgccccga gtgcgggagc tgctgctgga 1200cgccctgagc
cccgagcagc tagtgctcac cctcctggag gctgagccgc cccatgtgct
1260gatcagccgc cccagtgcgc ccttcaccga ggcctccatg atgatgtccc
tgaccaagtt 1320ggccgacaag gagttggtac acatgatcag ctgggccaag
aagattcccg gctttgtgga 1380gctcagcctg ttcgaccaag tgcggctctt
ggagagctgt tggatggagg tgttaatgat 1440ggggctgatg tggcgctcaa
ttgaccaccc cggcaagctc atctttgctc cagatcttgt 1500tctggacagg
gatgagggga aatgcgtaga aggaattctg gaaatctttg acatgctcct
1560ggcaactact tcaaggtttc gagagttaaa actccaacac aaagaatatc
tctgtgtcaa 1620ggccatgatc ctgctcaatt ccagtatgta ccctctggtc
acagcgaccc aggatgctga 1680cagcagccgg aagctggctc acttgctgaa
cgccgtgacc gatgctttgg tttgggtgat 1740tgccaagagc ggcatctcct
cccagcagca atccatgcgc ctggctaacc tcctgatgct 1800cctgtcccac
gtcaggcatg cgagtaacaa gggcatggaa catctgctca acatgaagtg
1860caaaaatgtg gtcccagtgt atgacctgct gctggagatg ctgaatgccc
acgtgcttcg 1920cgggtgcaag tcctccatca cggggtccga gtgcagcccg
gcagaggaca gtaaaagcaa 1980agagggctcc cagaacccac agtctcagtg a
2011
* * * * *
References