U.S. patent application number 11/340211 was filed with the patent office on 2006-12-28 for nod nucleic acids and polypeptides.
Invention is credited to Naohiro Inohara, Gabriel Nunez.
Application Number | 20060292590 11/340211 |
Document ID | / |
Family ID | 33513837 |
Filed Date | 2006-12-28 |
United States Patent
Application |
20060292590 |
Kind Code |
A1 |
Inohara; Naohiro ; et
al. |
December 28, 2006 |
NOD nucleic acids and polypeptides
Abstract
The present invention relates to the NOD proteins and nucleic
acids encoding the NOD proteins. The present invention further
provides assays for the detection of NOD polymorphisms and
mutations associated with disease states, as well as methods of
screening for ligands and modulators of NOD proteins.
Inventors: |
Inohara; Naohiro; (Ann
Arbor, MI) ; Nunez; Gabriel; (Ann Arbor, MI) |
Correspondence
Address: |
Medlen & Carroll, LLP
Suite 350
101 Howard Street
San Francisco
CA
94105
US
|
Family ID: |
33513837 |
Appl. No.: |
11/340211 |
Filed: |
January 26, 2006 |
Related U.S. Patent Documents
|
|
|
|
|
|
Application
Number |
Filing Date |
Patent Number |
|
|
10794342 |
Mar 5, 2004 |
7041491 |
|
|
11340211 |
Jan 26, 2006 |
|
|
|
60452274 |
Mar 5, 2003 |
|
|
|
Current U.S.
Class: |
435/6.11 ;
536/24.3 |
Current CPC
Class: |
C07H 21/04 20130101;
C07K 14/47 20130101 |
Class at
Publication: |
435/006 ;
536/024.3 |
International
Class: |
C12Q 1/68 20060101
C12Q001/68; C07H 21/04 20060101 C07H021/04 |
Goverment Interests
[0002] This invention was made with government support under Grants
No. DK61707 and GM60421 awarded by the National Institutes of
Health. The Government has certain rights in the invention.
Claims
1. A composition comprising an isolated and purified nucleic acid
sequence encoding a protein selected from the group consisting of
SEQ ID NOs: 12-21.
2. The composition of claim 1, wherein said sequence is operably
linked to a heterologous promoter.
3. The composition of claim 1, wherein said sequence is contained
within a vector.
4. The composition of claim 3, wherein said vector is within a host
cell.
5. The composition of claim 1, wherein said nucleic acid is
selected from the group consisting of SEQ ID NOs: 1-10 nd variants
thereof that are at least 80% identical to SEQ ID NOs: 1-10.
6. The composition of claim 5, wherein said protein is at least 90%
identical to SEQ ID NOs: 12-21.
7. The composition of claim 5, wherein said protein is at least 95%
identical to SEQ ID NOs: 12-21.
8. The composition of claim 1, wherein said nucleic acid sequence
is selected from the group consisting of SEQ ID NOs: 1-10.
9-17. (canceled)
Description
[0001] This application claims priority to provisional patent
application Ser. No. 60/452,274, filed Mar. 05, 2004; which is
herein incorporated by reference in its entirety.
FIELD OF THE INVENTION
[0003] The present invention relates to the NOD proteins and
nucleic acids encoding the NOD proteins. The present invention
further provides assays for the detection of NOD polymorphisms and
mutations associated with disease states, as well as methods of
screening for ligands and modulators of NOD proteins.
BACKGROUND OF THE INVENTION
[0004] The removal of infectious agents by the host is fundamental
for the survival of multicellular organisms. In animals and plants,
the initial detection of microbial agents relies on specialized
host receptors that recognize molecules expressed exclusively by
microbes (Dang and Jones, Nature 411, 826-833 (2001); Medzhitov,
Nature Rev. Immunol. 1, 135-145 (2001)). In animals, detection of
microbial agents is mediated by the recognition of
pathogen-associated molecular patterns (PAMPs) by specific host
pattern-recognition receptors (PRRs) (Medzhitov, supra). Because
the structure of each PAMP is highly conserved and invariant in
microorganisms of the same class, the animal can recognize most or
all microbes with a limited number of PRRs. The identification and
characterization of plasma membrane Toll-like receptors (TLRs) as
PRRs have provided fundamental insight into the mechanisms of host
defense in animals. There is now compelling evidence that TLRs play
a pivotal role in mediating immune responses to bacterial pathogens
(Medzhitov, supra; Akira et al., Nat. Immunol. 2, 675-680 (2001))
In mammals, TLRs mediate host immune responses by inducing the
secretion of several proinflammatory cytokines and co-stimulatory
surface molecules through the activation of transcriptional factors
including NF-.kappa.B (Medzhitov, supra; Akira et al., supra).
The
SUMMARY OF THE INVENTION
[0005] The present invention relates to the NOD proteins and
nucleic acids encoding the NOD proteins. The present invention
further provides assays for the detection of NOD polymorphisms and
mutations associated with disease states, as well as methods of
screening for ligands and modulators of NOD proteins.
[0006] Accordingly, in some embodiments, the present invention
provides a composition comprising an isolated and purified nucleic
acid sequence encoding a protein selected from the group consisting
of SEQ ID NOs: 12-22. In some embodiments, the sequence is operably
linked to a heterologous promoter. In some embodiments, the
sequence is contained within a vector. In some embodiments, the
vector is within a host cell. In some embodiments, the nucleic acid
comprises one of SEQ ID NOs: 1 and variants thereof that are at
least 80%, preferably at least 90%, and even more preferably at
least 95% identical to SEQ ID NOs: 12-22. In some embodiments, the
nucleic acid comprises one of SEQ ID NOs: 1-11.
[0007] The present invention further provides a composition
comprising a polypeptide having an amino acid sequence comprising
SEQ ID NOs: 12-22 or variants thereof that are at least 80%
identical to SEQ ID NOs: 12-22. In some embodiments, the
polypeptide is at least 90%, and preferably at least 95% identical
to SEQ ID NOs: 12-22. In some embodiments, the polypeptide
comprises one of SEQ ID NOs: 12-22.
[0008] The present invention additionally provides a method of
generating an inflammation profile, comprising providing a sample
from a subject, wherein the sample comprises nucleic acid; and
detecting the presence or absence of expression of at least two NOD
genes in the sample, thereby generating an inflammation profile. In
some embodiments, the detecting comprises detecting the presence or
absence of expression of at least 5, and preferably at least 10 NOD
genes in said sample. In some embodiments, the nucleic acid
comprises genomic DNA. In other embodiments, the nucleic acid
comprises mRNA.
DESCRIPTION OF THE FIGURES
[0009] FIG. 1 shows the domain structures of exemplary NOD nucleic
acids and proteins of some embodiments of the present invention.
CARD, caspase-recruitment domain; DC, dendritic cell; DT,
DEFCAP/TUCAN expanded homology domain; EBD, effector-binding
domain; NOD, nucleotide-binding oligomerization domain; PYD, pyrin
domain; LRR, leucine-rich repeat; WD40R, WD40 repeat; BIR,
baculoviral inhibitor-of-apoptosis repeat; TIR, Toll/interleukin-1
receptor.
[0010] FIG. 2 shows an induced proximity model of NOD protein
activation. EBD, effector binding domain; LRD, ligand recognition
domain; NOD, Nucleotide-binding oligomerization domain.
[0011] FIG. 3 shows signaling pathways mediated by NOD1, NOD2, IPAF
and Cryopyrin.
[0012] FIG. 4 shows a model for the role of NOD 1, NOD2 and related
NODs in innate and adaptive immunity. APC, antigen-presenting cell;
MHC-II, major histocompatibility complex class II molecules; TCR,
T-cell receptor; TLR, Toll-like receptors.
[0013] FIG. 5 shows hypothetical mechanisms of disease in patients
with mutations in NOD2, Cryopyrin, CIITA and Pyrin.
[0014] FIG. 6 shows Table 2.
[0015] FIG. 7 shows the nucleic acid sequence of NOD3 (SEQ ID NO:
1).
[0016] FIG. 8 shows the nucleic acid sequence of NOD5 (SEQ ID
NO:2).
[0017] FIG. 9 shows the nucleic acid sequence of NOD6 (SEQ ID
NO:3).
[0018] FIG. 10 shows the nucleic acid sequence of NOD8 (SEQ ID
NO:4).
[0019] FIG. 11 shows the nucleic acid sequence of NOD9 (SEQ ID
NO:5).
[0020] FIG. 12 shows the nucleic acid sequence of NOD12 (SEQ ID
NO:6).
[0021] FIG. 13 shows the nucleic acid sequence of NOD14 (SEQ ID
NO:7).
[0022] FIG. 14 shows the nucleic acid sequence of NOD17 (SEQ ID
NO:9).
[0023] FIG. 15 shows the nucleic acid sequence of NOD26 (SEQ ID NO:
10).
[0024] FIG. 16 shows the nucleic acid sequence of NOD27 (SEQ ID NO:
11).
[0025] FIG. 17 shows the amino acid sequence of NOD3 (SEQ ID
NO:12).
[0026] FIG. 18 shows the amino acid sequence of NOD5 (SEQ ID NO:
13).
[0027] FIG. 19 shows the amino acid sequence of NOD6 (SEQ ID
NO:14).
[0028] FIG. 20 shows the amino acid sequence of NOD8 (SEQ ID NO:
15).
[0029] FIG. 21 shows the amino acid sequence of NOD9 (SEQ ID
NO:16).
[0030] FIG. 22 shows the amino acid sequence of NOD12 (SEQ ID NO:
17).
[0031] FIG. 23 shows the amino acid sequence of NOD14 (SEQ ID
NO:18).
[0032] FIG. 24 shows the amino acid sequence of NOD17 (SEQ ID
NO:20).
[0033] FIG. 25 shows the amino acid sequence of NOD26 (SEQ ID
NO:21).
[0034] FIG. 26 shows the amino acid sequence of NOD27 (SEQ ID
NO:22).
[0035] FIG. 27 shows the nucleic acid sequence of NOD16 (SEQ ID
NO:8).
[0036] FIG. 28 shows the nucleic acid sequence of NOD 16 (SEQ ID
NO: 19).
DEFINITIONS
[0037] To facilitate understanding of the invention, a number of
terms are defined below.
[0038] As used herein, the term "NOD" when used in reference to a
protein or nucleic acid refers to a NOD protein or nucleic acid
encoding a NOD protein of the present invention. The term NOD
encompasses both proteins that are identical to wild-type NODs and
those that are derived from wild type NOD (e.g., variants of NOD
polypeptides of the present invention) or chimeric genes
constructed with portions of NOD coding regions). In some
embodiments, the "NOD" is a wild type NOD nucleic acid (SEQ ID NOs:
1 -11) or amino acid (SEQ ID NOs: 12-22) sequence. In other
embodiments, the "NOD" is a variant or mutant.
[0039] As used herein, the term "instructions for using said kit
for said detecting the presence or absence of a variant NOD nucleic
acid or polypeptide in said biological sample" includes
instructions for using the reagents contained in the kit for the
detection of variant and wild type NOD nucleic acids or
polypeptides. In some embodiments, the instructions further
comprise the statement of intended use required by the U.S. Food
and Drug Administration (FDA) in labeling in vitro diagnostic
products. The FDA classifies in vitro diagnostics as medical
devices and requires that they be approved through the 510(k)
procedure. Information required in an application under 510(k)
includes: 1) The in vitro diagnostic product name, including the
trade or proprietary name, the common or usual name, and the
classification name of the device; 2) The intended use of the
product; 3) The establishment registration number, if applicable,
of the owner or operator submitting the 510(k) submission; the
class in which the in vitro diagnostic product was placed under
section 513 of the FD&C Act, if known, its appropriate panel,
or, if the owner or operator determines that the device has not
been classified under such section, a statement of that
determination and the basis for the determination that the in vitro
diagnostic product is not so classified; 4) Proposed labels,
labeling and advertisements sufficient to describe the in vitro
diagnostic product, its intended use, and directions for use. Where
applicable, photographs or engineering drawings should be supplied;
5) A statement indicating that the device is similar to and/or
different from other in vitro diagnostic products of comparable
type in commercial distribution in the U.S., accompanied by data to
support the statement; 6) A 510(k) summary of the safety and
effectiveness data upon which the substantial equivalence
determination is based; or a statement that the 510(k) safety and
effectiveness information supporting the FDA finding of substantial
equivalence will be made available to any person within 30 days of
a written request; 7) A statement that the submitter believes, to
the best of their knowledge, that all data and information
submitted in the premarket notification are truthful and accurate
and that no material fact has been omitted; 8) Any additional
information regarding the in vitro diagnostic product requested
that is necessary for the FDA to make a substantial equivalency
determination. Additional information is available at the Internet
web page of the U.S. FDA.
[0040] As used herein, the term "inflammation profile" refers to
the pattern of expression of two or more NOD genes of the present
invention (e.g., the NOD genes described by SEQ ID NOs: 1-11). In
some embodiments, the pattern of expression comprises the presence
or absence of expression. In other embodiments, the pattern of
expression comprises the level of expression or localization of
expression of the NOD genes. The inflammation profiles of the
present invention find use the characterization of inflammatory
diseases and in determining a subject's risk of contacting an
inflammatory disease. For example, in some embodiments,
inflammation profiles from a subject are compared to control
profiles associated with disease or predisposition to disease.
[0041] The term "gene" refers to a nucleic acid (e.g., DNA)
sequence that comprises coding sequences necessary for the
production of a polypeptide, RNA (e.g., including but not limited
to, mRNA, tRNA and rRNA) or precursor (e.g., NOD). The polypeptide,
RNA, or precursor can be encoded by a full length coding sequence
or by any portion of the coding sequence so long as the desired
activity or functional properties (e.g., enzymatic activity, ligand
binding, signal transduction, etc.) of the full-length or fragment
are retained. The term also encompasses the coding region of a
structural gene and the sequences located adjacent to the coding
region on both the 5' and 3' ends for a distance of about 1 kb on
either end such that the gene corresponds to the length of the
full-length mRNA. The sequences that are located 5' of the coding
region and which are present on the mRNA are referred to as 5'
untranslated sequences. The sequences that are located 3' or
downstream of the coding region and that are present on the mRNA
are referred to as 3' untranslated sequences. The term "gene"
encompasses both cDNA and genomic forms of a gene. A genomic form
or clone of a gene contains the coding region interrupted with
non-coding sequences termed "introns" or "intervening regions" or
"intervening sequences." Introns are segments of a gene that are
transcribed into nuclear RNA (hnRNA); introns may contain
regulatory elements such as enhancers. Introns are removed or
"spliced out" from the nuclear or primary transcript; introns
therefore are absent in the messenger RNA (mRNA) transcript. The
mRNA functions during translation to specify the sequence or order
of amino acids in a nascent polypeptide.
[0042] In particular, the term "NOD gene" or "NOD genes" refers to
the full-length NOD nucleotide sequence (e.g., contained in SEQ ID
NOs: 1-11). However, it is also intended that the term encompass
fragments of the NOD sequences, mutants of the NOD sequences, as
well as other domains within the full-length NOD nucleotide
sequences. Furthermore, the terms "NOD nucleotide sequence" or "NOD
polynucleotide sequence" encompasses DNA, cDNA, and RNA (e.g.,
mRNA) sequences.
[0043] Where "amino acid sequence" is recited herein to refer to an
amino acid sequence of a naturally occurring protein molecule,
"amino acid sequence" and like terms, such as "polypeptide" or
"protein" are not meant to limit the amino acid sequence to the
complete, native amino acid sequence associated with the recited
protein molecule.
[0044] In addition to containing introns, genomic forms of a gene
may also include sequences located on both the 5' and 3' end of the
sequences that are present on the RNA transcript. These sequences
are referred to as "flanking" sequences or regions (these flanking
sequences are located 5' or 3' to the non-translated sequences
present on the mRNA transcript). The 5' flanking region may contain
regulatory sequences such as promoters and enhancers that control
or influence the transcription of the gene. The 3' flanking region
may contain sequences that direct the termination of transcription,
post-transcriptional cleavage and polyadenylation.
[0045] The term "wild-type" refers to a gene or gene product that
has the characteristics of that gene or gene product when isolated
from a naturally occurring source. A wild-type gene is that which
is most frequently observed in a population and is thus arbitrarily
designed the "normal" or "wild-type" form of the gene. In contrast,
the terms "modified," "mutant," "polymorphism," and "variant" refer
to a gene or gene product that displays modifications in sequence
and/or functional properties (i.e., altered characteristics) when
compared to the wild-type gene or gene product. It is noted that
naturally-occurring mutants can be isolated; these are identified
by the fact that they have altered characteristics when compared to
the wild-type gene or gene product.
[0046] As used herein, the terms "nucleic acid molecule encoding,"
"DNA sequence encoding," and "DNA encoding" refer to the order or
sequence of deoxyribonucleotides along a strand of deoxyribonucleic
acid. The order of these deoxyribonucleotides determines the order
of amino acids along the polypeptide (protein) chain. The DNA
sequence thus codes for the amino acid sequence.
[0047] DNA molecules are said to have "5' ends" and "3' ends"
because mononucleotides are reacted to make oligonucleotides or
polynucleotides in a manner such that the 5' phosphate of one
mononucleotide pentose ring is attached to the 3' oxygen of its
neighbor in one direction via a phosphodiester linkage. Therefore,
an end of an oligonucleotides or polynucleotide, referred to as the
"5' end" if its 5' phosphate is not linked to the 3' oxygen of a
mononucleotide pentose ring and as the "3' end" if its 3' oxygen is
not linked to a 5' phosphate of a subsequent mononucleotide pentose
ring. As used herein, a nucleic acid sequence, even if internal to
a larger oligonucleotide or polynucleotide, also may be said to
have 5' and 3' ends. In either a linear or circular DNA molecule,
discrete elements are referred to as being "upstream" or 5' of the
"downstream" or 3' elements. This terminology reflects the fact
that transcription proceeds in a 5' to 3' fashion along the DNA
strand. The promoter and enhancer elements that direct
transcription of a linked gene are generally located 5' or upstream
of the coding region. However, enhancer elements can exert their
effect even when located 3' of the promoter element and the coding
region. Transcription termination and polyadenylation signals are
located 3' or downstream of the coding region.
[0048] As used herein, the terms "an oligonucleotide having a
nucleotide sequence encoding a gene" and "polynucleotide having a
nucleotide sequence encoding a gene," means a nucleic acid sequence
comprising the coding region of a gene or, in other words, the
nucleic acid sequence that encodes a gene product. The coding
region may be present in a cDNA, genomic DNA, or RNA form. When
present in a DNA form, the oligonucleotide or polynucleotide may be
single-stranded (i.e., the sense strand) or double-stranded.
Suitable control elements such as enhancers/promoters, splice
junctions, polyadenylation signals, etc. may be placed in close
proximity to the coding region of the gene if needed to permit
proper initiation of transcription and/or correct processing of the
primary RNA transcript. Alternatively, the coding region utilized
in the expression vectors of the present invention may contain
endogenous enhancers/promoters, splice junctions, intervening
sequences, polyadenylation signals, etc. or a combination of both
endogenous and exogenous control elements.
[0049] As used herein, the term "regulatory element" refers to a
genetic element that controls some aspect of the expression of
nucleic acid sequences. For example, a promoter is a regulatory
element that facilitates the initiation of transcription of an
operably linked coding region. Other regulatory elements include
splicing signals, polyadenylation signals, termination signals,
etc.
[0050] As used herein, the terms "complementary" or
"complementarity" are used in reference to polynucleotides (i.e., a
sequence of nucleotides) related by the base-pairing rules. For
example, for the sequence 5'-"A-G-T-3'," is complementary to the
sequence 3'-"T-C-A-5'." Complementarity may be "partial," in which
only some of the nucleic acids' bases are matched according to the
base pairing rules. Or, there may be "complete" or "total"
complementarity between the nucleic acids. The degree of
complementarity between nucleic acid strands has significant
effects on the efficiency and strength of hybridization between
nucleic acid strands. This is of particular importance in
amplification reactions, as well as detection methods that depend
upon binding between nucleic acids. Complementarity can include the
formation of base pairs between any type of nucleotides, including
non-natural bases, modified bases, synthetic bases and the
like.
[0051] The term "homology" refers to a degree of complementarity.
There may be partial homology or complete homology (i.e.,
identity). A partially complementary sequence is one that at least
partially inhibits a completely complementary sequence from
hybridizing to a target nucleic acid and is referred to using the
functional term "substantially homologous." The term "inhibition of
binding," when used in reference to nucleic acid binding, refers to
inhibition of binding caused by competition of homologous sequences
for binding to a target sequence. The inhibition of hybridization
of the completely complementary sequence to the target sequence may
be examined using a hybridization assay (Southern or Northern blot,
solution hybridization and the like) under conditions of low
stringency. A substantially homologous sequence or probe will
compete for and inhibit the binding (i.e., the hybridization) of a
completely homologous to a target under conditions of low
stringency. This is not to say that conditions of low stringency
are such that non-specific binding is permitted; low stringency
conditions require that the binding of two sequences to one another
be a specific (i.e., selective) interaction. The absence of
non-specific binding may be tested by the use of a second target
that lacks even a partial degree of complementarity (e.g., less
than about 30% identity); in the absence of non-specific binding
the probe will not hybridize to the second non-complementary
target.
[0052] The art knows well that numerous equivalent conditions may
be employed to comprise low stringency conditions; factors such as
the length and nature (DNA, RNA, base composition) of the probe and
nature of the target (DNA, RNA, base composition, present in
solution or immobilized, etc.) and the concentration of the salts
and other components (e.g., the presence or absence of formamide,
dextran sulfate, polyethylene glycol) are considered and the
hybridization solution may be varied to generate conditions of low
stringency hybridization different from, but equivalent to, the
above listed conditions. In addition, the art knows conditions that
promote hybridization under conditions of high stringency (e.g.,
increasing the temperature of the hybridization and/or wash steps,
the use of formamide in the hybridization solution, etc.).
[0053] When used in reference to a double-stranded nucleic acid
sequence such as a cDNA or genomic clone, the term "substantially
homologous" refers to any probe that can hybridize to either or
both strands of the double-stranded nucleic acid sequence under
conditions of low stringency as described above.
[0054] A gene may produce multiple RNA species that are generated
by differential splicing of the primary RNA transcript. cDNAs that
are splice variants of the same gene will contain regions of
sequence identity or complete homology (representing the presence
of the same exon or portion of the same exon on both cDNAs) and
regions of complete non-identity (for example, representing the
presence of exon "A" on cDNA 1 wherein cDNA 2 contains exon "B"
instead). Because the two cDNAs contain regions of sequence
identity they will both hybridize to a probe derived from the
entire gene or portions of the gene containing sequences found on
both cDNAs; the two splice variants are therefore substantially
homologous to such a probe and to each other.
[0055] When used in reference to a single-stranded nucleic acid
sequence, the term "substantially homologous" refers to any probe
that can hybridize (i.e., it is the complement of) the
single-stranded nucleic acid sequence under conditions of low
stringency as described above.
[0056] As used herein, the term "competes for binding" is used in
reference to a first polypeptide with an activity which binds to
the same substrate as does a second polypeptide with an activity,
where the second polypeptide is a variant of the first polypeptide
or a related or dissimilar polypeptide. The efficiency (e.g.,
kinetics or thermodynamics) of binding by the first polypeptide may
be the same as or greater than or less than the efficiency
substrate binding by the second polypeptide. For example, the
equilibrium binding constant (K.sub.D) for binding to the substrate
may be different for the two polypeptides. The term "K.sub.m" as
used herein refers to the Michaelis-Menton constant for an enzyme
and is defined as the concentration of the specific substrate at
which a given enzyme yields one-half its maximum velocity in an
enzyme catalyzed reaction.
[0057] As used herein, the term "hybridization" is used in
reference to the pairing of complementary nucleic acids.
Hybridization and the strength of hybridization (i.e., the strength
of the association between the nucleic acids) is impacted by such
factors as the degree of complementary between the nucleic acids,
stringency of the conditions involved, the T.sub.m of the formed
hybrid, and the G:C ratio within the nucleic acids.
[0058] As used herein, the term "T.sub.m" is used in reference to
the "melting temperature." The melting temperature is the
temperature at which a population of double-stranded nucleic acid
molecules becomes half dissociated into single strands. The
equation for calculating the T.sub.m of nucleic acids is well known
in the art. As indicated by standard references, a simple estimate
of the T.sub.m value may be calculated by the equation:
T.sub.m=81.5+0.41 (% G+C), when a nucleic acid is in aqueous
solution at 1 M NaCl (See e.g., Anderson and Young, Quantitative
Filter Hybridization, in Nucleic Acid Hybridization [1985]). Other
references include more sophisticated computations that take
structural as well as sequence characteristics into account for the
calculation of T.sub.m.
[0059] As used herein the term "stringency" is used in reference to
the conditions of temperature, ionic strength, and the presence of
other compounds such as organic solvents, under which nucleic acid
hybridizations are conducted. Those skilled in the art will
recognize that "stringency" conditions may be altered by varying
the parameters just described either individually or in concert.
With "high stringency" conditions, nucleic acid base pairing will
occur only between nucleic acid fragments that have a high
frequency of complementary base sequences (e.g., hybridization
under "high stringency" conditions may occur between homologs with
about 85-100% identity, preferably about 70-100% identity). With
medium stringency conditions, nucleic acid base pairing will occur
between nucleic acids with an intermediate frequency of
complementary base sequences (e.g., hybridization under "medium
stringency" conditions may occur between homologs with about 50-70%
identity). Thus, conditions of "weak" or "low" stringency are often
required with nucleic acids that are derived from organisms that
are genetically diverse, as the frequency of complementary
sequences is usually less.
[0060] "High stringency conditions" when used in reference to
nucleic acid hybridization comprise conditions equivalent to
binding or hybridization at 42.degree. C. in a solution consisting
of 5.times.SSPE (43.8 g/l NaCl, 6.9 g/l NaH.sub.2PO.sub.4 H.sub.2O
and 1.85 g/l EDTA, pH adjusted to 7.4 with NaOH), 0.5% SDS,
5.times. Denhardt's reagent and 100 .mu.g/ml denatured salmon sperm
DNA followed by washing in a solution comprising 0.1.times.SSPE,
1.0% SDS at 42.degree. C. when a probe of about 500 nucleotides in
length is employed.
[0061] "Medium stringency conditions" when used in reference to
nucleic acid hybridization comprise conditions equivalent to
binding or hybridization at 42.degree. C. in a solution consisting
of 5.times.SSPE (43.8 g/l NaCl, 6.9 g/l NaH.sub.2PO.sub.4 H.sub.2O
and 1.85 g/l EDTA, pH adjusted to 7.4 with NaOH), 0.5% SDS,
5.times. Denhardt's reagent and 100 .mu.g/ml denatured salmon sperm
DNA followed by washing in a solution comprising 1.0.times.SSPE,
1.0% SDS at 42.degree. C. when a probe of about 500 nucleotides in
length is employed. "Low stringency conditions" comprise conditions
equivalent to binding or hybridization at 42.degree. C. in a
solution consisting of 5.times.SSPE (43.8 g/l NaCl, 6.9 g/l
NaH.sub.2PO.sub.4 H.sub.2O and 1.85 g/l EDTA, pH adjusted to 7.4
with NaOH), 0.1% SDS, 5.times. Denhardt's reagent [50.times.
Denhardt's contains per 500 ml: 5 g Ficoll (Type 400, Pharamcia), 5
g BSA (Fraction V; Sigma)] and 100 .mu.g/ml denatured salmon sperm
DNA followed by washing in a solution comprising 5.times.SSPE, 0.1%
SDS at 42.degree. C. when a probe of about 500 nucleotides in
length is employed.
[0062] The present invention is not limited to the hybridization of
probes of about 500 nucleotides in length. The present invention
contemplates the use of probes between approximately 10 nucleotides
up to several thousand (e.g., at least 5000) nucleotides in length.
One skilled in the relevant understands that stringency conditions
may be altered for probes of other sizes (See e.g., Anderson and
Young, Quantitative Filter Hybridization, in Nucleic Acid
Hybridization [1985] and Sambrook et al., Molecular Cloning: A
Laboratory Manual, Cold Spring Harbor Press, NY [1989]).
[0063] The following terms are used to describe the sequence
relationships between two or more polynucleotides: "reference
sequence", "sequence identity", "percentage of sequence identity",
and "substantial identity". A "reference sequence" is a defined
sequence used as a basis for a sequence comparison; a reference
sequence may be a subset of a larger sequence, for example, as a
segment of a full-length cDNA sequence given in a sequence listing
or may comprise a complete gene sequence. Generally, a reference
sequence is at least 20 nucleotides in length, frequently at least
25 nucleotides in length, and often at least 50 nucleotides in
length. Since two polynucleotides may each (1) comprise a sequence
(i.e., a portion of the complete polynucleotide sequence) that is
similar between the two polynucleotides, and (2) may further
comprise a sequence that is divergent between the two
polynucleotides, sequence comparisons between two (or more)
polynucleotides are typically performed by comparing sequences of
the two polynucleotides over a "comparison window" to identify and
compare local regions of sequence similarity. A "comparison
window", as used herein, refers to a conceptual segment of at least
20 contiguous nucleotide positions wherein a polynucleotide
sequence may be compared to a reference sequence of at least 20
contiguous nucleotides and wherein the portion of the
polynucleotide sequence in the comparison window may comprise
additions or deletions (i.e., gaps) of 20 percent or less as
compared to the reference sequence (which does not comprise
additions or deletions) for optimal alignment of the two sequences.
Optimal alignment of sequences for aligning a comparison window may
be conducted by the local homology algorithm of Smith and Waterman
[Smith and Waterman, Adv. Appl. Math. 2: 482 (1981)] by the
homology alignment algorithm of Needleman and Wunsch [Needleman and
Wunsch, J. Mol. Biol. 48:443 (1970)], by the search for similarity
method of Pearson and Lipman [Pearson and Lipman, Proc. Natl. Acad.
Sci. (U.S.A.) 85:2444 (1988)], by computerized implementations of
these algorithms (GAP, BESTFIT, FASTA, and TFASTA in the Wisconsin
Genetics Software Package Release 7.0, Genetics Computer Group, 575
Science Dr., Madison, Wis.), or by inspection, and the best
alignment (i.e., resulting in the highest percentage of homology
over the comparison window) generated by the various methods is
selected. The term "sequence identity" means that two
polynucleotide sequences are identical (i.e., on a
nucleotide-by-nucleotide basis) over the window of comparison. The
term "percentage of sequence identity" is calculated by comparing
two optimally aligned sequences over the window of comparison,
determining the number of positions at which the identical nucleic
acid base (e.g., A, T, C, G, U, or I) occurs in both sequences to
yield the number of matched positions, dividing the number of
matched positions by the total number of positions in the window of
comparison (i.e., the window size), and multiplying the result by
100 to yield the percentage of sequence identity. The terms
"substantial identity" as used herein denotes a characteristic of a
polynucleotide sequence, wherein the polynucleotide comprises a
sequence that has at least 85 percent sequence identity, preferably
at least 90 to 95 percent sequence identity, more usually at least
99 percent sequence identity as compared to a reference sequence
over a comparison window of at least 20 nucleotide positions,
frequently over a window of at least 25-50 nucleotides, wherein the
percentage of sequence identity is calculated by comparing the
reference sequence to the polynucleotide sequence which may include
deletions or additions which total 20 percent or less of the
reference sequence over the window of comparison. The reference
sequence may be a subset of a larger sequence, for example, as a
segment of the full-length sequences of the compositions claimed in
the present invention (e.g., NOD).
[0064] As applied to polypeptides, the term "substantial identity"
means that two peptide sequences, when optimally aligned, such as
by the programs GAP or BESTFIT using default gap weights, share at
least 80 percent sequence identity, preferably at least 90 percent
sequence identity, more preferably at least 95 percent sequence
identity or more (e.g., 99 percent sequence identity). Preferably,
residue positions that are not identical differ by conservative
amino acid substitutions. Conservative amino acid substitutions
refer to the interchangeability of residues having similar side
chains. For example, a group of amino acids having aliphatic side
chains is glycine, alanine, valine, leucine, and isoleucine; a
group of amino acids having aliphatic-hydroxyl side chains is
serine and threonine; a group of amino acids having
amide-containing side chains is asparagine and glutamine; a group
of amino acids having aromatic side chains is phenylalanine,
tyrosine, and tryptophan; a group of amino acids having basic side
chains is lysine, arginine, and histidine; and a group of amino
acids having sulfur-containing side chains is cysteine and
methionine. Preferred conservative amino acids substitution groups
are: valine-leucine-isoleucine, phenylalanine-tyrosine,
lysine-arginine, alanine-valine, and asparagine-glutamine.
[0065] The term "fragment" as used herein refers to a polypeptide
that has an amino-terminal and/or carboxy-terminal deletion as
compared to the native protein, but where the remaining amino acid
sequence is identical to the corresponding positions in the amino
acid sequence deduced from a full-length cDNA sequence. Fragments
typically are at least 4 amino acids long, preferably at least 20
amino acids long, usually at least 50 amino acids long or longer,
and span the portion of the polypeptide required for intermolecular
binding of the compositions (claimed in the present invention) with
its various ligands and/or substrates.
[0066] The term "polymorphic locus" is a locus present in a
population that shows variation between members of the population
(i.e., the most common allele has a frequency of less than 0.95).
In contrast, a "monomorphic locus" is a genetic locus at little or
no variations seen between members of the population (generally
taken to be a locus at which the most common allele exceeds a
frequency of 0.95 in the gene pool of the population).
[0067] As used herein, the term "genetic variation information" or
"genetic variant information" refers to the presence or absence of
one or more variant nucleic acid sequences (e.g., polymorphism or
mutations) in a given allele of a particular gene (e.g., a NOD gene
of the present invention).
[0068] As used herein, the term "detection assay" refers to an
assay for detecting the presence or absence of variant nucleic acid
sequences (e.g., polymorphisms or mutations) in a given allele of a
particular gene (e.g., a NOD gene). Examples of suitable detection
assays include, but are not limited to, those described below in
Section III B.
[0069] The term "naturally-occurring" as used herein as applied to
an object refers to the fact that an object can be found in nature.
For example, a polypeptide or polynucleotide sequence that is
present in an organism (including viruses) that can be isolated
from a source in nature and which has not been intentionally
modified by man in the laboratory is naturally-occurring.
[0070] "Amplification" is a special case of nucleic acid
replication involving template specificity. It is to be contrasted
with non-specific template replication (i.e., replication that is
template-dependent but not dependent on a specific template).
Template specificity is here distinguished from fidelity of
replication (i.e., synthesis of the proper polynucleotide sequence)
and nucleotide (ribo- or deoxyribo-) specificity. Template
specificity is frequently described in terms of "target"
specificity. Target sequences are "targets" in the sense that they
are sought to be sorted out from other nucleic acid. Amplification
techniques have been designed primarily for this sorting out.
[0071] Template specificity is achieved in most amplification
techniques by the choice of enzyme. Amplification enzymes are
enzymes that, under conditions they are used, will process only
specific sequences of nucleic acid in a heterogeneous mixture of
nucleic acid. For example, in the case of Q.beta. replicase, MDV-1
RNA is the specific template for the replicase (D. L. Kacian et
al., Proc. Natl. Acad. Sci. USA 69:3038 [1972]). Other nucleic acid
will not be replicated by this amplification enzyme. Similarly, in
the case of T7 RNA polymerase, this amplification enzyme has a
stringent specificity for its own promoters (Chamberlin et al.,
Nature 228:227 [1970]). In the case of T4 DNA ligase, the enzyme
will not ligate the two oligonucleotides or polynucleotides, where
there is a mismatch between the oligonucleotide or polynucleotide
substrate and the template at the ligation junction (D. Y. Wu and
R. B. Wallace, Genomics 4:560 [1989]). Finally, Taq and Pfu
polymerases, by virtue of their ability to function at high
temperature, are found to display high specificity for the
sequences bounded and thus defined by the primers; the high
temperature results in thermodynamic conditions that favor primer
hybridization with the target sequences and not hybridization with
non-target sequences (H. A. Erlich (ed.), PCR Technology, Stockton
Press [1989]).
[0072] As used herein, the term "amplifiable nucleic acid" is used
in reference to nucleic acids that may be amplified by any
amplification method. It is contemplated that "amplifiable nucleic
acid" will usually comprise "sample template."
[0073] As used herein, the term "sample template" refers to nucleic
acid originating from a sample that is analyzed for the presence of
"target" (defined below). In contrast, "background template" is
used in reference to nucleic acid other than sample template that
may or may not be present in a sample. Background template is most
often inadvertent. It may be the result of carryover, or it may be
due to the presence of nucleic acid contaminants sought to be
purified away from the sample. For example, nucleic acids from
organisms other than those to be detected may be present as
background in a test sample.
[0074] As used herein, the term "primer" refers to an
oligonucleotide, whether occurring naturally as in a purified
restriction digest or produced synthetically, which is capable of
acting as a point of initiation of synthesis when placed under
conditions in which synthesis of a primer extension product which
is complementary to a nucleic acid strand is induced, (i.e., in the
presence of nucleotides and an inducing agent such as DNA
polymerase and at a suitable temperature and pH). The primer is
preferably single stranded for maximum efficiency in amplification,
but may alternatively be double stranded. If double stranded, the
primer is first treated to separate its strands before being used
to prepare extension products. Preferably, the primer is an
oligodeoxyribonucleotide. The primer must be sufficiently long to
prime the synthesis of extension products in the presence of the
inducing agent. The exact lengths of the primers will depend on
many factors, including temperature, source of primer and the use
of the method.
[0075] As used herein, the term "probe" refers to an
oligonucleotide (i.e., a sequence of nucleotides), whether
occurring naturally as in a purified restriction digest or produced
synthetically, recombinantly or by PCR amplification, that is
capable of hybridizing to another oligonucleotide of interest. A
probe may be single-stranded or double-stranded. Probes are useful
in the detection, identification and isolation of particular gene
sequences. It is contemplated that any probe used in the present
invention will be labeled with any "reporter molecule," so that is
detectable in any detection system, including, but not limited to
enzyme (e.g., ELISA, as well as enzyme-based histochemical assays),
fluorescent, radioactive, and luminescent systems. It is not
intended that the present invention be limited to any particular
detection system or label.
[0076] As used herein, the term "target," refers to a nucleic acid
sequence or structure to be detected or characterized. Thus, the
"target" is sought to be sorted out from other nucleic acid
sequences. A "segment" is defined as a region of nucleic acid
within the target sequence.
[0077] As used herein, the term "polymerase chain reaction" ("PCR")
refers to the method of K. B. Mullis U.S. Pat. Nos. 4,683,195,
4,683,202, and 4,965,188, hereby incorporated by reference, that
describe a method for increasing the concentration of a segment of
a target sequence in a mixture of genomic DNA without cloning or
purification. This process for amplifying the target sequence
consists of introducing a large excess of two oligonucleotide
primers to the DNA mixture containing the desired target sequence,
followed by a precise sequence of thermal cycling in the presence
of a DNA polymerase. The two primers are complementary to their
respective strands of the double stranded target sequence. To
effect amplification, the mixture is denatured and the primers then
annealed to their complementary sequences within the target
molecule. Following annealing, the primers are extended with a
polymerase so as to form a new pair of complementary strands. The
steps of denaturation, primer annealing, and polymerase extension
can be repeated many times (i.e., denaturation, annealing and
extension constitute one "cycle"; there can be numerous "cycles")
to obtain a high concentration of an amplified segment of the
desired target sequence. The length of the amplified segment of the
desired target sequence is determined by the relative positions of
the primers with respect to each other, and therefore, this length
is a controllable parameter. By virtue of the repeating aspect of
the process, the method is referred to as the "polymerase chain
reaction" (hereinafter "PCR"). Because the desired amplified
segments of the target sequence become the predominant sequences
(in terms of concentration) in the mixture, they are said to be
"PCR amplified."
[0078] With PCR, it is possible to amplify a single copy of a
specific target sequence in genomic DNA to a level detectable by
several different methodologies (e.g., hybridization with a labeled
probe; incorporation of biotinylated primers followed by
avidin-enzyme conjugate detection; incorporation of
.sup.32P-labeled deoxynucleotide triphosphates, such as dCTP or
dATP, into the amplified segment). In addition to genomic DNA, any
oligonucleotide or polynucleotide sequence can be amplified with
the appropriate set of primer molecules. In particular, the
amplified segments created by the PCR process itself are,
themselves, efficient templates for subsequent PCR
amplifications.
[0079] As used herein, the terms "PCR product," "PCR fragment," and
"amplification product" refer to the resultant mixture of compounds
after two or more cycles of the PCR steps of denaturation,
annealing and extension are complete. These terms encompass the
case where there has been amplification of one or more segments of
one or more target sequences.
[0080] As used herein, the term "amplification reagents" refers to
those reagents (deoxyribonucleotide triphosphates, buffer, etc.),
needed for amplification except for primers, nucleic acid template,
and the amplification enzyme. Typically, amplification reagents
along with other reaction components are placed and contained in a
reaction vessel (test tube, microwell, etc.).
[0081] As used herein, the terms "restriction endonucleases" and
"restriction enzymes" refer to bacterial enzymes, each of which cut
double-stranded DNA at or near a specific nucleotide sequence.
[0082] As used herein, the term "recombinant DNA molecule" as used
herein refers to a DNA molecule that is comprised of segments of
DNA joined together by means of molecular biological
techniques.
[0083] As used herein, the term "antisense" is used in reference to
RNA sequences that are complementary to a specific RNA sequence
(e.g., mRNA). Included within this definition are antisense RNA
("asRNA") molecules involved in gene regulation by bacteria.
Antisense RNA may be produced by any method, including synthesis by
splicing the gene(s) of interest in a reverse orientation to a
viral promoter that permits the synthesis of a coding strand. Once
introduced into an embryo, this transcribed strand combines with
natural mRNA produced by the embryo to form duplexes. These
duplexes then block either the further transcription of the mRNA or
its translation. In this manner, mutant phenotypes may be
generated. The term "antisense strand" is used in reference to a
nucleic acid strand that is complementary to the "sense" strand.
The designation (-) (i.e., "negative") is sometimes used in
reference to the antisense strand, with the designation (+)
sometimes used in reference to the sense (i.e., "positive")
strand.
[0084] The term "isolated" when used in relation to a nucleic acid,
as in "an isolated oligonucleotide" or "isolated polynucleotide"
refers to a nucleic acid sequence that is identified and separated
from at least one contaminant nucleic acid with which it is
ordinarily associated in its natural source. Isolated nucleic acid
is present in a form or setting that is different from that in
which it is found in nature. In contrast, non-isolated nucleic
acids are nucleic acids such as DNA and RNA found in the state they
exist in nature. For example, a given DNA sequence (e.g., a gene)
is found on the host cell chromosome in proximity to neighboring
genes; RNA sequences, such as a specific mRNA sequence encoding a
specific protein, are found in the cell as a mixture with numerous
other mRNAs that encode a multitude of proteins. However, isolated
nucleic acid encoding NOD includes, by way of example, such nucleic
acid in cells ordinarily expressing NOD where the nucleic acid is
in a chromosomal location different from that of natural cells, or
is otherwise flanked by a different nucleic acid sequence than that
found in nature. The isolated nucleic acid, oligonucleotide, or
polynucleotide may be present in single-stranded or double-stranded
form. When an isolated nucleic acid, oligonucleotide or
polynucleotide is to be utilized to express a protein, the
oligonucleotide or polynucleotide will contain at a minimum the
sense or coding strand (i.e., the oligonucleotide or polynucleotide
may single-stranded), but may contain both the sense and anti-sense
strands (i.e., the oligonucleotide or polynucleotide may be
double-stranded).
[0085] As used herein, a "portion of a chromosome" refers to a
discrete section of the chromosome. Chromosomes are divided into
sites or sections by cytogeneticists as follows: the short
(relative to the centromere) arm of a chromosome is termed the "p"
arm; the long arm is termed the "q" arm. Each arm is then divided
into 2 regions termed region 1 and region 2 (region 1 is closest to
the centromere). Each region is further divided into bands. The
bands may be further divided into sub-bands. For example, the
11p15.5 portion of human chromosome 11 is the portion located on
chromosome 11 (11) on the short arm (p) in the first region (1) in
the 5th band (5) in sub-band 5 (0.5). A portion of a chromosome may
be "altered;" for instance the entire portion may be absent due to
a deletion or may be rearranged (e.g., inversions, translocations,
expanded or contracted due to changes in repeat regions). In the
case of a deletion, an attempt to hybridize (i.e., specifically
bind) a probe homologous to a particular portion of a chromosome
could result in a negative result (i.e., the probe could not bind
to the sample containing genetic material suspected of containing
the missing portion of the chromosome). Thus, hybridization of a
probe homologous to a particular portion of a chromosome may be
used to detect alterations in a portion of a chromosome.
[0086] The term "sequences associated with a chromosome" means
preparations of chromosomes (e.g., spreads of metaphase
chromosomes), nucleic acid extracted from a sample containing
chromosomal DNA (e.g., preparations of genomic DNA); the RNA that
is produced by transcription of genes located on a chromosome
(e.g., hnRNA and mRNA), and cDNA copies of the RNA transcribed from
the DNA located on a chromosome. Sequences associated with a
chromosome may be detected by numerous techniques including probing
of Southern and Northern blots and in situ hybridization to RNA,
DNA, or metaphase chromosomes with probes containing sequences
homologous to the nucleic acids in the above listed
preparations.
[0087] As used herein the term "portion" when in reference to a
nucleotide sequence (as in "a portion of a given nucleotide
sequence") refers to fragments of that sequence. The fragments may
range in size from four nucleotides to the entire nucleotide
sequence minus one nucleotide (10 nucleotides, 20, 30, 40, 50, 100,
200, etc.).
[0088] As used herein the term "coding region" when used in
reference to structural gene refers to the nucleotide sequences
that encode the amino acids found in the nascent polypeptide as a
result of translation of a mRNA molecule. The coding region is
bounded, in eukaryotes, on the 5' side by the nucleotide triplet
"ATG" that encodes the initiator methionine and on the 3' side by
one of the three triplets, which specify stop codons (i.e., TAA,
TAG, TGA).
[0089] As used herein, the term "purified" or "to purify" refers to
the removal of contaminants from a sample. For example, NOD
antibodies are purified by removal of contaminating
non-immunoglobulin proteins; they are also purified by the removal
of immunoglobulin that does not bind a NOD polypeptide. The removal
of non-immunoglobulin proteins and/or the removal of
immunoglobulins that do not bind a NOD polypeptide results in an
increase in the percent of NOD-reactive immunoglobulins in the
sample. In another example, recombinant NOD polypeptides are
expressed in bacterial host cells and the polypeptides are purified
by the removal of host cell proteins; the percent of recombinant
NOD polypeptides is thereby increased in the sample.
[0090] The term "recombinant DNA molecule" as used herein refers to
a DNA molecule that is comprised of segments of DNA joined together
by means of molecular biological techniques.
[0091] The term "recombinant protein" or "recombinant polypeptide"
as used herein refers to a protein molecule that is expressed from
a recombinant DNA molecule.
[0092] The term "native protein" as used herein, is used to
indicate a protein that does not contain amino acid residues
encoded by vector sequences; that is the native protein contains
only those amino acids found in the protein as it occurs in nature.
A native protein may be produced by recombinant means or may be
isolated from a naturally occurring source.
[0093] As used herein the term "portion" when in reference to a
protein (as in "a portion of a given protein") refers to fragments
of that protein. The fragments may range in size from four
consecutive amino acid residues to the entire amino acid sequence
minus one amino acid.
[0094] The term "Southern blot," refers to the analysis of DNA on
agarose or acrylamide gels to fractionate the DNA according to size
followed by transfer of the DNA from the gel to a solid support,
such as nitrocellulose or a nylon membrane. The immobilized DNA is
then probed with a labeled probe to detect DNA species
complementary to the probe used. The DNA may be cleaved with
restriction enzymes prior to electrophoresis. Following
electrophoresis, the DNA may be partially depurinated and denatured
prior to or during transfer to the solid support. Southern blots
are a standard tool of molecular biologists (J. Sambrook et al.,
Molecular Cloning: A Laboratory Manual, Cold Spring Harbor Press,
NY, pp 9.31-9.58 [1989]).
[0095] The term "Northern blot," as used herein refers to the
analysis of RNA by electrophoresis of RNA on agarose gels to
fractionate the RNA according to size followed by transfer of the
RNA from the gel to a solid support, such as nitrocellulose or a
nylon membrane. The immobilized RNA is then probed with a labeled
probe to detect RNA species complementary to the probe used.
Northern blots are a standard tool of molecular biologists (J.
Sambrook, et al., supra, pp 7.39-7.52 [1989]).
[0096] The term "Western blot" refers to the analysis of protein(s)
(or polypeptides) immobilized onto a support such as nitrocellulose
or a membrane. The proteins are run on acrylamide gels to separate
the proteins, followed by transfer of the protein from the gel to a
solid support, such as nitrocellulose or a nylon membrane. The
immobilized proteins are then exposed to antibodies with reactivity
against an antigen of interest. The binding of the antibodies may
be detected by various methods, including the use of radiolabeled
antibodies.
[0097] The term "antigenic determinant" as used herein refers to
that portion of an antigen that makes contact with a particular
antibody (i.e., an epitope). When a protein or fragment of a
protein is used to immunize a host animal, numerous regions of the
protein may induce the production of antibodies that bind
specifically to a given region or three-dimensional structure on
the protein; these regions or structures are referred to as
antigenic determinants. An antigenic determinant may compete with
the intact antigen (i.e., the "immunogen" used to elicit the immune
response) for binding to an antibody.
[0098] The term "transgene" as used herein refers to a foreign,
heterologous, or autologous gene that is placed into an organism by
introducing the gene into newly fertilized eggs or early embryos.
The term "foreign gene" refers to any nucleic acid (e.g., gene
sequence) that is introduced into the genome of an animal by
experimental manipulations and may include gene sequences found in
that animal so long as the introduced gene does not reside in the
same location as does the naturally-occurring gene. The term
"autologous gene" is intended to encompass variants (e.g.,
polymorphisms or mutants) of the naturally occurring gene. The term
transgene thus encompasses the replacement of the naturally
occurring gene with a variant form of the gene.
[0099] As used herein, the term "vector" is used in reference to
nucleic acid molecules that transfer DNA segment(s) from one cell
to another. The term "vehicle" is sometimes used interchangeably
with "vector."
[0100] The term "expression vector" as used herein refers to a
recombinant DNA molecule containing a desired coding sequence and
appropriate nucleic acid sequences necessary for the expression of
the operably linked coding sequence in a particular host organism.
Nucleic acid sequences necessary for expression in prokaryotes
usually include a promoter, an operator (optional), and a ribosome
binding site, often along with other sequences. Eukaryotic cells
are known to utilize promoters, enhancers, and termination and
polyadenylation signals.
[0101] As used herein, the term "host cell" refers to any
eukaryotic or prokaryotic cell (e.g., bacterial cells such as E.
coli, yeast cells, mammalian cells, avian cells, amphibian cells,
plant cells, fish cells, and insect cells), whether located in
vitro or in vivo. For example, host cells may be located in a
transgenic animal.
[0102] The terms "overexpression" and "overexpressing" and
grammatical equivalents, are used in reference to levels of mRNA to
indicate a level of expression approximately 3-fold higher than
that typically observed in a given tissue in a control or
non-transgenic animal. Levels of mRNA are measured using any of a
number of techniques known to those skilled in the art including,
but not limited to Northern blot analysis (See, Example 10, for a
protocol for performing Northern blot analysis). Appropriate
controls are included on the Northern blot to control for
differences in the amount of RNA loaded from each tissue analyzed
(e.g., the amount of 28S rRNA, an abundant RNA transcript present
at essentially the same amount in all tissues, present in each
sample can be used as a means of normalizing or standardizing the
RAD50 mRNA-specific signal observed on Northern blots). The amount
of mRNA present in the band corresponding in size to the correctly
spliced NOD transgene RNA is quantified; other minor species of RNA
which hybridize to the transgene probe are not considered in the
quantification of the expression of the transgenic mRNA.
[0103] The term "transfection" as used herein refers to the
introduction of foreign DNA into eukaryotic cells. Transfection may
be accomplished by a variety of means known to the art including
calcium phosphate-DNA co-precipitation, DEAE-dextran-mediated
transfection, polybrene-mediated transfection, electroporation,
microinjection, liposome fusion, lipofection, protoplast fusion,
retroviral infection, and biolistics.
[0104] The term "stable transfection" or "stably transfected"
refers to the introduction and integration of foreign DNA into the
genome of the transfected cell. The term "stable transfectant"
refers to a cell that has stably integrated foreign DNA into the
genomic DNA.
[0105] The term "transient transfection" or "transiently
transfected" refers to the introduction of foreign DNA into a cell
where the foreign DNA fails to integrate into the genome of the
transfected cell. The foreign DNA persists in the nucleus of the
transfected cell for several days. During this time the foreign DNA
is subject to the regulatory controls that govern the expression of
endogenous genes in the chromosomes. The term "transient
transfectant" refers to cells that have taken up foreign DNA but
have failed to integrate this DNA.
[0106] The term "calcium phosphate co-precipitation" refers to a
technique for the introduction of nucleic acids into a cell. The
uptake of nucleic acids by cells is enhanced when the nucleic acid
is presented as a calcium phosphate-nucleic acid co-precipitate.
The original technique of Graham and van der Eb (Graham and van der
Eb, Virol., 52:456 [1973]), has been modified by several groups to
optimize conditions for particular types of cells. The art is well
aware of these numerous modifications.
[0107] A "composition comprising a given polynucleotide sequence"
as used herein refers broadly to any composition containing the
given polynucleotide sequence. The composition may comprise an
aqueous solution. Compositions comprising polynucleotide sequences
encoding NODs (e.g., SEQ ID NOs:1-11) or fragments thereof may be
employed as hybridization probes. In this case, the NOD encoding
polynucleotide sequences are typically employed in an aqueous
solution containing salts (e.g., NaCl), detergents (e.g., SDS), and
other components (e.g., Denhardt's solution, dry milk, salmon sperm
DNA, etc.).
[0108] The term "test compound" refers to any chemical entity,
pharmaceutical, drug, and the like that can be used to treat or
prevent a disease, illness, sickness, or disorder of bodily
function, or otherwise alter the physiological or cellular status
of a sample. Test compounds comprise both known and potential
therapeutic compounds. A test compound can be determined to be
therapeutic by screening using the screening methods of the present
invention. A "known therapeutic compound" refers to a therapeutic
compound that has been shown (e.g., through animal trials or prior
experience with administration to humans) to be effective in such
treatment or prevention.
[0109] The term "sample" as used herein is used in its broadest
sense. A sample suspected of containing a human chromosome or
sequences associated with a human chromosome may comprise a cell,
chromosomes isolated from a cell (e.g., a spread of metaphase
chromosomes), genomic DNA (in solution or bound to a solid support
such as for Southern blot analysis), RNA (in solution or bound to a
solid support such as for Northern blot analysis), cDNA (in
solution or bound to a solid support) and the like. A sample
suspected of containing a protein may comprise a cell, a portion of
a tissue, an extract containing one or more proteins and the
like.
[0110] As used herein, the term "response," when used in reference
to an assay, refers to the generation of a detectable signal (e.g.,
accumulation of reporter protein, increase in ion concentration,
accumulation of a detectable chemical product).
[0111] As used herein, the term "reporter gene" refers to a gene
encoding a protein that may be assayed. Examples of reporter genes
include, but are not limited to, luciferase (See, e.g., deWet et
al., Mol. Cell. Biol. 7:725 [1987] and U.S. Pat Nos. 6,074,859;
5,976,796; 5,674,713; and 5,618,682; all of which are incorporated
herein by reference), green fluorescent protein (e.g., GenBank
Accession Number U43284; a number of GFP variants are commercially
available from CLONTECH Laboratories, Palo Alto, Calif.),
chloramphenicol acetyltransferase, .beta.-galactosidase, alkaline
phosphatase, and horse radish peroxidase.
[0112] As used herein, the terms "computer memory" and "computer
memory device" refer to any storage media readable by a computer
processor. Examples of computer memory include, but are not limited
to, RAM, ROM, computer chips, digital video disc (DVDs), compact
discs (CDs), hard disk drives (HDD), and magnetic tape.
[0113] As used herein, the term "computer readable medium" refers
to any device or system for storing and providing information
(e.g., data and instructions) to a computer processor. Examples of
computer readable media include, but are not limited to, DVDs, CDs,
hard disk drives, magnetic tape and servers for streaming media
over networks.
[0114] As used herein, the term "entering" as in "entering said
genetic variation information into said computer" refers to
transferring information to a "computer readable medium."
Information may be transferred by any suitable method, including
but not limited to, manually (e.g., by typing into a computer) or
automated (e.g., transferred from another "computer readable
medium" via a "processor").
[0115] As used herein, the terms "processor" and "central
processing unit" or "CPU" are used interchangeably and refer to a
device that is able to read a program from a computer memory (e.g.,
ROM or other computer memory) and perform a set of steps according
to the program.
[0116] As used herein, the term "computer implemented method"
refers to a method utilizing a "CPU" and "computer readable
medium."
GENERAL DESCRIPTION OF THE INVENTION
[0117] The nucleotide-binding oligomerization domain (NOD) was
first found in Apaf-1 and its nematode homologue CED-4, two pivotal
regulators of developmental and p53-dependent programmed cell death
(Lui and Hengartner, supra; Derry et al., supra). Subsequently, two
NOD-containing molecules, NOD1 (CARD4) and NOD2, were identified
through database searches for Apaf-1/CED-4 homologues. Since then,
the NOD protein family has greatly expanded and currently contains
a large number of proteins from animals, plants, fungi and
bacteria, including >20 human proteins homologous to Apaf-1 and
NOD1 (FIG. 1). The majority of NOD family members are comprised of
three distinct functional domains, an amino-terminal effector
binding domain (EBD), a centrally located NOD and a
carboxy-terminal ligand recognition domain (LRD) (Table 2). The NOD
mediates self oligomerization, which, in some embodiments, function
in the activation of downstream effector molecules. The EBD of
mammalian NOD proteins mediates the binding to effector molecules
which determines the downstream events activated upon signaling,
including apoptosis and NF-.kappa.B activation (Table 2).
[0118] Some NOD proteins share the same type of effector domain
(e.g., CARD or PYD). In some embodiments, the NOD proteins activate
different signaling cascades as the interaction between these
domains and those present in downstream binding partners is highly
specific. For example, the PYD of ASC, a downstream adaptor
molecule involved in NOD signalling, associates with the PYD of
cryopyrin, but not with the PYD present in NALP2, PAN2, PYPAF3,
PYPAF4, PYPAF6 or NOD27 (Grenier et al., FEBS Lett. 530, 73-78
(2002)). In other embodiments, certain NOD proteins like NOD1 and
NOD2 interact with and use a common downstream molecule, RICK, to
activate identical or similar signalling pathways (FIG. 3).
Transient expression of NOD1 and NOD2 in mammalian cells induces
NF-.kappa.B activation (Bertin et al., J. Biol. Chem. 274,
12955-12958 (1999); Inohara et al., J. Biol. Chem. 274, 14560-14567
(1999); Ogura et al., J. Biol. Chem. 276, 4812-4818 (2001)).
Mutational analyses demonstrated that the CARDs and the NODs of NOD
1 and NOD2 were required for the induction of NF-KB whereas its
LRRs were dispensable (Inohara et al., J. Biol. Chem. 274,
14560-14567 (1999); Ogura et al., J. Biol. Chem. 276, 4812-4818
(2001)). Thus, in some embodiments, the CARDs act as effector
domains for NOD 1 and NOD2 signalling. Both NOD 1 and NOD2
physically associate with RICK, a CARD-containing protein kinase
through homophilic CARD-CARD interactions (Inohara et al., J.
Biol.. Chem. 274, 14560-14567 (1999); Ogura et al., J. Biol. Chem.
276, 4812-4818 (2001)). A role for RICK in NOD1 and NOD2 signalling
is supported by several studies (Inohara et al., supra; Ogura et
al., supra).
[0119] Several NOD-LRR proteins, including IPAF, cryopyrin, and
DEFCAP, associate with ASC (Manji et al., J. Biol. Chem. 277,
11570-11575 (2002); Geddes et al., Biochem. Biophys. Res. Commun.
284, 77-82 (2001); Martinon et al., Mol. Cell. 10, 417-426 (2002)).
ASC (also called TMS1/PYCARD) is an adaptor molecule originally
identified in a sub-cytosolic fraction called the "speck" in cells
undergoing apoptosis. ASC is composed of an amino-terminal PYD and
a carboxy-terminal CARD. Co-expression of ASC with several
PYD-containing NOD proteins including cryopyrin, PYPAF5 or PYPAF7,
as well as with the CARD-containing IPAF, induces NF-.kappa.B
activation (Manji et al., supra). Thus, in some embodiments,
PYD-containing NOD proteins use the adaptor ASC for signaling
(Grenier et al., supra). NF-.kappa.B activation induced through ASC
signalling is inhibited by dominant forms of NEMO/IKK.gamma. (Manji
et al., supra; Grenier et al., supra). Thus, ASC signals, as was
reported for RICK, through the common IKK signalling pathway of
NF-.kappa.B activation (FIG. 3).
[0120] Multiple NOD proteins including NOD 1, NOD2, IPAF and DEFCAP
promote activation of pro-inflammatory caspases. For example, NOD1
promotes caspase-1 activation in transient overexpression studies
(Yoo et al., Biochem. Biophys. Res. Commun. 299, 652-658 (2002)).
IPAF, cryopyrin, DEFCAP, PYPAF5 and PYPAF7 have been found to
regulate, in the presence of ASC, the activation of caspase-1,
interleukin-1.beta. converting enzyme (Grenier et al., supra; Wang
et al., J. Biol. Chem. 277, 29874-29880 (2002)). DEFCAP, the only
NOD family member known to possess both a CARD and PYD, can form an
endogenous multi-protein complex containing ASC, caspase-1 and
caspase-5 dubbed "the inflammasome" which promotes caspase
activation and processing of pro-interleukin-1.beta. (Martinon et
al., Mol. Cell. 10, 417-426 (2002)).
[0121] In some embodiments, NOD proteins (e.g., Apaf-1, NOD 1,
NOD2, DEFCAP, IPAF and cryopyrin) induce or enhance apoptosis
(Inohara et al., J. Biol. Chem. 274, 14560-14567 (1999); Ogura et
al., J. Biol. Chem. 276, 4812-4818 (2001); Geddes et al., Biochem.
Biophys. Res. Commun. 284, 77-82 (2001); Poyet et al., J. Biol.
Chem. 276, 28309-28313 (2001); Hlaing et al., J. Biol. Chem. 276,
9230-9238 (2001); Zou et al., Cell 90, 405-413 (1997)). For
example, NOD1 and DEFCAP interact with multiple caspases and/or
Apaf-1 (Hlaing et al., supra; Inohara and Nunez, Oncogene, 20,
6473-6481 (2001)). Co-expression of IPAF or cryopyrin with ASC or
forced oligomerization of IPAF or cryopyrin induces apoptosis in
mammalian cells, which requires caspase activity. NOD 1, IPAF,
cryopyrin, PYPAF5 and PYPAF7 induce both NF-.kappa.B and caspase-1
activation. Thus, in some embodiments, NOD pro-apoptotic activity
results from the activation of inflammatory caspases. In other
embodiments, apoptotic caspases contribute to the activation of
inflammatory caspases.
[0122] In some embodiments, the induction of both NF-.kappa.B and
apoptosis by NOD proteins is similar to that observed with TLRs,
PKR and death receptors (DRs), which induce apoptosis through the
activation of caspases. Upon DR signalling, the induction of
apoptosis is suppressed in vivo by simultaneous activation of
NF-KB, which leads to the expression of anti-apoptotic genes (Beg
and Baltimore, Science 274, 782-784 (1996); Wang et al., Science
281, 1680-1683 (1998); Micheau et al., Mol. Cell. Biol. 21,
5299-5305 (2001)). Thus, in some embodiments, under physiological
conditions, the pro-apoptotic activity induced through NOD proteins
is suppressed by simultaneous induction of NF-.kappa.B
activity.
[0123] Genetic variation in three human NOD proteins has been
implicated in the development of genetic diseases (Hull et al.,
Curr Opin Rheumatol. 15, 61-69 (2003)). For example, mutations in
CIITA are known to cause type II lymphocyte bare syndrome (LBS), a
hereditary immunodeficiency disorder characterized by the absence
of MHCII expression (Steimle et al., Cell 75, 135-146 (1993); Reith
and Mach, Annu Rev Immunol. 19, 331-373. (2001)). More recently,
mutations in NOD2 and CIAS1 (the gene encoding cryopyrin) have been
implicated in several autoinflammatory diseases. A frameshift
mutation, L1007fsinsC, and two missense mutations (G908R and R702W)
in NOD2 are associated with Crohn's disease (CD), a common
inflammatory disease of the intestinal tract (Ogura et al., Nature
411, 603-606 (2001); Hugot et al., Nature 411, 599-603 (2001);
Hampe et al., Lancet 357, 1925-1928 (2001)). Having one copy of the
mutated alleles confers a 2-4-fold increased risk of developing CD,
whereas homozygocity or compound heterozygocity for NOD2 mutations
increases the risk 20-40-fold, indicating that lack of NOD2
function is important for disease development. All three
CD-associated mutations result in proteins that are deficient in
inducing PGN- and MDP-mediated NF-.kappa.B activation. Activation
of NF-.kappa.B induced by MDP is absent in mononuclear cells
derived from CD patients homozygous for L1007fsinsC.
[0124] In addition to CD, missense mutations in the coding region
of NOD2 have been associated with Blau syndrome, an autosomal
dominant trait characterized by arthritis, uveitis and skin rashes
(Miceli-Richard et al., Nat. Genet. 29, 19-20 (2001)). NOD2
mutations resulting in Blau syndrome are located in the NOD
(Miceli-Richard et al., supra). NOD2 mutant proteins found in
patients with Blau syndrome induce increased basal NF-.kappa.B
activity, when compared to wild-type NOD2. Thus, variant proteins
found in patients with Blau syndrome may represent constitutively
active NOD2 mutations. This is in contrast to CD-associated NOD2
variants, which have normal or reduced levels of basal activity but
are defective in their response bacterial components (Ogura et al.,
Nature 411, 603-606 (2001); Bonen et al., Gastroenterology 124,
140-146 (2003)).
[0125] Mutations in the CIAS1 gene, which encodes cryopyrin, are
the cause of several autoinflammatory syndromes characterized by
recurrent episodes of seemingly unprovoked inflammation (Hoffman et
al., Nature Genet. 29, 301-305 (2001); Feldmann et al., Am. J Hum.
Genet. 71, 198-203 (2002); Aksentijevich et al., Arthritis Rheum.
46, 3340-3348 (2002); Aganna et al., Arthritis Rheum., 46,
2445-2452 (2002)). These autosomal-dominant diseases include
familial cold autoinflammatory syndrome (FACS), Muckle-Wells
syndrome (MWS) and neonatal-onset multisystem inflammatory disease
(NOMID, also known as chronic infantile neurologic cutaneous
articular syndrome or CINCA). Patients with FACS, MWS and NOMID
carry missense mutations that localize to the NOD of cryopyrin. The
R260W mutation associated with FACS and MWS corresponds to the
R334W NOD2 mutation found in Blau syndrome (Miceli-Richard et al.,
supra). The present invention is not limited to a particular
mechanism. Indeed, an understanding of the mechanism of the present
invention is not required to practice the present invention.
Nonetheless, it is contemplated that this observation suggests that
R206W cryopyrin may represent a constitutively active mutation
which may lead to a deregulated activation of NF-.kappa.B and
inflammatory caspases (FIG. 5).
[0126] Pyrin has been implicated in familial Mediterranean fever
(FMF), an autosomal-recessive disease characterized by recurrent
episodes of fever and localized inflammation (The International FMF
Consortium, Cell 90, 797-807 (1997)). The gene mutated in FMF
encodes a protein called pyrin, which is composed of an
amino-terminal PYD, a B-type zinc-finger box, a coiled coil, a PRY
domain and a Spla and Ryanodine receptor (SPRY) domain (The
International FMF Consortium, supra).
[0127] In some embodiments, the present invention provides novel
NOD genes (e.g., those described in SEQ ID NOs: 1-22 and Table 1).
The novel NOD genes of the present invention were identified by
searching public gene databases for proteins with homology to known
NOD proteins. The present invention is not limited to a particular
mechanism. Indeed, an understanding of the mechanism of the present
invention is not necessary to understand the present invention.
Nonetheless, it is contemplated that these genes are associated
with inflammatory diseases. In particular, analysis conducted
during the course of development of the present invention revealed
that linkage analysis of NOD27 revealed a locus in the chromosomal
region that is associated with psoriasis. Accordingly, it is
further contemplated that NOD27 is associated with psoriasis.
[0128] In some embodiments, the present invention provides an
"expression profile" of inflammatory diseases. For example, in some
embodiments, the expression and or presence of variant alleles of
the NOD proteins of the present invention is determined. Such
expression profiles can then be correlated with disease states or
susceptibility to disease.
DETAILED DESCRIPTION OF THE INVENTION
[0129] The present invention relates to the NOD proteins and
nucleic acids encoding the NOD proteins. The present invention
further provides assays for the detection of NOD polymorphisms and
mutations associated with disease states. Exemplary embodiments of
the present invention are described below.
I. NOD Polynucleotides
[0130] As described above, the present invention provides novel NOD
family genes. Accordingly, the present invention provides nucleic
acids encoding NOD genes, homologs, variants (e.g., polymorphisms
and mutants), including but not limited to, those described in SEQ
ID NOs: 1-11. Table 1 describes the NOD genes of the present
invention. In some embodiments, the present invention provide
polynucleotide sequences that are capable of hybridizing to SEQ ID
NOs: 1-11 under conditions of low to high stringency as long as the
polynucleotide sequence capable of hybridizing encodes a protein
that retains a biological activity of the naturally occurring NODs.
In some embodiments, the protein that retains a biological activity
of naturally occurring NOD is 70% homologous to wild-type NOD,
preferably 80% homologous to wild-type NOD, more preferably 90%
homologous to wild-type NOD, and most preferably 95% homologous to
wild-type NOD. In preferred embodiments, hybridization conditions
are based on the melting temperature (T.sub.m) of the nucleic acid
binding complex and confer a defined "stringency" as explained
above (See e.g., Wahl, et al., Meth. Enzymol., 152:399-407 [1987],
incorporated herein by reference).
[0131] In other embodiments of the present invention, additional
alleles of NOD genes are provided. In preferred embodiments,
alleles result from a polymorphism or mutation (i.e., a change in
the nucleic acid sequence) and generally produce altered mRNAs or
polypeptides whose structure or function may or may not be altered.
Any given gene may have none, one or many allelic forms. Common
mutational changes that give rise to alleles are generally ascribed
to deletions, additions or substitutions of nucleic acids. Each of
these types of changes may occur alone, or in combination with the
others, and at the rate of one or more times in a given sequence.
Examples of the alleles of the present invention include those
encoded by SEQ ID NOs: 1-11 (wild type) and disease alleles
thereof.
[0132] In still other embodiments of the present invention, the
nucleotide sequences of the present invention may be engineered in
order to alter an NOD coding sequence for a variety of reasons,
including but not limited to, alterations which modify the cloning,
processing and/or expression of the gene product. For example,
mutations may be introduced using techniques that are well known in
the art (e.g., site-directed mutagenesis to insert new restriction
sites, to alter glycosylation patterns, to change codon preference,
etc.).
[0133] In some embodiments of the present invention, the
polynucleotide sequence of NOD may be extended utilizing the
nucleotide sequence (e.g., SEQ ID NOs: 1-I1) in various methods
known in the art to detect upstream sequences such as promoters and
regulatory elements. For example, it is contemplated that
restriction-site polymerase chain reaction (PCR) will find use in
the present invention. This is a direct method that uses universal
primers to retrieve unknown sequence adjacent to a known locus
(Gobinda et al., PCR Methods Applic., 2:318-22 [1993]). First,
genomic DNA is amplified in the presence of a primer to a linker
sequence and a primer specific to the known region. The amplified
sequences are then subjected to a second round of PCR with the same
linker primer and another specific primer internal to the first
one. Products of each round of PCR are transcribed with an
appropriate RNA polymerase and sequenced using reverse
transcriptase.
[0134] In another embodiment, inverse PCR can be used to amplify or
extend sequences using divergent primers based on a known region
(Triglia et al., Nucleic Acids Res., 16:8186 [1988]). The primers
may be designed using Oligo 4.0 (National Biosciences Inc, Plymouth
Minn.), or another appropriate program, to be 22-30 nucleotides in
length, to have a GC content of 50% or more, and to anneal to the
target sequence at temperatures about 68-72.degree. C. The method
uses several restriction enzymes to generate a suitable fragment in
the known region of a gene. The fragment is then circularized by
intramolecular ligation and used as a PCR template. In still other
embodiments, walking PCR is utilized. Walking PCR is a method for
targeted gene walking that permits retrieval of unknown sequence
(Parker et al., Nucleic Acids Res., 19:3055-60 [1991]). The
PROMOTERFINDER kit (Clontech) uses PCR, nested primers and special
libraries to "walk in" genomic DNA. This process avoids the need to
screen libraries and is useful in finding intron/exon
junctions.
[0135] Preferred libraries for screening for full length cDNAs
include mammalian libraries that have been size-selected to include
larger cDNAs. Also, random primed libraries are preferred, in that
they will contain more sequences that contain the 5' and upstream
gene regions. A randomly primed library may be particularly useful
in case where an oligo d(T) library does not yield full-length
cDNA. Genomic mammalian libraries are useful for obtaining introns
and extending 5' sequence.
[0136] In other embodiments of the present invention, variants of
the disclosed NOD sequences are provided. In preferred embodiments,
variants result from polymorphisms or mutations (i.e., a change in
the nucleic acid sequence) and generally produce altered mRNAs or
polypeptides whose structure or function may or may not be altered.
Any given gene may have none, one, or many variant forms. Common
mutational changes that give rise to variants are generally
ascribed to deletions, additions or substitutions of nucleic acids.
Each of these types of changes may occur alone, or in combination
with the others, and at the rate of one or more times in a given
sequence.
[0137] It is contemplated that it is possible to modify the
structure of a peptide having a function (e.g., NOD function) for
such purposes as altering the biological activity (e.g., Nod
signaling). Such modified peptides are considered functional
equivalents of peptides having an activity of a NOD peptide as
defined herein. A modified peptide can be produced in which the
nucleotide sequence encoding the polypeptide has been altered, such
as by substitution, deletion, or addition. In particularly
preferred embodiments, these modifications do not significantly
reduce the biological activity of the modified NOD genes. In other
words, construct "X" can be evaluated in order to determine whether
it is a member of the genus of modified or variant NOD's of the
present invention as defined functionally, rather than
structurally. In preferred embodiments, the activity of variant NOD
polypeptides is evaluated by methods described herein (e.g., the
generation of transgenic animals or the use of signaling
assays).
[0138] Moreover, as described above, variant forms of NOD genes are
also contemplated as being equivalent to those peptides and DNA
molecules that are set forth in more detail herein. For example, it
is contemplated that isolated replacement of a leucine with an
isoleucine or valine, an aspartate with a glutamate, a threonine
with a serine, or a similar replacement of an amino acid with a
structurally related amino acid (i.e., conservative mutations) will
not have a major effect on the biological activity of the resulting
molecule. Accordingly, some embodiments of the present invention
provide variants of NOD disclosed herein containing conservative
replacements. Conservative replacements are those that take place
within a family of amino acids that are related in their side
chains. Genetically encoded amino acids can be divided into four
families: (1) acidic (aspartate, glutamate); (2) basic (lysine,
arginine, histidine); (3) nonpolar (alanine, valine, leucine,
isoleucine, proline, phenylalanine, methionine, tryptophan); and
(4) uncharged polar (glycine, asparagine, glutamine, cysteine,
serine, threonine, tyrosine). Phenylalanine, tryptophan, and
tyrosine are sometimes classified jointly as aromatic amino acids.
In similar fashion, the amino acid repertoire can be grouped as (1)
acidic (aspartate, glutamate); (2) basic (lysine, arginine,
histidine), (3) aliphatic (glycine, alanine, valine, leucine,
isoleucine, serine, threonine), with serine and threonine
optionally be grouped separately as aliphatic-hydroxyl; (4)
aromatic (phenylalanine, tyrosine, tryptophan); (5) amide
(asparagine, glutamine); and (6) sulfur -containing (cysteine and
methionine) (e.g., Stryer ed., Biochemistry, pg. 17-21, 2nd ed, WH
Freeman and Co., 1981). Whether a change in the amino acid sequence
of a peptide results in a functional polypeptide can be readily
determined by assessing the ability of the variant peptide to
function in a fashion similar to the wild-type protein. Peptides
having more than one replacement can readily be tested in the same
manner.
[0139] More rarely, a variant includes "nonconservative" changes
(e.g., replacement of a glycine with a tryptophan). Analogous minor
variations can also include amino acid deletions or insertions, or
both. Guidance in determining which amino acid residues can be
substituted, inserted, or deleted without abolishing biological
activity can be found using computer programs (e.g., LASERGENE
software, DNASTAR Inc., Madison, Wis.).
[0140] As described in more detail below, variants may be produced
by methods such as directed evolution or other techniques for
producing combinatorial libraries of variants, described in more
detail below. In still other embodiments of the present invention,
the nucleotide sequences of the present invention may be engineered
in order to alter a NOD coding sequence including, but not limited
to, alterations that modify the cloning, processing, localization,
secretion, and/or expression of the gene product. For example,
mutations may be introduced using techniques that are well known in
the art (e.g., site-directed mutagenesis to insert new restriction
sites, alter glycosylation patterns, or change codon preference,
etc.). TABLE-US-00001 TABLE 1 Nod Genes Nod Gene SEQ ID NO (Nucleic
acid) SEQ ID NO (Polypeptide) Nod3 1 12 Nod5 2 13 Nod6 3 14 Nod8 4
15 Nod9 5 16 Nod12 6 17 Nod14 7 18 Nod16 8 19 Nod17 9 20 Nod26 10
21 Nod27 11 22
II. NOD Polypeptides
[0141] In other embodiments, the present invention provides NOD
polynucleotide sequences that encode NOD polypeptide sequences
(e.g., the polypeptides of SEQ ID NOs: 12-22). Other embodiments of
the present invention provide fragments, fusion proteins or
functional equivalents of these NOD proteins. In some embodiments,
the present invention provides mutants of NOD polypeptides. In
still other embodiments of the present invention, nucleic acid
sequences corresponding to NOD variants, homologs, and mutants may
be used to generate recombinant DNA molecules that direct the
expression of the NOD variants, homologs, and mutants in
appropriate host cells. In some embodiments of the present
invention, the polypeptide may be a naturally purified product, in
other embodiments it may be a product of chemical synthetic
procedures, and in still other embodiments it may be produced by
recombinant techniques using a prokaryotic or eukaryotic host
(e.g., by bacterial, yeast, higher plant, insect and mammalian
cells in culture). In some embodiments, depending upon the host
employed in a recombinant production procedure, the polypeptide of
the present invention may be glycosylated or may be
non-glycosylated. In other embodiments, the polypeptides of the
invention may also include an initial methionine amino acid
residue.
[0142] In one embodiment of the present invention, due to the
inherent degeneracy of the genetic code, DNA sequences other than
the polynucleotide sequences of SEQ ID NOs: 1-11 that encode
substantially the same or a functionally equivalent amino acid
sequence, may be used to clone and express NOD. In general, such
polynucleotide sequences hybridize to SEQ ID NOs: 1-11 under
conditions of high to medium stringency as described above. As will
be understood by those of skill in the art, it may be advantageous
to produce NOD-encoding nucleotide sequences possessing
non-naturally occurring codons. Therefore, in some preferred
embodiments, codons preferred by a particular prokaryotic or
eukaryotic host (Murray et al., Nucl. Acids Res., 17 [1989]) are
selected, for example, to increase the rate of NOD expression or to
produce recombinant RNA transcripts having desirable properties,
such as a longer half-life, than transcripts produced from
naturally occurring sequence.
[0143] 1. Vectors for Production of NOD
[0144] The polynucleotides of the present invention may be employed
for producing polypeptides by recombinant techniques. Thus, for
example, the polynucleotide may be included in any one of a variety
of expression vectors for expressing a polypeptide. In some
embodiments of the present invention, vectors include, but are not
limited to, chromosomal, nonchromosomal and synthetic DNA sequences
(e.g., derivatives of SV40, bacterial plasmids, phage DNA;
baculovirus, yeast plasmids, vectors derived from combinations of
plasmids and phage DNA, and viral DNA such as vaccinia, adenovirus,
fowl pox virus, and pseudorabies). It is contemplated that any
vector may be used as long as it is replicable and viable in the
host.
[0145] In particular, some embodiments of the present invention
provide recombinant constructs comprising one or more of the
sequences as broadly described above (e.g., SEQ ID NOs: 1-11). In
some embodiments of the present invention, the constructs comprise
a vector, such as a plasmid or viral vector, into which a sequence
of the invention has been inserted, in a forward or reverse
orientation. In still other embodiments, the heterologous
structural sequence (e.g., SEQ ID NOs: 1-11) is assembled in
appropriate phase with translation initiation and termination
sequences. In preferred embodiments of the present invention, the
appropriate DNA sequence is inserted into the vector using any of a
variety of procedures. In general, the DNA sequence is inserted
into an appropriate restriction endonuclease site(s) by procedures
known in the art.
[0146] Large numbers of suitable vectors are known to those of
skill in the art, and are commercially available. Such vectors
include, but are not limited to, the following vectors: 1)
Bacterial--pQE70, pQE60, pQE-9 (Qiagen), pBS, pD10, phagescript,
psiX174, pbluescript SK, pBSKS, pNH8A, pNH16a, pNH18A, pNH46A
(Stratagene); ptrc99a, pKK223-3, pKK233-3, pDR540, pRIT5
(Pharmacia); 2) Eukaryotic--pWLNEO, pSV2CAT, pOG44, PXT1, pSG
(Stratagene) pSVK3, pBPV, pMSG, pSVL (Pharmacia); and 3)
Baculovirus--pPbac and pMbac (Stratagene). Any other plasmid or
vector may be used as long as they are replicable and viable in the
host. In some preferred embodiments of the present invention,
mammalian expression vectors comprise an origin of replication, a
suitable promoter and enhancer, and also any necessary ribosome
binding sites, polyadenylation sites, splice donor and acceptor
sites, transcriptional termination sequences, and 5' flanking
non-transcribed sequences. In other embodiments, DNA sequences
derived from the SV40 splice, and polyadenylation sites may be used
to provide the required non-transcribed genetic elements.
[0147] In certain embodiments of the present invention, the DNA
sequence in the expression vector is operatively linked to an
appropriate expression control sequence(s) promoter) to direct mRNA
synthesis. Promoters useful in the present invention include, but
are not limited to, the LTR or SV40 promoter, the E. coli lac or
trp, the phage lambda P.sub.L and P.sub.R, T3 and T7 promoters, and
the cytomegalovirus (CMV) immediate early, herpes simplex virus
(HSV) thymidine kinase, and mouse metallothionein-I promoters and
other promoters known to control expression of genes in prokaryotic
or eukaryotic cells or their viruses. In other embodiments of the
present invention, recombinant expression vectors include origins
of replication and selectable markers permitting transformation of
the host cell (e.g., dihydrofolate reductase or neomycin resistance
for eukaryotic cell culture, or tetracycline or ampicillin
resistance in E. coli).
[0148] In some embodiments of the present invention, transcription
of the DNA encoding the polypeptides of the present invention by
higher eukaryotes is increased by inserting an enhancer sequence
into the vector. Enhancers are cis-acting elements of DNA, usually
about from 10 to 300 bp that act on a promoter to increase its
transcription. Enhancers useful in the present invention include,
but are not limited to, the SV40 enhancer on the late side of the
replication origin bp 100 to 270, a cytomegalovirus early promoter
enhancer, the polyoma enhancer on the late side of the replication
origin, and adenovirus enhancers.
[0149] In other embodiments, the expression vector also contains a
ribosome binding site for translation initiation and a
transcription terminator. In still other embodiments of the present
invention, the vector may also include appropriate sequences for
amplifying expression.
[0150] 2. Host Cells for Production of NOD Polypeptides
[0151] In a further embodiment, the present invention provides host
cells containing the above-described constructs. In some
embodiments of the present invention, the host cell is a higher
eukaryotic cell (e.g., a mammalian or insect cell). In other
embodiments of the present invention, the host cell is a lower
eukaryotic cell (e.g., a yeast cell). In still other embodiments of
the present invention, the host cell can be a prokaryotic cell
(e.g., a bacterial cell). Specific examples of host cells include,
but are not limited to, Escherichia coli, Salmonella typhimurium,
Bacillus subtilis, and various species within the genera
Pseudomonas, Streptomyces, and Staphylococcus, as well as
Saccharomycees cerivisiae, Schizosaccharomycees pombe, Drosophila
S2 cells, Spodoptera Sf9 cells, Chinese hamster ovary (CHO) cells,
COS-7 lines of monkey kidney fibroblasts, (Gluzman, Cell 23:175
[1981]), C127, 3T3, 293, 293T, HeLa and BHK cell lines.
[0152] The constructs in host cells can be used in a conventional
manner to produce the gene product encoded by the recombinant
sequence. In some embodiments, introduction of the construct into
the host cell can be accomplished by calcium phosphate
transfection, DEAE-Dextran mediated transfection, or
electroporation (See e.g., Davis et al., Basic Methods in Molecular
Biology, [1986]). Alternatively, in some embodiments of the present
invention, the polypeptides of the invention can be synthetically
produced by conventional peptide synthesizers.
[0153] Proteins can be expressed in mammalian cells, yeast,
bacteria, or other cells under the control of appropriate
promoters. Cell-free translation systems can also be employed to
produce such proteins using RNAs derived from the DNA constructs of
the present invention. Appropriate cloning and expression vectors
for use with prokaryotic and eukaryotic hosts are described by
Sambrook, et al., Molecular Cloning: A Laboratory Manual, Second
Edition, Cold Spring Harbor, N.Y., [1989].
[0154] In some embodiments of the present invention, following
transformation of a suitable host strain and growth of the host
strain to an appropriate cell density, the selected promoter is
induced by appropriate means (e.g., temperature shift or chemical
induction) and cells are cultured for an additional period. In
other embodiments of the present invention, cells are typically
harvested by centrifugation, disrupted by physical or chemical
means, and the resulting crude extract retained for further
purification. In still other embodiments of the present invention,
microbial cells employed in expression of proteins can be disrupted
by any convenient method, including freeze-thaw cycling,
sonication, mechanical disruption, or use of cell lysing
agents.
[0155] 3. Purification of NOD polypeptides
[0156] The present invention also provides methods for recovering
and purifying NOD polypeptides from recombinant cell cultures
including, but not limited to, ammonium sulfate or ethanol
precipitation, acid extraction, anion or cation exchange
chromatography, phosphocellulose chromatography, hydrophobic
interaction chromatography, affinity chromatography,
hydroxylapatite chromatography and lectin chromatography. In other
embodiments of the present invention, protein-refolding steps can
be used as necessary, in completing configuration of the mature
protein. In still other embodiments of the present invention, high
performance liquid chromatography (HPLC) can be employed for final
purification steps.
[0157] The present invention further provides polynucleotides
having a coding sequence of a NOD gene (e.g., SEQ ID NOs: 1-11)
fused in frame to a marker sequence that allows for purification of
the polypeptide of the present invention. A non-limiting example of
a marker sequence is a hexahistidine tag which may be supplied by a
vector, preferably a pQE-9 vector, which provides for purification
of the polypeptide fused to the marker in the case of a bacterial
host, or, for example, the marker sequence may be a hemagglutinin
(HA) tag when a mammalian host (e.g., COS-7 cells) is used. The HA
tag corresponds to an epitope derived from the influenza
hemagglutinin protein (Wilson et al., Cell, 37:767 [1984]).
[0158] 4. Truncation Mutants of NOD Polypeptide
[0159] In addition, the present invention provides fragments of NOD
polypeptides (i.e., truncation mutants). In some embodiments of the
present invention, when expression of a portion of the NOD protein
is desired, it may be necessary to add a start codon (ATG) to the
oligonucleotide fragment containing the desired sequence to be
expressed. It is well known in the art that a methionine at the
N-terminal position can be enzymatically cleaved by the use of the
enzyme methionine aminopeptidase (MAP). MAP has been cloned from E.
coli (Ben-Bassat et al., J. Bacteriol., 169:751 [1987]) and
Salmonella typhimurium and its in vitro activity has been
demonstrated on recombinant proteins (Miller et al., Proc. Natl.
Acad. Sci. USA 84:2718 [1990]). Therefore, removal of an N-terminal
methionine, if desired, can be achieved either in vivo by
expressing such recombinant polypeptides in a host which produces
MAP (e.g., E. coli or CM89 or S. cerivisiae), or in vitro by use of
purified MAP.
[0160] 5. Fusion Proteins Containing NOD
[0161] The present invention also provides fusion proteins
incorporating all or part of the NOD polypeptides of the present
invention. Accordingly, in some embodiments of the present
invention, the coding sequences for the polypeptide can be
incorporated as a part of a fusion gene including a nucleotide
sequence encoding a different polypeptide. It is contemplated that
this type of expression system will find use under conditions where
it is desirable to produce an immunogenic fragment of a NOD
protein. In some embodiments of the present invention, the VP6
capsid protein of rotavirus is used as an immunologic carrier
protein for portions of a NOD polypeptide, either in the monomeric
form or in the form of a viral particle. In other embodiments of
the present invention, the nucleic acid sequences corresponding to
the portion of a NOD polypeptide against which antibodies are to be
raised can be incorporated into a fusion gene construct which
includes coding sequences for a late vaccinia virus structural
protein to produce a set of recombinant viruses expressing fusion
proteins comprising a portion of NOD as part of the virion. It has
been demonstrated with the use of immunogenic fusion proteins
utilizing the hepatitis B surface antigen fusion proteins that
recombinant hepatitis B virions can be utilized in this role as
well. Similarly, in other embodiments of the present invention,
chimeric constructs coding for fusion proteins containing a portion
of a NOD polypeptide and the poliovirus capsid protein are created
to enhance immunogenicity of the set of polypeptide antigens (See
e.g., EP Publication No. 025949; and Evans et al., Nature 339:385
[1989]; Huang et al., J. Virol., 62:3855 [1988]; and Schlienger et
al., J. Virol., 66:2 [1992]).
[0162] In still other embodiments of the present invention, the
multiple antigen peptide system for peptide-based immunization can
be utilized. In this system, a desired portion of NOD is obtained
directly from organo-chemical synthesis of the peptide onto an
oligomeric branching lysine core (see e.g., Posnett et al., J.
Biol. Chem., 263:1719 [1988]; and Nardelli et al., J. Immunol.,
148:914 [1992]). In other embodiments of the present invention,
antigenic determinants of the NOD proteins can also be expressed
and presented by bacterial cells.
[0163] In addition to utilizing fusion proteins to enhance
immunogenicity, it is widely appreciated that fusion proteins can
also facilitate the expression of proteins, such as a NOD protein
of the present invention. Accordingly, in some embodiments of the
present invention, NOD polypeptides can be generated as
glutathione-S-transferase (i.e., GST fusion proteins). It is
contemplated that such GST fusion proteins will enable easy
purification of NOD polypeptides, such as by the use of
glutathione-derivatized matrices (See e.g., Ausabel et al. (eds.),
Current Protocols in Molecular Biology, John Wiley & Sons, NY
[1991]). In another embodiment of the present invention, a fusion
gene coding for a purification leader sequence, such as a
poly-(His)/enterokinase cleavage site sequence at the N-terminus of
the desired portion of a NOD polypeptide, can allow purification of
the expressed NOD fusion protein by affinity chromatography using a
Ni.sup.2+ metal resin. In still another embodiment of the present
invention, the purification leader sequence can then be
subsequently removed by treatment with enterokinase (See e.g.,
Hochuli et al., J. Chromatogr., 411:177 [1987]; and Janknecht et
al., Proc. Natl. Acad. Sci. USA 88:8972).
[0164] Techniques for making fusion genes are well known.
Essentially, the joining of various DNA fragments coding for
different polypeptide sequences is performed in accordance with
conventional techniques, employing blunt-ended or stagger-ended
termini for ligation, restriction enzyme digestion to provide for
appropriate termini, filling-in of cohesive ends as appropriate,
alkaline phosphatase treatment to avoid undesirable joining, and
enzymatic ligation. In another embodiment of the present invention,
the fusion gene can be synthesized by conventional techniques
including automated DNA synthesizers. Alternatively, in other
embodiments of the present invention, PCR amplification of gene
fragments can be carried out using anchor primers which give rise
to complementary overhangs between two consecutive gene fragments
which can subsequently be annealed to generate a chimeric gene
sequence (See e.g., Current Protocols in Molecular Biology,
supra).
[0165] 6. Variants of NOD
[0166] Still other embodiments of the present invention provide
mutant or variant forms of NOD polypeptides (i.e., muteins). It is
possible to modify the structure of a peptide having an activity of
a NOD polypeptide of the present invention for such purposes as
enhancing therapeutic or prophylactic efficacy, or stability (e.g.,
ex vivo shelf life, and/or resistance to proteolytic degradation in
vivo). Such modified peptides are considered functional equivalents
of peptides having an activity of the subject NOD proteins as
defined herein. A modified peptide can be produced in which the
amino acid sequence has been altered, such as by amino acid
substitution, deletion, or addition.
[0167] Moreover, as described above, variant forms (e.g., mutants
or polymorphic sequences) of the subject NOD proteins are also
contemplated as being equivalent to those peptides and DNA
molecules that are set forth in more detail. For example, as
described above, the present invention encompasses mutant and
variant proteins that contain conservative or non-conservative
amino acid substitutions.
[0168] This invention further contemplates a method of generating
sets of combinatorial mutants of the present NOD proteins, as well
as truncation mutants, and is especially useful for identifying
potential variant sequences (i.e., mutants or polymorphic
sequences) that are involved in inflammatory diseases or resistance
to inflammatory diseases. The purpose of screening such
combinatorial libraries is to generate, for example, novel NOD
variants that can act as either agonists or antagonists, or
alternatively, possess novel activities all together.
[0169] Therefore, in some embodiments of the present invention, NOD
variants are engineered by the present method to provide altered
(e.g., increased or decreased) biological activity. In other
embodiments of the present invention, combinatorially-derived
variants are generated which have a selective potency relative to a
naturally occurring NOD. Such proteins, when expressed from
recombinant DNA constructs, can be used in gene therapy
protocols.
[0170] Still other embodiments of the present invention provide NOD
variants that have intracellular half-lives dramatically different
than the corresponding wild-type protein. For example, the altered
protein can be rendered either more stable or less stable to
proteolytic degradation or other cellular process that result in
destruction of, or otherwise inactivate NOD polypeptides. Such
variants, and the genes which encode them, can be utilized to alter
the location of NOD expression by modulating the half-life of the
protein. For instance, a short half-life can give rise to more
transient NOD biological effects and, when part of an inducible
expression system, can allow tighter control of NOD levels within
the cell. As above, such proteins, and particularly their
recombinant nucleic acid constructs, can be used in gene therapy
protocols.
[0171] In still other embodiments of the present invention, NOD
variants are generated by the combinatorial approach to act as
antagonists, in that they are able to interfere with the ability of
the corresponding wild-type protein to regulate cell function.
[0172] In some embodiments of the combinatorial mutagenesis
approach of the present invention, the amino acid sequences for a
population of NOD homologs, variants or other related proteins are
aligned, preferably to promote the highest homology possible. Such
a population of variants can include, for example, NOD homologs
from one or more species, or NOD variants from the same species but
which differ due to mutation or polymorphisms. Amino acids that
appear at each position of the aligned sequences are selected to
create a degenerate set of combinatorial sequences.
[0173] In a preferred embodiment of the present invention, the
combinatorial NOD library is produced by way of a degenerate
library of genes encoding a library of polypeptides which each
include at least a portion of potential NOD protein sequences. For
example, a mixture of synthetic oligonucleotides can be
enzymatically ligated into gene sequences such that the degenerate
set of potential NOD sequences are expressible as individual
polypeptides, or alternatively, as a set of larger fusion proteins
(e.g., for phage display) containing the set of NOD sequences
therein.
[0174] There are many ways by which the library of potential NOD
homologs and variants can be generated from a degenerate
oligonucleotide sequence. In some embodiments, chemical synthesis
of a degenerate gene sequence is carried out in an automatic DNA
synthesizer, and the synthetic genes are ligated into an
appropriate gene for expression. The purpose of a degenerate set of
genes is to provide, in one mixture, all of the sequences encoding
the desired set of potential NOD sequences. The synthesis of
degenerate oligonucleotides is well known in the art (See e.g.,
Narang, Tetrahedron Lett., 39:39 [1983]; Itakura et al.,
Recombinant DNA, in Walton (ed.), Proceedings of the 3rd Cleveland
Symposium on Macromolecules, Elsevier, Amsterdam, pp 273-289
[1981]; Itakura et al., Annu. Rev. Biochem., 53:323 [1984]; Itakura
et al., Science 198:1056 [1984]; Ike et al., Nucl. Acid Res.,
11:477 [1983]). Such techniques have been employed in the directed
evolution of other proteins (See e.g., Scott et al., Science
249:386 [1980]; Roberts et al., Proc. Natl. Acad. Sci. USA 89:2429
[1992]; Devlin et al., Science 249: 404 [1990]; Cwirla et al.,
Proc. Natl. Acad. Sci. USA 87: 6378 [1990]; each of which is herein
incorporated by reference; as well as U.S. Pat. Nos. 5,223,409,
5,198,346, and 5,096,815; each of which is incorporated herein by
reference).
[0175] It is contemplated that the NOD nucleic acids of the present
invention (e.g., SEQ ID NOs: 1-11, and fragments and variants
thereof) can be utilized as starting nucleic acids for directed
evolution. These techniques can be utilized to develop NOD variants
having desirable properties such as increased or decreased
biological activity.
[0176] In some embodiments, artificial evolution is performed by
random mutagenesis (e.g., by utilizing error-prone PCR to introduce
random mutations into a given coding sequence). This method
requires that the frequency of mutation be finely tuned. As a
general rule, beneficial mutations are rare, while deleterious
mutations are common. This is because the combination of a
deleterious mutation and a beneficial mutation often results in an
inactive enzyme. The ideal number of base substitutions for
targeted gene is usually between 1.5 and 5 (Moore and Arnold, Nat.
Biotech., 14, 458 [1996]; Leung et al., Technique, 1:11 [1989];
Eckert and Kunkel, PCR Methods Appl., 1: 17-24 [1991]; Caldwell and
Joyce, PCR Methods Appl., 2:28 [1992]; and Zhao and Arnold, Nuc.
Acids. Res., 25:1307 [1997]). After mutagenesis, the resulting
clones are selected for desirable activity (e.g., screened for NOD
activity). Successive rounds of mutagenesis and selection are often
necessary to develop enzymes with desirable properties. It should
be noted that only the useful mutations are carried over to the
next round of mutagenesis.
[0177] In other embodiments of the present invention, the
polynucleotides of the present invention are used in gene shuffling
or sexual PCR procedures (e.g., Smith, Nature, 370:324 [1994]; U.S.
Pat. Nos. 5,837,458; 5,830,721; 5,811,238; 5,733,731; all of which
are herein incorporated by reference). Gene shuffling involves
random fragmentation of several mutant DNAs followed by their
reassembly by PCR into full length molecules. Examples of various
gene shuffling procedures include, but are not limited to, assembly
following DNase treatment, the staggered extension process (STEP),
and random priming in vitro recombination. In the DNase mediated
method, DNA segments isolated from a pool of positive mutants are
cleaved into random fragments with DNaseI and subjected to multiple
rounds of PCR with no added primer. The lengths of random fragments
approach that of the uncleaved segment as the PCR cycles proceed,
resulting in mutations in present in different clones becoming
mixed and accumulating in some of the resulting sequences. Multiple
cycles of selection and shuffling have led to the functional
enhancement of several enzymes (Stemmer, Nature, 370:398 [1994];
Stemmer, Proc. Natl. Acad. Sci. USA, 91:10747 [1994]; Crameri et
al., Nat. Biotech., 14:315 [1996]; Zhang et al., Proc. Natl. Acad.
Sci. USA, 94:4504 [1997]; and Crameri et al., Nat. Biotech., 15:436
[1997]). Variants produced by directed evolution can be screened
for NOD activity by the methods described herein.
[0178] A wide range of techniques are known in the art for
screening gene products of combinatorial libraries made by point
mutations, and for screening cDNA libraries for gene products
having a certain property. Such techniques will be generally
adaptable for rapid screening of the gene libraries generated by
the combinatorial mutagenesis or recombination of NOD homologs or
variants. The most widely used techniques for screening large gene
libraries typically comprises cloning the gene library into
replicable expression vectors, transforming appropriate cells with
the resulting library of vectors, and expressing the combinatorial
genes under conditions in which detection of a desired activity
facilitates relatively easy isolation of the vector encoding the
gene whose product was detected.
[0179] 7. Chemical Synthesis of NOD Polypeptides
[0180] In an alternate embodiment of the invention, the coding
sequence of NOD is synthesized, whole or in part, using chemical
methods well known in the art (See e.g., Caruthers et al., Nucl.
Acids Res. Symp. Ser., 7:215 [1980]; Crea and Horn, Nucl. Acids
Res., 9:2331 [1980]; Matteucci and Caruthers, Tetrahedron Lett.,
21:719 [1980]; and Chow and Kempe, Nucl. Acids Res., 9:2807
[1981]). In other embodiments of the present invention, the protein
itself is produced using chemical methods to synthesize either an
entire NOD amino acid sequence or a portion thereof. For example,
peptides can be synthesized by solid phase techniques, cleaved from
the resin, and purified by preparative high performance liquid
chromatography (See e.g., Creighton, Proteins Structures And
Molecular Principles, W H Freeman and Co, New York N.Y. [1983]). In
other embodiments of the present invention, the composition of the
synthetic peptides is confirmed by amino acid analysis or
sequencing (See e.g., Creighton, supra).
[0181] Direct peptide synthesis can be performed using various
solid-phase techniques (Roberge et al., Science 269:202 [1995]) and
automated synthesis may be achieved, for example, using ABI 431A
Peptide Synthesizer (Perkin Elmer) in accordance with the
instructions provided by the manufacturer. Additionally, the amino
acid sequence of a NOD polypeptide, or any part thereof, may be
altered during direct synthesis and/or combined using chemical
methods with other sequences to produce a variant polypeptide.
III. Detection of NOD Alleles
[0182] In some embodiments, the present invention provides methods
of detecting the presence of wild type or variant (e.g., mutant or
polymorphic) NOD nucleic acids or polypeptides. The detection of
mutant NOD polypeptides finds use in the diagnosis of disease
(e.g., inflammatory disease).
A. Detection of Variant NOD Alleles
[0183] In some embodiments, the present invention provides alleles
of NOD that increase a patient's susceptibility to inflammatory
diseases. Any mutation that results in an altered phenotype (e.g.,
increase in inflammatory disease or resistance to inflammatory
disease) is within the scope of the present invention.
[0184] Accordingly, the present invention provides methods for
determining whether a patient has an increased susceptibility to an
inflammatory disease by determining whether the individual has a
variant NOD allele. In other embodiments, the present invention
provides methods for providing a prognosis of increased risk for
inflammatory disease to an individual based on the presence or
absence of one or more variant alleles of NOD.
[0185] A number of methods are available for analysis of variant
(e.g., mutant or polymorphic) nucleic acid sequences. Assays for
detection variants (e.g., polymorphisms or mutations) fall into
several categories including, but not limited to, direct sequencing
assays, fragment polymorphism assays, hybridization assays, and
computer based data analysis. Protocols and commercially available
kits or services for performing multiple variations of these assays
are available. In some embodiments, assays are performed in
combination or in hybrid (e.g., different reagents or technologies
from several assays are combined to yield one assay). The following
exemplary assays are useful in the present invention: directs
sequencing assays, PCR assays, mutational analysis by dHPLC (e.g.,
available from Transgenomic, Omaha, Nebr. or Varian, Palo Alto,
Calif.), fragment length polymorphism assays (e.g., RFLP or CFLP
(See e.g. U.S. Patents U.S. Pat. Nos. 5,843,654; 5,843,669;
5,719,208; and 5,888,780; each of which is herein incorporated by
reference)), hybridization assays (e.g., direct detection of
hybridization, detection of hybridization using DNA chip assays
(See e.g., U.S. Pat. Nos. 6,045,996; 5,925,525; 5,858,659;
6,017,696; 6,068,818; 6,051,380; 6,001,311; 5,985,551; 5,474,796;
PCT Publications WO 99/67641 and WO 00/39587, each of which is
herein incorporated by reference), enzymatic detection of
hybridization (See e.g., U.S. Pat. Nos. 5,846,717, 6,090,543;
6,001,567; 5,985,557; 5,994,069; 5,962,233; 5,538,848; 5,952,174
and 5,919,626, each of which is herein incorporated by reference)),
and mass spectrometry assays. In addition, assays for the detection
of variant NOD proteins find use in the present invention (e.g.,
cell free translation methods, See e.g., U.S. Pat. No. 6,303,337,
herein incorporated by reference) and antibody binding assays.
[0186] B. Kits for Analyzing Risk of Inflammatory Disease
[0187] The present invention also provides kits for determining
whether an individual contains a wild-type or variant (e.g., mutant
or polymorphic) allele or polypeptide of NOD. In some embodiments,
the kits are useful determining whether the subject is at risk of
developing an inflammatory disease (e.g., Crohn's disease or
psoriasis). The diagnostic kits are produced in a variety of ways.
In some embodiments, the kits contain at least one reagent for
specifically detecting a mutant NOD allele or protein. In preferred
embodiments, the reagent is a nucleic acid that hybridizes to
nucleic acids containing the mutation and that does not bind to
nucleic acids that do not contain the mutation. In other
embodiments, the reagents are primers for amplifying the region of
DNA containing the mutation. In still other embodiments, the
reagents are antibodies that preferentially bind either the
wild-type or mutant NOD proteins.
[0188] In some embodiments, the kit contains instructions for
determining whether the subject is at risk for an inflammatory
disease. In preferred embodiments, the instructions specify that
risk for developing an inflammatory disease is determined by
detecting the presence or absence of a mutant NOD allele in the
subject, wherein subjects having an mutant allele are at greater
risk for developing an inflammatory disease.
[0189] The presence or absence of a disease-associated mutation in
a NOD gene can be used to make therapeutic or other medical
decisions. For example, couples with a family history of
inflammatory diseases may choose to conceive a child via in vitro
fertilization and pre-implantation genetic screening. In this case,
fertilized embryos are screened for mutant (e.g., disease
associated) alleles of a NOD gene and only embryos with wild type
alleles are implanted in the uterus.
[0190] In other embodiments, in utero screening is performed on a
developing fetus (e.g., amniocentesis or chorionic villi
screening). In still other embodiments, genetic screening of
newborn babies or very young children is performed. The early
detection of a NOD allele known to be associated with an
inflammatory disease allows for early intervention (e.g., genetic
or pharmaceutical therapies).
[0191] In some embodiments, the kits include ancillary reagents
such as buffering agents, nucleic acid stabilizing reagents,
protein stabilizing reagents, and signal producing systems (e.g.,
florescence generating systems as Fret systems). The test kit may
be packaged in any suitable manner, typically with the elements in
a single container or various containers as necessary along with a
sheet of instructions for carrying out the test. In some
embodiments, the kits also preferably include a positive control
sample.
[0192] C. Bioinformatics
[0193] In some embodiments, the present invention provides methods
of determining an individual's risk of developing an inflammatory
disease based on the presence of one or more variant alleles of a
NOD gene. In some embodiments, the analysis of variant data is
processed by a computer using information stored on a computer
(e.g., in a database). For example, in some embodiments, the
present invention provides a bioinformatics research system
comprising a plurality of computers running a multi-platform object
oriented programming language (See e.g., U.S. Pat. No. 6,125,383;
herein incorporated by reference). In some embodiments, one of the
computers stores genetics data (e.g., the risk of contacting an
inflammatory disease associated with a given polymorphism, as well
as the sequences). In some embodiments, one of the computers stores
application programs (e.g., for analyzing the results of detection
assays). Results are then delivered to the user (e.g., via one of
the computers or via the internet.
[0194] For example, in some embodiments, a computer-based analysis
program is used to translate the raw data generated by the
detection assay (e.g., the presence, absence, or amount of a given
NOD allele or polypeptide) into data of predictive value for a
clinician. The clinician can access the predictive data using any
suitable means. Thus, in some preferred embodiments, the present
invention provides the further benefit that the clinician, who is
not likely to be trained in genetics or molecular biology, need not
understand the raw data. The data is presented directly to the
clinician in its most useful form. The clinician is then able to
immediately utilize the information in order to optimize the care
of the subject.
[0195] The present invention contemplates any method capable of
receiving, processing, and transmitting the information to and from
laboratories conducting the assays, information providers, medical
personal, and subjects. For example, in some embodiments of the
present invention, a sample (e.g., a biopsy or a serum or urine
sample) is obtained from a subject and submitted to a profiling
service (e.g., clinical lab at a medical facility, genomic
profiling business, etc.), located in any part of the world (e.g.,
in a country different than the country where the subject resides
or where the information is ultimately used) to generate raw data.
Where the sample comprises a tissue or other biological sample, the
subject may visit a medical center to have the sample obtained and
sent to the profiling center, or subjects may collect the sample
themselves (e.g., a urine sample) and directly send it to a
profiling center. Where the sample comprises previously determined
biological information, the information may be directly sent to the
profiling service by the subject (e.g., an information card
containing the information may be scanned by a computer and the
data transmitted to a computer of the profiling center using an
electronic communication systems). Once received by the profiling
service, the sample is processed and a profile is produced (i.e.,
presence of wild type or mutant NOD genes or polypeptides),
specific for the diagnostic or prognostic information desired for
the subject.
[0196] The profile data is then prepared in a format suitable for
interpretation by a treating clinician. For example, rather than
providing raw data, the prepared format may represent a diagnosis
or risk assessment (e.g., likelihood of developing an inflammatory
disease) for the subject, along with recommendations for particular
treatment options. The data may be displayed to the clinician by
any suitable method. For example, in some embodiments, the
profiling service generates a report that can be printed for the
clinician (e.g., at the point of care) or displayed to the
clinician on a computer monitor.
[0197] In some embodiments, the information is first analyzed at
the point of care or at a regional facility. The raw data is then
sent to a central processing facility for further analysis and/or
to convert the raw data to information useful for a clinician or
patient. The central processing facility provides the advantage of
privacy (all data is stored in a central facility with uniform
security protocols), speed, and uniformity of data analysis. The
central processing facility can then control the fate of the data
following treatment of the subject. For example, using an
electronic communication system, the central facility can provide
data to the clinician, the subject, or researchers.
[0198] In some embodiments, the subject is able to directly access
the data using the electronic communication system. The subject may
chose further intervention or counseling based on the results. In
some embodiments, the data is used for research use. For example,
the data may be used to further optimize the association of a given
NOD allele with inflammatory diseases.
IV. Generation of NOD Antibodies
[0199] The present invention provides isolated antibodies or
antibody fragments (e.g., FAB fragments). Antibodies can be
generated to allow for the detection of a NOD protein of the
present invention. The antibodies may be prepared using various
immunogens. In one embodiment, the immunogen is a human NOD peptide
to generate antibodies that recognize human NOD. Such antibodies
include, but are not limited to polyclonal, monoclonal, chimeric,
single chain, Fab fragments, Fab expression libraries, or
recombinant (e.g., chimeric, humanized, etc.) antibodies, as long
as it can recognize the protein. Antibodies can be produced by
using a protein of the present invention as the antigen according
to a conventional antibody or antiserum preparation process.
[0200] Various procedures known in the art may be used for the
production of polyclonal antibodies directed against a NOD
polypeptide. For the production of antibody, various host animals
can be immunized by injection with the peptide corresponding to the
NOD epitope including but not limited to rabbits, mice, rats,
sheep, goats, etc. In a preferred embodiment, the peptide is
conjugated to an immunogenic carrier (e.g., diphtheria toxoid,
bovine serum albumin (BSA), or keyhole limpet hemocyanin (KLH)).
Various adjuvants may be used to increase the immunological
response, depending on the host species, including but not limited
to Freund's (complete and incomplete), mineral gels (e.g., aluminum
hydroxide), surface active substances (e.g., lysolecithin, pluronic
polyols, polyanions, peptides, oil emulsions, keyhole limpet
hemocyanins, dinitrophenol, and potentially useful human adjuvants
such as BCG (Bacille Calmette-Guerin) and Corynebacterium
parvum).
[0201] For preparation of monoclonal antibodies directed toward
NOD, it is contemplated that any technique that provides for the
production of antibody molecules by continuous cell lines in
culture will find use with the present invention (See e.g., Harlow
and Lane, Antibodies: A Laboratory Manual, Cold Spring Harbor
Laboratory Press, Cold Spring Harbor, N.Y.). These include but are
not limited to the hybridoma technique originally developed by
Kohler and Milstein (Kohler and Milstein, Nature 256:495-497
[1975]), as well as the trioma technique, the human B-cell
hybridoma technique (See e.g., Kozbor et al., Immunol. Tod., 4:72
[1983]), and the EBV-hybridoma technique to produce human
monoclonal antibodies (Cole et al., in Monoclonal Antibodies and
Cancer Therapy, Alan R. Liss, Inc., pp. 77-96 [1985]).
[0202] In an additional embodiment of the invention, monoclonal
antibodies are produced in germ-free animals utilizing technology
such as that described in PCT/US90/02545). Furthermore, it is
contemplated that human antibodies will be generated by human
hybridomas (Cote et al., Proc. Natl. Acad. Sci. USA 80:2026-2030
[1983]) or by transforming human B cells with EBV virus in vitro
(Cole et al., in Monoclonal Antibodies and Cancer Therapy, Alan R.
Liss, pp. 77-96 [1985]).
[0203] In addition, it is contemplated that techniques described
for the production of single chain antibodies (U.S. Pat. No.
4,946,778; herein incorporated by reference) will find use in
producing NOD specific single chain antibodies. An additional
embodiment of the invention utilizes the techniques described for
the construction of Fab expression libraries (Huse et al., Science
246:1275-1281 [1989]) to allow rapid and easy identification of
monoclonal Fab fragments with the desired specificity for a NOD
polypeptide.
[0204] In other embodiments, the present invention contemplated
recombinant antibodies or fragments thereof to the proteins of the
present invention. Recombinant antibodies include, but are not
limited to, humanized and chimeric antibodies. Methods for
generating recombinant antibodies are known in the art (See e.g.,
U.S. Pat. Nos. 6,180,370 and 6,277,969 and "Monoclonal Antibodies"
H. Zola, BIOS Scientific Publishers Limited 2000. Springer-Verlay
New York, Inc., New York; each of which is herein incorporated by
reference).
[0205] It is contemplated that any technique suitable for producing
antibody fragments will find use in generating antibody fragments
that contain the idiotype (antigen binding region) of the antibody
molecule. For example, such fragments include but are not limited
to: F(ab')2 fragment that can be produced by pepsin digestion of
the antibody molecule; Fab' fragments that can be generated by
reducing the disulfide bridges of the F(ab')2 fragment, and Fab
fragments that can be generated by treating the antibody molecule
with papain and a reducing agent.
[0206] In the production of antibodies, it is contemplated that
screening for the desired antibody will be accomplished by
techniques known in the art (e.g., radioimmunoassay, ELISA
(enzyme-linked immunosorbant assay), "sandwich" immunoassays,
immunoradiometric assays, gel diffusion precipitation reactions,
immunodiffusion assays, in situ immunoassays (e.g., using colloidal
gold, enzyme or radioisotope labels), Western blots, precipitation
reactions, agglutination assays (e.g., gel agglutination assays,
hemagglutination assays, etc.), complement fixation assays,
immunofluorescence assays, protein A assays, and
immunoelectrophoresis assays, etc.
[0207] In one embodiment, antibody binding is detected by detecting
a label on the primary antibody. In another embodiment, the primary
antibody is detected by detecting binding of a secondary antibody
or reagent to the primary antibody. In a further embodiment, the
secondary antibody is labeled. Many means are known in the art for
detecting binding in an immunoassay and are within the scope of the
present invention. As is well known in the art, the immunogenic
peptide should be provided free of the carrier molecule used in any
immunization protocol. For example, if the peptide was conjugated
to KLH, it may be conjugated to BSA, or used directly, in a
screening assay.)
[0208] The foregoing antibodies can he used in methods known in the
art relating to the localization and structure of NOD (e.g., for
Western blotting), measuring levels thereof in appropriate
biological samples, etc. The antibodies can be used to detect a NOD
in a biological sample from an individual. The biological sample
can be a biological fluid, such as, but not limited to, blood,
serum, plasma, interstitial fluid, urine, cerebrospinal fluid, and
the like, containing cells.
[0209] The biological samples can then be tested directly for the
presence of a human NOD using an appropriate strategy (e.g., ELISA
or radioimmunoassay) and format (e.g., microwells, dipstick (e.g.,
as described in International Patent Publication WO 93/03367), etc.
Alternatively, proteins in the sample can be size separated (e.g.,
by polyacrylamide gel electrophoresis (PAGE), in the presence or
not of sodium dodecyl sulfate (SDS), and the presence of NOD
detected by immunoblotting (Western blotting). Immunoblotting
techniques are generally more effective with antibodies generated
against a peptide corresponding to an epitope of a protein, and
hence, are particularly suited to the present invention.
[0210] Another method uses antibodies as agents to alter signal
transduction. Specific antibodies that bind to the binding domains
of NOD or other proteins involved in intracellular signaling can be
used to inhibit the interaction between the various proteins and
their interaction with other ligands. Antibodies that bind to the
complex can also be used therapeutically to inhibit interactions of
the protein complex in the signal transduction pathways leading to
the various physiological and cellular effects of NOD. Such
antibodies can also be used diagnostically to measure abnormal
expression of NOD, or the aberrant formation of protein complexes,
which may be indicative of a disease state.
V. Gene Therapy Using NOD
[0211] The present invention also provides methods and compositions
suitable for gene therapy to alter NOD expression, production, or
function. As described above, the present invention provides human
NOD genes and provides methods of obtaining NOD genes from other
species. Thus, the methods described below are generally applicable
across many species. In some embodiments, it is contemplated that
the gene therapy is performed by providing a subject with a
wild-type allele of a NOD gene (i.e., an allele that does not
contain a NOD disease allele (e.g., free of disease causing
polymorphisms or mutations). Subjects in need of such therapy are
identified by the methods described above.
[0212] Viral vectors commonly used for in vivo or ex vivo targeting
and therapy procedures are DNA-based vectors and retroviral
vectors. Methods for constructing and using viral vectors are known
in the art (See e.g., Miller and Rosman, BioTech., 7:980-990
[1992]). Preferably, the viral vectors are replication defective,
that is, they are unable to replicate autonomously in the target
cell. In general, the genome of the replication defective viral
vectors that are used within the scope of the present invention
lack at least one region that is necessary for the replication of
the virus in the infected cell. These regions can either be
eliminated (in whole or in part), or be rendered non-functional by
any technique known to a person skilled in the art. These
techniques include the total removal, substitution (by other
sequences, in particular by the inserted nucleic acid), partial
deletion or addition of one or more bases to an essential (for
replication) region. Such techniques may be performed in vitro
(i.e., on the isolated DNA) or in situ, using the techniques of
genetic manipulation or by treatment with mutagenic agents.
[0213] Preferably, the replication defective virus retains the
sequences of its genome that are necessary for encapsidating the
viral particles. DNA viral vectors include an attenuated or
defective DNA viruses, including, but not limited to, herpes
simplex virus (HSV), papillomavirus, Epstein Barr virus (EBV),
adenovirus, adeno-associated virus (AAV), and the like. Defective
viruses, that entirely or almost entirely lack viral genes, are
preferred, as defective virus is not infective after introduction
into a cell. Use of defective viral vectors allows for
administration to cells in a specific, localized area, without
concern that the vector can infect other cells. Thus, a specific
tissue can be specifically targeted. Examples of particular vectors
include, but are not limited to, a defective herpes virus 1 (HSV1)
vector (Kaplitt et al., Mol. Cell. Neurosci., 2:320-330 [1991]),
defective herpes virus vector lacking a glycoprotein L gene (See
e.g., Patent Publication RD 371005 A), or other defective herpes
virus vectors (See e.g., WO 94/21807; and WO 92/05263); an
attenuated adenovirus vector, such as the vector described by
Stratford-Perricaudet et al. (J. Clin. Invest., 90:626-630 [1992];
See also, La Salle et al., Science 259:988-990 [1993]); and a
defective adeno-associated virus vector (Samulski et al., J.
Virol., 61:3096-3101 [1987]; Samulski et al., J. Virol.,
63:3822-3828 [1989]; and Lebkowski et al., Mol. Cell. Biol.,
8:3988-3996 [1988]).
[0214] Preferably, for in vivo administration, an appropriate
immunosuppressive treatment is employed in conjunction with the
viral vector (e.g., adenovirus vector), to avoid
immuno-deactivation of the viral vector and transfected cells. For
example, immunosuppressive cytokines, such as interleukin-12
(IL-12), interferon-gamma (IFN-.gamma.), or anti-CD4 antibody, can
be administered to block humoral or cellular immune responses to
the viral vectors. In addition, it is advantageous to employ a
viral vector that is engineered to express a minimal number of
antigens.
[0215] In a preferred embodiment, the vector is an adenovirus
vector. Adenoviruses are eukaryotic DNA viruses that can be
modified to efficiently deliver a nucleic acid of the invention to
a variety of cell types. Various serotypes of adenovirus exist. Of
these serotypes, preference is given, within the scope of the
present invention, to type 2 or type 5 human adenoviruses (Ad 2 or
Ad 5), or adenoviruses of animal origin (See e.g., WO 94/26914).
Those adenoviruses of animal origin that can be used within the
scope of the present invention include adenoviruses of canine,
bovine, murine (e.g., Mavl, Beard et al., Virol., 75-81 [1990]),
ovine, porcine, avian, and simian (e.g., SAV) origin. Preferably,
the adenovirus of animal origin is a canine adenovirus, more
preferably a CAV2 adenovirus (e.g. Manhattan or A26/61 strain (ATCC
VR-800)).
[0216] Preferably, the replication defective adenoviral vectors of
the invention comprise the ITRs, an encapsidation sequence and the
nucleic acid of interest. Still more preferably, at least the E1
region of the adenoviral vector is non-functional. The deletion in
the E1 region preferably extends from nucleotides 455 to 3329 in
the sequence of the Ad5 adenovirus (PvuII-BglII fragment) or 382 to
3446 (HinfII-Sau3A fragment). Other regions may also be modified,
in particular the E3 region (e.g., WO 95/02697), the E2 region
(e.g., WO 94/28938), the E4 region (e.g., WO 94/28152, WO 94/12649
and WO 95/02697), or in any of the late genes L1-L5.
[0217] In a preferred embodiment, the adenoviral vector has a
deletion in the E1 region (Ad 1.0). Examples of E1-deleted
adenoviruses are disclosed in EP 185,573, the contents of which are
incorporated herein by reference. In another preferred embodiment,
the adenoviral vector has a deletion in the E1 and E4 regions (Ad
3.0). Examples of E1/E4-deleted adenoviruses are disclosed in WO
95/02697 and WO 96/22378. In still another preferred embodiment,
the adenoviral vector has a deletion in the E1 region into which
the E4 region and the nucleic acid sequence are inserted.
[0218] The replication defective recombinant adenoviruses according
to the invention can be prepared by any technique known to the
person skilled in the art (See e.g., Levrero et al., Gene 101:195
[1991]; EP 185 573; and Graham, EMBO J., 3:2917 [1984]). In
particular, they can be prepared by homologous recombination
between an adenovirus and a plasmid that carries, inter alia, the
DNA sequence of interest. The homologous recombination is
accomplished following co-transfection of the adenovirus and
plasmid into an appropriate cell line. The cell line that is
employed should preferably (i) be transformable by the elements to
be used, and (ii) contain the sequences that are able to complement
the part of the genome of the replication defective adenovirus,
preferably in integrated form in order to avoid the risks of
recombination. Examples of cell lines that may be used are the
human embryonic kidney cell line 293 (Graham et al., J. Gen.
Virol., 36:59 [1977]), which contains the left-hand portion of the
genome of an Ad5 adenovirus (12%) integrated into its genome, and
cell lines that are able to complement the E1 and E4 functions, as
described in applications WO 94/26914 and WO 95/02697. Recombinant
adenoviruses are recovered and purified using standard molecular
biological techniques that are well known to one of ordinary skill
in the art.
[0219] The adeno-associated viruses (AAV) are DNA viruses of
relatively small size that can integrate, in a stable and
site-specific manner, into the genome of the cells that they
infect. They are able to infect a wide spectrum of cells without
inducing any effects on cellular growth, morphology or
differentiation, and they do not appear to be involved in human
pathologies. The AAV genome has been cloned, sequenced and
characterized. It encompasses approximately 4700 bases and contains
an inverted terminal repeat (ITR) region of approximately 145 bases
at each end, which serves as an origin of replication for the
virus. The remainder of the genome is divided into two essential
regions that carry the encapsidation functions: the left-hand part
of the genome, that contains the rep gene involved in viral
replication and expression of the viral genes; and the right-hand
part of the genome, that contains the cap gene encoding the capsid
proteins of the virus.
[0220] The use of vectors derived from the AAVs for transferring
genes in vitro and in vivo has been described (See e.g., WO
91/18088; WO 93/09239; U.S. Pat. No. 4,797,368; U.S. Pat. No.,
5,139,941; and EP 488 528, all of which are herein incorporated by
reference). These publications describe various AAV-derived
constructs in which the rep and/or cap genes are deleted and
replaced by a gene of interest, and the use of these constructs for
transferring the gene of interest in vitro (into cultured cells) or
in vivo (directly into an organism). The replication defective
recombinant AAVs according to the invention can be prepared by
co-transfecting a plasmid containing the nucleic acid sequence of
interest flanked by two AAV inverted terminal repeat (ITR) regions,
and a plasmid carrying the AAV encapsidation genes (rep and cap
genes), into a cell line that is infected with a human helper virus
(for example an adenovirus). The AAV recombinants that are produced
are then purified by standard techniques.
[0221] In another embodiment, the gene can be introduced in a
retroviral vector (e.g., as described in U.S. Pat. Nos. 5,399,346,
4,650,764, 4,980,289 and 5,124,263; all of which are herein
incorporated by reference; Mann et al., Cell 33:153 [1983];
Markowitz et al., J. Virol., 62:1120 [1988]; PCT/US95/14575; EP
453242; EP178220; Bernstein et al. Genet. Eng., 7:235 [1985];
McCormick, BioTechnol., 3:689 [1985]; WO 95/07358; and Kuo et al.,
Blood 82:845 [1993]). The retroviruses are integrating viruses that
infect dividing cells. The retrovirus genome includes two LTRs, an
encapsidation sequence and three coding regions (gag, pol and env).
In recombinant retroviral vectors, the gag, pol and env genes are
generally deleted, in whole or in part, and replaced with a
heterologous nucleic acid sequence of interest. These vectors can
be constructed from different types of retrovirus, such as, HIV,
MoMuLV ("murine Moloney leukemia virus" MSV ("murine Moloney
sarcoma virus"), HaSV ("Harvey sarcoma virus"); SNV ("spleen
necrosis virus"); RSV ("Rous sarcoma virus") and Friend virus.
Defective retroviral vectors are also disclosed in WO 95/02697.
[0222] In general, in order to construct recombinant retroviruses
containing a nucleic acid sequence, a plasmid is constructed that
contains the LTRs, the encapsidation sequence and the coding
sequence. This construct is used to transfect a packaging cell
line, which cell line is able to supply in trans the retroviral
functions that are deficient in the plasmid. In general, the
packaging cell lines are thus able to express the gag, pol and env
genes. Such packaging cell lines have been described in the prior
art, in particular the cell line PA317 (U.S. Pat. No. 4,861,719,
herein incorporated by reference), the PsiCRIP cell line (See,
WO90/02806), and the GP+envAm-12 cell line (See, WO89/07150). In
addition, the recombinant retroviral vectors can contain
modifications within the LTRs for suppressing transcriptional
activity as well as extensive encapsidation sequences that may
include a part of the gag gene (Bender et al., J. Virol., 61:1639
[1987]). Recombinant retroviral vectors are purified by standard
techniques known to those having ordinary skill in the art.
[0223] Alternatively, the vector can be introduced in vivo by
lipofection. For the past decade, there has been increasing use of
liposomes for encapsulation and transfection of nucleic acids in
vitro. Synthetic cationic lipids designed to limit the difficulties
and dangers encountered with liposome mediated transfection can be
used to prepare liposomes for in vivo transfection of a gene
encoding a marker (Felgner et. al., Proc. Natl. Acad. Sci. USA
84:7413-7417 [1987]; See also, Mackey, et al., Proc. Natl. Acad.
Sci. USA 85:8027-8031 [1988]; Ulmer et al., Science 259:1745-1748
[1993]). The use of cationic lipids may promote encapsulation of
negatively charged nucleic acids, and also promote fusion with
negatively charged cell membranes (Felgner and Ringold, Science
337:387-388 [1989]). Particularly useful lipid compounds and
compositions for transfer of nucleic acids are described in
WO95/18863 and WO96/17823, and in U.S. Pat. No. 5,459,127, herein
incorporated by reference.
[0224] Other molecules are also useful for facilitating
transfection of a nucleic acid in vivo, such as a cationic
oligopeptide (e.g., WO95/21931), peptides derived from DNA binding
proteins (e.g., WO96/25508), or a cationic polymer (e.g.,
WO95/21931).
[0225] It is also possible to introduce the vector in vivo as a
naked DNA plasmid. Methods for formulating and administering naked
DNA to mammalian muscle tissue are disclosed in U.S. Pat. Nos.
5,580,859 and 5,589,466, both of which are herein incorporated by
reference.
[0226] DNA vectors for gene therapy can be introduced into the
desired host cells by methods known in the art, including but not
limited to transfection, electroporation, microinjection,
transduction, cell fusion, DEAE dextran, calcium phosphate
precipitation, use of a gene gun, or use of a DNA vector
transporter (See e.g., Wu et al., J. Biol. Chem., 267:963 [1992];
Wu and Wu, J. Biol. Chem., 263:14621 [1988]; and Williams et al.,
Proc. Natl. Acad. Sci. USA 88:2726 [1991]). Receptor-mediated DNA
delivery approaches can also be used (Curiel et al., Hum. Gene
Ther., 3:147 [1992]; and Wu and Wu, J. Biol. Chem., 262:4429
[1987]).
VI. Transgenic Animals Expressing Exogenous NOD Genes and Homologs,
Mutants, and Variants Thereof
[0227] The present invention contemplates the generation of
transgenic animals comprising an exogenous NOD gene or homologs,
mutants, or variants thereof. In preferred embodiments, the
transgenic animal displays an altered phenotype as compared to
wild-type animals. In some embodiments, the altered phenotype is
the overexpression of mRNA for a NOD gene as compared to wild-type
levels of NOD expression. In other embodiments, the altered
phenotype is the decreased expression of mRNA for an endogenous NOD
gene as compared to wild-type levels of endogenous NOD expression.
In some preferred embodiments, the transgenic animals comprise
mutant alleles of NOD. Methods for analyzing the presence or
absence of such phenotypes include Northern blotting, mRNA
protection assays, and RT-PCR. In other embodiments, the transgenic
mice have a knock out mutation of a NOD gene. In preferred
embodiments, the transgenic animals display an altered
susceptibility to inflammatory diseases.
[0228] Such animals find use in research applications (e.g.,
identifying signaling pathways that a NOD protein is involved in),
as well as drug screening applications (e.g., to screen for drugs
that prevent or treat inflammatory diseases. For example, in some
embodiments, test compounds (e.g., a drug that is suspected of
being useful to treat an inflammatory disease are administered to
the transgenic animals and control animals with a wild type NOD
allele and the effects evaluated. The effects of the test and
control compounds on disease symptoms are then assessed.
[0229] The transgenic animals can be generated via a variety of
methods. In some embodiments, embryonal cells at various
developmental stages are used to introduce transgenes for the
production of transgenic animals. Different methods are used
depending on the stage of development of the embryonal cell. The
zygote is the best target for micro-injection. In the mouse, the
male pronucleus reaches the size of approximately 20 micrometers in
diameter, which allows reproducible injection of 1-2 picoliters
(pl) of DNA solution. The use of zygotes as a target for gene
transfer has a major advantage in that in most cases the injected
DNA will be incorporated into the host genome before the first
cleavage (Brinster et al., Proc. Natl. Acad. Sci. USA 82:4438-4442
[1985]). As a consequence, all cells of the transgenic non-human
animal will carry the incorporated transgene. This will in general
also be reflected in the efficient transmission of the transgene to
offspring of the founder since 50% of the germ cells will harbor
the transgene. U.S. Pat. No. 4,873,191 describes a method for the
micro-injection of zygotes; the disclosure of this patent is
incorporated herein in its entirety.
[0230] In other embodiments, retroviral infection is used to
introduce transgenes into a non-human animal. In some embodiments,
the retroviral vector is utilized to transfect oocytes by injecting
the retroviral vector into the perivitelline space of the oocyte
(U.S. Pat. No. 6,080,912, incorporated herein by reference). In
other embodiments, the developing non-human embryo can be cultured
in vitro to the blastocyst stage. During this time, the blastomeres
can be targets for retroviral infection (Janenich, Proc. Natl.
Acad. Sci. USA 73:1260 [1976]). Efficient infection of the
blastomeres is obtained by enzymatic treatment to remove the zona
pellucida (Hogan et al., in Manipulating the Mouse Embryo, Cold
Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y. [1986]).
The viral vector system used to introduce the transgene is
typically a replication-defective retrovirus carrying the transgene
(Jahner et al., Proc. Natl. Acad Sci. USA 82:6927 [1985]).
Transfection is easily and efficiently obtained by culturing the
blastomeres on a monolayer of virus-producing cells (Van der
Putten, supra; Stewart, et al., EMBO J., 6:383 [1987]).
Alternatively, infection can be performed at a later stage. Virus
or virus-producing cells can be injected into the blastocoele
(Jahner et al., Nature 298:623 [1982]). Most of the founders will
be mosaic for the transgene since incorporation occurs only in a
subset of cells that form the transgenic animal. Further, the
founder may contain various retroviral insertions of the transgene
at different positions in the genome that generally will segregate
in the offspring. In addition, it is also possible to introduce
transgenes into the germline, albeit with low efficiency, by
intrauterine retroviral infection of the midgestation embryo
(Jahner et al., supra [1982]). Additional means of using
retroviruses or retroviral vectors to create transgenic animals
known to the art involves the micro-injection of retroviral
particles or mitomycin C-treated cells producing retrovirus into
the perivitelline space of fertilized eggs or early embryos (PCT
International Application WO 90/08832 [1990], and Haskell and
Bowen, Mol. Reprod. Dev., 40:386 [1995]).
[0231] In other embodiments, the transgene is introduced into
embryonic stem cells and the transfected stem cells are utilized to
form an embryo. ES cells are obtained by culturing pre-implantation
embryos in vitro under appropriate conditions (Evans et al., Nature
292:154 [1981]; Bradley et al., Nature 309:255 [1984]; Gossler et
al., Proc. Acad. Sci. USA 83:9065 [1986]; and Robertson et al.,
Nature 322:445 [1986]). Transgenes can be efficiently introduced
into the ES cells by DNA transfection by a variety of methods known
to the art including calcium phosphate co-precipitation, protoplast
or spheroplast fusion, lipofection and DEAE-dextran-mediated
transfection. Transgenes may also be introduced into ES cells by
retrovirus-mediated transduction or by micro-injection. Such
transfected ES cells can thereafter colonize an embryo following
their introduction into the blastocoel of a blastocyst-stage embryo
and contribute to the germ line of the resulting chimeric animal
(for review, See, Jaenisch, Science 240:1468 [1988]). Prior to the
introduction of transfected ES cells into the blastocoel, the
transfected ES cells may be subjected to various selection
protocols to enrich for ES cells which have integrated the
transgene assuming that the transgene provides a means for such
selection. Alternatively, the polymerase chain reaction may be used
to screen for ES cells that have integrated the transgene. This
technique obviates the need for growth of the transfected ES cells
under appropriate selective conditions prior to transfer into the
blastocoel.
[0232] In still other embodiments, homologous recombination is
utilized to knock-out gene function or create deletion mutants
(e.g., mutants in which a particular domain of a NOD is deleted).
Methods for homologous recombination are described in U.S. Pat. No.
5,614,396, incorporated herein by reference.
VIII. Drug Screening Using NOD
[0233] In some embodiments, the isolated nucleic acid and
polypeptides of NOD genes of the present invention (e.g., SEQ ID
NOS: 1-22) and related proteins and nucleic acids are used in drug
screening applications for compounds that alter (e.g., enhance or
inhibit) NOD activity and signaling. The present invention further
provides methods of identifying ligands of the NOD proteins of the
present invention.
[0234] As described above, NOD family proteins (e.g., Nod2) have
been shown to mediate the host response to bacterial muropeptides.
The present invention is not limited to a particular mechanism.
Indeed, an understanding of the mechanism is not necessary to
practice the present invention. Nonetheless, it is contemplated
that the NOD family proteins of the present invention are involved
in host responses to microbes (e.g., bacteria, virus, fungi, etc.).
It is further contemplated that some NODs recognize endogenous
compounds (e.g., derived from host cells) as ligands. For example,
some NODs may recognize host cell proteins induced by stress (e.g.
heat shock proteins). Accordingly, in some embodiments, the present
invention provides methods of screening for ligands of NOD family
proteins (e.g., ligands derived from microbes or host factors). For
example, in some embodiments, an assay that measures NOD signaling
is used to screen libraries of compounds (e.g., microbial or host
derived compounds) for their ability to alter NOD family
signaling.
[0235] In other embodiments, the present invention provides methods
of screening compounds for the ability to alter NOD signaling
mediated by natural ligands (e.g., identified using the methods
described above). Such compounds find use in the treatment of
disease mediated by NOD family members (e.g., inflammatory
diseases).
[0236] In one screening method, the two-hybrid system is used to
screen for compounds (e.g., proteins) capable of altering NOD
function(s) (e.g., interaction with a binding partner) in vitro or
in vivo. In one embodiment, a GAL4 binding site, linked to a
reporter gene such as lacZ, is contacted in the presence and
absence of a candidate compound with a GAL4 binding domain linked
to a NOD fragment and a GAL4 transactivation domain II linked to a
binding partner fragment. Expression of the reporter gene is
monitored and a decrease in the expression is an indication that
the candidate compound inhibits the interaction of a NOD with the
binding partner. Alternately, the effect of candidate compounds on
the interaction of a NOD with other proteins (e.g., proteins known
to interact directly or indirectly with the binding partner) can be
tested in a similar manner In some embodiments, the present
invention provides methods of identifying NOD binding partners or
ligands that utilize immunoprecipitation. In some embodiments,
antibodies to NOD proteins are utilized to immunoprecipitated NODs
and any bound proteins. In other embodiments, NOD fusion proteins
are generated with tags and antibodies to the tags are utilized for
immunoprecipitation. Potential binding partners that
immunoprecipitate with NODs can be identified using any suitable
method.
[0237] In another screening method, candidate compounds are
evaluated for their ability to alter NOD signaling by contacting
NOD, binding partners, binding partner-associated proteins, or
fragments thereof, with the candidate compound and determining
binding of the candidate compound to the peptide. The protein or
protein fragments is/are immobilized using methods known in the art
such as binding a GST-NOD fusion protein to a polymeric bead
containing glutathione. A chimeric gene encoding a GST fusion
protein is constructed by fusing DNA encoding the polypeptide or
polypeptide fragment of interest to the DNA encoding the carboxyl
terminus of GST (See e.g., Smith et al., Gene 67:31 [1988]). The
fusion construct is then transformed into a suitable expression
system (e.g., E. coli XA90) in which the expression of the GST
fusion protein can be induced with
isopropyl-.beta.-D-thiogalactopyranoside (IPTG). Induction with
(IPTG should yield the fusion protein as a major constituent of
soluble, cellular proteins. The fusion proteins can be purified by
methods known to those skilled in the art, including purification
by glutathione affinity chromatography. Binding of the candidate
compound to the proteins or protein fragments is correlated with
the ability of the compound to disrupt the signal transduction
pathway and thus regulate NOD physiological effects (e.g.,
inflammatory disease).
[0238] In another screening method, one of the components of the
NOD/binding partner signaling system is immobilized. Polypeptides
can be immobilized using methods known in the art, such as
adsorption onto a plastic microtiter plate or specific binding of a
GST-fusion protein to a polymeric bead containing glutathione. For
example, in some embodiments, GST-NOD is bound to
glutathione-Sepharose beads. The immobilized peptide is then
contacted with another peptide with which it is capable of binding
in the presence and absence of a candidate compound. Unbound
peptide is then removed and the complex solubilized and analyzed to
determine the amount of bound labeled peptide. A decrease in
binding is an indication that the candidate compound inhibits the
interaction of NOD with the other peptide. A variation of this
method allows for the screening of compounds that are capable of
disrupting a previously-formed protein/protein complex. For
example, in some embodiments a complex comprising a NOD or a NOD
fragment bound to another peptide is immobilized as described above
and contacted with a candidate compound. The dissolution of the
complex by the candidate compound correlates with the ability of
the compound to disrupt or inhibit the interaction between NOD and
the other peptide.
[0239] Another technique for drug screening provides high
throughput screening for compounds having suitable binding affinity
to NOD peptides and is described in detail in WO 84/03564,
incorporated herein by reference. Briefly, large numbers of
different small peptide test compounds are synthesized on a solid
substrate, such as plastic pins or some other surface. The peptide
test compounds are then reacted with NOD peptides and washed. Bound
NOD peptides are then detected by methods well known in the
art.
[0240] Another technique uses NOD antibodies, generated as
discussed above. Such antibodies are capable of specifically
binding to NOD peptides and compete with a test compound for
binding to NOD. In this manner, the antibodies can be used to
detect the presence of any peptide that shares one or more
antigenic determinants of a NOD peptide.
[0241] The present invention contemplates many other means of
screening compounds. The examples provided above are presented
merely to illustrate a range of techniques available. One of
ordinary skill in the art will appreciate that many other screening
methods can be used.
[0242] In particular, the present invention contemplates the use of
cell lines transfected with NOD genes and variants thereof for
screening compounds for activity, and in particular to high
throughput screening of compounds from combinatorial libraries
(e.g., libraries containing greater than 10.sup.4 compounds). The
cell lines of the present invention can be used in a variety of
screening methods. In some embodiments, the cells can be used in
second messenger assays that monitor signal transduction following
activation of cell-surface receptors. In other embodiments, the
cells can be used in reporter gene assays that monitor cellular
responses at the transcription/translation level. In still further
embodiments, the cells can be used in cell proliferation assays to
monitor the overall growth/no growth response of cells to external
stimuli.
[0243] In second messenger assays, the host cells are preferably
transfected as described above with vectors encoding NOD or
variants or mutants thereof. The host cells are then treated with a
compound or plurality of compounds (e.g., from a combinatorial
library) and assayed for the presence or absence of a response. It
is contemplated that at least some of the compounds in the
combinatorial library can serve as agonists, antagonists,
activators, or inhibitors of the protein or proteins encoded by the
vectors. It is also contemplated that at least some of the
compounds in the combinatorial library can serve as agonists,
antagonists, activators, or inhibitors of protein acting upstream
or downstream of the protein encoded by the vector in a signal
transduction pathway.
[0244] In some embodiments, the second messenger assays measure
fluorescent signals from reporter molecules that respond to
intracellular changes (e.g., Ca.sup.2+ concentration, membrane
potential, pH, IP.sub.3, cAMP, arachidonic acid release) due to
stimulation of membrane receptors and ion channels (e.g., ligand
gated ion channels; see Denyer et al., Drug Discov. Today 3:323
[1998]; and Gonzales et al., Drug. Discov. Today 4:431-39 [1999]).
Examples of reporter molecules include, but are not limited to,
FRET (florescence resonance energy transfer) systems (e.g.,
Cuo-lipids and oxonols, EDAN/DABCYL), calcium sensitive indicators
(e.g., Fluo-3, FURA 2, INDO 1, and FLUO3/AM, BAPTA AM),
chloride-sensitive indicators (e.g., SPQ, SPA), potassium-sensitive
indicators (e.g., PBFI), sodium-sensitive indicators (e.g., SBFI),
and pH sensitive indicators (e.g., BCECF).
[0245] In general, the host cells are loaded with the indicator
prior to exposure to the compound. Responses of the host cells to
treatment with the compounds can be detected by methods known in
the art, including, but not limited to, fluorescence microscopy,
confocal microscopy (e.g., FCS systems), flow cytometry,
microfluidic devices, FLIPR systems (See, e.g., Schroeder and
Neagle, J. Biomol. Screening 1:75 [1996]), and plate-reading
systems. In some preferred embodiments, the response (e.g.,
increase in fluorescent intensity) caused by compound of unknown
activity is compared to the response generated by a known agonist
and expressed as a percentage of the maximal response of the known
agonist. The maximum response caused by a known agonist is defined
as a 100% response. Likewise, the maximal response recorded after
addition of an agonist to a sample containing a known or test
antagonist is detectably lower than the 100% response.
[0246] The cells are also useful in reporter gene assays. Reporter
gene assays involve the use of host cells transfected with vectors
encoding a nucleic acid comprising transcriptional control elements
of a target gene (i.e., a gene that controls the biological
expression and function of a disease target) spliced to a coding
sequence for a reporter gene. Therefore, activation of the target
gene results in activation of the reporter gene product. In some
embodiments, the reporter gene construct comprises the 5'
regulatory region (e.g., promoters and/or enhancers) of a protein
whose expression is controlled by NOD in operable association with
a reporter gene. Examples of reporter genes finding use in the
present invention include, but are not limited to, chloramphenicol
transferase, alkaline phosphatase, firefly and bacterial
luciferases, .beta.-galactosidase, .beta.-lactamase, and green
fluorescent protein. The production of these proteins, with the
exception of green fluorescent protein, is detected through the use
of chemiluminescent, calorimetric, or bioluminecent products of
specific substrates (e.g., X-gal and luciferin). Comparisons
between compounds of known and unknown activities may be conducted
as described above.
[0247] Specifically, the present invention provides screening
methods for identifying modulators, i.e., candidate or test
compounds or agents (e.g., proteins, peptides, peptidomimetics,
peptoids, small molecules or other drugs) which bind to a NOD of
the present invention, have an inhibitory (or stimulatory) effect
on, for example, NOD expression or NOD activity, or have a
stimulatory or inhibitory effect on, for example, the expression or
activity of a NOD substrate. Compounds thus identified can be used
to modulate the activity of target gene products (e.g., NOD genes)
either directly or indirectly in a therapeutic protocol, to
elaborate the biological function of the target gene product, or to
identify compounds that disrupt normal target gene interactions.
Compounds, which stimulate the activity of a variant NOD or mimic
the activity of a non-functional variant are particularly useful in
the treatment of inflammatory diseases.
[0248] In one embodiment, the invention provides assays for
screening candidate or test compounds that are substrates of a NOD
protein or polypeptide or a biologically active portion thereof. In
another embodiment, the invention provides assays for screening
candidate or test compounds that bind to or modulate the activity
of a NOD protein or polypeptide or a biologically active portion
thereof.
[0249] The test compounds of the present invention can be obtained
using any of the numerous approaches in combinatorial library
methods known in the art, including biological libraries; peptoid
libraries (libraries of molecules having the functionalities of
peptides, but with a novel, non-peptide backbone, which are
resistant to enzymatic degradation but which nevertheless remain
bioactive; see, e.g., Zuckennann et al., J. Med. Chem. 37: 2678-85
[1994]); spatially addressable parallel solid phase or solution
phase libraries; synthetic library methods requiring deconvolution;
the `one-bead one-compound` library method; and synthetic library
methods using affinity chromatography selection. The biological
library and peptoid library approaches are preferred for use with
peptide libraries, while the other four approaches are applicable
to peptide, non-peptide oligomer or small molecule libraries of
compounds (Lam (1997) Anticancer Drug Des. 12:145).
[0250] Examples of methods for the synthesis of molecular libraries
can be found in the art, for example in: DeWitt et al., Proc. Natl.
Acad. Sci. U.S.A. 90:6909 [1993]; Erb et al., Proc. Nad. Acad. Sci.
USA 91:11422 [1994]; Zuckermann et al., J. Med. Chem. 37:2678
[1994]; Cho et al., Science 261:1303 [1993]; Carrell et al., Angew.
Chem. Int. Ed. Engl. 33.2059 [1994]; Carell et al., Angew. Chem.
Int. Ed. Engl. 33:2061 [1994]; and Gallop et al., J. Med. Chem.
37:1233 [1994].
[0251] Libraries of compounds may be presented in solution (e.g.,
Houghten, Biotechniques 13:412-421 [1992]), or on beads (Lam,
Nature 354:82-84 [1991]), chips (Fodor, Nature 364:555-556 [1993]),
bacteria or spores (U.S. Pat. No. 5,223,409; herein incorporated by
reference), plasmids (Cull et al., Proc. Nad. Acad. Sci. USA
89:18651869 [1992]) or on phage (Scott and Smith, Science
249:386-390 [1990]; Devlin Science 249:404-406 [1990]; Cwirla et
al., Proc. NatI. Acad. Sci. 87:6378-6382 [1990]; Felici, J. Mol.
Biol. 222:301 [1991]).
[0252] In one embodiment, an assay is a cell-based assay in which a
cell that expresses a NOD protein or biologically active portion
thereof is contacted with a test compound, and the ability of the
test compound to modulate a NOD's activity is determined.
Determining the ability of the test compound to modulate NOD
activity can be accomplished by monitoring, for example, changes in
enzymatic activity. The cell, for example, can be of mammalian
origin.
[0253] The ability of the test compound to modulate NOD binding to
a compound, e.g., a NOD substrate, can also be evaluated. This can
be accomplished, for example, by coupling the compound, e.g., the
substrate, with a radioisotope or enzymatic label such that binding
of the compound, e.g., the substrate, to a NOD can be determined by
detecting the labeled compound, e.g., substrate, in a complex.
[0254] Alternatively, a NOD is coupled with a radioisotope or
enzymatic label to monitor the ability of a test compound to
modulate NOD binding to a NOD substrate in a complex. For example,
compounds (e.g., substrates) can be labeled with .sup.125I,
.sup.35S .sup.14C or .sup.3H, either directly or indirectly, and
the radioisotope detected by direct counting of radioemmission or
by scintillation counting. Alternatively, compounds can be
enzymatically labeled with, for example, horseradish peroxidase,
alkaline phosphatase, or luciferase, and the enzymatic label
detected by determination of conversion of an appropriate substrate
to product.
[0255] The ability of a compound (e.g., a NOD substrate) to
interact with a NOD with or without the labeling of any of the
interactants can be evaluated. For example, a microphysiorneter can
be used to detect the interaction of a compound with a NOD without
the labeling of either the compound or the NOD (McConnell et al.
Science 257:1906-1912 [1992]). As used herein, a "microphysiometer"
(e.g., Cytosensor) is an analytical instrument that measures the
rate at which a cell acidifies its environment using a
light-addressable potentiometric sensor (LAPS). Changes in this
acidification rate can be used as an indicator of the interaction
between a compound and a NOD polypeptide.
[0256] In yet another embodiment, a cell-free assay is provided in
which a NOD protein or biologically active portion thereof is
contacted with a test compound and the ability of the test compound
to bind to the NOD protein or biologically active portion thereof
is evaluated. Preferred biologically active portions of NOD
proteins to be used in assays of the present invention include
fragments that participate in interactions with substrates or other
proteins, e.g., fragments with high surface probability scores.
[0257] Cell-free assays involve preparing a reaction mixture of the
target gene protein and the test compound under conditions and for
a time sufficient to allow the two components to interact and bind,
thus forming a complex that can be removed and/or detected.
[0258] The interaction between two molecules can also be detected,
e.g., using fluorescence energy transfer (FRET) (see, for example,
Lakowicz et al., U.S. Pat. No. 5,631,169; Stavrianopoulos et al.,
U.S. Pat. No. 4,968,103; each of which is herein incorporated by
reference). A fluorophore label is selected such that a first donor
molecule's emitted fluorescent energy will be absorbed by a
fluorescent label on a second, `acceptor` molecule, which in turn
is able to fluoresce due to the absorbed energy.
[0259] Alternately, the `donor` protein molecule may simply utilize
the natural fluorescent energy of tryptophan residues. Labels are
chosen that emit different wavelengths of light, such that the
`acceptor` molecule label may be differentiated from that of the
`donor`. Since the efficiency of energy transfer between the labels
is related to the distance separating the molecules, the spatial
relationship between the molecules can be assessed. In a situation
in which binding occurs between the molecules, the fluorescent
emission of the `acceptor` molecule label in 1 5 the assay should
be maximal. An FRET binding event can be conveniently measured
through standard fluorometric detection means well known in the art
(e.g., using a fluorimeter).
[0260] In another embodiment, determining the ability of a NOD
protein to bind to a target molecule can be accomplished using
real-time Biomolecular Interaction Analysis (BIA) (see, e.g.,
Sjolander and Urbaniczky, Anal. Chem. 63:2338-2345 [1991] and Szabo
et al. Curr. Opin. Struct. Biol. 5:699-705 [1995]). "Surface
plasmon resonance" or "BIA" detects biospecific interactions in
real time, without labeling any of the interactants (e.g.,
BlAcore). Changes in the mass at the binding surface (indicative of
a binding event) result in alterations of the refractive index of
light near the surface (the optical phenomenon of surface plasmon
resonance (SPR)), resulting in a detectable signal that can be used
as an indication of real-time reactions between biological
molecules.
[0261] In one embodiment, the target gene product or the test
substance is anchored onto a solid phase. The target gene
product/test compound complexes anchored on the solid phase can be
detected at the end of the reaction. Preferably, the target gene
product can be anchored onto a solid surface, and the test
compound, (which is not anchored), can be labeled, either directly
or indirectly, with detectable labels discussed herein.
[0262] It may be desirable to immobilize a NOD protein, an anti-NOD
antibody or its target molecule to facilitate separation of
complexed from non-complexed forms of one or both of the proteins,
as well as to accommodate automation of the assay. Binding of a
test compound to a NOD protein, or interaction of a NOD protein
with a target molecule in the presence and absence of a candidate
compound, can be accomplished in any vessel suitable for containing
the reactants. Examples of such vessels include microtiter plates,
test tubes, and micro-centrifuge tubes. In one embodiment, a fusion
protein can be provided which adds a domain that allows one or both
of the proteins to be bound to a matrix. For example,
glutathione-S-transferase-NOD fusion proteins or
glutathione-S-transferase/target fusion proteins can be adsorbed
onto glutathione Sepharose beads (Sigma Chemical, St. Louis, Mo.)
or glutathione-derivatized microtiter plates, which are then
combined with the test compound or the test compound and either the
non-adsorbed target protein or NOD protein, and the mixture
incubated under conditions conducive for complex formation (e.g.,
at physiological conditions for salt and pH). Following incubation,
the beads or microtiter plate wells are washed to remove any
unbound components, the matrix immobilized in the case of beads,
complex determined either directly or indirectly, for example, as
described above.
[0263] Alternatively, the complexes can be dissociated from the
matrix, and the level of NOD binding or activity determined using
standard techniques. Other techniques for immobilizing either a NOD
protein or a target molecule on matrices include using conjugation
of biotin and streptavidin. Biotinylated NOD protein or target
molecules can be prepared from biotin-NHS (N-hydroxy-succinimide)
using techniques known in the art (e.g., biotinylation kit, Pierce
Chemicals, Rockford, Ill.), and immobilized in the wells of
streptavidin-coated 96 well plates (Pierce Chemical).
[0264] In order to conduct the assay, the non-immobilized component
is added to the coated surface containing the anchored component.
After the reaction is complete, unreacted components are removed
(e.g., by washing) under conditions such that any complexes formed
will remain immobilized on the solid surface. The detection of
complexes anchored on the solid surface can be accomplished in a
number of ways. Where the previously non-immobilized component is
pre-labeled, the detection of label immobilized on the surface
indicates that complexes were formed. Where the previously
non-immobilized component is not pre-labeled, an indirect label can
be used to detect complexes anchored on the surface; e.g., using a
labeled antibody specific for the immobilized component (the
antibody, in turn, can be directly labeled or indirectly labeled
with, e.g., a labeled anti-IgG antibody).
[0265] This assay is performed utilizing antibodies reactive with
NOD protein or target molecules but which do not interfere with
binding of the NOD protein to its target molecule. Such antibodies
can be derivatized to the wells of the plate, and unbound target or
NOD protein trapped in the wells by antibody conjugation. Methods
for detecting such complexes, in addition to those described above
for the GST-immobilized complexes, include immunodetection of
complexes using antibodies reactive with the NOD protein or target
molecule, as well as enzyme-linked assays which rely on detecting
an enzymatic activity associated with the NOD protein or target
molecule.
[0266] Alternatively, cell free assays can be conducted in a liquid
phase. In such an assay, the reaction products are separated from
unreacted components, by any of a number of standard techniques,
including, but not limited to: differential centrifugation (see,
for example, Rivas and Minton, Trends Biochem Sci 18:284-7 [1993]);
chromatography (gel filtration chromatography, ion-exchange
chromatography); electrophoresis (see, e.g., Ausubel et al., eds.
Current Protocols in Molecular Biology 1999, J. Wiley: New York.);
and immunoprecipitation (see, for example, Ausubel et al., eds.
Current Protocols in Molecular Biology 1999, J. Wiley: New York).
Such resins and chromatographic techniques are known to one skilled
in the art (See e.g., Heegaard J. Mol. Recognit 11: 141-8 [1998];
Hageand Tweed J. Chromatogr. Biomed. Sci. Appl 699:499-525 [1997]).
Further, fluorescence energy transfer may also be conveniently
utilized, as described herein, to detect binding without further
purification of the complex from solution.
[0267] The assay can include contacting the NOD protein or
biologically active portion thereof with a known compound that
binds the NOD to form an assay mixture, contacting the assay
mixture with a test compound, and determining the ability of the
test compound to interact with a NOD protein, wherein determining
the ability of the test compound to interact with a NOD protein
includes determining the ability of the test compound to
preferentially bind to NOD or biologically active portion thereof,
or to modulate the activity of a target molecule, as compared to
the known compound.
[0268] To the extent that a NOD can, in vivo, interact with one or
more cellular or extracellular macromolecules, such as proteins,
inhibitors of such an interaction are useful. A homogeneous assay
can be used can be used to identify inhibitors.
[0269] For example, a preformed complex of the target gene product
and the interactive cellular or extracellular binding partner
product is prepared such that either the target gene products or
their binding partners are labeled, but the signal generated by the
label is quenched due to complex formation (see, e.g., U.S. Pat.
No. 4,109,496, herein incorporated by reference, that utilizes this
approach for immunoassays). The addition of a test substance that
competes with and displaces one of the species from the preformed
complex will result in the generation of a signal above background.
In this way, test substances that disrupt target gene
product-binding partner interaction can be identified.
Alternatively, a NOD protein can be used as a "bait protein" in a
two-hybrid assay or three-hybrid assay (see, e.g., U.S. Pat. No.
5,283,317; Zervos et al., Cell 72:223-232 [1993]; Madura et al., J.
Biol. Chem. 268.12046-12054 [1993]; Bartel et al., Biotechniques
14:920-924 [1993]; Iwabuchi et al., Oncogene 8:1693-1696 [1993];
and Brent WO 94/10300; each of which is herein incorporated by
reference), to identify other proteins, that bind to or interact
with a NOD ("NOD-binding proteins" or "NOD-bp") and are involved in
NOD activity. Such NOD-bps can be activators or inhibitors of
signals by the NOD proteins or targets as, for example, downstream
elements of a NOD-mediated signaling pathway.
[0270] Modulators of NOD expression can also be identified. For
example, a cell or cell free mixture is contacted with a candidate
compound and the expression of a NOD mRNA or protein evaluated
relative to the level of expression of the NOD mRNA or protein in
the absence of the candidate compound. When expression of the NOD
mRNA or protein is greater in the presence of the candidate
compound than in its absence, the candidate compound is identified
as a stimulator of a NOD mRNA or protein expression. Alternatively,
when expression of NOD mRNA or protein is less (i.e., statistically
significantly less) in the presence of the candidate compound than
in its absence, the candidate compound is identified as an
inhibitor of NOD mRNA or protein expression. The level of NOD mRNA
or protein expression can be determined by methods described herein
for detecting NOD mRNA or protein.
[0271] A modulating agent can be identified using a cell-based or a
cell free assay, and the ability of the agent to modulate the
activity of a NOD protein can be confirmed in vivo, e.g., in an
animal such as an animal model for a disease (e.g., an animal with
inflammatory disease).
B. Therapeutic Agents
[0272] This invention further pertains to novel agents identified
by the above-described screening assays. Accordingly, it is within
the scope of this invention to further use an agent identified as
described herein (e.g., a NOD modulating agent or mimetic, a NOD
specific antibody, or a NOD-binding partner) in an appropriate
animal model (such as those described herein) to determine the
efficacy, toxicity, side effects, or mechanism of action, of
treatment with such an agent. Furthermore, as described above,
novel agents identified by the above-described screening assays can
be, e.g., used for treatments of inflammatory disease (e.g.,
including, but not limited to, psoriasis or Crohn's disease). In
some embodiments, the agents are NOD ligands or ligand analogs
(e.g., identified using the drug screening methods described
above).
IX. Pharmaceutical Compositions Containing NOD Nucleic Acid,
Peptides, and Analogs
[0273] The present invention further provides pharmaceutical
compositions which may comprise all or portions of NOD
polynucleotide sequences, NOD polypeptides, inhibitors or
antagonists of NOD bioactivity, including antibodies, alone or in
combination with at least one other agent, such as a stabilizing
compound, and may be administered in any sterile, biocompatible
pharmaceutical carrier, including, but not limited to, saline,
buffered saline, dextrose, and water.
[0274] The methods of the present invention find use in treating
diseases or altering physiological states characterized by mutant
NOD alleles (e.g., inflammatory disease). Peptides can be
administered to the patient intravenously in a pharmaceutically
acceptable carrier such as physiological saline. Standard methods
for intracellular delivery of peptides can be used (e.g., delivery
via liposome). Such methods are well known to those of ordinary
skill in the art. The formulations of this invention are useful for
parenteral administration, such as intravenous, subcutaneous,
intramuscular, and intraperitoneal. Therapeutic administration of a
polypeptide intracellularly can also be accomplished using gene
therapy as described above.
[0275] As is well known in the medical arts, dosages for any one
patient depends upon many factors, including the patient's size,
body surface area, age, the particular compound to be administered,
sex, time and route of administration, general health, and
interaction with other drugs being concurrently administered.
[0276] Accordingly, in some embodiments of the present invention,
NOD nucleotide and NOD amino acid sequences can be administered to
a patient alone, or in combination with other nucleotide sequences,
drugs or hormones or in pharmaceutical compositions where it is
mixed with excipient(s) or other pharmaceutically acceptable
carriers. In one embodiment of the present invention, the
pharmaceutically acceptable carrier is pharmaceutically inert. In
another embodiment of the present invention, NOD polynucleotide
sequences or NOD amino acid sequences may be administered alone to
individuals subject to or suffering from a disease.
[0277] Depending on the condition being treated, these
pharmaceutical compositions may be formulated and administered
systemically or locally. Techniques for formulation and
administration may be found in the latest edition of "Remington's
Pharmaceutical Sciences" (Mack Publishing Co, Easton Pa.). Suitable
routes may, for example, include oral or transmucosal
administration; as well as parenteral delivery, including
intramuscular, subcutaneous, intramedullary, intrathecal,
intraventricular, intravenous, intraperitoneal, or intranasal
administration.
[0278] For injection, the pharmaceutical compositions of the
invention may be formulated in aqueous solutions, preferably in
physiologically compatible buffers such as Hanks' solution,
Ringer's solution, or physiologically buffered saline. For tissue
or cellular administration, penetrants appropriate to the
particular barrier to be permeated are used in the formulation.
Such penetrants are generally known in the art.
[0279] In other embodiments, the pharmaceutical compositions of the
present invention can be formulated using pharmaceutically
acceptable carriers well known in the art in dosages suitable for
oral administration. Such carriers enable the pharmaceutical
compositions to be formulated as tablets, pills, capsules, liquids,
gels, syrups, slurries, suspensions and the like, for oral or nasal
ingestion by a patient to be treated.
[0280] Pharmaceutical compositions suitable for use in the present
invention include compositions wherein the active ingredients are
contained in an effective amount to achieve the intended purpose.
For example, an effective amount of NOD may be that amount that
suppresses apoptosis. Determination of effective amounts is well
within the capability of those skilled in the art, especially in
light of the disclosure provided herein.
[0281] In addition to the active ingredients these pharmaceutical
compositions may contain suitable pharmaceutically acceptable
carriers comprising excipients and auxiliaries that facilitate
processing of the active compounds into preparations that can be
used pharmaceutically. The preparations formulated for oral
administration may be in the form of tablets, dragees, capsules, or
solutions.
[0282] The pharmaceutical compositions of the present invention may
be manufactured in a manner that is itself known (e.g., by means of
conventional mixing, dissolving, granulating, dragee-making,
levigating, emulsifying, encapsulating, entrapping or lyophilizing
processes).
[0283] Pharmaceutical formulations for parenteral administration
include aqueous solutions of the active compounds in water-soluble
form. Additionally, suspensions of the active compounds may be
prepared as appropriate oily injection suspensions. Suitable
lipophilic solvents or vehicles include fatty oils such as sesame
oil, or synthetic fatty acid esters, such as ethyl oleate or
triglycerides, or liposomes. Aqueous injection suspensions may
contain substances that increase the viscosity of the suspension,
such as sodium carboxymethyl cellulose, sorbitol, or dextran.
Optionally, the suspension may also contain suitable stabilizers or
agents that increase the solubility of the compounds to allow for
the preparation of highly concentrated solutions.
[0284] Pharmaceutical preparations for oral use can be obtained by
combining the active compounds with solid excipient, optionally
grinding a resulting mixture, and processing the mixture of
granules, after adding suitable auxiliaries, if desired, to obtain
tablets or dragee cores. Suitable excipients are carbohydrate or
protein fillers such as sugars, including lactose, sucrose,
mannitol, or sorbitol; starch from corn, wheat, rice, potato, etc;
cellulose such as methyl cellulose, hydroxypropylmethyl-cellulose,
or sodium carboxymethylcellulose; and gums including arabic and
tragacanth; and proteins such as gelatin and collagen. If desired,
disintegrating or solubilizing agents may be added, such as the
cross-linked polyvinyl pyrrolidone, agar, alginic acid or a salt
thereof such as sodium alginate.
[0285] Dragee cores are provided with suitable coatings such as
concentrated sugar solutions, which may also contain gum arabic,
talc, polyvinylpyrrolidone, carbopol gel, polyethylene glycol,
and/or titanium dioxide, lacquer solutions, and suitable organic
solvents or solvent mixtures. Dyestuffs or pigments may be added to
the tablets or dragee coatings for product identification or to
characterize the quantity of active compound, (i.e., dosage).
[0286] Pharmaceutical preparations that can be used orally include
push-fit capsules made of gelatin, as well as soft, sealed capsules
made of gelatin and a coating such as glycerol or sorbitol. The
push-fit capsules can contain the active ingredients mixed with a
filler or binders such as lactose or starches, lubricants such as
talc or magnesium stearate, and, optionally, stabilizers. In soft
capsules, the active compounds may be dissolved or suspended in
suitable liquids, such as fatty oils, liquid paraffin, or liquid
polyethylene glycol with or without stabilizers.
[0287] Compositions comprising a compound of the invention
formulated in a pharmaceutical acceptable carrier may be prepared,
placed in an appropriate container, and labeled for treatment of an
indicated condition. For polynucleotide or amino acid sequences of
NOD, conditions indicated on the label may include treatment of
condition related to inflammatory diseases.
[0288] The pharmaceutical composition may be provided as a salt and
can be formed with many acids, including but not limited to
hydrochloric, sulfuric, acetic, lactic, tartaric, malic, succinic,
etc. Salts tend to be more soluble in aqueous or other protonic
solvents that are the corresponding free base forms. In other
cases, the preferred preparation may be a lyophilized powder in 1
mM-50 mM histidine, 0.1%-2% sucrose, 2%-7% mannitol at a pH range
of 4.5 to 5.5 that is combined with buffer prior to use.
[0289] For any compound used in the method of the invention, the
therapeutically effective dose can be estimated initially from cell
culture assays. Then, preferably, dosage can be formulated in
animal models (particularly murine models) to achieve a desirable
circulating concentration range that adjusts NOD levels.
[0290] A therapeutically effective dose refers to that amount of
NOD that ameliorates symptoms of the disease state. Toxicity and
therapeutic efficacy of such compounds can be determined by
standard pharmaceutical procedures in cell cultures or experimental
animals, e.g., for determining the LD.sub.50 (the dose lethal to
50% of the population) and the ED.sub.50 (the dose therapeutically
effective in 50% of the population). The dose ratio between toxic
and therapeutic effects is the therapeutic index, and it can be
expressed as the ratio LD.sub.50/ED.sub.50. Compounds that exhibit
large therapeutic indices are preferred. The data obtained from
these cell culture assays and additional animal studies can be used
in formulating a range of dosage for human use. The dosage of such
compounds lies preferably within a range of circulating
concentrations that include the ED.sub.50 with little or no
toxicity. The dosage varies within this range depending upon the
dosage form employed, sensitivity of the patient, and the route of
administration.
[0291] The exact dosage is chosen by the individual physician in
view of the patient to be treated. Dosage and administration are
adjusted to provide sufficient levels of the active moiety or to
maintain the desired effect. Additional factors which may be taken
into account include the severity of the disease state; age,
weight, and gender of the patient; diet, time and frequency of
administration, drug combination(s), reaction sensitivities, and
tolerance/response to therapy. Long acting pharmaceutical
compositions might be administered every 3 to 4 days, every week,
or once every two weeks depending on half-life and clearance rate
of the particular formulation.
[0292] Normal dosage amounts may vary from 0.1 to 100,000
micrograms, up to a total dose of about 1 g, depending upon the
route of administration. Guidance as to particular dosages and
methods of delivery is provided in the literature (See, U.S. Pat.
Nos. 4,657,760; 5,206,344; or 5,225,212, all of which are herein
incorporated by reference). Those skilled in the art will employ
different formulations for NOD than for the inhibitors of NOD.
Administration to the bone marrow may necessitate delivery in a
manner different from intravenous injections.
[0293] All publications and patents mentioned in the above
specification are herein incorporated by reference. Various
modifications and variations of the described method and system of
the invention will be apparent to those skilled in the art without
departing from the scope and spirit of the invention. Although the
invention has been described in connection with specific preferred
embodiments, it should be understood that the invention as claimed
should not be unduly limited to such specific embodiments. Indeed,
various modifications of the described modes for carrying out the
invention that are obvious to those skilled in the relevant fields
are intended to be within the scope of the following claims.
Sequence CWU 1
1
22 1 3414 DNA Homo sapiens 1 tgagaactca ggctggcaca gggattccca
gggcatctac caccacgcag ctggagcagg 60 gctgagccca ggagcatgga
gatggacgcc cccaggcccc ccagtcttgc tgtccctgga 120 gcagcatcga
ggcccgggag gctgctggat ggggggcacg gcaggcagca ggttcaggcc 180
ctctcttcac agctcctgga ggtgatcccc gactccatga ggaagcaaga ggtgcggacg
240 ggcagggagg ccggccaggg ccacggtacg ggctccccag ccgagcaggt
gaaagccctc 300 atggatctgc tggctgggaa gggcagtcaa ggctcccacg
ccccgcaggc cctggatagg 360 acaccggatg ccccgctggg gccctgcagc
aatgactcaa ggatacagag gcaccgcaag 420 gccctgctga gcaaggtggg
aggtggcccg gagctgggcg gaccctggca caggctggcc 480 tccctcctgc
tggtggaggg cctgacggac ctgcagctga gggaacacga cttcacacag 540
gtggaggcca cccgcggggg cgggcacccc gccaggaccg tcgccctgga ccggctcttc
600 ctgcctctct cccgggtgtc tgtcccaccc cgggtctcca tcactatcgg
ggtggccggc 660 atgggcaaga ccaccctggt gaggcacttc gtccgcctct
gggcccatgg gcaggtcggc 720 aaggacttct cgctggtgct gcctctgacc
ttccgggatc tcaacaccca cgagaagctg 780 tgtgccgacc gactcatctg
ctcggtcttc ccgcacgtcg gggagcccag cctggcggtg 840 gcagtcccag
ccagggccct cctgatcctg gacggcttgg atgagtgcag gacgcctctg 900
gacttctcca acaccgtggc ctgcacggac ccaaagaagg agatcccggt ggaccacctg
960 atcaccaaca tcatccgtgg caacctcttt ccggaagttt ccatctggat
cacctcccgt 1020 cccagtgcat ctggccagat cccagggggc ctggtggacc
ggatgacgga gatccggggc 1080 tttaacgagg aggagatcaa ggtgtgtttg
gagcagatgt tccccgagga ccaggccctt 1140 ctgggctgga tgctgagcca
agtgcaggct gacagggccc tgtacctgat gtgcaccgtc 1200 ccagccttct
gcaggctcac ggggatggcg ctaggccacc tgtggcgcag caggacgggg 1260
ccccaggatg cagagctgtg gcccccgagg accctgtgcg agctctactc atggtacttt
1320 aggatggccc tcagcgggga ggggcaggag aagggcaagg caagccctcg
catcgagcag 1380 gtggcccatg gtggccgcaa gatggtgggg acattgggcc
gtctggcctt ccatgggctg 1440 ctcaagaaga aatacgtgtt ttacgagcaa
gacatgaagg cgtttggtgt agacctcgct 1500 ctgctgcagg gcgccccgtg
cagctgcttc ctgcagagag aggagacgtt ggcatcgtca 1560 gtggcctact
gcttcaccca cctgtccctg caggagtttg tggcagccgc gtattactat 1620
ggcgcatcca ggagggccat cttcgacctc ttcactgaga gcggcgtatc ctggcccagg
1680 ctgggcttcc tcacgcattt caggagcgca gcccagcggg ccatgcaggc
agaggacggg 1740 aggctggacg tgttcctgcg cttcctctcc ggcctcttgt
ctccgagggt caatgccctc 1800 ctggccggct ccctgctggc ccaaggcgag
caccaggcct accggaccca ggtggctgag 1860 ctcctgcagg gctgcctgcg
ccccgatgcc gcagtctgtg cacgggccat caacgtgttg 1920 cactgcctgc
atgagctgca gcacaccgag ctggcccgca gcgtggagga ggccatggag 1980
agcggggccc tggccaggct gactggtccc gcgcaccgcg ctgccctggc ctacctcctg
2040 caggtgtccg acgcctgtgc ccaggaggcc aacctgtccc tgagcctcag
ccagggcgtc 2100 cttcagagcc tgctgcccca gctgctctac tgccggaagc
tcaggctgga caccaaccag 2160 ttccaggacc ccgtgatgga gctgctgggc
agcgtgctga gtgggaagga ctgtcgcatt 2220 cagaagatca gcttggcgga
gaaccagatc agtaacaaag gggccaaagc tctggccaga 2280 tccctcttgg
tcaacagaag tctgacctct ctggacctcc gcggtaactc cattggacca 2340
caaggggcca aggcgctggc agacgctttg aagatcaacc gcaccctgac ctccctgagc
2400 ctccagggca acaccgttag ggatgatggt gccaggtcca tggctgaggc
cttggcctcc 2460 aaccggaccc tctccatgct gcacctgcag aagaacagca
tcgggcccat gggagcccag 2520 cggatggcag atgccttgaa gcagaacagg
agtctgaaag agctcatgtt ctccagtaat 2580 agtattggtg atggaggtgc
caaggccctg gctgaggccc tgaaggtgaa ccagggcctg 2640 gagagcctgg
acctgcagag caattccatc agtgacgcag gagtggcagc actgatgggg 2700
gccctctgca ccaaccagac cctcctcagc ctcagccttc gagaaaactc catcagtccc
2760 gagggagccc aggccatcgc tcatgccctc tgcgccaaca gcaccctgaa
gaacctggac 2820 ctgacagcca acctcctcca cgaccagggt gcccgggcca
tcgcagtggc agtgagagaa 2880 aaccgcaccc tcacctccct tcacctgcag
tggaacttca tccaggccgg cgctgcccag 2940 gccctgggac aagcactaca
gctcaacagg agcctcacca gcttagattt acaggagaac 3000 gccatcgggg
atgacggagc gtgtgcggtg gcccgtgcac tgaaggtcaa cacagccctc 3060
actgctctct atctccaggt ggcctcaatt ggtgcttcag gcgcccaggt gctaggggaa
3120 gccttggctg tgaacagaac cttggagatt ctcgacttaa gaggaaatgc
cattggggtg 3180 gctggagcca aagccctggc aaatgctctg aaggtaaact
caagtctccg gagactcaat 3240 cttcaagaga attctctggg gatggacggg
gcgatatgca ttgccacagc actgtctgga 3300 aaccacaggc tccagcatat
caatctccag ggaaaccaca ttggggactc cggggccagg 3360 atgatctcag
aggccatcaa gacaaatgct cccacgtgca ctgttgaaat gtga 3414 2 3521 DNA
Homo sapiens 2 aggcctgaat atttggacaa gatggcagat tcatcatcat
cttctttctt tcctgatttt 60 gggctgctat tgtatttgga ggagctaaac
aaagaggaat taaatacatt caagttattc 120 ctaaaggaga ccatggaacc
tgagcatggc ctgacaccct ggaatgaagt gaagaaggcc 180 aggcgggagg
acctggccaa tttgatgaag aaatattatc caggagagaa agcctggagt 240
gtgtctctca aaatctttgg caagatgaac ctgaaggatc tgtgtgagag agcgaaagaa
300 gagatcaact ggtcggccca gactatagga ccagatgatg ccaaggctgg
agagacacaa 360 gaagatcagg aggcagtgct gggtgatgga acagaataca
gaaatagaat aaaggaaaaa 420 ttttgcatca cttgggacaa gaagtctttg
gctggaaagc ctgaagattt ccatcatgga 480 attgcagaga aagatagaaa
actgttggaa cacttgttcg atgtggatgt caaaaccggt 540 gcacagccac
agatcgtggt gcttcaggga gctgctggag ttgggaaaac aaccttggtg 600
agaaaggcaa tgttagattg ggcagagggc agtctctacc agcagaggtt taagtatgtt
660 ttttatctca atgggagaga aattaaccag ctgaaagaga gaagctttgc
tcaattgata 720 tcaaaggact ggcccagcac agaaggcccc attgaagaaa
tcatgtacca gccaagtagc 780 ctcttgttta ttattgacag tttcgatgaa
ctgaactttg cctttgaaga acctgagttt 840 gcactgtgcg aagactggac
ccaagaacac ccagtgtcct tcctcatgag tagtttgctg 900 aggaaagtga
tgctccctga ggcatcctta ttggtgacaa caagactcac aacttctaag 960
agactaaagc agttgttgaa gaatcaccat tatgtagagc tactaggaat gtctgaggat
1020 gcaagagagg agtatattta ccagtttttt gaagataaga ggtgggccat
gaaagtattc 1080 agttcactaa aaagcaatga gatgctgttt agcatgtgcc
aagtccccct agtgtgctgg 1140 gccgcttgta cttgtctgaa gcagcaaatg
gagaagggtg gtgatgtcac attgacctgc 1200 caaacaacca cagctctgtt
tacctgctat atttctagct tgttcacacc agtagatgga 1260 ggctctccta
gtctacccaa ccaagcccag ctgagaagac tgtgccaagt cgctgccaaa 1320
ggaatatgga ctatgactta cgtgttttac agagaaaatc tcagaaggct tgggttaact
1380 caatctgatg tctctagttt tatggacagc aatattattc agaaggacgc
agagtatgaa 1440 aactgctatg tgttcaccca ccttcatgtt caggagtttt
ttgcagctat gttctatatg 1500 ttgaaaggca gttgggaagc tgggaaccct
tcctgccagc cttttgaaga tttgaagtca 1560 ttacttcaaa gcacaagtta
taaagacccc catttgacac agatgaagtg ctttttgttt 1620 ggccttttga
atgaagatcg agtaaaacaa ctggagagga cttttaactg taaaatgtca 1680
ctgaagataa aatcaaagtt acttcagtgt atggaagtat taggaaacag tgactattct
1740 ccatcacagc tgggatttct ggagttgttt cactgtctgt atgagactca
agataaagcg 1800 tttataagcc aggcaatgag atgtttccca aaggttgcca
ttaatatttg tgagaaaata 1860 catttgcttg tatcttcttt ctgccttaag
cactgccggt gtttgcggac catcaggctg 1920 tctgtaactg tggtatttga
gaagaagata ttaaaaacaa gcctcccaac taacacttgg 1980 gatggtgatc
gcattactca ctgttggcaa gatctctgtt ctgtgcttca tacaaatgaa 2040
cacttgagag aattggacct gtaccatagc aaccttgata aatcagcaat gaatatcctg
2100 catcatgaac taaggcaccc aaactgtaaa ctacaaaagc tactgttgaa
atttatcact 2160 ttccctgatg gttgtcagga tatctctact tctttgattc
ataacaagaa tctgatgcat 2220 cttgacctaa aagggagtga tataggggat
aatggagtaa agtcattgtg tgaggccttg 2280 aaacacccag agtgtaaact
acagactctc aggctggaat cttgcaacct aactgtattt 2340 tgttgtctaa
atatatctaa tgctctcatc agaagccaga gcctgatatt tctgaatctg 2400
tcaaccaata atctgttgga tgatggagtg cagcttttgt gtgaggcctt aagacatcca
2460 aagtgttatc tagagagact gtccttagaa agctgtggtc tcacagaggc
tggctgtgag 2520 tatctttctt tggctctcat cagcaataaa agactgacac
atttgtgctt ggcagacaat 2580 gtcttgggtg atggtggagt aaagcttatg
agtgatgccc tgcaacatgc acaatgtact 2640 ctgaagagcc ttgtgctgag
gcgttgccat ttcacttcac ttagcagtga atatctgtca 2700 acttctcttc
tacacaacaa gagcctgacg catctggatc taggatcaaa ctggctacaa 2760
gacaatggag tgaagcttct gtgtgatgtc tttcggcatc caagctgtaa tcttcaggac
2820 ttggaattga tgggctgtgt tctcactaat gcatgttgtc tggatctggc
ttctgttatt 2880 ttgaataacc caaacctgag gagcctggac cttgggaaca
acgatttgca ggatgatgga 2940 gtgaaaattc tgtgtgatgc tttgagatat
ccaaactgta acattcagag gctcgggttg 3000 gaatactgtg gtttgacatc
tctctgctgt caagatctct cctctgctct tatctgcaac 3060 aaaagactga
taaaaatgaa tctgacacag aataccttag gatatgaagg aattgtgaag 3120
ttatataaag tcttgaagtc tcctaagtgt aaactacaag ttctagggtt gtgcaaagag
3180 gcatttgatg aggaagccca gaagctgctg gaagctgtgg gagttagcaa
tccacactta 3240 atcattaagc cagattgtaa ctatcataat gaagaagatg
tgtcttggtg gtggtgtttc 3300 tgatttgaag aaactgacat tcctttaaaa
atataaatat aaatacatac atacatagat 3360 atatacccag acttgggtgc
ttagcttcag atactctatg cccagagata gtgcacttgg 3420 cagctgtcag
ataccattca tctacttctc tgtaaaatgt ctgttctact tcacacagtg 3480
gtcgagaggc taaaataaaa tgaaaagcat aaaactctct g 3521 3 3484 DNA Homo
sapiens 3 acacctcagt tcacaatcct ggggcgatat ggcagaatct tttttttcgg
attttggctt 60 gttgtggtat ctgaaggagc tcagaaagga agagttttgg
aaatttaagg agctcctcaa 120 acaacctttg gagaaatttg aactcaagcc
aatcccctgg gctgagctga agaaggcctc 180 caaagaagat gtagcaaagc
tgctggacaa acattaccca ggaaagcagg catgggaggt 240 aacactgaac
ctgtttctac agatcaatag gaaagatctc tggacaaagg ctcaggaaga 300
gatgagaaat aagctaaacc catacagaaa gcatatgaag gaaacatttc aactcatatg
360 ggagaaggaa acctgtcttc acgtccctga gcatttctac aaagaaacca
tgaaaaatga 420 gtataaagaa ttgaatgacg catatactgc tgcggctaga
cgacacactg tggtcctgga 480 aggtcctgat ggaattggaa aaacaaccct
tttaagaaaa gtgatgttgg actgggcaga 540 gggaaactta tggaaggaca
ggttcacatt tgtgtttttc ctcaatgtct gtgaaatgaa 600 cggtatcgca
gagaccagct tactggagct cctctctagg gactggccgg agtcttcaga 660
gaagatcgaa gacatttttt cccagccaga gagaattctg ttcatcatgg atggctttga
720 gcaactgaag tttaacttac aacttaaggc tgacttgagc gatgattgga
ggcagcggca 780 gccaatgcca attatcctga gcagtttgtt gcaaaaaaag
atgcttccag aatcctctct 840 ccttattgca ttaggaaaac tggctatgca
aaaacactat tttatgttgc ggcatccaaa 900 actcataaag ctcttaggat
tcagtgaatc tgaaaagaag tcgtatttct cctacttctt 960 tggtgagaag
agcaaagccc tgaaagtctt caattttgtg agagataatg ggccgctgtt 1020
tatcttgtgc cataatccct ttacgtgctg gttggtctgt acttgtgtga aacagaggct
1080 agagagggga gaagaccttg aaataaactc ccaaaacacc acctatttat
atgcatcctt 1140 tttaacaact gtattcaaag caggaagtca gagttttcca
cctaaggtga acagagcccg 1200 actaaaaagc ctgtgtgctt tggctgcaga
gggaatttgg acatatacat ttgtattttc 1260 ccatggggat ctccggagga
atgggttatc tgagtctgag ggcgtgatgt gggtgggtat 1320 gagactcctc
caaaggagag gggactgttt tgccttcatg catctgtgta tccaagagtt 1380
ttgtgccgcc atgttttatt tgctcaaacg acccaaagac gatcctaacc cggccattgg
1440 aagcataacc cagcttgtaa gagcaagtgt ggttcagcct caaaccctct
tgacccaggt 1500 ggggatattc atgtttggaa tttcaacaga agaaatcgtc
agcatgctgg agacctcctt 1560 tggttttcca ctgtcaaaag acctaaagca
ggaaataacc caatgccttg aaagtttaag 1620 tcaatgtgaa gctgataggg
aagccatagc tttccaggaa ctattcattg gtttgtttga 1680 aactcaggaa
aaagaatttg taaccaaagt gatgaatttc tttgaagaag ttttcattta 1740
tattggtaac atagaacatt tggtaatagc ttcattctgc ctgaagcatt gtcaacattt
1800 aacgacactt cgcatgtgtg tggagaatat ctttccagat gactcaggat
gcatctcaga 1860 ttacaatgag aagctcgtct actggcggga gctttgctca
atgttcatta ccaacaagaa 1920 cttccagatt ttagacatgg aaaataccag
ccttgatgat ccctccctgg cgattctttg 1980 caaagcgctg gctcagcctg
tttgtaaact ccgaaaactc atatttactt ctgtgtactt 2040 tggacatgat
tcagaattat ttaaggcagt tcttcacaac cctcatctga aacttctgag 2100
cctgtacggc actagcctct cccagtctga catcagacac ctgtgtgaga cgctgaaaca
2160 tccaatgtgc aagatagaag agctgatact gggaaagtgt gacatctcca
gtgaagtttg 2220 tgaagacatc gcctccgtcc tggcctgcaa cagcaagctg
aaacacctct ccttggtaga 2280 aaatcccttg agggacgaag gaatgacgtt
gctgtgtgaa gccctgaagc actcacactg 2340 tgccctggag aggctgatgt
tgatgtactg ctgtctcacc tctgtctcct gtgactccat 2400 ttccgaagtc
ctcttgtgca gtaagtccct gtccctcctc gatctgggct caaatgccct 2460
ggaagataat ggagtggcat ctctgtgtgc agcgctgaag cacccaggct gcagcatacg
2520 ggagctgtgg ttgatgggct gtttccttac ttccgattcc tgtaaggaca
ttgctgctgt 2580 tcttatttgc aatgggaaac tgaagaccct gaaacttggg
cataatgaaa taggagacac 2640 tggtgtcaga cagttatgtg cagctttgca
gcatcctcac tgtaaattag agtgtctcgg 2700 gctgcaaacg tgtccgatca
cccgtgcctg ctgcgacgac atcgccgcag cactcatcgc 2760 ctgcaaaaca
ctgaggagcc tgaacctcga ctggattgcc ttggatgctg atgcagtggt 2820
ggtgctgtgt gaggcattga gccacccgga ctgtgccctg cagatgctgg ggctgcacaa
2880 atctggcttt gatgaagaaa ctcagaagat cctgatgtct gtggaagaaa
aaattcccca 2940 tctgaccatt tcacatggac cttggattga cgaggaatac
aagatcaggg gtgtgctcct 3000 ctgatgggga acaccctgaa gtagtcgtct
cacaaaggct ttccttggcc acagtgggac 3060 cttcacctgg cacctctatc
ctgtaattgc acatcatggc agcagggctg tgatttcaga 3120 ggtactccct
aagtgttcta gcaatatgat tatggagtgt gattcagtgt acatgctgat 3180
tgtctttgcc tcggtcctat atccccttgt ctttagaaat cccatcctgc cttgtgatat
3240 ttagaagcac aagtacgtta aacaagtgct aaacgctctg gaaagcatgg
ctttattttc 3300 ttaatggatg tcttggtgtg taggagcatg catttgtagg
caccacaatc cggatacttc 3360 tgacacagaa gtgatgctag aatgtgtcta
tagattgtat tgctagcatc cagactttct 3420 agtttgtcca gatttcgatt
tgatcaattt tcttgtccaa taaaaaagca tttccaaatc 3480 tcta 3484 4 1974
DNA Homo sapiens 4 atcaccatgg ccatggccaa ggccagaaag ccccgggagg
cattgctctg ggccttgagt 60 gaccttgagg agaacgattt caagaagtta
aagttctact tacgggatat gaccctgtct 120 gagggccagc ccccactggc
cagaggggag ttggagggcc tgattccggt ggacctggca 180 gaattactga
tttcaaagta tggagaaaag gaggctgtga aagttgtcct caagggcttg 240
aaggtcatga acctgttgga acttgtggac cagctcagcc atatttgtct gcatgattac
300 agagaagtat accgagagca tgtgcgctgc ctagaggaat ggcaggaagc
aggagtcaat 360 ggcagataca accaggtgct cctggtggcc aagcccagct
cagagagccc agaatcactt 420 gcctgcccct tcccggagca ggagctggag
tctgtcacgg tggaggctct atttgattca 480 ggggaaaagc cctcactggc
cccatcctta gttgtgctac aggggtcggc tggcactgga 540 aagacaactc
tcgccagaaa aatggtgttg gactgggcca ccggtactct gtacccaggc 600
cggtttgatt atgtctttta tgtaagctgc aaagaagtgg tcctgctgct ggagagcaaa
660 ctggagcagc tccttttctg gtgctgcggg gacaatcaag cccctgtcac
agagattctg 720 aggcagccag agcggctcct gttcatcctg gatggctttg
atgagctgca gaggcccttt 780 gaagaaaagt tgaagaagag gggtttgagt
cccaaggaga gcctgctgca ccttctaatt 840 aggagacata cactccccac
gtgctccctt ctcatcacca cccggcccct ggctttgagg 900 aatctggagc
ccttgctgaa acaagcacgt catgtccata tcctaggctt ctctgaggag 960
gagagggcga ggtacttcag ctcctatttc acggatgaga agcaagctga ccgtgccttc
1020 gacattgtac agaaaaatga cattctctac aaagcgtgtc aggttccagg
catttgctgg 1080 gtggtctgct cctggctgca ggggcagatg gagagaggca
aagttgtctt agagacacct 1140 agaaacagca ctgacatctt catggcttac
gtctccacct ttctgccgcc cgatgatgat 1200 gggggctgct ccgagctttc
ccggcacagg gtcctgagga gtctgtgctc cctagcagct 1260 gaagggattc
agcaccagag gttcctattt gaagaagctg agctcaggaa acataattta 1320
gatggcccca ggcttgccgc tttcctgagt agtaacgact accaattggg acttgccatc
1380 aagaagttct acagcttccg ccacatcagc ttccaggact tttttcatgc
catgtcttac 1440 ctggtgaaag aggaccaaag ccggctgggg aaggagtccc
gcagagaagt gcaaaggctg 1500 ctggaggtaa aggagcagga agggaatgat
gagatgaccc tcactatgca gtttttactg 1560 gacatctcga aaaaagacag
cttctcgaac ttggagctca agttctgctt cagaatttct 1620 ccctgtttag
cgcaggatct gaagcatttt aaagaacaga tggaatctat gaagcacaac 1680
aggacctggg atttggaatt ctccctgtat gaagctaaaa taaagaatct ggtaaaaggt
1740 attcagatga acaatgtatc attcaagata aaacattcaa atgaaaagaa
atcacagagc 1800 cagaatttat tttctgtcaa aagcagcttg agtcatggac
ctaaggagga gcaaaaatgt 1860 ccttctgtcc atggacagaa ggagggcaaa
gataatatag caggaacaca aaaggaagct 1920 tctactggaa aaggcagagg
gacagaggaa acaccaaaaa atacttacat ataa 1974 5 3525 DNA Homo sapiens
5 gctctgacct tctttcccag gatgaggtgg ggccaccatt tgcccagggc ctcttggggc
60 tctggtttta gaagagcact ccagcgacca gatgatcgta tccccttcct
gatccactgg 120 agttggcccc ttcaagggga gcgtcccttt gggcccccta
gggcctttat acgccaccac 180 ggaagctcgg tagatagcgc tcccccatcc
gggaggcatg gacggctgtt ccccagcgcc 240 tctgcaactg aagctataca
gcggcaccgc cggaacctgg ctgagtggtt cagccggctg 300 cccagggagg
agcgccagtt tggcccaacc tttgccctag acacggtcca cgttgaccct 360
gtgatccgcg agagtacccc tgatgagcta cttcgcccac ccgcggagct ggccctggag
420 catcagccac cccaggccgg gctcccccca ctggccttgt ctcagctctt
taacccggat 480 gcctgtgggc gccgggtgca gacagtggtg ctgtatggga
cagtgggcac aggcaagagc 540 acgctggtgc gcaagatggt tctggactgg
tgttatgggc ggctgccggc cttcgagctg 600 ctcatcccct tctcctgtga
ggacctgtca tccctgggcc ctgccccagc ctccctgtgc 660 caacttgtgg
cccagcgcta cacgcccctg aaggaggttc tgcccctgat ggctgctgct 720
gggtcccacc tcctctttgt gctccatggc ttagagcatc tcaacctcga cttccggctg
780 gcaggcacgg gactttgtag tgacccggag gaaccgcagg aaccagctgc
tatcatcgtc 840 aacctgctgc gcaaatacat gctgcctcag gccagcattc
tggtgaccac tcggccctct 900 gccattggcc gtatccccag caagtacgtg
ggccgctatg gtgagatctg cggtttctct 960 gataccaacc tgcagaagct
ctacttccag ctccgcctca accagccgta ctgcgggtat 1020 gccgttggcg
gttcaggtgt ctctgccaca ccagctcagc gtgaccacct ggtgcagatg 1080
ctctcccgga acctggaggg gcaccaccag atagccgctg cctgcttcct gccgtcctat
1140 tgctggctcg tttgtgccac cttgcacttc ctgcatgccc ccacgcctgc
tgggcagacc 1200 cttacaagca tctataccag cttcctgcgc ctcaacttca
gcggggaaac cctggacagc 1260 actgacccct ccaatttgtc cctgatggcc
tatgcagccc gaaccatggg caagttggcc 1320 tatgaggggg tgtcctcccg
caagacctac ttctctgaag aggatgtctg tggctgcctg 1380 gaggctggca
tcaggacgga ggaggagttt cagctgctgc acatcttccg tcgggatgcc 1440
ctgaggtttt tcctggcccc atgtgtggag ccagggcgtg caggcacctt cgtgttcacc
1500 gtgcccgcca tgcaggaata cctggctgcc ctctacattg tgctgggttt
gcgcaagacg 1560 accctgcaaa aggtgggcaa ggaagtggct gagctcgtgg
gccgtgttgg ggaggacgtc 1620 agcctggtac tgggcatcat ggccaagctg
ctgcctctgc gggctctgcc tctgctcttc 1680 aacctgatca aggtggttcc
acgagtgttt gggcgcatgg tgggtaaaag ccgggaggcg 1740 gtggctcagg
ccatggtgct ggagatgttt cgagaggagg actactacaa cgatgatgtt 1800
ctggaccaga tgggcgccag tatcctgggc gtggagggcc cccggcgcca cccagatgag
1860 ccccctgagg atgaagtctt cgagctcttc cccatgttca tgggggggct
tctctctgcc 1920 cacaaccgag ctgtgctagc tcagcttggc tgccccatca
agaacctgga tgccctggag 1980 aatgcccagg ccatcaagaa gaagctgggc
aagctgggcc ggcaggtgct gcccccatca 2040 gagctccttg accacctctt
cttccactat gagttccaga accagcgctt ctccgctgag 2100 gtgctcagct
ccctgcgtca gctcaacctg gcaggtgtgc gcatgacacc agtcaagtgc 2160
acagtggtgg cagctgtgct gggcagcgga aggcatgccc tggatgaggt gaacttggcc
2220 tcctgccagc tagatcctgc tgggctgcgc acactcctgc ctgtcttcct
gcgtgcccgg 2280 aagctgggct tgcaactcaa cagcctgggc cctgaggcct
gcaaggacct ccgagacctg 2340 ttgctgcatg accagtgcca aattaccaca
ctgcggctgt ccaacaaccc gctgacggag 2400 gcaggtgttg ccgtgctaat
ggaggggctg gcaggaaaca cctcagtgac
gcacctgtcc 2460 ctgctgcaca cgggccttgg ggacgaaggc ctggagctgc
tggctgccca gctggaccgc 2520 aaccggcagc tgcaggagct gaacgtggcg
tacaacggtg ctggtgacac agcggccctg 2580 gccctggcca gagctgcccg
ggagcaccct tccctggaac tgctacacct ctacttcaat 2640 gagctgagct
cagagggccg ccaggtcttg cgagacttgg ggggtgctgc tgaaggtggt 2700
gcccgggtgg tggtgtcact gacagagggg acggcggtgt cagaatactg gtcagtgatc
2760 ctcagtgaag tccagcggaa cctcaatagc tgggatcggg cccgggttca
gcgacacctt 2820 gagctcctac tgcgggatct ggaagatagc cggggtgcca
cccttaatcc ttggcgcaag 2880 gcccagctgc tgcgagtgga gggcgaggtc
agggccctcc tggagcagct gggaagctct 2940 ggaagctgag acactggcgg
caggcaccta gctatgtgac cactggccct aaaccttttc 3000 cctctgtggc
ctcctggctt gcactgctcc ctctagaaag attccttcag gtctggaggc 3060
agaggaatgg gcatagctga gccagttgcc ctcctagggc atgtttgacc aggactgagt
3120 ctggaatctc caagttaaag atggtgaatc aatgcttcgg gcttggagat
ggaacatgcc 3180 tcctctccat tcagctagaa ggaccaaagc atgtggcatt
tggatggcca gagtgccctg 3240 aagcaccact accaaccttg cctccccctc
ctctcaaaga gcctctgatt gtgtcaccaa 3300 ggggctcaca tcttatgtct
gccatgccag gggtgtcgcc atccagatgt gttggaagct 3360 tcccctcctg
ccttatgctc acctgtggac accgaggatg ccctcacatt ggtgctttct 3420
cctcatcctc atgccccctt tgccacaatg gtatgatggc ttggtagccc ctcgaggcag
3480 atgcacctga cttgctgcta ttaaaaagcc gtgtgccttc tacca 3525 6 3373
DNA Homo sapiens 6 ttcttcagcc ttaacctaag gtctcatact cggagcacta
tgacatcgcc ccagctagag 60 tggactctgc agacccttct ggagcagctg
aacgaggatg aattaaagag tttcaaatcc 120 cttttatggg cttttcccct
cgaagacgtg ctacagaaga ccccatggtc tgaggtggaa 180 gaggctgatg
gcgagaaact ggcagaaatt ctggtcaaca cctcctcaga aaattggata 240
aggaatgcga ctgtgaacat cttggaagag atgaatctca cggaattgtg taagatggca
300 aaggctgaga tgatggagga cggacaggtg caagaaatag ataatcctga
gctgggagat 360 gcagaagaag actcggagtt agcaaagcca ggtgaaaagg
aaggatggag aaattcaatg 420 gagaaacagt ctttggtctg gaagaacacc
ttttggcaag gagacattga caatttccat 480 gacgacgtca ctctgagaaa
ccaacggttc attccattct tgaatcccag aacacccagg 540 aagctaacac
cttacacggt ggtgctgcac ggccccgcag gcgtggggaa aaccacgctg 600
gccaaaaagt gtatgctgga ctggacagac tgcaacctca gcccgacgct cagatacgcg
660 ttctacctca gctgcaagga gctcagccgc atgggcccct gcagttttgc
agagctgatc 720 tccaaagact ggcctgaatt gcaggatgac attccaagca
tcctagccca agcacagaga 780 atcctgttcg tggtcgatgg ccttgatgag
ctgaaagtcc cacctggggc gctgatccag 840 gacatctgcg gggactggga
gaagaagaag ccggtgcccg tcctcctggg gagtttgctg 900 aagaggaaga
tgttacccag ggcagccttg ctggtcacca cgcggcccag ggcactgagg 960
gacctccagc tcctggcgca gcagccgatc tacgtaaggg tggagggctt cctggaggag
1020 gacaggaggg cctatttcct gagacacttt ggagacgagg accaagccat
gcgtgccttt 1080 gagctaatga ggagcaacgc ggccctgttc cagctgggct
cggcccccgc ggtgtgctgg 1140 attgtgtgca cgactctgaa gctgcagatg
gagaaggggg aggacccggt ccccacctgc 1200 ctcacccgca cggggctgtt
cctgcgtttc ctctgcagcc ggttcccgca gggcgcacag 1260 ctgcggggcg
cgctgcggac gctgagcctc ctggccgcgc agggcctgtg ggcgcagatg 1320
tccgtgttcc accgagagga cctggaaagg ctcggggtgc aggagtccga cctccgtctg
1380 ttcctggacg gagacatcct ccgccaggac agagtctcca aaggctgcta
ctccttcatc 1440 cacctcagct tccagcagtt tctcactgcc ctgttctacg
ccctggagaa ggaggagggg 1500 gaggacaggg acggccacgc ctgggacatc
ggggacgtac agaagctgct ttccggagaa 1560 gaaagactca agaaccccga
cctgattcaa gtaggacact tcttattcgg cctcgctaac 1620 gagaagagag
ccaaggagtt ggaggccact tttggctgcc ggatgtcacc ggacatcaaa 1680
caggaattgc tgcaatgcaa agcacatctt catgcaaata agcccttatc cgtgaccgac
1740 ctgaaggagg tcttgggctg cctgtatgag tctcaggagg aggagctggc
gaaggtggtg 1800 gtggccccgt tcaaggaaat ttctattcac ctgacaaata
cttctgaagt gatgcattgt 1860 tccttcagcc tgaagcattg tcaagacttg
cagaaactct cactgcaggt agcaaagggg 1920 gtgttcctgg agaattacat
ggattttgaa ctggacattg aatttgaaag ctcaaacagc 1980 aacctcaagt
ttctggaagt gaaacaaagc ttcctgagtg actcttctgt gcggattctt 2040
tgtgaccacg taacccgtag cacctgtcat ctgcagaaag tggagattaa aaacgtcacc
2100 cctgacaccg cgtaccggga cttctgtctt gctttcattg ggaagaagac
cctcacgcac 2160 ctgaccctgg cagggcacat cgagtgggaa cgcacgatga
tgctgatgct gtgtgacctg 2220 ctcagaaatc ataaatgcaa cctgcagtac
ctgaggttgg gaggtcactg tgccaccccg 2280 gagcagtggg ctgaattctt
ctatgtcctc aaagccaacc agtccctgaa gcacctgcgt 2340 ctctcagcca
atgtgctcct ggatgagggt gccatgttgc tgtacaagac catgacacgc 2400
ccaaaacact tcctgcagat gttgtcgttg gaaaactgtc gtcttacaga agccagttgc
2460 aaggaccttg ctgctgtctt ggttgtcagc aagaagctga cacacctgtg
cttggccaag 2520 aaccccattg gggatacagg ggtgaagttt ctgtgtgagg
gcttgagtta ccctgattgt 2580 aaactgcaga ccttggtgtt acagcaatgc
agcataacca agcttggctg tagatatctc 2640 tcagaggcgc tccaagaagc
ctgcagcctc acaaacctgg acttgagtat caaccagata 2700 gctcgtggat
tgtggattct ctgtcaggcg ttagagaatc caaactgtaa cctaaaacac 2760
ctacgcctct ggagctgctc cctcatgcct ttctattgtc agcatcttgg atctgctctc
2820 ctcagcaatc agaagcttga aactctggac ctgggccaga atcatttgtg
gaagagtggc 2880 ataattaagc tctttggggt tctaagacaa agaactggat
ccttgaagat actcaggttg 2940 aagacctatg aaactaattt ggaaatcaag
aagctgttgg aggaagtgaa agaaaagaat 3000 cccaagctga ctattgattg
caatgcttcc ggggcaacgg cacctccgtg ctgtgacttt 3060 ttttgctgag
cagcctggga tcgctctacg aattacacag gaagcgggat tcgggtctct 3120
aagatgtctt atgaatgcag gtcagagggt cacatgttaa cactagagtc tgtcgagagg
3180 taggatttga cactggtttt ctcactattt ttgggagatt ctgcacgagt
cacgcacccc 3240 cttcacatga cgctatgtac tttctcacag ggataataaa
gttagagcac tctcgttgca 3300 gctgcgttta ttgacatgct caggagcaaa
cctgcaataa acatggtact ctgtgcttcg 3360 tctaggagga agt 3373 7 3540
DNA Homo sapiens 7 tgagaaactg catgtgttgg gcaagatgaa cttttctgta
atcacctgcc ccaacggtgg 60 taccaaccaa gggcttctgc cttacctgat
ggccctggat cagtatcagc tggaggaatt 120 caagctttgc ttggaacccc
agcagctgat ggacttctgg tcggcccccc aggggcactt 180 cccgcgtatc
ccctgggcaa acttgagagc tgccgaccct ttgaatctgt cctttctttt 240
ggatgaacac ttcccaaaag gtcaggcatg gaaagtggtc ctcggcatct tccagacaat
300 gaatctgacc tcactgtgtg agaaagttag agccgagatg aaagagaatg
tgcagaccca 360 agagctgcaa gatccaaccc aggaagatct agagatgcta
gaagcagcag cagggaatat 420 gcagacccag ggatgccaag atccaaacca
agaagaacta gacgagctag aagaagaaac 480 agggaatgta caggcccagg
gatgccaaga tccaaaccaa gaagaaccag agatgctaga 540 ggaagcagac
cacagaagaa aatacagaga gaacatgaag gctgaactac tggagacatg 600
ggacaacatc agttggccta aagaccacgt atatatccgt aatacatcaa aggacgaaca
660 tgaggaactg cagcgcctac tggatcctaa taggactaga gcccaggccc
agacgatagt 720 cttggtgggg agggcagggg ttgggaagac caccttggca
atgcaggcta tgctgcactg 780 ggcaaatgga gttctctttc agcaaaggtt
ctcctatgtt ttctatctca gctgccataa 840 aataaggtac atgaaggaaa
ctacctttgc tgaattgatt tctttggatt ggcccgattt 900 tgatgccccc
attgaagagt tcatgtctca accagagaag ctcctgttta ttattgatgg 960
ctttgaggaa ataatcatat ctgagtcacg ctctgagagc ttggatgatg gctcgccatg
1020 tacagactgg taccaggagc tcccagtgac caaaatccta cacagcttgt
tgaagaaaga 1080 attggttccc ctggctacct tactgatcac gatcaagacc
tggtttgtga gagatcttaa 1140 ggcctcatta gtgaatccat gctttgtaca
aattacaggg ttcacagggg acgacctacg 1200 ggtatatttc atgagacact
ttgatgactc aagtgaagtt gagaaaatcc tgcagcagct 1260 aagaaaaaac
gaaactctct ttcattcctg cagtgccccc atggtgtgtt ggaccgtatg 1320
ttcctgtctg aagcagccga aggtgaggta ttacgatctc cagtcaatca ctcagactac
1380 caccagtctg tatgcctatt ttttctccaa cttgttctcc acagcagagg
tagatttggc 1440 agatgacagc tggccaggac aatggagggc cctctgcagt
ctggccatag aagggctgtg 1500 gtctatgaac ttcacgttta acaaagaaga
cactgagatc gagggcctgg aagtgccttt 1560 cattgattct ctctacgagt
tcaatattct tcaaaagatc aatgactgtg ggggttgcac 1620 tactttcacc
cacctaagtt tccaggagtt ttttgcagcc atgtcctttg tgctagagga 1680
acctagagaa ttccctcccc attccacaaa gccacaagag atgaagatgt tactgcaaca
1740 cgtcttgctt gacaaagaag cctactggac tccagtggtt ctgttcttct
ttggtctttt 1800 aaataaaaac atagcaagag aactggaaga tactttgcat
tgtaaaatat ctcccagggt 1860 aatggaggaa ttattaaagt ggggagaaga
gttaggtaag gctgaaagtg cctctctcca 1920 atttcacatt ctacgacttt
ttcactgcct acacgagtcc caggaggaag acttcacaaa 1980 gaagatgttg
ggtcgtatct ttgaagttga ccttaatatt ttggaggacg aagaactcca 2040
agcttcttca ttttgcctaa agcactgtaa aaggttaaat aagctaaggc tttctgttag
2100 cagtcacatc cttgaaaggg acttggaaat tctggagaca agcaagtttg
attccaggat 2160 gcacgcatgg aacagcattt gctctacgtt ggtcacaaat
gagaatctgc atgagctaga 2220 cctgagtaac agcaaacttc atgcttcctc
tgtgaagggt ctctgtcttg cactgaaaaa 2280 tccaagatgc aaagtccaga
aactgacgtg caaatcggta actcctgagt gggttctgca 2340 ggacctcatt
attgcccttc agggtaacag caagctgacc catctgaact tcagctctaa 2400
caagctggga atgactgtcc ccctgattct taaagctttg agacactcag cttgcaacct
2460 caagtatctg tgcctggaga aatgcaactt gtcggcagcc agctgtcagg
acctagcctt 2520 gtttctcacc agcatccaac acgtaactcg attgtgcctg
ggatttaatc ggctccaaga 2580 tgatggcata aagctattgt gtgcggccct
gactcacccc aagtgtgcct tagagagact 2640 ggagctctgg ttttgccagc
tggcagcacc cgcttgcaag cacttgtcag atgctctcct 2700 gcagaacagg
agcctgacac acctgaatct gagcaagaac agcctgagag acgagggagt 2760
caagttcctg tgtgaggcct tgggtcgccc agatggtaac ctgcagagcc tgaatttgtc
2820 aggttgttct ttcacaagag agggctgtgg agagctggct aatgccctca
gccataatca 2880 taatgtgaaa atcttagatt tgggagaaaa tgatcttcag
gatgatggag tgaagctact 2940 gtgtgaggct ctgaaaccac atcgtgcatt
gcacacactt gggttggcga aatgcaatct 3000 gacaactgct tgctgccagc
atctcttctc tgttctcagc agcagtaaga gcctggtcaa 3060 tctgaacctt
ctaggcaatg aattggatac tgatggtgtc aagatgctat gcttcaaaaa 3120
gacctgcaca atgtagtgag agaggagata cagacctcac agaaggagct ctgtctgaaa
3180 ctcaagtgtg cgtgggattt taatgacctt gaagacaagt ggtggtggtg
atcccacgga 3240 ttagatgcca cgtggcttga ccatggatct tgggggaaag
ccaccaggac atcctggcct 3300 gtgtgtcgct ccaatgtcac catttgtggg
gacaaatgag ctgttccctg caggaggctt 3360 tgtcacggtt gttggaggcc
gcccattgca cgcccaggtc tggaatccta gtgtaatact 3420 gtgtctggta
ccaagatcat aagttggctg tgccttcagt cttgtctatg tcctccttgg 3480
tgtaatgttt ttaattcttg gaggtgttga gagaattcaa taaagcaaag catataaaaa
3540 8 3934 DNA Homo sapiens 8 gtctcgtgtt tctctcttcc aatcggttgt
ctttatcgtg gacactgagg tgttctctgc 60 cttgactaaa gatgagtgac
gtgaatccac cctctgacac ccccattccc ttttcatcct 120 cctccactca
cagttctcat attccgccct ggacattctc ttgctacccc ggctccccat 180
gtgaaaatgg ggtcatgctg tacatgagaa acgtgagcca tgaggagcta caacggttca
240 agcagctctt actgactgag ctcagtactg gcaccatgcc catcacctgg
gaccaggtcg 300 agacagccag ctgggcagag gtggttcatc tcttgataga
gcgtttccct ggacgacgcg 360 cttgggatgt gacttcgaac atctttgcca
ttatgaactg tgataaaatg tgtgttgtag 420 tccgcagaga gataaatgcc
attctgccta ccttggaacc agaggacttg aatgtgggag 480 aaacacaggt
gaatctggag gaaggagaat ctggtaaaat acggcggtat aaatcgaatg 540
tgatggaaaa gtttttcccc atatgggaca ttacgacttg gcctggaaac cagagggact
600 tcttctacca aggtgtacac aggcacgagg agtacttacc atgtctgctt
ctgcccaaaa 660 gaccccaggg tagacagccc aagaccgtgg ccatacaggg
agctcctggg atcggaaaaa 720 caatcctggc caaaaaggtg atgtttgagt
gggccagaaa caagttctac gcccacaagc 780 gctggtgtgc tttctacttc
cattgccaag aggtgaacca gacgacagac cagagcttct 840 ccgagctgat
tgagcaaaag tggcctggat ctcaggacct cgtgtcaaag attatgtcca 900
aacccgacca acttctgctg ctcttggatg gctttgagga gctcacatct accctcattg
960 acagactgga ggacctgagt gaagactgga ggcagaaatt gcctgggtct
gtcctactga 1020 gcagtttgct gagcaaaacg atgcttccag aggccacgct
actgatcatg ataagattta 1080 cctcttggca gacatgcaag cccttgctga
aatgtccctc tctcgtaacc cttccggggt 1140 ttaatacgat ggaaaaaatc
aagtatttcc agatgtattt tggacacaca gaggagggag 1200 accaagtctt
gagtttcgcc atggaaaaca ccattctctt ctccatgtgc cgggtccctg 1260
tggtttgctg gatggtctgc tctggtctga aacagcaaat ggagagagga aacaatctca
1320 cacagtcatg tccaaatgcc acctctgtgt tcgtccggta tatttctagc
ttgtttccca 1380 ccagagctga gaacttttcc agaaagatcc accaagcaca
actggaaggt ctgtgtcact 1440 tggccgcaga cagcatgtgg cacaggaaat
gggtgttagg taaagaagat cttgaggaag 1500 ccaagctgga tcagacggga
gtcaccgcct tccttggcat gagtattctt cggagaattg 1560 caggtgagga
agaccactat gtctttaccc tcgtgacttt tcaggaattt tttgcggcct 1620
tgttttatgt tctctgtttc ccacaaagac tcaaaaattt tcatgtgttg agccacgtga
1680 atatccagcg cctgatagcg agtcccagag gaagcaaaag ctatctctct
cacatgggac 1740 ttttcttatt cggttttctg aacgaggcct gcgcttcggc
cgtggaacag tcattccaat 1800 gcaaggtgtc tttcggtaat aagaggaaac
tgctgaaagt catacctctg ttgcataaat 1860 gtgacccacc ttctccgggc
agtggggtcc cgcagttatt ctactgtctg catgaaatcc 1920 gggaggaagc
ctttgtaagc caagccctaa atgattatca taaagttgtc ttgagaattg 1980
gcaacaacaa agaagttcaa gtgtctgctt tttgcctgaa gcggtgtcaa tatttgcatg
2040 aggtggaact gaccgtcacc ctgaacttca tgaacgtgtg gaagctcagc
tccagctccc 2100 atcctggctc tgaagcgcca gagagcaatg ggctgcatcg
ttggtggcaa gacttatgct 2160 ctgtgtttgc aacgaatgat aagctggaag
tcctgactat gaccaacagt gttttggggc 2220 ctcctttttt gaaggctctc
gcggccgcac tgaggcaccc tcagtgcaaa ctgcaaaagc 2280 tactcctaag
gcgtgtgaat agcaccatgt tgaaccagga cttaatcggt gttttgacgg 2340
ggaaccagca tctgagatac ttggaaatac aacatgtgga agtggagtcc aaagctgtga
2400 agcttctatg cagggtgctg agatcccccc ggtgccgtct gcagtgtctc
aggttggaag 2460 actgcttggc cacccctaga atttggactg atcttggcaa
taatcttcaa ggtaacgggc 2520 atctaaagac tctcatacta agaaaaaact
ccctggagaa ctgtggggcg tattacctgt 2580 ctgtggccca gctggagagg
ctgtcgatag agaactgcaa ccttacacag cttacttgtg 2640 aaagccttgc
ctcctgtctc aggcagagta agatgctgac ccacctgagc ttggcagaaa 2700
acgccttgaa agatgaaggg gccaagcata tttggaatgc cctgccacac ctgagatgtc
2760 ctctgcagag gctggtactg agaaagtgtg acttgacctt taattgctgt
caggatatga 2820 tctctgcgct ctgtaaaaat aaaaccctga aaagtcttga
cctaagtttt aatagcctga 2880 aggatgatgg ggtgatcctg ctgtgtgagg
ccctgaagaa ccctgactgt acattacaga 2940 tcctggagct ggaaaactgc
ctgttcacct ccatctgctg ccaggccatg gcttccatgc 3000 tccgcaaaaa
ccaacatctg agacatctgg acttgagcaa gaatgcgatt ggagtctatg 3060
gtattctgac cttgtgcgag gccttctcaa gccaaaagaa gagagaagag gtcattttct
3120 gtattcctgc ctggactcga ataactagct tctccccaac tcctcaccca
cccgacttca 3180 cgggaaaaag tgactgccta tcccagatta atccttaggc
cgtccagtca tctttctctg 3240 gggcttgatt gatcagttcc cactctgaca
actggcaaat accaggcgtt atcatcctgt 3300 atgcattaac gtactttccc
ctgaaacaga gcaacccagt caacaccaca gaacctcagc 3360 tttgaaccct
ggagtgagga cggtgatgcc ctgtgtgtat taatatgcta tgtaaggctg 3420
ggcgtggtgg ctcacgcctg taacccagca ctatgggagg tcgaggtggg cagattacct
3480 gaggtcagga gttccagacc agcctggcca acatggtgaa accccgcctc
tactaaaaaa 3540 aaaaatacaa aaaattaggc gtggtggtgg gctcctgtaa
tcccagctgc tcgggaggct 3600 gaggcaggag aatcacttga atctaggagg
cagagtttgc agtgagctga gatcacgcca 3660 ttgcactcca gcctgggcga
cagagcaaga ctctgtctca agaagaaaaa aaaaatacat 3720 atacacataa
atatatatat gtgtgtgtgt atatatatat atatatatat atatgctata 3780
taaagtttaa atgaaatgct ttgagtcacc taagacagga tatagacaaa gtcttcatcg
3840 tcttcttgct tcttctacct ttatttattc tcagctctga atgtatgaac
ctgctcaatc 3900 acctcatctt aaaaataaaa tcactgtccc taga 3934 9 3102
DNA Homo sapiens 9 atggcagaat cggattctac tgactttgac ctgctgtggt
atctagagaa tctcagtgac 60 aaggaatttc agagttttaa gaagtatctg
gcacgcaaga ttcttgattt caaactgcca 120 cagtttccac tgatacagat
gacaaaagaa gaactggcta acgtgttgcc aatctcttat 180 gagggacagt
atatatggaa tatgctcttc agcatatttt caatgatgcg taaggaagat 240
ctttgtagga agatcattgg cagacgaaac cgcaatcagg aggcatgcaa agctgtcatg
300 aggagaaaat tcatgctgca atgggaaagt cacacttttg gaaaatttca
ttataaattt 360 tttcgtgacg tttcgtcaga tgtgttctac atacttcaat
tagcctatga ttctaccagc 420 tattattcag caaacaatct caatgtgttc
ctgatgggag agagagcatc tggaaaaact 480 attgttataa atctggctgt
gttgaggtgg atcaagggtg agatgtggca gaacatgatc 540 tcgtacgtcg
ttcacctcac ttctcacgaa ataaaccaga tgaccaacag cagcttggct 600
gagctaatcg ccaaggactg gcctgacggc caggctccca ttgcagacat cctgtctgat
660 cccaagaaac tccttttcat cctcgaggac ttggacaaca taagattcga
gttaaatgtc 720 aatgaaagtg ctttgtgtag taacagcacc cagaaagttc
ccattccagt tctcctggtc 780 agtttgctga agagaaaaat ggctccaggc
tgctggttcc tcatctcctc aaggcccaca 840 cgtgggaata atgtaaaaac
gttcttgaaa gaggtagatt gctgcacgac cttgcagctg 900 tcgaatggga
agagggagat atattttaac tctttcttta aagaccgcca gagggcgtcg 960
gcagccctcc agcttgtaca tgaggatgaa atactcgtgg gtctgtgccg agtcgccatc
1020 ttatgctgga tcacgtgtac tgtcctgaag cggcagatgg acaaggggcg
tgacttccag 1080 ctctgctgcc aaacacccac tgatctacat gcccactttc
ttgctgatgc gttgacatca 1140 gaggctggac ttactgccaa tcagtatcac
ctaggtctcc taaaacgtct gtgtttgctg 1200 gctgcaggag gactgtttct
gagcaccctg aatttcagtg gtgaagacct cagatgtgtt 1260 gggtttactg
aggctgatgt ctctgtgttg caggccgcga atattctttt gccgagcaac 1320
actcataaag accgttacaa gttcatacac ttgaacgtcc aggagttttg tacagccatt
1380 gcatttctga tggcagtacc caactatctg atcccctcag gcagcagaga
gtataaagag 1440 aagagagaac aatactctga ctttaatcaa gtgtttactt
tcatttttgg tcttctaaat 1500 gcaaacagga gaaagattct tgagacatcc
tttggatacc agctaccgat ggtagacagc 1560 ttcaagtggt actcggtggg
atacatgaaa catttggacc gtgacccgga aaagttgacg 1620 caccatatgc
ctttgtttta ctgtctctat gagaatcggg aagaagaatt tgtgaagacg 1680
attgtggatg ctctcatgga ggttacagtt taccttcaat cagacaagga tatgatggtc
1740 tcattatact gtctggatta ctgctgtcac ctgaggacac ttaagttgag
cgttcagcgc 1800 atctttcaaa acaaagagcc acttataagg ccaactgcta
gtcaaatgaa gagccttgtc 1860 tactggagag agatctgctc tcttttttat
acaatggaga gcctccggga gctgcatatc 1920 tttgacaatg accttaatgg
tatttcagaa aggattctgt ctaaagccct ggagcattct 1980 agctgtaaac
ttcgcacact caagttgtcc tatgtctcga ctgcttctgg ttttgaagac 2040
ttactcaagg ctttggctcg taatcggagc ctgacatacc tgagtatcaa ctgtacgtcc
2100 atttccctaa atatgttttc acttctgcat gacatcctgc acgagcccac
atgccaaata 2160 agtcatctga gcttgatgaa atgtgatttg cgagccagcg
aatgcgaaga aatcgcctct 2220 ctcctcatca gtggcgggag tctgagaaaa
ctgaccttat ccagcaatcc gctgaggagc 2280 gacgggatga acatactgtg
tgatgccttg cttcatccca actgcactct tatatcactg 2340 gtgttagtct
tctgctgtct cactgaaaat tgctgcagcg cccttggaag agtgcttctg 2400
ttcagcccaa ctctaagaca actagacctg tgtgtgaatc gcttaaaaaa ttacggagtg
2460 ttgcatgtga cgtttccctt gctgtttcca acctgtcagt tagaggagct
tcatctgtct 2520 ggctgtttct ttagcagcga tatctgtcaa tatattgcca
tagttattgc tactaatgaa 2580 aaactgagga gcctggagat tgggagcaac
aaaatagaag atgcaggaat gcagctgcta 2640 tgtggtggtt tgagacatcc
caactgcatg ttggtgaata ttgggctaga agagtgcatg 2700 ttaaccagtg
cctgctgtcg atctcttgcc tctgttctta ccaccaacaa aacactagaa 2760
agactcaact tgcttcaaaa tcacttgggc aatgatggag ttgcaaaact tcttgagagc
2820 ttgatcagcc cagattgtgt acttaaggta gttgggcttc cattaactgg
cctgaacaca 2880 caaacccagc agttgctgat gactgtaaag gaaagaaaac
ccagtttgat ctttctgtct 2940
gaaacttggt ctttaaagga aggcagagaa attggtgtga cacctgcttc tcagccaggt
3000 tcaataatac ctaattctaa tttggattac atgtttttca aatttcccag
aatgtctgca 3060 gccatgagaa cgtcaaatac agcatctagg caaccccttt ga 3102
10 2928 DNA Homo sapiens 10 atgaggtggg gccaccattt gcccagggcc
tcttggggct ctggttttag aagagcactc 60 cagcgaccag atgatcgtat
ccccttcctg atccactgga gttggcccct tcaaggggag 120 cgtccctttg
ggccccctag ggcctttata cgccaccacg gaagctcggt agatagcgct 180
cccccacccg ggaggcatgg acggctgttc cccagcgcct ctgcaactga agctatacag
240 cggcaccgcc ggaacctggc tgagtggttc agccggctgc ccagggagga
gcgccagttt 300 ggcccaacct ttgccctaga cacggtccac gttgaccctg
tgatccgcga gagtacccct 360 gatgagctac ttcgcccacc cgcggagctg
gccctggagc atcagccacc ccaggccggg 420 ctccccccac tggccttgtc
tcagctcttt aacccggatg cctgtgggcg ccgggtgcag 480 acagtggtgc
tgtatgggac agtgggcaca ggcaagagca cgctggtgcg caagatggtt 540
ctggactggt gttatgggcg gctgccggcc ttcgagctgc tcatcccctt ctcctgtgag
600 gacctgtcat ccctgggccc tgccccagcc tccctgtgcc aacttgtggc
ccagcgctac 660 acgcccctga aggaggttct gcccctgatg gctgctgctg
ggtcccacct cctctttgtg 720 ctccatggct tagagcatct caacctcgac
ttccggctgg caggcacggg actttgtagt 780 gacccggagg aaccgcagga
accagctgct atcatcgtca acctgctgcg caaatacatg 840 ctgcctcagg
ccagcattct ggtgaccact cggccctctg ccattggccg tatccccagc 900
aagtacgtgg gccgctatgg tgagatctgc ggtttctctg ataccaacct gcagaagctc
960 tacttccagc tccgcctcaa ccagccgtac tgcgggtatg ccgttggcgg
ttcaggtgtc 1020 tctgccacac cagctcagcg tgaccacctg gtgcagatgc
tctcccggaa cctggagggg 1080 caccaccaga tagccgctgc ctgcttcctg
ccgtcctatt gctggctcgt ttgtgccacc 1140 ttgcacttcc tgcatgcccc
cacgcctgct gggcagaccc ttacaagcat ctataccagc 1200 ttcctgcgcc
tcaacttcag cggggaaacc ctggacagca ctgacccctc caatttgtcc 1260
ctgatggcct atgcagcccg aaccatgggc aagttggcct atgagggggt gtcctcccgc
1320 aagacctact tctctgaaga ggatgtctgt ggctgcctgg aggctggcat
caggacggag 1380 gaggagtttc agctgctgca catcttccgt cgggatgccc
tgaggttttt cctggcccca 1440 tgtgtggagc cagggcgtgc aggcaccttc
gtgttcaccg tgcccgccat gcaggaatac 1500 ctggctgccc tctacattgt
gctgggtttg cgcaagacga ccctgcaaaa ggtgggcaag 1560 gaagtggctg
agctcgtggg ccgtgttggg gaggacgtca gcctggtact gggcatcatg 1620
gccaagctgc tgcctctgcg ggctctgcct ctgctcttca acctgatcaa ggtggttcca
1680 cgagtgtttg ggcgcatggt gggtaaaagc cgggaggcgg tggctcaggc
catggtgctg 1740 gagatgtttc gagaggagga ctactacaac gatgatgttc
tggaccagat gggcgccagt 1800 atcctgggcg tggagggccc ccggcgccac
ccagatgagc cccctgagga tgaagtcttc 1860 gagctcttcc ccatgttcat
gggggggctt ctctctgccc acaaccgagc tgtgctagct 1920 cagcttggct
gccccatcaa gaacctggat gccctggaga atgcccaggc catcaagaag 1980
aagctgggca agctgggccg gcaggtgctg cccccatcag agctccttga ccacctcttc
2040 ttccactatg agttccagaa ccagcgcttc tccgctgagg tgctcagctc
cctgcgtcag 2100 ctcaacctgg caggtgtgcg catgacacca gtcaagtgca
cagtggtggc agctgtgctg 2160 ggcagcggaa ggcatgccct ggatgaggtg
aacttggcct cctgccagct agatcctgct 2220 gggctgcgca cactcctgcc
tgtcttcctg cgtgcccgga agctgggctt gcaactcaac 2280 agcctgggcc
ctgaggcctg caaggacctc cgagacctgt tgctgcatga ccagtgccaa 2340
attaccacac tgcggctgtc caacaacccg ctgacggcgg caggtgttgc cgtgctaatg
2400 gaggggctgg caggaaacac ctcagtgacg cacctgtccc tgctgcacac
gggccttggg 2460 gacgaaggcc tggagctgct ggctgcccag ctggaccgca
accggcagct gcaggagctg 2520 aacgtggcgt acaacggtgc tggtgacaca
gcggccctgg ccctggccag agctgcccgg 2580 gagcaccctt ccctggaact
gctacacctc tacttcaatg agctgagctc agagggccgc 2640 caggtcttgc
gagacttggg gggtgctgct gaaggtggtg cccgggtggt ggtgtcactg 2700
acagagggga cggcggtgtc agaatactgg tcagtgatcc tcagtgaagt ccagcggaac
2760 ctcaatagct gggatcgggc ccgggttcag cgacaccttg agctcctact
gcgggatctg 2820 gaagatagcc ggggtgccac ccttaatcct tggcgcaagg
cccagctgct gcgagtggag 2880 ggcgaggtca gggccctcct ggagcagctg
ggaagctctg gaagctga 2928 11 6763 DNA Homo sapiens 11 ggaggagccg
cgagcgctga gggtgagtgc cgggagctct gagggagtct gcactatgga 60
aacaacctgt caatccagct caaggcacac atagcccaga cacccatgag accctctccg
120 tggggaccct agagcaccta tcatgaacga ggagaccaag gctggctcct
catggacccc 180 gttggcctcc agctcggcaa caagaacctg tggagctgtc
ttgtgaggct gctcaccaaa 240 gacccagaat ggctgaacgc caagatgaag
ttcttcctcc ccaacacgga cctggattcc 300 aggaacgaga ccttggaccc
tgaacagaga gtcatcctgc aactcaacaa gctgcatgtc 360 cagggttcgg
acacctggca gtctttcatt cattgcgtgt gcatgcagct ggaggtgcct 420
ctggacctgg aggtgcttct gctaagtact tttggctatg atgatgggtt caccagccag
480 ctgggagctg aggggaaaag ccaacctgaa tctcagctcc accatggcct
gaagcgccca 540 catcagagct gtgggtcctc accccgccgg aagcagtgca
agaagcagca gctagagttg 600 gccaagaagt acctgcagct cctgcggacc
tctgcccagc agcgctacag gagccaaatc 660 cctgggtcag ggcagcccca
cgccttccac caggtctatg tccctccaat cctgcgccgg 720 gccacagcat
ccttagacac tccggagggg gccattatgg gggacgtcaa ggtggaagat 780
ggtgctgacg tgagcatctc ggacctcttc aacaccaggg ttaacaaggg cccgagggtg
840 accgtgcttt tggggaaggc tggcatgggc aagaccacgc tggcccaccg
gctctgccag 900 aagtgggcag agggccatct gaactgtttc caggccctgt
tcctttttga attccgccag 960 ctcaacttga tcacgaggtt cctgacaccg
tccgagctcc tttttgatct gtacctgagc 1020 cctgaatcgg accacgacac
tgtcttccag tacctggaga agaacgctga ccaagtcctg 1080 ctgatctttg
atgggctaga tgaggccctc cagcctatgg gtcctgatgg cccaggccca 1140
gtcctcaccc ttttctccca tctctgcaat gggaccctcc tgcctggctg ccgggtgatg
1200 gctacctccc gtccagggaa gctgcctgcc tgcctgcctg cagaggcagc
catggtccac 1260 atgttgggct ttgatgggcc acgggtggaa gaatatgtga
atcacttctt cagcgcccag 1320 ccatcgcggg agggggccct ggtggagtta
cagacaaatg gacgtctccg aagcctgtgt 1380 gcggtgcccg cactgtgcca
agtcgcctgt ctctgcctcc accatctgct tcctgaccac 1440 gccccaggcc
agtctgtggc cctcctgccc aacatgactc agctctatat gcagatggtg 1500
ctcgccctca gcccccctgg gcacttgccc acctcgtccc tactggacct gggggaggtg
1560 gccctgaggg gcctggagac agggaaggtt atcttctatg caaaagatat
tgctccaccc 1620 ttgatagctt ttggggccac tcacagcctg ctgacttcct
tctgcgtctg cacaggccct 1680 gggcaccagc agacaggcta tgctttcacc
cacctcagcc tgcaggagtt tcttgctgcc 1740 ctgcacctga tggccagccc
caaggtgaac aaagacacac ttacccagta tgttaccctc 1800 cattcccgct
gggtacagcg gaccaaagct agactgggcc tctcagacca cctccccacc 1860
ttcctggcgg gcctggcatc ctgcacctgc cgccccttcc ttagccacct ggcgcagggc
1920 aatgaggact gtgtgggtgc caagcaggct gctgtagtgc aggtgttgaa
gaagttggcc 1980 acccgcaagc tcacagggcc aaaggttgta gagctgtgtc
actgtgtgga tgagacacag 2040 gagcctgagc tggccagtct caccgcacaa
agcctcccct atcaactgcc cttccacaat 2100 ttcccactga cctgcaccga
cctggccacc ctgaccaaca tcctagagca cagggaggcc 2160 cccatccacc
tggattttga tggctgtccc ctggagcccc actgccctga ggctctggta 2220
ggctgtgggc agatagagaa tctcagcttt aagagcagga agtgtgggga tgcctttgca
2280 gaagccctct ccaggagctt gccgacaatg gggaggctgc agatgctggg
gttagcagga 2340 agtaaaatca ctgcccgagg catcagccac ctggtgaaag
ctttgcctct ctgtccacag 2400 ctgaaagaag tcagttttcg ggacaaccag
ctcagtgacc aggtggtgct gaacattgtg 2460 gaggttctcc ctcacctacc
acggctccgg aagcttgacc tgagcagcaa cagcatctgc 2520 gtgtcaaccc
tactctgctt ggcaagggtg gcagtcacgt gtcctaccgt caggatgctt 2580
caggccaggg agcggaccat catcttcctt ctttccccgc ccacagagac aactgcagag
2640 ctacaaagag ctccagacct gcaggaaagt gacggccaga ggaaaggggc
tcagagcaga 2700 agcttgacgc tcaggctgca gaagtgtcag ctccaggtcc
acgatgcgga ggccctcata 2760 gccctgctcc aggaaggccc tcacctggag
gaagtggacc tctcagggaa ccagctggaa 2820 gatgaaggct gtcggctgat
ggcagaggct gcatcccagc tgcacatcgc caggaagctg 2880 gacctcagcg
acaacgggct ttctgtggcc ggggtgcatt gtgtgctgag ggccgtgagt 2940
gcgtgctgga ccctggcaga gctgcacatc agcctgcagc acaaaactgt gatcttcatg
3000 tttgcccagg agccagagga gcagaagggg ccccaggaga gggctgcatt
tcttgacagc 3060 ctcatgctcc agatgccctc tgagctgcct ctgagctccc
gaaggatgag gctgacacat 3120 tgtggcctcc aagaaaagca cctagagcag
ctctgcaagg ctctgggagg aagctgccac 3180 ctcggtcacc tccacctcga
cttctcaggc aatgctctgg gggatgaagg tgcagcccgg 3240 ctggctcagc
tgctcccagg gctgggagct ctgcagtcct tgaacctcag tgagaacggt 3300
ttgtccctgg atgccgtgtt gggcttggtt cggtgcttct ccactctgca gtggctcttc
3360 cgcttggaca tcagctttga aagccaacac atcctcctga gaggggacaa
gacaagcagg 3420 gatatgtggg ccactggatc tttgccagac ttcccagctg
cagccaagtt cttagggttc 3480 cgtcagcgct gcatccccag gagcctctgc
ctcagtgagt gtcctctgga gcccccaagc 3540 ctcacccgcc tctgtgccac
tctgaaggac tgcccgggac ccctggaact gcaattgtcc 3600 tgtgagttcc
tgagtgacca gagcctggag actctactgg actgcttacc tcaactccct 3660
cagctgagcc tgctgcagct gagccagacg ggactgtccc cgaaaagccc cttcctgctg
3720 gccaacacct taagcctgtg tccacgggtt aaaaaggtgg atctcaggtc
cctgcaccat 3780 gcaactttgc acttcagatc caacgaggag gaggaaggcg
tgtgctgtgg caggttcaca 3840 ggctgcagcc tcagccagga gcacgtagag
tcactctgct ggttgctgag caagtgtaaa 3900 gacctcagcc aggtggatct
ctcagcaaac ctgctgggcg acagcggact cagatgcctt 3960 ctggaatgtc
tgccgcaggt gcccatctcc ggtttgcttg atctgagtca caacagcatt 4020
tctcaggaaa gtgccctgta cctgctggag acactgccct cctgcccacg tgtccgggag
4080 gcctcagtga acctgggctc tgagcagagc ttccggattc acttctccag
agaggaccag 4140 gctgggaaga cactcaggct aagtgagtgc agcttccggc
cagagcacgt gtccaggctg 4200 gccaccggct tgagcaagtc cctgcagctg
acggagctca cgctgaccca gtgctgcctg 4260 ggccagaagc agctggccat
cctcctgagc ttggtggggc gacccgcagg gctgttcagc 4320 ctcagggtgc
aggagccgtg ggcggacaga gccagggttc tctccctgtt agaagtctgc 4380
gcccaggcct caggcagtgt cactgaaatc agcatctccg agacccagca gcagctctgt
4440 gtccagctgg aatttcctcg ccaggaagag aatccagaag ctgtggcact
caggttggct 4500 cactgtgacc ttggagccca ccacagcctt cttgtcgggc
agctgatgga gacatgtgcc 4560 aggctgcagc agctcagctt gtctcaggtt
aacctctgtg aggacgatga tgccagttcc 4620 ctgctgctgc agagcctcct
gctgtccctc tctgagctga agacatttcg gctgacctcc 4680 agctgtgtga
gcaccgaggg cctcgcccac ctggcatctg gtctgggcca ctgccaccac 4740
ttggaggagc tggacttgtc taacaatcaa tttgatgagg agggcaccaa ggcgctgatg
4800 agggcccttg aggggaaatg gatgctaaag aggctggacc tcagtcacct
tctgctgaac 4860 agctccacct tggccttgct tactcacaga ctaagccaga
tgacctgcct gcagagcctc 4920 agactgaaca ggaacagtat cggtgatgtc
ggttgctgcc acctttctga ggctctcagg 4980 gctgccacca gcctagagga
gctggacttg agccacaacc agattggaga cgctggtgtc 5040 cagcacttag
ctaccatcct gcctgggctg ccagagctca ggaagataga cctctcaggg 5100
aatagcatca gctcagccgg gggagtgcag ttggcagagt ctctcgttct ttgcaggcgc
5160 ctggaggagt tgatgcttgg ctgcaatgcc ctgggggatc ccacagccct
ggggctggct 5220 caggagctgc cccagcacct gagggtccta cacctaccat
tcagccatct gggcccaggt 5280 ggggccctga gcctggccca ggccctggat
ggatcccccc atttggaaga gatcagcttg 5340 gcggaaaaca acctggctgg
aggggtcctg cgtttctgta tggagctccc gctgctcaga 5400 cagatagacc
tggtttcctg taagattgac aaccagactg ccaagctcct cacctccagc 5460
ttcacgagct gccctgccct ggaagtaatc ttgctgtcct ggaatctcct cggggatgag
5520 gcagctgccg agctggccca ggtgctgccg aagatgggcc ggctgaagag
agtggacctg 5580 gagaagaatc agatcacagc tttgggggcc tggctcctgg
ctgaaggact ggcccagggg 5640 tctagcatcc aagtcatccg cctctggaat
aaccccattc cctgcgacat ggcccagcac 5700 ctgaagagcc aggagcccag
gctggacttt gccttctttg acaaccagcc ccaggcccct 5760 tggggtactt
gatggccccc tcaagacctt tggaatccag ccaagtgatg cacccaaatg 5820
atccaccttt cgcccactgg gataaatgac tcaggaaaga agagcctcgg cagggcgctc
5880 tgcactccac ccaggaggaa ggatacgtgt gtcctgctgc agtcctcagg
gagaactttt 5940 ttgggaacca ggagctgggt ctggacaaag gagtaccctg
cattacgtgg gatatgtgtg 6000 atcaattggg gacatgcgac acacaatgag
ggtgtcatga caatgcatga cacgtacggt 6060 tatatgtggc agtgtgaccc
cttgacatgt ggcgttacat gaaagtcagt gtggcacgtg 6120 ttctgtggca
tgggtgctgg catcccaagt ggcaggatac atgattgttg gtctatatat 6180
gacacatgac aaatgtccat gtcacaggac tcatggctgg ccagatgacc tcaggctggc
6240 ccaagatcta atttattaat ttttaaagca aatacatatt tatagattgt
gtgtatggag 6300 cagctaagtc aggaaaagtc ttccgcccga gctgggaggg
gagagtgtcc atgcactgac 6360 cagtccaggg gctcaagggc cagggctctg
gaacaagcca gggactcagc cattaagtcc 6420 cctcctgcct caatcctcag
cctacccatc tataaacttg atgactcctc ccttacttac 6480 atactagctt
ccaaggacag gtggaggtag ggccagcctg gcgggagtgg agaagcccag 6540
tctgtcctat gtaagggaca aagccaggtc taatggtact gggtaggggg cactgccaag
6600 acaataagct aggctactgg gtccagctac tactttggtg ggattcaggt
gagtctccat 6660 gcacttcaca tgttacccag tgttcttgtt acttccaagg
agaaccaaga atggctctgt 6720 cacactcgaa gccaggtttg atcaataaac
acaatggtat tcc 6763 12 1112 PRT Homo sapiens 12 Met Glu Met Asp Ala
Pro Arg Pro Pro Ser Leu Ala Val Pro Gly Ala 1 5 10 15 Ala Ser Arg
Pro Gly Arg Leu Leu Asp Gly Gly His Gly Arg Gln Gln 20 25 30 Val
Gln Ala Leu Ser Ser Gln Leu Leu Glu Val Ile Pro Asp Ser Met 35 40
45 Arg Lys Gln Glu Val Arg Thr Gly Arg Glu Ala Gly Gln Gly His Gly
50 55 60 Thr Gly Ser Pro Ala Glu Gln Val Lys Ala Leu Met Asp Leu
Leu Ala 65 70 75 80 Gly Lys Gly Ser Gln Gly Ser His Ala Pro Gln Ala
Leu Asp Arg Thr 85 90 95 Pro Asp Ala Pro Leu Gly Pro Cys Ser Asn
Asp Ser Arg Ile Gln Arg 100 105 110 His Arg Lys Ala Leu Leu Ser Lys
Val Gly Gly Gly Pro Glu Leu Gly 115 120 125 Gly Pro Trp His Arg Leu
Ala Ser Leu Leu Leu Val Glu Gly Leu Thr 130 135 140 Asp Leu Gln Leu
Arg Glu His Asp Phe Thr Gln Val Glu Ala Thr Arg 145 150 155 160 Gly
Gly Gly His Pro Ala Arg Thr Val Ala Leu Asp Arg Leu Phe Leu 165 170
175 Pro Leu Ser Arg Val Ser Val Pro Pro Arg Val Ser Ile Thr Ile Gly
180 185 190 Val Ala Gly Met Gly Lys Thr Thr Leu Val Arg His Phe Val
Arg Leu 195 200 205 Trp Ala His Gly Gln Val Gly Lys Asp Phe Ser Leu
Val Leu Pro Leu 210 215 220 Thr Phe Arg Asp Leu Asn Thr His Glu Lys
Leu Cys Ala Asp Arg Leu 225 230 235 240 Ile Cys Ser Val Phe Pro His
Val Gly Glu Pro Ser Leu Ala Val Ala 245 250 255 Val Pro Ala Arg Ala
Leu Leu Ile Leu Asp Gly Leu Asp Glu Cys Arg 260 265 270 Thr Pro Leu
Asp Phe Ser Asn Thr Val Ala Cys Thr Asp Pro Lys Lys 275 280 285 Glu
Ile Pro Val Asp His Leu Ile Thr Asn Ile Ile Arg Gly Asn Leu 290 295
300 Phe Pro Glu Val Ser Ile Trp Ile Thr Ser Arg Pro Ser Ala Ser Gly
305 310 315 320 Gln Ile Pro Gly Gly Leu Val Asp Arg Met Thr Glu Ile
Arg Gly Phe 325 330 335 Asn Glu Glu Glu Ile Lys Val Cys Leu Glu Gln
Met Phe Pro Glu Asp 340 345 350 Gln Ala Leu Leu Gly Trp Met Leu Ser
Gln Val Gln Ala Asp Arg Ala 355 360 365 Leu Tyr Leu Met Cys Thr Val
Pro Ala Phe Cys Arg Leu Thr Gly Met 370 375 380 Ala Leu Gly His Leu
Trp Arg Ser Arg Thr Gly Pro Gln Asp Ala Glu 385 390 395 400 Leu Trp
Pro Pro Arg Thr Leu Cys Glu Leu Tyr Ser Trp Tyr Phe Arg 405 410 415
Met Ala Leu Ser Gly Glu Gly Gln Glu Lys Gly Lys Ala Ser Pro Arg 420
425 430 Ile Glu Gln Val Ala His Gly Gly Arg Lys Met Val Gly Thr Leu
Gly 435 440 445 Arg Leu Ala Phe His Gly Leu Leu Lys Lys Lys Tyr Val
Phe Tyr Glu 450 455 460 Gln Asp Met Lys Ala Phe Gly Val Asp Leu Ala
Leu Leu Gln Gly Ala 465 470 475 480 Pro Cys Ser Cys Phe Leu Gln Arg
Glu Glu Thr Leu Ala Ser Ser Val 485 490 495 Ala Tyr Cys Phe Thr His
Leu Ser Leu Gln Glu Phe Val Ala Ala Ala 500 505 510 Tyr Tyr Tyr Gly
Ala Ser Arg Arg Ala Ile Phe Asp Leu Phe Thr Glu 515 520 525 Ser Gly
Val Ser Trp Pro Arg Leu Gly Phe Leu Thr His Phe Arg Ser 530 535 540
Ala Ala Gln Arg Ala Met Gln Ala Glu Asp Gly Arg Leu Asp Val Phe 545
550 555 560 Leu Arg Phe Leu Ser Gly Leu Leu Ser Pro Arg Val Asn Ala
Leu Leu 565 570 575 Ala Gly Ser Leu Leu Ala Gln Gly Glu His Gln Ala
Tyr Arg Thr Gln 580 585 590 Val Ala Glu Leu Leu Gln Gly Cys Leu Arg
Pro Asp Ala Ala Val Cys 595 600 605 Ala Arg Ala Ile Asn Val Leu His
Cys Leu His Glu Leu Gln His Thr 610 615 620 Glu Leu Ala Arg Ser Val
Glu Glu Ala Met Glu Ser Gly Ala Leu Ala 625 630 635 640 Arg Leu Thr
Gly Pro Ala His Arg Ala Ala Leu Ala Tyr Leu Leu Gln 645 650 655 Val
Ser Asp Ala Cys Ala Gln Glu Ala Asn Leu Ser Leu Ser Leu Ser 660 665
670 Gln Gly Val Leu Gln Ser Leu Leu Pro Gln Leu Leu Tyr Cys Arg Lys
675 680 685 Leu Arg Leu Asp Thr Asn Gln Phe Gln Asp Pro Val Met Glu
Leu Leu 690 695 700 Gly Ser Val Leu Ser Gly Lys Asp Cys Arg Ile Gln
Lys Ile Ser Leu 705 710 715 720 Ala Glu Asn Gln Ile Ser Asn Lys Gly
Ala Lys Ala Leu Ala Arg Ser 725 730 735 Leu Leu Val Asn Arg Ser Leu
Thr Ser Leu Asp Leu Arg Gly Asn Ser 740 745 750 Ile Gly Pro Gln Gly
Ala Lys Ala Leu Ala Asp Ala Leu Lys Ile Asn 755 760 765 Arg Thr Leu
Thr Ser Leu Ser Leu Gln Gly Asn Thr Val Arg Asp Asp 770 775 780 Gly
Ala Arg Ser Met Ala Glu Ala Leu Ala Ser Asn Arg Thr Leu Ser 785 790
795 800 Met Leu His Leu Gln Lys Asn Ser Ile Gly Pro Met Gly Ala Gln
Arg 805 810 815 Met Ala Asp Ala Leu Lys Gln Asn Arg Ser Leu Lys Glu
Leu Met Phe
820 825 830 Ser Ser Asn Ser Ile Gly Asp Gly Gly Ala Lys Ala Leu Ala
Glu Ala 835 840 845 Leu Lys Val Asn Gln Gly Leu Glu Ser Leu Asp Leu
Gln Ser Asn Ser 850 855 860 Ile Ser Asp Ala Gly Val Ala Ala Leu Met
Gly Ala Leu Cys Thr Asn 865 870 875 880 Gln Thr Leu Leu Ser Leu Ser
Leu Arg Glu Asn Ser Ile Ser Pro Glu 885 890 895 Gly Ala Gln Ala Ile
Ala His Ala Leu Cys Ala Asn Ser Thr Leu Lys 900 905 910 Asn Leu Asp
Leu Thr Ala Asn Leu Leu His Asp Gln Gly Ala Arg Ala 915 920 925 Ile
Ala Val Ala Val Arg Glu Asn Arg Thr Leu Thr Ser Leu His Leu 930 935
940 Gln Trp Asn Phe Ile Gln Ala Gly Ala Ala Gln Ala Leu Gly Gln Ala
945 950 955 960 Leu Gln Leu Asn Arg Ser Leu Thr Ser Leu Asp Leu Gln
Glu Asn Ala 965 970 975 Ile Gly Asp Asp Gly Ala Cys Ala Val Ala Arg
Ala Leu Lys Val Asn 980 985 990 Thr Ala Leu Thr Ala Leu Tyr Leu Gln
Val Ala Ser Ile Gly Ala Ser 995 1000 1005 Gly Ala Gln Val Leu Gly
Glu Ala Leu Ala Val Asn Arg Thr Leu 1010 1015 1020 Glu Ile Leu Asp
Leu Arg Gly Asn Ala Ile Gly Val Ala Gly Ala 1025 1030 1035 Lys Ala
Leu Ala Asn Ala Leu Lys Val Asn Ser Ser Leu Arg Arg 1040 1045 1050
Leu Asn Leu Gln Glu Asn Ser Leu Gly Met Asp Gly Ala Ile Cys 1055
1060 1065 Ile Ala Thr Ala Leu Ser Gly Asn His Arg Leu Gln His Ile
Asn 1070 1075 1080 Leu Gln Gly Asn His Ile Gly Asp Ser Gly Ala Arg
Met Ile Ser 1085 1090 1095 Glu Ala Ile Lys Thr Asn Ala Pro Thr Cys
Thr Val Glu Met 1100 1105 1110 13 1093 PRT Homo sapiens 13 Met Ala
Asp Ser Ser Ser Ser Ser Phe Phe Pro Asp Phe Gly Leu Leu 1 5 10 15
Leu Tyr Leu Glu Glu Leu Asn Lys Glu Glu Leu Asn Thr Phe Lys Leu 20
25 30 Phe Leu Lys Glu Thr Met Glu Pro Glu His Gly Leu Thr Pro Trp
Asn 35 40 45 Glu Val Lys Lys Ala Arg Arg Glu Asp Leu Ala Asn Leu
Met Lys Lys 50 55 60 Tyr Tyr Pro Gly Glu Lys Ala Trp Ser Val Ser
Leu Lys Ile Phe Gly 65 70 75 80 Lys Met Asn Leu Lys Asp Leu Cys Glu
Arg Ala Lys Glu Glu Ile Asn 85 90 95 Trp Ser Ala Gln Thr Ile Gly
Pro Asp Asp Ala Lys Ala Gly Glu Thr 100 105 110 Gln Glu Asp Gln Glu
Ala Val Leu Gly Asp Gly Thr Glu Tyr Arg Asn 115 120 125 Arg Ile Lys
Glu Lys Phe Cys Ile Thr Trp Asp Lys Lys Ser Leu Ala 130 135 140 Gly
Lys Pro Glu Asp Phe His His Gly Ile Ala Glu Lys Asp Arg Lys 145 150
155 160 Leu Leu Glu His Leu Phe Asp Val Asp Val Lys Thr Gly Ala Gln
Pro 165 170 175 Gln Ile Val Val Leu Gln Gly Ala Ala Gly Val Gly Lys
Thr Thr Leu 180 185 190 Val Arg Lys Ala Met Leu Asp Trp Ala Glu Gly
Ser Leu Tyr Gln Gln 195 200 205 Arg Phe Lys Tyr Val Phe Tyr Leu Asn
Gly Arg Glu Ile Asn Gln Leu 210 215 220 Lys Glu Arg Ser Phe Ala Gln
Leu Ile Ser Lys Asp Trp Pro Ser Thr 225 230 235 240 Glu Gly Pro Ile
Glu Glu Ile Met Tyr Gln Pro Ser Ser Leu Leu Phe 245 250 255 Ile Ile
Asp Ser Phe Asp Glu Leu Asn Phe Ala Phe Glu Glu Pro Glu 260 265 270
Phe Ala Leu Cys Glu Asp Trp Thr Gln Glu His Pro Val Ser Phe Leu 275
280 285 Met Ser Ser Leu Leu Arg Lys Val Met Leu Pro Glu Ala Ser Leu
Leu 290 295 300 Val Thr Thr Arg Leu Thr Thr Ser Lys Arg Leu Lys Gln
Leu Leu Lys 305 310 315 320 Asn His His Tyr Val Glu Leu Leu Gly Met
Ser Glu Asp Ala Arg Glu 325 330 335 Glu Tyr Ile Tyr Gln Phe Phe Glu
Asp Lys Arg Trp Ala Met Lys Val 340 345 350 Phe Ser Ser Leu Lys Ser
Asn Glu Met Leu Phe Ser Met Cys Gln Val 355 360 365 Pro Leu Val Cys
Trp Ala Ala Cys Thr Cys Leu Lys Gln Gln Met Glu 370 375 380 Lys Gly
Gly Asp Val Thr Leu Thr Cys Gln Thr Thr Thr Ala Leu Phe 385 390 395
400 Thr Cys Tyr Ile Ser Ser Leu Phe Thr Pro Val Asp Gly Gly Ser Pro
405 410 415 Ser Leu Pro Asn Gln Ala Gln Leu Arg Arg Leu Cys Gln Val
Ala Ala 420 425 430 Lys Gly Ile Trp Thr Met Thr Tyr Val Phe Tyr Arg
Glu Asn Leu Arg 435 440 445 Arg Leu Gly Leu Thr Gln Ser Asp Val Ser
Ser Phe Met Asp Ser Asn 450 455 460 Ile Ile Gln Lys Asp Ala Glu Tyr
Glu Asn Cys Tyr Val Phe Thr His 465 470 475 480 Leu His Val Gln Glu
Phe Phe Ala Ala Met Phe Tyr Met Leu Lys Gly 485 490 495 Ser Trp Glu
Ala Gly Asn Pro Ser Cys Gln Pro Phe Glu Asp Leu Lys 500 505 510 Ser
Leu Leu Gln Ser Thr Ser Tyr Lys Asp Pro His Leu Thr Gln Met 515 520
525 Lys Cys Phe Leu Phe Gly Leu Leu Asn Glu Asp Arg Val Lys Gln Leu
530 535 540 Glu Arg Thr Phe Asn Cys Lys Met Ser Leu Lys Ile Lys Ser
Lys Leu 545 550 555 560 Leu Gln Cys Met Glu Val Leu Gly Asn Ser Asp
Tyr Ser Pro Ser Gln 565 570 575 Leu Gly Phe Leu Glu Leu Phe His Cys
Leu Tyr Glu Thr Gln Asp Lys 580 585 590 Ala Phe Ile Ser Gln Ala Met
Arg Cys Phe Pro Lys Val Ala Ile Asn 595 600 605 Ile Cys Glu Lys Ile
His Leu Leu Val Ser Ser Phe Cys Leu Lys His 610 615 620 Cys Arg Cys
Leu Arg Thr Ile Arg Leu Ser Val Thr Val Val Phe Glu 625 630 635 640
Lys Lys Ile Leu Lys Thr Ser Leu Pro Thr Asn Thr Trp Asp Gly Asp 645
650 655 Arg Ile Thr His Cys Trp Gln Asp Leu Cys Ser Val Leu His Thr
Asn 660 665 670 Glu His Leu Arg Glu Leu Asp Leu Tyr His Ser Asn Leu
Asp Lys Ser 675 680 685 Ala Met Asn Ile Leu His His Glu Leu Arg His
Pro Asn Cys Lys Leu 690 695 700 Gln Lys Leu Leu Leu Lys Phe Ile Thr
Phe Pro Asp Gly Cys Gln Asp 705 710 715 720 Ile Ser Thr Ser Leu Ile
His Asn Lys Asn Leu Met His Leu Asp Leu 725 730 735 Lys Gly Ser Asp
Ile Gly Asp Asn Gly Val Lys Ser Leu Cys Glu Ala 740 745 750 Leu Lys
His Pro Glu Cys Lys Leu Gln Thr Leu Arg Leu Glu Ser Cys 755 760 765
Asn Leu Thr Val Phe Cys Cys Leu Asn Ile Ser Asn Ala Leu Ile Arg 770
775 780 Ser Gln Ser Leu Ile Phe Leu Asn Leu Ser Thr Asn Asn Leu Leu
Asp 785 790 795 800 Asp Gly Val Gln Leu Leu Cys Glu Ala Leu Arg His
Pro Lys Cys Tyr 805 810 815 Leu Glu Arg Leu Ser Leu Glu Ser Cys Gly
Leu Thr Glu Ala Gly Cys 820 825 830 Glu Tyr Leu Ser Leu Ala Leu Ile
Ser Asn Lys Arg Leu Thr His Leu 835 840 845 Cys Leu Ala Asp Asn Val
Leu Gly Asp Gly Gly Val Lys Leu Met Ser 850 855 860 Asp Ala Leu Gln
His Ala Gln Cys Thr Leu Lys Ser Leu Val Leu Arg 865 870 875 880 Arg
Cys His Phe Thr Ser Leu Ser Ser Glu Tyr Leu Ser Thr Ser Leu 885 890
895 Leu His Asn Lys Ser Leu Thr His Leu Asp Leu Gly Ser Asn Trp Leu
900 905 910 Gln Asp Asn Gly Val Lys Leu Leu Cys Asp Val Phe Arg His
Pro Ser 915 920 925 Cys Asn Leu Gln Asp Leu Glu Leu Met Gly Cys Val
Leu Thr Asn Ala 930 935 940 Cys Cys Leu Asp Leu Ala Ser Val Ile Leu
Asn Asn Pro Asn Leu Arg 945 950 955 960 Ser Leu Asp Leu Gly Asn Asn
Asp Leu Gln Asp Asp Gly Val Lys Ile 965 970 975 Leu Cys Asp Ala Leu
Arg Tyr Pro Asn Cys Asn Ile Gln Arg Leu Gly 980 985 990 Leu Glu Tyr
Cys Gly Leu Thr Ser Leu Cys Cys Gln Asp Leu Ser Ser 995 1000 1005
Ala Leu Ile Cys Asn Lys Arg Leu Ile Lys Met Asn Leu Thr Gln 1010
1015 1020 Asn Thr Leu Gly Tyr Glu Gly Ile Val Lys Leu Tyr Lys Val
Leu 1025 1030 1035 Lys Ser Pro Lys Cys Lys Leu Gln Val Leu Gly Leu
Cys Lys Glu 1040 1045 1050 Ala Phe Asp Glu Glu Ala Gln Lys Leu Leu
Glu Ala Val Gly Val 1055 1060 1065 Ser Asn Pro His Leu Ile Ile Lys
Pro Asp Cys Asn Tyr His Asn 1070 1075 1080 Glu Glu Asp Val Ser Trp
Trp Trp Cys Phe 1085 1090 14 991 PRT Homo sapiens 14 Met Ala Glu
Ser Phe Phe Ser Asp Phe Gly Leu Leu Trp Tyr Leu Lys 1 5 10 15 Glu
Leu Arg Lys Glu Glu Phe Trp Lys Phe Lys Glu Leu Leu Lys Gln 20 25
30 Pro Leu Glu Lys Phe Glu Leu Lys Pro Ile Pro Trp Ala Glu Leu Lys
35 40 45 Lys Ala Ser Lys Glu Asp Val Ala Lys Leu Leu Asp Lys His
Tyr Pro 50 55 60 Gly Lys Gln Ala Trp Glu Val Thr Leu Asn Leu Phe
Leu Gln Ile Asn 65 70 75 80 Arg Lys Asp Leu Trp Thr Lys Ala Gln Glu
Glu Met Arg Asn Lys Leu 85 90 95 Asn Pro Tyr Arg Lys His Met Lys
Glu Thr Phe Gln Leu Ile Trp Glu 100 105 110 Lys Glu Thr Cys Leu His
Val Pro Glu His Phe Tyr Lys Glu Thr Met 115 120 125 Lys Asn Glu Tyr
Lys Glu Leu Asn Asp Ala Tyr Thr Ala Ala Ala Arg 130 135 140 Arg His
Thr Val Val Leu Glu Gly Pro Asp Gly Ile Gly Lys Thr Thr 145 150 155
160 Leu Leu Arg Lys Val Met Leu Asp Trp Ala Glu Gly Asn Leu Trp Lys
165 170 175 Asp Arg Phe Thr Phe Val Phe Phe Leu Asn Val Cys Glu Met
Asn Gly 180 185 190 Ile Ala Glu Thr Ser Leu Leu Glu Leu Leu Ser Arg
Asp Trp Pro Glu 195 200 205 Ser Ser Glu Lys Ile Glu Asp Ile Phe Ser
Gln Pro Glu Arg Ile Leu 210 215 220 Phe Ile Met Asp Gly Phe Glu Gln
Leu Lys Phe Asn Leu Gln Leu Lys 225 230 235 240 Ala Asp Leu Ser Asp
Asp Trp Arg Gln Arg Gln Pro Met Pro Ile Ile 245 250 255 Leu Ser Ser
Leu Leu Gln Lys Lys Met Leu Pro Glu Ser Ser Leu Leu 260 265 270 Ile
Ala Leu Gly Lys Leu Ala Met Gln Lys His Tyr Phe Met Leu Arg 275 280
285 His Pro Lys Leu Ile Lys Leu Leu Gly Phe Ser Glu Ser Glu Lys Lys
290 295 300 Ser Tyr Phe Ser Tyr Phe Phe Gly Glu Lys Ser Lys Ala Leu
Lys Val 305 310 315 320 Phe Asn Phe Val Arg Asp Asn Gly Pro Leu Phe
Ile Leu Cys His Asn 325 330 335 Pro Phe Thr Cys Trp Leu Val Cys Thr
Cys Val Lys Gln Arg Leu Glu 340 345 350 Arg Gly Glu Asp Leu Glu Ile
Asn Ser Gln Asn Thr Thr Tyr Leu Tyr 355 360 365 Ala Ser Phe Leu Thr
Thr Val Phe Lys Ala Gly Ser Gln Ser Phe Pro 370 375 380 Pro Lys Val
Asn Arg Ala Arg Leu Lys Ser Leu Cys Ala Leu Ala Ala 385 390 395 400
Glu Gly Ile Trp Thr Tyr Thr Phe Val Phe Ser His Gly Asp Leu Arg 405
410 415 Arg Asn Gly Leu Ser Glu Ser Glu Gly Val Met Trp Val Gly Met
Arg 420 425 430 Leu Leu Gln Arg Arg Gly Asp Cys Phe Ala Phe Met His
Leu Cys Ile 435 440 445 Gln Glu Phe Cys Ala Ala Met Phe Tyr Leu Leu
Lys Arg Pro Lys Asp 450 455 460 Asp Pro Asn Pro Ala Ile Gly Ser Ile
Thr Gln Leu Val Arg Ala Ser 465 470 475 480 Val Val Gln Pro Gln Thr
Leu Leu Thr Gln Val Gly Ile Phe Met Phe 485 490 495 Gly Ile Ser Thr
Glu Glu Ile Val Ser Met Leu Glu Thr Ser Phe Gly 500 505 510 Phe Pro
Leu Ser Lys Asp Leu Lys Gln Glu Ile Thr Gln Cys Leu Glu 515 520 525
Ser Leu Ser Gln Cys Glu Ala Asp Arg Glu Ala Ile Ala Phe Gln Glu 530
535 540 Leu Phe Ile Gly Leu Phe Glu Thr Gln Glu Lys Glu Phe Val Thr
Lys 545 550 555 560 Val Met Asn Phe Phe Glu Glu Val Phe Ile Tyr Ile
Gly Asn Ile Glu 565 570 575 His Leu Val Ile Ala Ser Phe Cys Leu Lys
His Cys Gln His Leu Thr 580 585 590 Thr Leu Arg Met Cys Val Glu Asn
Ile Phe Pro Asp Asp Ser Gly Cys 595 600 605 Ile Ser Asp Tyr Asn Glu
Lys Leu Val Tyr Trp Arg Glu Leu Cys Ser 610 615 620 Met Phe Ile Thr
Asn Lys Asn Phe Gln Ile Leu Asp Met Glu Asn Thr 625 630 635 640 Ser
Leu Asp Asp Pro Ser Leu Ala Ile Leu Cys Lys Ala Leu Ala Gln 645 650
655 Pro Val Cys Lys Leu Arg Lys Leu Ile Phe Thr Ser Val Tyr Phe Gly
660 665 670 His Asp Ser Glu Leu Phe Lys Ala Val Leu His Asn Pro His
Leu Lys 675 680 685 Leu Leu Ser Leu Tyr Gly Thr Ser Leu Ser Gln Ser
Asp Ile Arg His 690 695 700 Leu Cys Glu Thr Leu Lys His Pro Met Cys
Lys Ile Glu Glu Leu Ile 705 710 715 720 Leu Gly Lys Cys Asp Ile Ser
Ser Glu Val Cys Glu Asp Ile Ala Ser 725 730 735 Val Leu Ala Cys Asn
Ser Lys Leu Lys His Leu Ser Leu Val Glu Asn 740 745 750 Pro Leu Arg
Asp Glu Gly Met Thr Leu Leu Cys Glu Ala Leu Lys His 755 760 765 Ser
His Cys Ala Leu Glu Arg Leu Met Leu Met Tyr Cys Cys Leu Thr 770 775
780 Ser Val Ser Cys Asp Ser Ile Ser Glu Val Leu Leu Cys Ser Lys Ser
785 790 795 800 Leu Ser Leu Leu Asp Leu Gly Ser Asn Ala Leu Glu Asp
Asn Gly Val 805 810 815 Ala Ser Leu Cys Ala Ala Leu Lys His Pro Gly
Cys Ser Ile Arg Glu 820 825 830 Leu Trp Leu Met Gly Cys Phe Leu Thr
Ser Asp Ser Cys Lys Asp Ile 835 840 845 Ala Ala Val Leu Ile Cys Asn
Gly Lys Leu Lys Thr Leu Lys Leu Gly 850 855 860 His Asn Glu Ile Gly
Asp Thr Gly Val Arg Gln Leu Cys Ala Ala Leu 865 870 875 880 Gln His
Pro His Cys Lys Leu Glu Cys Leu Gly Leu Gln Thr Cys Pro 885 890 895
Ile Thr Arg Ala Cys Cys Asp Asp Ile Ala Ala Ala Leu Ile Ala Cys 900
905 910 Lys Thr Leu Arg Ser Leu Asn Leu Asp Trp Ile Ala Leu Asp Ala
Asp 915 920 925 Ala Val Val Val Leu Cys Glu Ala Leu Ser His Pro Asp
Cys Ala Leu 930 935 940 Gln Met Leu Gly Leu His Lys Ser Gly Phe Asp
Glu Glu Thr Gln Lys 945 950 955 960 Ile Leu Met Ser Val Glu Glu Lys
Ile Pro His Leu Thr Ile Ser His 965 970 975 Gly Pro Trp Ile Asp Glu
Glu Tyr Lys Ile Arg Gly Val Leu Leu 980 985 990 15 655 PRT Homo
sapiens 15 Met Ala Met Ala Lys Ala Arg Lys Pro Arg Glu Ala Leu Leu
Trp Ala 1 5 10 15 Leu Ser Asp Leu Glu Glu Asn Asp Phe Lys Lys Leu
Lys Phe Tyr Leu 20 25 30 Arg Asp Met Thr Leu Ser Glu Gly Gln Pro
Pro Leu Ala Arg Gly Glu 35 40 45 Leu Glu Gly Leu Ile Pro Val Asp
Leu Ala Glu Leu Leu Ile Ser Lys 50 55 60 Tyr Gly Glu Lys
Glu Ala Val Lys Val Val Leu Lys Gly Leu Lys Val 65 70 75 80 Met Asn
Leu Leu Glu Leu Val Asp Gln Leu Ser His Ile Cys Leu His 85 90 95
Asp Tyr Arg Glu Val Tyr Arg Glu His Val Arg Cys Leu Glu Glu Trp 100
105 110 Gln Glu Ala Gly Val Asn Gly Arg Tyr Asn Gln Val Leu Leu Val
Ala 115 120 125 Lys Pro Ser Ser Glu Ser Pro Glu Ser Leu Ala Cys Pro
Phe Pro Glu 130 135 140 Gln Glu Leu Glu Ser Val Thr Val Glu Ala Leu
Phe Asp Ser Gly Glu 145 150 155 160 Lys Pro Ser Leu Ala Pro Ser Leu
Val Val Leu Gln Gly Ser Ala Gly 165 170 175 Thr Gly Lys Thr Thr Leu
Ala Arg Lys Met Val Leu Asp Trp Ala Thr 180 185 190 Gly Thr Leu Tyr
Pro Gly Arg Phe Asp Tyr Val Phe Tyr Val Ser Cys 195 200 205 Lys Glu
Val Val Leu Leu Leu Glu Ser Lys Leu Glu Gln Leu Leu Phe 210 215 220
Trp Cys Cys Gly Asp Asn Gln Ala Pro Val Thr Glu Ile Leu Arg Gln 225
230 235 240 Pro Glu Arg Leu Leu Phe Ile Leu Asp Gly Phe Asp Glu Leu
Gln Arg 245 250 255 Pro Phe Glu Glu Lys Leu Lys Lys Arg Gly Leu Ser
Pro Lys Glu Ser 260 265 270 Leu Leu His Leu Leu Ile Arg Arg His Thr
Leu Pro Thr Cys Ser Leu 275 280 285 Leu Ile Thr Thr Arg Pro Leu Ala
Leu Arg Asn Leu Glu Pro Leu Leu 290 295 300 Lys Gln Ala Arg His Val
His Ile Leu Gly Phe Ser Glu Glu Glu Arg 305 310 315 320 Ala Arg Tyr
Phe Ser Ser Tyr Phe Thr Asp Glu Lys Gln Ala Asp Arg 325 330 335 Ala
Phe Asp Ile Val Gln Lys Asn Asp Ile Leu Tyr Lys Ala Cys Gln 340 345
350 Val Pro Gly Ile Cys Trp Val Val Cys Ser Trp Leu Gln Gly Gln Met
355 360 365 Glu Arg Gly Lys Val Val Leu Glu Thr Pro Arg Asn Ser Thr
Asp Ile 370 375 380 Phe Met Ala Tyr Val Ser Thr Phe Leu Pro Pro Asp
Asp Asp Gly Gly 385 390 395 400 Cys Ser Glu Leu Ser Arg His Arg Val
Leu Arg Ser Leu Cys Ser Leu 405 410 415 Ala Ala Glu Gly Ile Gln His
Gln Arg Phe Leu Phe Glu Glu Ala Glu 420 425 430 Leu Arg Lys His Asn
Leu Asp Gly Pro Arg Leu Ala Ala Phe Leu Ser 435 440 445 Ser Asn Asp
Tyr Gln Leu Gly Leu Ala Ile Lys Lys Phe Tyr Ser Phe 450 455 460 Arg
His Ile Ser Phe Gln Asp Phe Phe His Ala Met Ser Tyr Leu Val 465 470
475 480 Lys Glu Asp Gln Ser Arg Leu Gly Lys Glu Ser Arg Arg Glu Val
Gln 485 490 495 Arg Leu Leu Glu Val Lys Glu Gln Glu Gly Asn Asp Glu
Met Thr Leu 500 505 510 Thr Met Gln Phe Leu Leu Asp Ile Ser Lys Lys
Asp Ser Phe Ser Asn 515 520 525 Leu Glu Leu Lys Phe Cys Phe Arg Ile
Ser Pro Cys Leu Ala Gln Asp 530 535 540 Leu Lys His Phe Lys Glu Gln
Met Glu Ser Met Lys His Asn Arg Thr 545 550 555 560 Trp Asp Leu Glu
Phe Ser Leu Tyr Glu Ala Lys Ile Lys Asn Leu Val 565 570 575 Lys Gly
Ile Gln Met Asn Asn Val Ser Phe Lys Ile Lys His Ser Asn 580 585 590
Glu Lys Lys Ser Gln Ser Gln Asn Leu Phe Ser Val Lys Ser Ser Leu 595
600 605 Ser His Gly Pro Lys Glu Glu Gln Lys Cys Pro Ser Val His Gly
Gln 610 615 620 Lys Glu Gly Lys Asp Asn Ile Ala Gly Thr Gln Lys Glu
Ala Ser Thr 625 630 635 640 Gly Lys Gly Arg Gly Thr Glu Glu Thr Pro
Lys Asn Thr Tyr Ile 645 650 655 16 975 PRT Homo sapiens 16 Met Arg
Trp Gly His His Leu Pro Arg Ala Ser Trp Gly Ser Gly Phe 1 5 10 15
Arg Arg Ala Leu Gln Arg Pro Asp Asp Arg Ile Pro Phe Leu Ile His 20
25 30 Trp Ser Trp Pro Leu Gln Gly Glu Arg Pro Phe Gly Pro Pro Arg
Ala 35 40 45 Phe Ile Arg His His Gly Ser Ser Val Asp Ser Ala Pro
Pro Ser Gly 50 55 60 Arg His Gly Arg Leu Phe Pro Ser Ala Ser Ala
Thr Glu Ala Ile Gln 65 70 75 80 Arg His Arg Arg Asn Leu Ala Glu Trp
Phe Ser Arg Leu Pro Arg Glu 85 90 95 Glu Arg Gln Phe Gly Pro Thr
Phe Ala Leu Asp Thr Val His Val Asp 100 105 110 Pro Val Ile Arg Glu
Ser Thr Pro Asp Glu Leu Leu Arg Pro Pro Ala 115 120 125 Glu Leu Ala
Leu Glu His Gln Pro Pro Gln Ala Gly Leu Pro Pro Leu 130 135 140 Ala
Leu Ser Gln Leu Phe Asn Pro Asp Ala Cys Gly Arg Arg Val Gln 145 150
155 160 Thr Val Val Leu Tyr Gly Thr Val Gly Thr Gly Lys Ser Thr Leu
Val 165 170 175 Arg Lys Met Val Leu Asp Trp Cys Tyr Gly Arg Leu Pro
Ala Phe Glu 180 185 190 Leu Leu Ile Pro Phe Ser Cys Glu Asp Leu Ser
Ser Leu Gly Pro Ala 195 200 205 Pro Ala Ser Leu Cys Gln Leu Val Ala
Gln Arg Tyr Thr Pro Leu Lys 210 215 220 Glu Val Leu Pro Leu Met Ala
Ala Ala Gly Ser His Leu Leu Phe Val 225 230 235 240 Leu His Gly Leu
Glu His Leu Asn Leu Asp Phe Arg Leu Ala Gly Thr 245 250 255 Gly Leu
Cys Ser Asp Pro Glu Glu Pro Gln Glu Pro Ala Ala Ile Ile 260 265 270
Val Asn Leu Leu Arg Lys Tyr Met Leu Pro Gln Ala Ser Ile Leu Val 275
280 285 Thr Thr Arg Pro Ser Ala Ile Gly Arg Ile Pro Ser Lys Tyr Val
Gly 290 295 300 Arg Tyr Gly Glu Ile Cys Gly Phe Ser Asp Thr Asn Leu
Gln Lys Leu 305 310 315 320 Tyr Phe Gln Leu Arg Leu Asn Gln Pro Tyr
Cys Gly Tyr Ala Val Gly 325 330 335 Gly Ser Gly Val Ser Ala Thr Pro
Ala Gln Arg Asp His Leu Val Gln 340 345 350 Met Leu Ser Arg Asn Leu
Glu Gly His His Gln Ile Ala Ala Ala Cys 355 360 365 Phe Leu Pro Ser
Tyr Cys Trp Leu Val Cys Ala Thr Leu His Phe Leu 370 375 380 His Ala
Pro Thr Pro Ala Gly Gln Thr Leu Thr Ser Ile Tyr Thr Ser 385 390 395
400 Phe Leu Arg Leu Asn Phe Ser Gly Glu Thr Leu Asp Ser Thr Asp Pro
405 410 415 Ser Asn Leu Ser Leu Met Ala Tyr Ala Ala Arg Thr Met Gly
Lys Leu 420 425 430 Ala Tyr Glu Gly Val Ser Ser Arg Lys Thr Tyr Phe
Ser Glu Glu Asp 435 440 445 Val Cys Gly Cys Leu Glu Ala Gly Ile Arg
Thr Glu Glu Glu Phe Gln 450 455 460 Leu Leu His Ile Phe Arg Arg Asp
Ala Leu Arg Phe Phe Leu Ala Pro 465 470 475 480 Cys Val Glu Pro Gly
Arg Ala Gly Thr Phe Val Phe Thr Val Pro Ala 485 490 495 Met Gln Glu
Tyr Leu Ala Ala Leu Tyr Ile Val Leu Gly Leu Arg Lys 500 505 510 Thr
Thr Leu Gln Lys Val Gly Lys Glu Val Ala Glu Leu Val Gly Arg 515 520
525 Val Gly Glu Asp Val Ser Leu Val Leu Gly Ile Met Ala Lys Leu Leu
530 535 540 Pro Leu Arg Ala Leu Pro Leu Leu Phe Asn Leu Ile Lys Val
Val Pro 545 550 555 560 Arg Val Phe Gly Arg Met Val Gly Lys Ser Arg
Glu Ala Val Ala Gln 565 570 575 Ala Met Val Leu Glu Met Phe Arg Glu
Glu Asp Tyr Tyr Asn Asp Asp 580 585 590 Val Leu Asp Gln Met Gly Ala
Ser Ile Leu Gly Val Glu Gly Pro Arg 595 600 605 Arg His Pro Asp Glu
Pro Pro Glu Asp Glu Val Phe Glu Leu Phe Pro 610 615 620 Met Phe Met
Gly Gly Leu Leu Ser Ala His Asn Arg Ala Val Leu Ala 625 630 635 640
Gln Leu Gly Cys Pro Ile Lys Asn Leu Asp Ala Leu Glu Asn Ala Gln 645
650 655 Ala Ile Lys Lys Lys Leu Gly Lys Leu Gly Arg Gln Val Leu Pro
Pro 660 665 670 Ser Glu Leu Leu Asp His Leu Phe Phe His Tyr Glu Phe
Gln Asn Gln 675 680 685 Arg Phe Ser Ala Glu Val Leu Ser Ser Leu Arg
Gln Leu Asn Leu Ala 690 695 700 Gly Val Arg Met Thr Pro Val Lys Cys
Thr Val Val Ala Ala Val Leu 705 710 715 720 Gly Ser Gly Arg His Ala
Leu Asp Glu Val Asn Leu Ala Ser Cys Gln 725 730 735 Leu Asp Pro Ala
Gly Leu Arg Thr Leu Leu Pro Val Phe Leu Arg Ala 740 745 750 Arg Lys
Leu Gly Leu Gln Leu Asn Ser Leu Gly Pro Glu Ala Cys Lys 755 760 765
Asp Leu Arg Asp Leu Leu Leu His Asp Gln Cys Gln Ile Thr Thr Leu 770
775 780 Arg Leu Ser Asn Asn Pro Leu Thr Glu Ala Gly Val Ala Val Leu
Met 785 790 795 800 Glu Gly Leu Ala Gly Asn Thr Ser Val Thr His Leu
Ser Leu Leu His 805 810 815 Thr Gly Leu Gly Asp Glu Gly Leu Glu Leu
Leu Ala Ala Gln Leu Asp 820 825 830 Arg Asn Arg Gln Leu Gln Glu Leu
Asn Val Ala Tyr Asn Gly Ala Gly 835 840 845 Asp Thr Ala Ala Leu Ala
Leu Ala Arg Ala Ala Arg Glu His Pro Ser 850 855 860 Leu Glu Leu Leu
His Leu Tyr Phe Asn Glu Leu Ser Ser Glu Gly Arg 865 870 875 880 Gln
Val Leu Arg Asp Leu Gly Gly Ala Ala Glu Gly Gly Ala Arg Val 885 890
895 Val Val Ser Leu Thr Glu Gly Thr Ala Val Ser Glu Tyr Trp Ser Val
900 905 910 Ile Leu Ser Glu Val Gln Arg Asn Leu Asn Ser Trp Asp Arg
Ala Arg 915 920 925 Val Gln Arg His Leu Glu Leu Leu Leu Arg Asp Leu
Glu Asp Ser Arg 930 935 940 Gly Ala Thr Leu Asn Pro Trp Arg Lys Ala
Gln Leu Leu Arg Val Glu 945 950 955 960 Gly Glu Val Arg Ala Leu Leu
Glu Gln Leu Gly Ser Ser Gly Ser 965 970 975 17 1009 PRT Homo
sapiens 17 Met Thr Ser Pro Gln Leu Glu Trp Thr Leu Gln Thr Leu Leu
Glu Gln 1 5 10 15 Leu Asn Glu Asp Glu Leu Lys Ser Phe Lys Ser Leu
Leu Trp Ala Phe 20 25 30 Pro Leu Glu Asp Val Leu Gln Lys Thr Pro
Trp Ser Glu Val Glu Glu 35 40 45 Ala Asp Gly Glu Lys Leu Ala Glu
Ile Leu Val Asn Thr Ser Ser Glu 50 55 60 Asn Trp Ile Arg Asn Ala
Thr Val Asn Ile Leu Glu Glu Met Asn Leu 65 70 75 80 Thr Glu Leu Cys
Lys Met Ala Lys Ala Glu Met Met Glu Asp Gly Gln 85 90 95 Val Gln
Glu Ile Asp Asn Pro Glu Leu Gly Asp Ala Glu Glu Asp Ser 100 105 110
Glu Leu Ala Lys Pro Gly Glu Lys Glu Gly Trp Arg Asn Ser Met Glu 115
120 125 Lys Gln Ser Leu Val Trp Lys Asn Thr Phe Trp Gln Gly Asp Ile
Asp 130 135 140 Asn Phe His Asp Asp Val Thr Leu Arg Asn Gln Arg Phe
Ile Pro Phe 145 150 155 160 Leu Asn Pro Arg Thr Pro Arg Lys Leu Thr
Pro Tyr Thr Val Val Leu 165 170 175 His Gly Pro Ala Gly Val Gly Lys
Thr Thr Leu Ala Lys Lys Cys Met 180 185 190 Leu Asp Trp Thr Asp Cys
Asn Leu Ser Pro Thr Leu Arg Tyr Ala Phe 195 200 205 Tyr Leu Ser Cys
Lys Glu Leu Ser Arg Met Gly Pro Cys Ser Phe Ala 210 215 220 Glu Leu
Ile Ser Lys Asp Trp Pro Glu Leu Gln Asp Asp Ile Pro Ser 225 230 235
240 Ile Leu Ala Gln Ala Gln Arg Ile Leu Phe Val Val Asp Gly Leu Asp
245 250 255 Glu Leu Lys Val Pro Pro Gly Ala Leu Ile Gln Asp Ile Cys
Gly Asp 260 265 270 Trp Glu Lys Lys Lys Pro Val Pro Val Leu Leu Gly
Ser Leu Leu Lys 275 280 285 Arg Lys Met Leu Pro Arg Ala Ala Leu Leu
Val Thr Thr Arg Pro Arg 290 295 300 Ala Leu Arg Asp Leu Gln Leu Leu
Ala Gln Gln Pro Ile Tyr Val Arg 305 310 315 320 Val Glu Gly Phe Leu
Glu Glu Asp Arg Arg Ala Tyr Phe Leu Arg His 325 330 335 Phe Gly Asp
Glu Asp Gln Ala Met Arg Ala Phe Glu Leu Met Arg Ser 340 345 350 Asn
Ala Ala Leu Phe Gln Leu Gly Ser Ala Pro Ala Val Cys Trp Ile 355 360
365 Val Cys Thr Thr Leu Lys Leu Gln Met Glu Lys Gly Glu Asp Pro Val
370 375 380 Pro Thr Cys Leu Thr Arg Thr Gly Leu Phe Leu Arg Phe Leu
Cys Ser 385 390 395 400 Arg Phe Pro Gln Gly Ala Gln Leu Arg Gly Ala
Leu Arg Thr Leu Ser 405 410 415 Leu Leu Ala Ala Gln Gly Leu Trp Ala
Gln Met Ser Val Phe His Arg 420 425 430 Glu Asp Leu Glu Arg Leu Gly
Val Gln Glu Ser Asp Leu Arg Leu Phe 435 440 445 Leu Asp Gly Asp Ile
Leu Arg Gln Asp Arg Val Ser Lys Gly Cys Tyr 450 455 460 Ser Phe Ile
His Leu Ser Phe Gln Gln Phe Leu Thr Ala Leu Phe Tyr 465 470 475 480
Ala Leu Glu Lys Glu Glu Gly Glu Asp Arg Asp Gly His Ala Trp Asp 485
490 495 Ile Gly Asp Val Gln Lys Leu Leu Ser Gly Glu Glu Arg Leu Lys
Asn 500 505 510 Pro Asp Leu Ile Gln Val Gly His Phe Leu Phe Gly Leu
Ala Asn Glu 515 520 525 Lys Arg Ala Lys Glu Leu Glu Ala Thr Phe Gly
Cys Arg Met Ser Pro 530 535 540 Asp Ile Lys Gln Glu Leu Leu Gln Cys
Lys Ala His Leu His Ala Asn 545 550 555 560 Lys Pro Leu Ser Val Thr
Asp Leu Lys Glu Val Leu Gly Cys Leu Tyr 565 570 575 Glu Ser Gln Glu
Glu Glu Leu Ala Lys Val Val Val Ala Pro Phe Lys 580 585 590 Glu Ile
Ser Ile His Leu Thr Asn Thr Ser Glu Val Met His Cys Ser 595 600 605
Phe Ser Leu Lys His Cys Gln Asp Leu Gln Lys Leu Ser Leu Gln Val 610
615 620 Ala Lys Gly Val Phe Leu Glu Asn Tyr Met Asp Phe Glu Leu Asp
Ile 625 630 635 640 Glu Phe Glu Ser Ser Asn Ser Asn Leu Lys Phe Leu
Glu Val Lys Gln 645 650 655 Ser Phe Leu Ser Asp Ser Ser Val Arg Ile
Leu Cys Asp His Val Thr 660 665 670 Arg Ser Thr Cys His Leu Gln Lys
Val Glu Ile Lys Asn Val Thr Pro 675 680 685 Asp Thr Ala Tyr Arg Asp
Phe Cys Leu Ala Phe Ile Gly Lys Lys Thr 690 695 700 Leu Thr His Leu
Thr Leu Ala Gly His Ile Glu Trp Glu Arg Thr Met 705 710 715 720 Met
Leu Met Leu Cys Asp Leu Leu Arg Asn His Lys Cys Asn Leu Gln 725 730
735 Tyr Leu Arg Leu Gly Gly His Cys Ala Thr Pro Glu Gln Trp Ala Glu
740 745 750 Phe Phe Tyr Val Leu Lys Ala Asn Gln Ser Leu Lys His Leu
Arg Leu 755 760 765 Ser Ala Asn Val Leu Leu Asp Glu Gly Ala Met Leu
Leu Tyr Lys Thr 770 775 780 Met Thr Arg Pro Lys His Phe Leu Gln Met
Leu Ser Leu Glu Asn Cys 785 790 795 800 Arg Leu Thr Glu Ala Ser Cys
Lys Asp Leu Ala Ala Val Leu Val Val 805 810 815 Ser Lys Lys Leu Thr
His Leu Cys Leu Ala Lys Asn Pro Ile Gly Asp 820 825 830 Thr Gly Val
Lys Phe Leu Cys Glu Gly Leu Ser Tyr Pro Asp Cys Lys 835 840 845 Leu
Gln Thr Leu Val Leu Gln Gln Cys Ser Ile Thr Lys Leu Gly Cys 850 855
860 Arg Tyr Leu Ser Glu Ala Leu Gln Glu Ala Cys Ser Leu Thr Asn Leu
865 870 875
880 Asp Leu Ser Ile Asn Gln Ile Ala Arg Gly Leu Trp Ile Leu Cys Gln
885 890 895 Ala Leu Glu Asn Pro Asn Cys Asn Leu Lys His Leu Arg Leu
Trp Ser 900 905 910 Cys Ser Leu Met Pro Phe Tyr Cys Gln His Leu Gly
Ser Ala Leu Leu 915 920 925 Ser Asn Gln Lys Leu Glu Thr Leu Asp Leu
Gly Gln Asn His Leu Trp 930 935 940 Lys Ser Gly Ile Ile Lys Leu Phe
Gly Val Leu Arg Gln Arg Thr Gly 945 950 955 960 Ser Leu Lys Ile Leu
Arg Leu Lys Thr Tyr Glu Thr Asn Leu Glu Ile 965 970 975 Lys Lys Leu
Leu Glu Glu Val Lys Glu Lys Asn Pro Lys Leu Thr Ile 980 985 990 Asp
Cys Asn Ala Ser Gly Ala Thr Ala Pro Pro Cys Cys Asp Phe Phe 995
1000 1005 Cys 18 1036 PRT Homo sapiens 18 Met Asn Phe Ser Val Ile
Thr Cys Pro Asn Gly Gly Thr Asn Gln Gly 1 5 10 15 Leu Leu Pro Tyr
Leu Met Ala Leu Asp Gln Tyr Gln Leu Glu Glu Phe 20 25 30 Lys Leu
Cys Leu Glu Pro Gln Gln Leu Met Asp Phe Trp Ser Ala Pro 35 40 45
Gln Gly His Phe Pro Arg Ile Pro Trp Ala Asn Leu Arg Ala Ala Asp 50
55 60 Pro Leu Asn Leu Ser Phe Leu Leu Asp Glu His Phe Pro Lys Gly
Gln 65 70 75 80 Ala Trp Lys Val Val Leu Gly Ile Phe Gln Thr Met Asn
Leu Thr Ser 85 90 95 Leu Cys Glu Lys Val Arg Ala Glu Met Lys Glu
Asn Val Gln Thr Gln 100 105 110 Glu Leu Gln Asp Pro Thr Gln Glu Asp
Leu Glu Met Leu Glu Ala Ala 115 120 125 Ala Gly Asn Met Gln Thr Gln
Gly Cys Gln Asp Pro Asn Gln Glu Glu 130 135 140 Leu Asp Glu Leu Glu
Glu Glu Thr Gly Asn Val Gln Ala Gln Gly Cys 145 150 155 160 Gln Asp
Pro Asn Gln Glu Glu Pro Glu Met Leu Glu Glu Ala Asp His 165 170 175
Arg Arg Lys Tyr Arg Glu Asn Met Lys Ala Glu Leu Leu Glu Thr Trp 180
185 190 Asp Asn Ile Ser Trp Pro Lys Asp His Val Tyr Ile Arg Asn Thr
Ser 195 200 205 Lys Asp Glu His Glu Glu Leu Gln Arg Leu Leu Asp Pro
Asn Arg Thr 210 215 220 Arg Ala Gln Ala Gln Thr Ile Val Leu Val Gly
Arg Ala Gly Val Gly 225 230 235 240 Lys Thr Thr Leu Ala Met Gln Ala
Met Leu His Trp Ala Asn Gly Val 245 250 255 Leu Phe Gln Gln Arg Phe
Ser Tyr Val Phe Tyr Leu Ser Cys His Lys 260 265 270 Ile Arg Tyr Met
Lys Glu Thr Thr Phe Ala Glu Leu Ile Ser Leu Asp 275 280 285 Trp Pro
Asp Phe Asp Ala Pro Ile Glu Glu Phe Met Ser Gln Pro Glu 290 295 300
Lys Leu Leu Phe Ile Ile Asp Gly Phe Glu Glu Ile Ile Ile Ser Glu 305
310 315 320 Ser Arg Ser Glu Ser Leu Asp Asp Gly Ser Pro Cys Thr Asp
Trp Tyr 325 330 335 Gln Glu Leu Pro Val Thr Lys Ile Leu His Ser Leu
Leu Lys Lys Glu 340 345 350 Leu Val Pro Leu Ala Thr Leu Leu Ile Thr
Ile Lys Thr Trp Phe Val 355 360 365 Arg Asp Leu Lys Ala Ser Leu Val
Asn Pro Cys Phe Val Gln Ile Thr 370 375 380 Gly Phe Thr Gly Asp Asp
Leu Arg Val Tyr Phe Met Arg His Phe Asp 385 390 395 400 Asp Ser Ser
Glu Val Glu Lys Ile Leu Gln Gln Leu Arg Lys Asn Glu 405 410 415 Thr
Leu Phe His Ser Cys Ser Ala Pro Met Val Cys Trp Thr Val Cys 420 425
430 Ser Cys Leu Lys Gln Pro Lys Val Arg Tyr Tyr Asp Leu Gln Ser Ile
435 440 445 Thr Gln Thr Thr Thr Ser Leu Tyr Ala Tyr Phe Phe Ser Asn
Leu Phe 450 455 460 Ser Thr Ala Glu Val Asp Leu Ala Asp Asp Ser Trp
Pro Gly Gln Trp 465 470 475 480 Arg Ala Leu Cys Ser Leu Ala Ile Glu
Gly Leu Trp Ser Met Asn Phe 485 490 495 Thr Phe Asn Lys Glu Asp Thr
Glu Ile Glu Gly Leu Glu Val Pro Phe 500 505 510 Ile Asp Ser Leu Tyr
Glu Phe Asn Ile Leu Gln Lys Ile Asn Asp Cys 515 520 525 Gly Gly Cys
Thr Thr Phe Thr His Leu Ser Phe Gln Glu Phe Phe Ala 530 535 540 Ala
Met Ser Phe Val Leu Glu Glu Pro Arg Glu Phe Pro Pro His Ser 545 550
555 560 Thr Lys Pro Gln Glu Met Lys Met Leu Leu Gln His Val Leu Leu
Asp 565 570 575 Lys Glu Ala Tyr Trp Thr Pro Val Val Leu Phe Phe Phe
Gly Leu Leu 580 585 590 Asn Lys Asn Ile Ala Arg Glu Leu Glu Asp Thr
Leu His Cys Lys Ile 595 600 605 Ser Pro Arg Val Met Glu Glu Leu Leu
Lys Trp Gly Glu Glu Leu Gly 610 615 620 Lys Ala Glu Ser Ala Ser Leu
Gln Phe His Ile Leu Arg Leu Phe His 625 630 635 640 Cys Leu His Glu
Ser Gln Glu Glu Asp Phe Thr Lys Lys Met Leu Gly 645 650 655 Arg Ile
Phe Glu Val Asp Leu Asn Ile Leu Glu Asp Glu Glu Leu Gln 660 665 670
Ala Ser Ser Phe Cys Leu Lys His Cys Lys Arg Leu Asn Lys Leu Arg 675
680 685 Leu Ser Val Ser Ser His Ile Leu Glu Arg Asp Leu Glu Ile Leu
Glu 690 695 700 Thr Ser Lys Phe Asp Ser Arg Met His Ala Trp Asn Ser
Ile Cys Ser 705 710 715 720 Thr Leu Val Thr Asn Glu Asn Leu His Glu
Leu Asp Leu Ser Asn Ser 725 730 735 Lys Leu His Ala Ser Ser Val Lys
Gly Leu Cys Leu Ala Leu Lys Asn 740 745 750 Pro Arg Cys Lys Val Gln
Lys Leu Thr Cys Lys Ser Val Thr Pro Glu 755 760 765 Trp Val Leu Gln
Asp Leu Ile Ile Ala Leu Gln Gly Asn Ser Lys Leu 770 775 780 Thr His
Leu Asn Phe Ser Ser Asn Lys Leu Gly Met Thr Val Pro Leu 785 790 795
800 Ile Leu Lys Ala Leu Arg His Ser Ala Cys Asn Leu Lys Tyr Leu Cys
805 810 815 Leu Glu Lys Cys Asn Leu Ser Ala Ala Ser Cys Gln Asp Leu
Ala Leu 820 825 830 Phe Leu Thr Ser Ile Gln His Val Thr Arg Leu Cys
Leu Gly Phe Asn 835 840 845 Arg Leu Gln Asp Asp Gly Ile Lys Leu Leu
Cys Ala Ala Leu Thr His 850 855 860 Pro Lys Cys Ala Leu Glu Arg Leu
Glu Leu Trp Phe Cys Gln Leu Ala 865 870 875 880 Ala Pro Ala Cys Lys
His Leu Ser Asp Ala Leu Leu Gln Asn Arg Ser 885 890 895 Leu Thr His
Leu Asn Leu Ser Lys Asn Ser Leu Arg Asp Glu Gly Val 900 905 910 Lys
Phe Leu Cys Glu Ala Leu Gly Arg Pro Asp Gly Asn Leu Gln Ser 915 920
925 Leu Asn Leu Ser Gly Cys Ser Phe Thr Arg Glu Gly Cys Gly Glu Leu
930 935 940 Ala Asn Ala Leu Ser His Asn His Asn Val Lys Ile Leu Asp
Leu Gly 945 950 955 960 Glu Asn Asp Leu Gln Asp Asp Gly Val Lys Leu
Leu Cys Glu Ala Leu 965 970 975 Lys Pro His Arg Ala Leu His Thr Leu
Gly Leu Ala Lys Cys Asn Leu 980 985 990 Thr Thr Ala Cys Cys Gln His
Leu Phe Ser Val Leu Ser Ser Ser Lys 995 1000 1005 Ser Leu Val Asn
Leu Asn Leu Leu Gly Asn Glu Leu Asp Thr Asp 1010 1015 1020 Gly Val
Lys Met Leu Cys Phe Lys Lys Thr Cys Thr Met 1025 1030 1035 19 1048
PRT Homo sapiens 19 Met Ser Asp Val Asn Pro Pro Ser Asp Thr Pro Ile
Pro Phe Ser Ser 1 5 10 15 Ser Ser Thr His Ser Ser His Ile Pro Pro
Trp Thr Phe Ser Cys Tyr 20 25 30 Pro Gly Ser Pro Cys Glu Asn Gly
Val Met Leu Tyr Met Arg Asn Val 35 40 45 Ser His Glu Glu Leu Gln
Arg Phe Lys Gln Leu Leu Leu Thr Glu Leu 50 55 60 Ser Thr Gly Thr
Met Pro Ile Thr Trp Asp Gln Val Glu Thr Ala Ser 65 70 75 80 Trp Ala
Glu Val Val His Leu Leu Ile Glu Arg Phe Pro Gly Arg Arg 85 90 95
Ala Trp Asp Val Thr Ser Asn Ile Phe Ala Ile Met Asn Cys Asp Lys 100
105 110 Met Cys Val Val Val Arg Arg Glu Ile Asn Ala Ile Leu Pro Thr
Leu 115 120 125 Glu Pro Glu Asp Leu Asn Val Gly Glu Thr Gln Val Asn
Leu Glu Glu 130 135 140 Gly Glu Ser Gly Lys Ile Arg Arg Tyr Lys Ser
Asn Val Met Glu Lys 145 150 155 160 Phe Phe Pro Ile Trp Asp Ile Thr
Thr Trp Pro Gly Asn Gln Arg Asp 165 170 175 Phe Phe Tyr Gln Gly Val
His Arg His Glu Glu Tyr Leu Pro Cys Leu 180 185 190 Leu Leu Pro Lys
Arg Pro Gln Gly Arg Gln Pro Lys Thr Val Ala Ile 195 200 205 Gln Gly
Ala Pro Gly Ile Gly Lys Thr Ile Leu Ala Lys Lys Val Met 210 215 220
Phe Glu Trp Ala Arg Asn Lys Phe Tyr Ala His Lys Arg Trp Cys Ala 225
230 235 240 Phe Tyr Phe His Cys Gln Glu Val Asn Gln Thr Thr Asp Gln
Ser Phe 245 250 255 Ser Glu Leu Ile Glu Gln Lys Trp Pro Gly Ser Gln
Asp Leu Val Ser 260 265 270 Lys Ile Met Ser Lys Pro Asp Gln Leu Leu
Leu Leu Leu Asp Gly Phe 275 280 285 Glu Glu Leu Thr Ser Thr Leu Ile
Asp Arg Leu Glu Asp Leu Ser Glu 290 295 300 Asp Trp Arg Gln Lys Leu
Pro Gly Ser Val Leu Leu Ser Ser Leu Leu 305 310 315 320 Ser Lys Thr
Met Leu Pro Glu Ala Thr Leu Leu Ile Met Ile Arg Phe 325 330 335 Thr
Ser Trp Gln Thr Cys Lys Pro Leu Leu Lys Cys Pro Ser Leu Val 340 345
350 Thr Leu Pro Gly Phe Asn Thr Met Glu Lys Ile Lys Tyr Phe Gln Met
355 360 365 Tyr Phe Gly His Thr Glu Glu Gly Asp Gln Val Leu Ser Phe
Ala Met 370 375 380 Glu Asn Thr Ile Leu Phe Ser Met Cys Arg Val Pro
Val Val Cys Trp 385 390 395 400 Met Val Cys Ser Gly Leu Lys Gln Gln
Met Glu Arg Gly Asn Asn Leu 405 410 415 Thr Gln Ser Cys Pro Asn Ala
Thr Ser Val Phe Val Arg Tyr Ile Ser 420 425 430 Ser Leu Phe Pro Thr
Arg Ala Glu Asn Phe Ser Arg Lys Ile His Gln 435 440 445 Ala Gln Leu
Glu Gly Leu Cys His Leu Ala Ala Asp Ser Met Trp His 450 455 460 Arg
Lys Trp Val Leu Gly Lys Glu Asp Leu Glu Glu Ala Lys Leu Asp 465 470
475 480 Gln Thr Gly Val Thr Ala Phe Leu Gly Met Ser Ile Leu Arg Arg
Ile 485 490 495 Ala Gly Glu Glu Asp His Tyr Val Phe Thr Leu Val Thr
Phe Gln Glu 500 505 510 Phe Phe Ala Ala Leu Phe Tyr Val Leu Cys Phe
Pro Gln Arg Leu Lys 515 520 525 Asn Phe His Val Leu Ser His Val Asn
Ile Gln Arg Leu Ile Ala Ser 530 535 540 Pro Arg Gly Ser Lys Ser Tyr
Leu Ser His Met Gly Leu Phe Leu Phe 545 550 555 560 Gly Phe Leu Asn
Glu Ala Cys Ala Ser Ala Val Glu Gln Ser Phe Gln 565 570 575 Cys Lys
Val Ser Phe Gly Asn Lys Arg Lys Leu Leu Lys Val Ile Pro 580 585 590
Leu Leu His Lys Cys Asp Pro Pro Ser Pro Gly Ser Gly Val Pro Gln 595
600 605 Leu Phe Tyr Cys Leu His Glu Ile Arg Glu Glu Ala Phe Val Ser
Gln 610 615 620 Ala Leu Asn Asp Tyr His Lys Val Val Leu Arg Ile Gly
Asn Asn Lys 625 630 635 640 Glu Val Gln Val Ser Ala Phe Cys Leu Lys
Arg Cys Gln Tyr Leu His 645 650 655 Glu Val Glu Leu Thr Val Thr Leu
Asn Phe Met Asn Val Trp Lys Leu 660 665 670 Ser Ser Ser Ser His Pro
Gly Ser Glu Ala Pro Glu Ser Asn Gly Leu 675 680 685 His Arg Trp Trp
Gln Asp Leu Cys Ser Val Phe Ala Thr Asn Asp Lys 690 695 700 Leu Glu
Val Leu Thr Met Thr Asn Ser Val Leu Gly Pro Pro Phe Leu 705 710 715
720 Lys Ala Leu Ala Ala Ala Leu Arg His Pro Gln Cys Lys Leu Gln Lys
725 730 735 Leu Leu Leu Arg Arg Val Asn Ser Thr Met Leu Asn Gln Asp
Leu Ile 740 745 750 Gly Val Leu Thr Gly Asn Gln His Leu Arg Tyr Leu
Glu Ile Gln His 755 760 765 Val Glu Val Glu Ser Lys Ala Val Lys Leu
Leu Cys Arg Val Leu Arg 770 775 780 Ser Pro Arg Cys Arg Leu Gln Cys
Leu Arg Leu Glu Asp Cys Leu Ala 785 790 795 800 Thr Pro Arg Ile Trp
Thr Asp Leu Gly Asn Asn Leu Gln Gly Asn Gly 805 810 815 His Leu Lys
Thr Leu Ile Leu Arg Lys Asn Ser Leu Glu Asn Cys Gly 820 825 830 Ala
Tyr Tyr Leu Ser Val Ala Gln Leu Glu Arg Leu Ser Ile Glu Asn 835 840
845 Cys Asn Leu Thr Gln Leu Thr Cys Glu Ser Leu Ala Ser Cys Leu Arg
850 855 860 Gln Ser Lys Met Leu Thr His Leu Ser Leu Ala Glu Asn Ala
Leu Lys 865 870 875 880 Asp Glu Gly Ala Lys His Ile Trp Asn Ala Leu
Pro His Leu Arg Cys 885 890 895 Pro Leu Gln Arg Leu Val Leu Arg Lys
Cys Asp Leu Thr Phe Asn Cys 900 905 910 Cys Gln Asp Met Ile Ser Ala
Leu Cys Lys Asn Lys Thr Leu Lys Ser 915 920 925 Leu Asp Leu Ser Phe
Asn Ser Leu Lys Asp Asp Gly Val Ile Leu Leu 930 935 940 Cys Glu Ala
Leu Lys Asn Pro Asp Cys Thr Leu Gln Ile Leu Glu Leu 945 950 955 960
Glu Asn Cys Leu Phe Thr Ser Ile Cys Cys Gln Ala Met Ala Ser Met 965
970 975 Leu Arg Lys Asn Gln His Leu Arg His Leu Asp Leu Ser Lys Asn
Ala 980 985 990 Ile Gly Val Tyr Gly Ile Leu Thr Leu Cys Glu Ala Phe
Ser Ser Gln 995 1000 1005 Lys Lys Arg Glu Glu Val Ile Phe Cys Ile
Pro Ala Trp Thr Arg 1010 1015 1020 Ile Thr Ser Phe Ser Pro Thr Pro
His Pro Pro Asp Phe Thr Gly 1025 1030 1035 Lys Ser Asp Cys Leu Ser
Gln Ile Asn Pro 1040 1045 20 1033 PRT Homo sapiens 20 Met Ala Glu
Ser Asp Ser Thr Asp Phe Asp Leu Leu Trp Tyr Leu Glu 1 5 10 15 Asn
Leu Ser Asp Lys Glu Phe Gln Ser Phe Lys Lys Tyr Leu Ala Arg 20 25
30 Lys Ile Leu Asp Phe Lys Leu Pro Gln Phe Pro Leu Ile Gln Met Thr
35 40 45 Lys Glu Glu Leu Ala Asn Val Leu Pro Ile Ser Tyr Glu Gly
Gln Tyr 50 55 60 Ile Trp Asn Met Leu Phe Ser Ile Phe Ser Met Met
Arg Lys Glu Asp 65 70 75 80 Leu Cys Arg Lys Ile Ile Gly Arg Arg Asn
Arg Asn Gln Glu Ala Cys 85 90 95 Lys Ala Val Met Arg Arg Lys Phe
Met Leu Gln Trp Glu Ser His Thr 100 105 110 Phe Gly Lys Phe His Tyr
Lys Phe Phe Arg Asp Val Ser Ser Asp Val 115 120 125 Phe Tyr Ile Leu
Gln Leu Ala Tyr Asp Ser Thr Ser Tyr Tyr Ser Ala 130 135 140 Asn Asn
Leu Asn Val Phe Leu Met Gly Glu Arg Ala Ser Gly Lys Thr 145 150 155
160 Ile Val Ile Asn Leu Ala Val Leu Arg Trp Ile Lys Gly Glu Met Trp
165 170 175 Gln Asn Met Ile Ser Tyr Val Val His Leu Thr Ser His Glu
Ile Asn 180 185 190 Gln Met Thr Asn Ser Ser Leu Ala Glu Leu Ile Ala
Lys Asp Trp Pro 195 200 205 Asp Gly Gln Ala Pro Ile Ala Asp Ile Leu
Ser Asp Pro Lys Lys Leu 210 215
220 Leu Phe Ile Leu Glu Asp Leu Asp Asn Ile Arg Phe Glu Leu Asn Val
225 230 235 240 Asn Glu Ser Ala Leu Cys Ser Asn Ser Thr Gln Lys Val
Pro Ile Pro 245 250 255 Val Leu Leu Val Ser Leu Leu Lys Arg Lys Met
Ala Pro Gly Cys Trp 260 265 270 Phe Leu Ile Ser Ser Arg Pro Thr Arg
Gly Asn Asn Val Lys Thr Phe 275 280 285 Leu Lys Glu Val Asp Cys Cys
Thr Thr Leu Gln Leu Ser Asn Gly Lys 290 295 300 Arg Glu Ile Tyr Phe
Asn Ser Phe Phe Lys Asp Arg Gln Arg Ala Ser 305 310 315 320 Ala Ala
Leu Gln Leu Val His Glu Asp Glu Ile Leu Val Gly Leu Cys 325 330 335
Arg Val Ala Ile Leu Cys Trp Ile Thr Cys Thr Val Leu Lys Arg Gln 340
345 350 Met Asp Lys Gly Arg Asp Phe Gln Leu Cys Cys Gln Thr Pro Thr
Asp 355 360 365 Leu His Ala His Phe Leu Ala Asp Ala Leu Thr Ser Glu
Ala Gly Leu 370 375 380 Thr Ala Asn Gln Tyr His Leu Gly Leu Leu Lys
Arg Leu Cys Leu Leu 385 390 395 400 Ala Ala Gly Gly Leu Phe Leu Ser
Thr Leu Asn Phe Ser Gly Glu Asp 405 410 415 Leu Arg Cys Val Gly Phe
Thr Glu Ala Asp Val Ser Val Leu Gln Ala 420 425 430 Ala Asn Ile Leu
Leu Pro Ser Asn Thr His Lys Asp Arg Tyr Lys Phe 435 440 445 Ile His
Leu Asn Val Gln Glu Phe Cys Thr Ala Ile Ala Phe Leu Met 450 455 460
Ala Val Pro Asn Tyr Leu Ile Pro Ser Gly Ser Arg Glu Tyr Lys Glu 465
470 475 480 Lys Arg Glu Gln Tyr Ser Asp Phe Asn Gln Val Phe Thr Phe
Ile Phe 485 490 495 Gly Leu Leu Asn Ala Asn Arg Arg Lys Ile Leu Glu
Thr Ser Phe Gly 500 505 510 Tyr Gln Leu Pro Met Val Asp Ser Phe Lys
Trp Tyr Ser Val Gly Tyr 515 520 525 Met Lys His Leu Asp Arg Asp Pro
Glu Lys Leu Thr His His Met Pro 530 535 540 Leu Phe Tyr Cys Leu Tyr
Glu Asn Arg Glu Glu Glu Phe Val Lys Thr 545 550 555 560 Ile Val Asp
Ala Leu Met Glu Val Thr Val Tyr Leu Gln Ser Asp Lys 565 570 575 Asp
Met Met Val Ser Leu Tyr Cys Leu Asp Tyr Cys Cys His Leu Arg 580 585
590 Thr Leu Lys Leu Ser Val Gln Arg Ile Phe Gln Asn Lys Glu Pro Leu
595 600 605 Ile Arg Pro Thr Ala Ser Gln Met Lys Ser Leu Val Tyr Trp
Arg Glu 610 615 620 Ile Cys Ser Leu Phe Tyr Thr Met Glu Ser Leu Arg
Glu Leu His Ile 625 630 635 640 Phe Asp Asn Asp Leu Asn Gly Ile Ser
Glu Arg Ile Leu Ser Lys Ala 645 650 655 Leu Glu His Ser Ser Cys Lys
Leu Arg Thr Leu Lys Leu Ser Tyr Val 660 665 670 Ser Thr Ala Ser Gly
Phe Glu Asp Leu Leu Lys Ala Leu Ala Arg Asn 675 680 685 Arg Ser Leu
Thr Tyr Leu Ser Ile Asn Cys Thr Ser Ile Ser Leu Asn 690 695 700 Met
Phe Ser Leu Leu His Asp Ile Leu His Glu Pro Thr Cys Gln Ile 705 710
715 720 Ser His Leu Ser Leu Met Lys Cys Asp Leu Arg Ala Ser Glu Cys
Glu 725 730 735 Glu Ile Ala Ser Leu Leu Ile Ser Gly Gly Ser Leu Arg
Lys Leu Thr 740 745 750 Leu Ser Ser Asn Pro Leu Arg Ser Asp Gly Met
Asn Ile Leu Cys Asp 755 760 765 Ala Leu Leu His Pro Asn Cys Thr Leu
Ile Ser Leu Val Leu Val Phe 770 775 780 Cys Cys Leu Thr Glu Asn Cys
Cys Ser Ala Leu Gly Arg Val Leu Leu 785 790 795 800 Phe Ser Pro Thr
Leu Arg Gln Leu Asp Leu Cys Val Asn Arg Leu Lys 805 810 815 Asn Tyr
Gly Val Leu His Val Thr Phe Pro Leu Leu Phe Pro Thr Cys 820 825 830
Gln Leu Glu Glu Leu His Leu Ser Gly Cys Phe Phe Ser Ser Asp Ile 835
840 845 Cys Gln Tyr Ile Ala Ile Val Ile Ala Thr Asn Glu Lys Leu Arg
Ser 850 855 860 Leu Glu Ile Gly Ser Asn Lys Ile Glu Asp Ala Gly Met
Gln Leu Leu 865 870 875 880 Cys Gly Gly Leu Arg His Pro Asn Cys Met
Leu Val Asn Ile Gly Leu 885 890 895 Glu Glu Cys Met Leu Thr Ser Ala
Cys Cys Arg Ser Leu Ala Ser Val 900 905 910 Leu Thr Thr Asn Lys Thr
Leu Glu Arg Leu Asn Leu Leu Gln Asn His 915 920 925 Leu Gly Asn Asp
Gly Val Ala Lys Leu Leu Glu Ser Leu Ile Ser Pro 930 935 940 Asp Cys
Val Leu Lys Val Val Gly Leu Pro Leu Thr Gly Leu Asn Thr 945 950 955
960 Gln Thr Gln Gln Leu Leu Met Thr Val Lys Glu Arg Lys Pro Ser Leu
965 970 975 Ile Phe Leu Ser Glu Thr Trp Ser Leu Lys Glu Gly Arg Glu
Ile Gly 980 985 990 Val Thr Pro Ala Ser Gln Pro Gly Ser Ile Ile Pro
Asn Ser Asn Leu 995 1000 1005 Asp Tyr Met Phe Phe Lys Phe Pro Arg
Met Ser Ala Ala Met Arg 1010 1015 1020 Thr Ser Asn Thr Ala Ser Arg
Gln Pro Leu 1025 1030 21 975 PRT Homo sapiens 21 Met Arg Trp Gly
His His Leu Pro Arg Ala Ser Trp Gly Ser Gly Phe 1 5 10 15 Arg Arg
Ala Leu Gln Arg Pro Asp Asp Arg Ile Pro Phe Leu Ile His 20 25 30
Trp Ser Trp Pro Leu Gln Gly Glu Arg Pro Phe Gly Pro Pro Arg Ala 35
40 45 Phe Ile Arg His His Gly Ser Ser Val Asp Ser Ala Pro Pro Pro
Gly 50 55 60 Arg His Gly Arg Leu Phe Pro Ser Ala Ser Ala Thr Glu
Ala Ile Gln 65 70 75 80 Arg His Arg Arg Asn Leu Ala Glu Trp Phe Ser
Arg Leu Pro Arg Glu 85 90 95 Glu Arg Gln Phe Gly Pro Thr Phe Ala
Leu Asp Thr Val His Val Asp 100 105 110 Pro Val Ile Arg Glu Ser Thr
Pro Asp Glu Leu Leu Arg Pro Pro Ala 115 120 125 Glu Leu Ala Leu Glu
His Gln Pro Pro Gln Ala Gly Leu Pro Pro Leu 130 135 140 Ala Leu Ser
Gln Leu Phe Asn Pro Asp Ala Cys Gly Arg Arg Val Gln 145 150 155 160
Thr Val Val Leu Tyr Gly Thr Val Gly Thr Gly Lys Ser Thr Leu Val 165
170 175 Arg Lys Met Val Leu Asp Trp Cys Tyr Gly Arg Leu Pro Ala Phe
Glu 180 185 190 Leu Leu Ile Pro Phe Ser Cys Glu Asp Leu Ser Ser Leu
Gly Pro Ala 195 200 205 Pro Ala Ser Leu Cys Gln Leu Val Ala Gln Arg
Tyr Thr Pro Leu Lys 210 215 220 Glu Val Leu Pro Leu Met Ala Ala Ala
Gly Ser His Leu Leu Phe Val 225 230 235 240 Leu His Gly Leu Glu His
Leu Asn Leu Asp Phe Arg Leu Ala Gly Thr 245 250 255 Gly Leu Cys Ser
Asp Pro Glu Glu Pro Gln Glu Pro Ala Ala Ile Ile 260 265 270 Val Asn
Leu Leu Arg Lys Tyr Met Leu Pro Gln Ala Ser Ile Leu Val 275 280 285
Thr Thr Arg Pro Ser Ala Ile Gly Arg Ile Pro Ser Lys Tyr Val Gly 290
295 300 Arg Tyr Gly Glu Ile Cys Gly Phe Ser Asp Thr Asn Leu Gln Lys
Leu 305 310 315 320 Tyr Phe Gln Leu Arg Leu Asn Gln Pro Tyr Cys Gly
Tyr Ala Val Gly 325 330 335 Gly Ser Gly Val Ser Ala Thr Pro Ala Gln
Arg Asp His Leu Val Gln 340 345 350 Met Leu Ser Arg Asn Leu Glu Gly
His His Gln Ile Ala Ala Ala Cys 355 360 365 Phe Leu Pro Ser Tyr Cys
Trp Leu Val Cys Ala Thr Leu His Phe Leu 370 375 380 His Ala Pro Thr
Pro Ala Gly Gln Thr Leu Thr Ser Ile Tyr Thr Ser 385 390 395 400 Phe
Leu Arg Leu Asn Phe Ser Gly Glu Thr Leu Asp Ser Thr Asp Pro 405 410
415 Ser Asn Leu Ser Leu Met Ala Tyr Ala Ala Arg Thr Met Gly Lys Leu
420 425 430 Ala Tyr Glu Gly Val Ser Ser Arg Lys Thr Tyr Phe Ser Glu
Glu Asp 435 440 445 Val Cys Gly Cys Leu Glu Ala Gly Ile Arg Thr Glu
Glu Glu Phe Gln 450 455 460 Leu Leu His Ile Phe Arg Arg Asp Ala Leu
Arg Phe Phe Leu Ala Pro 465 470 475 480 Cys Val Glu Pro Gly Arg Ala
Gly Thr Phe Val Phe Thr Val Pro Ala 485 490 495 Met Gln Glu Tyr Leu
Ala Ala Leu Tyr Ile Val Leu Gly Leu Arg Lys 500 505 510 Thr Thr Leu
Gln Lys Val Gly Lys Glu Val Ala Glu Leu Val Gly Arg 515 520 525 Val
Gly Glu Asp Val Ser Leu Val Leu Gly Ile Met Ala Lys Leu Leu 530 535
540 Pro Leu Arg Ala Leu Pro Leu Leu Phe Asn Leu Ile Lys Val Val Pro
545 550 555 560 Arg Val Phe Gly Arg Met Val Gly Lys Ser Arg Glu Ala
Val Ala Gln 565 570 575 Ala Met Val Leu Glu Met Phe Arg Glu Glu Asp
Tyr Tyr Asn Asp Asp 580 585 590 Val Leu Asp Gln Met Gly Ala Ser Ile
Leu Gly Val Glu Gly Pro Arg 595 600 605 Arg His Pro Asp Glu Pro Pro
Glu Asp Glu Val Phe Glu Leu Phe Pro 610 615 620 Met Phe Met Gly Gly
Leu Leu Ser Ala His Asn Arg Ala Val Leu Ala 625 630 635 640 Gln Leu
Gly Cys Pro Ile Lys Asn Leu Asp Ala Leu Glu Asn Ala Gln 645 650 655
Ala Ile Lys Lys Lys Leu Gly Lys Leu Gly Arg Gln Val Leu Pro Pro 660
665 670 Ser Glu Leu Leu Asp His Leu Phe Phe His Tyr Glu Phe Gln Asn
Gln 675 680 685 Arg Phe Ser Ala Glu Val Leu Ser Ser Leu Arg Gln Leu
Asn Leu Ala 690 695 700 Gly Val Arg Met Thr Pro Val Lys Cys Thr Val
Val Ala Ala Val Leu 705 710 715 720 Gly Ser Gly Arg His Ala Leu Asp
Glu Val Asn Leu Ala Ser Cys Gln 725 730 735 Leu Asp Pro Ala Gly Leu
Arg Thr Leu Leu Pro Val Phe Leu Arg Ala 740 745 750 Arg Lys Leu Gly
Leu Gln Leu Asn Ser Leu Gly Pro Glu Ala Cys Lys 755 760 765 Asp Leu
Arg Asp Leu Leu Leu His Asp Gln Cys Gln Ile Thr Thr Leu 770 775 780
Arg Leu Ser Asn Asn Pro Leu Thr Ala Ala Gly Val Ala Val Leu Met 785
790 795 800 Glu Gly Leu Ala Gly Asn Thr Ser Val Thr His Leu Ser Leu
Leu His 805 810 815 Thr Gly Leu Gly Asp Glu Gly Leu Glu Leu Leu Ala
Ala Gln Leu Asp 820 825 830 Arg Asn Arg Gln Leu Gln Glu Leu Asn Val
Ala Tyr Asn Gly Ala Gly 835 840 845 Asp Thr Ala Ala Leu Ala Leu Ala
Arg Ala Ala Arg Glu His Pro Ser 850 855 860 Leu Glu Leu Leu His Leu
Tyr Phe Asn Glu Leu Ser Ser Glu Gly Arg 865 870 875 880 Gln Val Leu
Arg Asp Leu Gly Gly Ala Ala Glu Gly Gly Ala Arg Val 885 890 895 Val
Val Ser Leu Thr Glu Gly Thr Ala Val Ser Glu Tyr Trp Ser Val 900 905
910 Ile Leu Ser Glu Val Gln Arg Asn Leu Asn Ser Trp Asp Arg Ala Arg
915 920 925 Val Gln Arg His Leu Glu Leu Leu Leu Arg Asp Leu Glu Asp
Ser Arg 930 935 940 Gly Ala Thr Leu Asn Pro Trp Arg Lys Ala Gln Leu
Leu Arg Val Glu 945 950 955 960 Gly Glu Val Arg Ala Leu Leu Glu Gln
Leu Gly Ser Ser Gly Ser 965 970 975 22 1866 PRT Homo sapiens 22 Met
Asp Pro Val Gly Leu Gln Leu Gly Asn Lys Asn Leu Trp Ser Cys 1 5 10
15 Leu Val Arg Leu Leu Thr Lys Asp Pro Glu Trp Leu Asn Ala Lys Met
20 25 30 Lys Phe Phe Leu Pro Asn Thr Asp Leu Asp Ser Arg Asn Glu
Thr Leu 35 40 45 Asp Pro Glu Gln Arg Val Ile Leu Gln Leu Asn Lys
Leu His Val Gln 50 55 60 Gly Ser Asp Thr Trp Gln Ser Phe Ile His
Cys Val Cys Met Gln Leu 65 70 75 80 Glu Val Pro Leu Asp Leu Glu Val
Leu Leu Leu Ser Thr Phe Gly Tyr 85 90 95 Asp Asp Gly Phe Thr Ser
Gln Leu Gly Ala Glu Gly Lys Ser Gln Pro 100 105 110 Glu Ser Gln Leu
His His Gly Leu Lys Arg Pro His Gln Ser Cys Gly 115 120 125 Ser Ser
Pro Arg Arg Lys Gln Cys Lys Lys Gln Gln Leu Glu Leu Ala 130 135 140
Lys Lys Tyr Leu Gln Leu Leu Arg Thr Ser Ala Gln Gln Arg Tyr Arg 145
150 155 160 Ser Gln Ile Pro Gly Ser Gly Gln Pro His Ala Phe His Gln
Val Tyr 165 170 175 Val Pro Pro Ile Leu Arg Arg Ala Thr Ala Ser Leu
Asp Thr Pro Glu 180 185 190 Gly Ala Ile Met Gly Asp Val Lys Val Glu
Asp Gly Ala Asp Val Ser 195 200 205 Ile Ser Asp Leu Phe Asn Thr Arg
Val Asn Lys Gly Pro Arg Val Thr 210 215 220 Val Leu Leu Gly Lys Ala
Gly Met Gly Lys Thr Thr Leu Ala His Arg 225 230 235 240 Leu Cys Gln
Lys Trp Ala Glu Gly His Leu Asn Cys Phe Gln Ala Leu 245 250 255 Phe
Leu Phe Glu Phe Arg Gln Leu Asn Leu Ile Thr Arg Phe Leu Thr 260 265
270 Pro Ser Glu Leu Leu Phe Asp Leu Tyr Leu Ser Pro Glu Ser Asp His
275 280 285 Asp Thr Val Phe Gln Tyr Leu Glu Lys Asn Ala Asp Gln Val
Leu Leu 290 295 300 Ile Phe Asp Gly Leu Asp Glu Ala Leu Gln Pro Met
Gly Pro Asp Gly 305 310 315 320 Pro Gly Pro Val Leu Thr Leu Phe Ser
His Leu Cys Asn Gly Thr Leu 325 330 335 Leu Pro Gly Cys Arg Val Met
Ala Thr Ser Arg Pro Gly Lys Leu Pro 340 345 350 Ala Cys Leu Pro Ala
Glu Ala Ala Met Val His Met Leu Gly Phe Asp 355 360 365 Gly Pro Arg
Val Glu Glu Tyr Val Asn His Phe Phe Ser Ala Gln Pro 370 375 380 Ser
Arg Glu Gly Ala Leu Val Glu Leu Gln Thr Asn Gly Arg Leu Arg 385 390
395 400 Ser Leu Cys Ala Val Pro Ala Leu Cys Gln Val Ala Cys Leu Cys
Leu 405 410 415 His His Leu Leu Pro Asp His Ala Pro Gly Gln Ser Val
Ala Leu Leu 420 425 430 Pro Asn Met Thr Gln Leu Tyr Met Gln Met Val
Leu Ala Leu Ser Pro 435 440 445 Pro Gly His Leu Pro Thr Ser Ser Leu
Leu Asp Leu Gly Glu Val Ala 450 455 460 Leu Arg Gly Leu Glu Thr Gly
Lys Val Ile Phe Tyr Ala Lys Asp Ile 465 470 475 480 Ala Pro Pro Leu
Ile Ala Phe Gly Ala Thr His Ser Leu Leu Thr Ser 485 490 495 Phe Cys
Val Cys Thr Gly Pro Gly His Gln Gln Thr Gly Tyr Ala Phe 500 505 510
Thr His Leu Ser Leu Gln Glu Phe Leu Ala Ala Leu His Leu Met Ala 515
520 525 Ser Pro Lys Val Asn Lys Asp Thr Leu Thr Gln Tyr Val Thr Leu
His 530 535 540 Ser Arg Trp Val Gln Arg Thr Lys Ala Arg Leu Gly Leu
Ser Asp His 545 550 555 560 Leu Pro Thr Phe Leu Ala Gly Leu Ala Ser
Cys Thr Cys Arg Pro Phe 565 570 575 Leu Ser His Leu Ala Gln Gly Asn
Glu Asp Cys Val Gly Ala Lys Gln 580 585 590 Ala Ala Val Val Gln Val
Leu Lys Lys Leu Ala Thr Arg Lys Leu Thr 595 600 605 Gly Pro Lys Val
Val Glu Leu Cys His Cys Val Asp Glu Thr Gln Glu 610 615 620 Pro Glu
Leu Ala Ser Leu Thr Ala Gln Ser Leu Pro Tyr Gln Leu Pro 625 630 635
640 Phe His Asn Phe Pro Leu Thr Cys Thr Asp Leu Ala Thr Leu Thr Asn
645 650 655 Ile
Leu Glu His Arg Glu Ala Pro Ile His Leu Asp Phe Asp Gly Cys 660 665
670 Pro Leu Glu Pro His Cys Pro Glu Ala Leu Val Gly Cys Gly Gln Ile
675 680 685 Glu Asn Leu Ser Phe Lys Ser Arg Lys Cys Gly Asp Ala Phe
Ala Glu 690 695 700 Ala Leu Ser Arg Ser Leu Pro Thr Met Gly Arg Leu
Gln Met Leu Gly 705 710 715 720 Leu Ala Gly Ser Lys Ile Thr Ala Arg
Gly Ile Ser His Leu Val Lys 725 730 735 Ala Leu Pro Leu Cys Pro Gln
Leu Lys Glu Val Ser Phe Arg Asp Asn 740 745 750 Gln Leu Ser Asp Gln
Val Val Leu Asn Ile Val Glu Val Leu Pro His 755 760 765 Leu Pro Arg
Leu Arg Lys Leu Asp Leu Ser Ser Asn Ser Ile Cys Val 770 775 780 Ser
Thr Leu Leu Cys Leu Ala Arg Val Ala Val Thr Cys Pro Thr Val 785 790
795 800 Arg Met Leu Gln Ala Arg Glu Arg Thr Ile Ile Phe Leu Leu Ser
Pro 805 810 815 Pro Thr Glu Thr Thr Ala Glu Leu Gln Arg Ala Pro Asp
Leu Gln Glu 820 825 830 Ser Asp Gly Gln Arg Lys Gly Ala Gln Ser Arg
Ser Leu Thr Leu Arg 835 840 845 Leu Gln Lys Cys Gln Leu Gln Val His
Asp Ala Glu Ala Leu Ile Ala 850 855 860 Leu Leu Gln Glu Gly Pro His
Leu Glu Glu Val Asp Leu Ser Gly Asn 865 870 875 880 Gln Leu Glu Asp
Glu Gly Cys Arg Leu Met Ala Glu Ala Ala Ser Gln 885 890 895 Leu His
Ile Ala Arg Lys Leu Asp Leu Ser Asp Asn Gly Leu Ser Val 900 905 910
Ala Gly Val His Cys Val Leu Arg Ala Val Ser Ala Cys Trp Thr Leu 915
920 925 Ala Glu Leu His Ile Ser Leu Gln His Lys Thr Val Ile Phe Met
Phe 930 935 940 Ala Gln Glu Pro Glu Glu Gln Lys Gly Pro Gln Glu Arg
Ala Ala Phe 945 950 955 960 Leu Asp Ser Leu Met Leu Gln Met Pro Ser
Glu Leu Pro Leu Ser Ser 965 970 975 Arg Arg Met Arg Leu Thr His Cys
Gly Leu Gln Glu Lys His Leu Glu 980 985 990 Gln Leu Cys Lys Ala Leu
Gly Gly Ser Cys His Leu Gly His Leu His 995 1000 1005 Leu Asp Phe
Ser Gly Asn Ala Leu Gly Asp Glu Gly Ala Ala Arg 1010 1015 1020 Leu
Ala Gln Leu Leu Pro Gly Leu Gly Ala Leu Gln Ser Leu Asn 1025 1030
1035 Leu Ser Glu Asn Gly Leu Ser Leu Asp Ala Val Leu Gly Leu Val
1040 1045 1050 Arg Cys Phe Ser Thr Leu Gln Trp Leu Phe Arg Leu Asp
Ile Ser 1055 1060 1065 Phe Glu Ser Gln His Ile Leu Leu Arg Gly Asp
Lys Thr Ser Arg 1070 1075 1080 Asp Met Trp Ala Thr Gly Ser Leu Pro
Asp Phe Pro Ala Ala Ala 1085 1090 1095 Lys Phe Leu Gly Phe Arg Gln
Arg Cys Ile Pro Arg Ser Leu Cys 1100 1105 1110 Leu Ser Glu Cys Pro
Leu Glu Pro Pro Ser Leu Thr Arg Leu Cys 1115 1120 1125 Ala Thr Leu
Lys Asp Cys Pro Gly Pro Leu Glu Leu Gln Leu Ser 1130 1135 1140 Cys
Glu Phe Leu Ser Asp Gln Ser Leu Glu Thr Leu Leu Asp Cys 1145 1150
1155 Leu Pro Gln Leu Pro Gln Leu Ser Leu Leu Gln Leu Ser Gln Thr
1160 1165 1170 Gly Leu Ser Pro Lys Ser Pro Phe Leu Leu Ala Asn Thr
Leu Ser 1175 1180 1185 Leu Cys Pro Arg Val Lys Lys Val Asp Leu Arg
Ser Leu His His 1190 1195 1200 Ala Thr Leu His Phe Arg Ser Asn Glu
Glu Glu Glu Gly Val Cys 1205 1210 1215 Cys Gly Arg Phe Thr Gly Cys
Ser Leu Ser Gln Glu His Val Glu 1220 1225 1230 Ser Leu Cys Trp Leu
Leu Ser Lys Cys Lys Asp Leu Ser Gln Val 1235 1240 1245 Asp Leu Ser
Ala Asn Leu Leu Gly Asp Ser Gly Leu Arg Cys Leu 1250 1255 1260 Leu
Glu Cys Leu Pro Gln Val Pro Ile Ser Gly Leu Leu Asp Leu 1265 1270
1275 Ser His Asn Ser Ile Ser Gln Glu Ser Ala Leu Tyr Leu Leu Glu
1280 1285 1290 Thr Leu Pro Ser Cys Pro Arg Val Arg Glu Ala Ser Val
Asn Leu 1295 1300 1305 Gly Ser Glu Gln Ser Phe Arg Ile His Phe Ser
Arg Glu Asp Gln 1310 1315 1320 Ala Gly Lys Thr Leu Arg Leu Ser Glu
Cys Ser Phe Arg Pro Glu 1325 1330 1335 His Val Ser Arg Leu Ala Thr
Gly Leu Ser Lys Ser Leu Gln Leu 1340 1345 1350 Thr Glu Leu Thr Leu
Thr Gln Cys Cys Leu Gly Gln Lys Gln Leu 1355 1360 1365 Ala Ile Leu
Leu Ser Leu Val Gly Arg Pro Ala Gly Leu Phe Ser 1370 1375 1380 Leu
Arg Val Gln Glu Pro Trp Ala Asp Arg Ala Arg Val Leu Ser 1385 1390
1395 Leu Leu Glu Val Cys Ala Gln Ala Ser Gly Ser Val Thr Glu Ile
1400 1405 1410 Ser Ile Ser Glu Thr Gln Gln Gln Leu Cys Val Gln Leu
Glu Phe 1415 1420 1425 Pro Arg Gln Glu Glu Asn Pro Glu Ala Val Ala
Leu Arg Leu Ala 1430 1435 1440 His Cys Asp Leu Gly Ala His His Ser
Leu Leu Val Gly Gln Leu 1445 1450 1455 Met Glu Thr Cys Ala Arg Leu
Gln Gln Leu Ser Leu Ser Gln Val 1460 1465 1470 Asn Leu Cys Glu Asp
Asp Asp Ala Ser Ser Leu Leu Leu Gln Ser 1475 1480 1485 Leu Leu Leu
Ser Leu Ser Glu Leu Lys Thr Phe Arg Leu Thr Ser 1490 1495 1500 Ser
Cys Val Ser Thr Glu Gly Leu Ala His Leu Ala Ser Gly Leu 1505 1510
1515 Gly His Cys His His Leu Glu Glu Leu Asp Leu Ser Asn Asn Gln
1520 1525 1530 Phe Asp Glu Glu Gly Thr Lys Ala Leu Met Arg Ala Leu
Glu Gly 1535 1540 1545 Lys Trp Met Leu Lys Arg Leu Asp Leu Ser His
Leu Leu Leu Asn 1550 1555 1560 Ser Ser Thr Leu Ala Leu Leu Thr His
Arg Leu Ser Gln Met Thr 1565 1570 1575 Cys Leu Gln Ser Leu Arg Leu
Asn Arg Asn Ser Ile Gly Asp Val 1580 1585 1590 Gly Cys Cys His Leu
Ser Glu Ala Leu Arg Ala Ala Thr Ser Leu 1595 1600 1605 Glu Glu Leu
Asp Leu Ser His Asn Gln Ile Gly Asp Ala Gly Val 1610 1615 1620 Gln
His Leu Ala Thr Ile Leu Pro Gly Leu Pro Glu Leu Arg Lys 1625 1630
1635 Ile Asp Leu Ser Gly Asn Ser Ile Ser Ser Ala Gly Gly Val Gln
1640 1645 1650 Leu Ala Glu Ser Leu Val Leu Cys Arg Arg Leu Glu Glu
Leu Met 1655 1660 1665 Leu Gly Cys Asn Ala Leu Gly Asp Pro Thr Ala
Leu Gly Leu Ala 1670 1675 1680 Gln Glu Leu Pro Gln His Leu Arg Val
Leu His Leu Pro Phe Ser 1685 1690 1695 His Leu Gly Pro Gly Gly Ala
Leu Ser Leu Ala Gln Ala Leu Asp 1700 1705 1710 Gly Ser Pro His Leu
Glu Glu Ile Ser Leu Ala Glu Asn Asn Leu 1715 1720 1725 Ala Gly Gly
Val Leu Arg Phe Cys Met Glu Leu Pro Leu Leu Arg 1730 1735 1740 Gln
Ile Asp Leu Val Ser Cys Lys Ile Asp Asn Gln Thr Ala Lys 1745 1750
1755 Leu Leu Thr Ser Ser Phe Thr Ser Cys Pro Ala Leu Glu Val Ile
1760 1765 1770 Leu Leu Ser Trp Asn Leu Leu Gly Asp Glu Ala Ala Ala
Glu Leu 1775 1780 1785 Ala Gln Val Leu Pro Lys Met Gly Arg Leu Lys
Arg Val Asp Leu 1790 1795 1800 Glu Lys Asn Gln Ile Thr Ala Leu Gly
Ala Trp Leu Leu Ala Glu 1805 1810 1815 Gly Leu Ala Gln Gly Ser Ser
Ile Gln Val Ile Arg Leu Trp Asn 1820 1825 1830 Asn Pro Ile Pro Cys
Asp Met Ala Gln His Leu Lys Ser Gln Glu 1835 1840 1845 Pro Arg Leu
Asp Phe Ala Phe Phe Asp Asn Gln Pro Gln Ala Pro 1850 1855 1860 Trp
Gly Thr 1865
* * * * *