U.S. patent application number 13/719835 was filed with the patent office on 2017-03-09 for rearranged tt virus molecules for use in diagnosis, prevention and treatment of cancer and autoimmunity.
This patent application is currently assigned to Deutsches Krebsforschungszentrum Stiftung Des Offentlichen Rechtes. The applicant listed for this patent is Deutsches Krebsforschungszentrum. Invention is credited to Ethel-Michele De Villiers, Harald Zur Hausen.
Application Number | 20170066802 13/719835 |
Document ID | / |
Family ID | 49235342 |
Filed Date | 2017-03-09 |
United States Patent
Application |
20170066802 |
Kind Code |
A9 |
De Villiers; Ethel-Michele ;
et al. |
March 9, 2017 |
REARRANGED TT VIRUS MOLECULES FOR USE IN DIAGNOSIS, PREVENTION AND
TREATMENT OF CANCER AND AUTOIMMUNITY
Abstract
The present invention relates to rearranged molecules of (a) a
specific TT virus sequence and (b) a nucleotide sequence encoding a
polypeptide showing homology to mammalian proteins associated with
cancer and autoimmune diseases that are capable of replicating
autonomously for use in diagnosis, prevention and treatment of
diseases like cancer and autoimmunity.
Inventors: |
De Villiers; Ethel-Michele;
(Waldmichelbach, DE) ; Zur Hausen; Harald;
(Waldmichelbach, DE) |
|
Applicant: |
Name |
City |
State |
Country |
Type |
Deutsches Krebsforschungszentrum |
Heidelberg |
|
DE |
|
|
Assignee: |
Deutsches Krebsforschungszentrum
Stiftung Des Offentlichen Rechtes
Heidelberg
DE
|
Prior
Publication: |
|
Document Identifier |
Publication Date |
|
US 20130259869 A1 |
October 3, 2013 |
|
|
Family ID: |
49235342 |
Appl. No.: |
13/719835 |
Filed: |
December 19, 2012 |
Related U.S. Patent Documents
|
|
|
|
|
|
Application
Number |
Filing Date |
Patent Number |
|
|
PCT/EP11/03119 |
Jun 24, 2011 |
|
|
|
13719835 |
|
|
|
|
12821634 |
Jun 23, 2010 |
|
|
|
PCT/EP11/03119 |
|
|
|
|
Current U.S.
Class: |
1/1 |
Current CPC
Class: |
G01N 2800/50 20130101;
C07K 16/081 20130101; C07K 14/005 20130101; C12Q 1/6886 20130101;
C12Q 1/701 20130101; C12N 7/00 20130101; C07K 14/01 20130101; C12N
2750/00022 20130101; G01N 33/56983 20130101; C12N 2750/00021
20130101; C12Q 1/6883 20130101; C12N 15/1131 20130101 |
International
Class: |
C07K 14/01 20060101
C07K014/01; C12N 15/113 20060101 C12N015/113; C12Q 1/70 20060101
C12Q001/70; C07K 16/08 20060101 C07K016/08; G01N 33/569 20060101
G01N033/569 |
Foreign Application Data
Date |
Code |
Application Number |
Jun 23, 2010 |
EP |
10006541 |
Nov 23, 2010 |
EP |
10014907 |
Claims
1. A rearranged TT virus polynucleic acid comprising (a) a
nucleotide sequence shown in FIG. 6; (b) a nucleotide sequence
which shows at least 70% identity to a nucleotide sequence of (a)
and is capable of replicating autonomously and/or inducing
autonomous replication; (c) a fragment of a nucleotide sequence of
(a) or (b) which is capable of replicating autonomously; (d) a
nucleotide sequence which is the complement of the nucleotide
sequence of (a), (b), or (c); or (e) a nucleotide sequence which is
redundant as a result of the degeneracy of the genetic code
compared to any of the above-given nucleotide sequences.
2. The rearranged TT virus polynucleic acid of claim 1 consisting
of (a) a nucleotide sequence shown in FIG. 6; (b) a nucleotide
sequence which shows at least 70% identity to a nucleotide sequence
of (a) and is capable of replicating autonomously and/or inducing
autonomous replication; (c) a fragment of a nucleotide sequence of
(a) or (b) which is capable of replicating autonomously; (d) a
nucleotide sequence which is the complement of the nucleotide
sequence of (a), (b), or (c); or (e) a nucleotide sequence which is
redundant as a result of the degeneracy of the genetic code
compared to any of the above-given nucleotide sequences.
3. The rearranged TT virus polynucleic acid of claim 1, wherein
said nucleotide sequence of (a), (b), (c), (d) or (e) is linked to
a polynucleic acid encoding a polypeptide containing a signature
motif of a mammalian protein or allergen being associated with
cancer or an autoimmune disease.
4. The rearranged TT virus polynucleic acid of claim 1 which is
present as a single- or double-stranded extrachromosomal
episome.
5. The rearranged TT virus polynucleic acid of claim 1 which is a
single-stranded DNA.
6. The rearranged TT virus polynucleic acid of claim 1 which is
linked to a host cell DNA.
7. The rearranged TT virus polynucleic acid of claim 6 having at
least one of the following properties: (a) growth-stimulation; (b)
oncogene function; (c) tumor suppressor gene-like function; or (d)
stimulation of autoimmune reactions.
8. The TT virus polynucleic acid of claim 1 comprising a nucleotide
sequence being selected from the group of nucleotide sequences
shown in FIGS. 8, 9 and 11 to 13.
9. The rearranged TT virus of claim 1, wherein said polypeptide is
a polypeptide as shown in Table 1.
10. An oligonucleotide primer comprising part of a polynucleic acid
according to claim 1, with said primer being able to act as primer
for specifically sequencing or specifically amplifying said
polynucleic acid.
11. The oligonucleotide primer of claim 10 having a nucleotide
sequence being selected from the group consisting of the nucleotide
sequences shown in Table 2 and FIG. 10.
12. An oligonucleotide probe comprising part of a polynucleic acid
according to claim 1, wherein said probe can specifically hybridize
to said polynucleic acid.
13. The oligonucleotide probe of claim 12 having a nucleotide
sequence being selected from the group consisting of the nucleotide
sequences shown in Table 2 and FIG. 10.
14. The oligonucleotide probe of claim 12, which is detectably
labelled or attached to a solid support.
15. The oligonucleotide primer of claim 10 having a length of at
least 13 bases.
16. An expression vector comprising a rearranged TT virus
polynucleic acid of claim 1 operably linked to prokaryotic,
eukaryotic or viral transcription and translation control
elements.
17. The expression vector of claim 16 which is an artificial
chromosome.
18. A host cell transformed with an expression vector according to
claim 16.
19. A polypeptide being encoded by a rearranged TT virus
polynucleic acid of claim 1.
20. An antibody or fragment thereof specifically binding to a
polypeptide of claim 19.
21. The antibody or fragment thereof of claim 20, wherein said
antibody or fragment is detectably labelled.
22. A diagnostic kit for determining the presence of a rearranged
TT virus polynucleic acid of claim 1, said kit comprising a primer
according to claim 10.
23. A diagnostic kit for determining a predisposition or an early
stage of cancer or an autoimmune disease comprising an antibody
according to claim 20.
24. A method for the detection of a rearranged TTV polynucleic acid
according to FIG. 6 in a biological sample, comprising: (a)
optionally extracting sample polynucleic acid, (b) amplifying the
polynucleic acid as described above with at least one primer
according to claim 10, optionally a labelled primer, and (c)
detecting the amplified polynucleic acid.
25. A method for the detection of a rearranged TTV polynucleic acid
according to FIG. 6 in a biological sample, comprising: (a)
optionally extracting sample polynucleic acid, (b) hybridizing the
polynucleic acid as described above with at least one probe
according to claim 12, optionally a labelled probe, and (c)
detecting the hybridized polynucleic acid.
26. A method for detecting a polypeptide of claim 19 present in a
biological sample, comprising: (a) contacting the biological sample
for the presence of such polypeptide or antibody as defined above,
and (b) detecting the immunological complex formed between said
antibody and said polypeptide.
27. An antisense oligonucleotide reducing or inhibiting the
expression of a rearranged TT virus polynucleic acid of claim
1.
28. The antisense oligonucleotide of claim 27, which is an iRNA
comprising a sense sequence and an antisense sequence, wherein the
sense and antisense sequences form an RNA duplex and wherein the
antisense sequence comprises a nucleotide sequence sufficiently
complementary to the nucleotide sequence of the rearranged TT virus
polynucleic acid of FIG. 6.
29. A pharmaceutical composition comprising the antibody of claim
20, and a suitable pharmaceutical carrier.
30. A pharmaceutical composition comprising the antisense
oligonucleotide of claim 27 comprising administering the antibody
of claim 20.
31. A method of preventing or treating cancer or an autoimmune
disease or early stages thereof.
32. The method according to claim 31, wherein said autoimmune
disease is multiple sclerosis (MS), asthma, polyarthritis,
diabetes, lupus erythematodes, celiac disease, colitis ulcerosa, or
Crohn's disease.
33. The method according to claim 31, wherein said cancer is breast
cancer, colorectal cancer, pancreatic cancer, cervical cancer,
Hodgkin's lymphoma, B-lymphoma, acute lymphocytic leukaemia, or
Burkitt's lymphoma.
34. A vaccine comprising a rearranged TT virus polynucleic acid of
claim 1.
35. A method of immunizing a mammal against a TT virus infection
comprising administering the rearranged TT virus polynucleic acid
of claim 1.
36. A method for the generation of a database for determining the
risk to develop cancer or an autoimmune disease, comprising the
following steps (a) determining the nucleotide sequence of a host
cell DNA linked to a rearranged TT virus polynucleic acid according
to claim 1 and being present in episomal form, if present, in a
sample from a patient suffering from at least one of said diseases;
and (b) compiling sequences determined in step (a) associated with
said diseases in a database.
37. A method for evaluating the risk to develop cancer or an
autoimmune disease of a patient suspected of being at risk of
developing such disease, comprising the following steps (a)
determining the nucleotide sequence of genomic host cell DNA linked
to a rearranged. TT virus polynucleic acid according to claim 1 and
being present in episomal form., if present, in a sample from said
patient; and (b) comparing sequences determined in step (a) with
the sequences compiled in a database generated by, (i) step (a) of
claim 36 (ii) step (b) of claim 36 wherein the absence of a host
cell DNA linked to a TT virus polynucleic acid or the presence only
of genomic host cell DNA linked to a TT virus polynucleic acid not
represented in said database indicates that the risk of developing
such disease is decreased or absent.
38. A process for the in vitro replication and propagation of
Torque teno viruses (TTV) comprising the following steps: (a)
transfecting linearized TTV DNA into 293TT cells expressing high
levels of SV40 large T antigen; (b) harvesting the cells and
isolating cells showing the presence of TTV DNA; (c) culturing the
cells obtained in step (b) for at least three days; and (d)
harvesting the cells of step (c).
39. The process of claim 38, wherein the TTV is a rearranged TTV
according to FIG. 6.
Description
[0001] This application is a continuation-in-part application of
international patent application Serial No. PCT/EP2011/003119 filed
24 Jun. 2011, which published as PCT Publication No. WO 2011/160848
on 29 Dec. 2011, which claims priority to U.S. patent application
Ser. No. 12/821,634 filed 23 Jun. 2010 and Ser. No. 12/952,300
filed 23 Nov. 2010 and European patent application Serial Nos. EP
10006541 filed 23 Jun. 2010 and EP 10014907 filed 23 Nov. 2010.
[0002] The foregoing applications, and all documents cited therein
or during their prosecution ("appln cited documents") and all
documents cited or referenced in the appln cited documents, and all
documents cited or referenced herein ("herein cited documents"),
and all documents cited or referenced in herein cited documents,
together with any manufacturer's instructions, descriptions,
product specifications, and product sheets for any products
mentioned herein or in any document incorporated by reference
herein, are hereby incorporated herein by reference, and may be
employed in the practice of the invention. More specifically, all
referenced documents are incorporated by reference to the same
extent as if each individual document was specifically and
individually indicated to be incorporated by reference.
FIELD OF THE INVENTION
[0003] The present invention relates to rearranged molecules of (a)
a specific TT virus sequence and (b) a nucleotide sequence encoding
a polypeptide showing homology to mammalian proteins associated
with cancer or an autoimmune disease that are capable of
replicating autonomously for use in diagnosis, prevention and
treatment of diseases like cancer or autoimmunity.
BACKGROUND OF THE INVENTION
[0004] The family Anelloviridae includes Torque teno viruses (TTV),
TT-midiviruses (TTMDV) and TT-miniviruses (TTMV), the majority
originating from samples of human origin (Nishizawa et al., 1997;
Takahashi et al., 2000; Ninomiya et al., 2007; Okamoto, 2009;
Biagini and de Micco, 2010). The plurality of this family of ssDNA
viruses is reflected not only in DNA sequence, but also in genome
size and organization.
[0005] Multiple attempts have been made to find a suitable in vitro
system for the replication and propagation of TT viruses.
Replicative forms of its DNA have been demonstrated in bone marrow
cells and in the liver (Kanda et al., 1999; Okamoto et al., 2000a,
c, d). Peripheral blood acts as reservoir for TT viruses (Okamoto
et al., 2000b) and replication in vivo seems to occur preferably in
activated mononuclear cells (Maggi et al., 2001b; Mariscal et al.,
2002; Maggi et al., 2010). Although in vitro transcription has been
investigated in a variety of cell lines (Kamahora et al., 2000;
Kamada et al., 2004; Kakkola et al., 2007; 2009; Qiu et al., 2005;
Muller et al., 2008), long term replication leading to virus
production has been difficult to achieve (Leppik et al., 2007).
[0006] The presence of a variety of intragenomic rearranged TT
subviral molecules in sera samples and the in vitro transcription
of a subviral molecule constituting only 10% of the complete
genome, initiated the discussion whether TT viruses may share
similarities to the plantvirus family Geminiviridae (Leppik et al.,
2007; de Villiers et al., 2009). Both mono- and bipartite
Geminiviruses associate with single-stranded DNA satellites to form
disease-inducing complexes (Saunders et al., 2000; Stanley, 2004;
Nawaz-ul-Rehman and Fauquet, 2009; Jeske 2009; Paprotka et al.,
2010; Patil et al., 2010).
[0007] Infections occur within the first days of life with close to
100% of infants being infected at one year of age. The primary
route of infection however still remains unclear (Kazi et al.,
2000; Peng et al., 2002; Ninomiya et al., 2008). The ubiquitous
nature of TTV infections has hampered efforts to associate it with
the pathogenesis of disease (Jelcic et al., 2004; Leppik et al.,
2007; de Villiers et al., 2009; Okamoto, 2009). A possible
etiological association with diseases of the liver (reviewed in
Okamoto, 2009), respiratory tract (Biagini et al., 2003; Maggi et
al., 2003a,b; Pifferi et al., 2005), hematopoietic malignancies
(Jelcic et al., 2004; Leppik et al., 2007; de Villiers et al.,
2002; 2009; Shiramizu et al., 2002; Garbuglia et al., 2003; zur
Hausen and de Villiers, 2005) and auto-immune diseases (Sospedra et
al., 2005; Maggi et al., 2001a; 2007; de Villiers et al., 2009)
have been reported. During the past years, additional data has been
compiled indicative of an association of TT virus infection with
human malignant tumors. A high rate of TT virus load has been noted
in a spleen biopsy of a patient with Hodgkin's lymphoma (24
individual TTV genotypes). Similarly, other reports describe a
higher rate of TTV prevalence in colorectal and esophageal cancer
and in hematopoietic malignancies in comparison to non-tumorous
tissue from the same or other patients. Yet, the ubiquity of these
infections rendered an interpretation of these results rather
difficult and did not permit a linkage of these observations with
tumor development.
[0008] Citation or identification of any document in this
application is not an admission that such document is available as
prior art to the present invention.
SUMMARY OF THE INVENTION
[0009] Thus, the technical problem underlying the present invention
is to identify specific TTV sequences that might be clearly
associated with diseases like cancer or autoimmune diseases and,
thus, to provide means for diagnosis and therapy.
[0010] The solution to said technical problem is achieved by
providing the embodiments characterized in the claims. During the
experiments resulting in the present invention more than 200
genomes of TT viruses have been isolated. The isolates grouping in
the genus Alphatorquevirus (ca 3.8 kb in size) share very low DNA
sequence homology and differ in their genome organization. A short
stretch (71 bp) of the intergenic region is highly conserved among
all human TTV isolates (Peng et al., 2002) and is widely used to
demonstrate TT virus infection. Samples from a broad spectrum of
diseases were analysed for the presence of torque teno virus DNA by
applying PCR-amplification of this conserved region (Jelcic et al.,
2004; Leppik et al., 2007; de Villiers et al., 2009; Sospedra et
al., 2005; de Villiers and Gunst, unpublished results).
Identification of individual TT virus types however requires the
amplification of full-length genomes. Thus far 93 full-length
genomes of TTVs (ca 3.8 kb) were isolated from human samples
(Jelcic et al., 2004; Leppik et al., 2007; de Villiers et al.,
2009; present experiments). These included samples obtained from
healthy individuals, patients with leukaemia and lymphoma,
rheumatoid arthritis, multiple sclerosis and kidney disease. The
present invention describes the in vitro replication and
transcription of 12 isolates after initial transfection of the
genomic DNA and followed by virus propagation using frozen infected
cells or purified particles. Intragenomic rearranged subviral
molecules .mu.TTV (microTTV) appearing in early passages were
cloned and characterized. These also propagated independently in
cell culture resulting in novel particle-like structures which are
able to infect virus-free 293TT cells.
[0011] Accordingly, it is an object of the invention to not
encompass within the invention any previously known product,
process of making the product, or method of using the product such
that Applicants reserve the right and hereby disclose a disclaimer
of any previously known product, process, or method. It is further
noted that the invention does not intend to encompass within the
scope of the invention any product, process, or making of the
product or method of using the product, which does not meet the
written description and enablement requirements of the USPTO (35
U.S.C. .sctn.112, first paragraph) or the EPO (Article 83 of the
EPC), such that Applicants reserve the right and hereby disclose a
disclaimer of any previously described product, process of making
the product, or method of using the product.
[0012] It is noted that in this disclosure and particularly in the
claims and/or paragraphs, terms such as "comprises", "comprised",
"comprising" and the like can have the meaning attributed to it in
U.S. Patent law; e.g., they can mean "includes", "included",
"including", and the like; and that terms such as "consisting
essentially of" and "consists essentially of" have the meaning
ascribed to them in U.S. Patent law, e.g., they allow for elements
not explicitly recited, but exclude elements that are found in the
prior art or that affect a basic or novel characteristic of the
invention.
[0013] These and other embodiments are disclosed or are obvious
from and encompassed by, the following Detailed Description.
BRIEF DESCRIPTION OF THE DRAWINGS
[0014] The following detailed description, given by way of example,
but not intended to limit the invention solely to the specific
embodiments described, may best be understood in conjunction with
the accompanying drawings.
[0015] FIG. 1: PCR amplification of a 71 base fragment containing
the highly conserved TTV region (HCR) in 4 different cell lines,
L1236 (EBV-negative Hodgkin's lymphoma line), HSB-2 (acute
lymphoblastic leukemia line), KR and IGL (melanoma cell lines) and
placenta DNA
[0016] FIG. 2: Spooled DNA remaining in the supernatant of L1236
cells after precipitation and removal of high molecular weight DNA
and RNase digestion [0017] Two bands are visible in the region
between 4.3 and 6.6 base bands.
[0018] FIG. 3: Outwards-directed long-PCR, using primers of the 71
base TTV HCR region in HSB-2 DNA [0019] Two bands are visible in
regions corresponding to 4.5 to 7 kb. In addition, bands emerge in
the region corresponding to 0.4 to 0.7 kb.
[0020] FIGS. 4A and 4B: Schematic outline of the TTV oncogene
concept [0021] The left part (FIG. 4A) represents the genomic
organization of wild-type TTV genomes. The right part (FIG. 4B)
envisages the integration of host cell DNA into the single-stranded
plasmids.
[0022] FIG. 5: Schematic outline of the TTV host cell DNA
autoimmunity concept [0023] The modified host cell genes should
code for immuno-reactive antigenic epitopes.
[0024] FIG. 6: PCR amplification of the 71 base highly conserved
region (HCR) from the DNA of 4 different cell lines [0025] The
arrows point to the two sites with variations in the nucleotide
sequences.
[0026] FIGS. 7A-C: [0027] (A) The autonomously replicating 719 base
TTV DNA (right) and the complete TTV sequence from which it is
derived. The nucleotide composition of both molecules is found in
FIGS. 11A+B. [0028] (B) The autonomously replicating 621 base TTV
DNA (right) and the complete DNA sequence from which it is derived.
The nucleotide composition of both molecules is found in FIGS.
12A+B. [0029] (C) The autonomously replicating 642 base TTV DNA
(right) and the complete DNA sequence from which it is derived. The
nucleotide composition of both molecules is found in FIGS.
13A+B.
[0030] FIGS. 8A-L: Three exemplary chimeric TTV/truncated host cell
DNA sequences from brain biopsies of patients with multiple
sclerosis [0031] (A-D) Chimeric cellular sequences derived from
chromosome 1 with some homologies to prion and Wilms tumor
sequences and the 3' end of myeloid lymphoid leukemia 3 (MLL3)
pseudogene. Human DNA sequence from clone RP 11-14N7 on chromosome
1. Contains 3' end of a myeloid/lymphoid or mixed lineage leukemia
3 (MLL3) pseudogene, a seven transmembrane helix receptor
pseudogene, the 5'-end of a novel gene. [0032] (E-G) Chimeric
cellular sequences derived from chromosome 16. Homologies to
transcription factor 3 (TF 3C), protein signatures for chemokine
receptors and leukotriene B4 receptor. [0033] (H-L) Chimeric
cellular sequences derived from chromosome 10, truncated sequence
of myosin, reactivity reported for multiple sclerosis patients and
those with rheumatoid arthritis (sequence contains both full
primers front and back).
[0034] FIGS. 9A-H: Three exemplary chimeric TTV/truncated host cell
DNA sequences from cell lines derived from patients with Hodgkin's
disease or leukemia [0035] (A-C) Chromosome 1 sequences with part
of transgelin 2, the IGSF9 gene for immunoglobulin superfamily
member 9, the SLAM9 gene. [0036] (D-F) Translated protein sequences
with substantial homology to the oncogenes v-myb (avian
myeloblastosis viral oncogene), but also to c-myb. This sequence
was amplified with the forward primer at both ends. [0037] (G-H)
Derived from chromosome 10. High homology with "Deleted in
malignant 1 Protein" (DMBT), an identified tumor suppressor gene.
This sequence was amplified with the forward primer at both
ends.
[0038] FIG. 10: Primer sequences used in the reactions described in
the Examples, derived from the 71 base HCR.
[0039] FIGS. 11A-D: [0040] (A-C) Complete TTV sequence from which
autonomously replicating 719 base DNA has been obtained. [0041] (D)
Complete sequence of the autonomously replicating 719 base TTV
DNA.
[0042] FIGS. 12A-D: [0043] (A-C) Complete TTV sequence (tth25) from
which autonomously replicating 621 base DNA has been obtained.
[0044] (D) Complete sequence of the autonomously replicating 621
base TTV DNA.
[0045] FIGS. 13A-D: [0046] (A-C) Complete TTV sequence (ttrh2l5)
from which autonomously replicating 642 base DNA has been obtained.
[0047] (D) Complete sequence of the autonomously replicating 642
base TTV DNA.
[0048] FIGS. 14A-K: Open reading frames (ORFs) found within the
nucleotide sequence of 71 nt [0049] zyb2.1.pep, zyb9.1.pep, and
zkb69.1.pep are starting at the first triplet, zyb2.3.pep,
zyb9.3.pep, zkb5.3.pep, and zkb69.3.pep are starting from the third
triplet. This region is actively transcribed.
[0050] FIG. 15: Digestion of single-stranded DNA by mung-bean
nuclease (MBN) [0051] Lanes 2 and 3 show that the amplified DNA may
be digested by pre-treatment with MBN. Lanes 5 and 6 demonstrate
that plasmid-DNA pretreated in the same way is not digested by
MBN.
[0052] FIGS. 16A-B: Schematic presentation of the ORF1 of a number
of TTV-HD isolates ORF1 was either divided into one to several
smaller ORFs or fused to other ORFs.
[0053] FIGS. 17A-G: Transcripts isolated during in vitro
replication of TTV-HD isolates [0054] Labelling of individual
transcripts indicates "isolate.5'- or 3'-race (s--single
strand).no". TTV-isolate numbers (1-12) indicated with respective
schematic genome and TTV-HD number. *--transcripts which were more
often isolated.
[0055] FIG. 18: Phylogenetic tree showing TTV species and isolates
of genus Alphatorquevirus, as well as all TTV-HD types [0056]
TTV-HD types propagated in in vitro cell cultures are
encircled.
[0057] FIGS. 19A-C: Propagation of full-length TTV-HD genomes in
293TT cells [0058] Examples of propagation of [0059] (A) TTV-HD14b,
TTV-HD14c, TTV-HD14a, and TTV-HD14e (lanes 1-4), TTV-HD15a (lane 5)
and TTV-HD16a (lane 16) after nested PCR amplification; [0060] (B)
TTV-HD20a (lane 7), TTV-HD3a (lane 8), TTV-HD1a (lane 9),
TTV-HD23b, TTV-HD23d, and TTV-HD23a (lanes 10-12) after single PCR
amplification. a, b and c--examples of propagations, approximately
7 days after infection. b-1, b-2, and b-3 indicate variability
observed when propagating same passage. [0061] (C) Daily sampling
of TTV-HD14e (nested PCR) and TTV-HD23b cultures. [0062] M--DNA
size marker; *--indicate subviral molecules of different
cultures.
[0063] FIG. 20A-D: Schematic presentation of full-length TTV-HD
with their respective .mu.TTV-HD molecules [0064] Numbers indicate
ORFs in the DNA genome.
[0065] FIGS. 21A-C: Independent propagation of .mu.TTV-HD [0066]
.mu.TTV-HD15 replicated stronger after initial transfection, but
decreased over time (*--indicate nested PCR amplification).
.mu.TTV-HD1 and .mu.TTV-HD23.2 replicated increasingly after
additional propagation steps. .mu.TTV-HD23.2 molecules formed
during replication of .mu.TTV-HD23.1.
[0067] FIGS. 22A-B: [0068] (A) Partially purified virus-like
particles [0069] Particles were lysed and content separated on
agarose gel. [0070] (B) Partially purified mTTV particles [0071]
Particles were lysed and DNA content separated on agarose gel.
[0072] 3--TTV-HD14a, 5--.mu.TTV-14, 6--TTV-HD16a, 8--TTV-HD3a,
9--.mu.TTV-HD1, 12--TTV-HD23a, 12a--.mu.TTV-HD12.1,
12b--.mu.TTV-HD12.2
DETAILED DESCRIPTION OF THE INVENTION
[0073] The ubiquity of torque teno viruses, together with the
absence of suitable in vitro culture systems, has hampered progress
in investigating this group of viruses. The multitude and
heterogeneity of types (Biagini and de Micco, 2010; Okamoto, 2009),
as well as their ubiquitous presence in hematopoietic cells
(Takahashi et al., 2002; Kanda et al., 1999; Zhong et al., 2002),
have added to the delay in gaining information on whether these
viruses are involved in the pathogenesis of any disease. A spectrum
of TTV types was isolated (Jelcic et al., 2004; Leppik et al.,
2007; de Villiers et al., 2009; present invention). Full-length
genomes of a number of TTV types were often isolated from an
individual sample depending on the composition of primers used for
long-distance PCR amplification. The scattered distribution of the
new isolates of the present invention on a phylogenetic tree of
genus Alphatorquevirus (FIG. 18) indicates their heterogeneity,
irrespective of origin. The variation in genome organization
resulting from minor differences in sequence identity across the
genome was often observed between isolates of the same type and has
prompted questions as to the functionality of these modified
genes.
[0074] In the past attempts were made to propagate TTV genomes in a
number of cell lines and in peripheral blood monocytes under
varying in vitro culturing conditions. Moderate success with single
isolates was achieved in Hodgkin's lymphoma cell lines and in 293T
cells. Replication was however slow and occurred at low levels
(Leppik et al., 2007; Leppik and de Villiers, unpublished data).
For the studies of the present invention the human embryonic kidney
cell line 293TT was engineered to express high-levels of SV-40
large-T antigen (Buck et al., 2005). Transfecting TTV genomes into
these cells resulted in virus DNA replication and production of
virus-like particles of ca. 30 nm in size (FIG. 22). The structures
of these virus-like particles differ from those previously
published as TTV particles (Itoh et al., 2000). This is possibly a
consequence of the isolation of the latter from faeces.
[0075] The differences in the level of DNA replication observed
between TTV-isolates cannot presently be explained. Phylogenetic
information does not provide an answer. Noticeable is that 6
isolates (TTV-HD14, TTV-HD15 and TTV-HD16) which originated from
brain biopsies of patients with multiple sclerosis all replicated
much less in the system of the present invention. Virus production
(FIG. 22) or virus propagation (FIGS. 19 and 21) did not seem to be
influenced despite the varying levels of DNA replication or
modifications in the genome organization which included modified
ORF1s. Transcription levels however, seemed to be influenced and
fewer of the common transcripts described for other TTV-types were
detected in the four TTV-14 isolates than in TTV-HD15a and
TTV-HD16a cultures. Previously reported transcripts (Leppik et al.,
2007; Kakkola et al., 2009) were isolated from all infected
cultures. Interestingly, no transcript was identified which would
code for full-length ORF1 protein (suspected to play a major role
in coding for the viral capsid, but not yet proven) of any of the
TTV-HD types studied, despite the isolation of full-length
genome-carrying virus-like particles from all infected cultures. A
number of putative protein sequences were identified which may have
resulted from fusion products of any two or three genes.
Translation strategies known to be used by viruses, such as leaky
scanning, re-initiation and ribosomal shunting (Ryabova et al.,
2006) might be involved here. Dual coding in alternative reading
frames is an additional mechanism which may be involved (Kovacs et
al., 2010). Interestingly, transcripts of the control region were
also isolated. Here two groups of transcripts were identified. One
group involved transcripts spanning at least part of the intergenic
region and extending into the rest of the genome covering the known
genes. The second group consisted of transcripts varying in length
and without recognizable coding capacity. It has been proposed that
the nature of the TTV intergenic region with its high GC content
may play a role in transcription-dependent replication blockage
(Belotserkovskii et al., 2010).
[0076] A very prominent observation in the present study is the
formation of subviral molecules already early during the
replication cycle of the majority of the isolates obtained. Two
groups of subviral molecules were distinguished. The formation of
multiple subviral DNA molecules ranging in size occurred frequently
and extensively in TTV-HD20a-, TTV-HD3a- and TTV-HD1a-infected
cultures. Previously similar rearranged subviral molecules were
demonstrated in serum samples (Leppik et al., 2007). Transfection
into L428 cells (Hodgkin's lymphoma cell line) of a small number of
the subviral genomes originating from sera resulted in limited
replication and transcription for a few days (de Villiers et al.,
2009). Data shown in the present invention indicate a role as
defective interfering particles during in vitro replication of the
full-length genome. Replication of the full-length genome is
reduced during simultaneously increasing levels of subviral
molecules (FIG. 19b). Similar subviral molecules were occasionally
and inconsistently demonstrated in cultures of the other 9
isolates, but did not influence the replication of the full-length
genome. This difference also underlines not only the diversity
between TTV types, but also that this phenomenon does not result
from PCR artifacts. Similar defective interfering molecules have
also been reported in Geminiviruses where they accumulate during
improper replication (Jeske, 2009).
[0077] The second group of subviral molecules .mu.TTV evolved
during replication of TTV isolates TTV-HD14b, TTV-HD14c, TTV-HD14a
and TTV-HD14e, TTV-HD15a, TTV-HD16a, TTV-HD1a, TTV-HD23b, TTV-HD23d
and TTV-HD23a and remained constant in size and composition during
propagation, as evidenced after cloning and sequencing. Their
production in the case of the latter 4 isolates seemed to be
influenced by culturing conditions. Interestingly, the subviral
molecule .mu.TTV-HD1 in the TTV-HD1a infected culture was
detectable in the cell culture even after loss of detectable
parental full-length genome (FIG. 19c). Two molecules
.mu.TTV-HD23.1 (409 bases) and .mu.TTV-HD23.2 (642 bases) were
isolated from all 3 TTV-HD23 infected cultures. .mu.TTV-HD23.2 is
composed of the .mu.TTV-HD23.1 molecule plus a duplication of 306
nt of the smaller molecule. Subviral molecules (.mu.TTV-HD14) which
were isolated from the 4 TTV-HD14 cultures were all identical in
sequence and appeared very early after the initial transfection of
the parental genome. The production of these smaller molecules did
not seem to be influenced by the variation in genome structure
between isolates of the same TTV type. All subviral molecules were
composed of parts of the parental TTV type, although the genome
regions involved, differed. They were all amplified by
long-distance PCR using the same back-to-back primers as for
amplification of the parental genome. The episomal replication of a
TTV subviral molecule isolated from a serum sample over a period of
23 days had previously been observed (de Villiers et al., 2009).
Multimeric subviral RNA was demonstrated during this process. The
subviral molecules reported in the present invention are able to
replicate autonomously, may be propagated in vitro (FIG. 21) and
appear to be related to small protein structures observed in these
cultures by electronmicroscope (FIG. 22). It is not known whether
they are transmitted as part of an infectious TT virus or whether
they are induced only after infection by the parent virus and then
transmitted by autonomously infecting other cells. Similar subviral
DNAs have been associated with the geminivirus disease complex
(Stanley, 2004). .beta.-satellites enhance symptom phenotypes in
plants. They share a network of protein interactions with
geminiviruses and are dependent on them for trans-replication,
encapsidation and vector transmission. The only sequence shared
between .beta.-satellites and geminiviruses lies in the short
origin of replication (Nawaz-ul-Rehman and Fauquet, 2009; Patil and
Fauquet, 2010; Paprotka et al., 2010). This is in contrast to the
TTV subviral molecules (.mu.TTV) which share almost identical
sequences with the parental genome. The cytopathic effect observed
during in vitro propagation of the TTV subviral molecules of the
present invention points to their possible role as the
disease-inducing component of some torque teno viruses. Signature
motifs of proteins involved in autoimmune disease have been
identified by in silico analyses of putative proteins expressed by
these subviral molecules, as well as from virus transcripts
isolated from the TTV-infected cultures.
[0078] The observation of a DNA encoding a protein containing a
signature motif of a mammalian protein associated with cancer or an
autoimmune disease linked to the 72 bp highly conserved TT virus
region (HCR) is the basis for the following conclusion: The
rearranged open reading frames of TTV and .mu.TTV code for
antigenic epitopes which mimic cellular protein sequences which are
attacked in cancer or autoimmune diseases. Their shared, but not
identical sequence should provoke an immune response against these
epitopes present also in normal tissue.
[0079] The surprising observation of host cell DNA linked to an
apparently single-stranded form to TT virus HCR is the basis for
the following conclusion: TT viral sequences have not yet been
demonstrated as integrated into double-stranded cellular DNA,
persisting within host cell chromosomes. Thus, the opposite finding
of host cell DNA, linked in a single-stranded state to the TTV HCR
should have biological significance. The present data indicate
their long-time persistence as episomes in human cancer cell lines,
pointing to a role of this persistence in cell proliferation. Two
aspects seem to require specific consideration: a possible role of
those recombinants in cancer and in autoimmunity.
[0080] One possibility is the random integration of host cell
sequences into TTV episomes. This may happen after strand
displacement in the course of aberrant DNA replication or after
reverse transcription of cellular RNA. In case of random
integration a larger number of recombinants should be innocuous and
harmless for cells carrying these recombinants. A growth-promoting
property of transcripts of the TTV HCR, as well as integration and
transcription of growth-stimulating host cell genes, their
modification in the process of integration or their dysregulation
by the TTV HCR however, will result in proliferative consequences.
These episomes should acquire immortalizing and under certain
conditions transforming properties. In combination with additional
modifications of the host cell genome they may direct malignant
growth. This mode of action reveals a distant resemblance to the
insertion of cellular oncogenes into retroviral genomes.
[0081] The previous considerations are summarized in FIG. 4.
Obviously, the recombination between the TTV regulatory region and
cellular nucleic acids must be a relatively frequent process, since
such recombinants are found in the majority of cell lines thus far
analyzed. It also should contribute to cell proliferation,
otherwise the regular persistence of such molecules, in part over
decades of continuous proliferation, would be difficult to explain.
It is assumed that this type of recombination is a random process,
involving different types of cellular genes. The coding function of
the TTV HCR and/or the uptake of genes steering cell proliferation,
or blocking the function of proliferation antagonists, or
inhibiting cell differentiation should lead to an accumulation of
cells containing these types of recombinants. It is envisaged that
this, in combination with additional mutational or recombinational
events of the cells harbouring such TTV-host cell nucleic acid
recombinants, provides a selective advantage for cells carrying
such episomes. The presence of the latter would represent a prime
risk factor for malignant conversion. In this sense those
recombinations should be of general importance for different types
of human cancers, although a certain degree of specificity for a
limited set of genes would be expected for individual cancer
types.
[0082] The implications of this model are profound. They reach from
cancer prevention, early detection into cancer therapy. The
important role of TTV infections and of the persistence of TTV HCR
is stressed by the available information. Prevention of these
infections should reduce the risk for the development of the
described recombinants. The diagnosis of specific recombinants
would probably contribute to cancer risk assessment. Profound
implications would be expected for cancer therapy: the TTV HCR
emerges as the prime determinant for the persistence and
maintenance of the single-stranded episomes. Since this region
appears to be part of an open reading frame, it should be
vulnerable to small interfering RNAs or DNAs. Thus, it offers a
suitable target for future therapeutic deliberations.
[0083] Two other aspects deserve discussion: certain parallels
which seem to exist to retroviral carcinogenesis in rodents and
chicken and the use of autonomously replicating TTV-based vector
systems for gene therapy. Insertional mutagenesis, the uptake and
modification of cellular growth-stimulating genes, rendering them
into oncogenes has frequently been analyzed in animal systems. This
has thus far not been reported for human cancers. Do TT viruses
replace this niche in human and other primate cells? Do TTV compete
successfully with retrovirus infections in taking over their role
in specific species? The episomal persistence of single-stranded
DNA, however, emerges as a remarkable difference to
retrovirus-induced carcinogenesis.
[0084] Autonomously replicating subviral DNA molecules of
approximately 400 bases of TTV origin have been described before.
It is tempting to speculate that they or specific TTV-host cell
recombinants may represent optimal vector systems for future
approaches in gene therapy and for the construction of artificial
chromosomes.
[0085] The existence of TTV host cell nucleic acid recombinants
also permits a novel view on aspects of autoimmune diseases and
other chronic diseases (potentially even conditions like
arteriosclerosis and Alzheimer's disease). Modification or
dys-regulation of cellular proteins may originate from insertional
events of cellular genes into single-stranded DNA or to the
different HCRs exerted by TTV elements (FIG. 5). They could provide
a convenient explanation for autoimmune reactions, even for local
ones, like in multiple sclerosis (MS) or Crohn's disease. In the
latter two cases in particular, the reactivation of other local
infections (potentially herpes-type viruses) would provide a
stimulus for the local amplification and gene activity of the
respective TTV-host cell nucleic acid recombinants. In MS, this
could explain recurrent episodes of disease progression. A model of
the autoimmunity concept is depicted in FIG. 5.
[0086] Similarly, rearranged TT virus molecules of 719, 642, and
621 bases have been identified which replicate autonomously upon
transfection of specific cell lines. Their DNA composition and
derivation from specific complete TTV genotypes is shown in FIG. 6.
Here the rearrangement results in novel open reading frames in part
with epitopes related to those of juvenile diabetes and rheumatoid
arthritis.
[0087] The models of the present invention for a role of TTV-host
cell nucleic acid recombinants is based on the demonstration of the
single-stranded chimeric molecules between the TTV HCR and host
cell DNA and rearranged autonomously replicating TTV molecules of
substantially reduced molecular weights. Both, the TTV oncogene
concept and the TTV autoimmunity concept will clearly provide novel
approaches to prevention, diagnosis, and in particular to therapy
of these conditions and will improve the prognosis of the
respective patients.
[0088] Unless defined otherwise, all technical and scientific terms
used herein have the same meaning as commonly understood by those
of ordinary skill in the art to which the invention belongs.
Although any methods and materials similar or equivalent to those
described herein may be used in the practice or testing of the
present invention, preferred methods and materials are described.
For the purposes of the present invention, the following terms are
defined below.
[0089] By "signature motif of a mammalian protein being associated
with an autoimmune disease" is meant an amino acid sequence showing
striking identity to a motif that may be found in any of the
proteins listed in Table 1. Preferably, the length of the signature
motif is at least 5 aa, preferably at least 10 aa, more preferably
at least 20 aa, and most preferably at least 30 aa and/or the
degree of identity of this signature motif to a corresponding motif
in a mammalian protein is at least 50%, 60%, 70%, 80%, 90% or
95%.
[0090] By "antibody" is meant a protein of the immunoglobulin
family that is capable of combining, interacting or otherwise
associating with an antigen. The term "antigen" is used herein in
its broadest sense to refer to a substance that is capable of
reacting in and/or inducing an immune response. Typically, but not
necessarily, antigens are foreign to the host animal in which they
produce immune reactions.
[0091] By "epitope" is meant that part of an antigenic molecule
against which a particular immune response is directed. Typically,
in an animal, antigens present several or even many antigenic
determinants simultaneously. Thus, the terms "epitope" and
"antigenic determinant" mean an amino acid sequence that is
immunoreactive. Generally an epitope consists of 4, and more
usually 5, 6, 7, 8 or 9 contiguous amino acids. However, it should
also be clear that an epitope need not be composed of a contiguous
amino acid sequence. The immunoreactive sequence may be separated
by a linker, which is not a functional part of the epitope. The
linker does not need to be an amino acid sequence, but may be any
molecule that allows the formation of the desired epitope.
[0092] The term "biological sample" as used herein refers to a
sample that may be extracted, untreated, treated, diluted or
concentrated from an animal. Biological sample refers to any
biological sample (tissue or fluid) containing a TTV polynucleic
acid of the invention and refers more particularly to blood serum
samples, plasma samples, biopsy samples, cerebrospinal fluid
samples etc.
[0093] By "carrier" is meant any substance of typically high
molecular weight to which a non- or poorly immunogenic substance
(e.g., a hapten) is naturally or artificially linked to enhance its
immunogenicity.
[0094] The term "diagnosis" is used herein in its broadest sense to
include detection of an antigen reactive to a sub-immunoglobulin
antigen-binding molecule. Also included within its scope, is the
analysis of disorder mechanisms. Accordingly, the term "diagnosis"
includes the use of monoclonal antibodies for research purposes as
tools to detect and understand mechanisms associated with a disease
or condition of interest. It also includes the diagnostic use of
TTV polynucleic acid of the invention for the detection of
homologous or complementary RNA transcribed from such
molecules.
[0095] The term "immunogenicity" is used herein in its broadest
sense to include the property of evoking an immune response within
an organism. Immunogenicity typically depends partly upon the size
of the substance in question, and partly upon how unlike host
molecules it is. It is generally considered that highly conserved
proteins tend to have rather low immunogenicity.
[0096] The term "patient" refers to patients of human or other
mammal origin and includes any individual it is desired to examine
or treat using the methods of the invention. However, it will be
understood that "patient" does not imply that symptoms are present.
Suitable mammals that fall within the scope of the invention
include, but are not restricted to, primates, livestock animals
(e.g., sheep, cows, horses, donkeys, pigs), laboratory test animals
(e.g., rabbits, mice, rats, guinea pigs, hamsters), companion
animals (e.g., cats, dogs) and captive wild animals (e.g., foxes,
deer, dingoes).
[0097] By "pharmaceutically acceptable carrier" is meant a solid or
liquid filler, diluent or encapsulating substance that may be
safely used in any kind of administration.
[0098] The term "related disease or condition" is used herein to
refer to a disease or condition that is related anatomically,
physiologically, pathologically and/or symptomatically to a
reference disease or condition. For example, diseases or conditions
may be related to one another by affecting similar anatomical
locations (e.g., affecting the same organ or body part), affecting
different organs or body parts with similar physiological function
(e.g., the oesophagus, duodenum and colon which rely an peristalsis
to move food from one end of the alimentary canal to the other), by
having similar or overlapping pathologies (e.g., tissue damage or
rupture, apoptosis, necrosis) or by having similar or overlapping
symptoms (i.e., allergic response, inflammation, lymphocytosis).
Thus, for example, an antigen associated with ulcerated colitis may
also be associated with perforation of the colon because these
disease affects the same organ (i.e., colon).
[0099] The term "treating" is used herein in its broadest sense to
include both therapeutic and prophylactic (i.e., preventative)
treatment designed to ameliorate the disease or condition.
[0100] The term "episome" is used herein to refer to a portion of
genetic material that may exist independent of the main body of
genetic material (chromosome) at some times or continuously and
replicate autonomously, while at other times is able to integrate
into the chromosome. Examples of episomes include insertion
sequences, transposons and the TTV of the invention.
[0101] The present invention provides a rearranged TT virus
polynucleic acid which may comprise (or consisting of) [0102] (a) a
nucleotide sequence shown in FIG. 6; [0103] (b) a nucleotide
sequence which shows at least 70%, 80%, 90%, 95% or at least 98%
identity to a nucleotide sequence of (a) and is capable of
replicating autonomously; [0104] (c) a fragment of a nucleotide
sequence of (a) or (b) which is capable of replicating autonomously
and/or inducing autonomous replication; [0105] (d) a nucleotide
sequence which is the complement of the nucleotide sequence of (a),
(b), or (c); or [0106] (e) a nucleotide sequence which is redundant
as a result of the degeneracy of the genetic code compared to any
of the above-given nucleotide sequences,
[0107] wherein, preferably, said nucleotide sequence of (a), (b),
(c), (d) or (e) is linked to a polynucleic acid encoding a protein
containing a signature motif of a protein being associated with
cancer or an autoimmune disease via a phosphodiester bond.
[0108] Preferably, the protein is a mammalian protein. Particularly
preferably the mammalian protein is a human protein. In another
embodiment of the invention the protein is an allergen such as
gluten.
[0109] The present invention also provides fragments of the
nucleotide sequences of the present invention described above that
are capable of replicating autonomously. The skilled person may
derive at fragments still having the biological activity of the
full length molecule without undue experimentation. The lengths of
the fragments are not critical, however, fragments having a length
of at least 45, 55 or 65 nt are preferred.
[0110] The person skilled in the art may easily determine which
nucleic acid sequences are related to the nucleotide sequence of
FIG. 6 or which fragments are still capable of replicating
autonomously by using standard assays or the assays described in
the examples, below.
[0111] The present invention also provides polynucleic acid
sequences which are redundant as a result of the degeneracy of the
genetic code compared to any of the above-given nucleotide
sequences. These variant polynucleic acid sequences will thus
encode the same amino acid sequence as the polynucleic acids they
are derived from.
[0112] The term "polynucleic acid" refers to a single-stranded or
double-stranded nucleic acid sequence. A polynucleic acid may
consist of deoxyribonucleotides or ribonucleotides, nucleotide
analogues or modified nucleotides, or may have been adapted for
therapeutic purposes. Preferably, the rearranged TT virus
polynucleic acid is a single-stranded DNA.
[0113] Preferably, the rearranged TT virus polynucleic acid of the
invention is present as an extrachromosomal episome.
[0114] Preferably, the mammalian protein associated with cancer or
an autoimmune disease or allergen associated with an autoimmune
disease is a protein as shown in Table 1.
TABLE-US-00001 TABLE 1 (A) Examples of signature motifs identified
in putative proteins resulting from TTV-HD transcripts and full-
length genomes Protamine 1 + 2 Leukotriene B4 receptor AIRE
(AutoImmune Regulator) Gliadin Neuropeptide Y CHLAMIDIAOM3 -
Chlamidia mol. mimicry - heart disease Arginine-rich (re. Sospedra
et al., 2005 - molecular mimicry in MS) Opsin Cyclin kinase
Proxisome (diabetes steroid receptor) Vasopressin BDNF factor
(brain-derived neurotropic factor) prepro-orexin Collagen helix
repeat GIP receptor Neurotensin Prion CD36 antigen (insulin
resistance deficiency, artherosclerose) Calcitonin Prostanoid GABA
receptor (principal inhibitory neurotransmitter in brain) Arginine
deaminase Opioid, growth factor receptor Galanin Plexin/semamorphin
NURR (rat orphan nuclear hormone receptor) Brain derived
neurotrophin factor (BDN) Collagenase + endostatin Aerolysin Myelin
proteolipid Serotonin Muscarinic receptor Melanin-conentrating
hormone receptor Sjorgen's syndrome/scleroderma auto-antigen p27
Plexin/semaphoring/integrin type repeat signature Male specific
protein Gastrin Collagen Collagenase metalloprotease (B) aa
sequence alignments DomainSweep employs a variety of search methods
to scan the following protein family databases: BLOCKS PFAMA PRINTS
PRODOM PROSITE SMART SUPERFAMILY TIGRFAMS OPSIN gbCsCt38.4ikn.2.154
OPSINRH3RH4_3: domain 1 of 1, from 46 to 56: score 8.4, E = 5.1
*->iynsFhrGfAlg<-* y sFhrG+A -YESFHRGHAAF 56
zc55s.B4.18dek.281 OPSINRH3RH4_3: domain 1 of 1, from 19 to 29:
score 8.4, E =5.1 *->iynsFhrGfAlg<-* y sFhrG+A zc55s.B4.1 19
-YESFHRGHAAF 29 rheu.cd.215rev.1.736 OPSINRH3RH4_7: domain 1 of 1,
from 665 to 683: score 7.8, E = 5.3 *->R1ELqKR1PWLe1nEKave<-*
R+ +q+RlPW+ + ++ rheu.cd.21 665 RFGVQQRLPWVHSSQETQS 683
OPSINRH3RH4_7: domain 1 of 1, from 23 to 41: score 8.2, E = 4.4
*->R1ELqKR1PWLe1nEKave<-* R+ +q+RlPW+ + ++ zc3r11.B4. 23
RFRVQQRLPWVHSSQETQS 41 gc; OPSINRH3RH4 gx; PR00577 gn; COMPOUND (7)
ga; 11-SEP-1996; UPDATE 07-JUN-1999 gt; Opsin RH3/RH4 signature gp;
PRINTS; PR00237 GPCRRHODOPSN; PR00247 GPCRCAMP; PR00248 GPCRMGR gp;
PRINTS; PR00249 GPCRSECRETIN; PR00250 GPCRSTE2; PR00899 GPCRSTE3
gp; PRINTS; PR00251 BACTRLOPSIN gp; PRINTS; PR00238 OPSIN; PR00574
OPSINBLUE; PR00575 OPSINREDGRN gp; PRINTS; PR00576 OPSINRH1RH2;
PR00578 OPSINLTRLEYE; PR01244 PEROPSIN gp; PRINTS; PR00666
PINOPSIN; PR00579 RHODOPSIN; PR00239 RHODOPSNTAIL gp; PRINTS;
PR00667 RPERETINALR gp; INTERPRO; IPR000856 gr; 1. APPLEBURY, M.L.
AND HARGRAVE, P.A. gr; Molecular biology of the visual pigments.
gr; VISION RES. 26(12) 1881-1895 (1986). gr; 2. FRYXELL, K.J. AND
MEYEROWITZ, E.M. gr; The evolution of rhodopsins and
neurotransmitter receptors. gr; J. MOL. EVOL. 33(4) 367-378 (1991).
gr; 3. ATTWOOD, T.K. AND FINDLAY, J.B.C. gr; Design of a
discriminating fingerprint for G protein-coupled receptors. gr;
PROTEIN ENG. 6(2) 167-176 (1993). gr; 4. ATTWOOD, T.K. AND FINDLAY,
J.B.C. gr; Fingerprinting G protein-coupled receptors. gr; PROTEIN
ENG. 7(2) 195-203 (1994). gr; 5. FRYXELL, K.J. AND MEYEROWITZ, E.M.
gr; An opsin gene that is expressed only in the R7 photoreceptor
cell of gr; Drosophila. gr; EMBO J. 6(2) 443-451 (1987). gr; 6.
ZUKER, C.S., MONTELL, C., JONES, K., LAVERTY, T. AND RUBIN, G.M.J.
gr; A rhodopsin gene expressed in photoreceptor cell R7 of the
Drosophila gr; eye-homologies with other signal-transducing
molecules. gr; NEUROSCIENCE 7(5) 1550-1557 (1987). gr; 7. MONTELL,
C., JONES, K., ZUKER, C.S. AND RUBIN, G.M.J. gr; A second opsin
gene expressed in the ultraviolet-sensitive R7 gr; photoreceptor
cells of Drosophila melanogaster. gr; NEUROSCIENCE 7(5) 1558-1566
(1987). gd; Opsins, the light-absorbing molecules that mediate
vision [1,2], are gd; integral membrane proteins that belong to a
superfamily of G protein- gd; coupled receptors (GPCRs). The
activating ligands of the different gd; superfamily members vary
widely in structure and character, yet the gd; proteins appear
faithfully to have conserved a basic structural gd; framework,
believed to consist of 7 transmembrane (TM) helices. Although gd;
the sequences of these proteins are very diverse, reflecting to
some gd; extent this broad range of activating ligands,
nevertheless, motifs gd; have been identified in the TM regions
that are characteristic of gd; virtually the entire superfamily
[3,4]. Amongst the exceptions are the gd; olfactory receptors,
which cluster together in a subfamily, which lacks gd; significant
matches with domains 2, 4 and 6. Interestingly, the opsins gd; also
seem to be emerging as increasingly atypical of the superfamily,
gd; clustering most strongly, in phylogenetic analyses, with the
olfactory gd; receptors [4]. The visual pigments comprise an
apoprotein (opsin), gd; covalently linked to the chromophore
11-cis-retinal. The covalent link gd; is in the form of a
protonated Schiff base between the retinal and a gd; lysine residue
located in TM domain 7. Vision is effected through the gd;
absorption of a photon by the chromophore, which is isomerised to
the gd; all-trans form, promoting a conformational change in the
protein. gd; By contrast with vertebrate rhodopsin, which is found
in rod cells, gd; insect photoreceptors are found in the ommatidia
that comprise the gd; compound eyes. Each Drosophila eye has 800
ommatidia, each of which gd; contains 8 photo-receptor cells
(designated R1-R8): R1-R6 are outer gd; cells, while R7 and R8 are
inner cells. Opsins RH3 and RH4 are sensitive gd; to UV light
[5-7]. OPSINRH3RH4 is a 7-element fingerprint that provides gd; a
signature for the RH3 and RH4 opsins. The fingerprint was derived
from gd; an initial alignment of 5 sequences: the motifs were drawn
from conserved gd; sections within either loop or N- and C-terminal
regions, focusing on gd; those areas of the alignment that
characterise the RH3/RH4 opsins but gd; distinguish them from the
rest of the rhodopsin-like superfamily- gd; motifs 1 and 2 lie at
the N-terminus; motif 3 spans the first external gd; loop; motif 4
lies in the second external loop; motif 5 spans the C- gd; terminal
half of TM domain 5; motif 6 lies in the the third cytoplasmic gd;
loop; and motif 7 lies at the C-terminus. A single iteration on
OWL28.1 gd; was required to reach convergence, no further sequences
being identified gd; beyond the starting set. gd; c; OPSINRH3RH43
il; 12 it; Opsin RH3/RH4 motif III-1 id; IFNSFHRGFAIY OPS4_DROME
109 52 id; IYNSFHRGFALG OPS4_DROPS 112 54 id; IYNSFHRGFALG
OPS4_DROVI 115 54 id; IYNSFHQGYALG OPS3_DROME 115 54 id;
IYNSFHQGYALG OPS3_DROPS 114 54 bb; fc; OPSINRH3RH43 fl; 12 ft;
Opsin RH3/RH4 motif III-2 fd; IYNSFHRGFALG OPS4_DROVI 115 54 fd;
IYNSFHQGYALG OPS3_DROME 115 54 fd; IYNSFHRGFALG OPS4_DROPS 112 54
fd; IYNSFHQGYALG OPS3_DROPS 114 54 fd; IFNSFHRGFAIY OPS4_DROME 109
52 fd; IYNSFHTGFATG O61474 105 54 fd; IYNSFNTGFATG O61473 106 54
fd; IYNSFNTGFALG OPSV_APIME 105 54 fc; OPSINRH3RH47 fl; 19 ft;
Opsin RH3/RH4 motif VII-2 fd; RMELQKRCPWLAIDEKAPE OPS4_DROVI 346 62
fd; RMELQKRCPWLALNEKAPE OPS3_DROME 346 62 fd; RMELQKRCPWLGVNEKSGE
OPS4_DROPS 343 62 fd; RMELQKRCPWLAISEKAPE OPS3_DROPS 345 62 fd;
RLELQKRCPWLGVNEKSGE OPS4_DROME 342 62 fd; RLELQKRLPWLELQEKPVA
O61474 336 62 fd; RLELQKRLPWLELQEKPIE O61473 337 62 fd;
RLELQKRLPWLELQEKPIS OPSV_APIME 336 62 ARG RICH PROSITE-PROFILES ARG
RICH Arginine-rich region NLS_BP Bipartite nuclear lo PFSCAN using
sequence gbCsCt38.2ikn.1.726 and profile(s) PRFDIR:prosite.prf,
Command Line Parameters used: -CUTLEV = -1 Score Raw seq-f seq-t
prf-f prf-t Name Description 30.1607 170 4- 67 1- 2 ARG_RICH
Arginine-rich region 4.0000 4 10- 26 1- 17 NLS_BP Bipartite nuclear
lo 4.0000 4 32- 46 1- 17 NLS_BP Bipartite nuclear lo 5.0000 5 52-
66 1- 17 NLS_BP Bipartite nuclear lo PFSCAN using sequence
gbDhDi43.4rp.1.765 and profile(s) PRFDIR:prosite.prf, October 15,
2010 15:31 Command Line Parameters used: -CUTLEV = -1 Score Raw
seq-f seq-t prf-f prf-t Name Description 33.0880 187 9- 73 1- 2
ARG_RICH Arginine-rich region PFSCAN using sequence
zpr5.B4.12dk.209 Command Line Parameters used: -CUTLEV = -1 Score
Raw seq-f seq-t prf-f prf-t Name Description 30.1607 170 4- 67 1- 2
ARG_RICH Arginine-rich region PFSCAN using sequence
zc55s.B4.18dek.117 and profile(s) PRFDIR:prosite.prf, Command Line
Parameters used: -CUTLEV = -1 Score Raw seq-f seq-t prf-f prf-t
Name Description 18.7959 104 4- 85 1- 2 ARG_RICH Arginine-rich
region PFSCAN using sequence zc37.B9.2de.p1 Command Line Parameters
used: -CUTLEV = -1 Score Raw seq-f seq-t prf-f prf-t Name
Description 24.3061 136 7- 86 1- 2 ARG_RICH Arginine-rich region
Protamine 1 and Protamine 2 BLKPROB Version 5/21/00.1 Database =
/gcg/husar/gcgdata/gcgblimps/blocksplus.dat Query =
gbCsCt38.2ikn.1.726 Length: Size = 726 Amino Acids Combined Family
Strand Blocks E-value IPB000221 Protamine P1 1 1 of 1 1.3e-09
HSP1_CHICK|P15340 1 ARYRRSRTRSRSPRSRRRRRRSGRRRSPRRRRRY IPB000492
Protamine 2, PRM2 1 1 of 2 2.2e-09 HSP2_PIG|P19757 55
HTRRRRSCRRRRRRACRHRRHRRGCRRIRRRRRCR Query = gbDhDi43.4rp.1.765
Length: 765 Combined Family Strand Blocks E-value IPB000221
Protamine P1 1 1 of 1 1.2e-11
HSP1_DIDMA|P35305 1 ARYRRRSRSRSRSRYGRRRRRSRSRRRRSRRRRR IPB000492
Protamine 2, PRM2 1 1 of 2 2.8e-10 HSP2_CALJA|Q28337 69
RRRSRSCRRRRRRSCRYRRRPRRGCRSRRRRRCRR Query = rheu.ef.242.746 Length:
746 Combined Family Strand Blocks E-value IPB000492 Protamine 2,
PRM2 1 1 of 2 1.4e-08 HSP2_CALJA|Q28337 69
RRRSRSCRRRRRRSCRYRRRPRRGCRSRRRRRCRR IPB000221 Protamine P1 1 1 of 1
1.5e-07 HSP1_DIDMA|P35305 1 ARYRRRSRSRSRSRYGRRRRRSRSRRRRSRRRRR
Query = uro705rev.1a.74 Length: 74 IPB000221 Protamine P1 1/1
blocks Combined E-value = 2.8e-12 HSP1_DIDMA|P35305 1
ARYRRRSRSRSRSRYGRRRRRSRSRRRRSRRRRR IPB000492 Protamine 2, PRM2 1/2
blocks Combined E-value = 2.3e-10 HSP2_CALJA|Q28337 69
RRRSRSCRRRRRRSCRYRRRPRRGCRSRRRRRCRR Query = zpr5.B4.12dk Length:
209 IPB000221 Protamine P1 1 1 of 1 4.1e-10 HSP1_CHICK|P15340 1
ARYRRSRTRSRSPRSRRRRRRSGRRRSPRRRRRY IPB000492 Protamine 2, PRM2 1
1/2 7.1e-10 HSP2_PIG|P19757 55 HTRRRRSCRRRRRRACRHRRHRRGCRRIRRRRRCR
Query = zc55s.B4.18dek.117 length: 117 Combined Family Strand
Blocks E-value IPB000492 Protamine 2, PRM2 1 1 of 2 3.4e-05
Q91V94|Q91V94_MESAU63 HRRRRSCRRRRRHSCRHRRRHRRGCRRSRRRRRCR IPB000221
Protamine P1 1 1 of 1 0.0013 HSP1_MOUSE|P02319 1
ARYRCCRSKSRSRCRRRRRRCRRRRRRCCRRRRR Query = zc37.B9.2de.p1 length:
918 Combined Family Strand Blocks E-value IPB000492 Protamine 2,
PRM2 1 1 of 2 2.8e-05 HSP2_ERYPA|Q9GKM0 69
RRRHRSCRRRRRRSCRHRRRHRRGCRTRRRRCRRY IPB000221 Protamine P1 1 1 of 1
0.0001 HSP1_CAVPO|P35304 1 ARYRCCRSPSRSRCRRRRRRFYRRRRRCHRRRRR
Sequences presented as examples: Full-length genomes (TTV) of:
gbCsCt38.2ikn.1.726 (TTV-HD15, ORF1 = 726aa) gbDhDi43.4rp.1.765
(TTV-HD16, ORF1 = 765aa) rheu.ef.242.746 (TTV-HD19, ORF1 = 746aa)
uro705rev.1a.74 (TTV-HD18, ORF1a = 74aa) Full-length genome
(.mu.TTV) of: zpr5.B4.12dk (.mu.TTV-HD15. ORF = 208aa) Transcripts
(from -): zc55s.B4.l8dek.117 (TTV-HD15, ORF = 117aa) zc37.B9.2de.p1
(TTV-HD20, ORF = 109aa) GALANIN: HMMER 2.3.2 (Oct 2003) Copyright
.COPYRGT. 1992-2003 HHMI/Washington University School of Medicine
Freely distributed under the GNU General Public License (GPL)
---------------------------------------------------------------------------
--------------------------------- HMM file: smart.hmm Sequence
file: gbDhDi33.33ik.1c.417 galanin: domain 1 of 1, from 264 to 367:
score -22.9, E = 6.5
*->atlGLgsPvkekrGWtLnsAGYLLGPHAidnHRsFsdKhGLtgKREL t L P + r + s
LGP ++ ++G+ +KR + gbDhDi33.3 264
STHELPDPDRHPRMLQV-SDPTKLGPKT--AFHKWDWRRGMLSKRSI 307
e..pEdearpGsfdrplses.nivrtiiefLsfLhLkeaGaLdrLpglPa ++ Ed +++pl+ ++n
t + L+ L + gbDhDi33.3 308
KrvQEDSTDDEYVAGPLPRKrNKFDTRVQGPPTPEKESYTLLQALQESGQ 357
aasseDlers<-* sseD e++ gbDhDi33.3 358 ESSSEDQEQA 367
gbDfDg33.48ikn.1b.179 galanin: domain 1 of 1, from 26 to 129: score
-21.0, E = 3.9
*->atlGLgsPvkekrGWtLnsAGYLLGPHAidnHRsFsdKhGLtgKREL t L P + r + s
LGP + ++ ++G+ +KR + gbDfDg33.4 26
STHELPDPDRHPRMLQV-SDPTKLGPKTV--FHKWDWRRGMLSKRSI 69
e..pEdearpGsfdrplses.nivrtiiefLsfLhLkeaGaLdrLpglPa ++ Ed +++pl+ ++n
t + L+ L + gbDfDg33.4 70
KrvQEDSTDDEYVAGPLPRKrNKFDTRVQGPPTPEKESYTLLQALQESGQ 119
aasseDlers<-* sseD e++ gbDfDg33.4 120 ESSSEDQEQA 129 HMM file:
smart.hmm Sequence file: gbDhDi33.32ikn.1.648 galanin: domain 1 of
1, from 495 to 598: score -24.5, E = 9.7
*->atlGLgsPvkekrGWtLnsAGYLLGPHAidnHRsFsdKhGLtgKREL t L P + r + s
LGP + ++ ++G+ +KR + gbDhDi33.3 495
STHELPDPDRHPRMLQV-SDPTKLGPKTV--FHKWDWRRGMLSKRSI 538
.epEdearpGs.fdrplses.nivrtiiefLsfLhLkeaGaLdrLpglPa ++ + G +++pl+
++n t + L+ L + gbDhDi33.3 539
kRVQGDSTDGEyVAGPLPRKrNKFDTRVQGPPTPEKESYTLLQALQESGQ 588
aasseDlers<-* sseD e++ gbDhDi33.3 589 ESSSEDQEQA 598
gbDfDg33.45ikn.1b.210 galanin: domain 1 of 1, from 57 to 160: score
-23.1, E = 6.8
*->atlGLgsPvkekrGWtLnsAGYLLGPHAidnHRsFsdKhGLtgKREL t L P + r + s
LGP + ++ +G+ +KR + gbDfDg33.4 57
STHELPDPDRHPRMLQV-SDPTKLGPKTV--FHKWDWGRGMLSKRSI 100
e..pEdearpGsfdrplses.nivrtiiefLsfLhLkeaGaLdrLpglPa ++ Ed +++pl+ ++n
t + L+ L + gbDfDg33.4 101
KrvQEDSTDDEYVAGPLPRKrNKFDTRVQGPPTPEKESYTLLQALQESGQ 150
aasseDlers<-* sseD e++ gbDfDg33.4 151 ESSSEDQEQA 160
PLEXIN/SEMAPHORIN/INTEGRIN TYPE REPEAT SIGNATURES HMMER 2.3.2 (Oct
2003) Copyright .COPYRGT. 1992-2003 HHMI/Washington University
School of Medicine Freely distributed under the GNU General Public
License (GPL)
---------------------------------------------------------------------------
--------------------------------- HMM file: smart.hmm Sequence
file: gbDhDi33.32ikn.1.648 psinew7: domain 1 of 1, from 341 to 394:
score -16.8, E = 3.9 *->rCsqygv . . . tsCseCllardpyg . . .
CgWCssegrCtrg.erC Cs +++ +t+ s C+l++ p + C W + +Ct ++++ gbDhDi33.3
341 WCSEKSSkldTTKSKCILRDFPLWamaygyCDWVV---KCTGVsSAW 384
derrgsrqnwssgpssqCp<-* + +r+ + Cp gbDhDi33.3 385
TDMRI----AI-----ICP 394 Interpro: IPRO03659
Plexin/semaphorin/integrin ##STR00001## Integrin beta-4 subunit
(matches 9 proteins) IPR020707 Tyrosine-protein kinase, hepatocyte
growth factor receptor (matches 82 proteins) IPR020739
Tyrosine-protein kinase, MSP receptor (matches 18 proteins)
Abstract This is a domain that has been found in plexins,
semaphorins and integrins. Plexin is involved in the development of
neural and epithelial tissues; semaphorins induce the collapse and
paralysis of neuronal growth cones; and integrins may mediate
adhesive or migratory functions of epithelial cells. Examples
---------------------------------------------------------------------------
--------------------------------- HMM file: smart.hmm Sequence
file: gbDhDi33.31ikn.1.712 psinew7: domain 1 of 1, from 341 to 378:
score -14.4, E = 2.3 *->rCsqygv . . .
tsCseCllardpygCgWCssegrCtrgerCderrgsr Cs +++ +t+ s C+l++p W +++++Cd
gbDhDi33.3 341 WCSEKSSkldTTKSKCILRDFP---LWA------MAYGHCD------ 372
qnwssgpssqCp<-* w+ +C+ gbDhDi33.3 373 --WVV----KCT 378 GASTRIN
HMMER 2.3.2 (Oct 2003) Copyright .COPYRGT. 1992-2003
HHMI/Washington University School of Medicine Freely distributed
under the GNU General Public License (GPL)
---------------------------------------------------------------------------
--------------------------------- HMM file: prints.hmm Sequence
file: gbDhDi33.32ikn.1.648 GASTRINR_8: domain 1 of 1, from 541 to
559: *->vaGEDsDGCyvq..LPRsR<-* v G+ DG yv ++LPR R gbDhDi33.3
541 VQGDSTDGEYVAgpLPRKR 559 gc; GASTRINR gx; PR00527 gn; COMPOUND
(9) ga; 03-JUN-1996; UPDATE 10-JUN-1999 gt; Gastrin receptor
signature gp; PRINTS; PR00237 GPCRRHODOPSN; PR00247 GPCRCAMP;
PR00248 GPCRMGR gp; PRINTS; PR00249 GPCRSECRETIN; PR00250 GPCRSTE2;
PR00899 GPCRSTE3 gp; PRINTS; PR00251 BACTRLOPSIN gp; PRINTS;
PR01822 CCYSTOKININR; PR00524 CCYSTOKNINAR gp; INTERPRO; IPR000314
gr; 1. ATTWOOD, T.K. AND FINDLAY, J.B.C. gr; Fingerprinting G
protein-coupled receptors. gr; PROTEIN ENG. 7(2) 195-203 (1994).
Gastrins and cholecystokinins (CCKs) are naturally-occurring
peptides that gd; share a common C-terminal sequence, GWMDF; full
biological activity gd; resides in this region [6]. The principal
physiological role of gastrin is gd; to stimulate acid secretion in
the stomach; it also has trophic effects on gd; gastric mucosa [6].
Gastrin is produced from a single gene transcript, and gd; is found
predominantly in the stomach and intestine, but also in vagal gd;
nerves. The CCKB receptor has a widespread distribution in the CNS
and gd; has been implicated in the pathogenesis of panic-anxiety
attacks caused gd; by CCK-related peptides [6]. It has a more
limited distribution in the gd; periphery, where it is found in
smooth muscle and secretory glands. gd; GASTRINR is a 9-element
fingerprint that provides a signature for the gd; gastrin (CCKB)
receptors. The fingerprint was derived from an initial gd;
alignment of 5 sequences: the motifs were drawn from conserved
sections gd; within either loop or N- and C-terminal regions,
focusing on those areas gd; of the alignment that characterise the
gastrin receptors but distinguish gd; them from the rest of the
rhodopsin-like superfamily - motifs 1 and 2 lie gd; at the
N-terminus; motif 3 spans the first external loop; motif 4 spans
gd; the second cytoplasmic loop; motifs 5 and 6 span the second
external loop; gd; motifs 7 and 8 spans the third cytoplasmic loop;
and motif 9 lies at the gd; C-terminus. Two iterations on OWL28.0
were required to reach convergence, gd; at which point a true set
which may comprise 7 sequences was identified. gd; Several partial
matches were also found, all of which are either gastrin gd;
fragments, or members of the cholecystokinin type A receptor
family. fc; GASTRINR8 fl; 17 ft; Gastrin receptor motif VIII-2 fd;
LAGEDGDGCYVQLPRSR GASR_RABIT 288 31 fd; VAGEDNDGCYVQLPRSR
GASR_PRANA 289 30 fd; LAGEDGDGCYVQLPRSR GASR_BOVIN 290 31 fd;
AVGEDSDGCYVQLPRSR GASR_HUMAN 285 26 fd; LAGEDGDGCYVQLPRSR
GASR_CANFA 289 29 fd; LTGEDSDGCYVQLPRSR GASR_MOUSE 291 32 fd;
VAGEDSDGCCVQLPRSR GASR_RAT 290 31 COLLAGENASE HMMER 2.3.2 (Oct
2003) Copyright .COPYRGT. 1992-2003 HHMI/Washington University
School of Medicine Freely distributed under the GNU General Public
License (GPL)
---------------------------------------------------------------------------
--------------------------------- HMM file: pfam.hmm Sequence file:
rheu.ef.241.736 Peptidase_M9: domain 1 of 1, from 125 to 412: score
-152.5, E = 7.5
*->msrlaelyllGdsiKgrhDnlWLaaaemlsYyApegkselgidicqa l ly r n W +
+el+ g+ + rheu.ef.24 125
--TLRILYDEF----TRFMNFWTVSNEDLDLCRYVGCKLIF--FKHP 163
klelaakVlPy..lyeCsgpaa.irsqdltdgqaAsaCdilrnkekdfhq + + + ++++
+++aa+i + ++ +l h+ rheu.ef.24 164
TVDFIVQINTQppFLDTHLTAAsIHPGIMMLSKRRILIPSLKTRPSRKHR 213
vkytGktPVaDDgntrveVgvfvseedykrYSafaSKEVkaqFgrvtdNG v+ V ++ + d + +S
fa t + rheu.ef.24 214
VVVR----VGAPRLFQDKWYPQSDLCDTVLLSIFA-----------TACD 248
GmYLEGNPsdagNqvrF..iAYEeaklnadlsigNlehEYthY . . . LDgR +Y G P +
v+F+ ++k ++s N+e + thY+++L + rheu.ef.24 249
LQYPFGSPLTENPCVNFqiLGPHYKKHL-SISSTNDETNKTHYesnLFNK 297
fdtYGtFsrnleeshivWWeEGfAEYvhYkqgGvPyqaApeligqgskly +Y tF ++ + e G+
v v ++ + ++g + rheu.ef.24 298
TELYNTFQTIAQ-----LKETGRTSGVNPNWTSVQNTTPLNQAGNN---A 339
lsdvftTTeeGyAElFAGShDtdRIyRWGYLA.vrf . . . mletnHnr ++ + t++ G + d
I ++++rf++ + ++l n + rheu.ef.24 340
QNSRDTWY---K-----GNTYNDNISKLAEITrQRFksatisALP-NYPT 380
dvesllvhsRyGnsfafyaylvkllgymYnnefgiw<-* + ++l ++ +G y+ ++ +g Y
g++ rheu.ef.24 381 IMSTDLYEYHSG----IYSSIFLSAGRSYFETTGAY 412
rheu.ef.241.736
Peptidase_M9: domain 1 of 1, from 125 to 412: score -152.5, E = 7.5
*->msrlaelyllGdsiKgrhDnlWLaaaemlsYyApegkselgidicqa l ly r n W +
+e l+ g+ + rheu.ef.24 125
--TLRILYDEF----TRFMNFWTVSNEDLDLCRYVGCKLIF--FKHP 163
klelaakVlPy..lyeCsgpaa.irsqdltdgqaAsaCdilrnkekdfhq ++ + ++++
+++aa+i + ++ +l h+ rheu.ef.24 164
TVDFIVQINTQppFLDTHLTAAsIHPGIMMLSKRRILIPSLKTRPSRKHR 213
vkytGktPVaDDgntrveVgvfvseedykrYSafaSKEVkaqFgrvtdNG v+ V ++ + d + +S
fa t + rheu.ef.24 214
VVVR----VGAPRLFQDKWYPQSDLCDTVLLSIFA-----------TACD 248
GmYLEGNPsdagNqvrF..iAYEeaklnadlsigNlehEYthY . . . LDgR +Y G P +
v+F+ ++ k ++s N+e + thY+++L + rheu.ef.24 249
LQYPFGSPLTENPCVNFqiLGPHYKKHL-SISSTNDETNKTHYesnLFNK 297
fdtYGtFsrnleeshivWWeEGfAEYvhYkqgGvPyqaApeligqgskly +Y tF ++ + e G+
v v ++ + ++g + rheu.ef.24 298
TELYNTFQTIAQ-----LKETGRTSGVNPNWTSVQNTTPLNQAGNN---A 339
lsdvftTTeeGyAElFAGShDtdRIyRWGYLA.vrf . . . mletnHnr ++ + t++ G + d
I ++++rf++ + ++l n + rheu.ef.24 340
QNSRDTWY---K-----GNTYNDNISKLAEITrQRFksatisALP-NYPT 380
dvesllvhsRyGnsfafyaylvkllgymYnnefgiw<-* + ++l ++ +G y+ ++ +g Y
g++ rheu.ef.24 381 IMSTDLYEYHSG----IYSSIFLSAGRSYFETTGAY 412 # = GF
ID Peptidase_M9 # = GF AC PF01752.9 # = GF DE Collagenase # = GF AU
Bateman A # = GF SE SWISS-PROT # = GF RM 7582017 # = GF RT
Molecular analysis of an extracellular protease gene from Vibrio #
= GF RT parahaemolyticus. # = GF RA Lee CY, Su SC, Liaw RB; # = GF
RL Microbiology 1995; 141: 2569-2576. # = GF RM 8282691 # = GF RT
Purification and characterization of Clostridium perfringens # = GF
RT 120-kilodalton collagenase and nucleotide sequence of the # = GF
RT corresponding gene. # = GF RA Matsushita O, Yoshihara K,
Katayama S, Minami J, Okabe A; # = GF RL J Bacteriol 1994; 176:
149-156. # = GF DR INTERPRO; IPR013510; # = GF DR MEROPS; M9; # =
GF CC This family of enzymes break down collagens. COLLAGEN HELIX
REPEAT BLKPROB Version 5/21/00.1
==========================================================================-
================================= Database =
/gcg/husar/gcgdata/gcgblimps/blocksplus.dat Copyright .COPYRGT.
1992-6 by the Fred Hutchinson Cancer Research Center If you use
BLOCKS in your research, please cite: Steven Henikoff and Jorja G.
Henikoff, Protein Family Classification Based on Searching a
Database of Blocks, Genomics 19: 97-107 (1994).
==========================================================================-
================================= Each numbered result consists of
one or more blocks from a PROSITE or PRINTS
==========================================================================-
================================= gbDhDi33.35ikn.2.128.pep Combined
Family Strand Blocks E-value IPB008161 Collagen helix repeat 1 1 of
1 0.0077 >IPB008161 1/1 blocks Combined E-value = 0.0077:
Collagen helix repeat Block Frame Location (aa) Block E-value
IPB008161 0 49-91 0.007 Other reported alignments: ##STR00002##
Query = rheu.ef.241.148 Length: 148 Type: P >IPB008161 1/1
blocks Combined E-value = 0.0075: Collagen helix repeat Block Frame
Location (aa) Block E-value IPB008161 0 67-109 0.0068 Other
reported alignments: ##STR00003## Query =
rheu.ef.238rev.148_2774.sreformat Length: 148 >IPB008161 1/1
blocks Combined E-value = 0.0075: Collagen helix repeat Block Frame
Location (aa) Block E-value IPB008161 0 67-109 0.0068 Other
reported alignments: ##STR00004## HMMER 2.3.2 (Oct 2003) Copyright
.COPYRGT. 1992-2003 HHMI/Washington University School of Medicine
Freely distributed under the GNU General Public License (GPL)
---------------------------------------------------------------------------
--------------------------------- HMM file: pfam.hmm Sequence file:
rheu.ef.241.148 Collagen: domain 1 of 1, from 73 to 133: score
-74.8, E = 3.5
*->GppGppGppGppGppGppGppGpaGapGppGppGe.pGpPGppGppG G+p +pGppG p
p + p + ++G+pG++ +G+ G++ + G rheu.ef.24 73
GRPPRPGPPGGPRTPQIRNLPALPAPQGEPGDRATwRGASGADAAGG 119
ppGppGapGapGpp<-* G++Ga+G rheu.ef.24 120 DGGERGADGGDPGD 133
rheu.ef.238rev.148 CollagenCollagen triple helix repeat (20 copies)
Collagen: domain 1 of 1, from 73 to 133: score -74.8, E = 3.5
*->GppGppGppGppGppGppGppGpaGapGppGppGe.pGpPGppGppG G+p +pGppG p
p + p + ++G+pG++ +G+ G++ + G rheu.ef.23 73
GRPPRPGPPGGPRTPQIRNLPALPAPQGEPGDRATwRGASGADAAGG 119
ppGppGapGapGpp<-* G++Ga+G rheu.ef.23 120 DGGERGADGGDPGD 133 # =
GF ID Collagen # = GF AC PF01391.10 # = GF DE Collagen triple helix
repeat (20 copies) # = GF AU Bateman A, Eddy SR # = GF SE Swissprot
# = GF TP Repeat # = GF BM hmmbuild -F --prior PRIORHMM_ls.ann
SEED.ann # = GF BM hmmcalibrate --seed 0 HMM_ls # = GF BM hmmbuild
-f -F --prior PRIORHMM_fs.ann SEED.ann # = GF BM hmmcalibrate
--seed 0 HMM_fs # = GF AM byscore # = GF RM 8240831 # = GF RT New
members of the collagen superfamily # = GF RA Mayne R, Brewton RG;
# = GF RL Curr Opin Cell Biol 1993;5:883-890. # = GF DR INTERPRO;
IPRO08160; # = GF DR SCOP; 1a9a; fa; # = GF DR MIM; 240400; # = GF
DC Scurvy is associated with collagens. # = GF CC Members of this
family belong to the collagen superfamily [1]. # = GF CC Collagens
are generally extracellular structural proteins # = GF CC involved
in formation of connective tissue structure. The # = GF CC
alignment contains 20 copies of the G-X-Y repeat that forms a # =
GF CC triple helix. The first position of the repeat is glycine,
the # = GF CC second and third positions may be any residue but are
frequently # = GF CC proline and hydroxyproline. Collagens are post
translationally # = GF CC modified by proline hydroxylase to form
the hydroxyproline # = GF CC residues. Defective hydroxylation is
the cause of scurvy. Some # = GF CC members of the collagen
superfamily are not involved in # = GF CC connective tissue
structure but share the same triple helical # = GF CC structure.
MALE SPECIFIC SPERM PROTEIN HMMER 2.3.2 (Oct 2003) Copyright
.COPYRGT. 1992-2003 HHMI/Washington University School of Medicine
Freely distributed under the GNU General Public License (GPL)
---------------------------------------------------------------------------
--------------------------------- HMM file: pfam.hmm Sequence file:
gbDhDi33.34ik.2.128
---------------------------------------------------------------------------
--------------------------------- MSSP: domain 1 of 1, from 59 to
116: score -9.5, E = 8.9
*->vgGPCgpCGPCggpcCGsccsPCg.gpCgPCgpCGpCGPccggCGPC P gp GP g+p+
P ++p P p CG ++ g gbDhDi33.3 59
QLNPEGPAGPGGPPAIL----PALpAPADPE-PAPRCGGRADGGAAA 100
GpCGPCCGttekycGl<-* G t + l gbDhDi33.3 101 GAAADADHTGYEEGDL 116
# = GF ID MSSP # = GF AC PF03940.5 # = GF DE Male specific sperm
protein This family of drosophila proteins are typified by the
repetitive motif C-G-P. MICROBIAL COLLAGENASE METALLOPROTEASE (M9)
SIGNATURE HMMER 2.3.2 (Oct 2003) Copyright .COPYRGT. 1992-2003
HHMI/Washington University School of Medicine Freely distributed
under the GNU General Public License (GPL)
---------------------------------------------------------------------------
--------------------------------- HMM file: prints.hmm Sequence
file: gbDhDi43.4rp.1.765 MICOLLPTASE_1: domain 1 of 1, from 311 to
328: score 5.3, E = 5.7 *->gletLveflRAGYYvrfyn<-* le+ +++ RA
Y f++ gbDhDi43.4 311 TLEN-ILYTRASYWNSFHA 328 MICOLLPTASE gx;
PR00931 gn; COMPOUND (5) ga; 09-SEP-1998; UPDATE 07-JUN-1999 gt;
Microbial collagenase metalloprotease (M9) signature gp; PRINTS;
PR00756 ALADIPTASE; PR00791 PEPDIPTASEA; PR00730 THERMOLYSIN gp;
PRINTS; PR00787 NEUTRALPTASE; PR00782 LSHMANOLYSIN; PR00997
FRAGILYSIN gp; PRINTS; PR00786 NEPRILYSIN; PR00765 CRBOXYPTASEA;
PR00932 AMINO1PTASE gp; PRINTS; PR00789 OSIALOPTASE; PR00933
BLYTICPTASE; PR00934 XHISDIPTASE gp; PRINTS; PR00919 THERMOPTASE;
PR00998 CRBOXYPTASET; PR00768 DEUTEROLYSIN gp; PRINTS; PR00999
FUNGALYSIN; PR01000 SREBPS2PTASE gp; INTERPRO; IPR002169 gp;
PROSITE; PS00142 ZINC_PROTEASE gp; PFAM; PF00099 gr; 1. RAWLINGS,
N.D. AND BARRETT, A.J. gr; Evolutionary families of
metallopeptidases. gr; METHODS ENZYMOL. 248 183-228 (1995). gr; 2.
RAWLINGS, N.D. AND BARRETT, A.J. gr; MEROPS - Peptidase Database
gr; http://www.bi.bbsrc.ac.uk/merops/merops.htm gr; 3. RAWLINGS,
N.D. AND BARRETT, A.J. gr; Family M9 - Clan MA - Microbial
collagenase gr; http://www.bi.bbsrc.ac.uk/merops/famcards/m9.htm
gr; 4. BARRETT, A.J., RAWLINGS, N.D. AND WOESSNER, J.F. gr; Vibrio
collagenase. gr; IN HANDBOOK OF PROTEOLYTIC ENZYMES, ACADEMIC
PRESS, 1998, PP. 1096-1098. gr; 5. BARRETT, A.J., RAWLINGS, N.D.
AND WOESSNER, J.F. gr; Clostridium collagenases. gr; IN HANDBOOK OF
PROTEOLYTIC ENZYMES, ACADEMIC PRESS, 1998, PP. 1098-1102. gr; 6.
MATSUSHITA, O., YOSHIHARA, K., KATAYAMA, S., MINAMI, J. AND OKABE,
A. gr; Purification and characterization of Clostridium perfringens
120- gr; kilodalton collagenase and nucleotide sequence of the
corresponding gene. gr; J. BACTERIOL. 176 149-156 (1994). gd;
Metalloproteases are the most diverse of the four main types of
protease, gd; with more than 30 families identified to date [1]. Of
these, around gd; half contain the HEXXH motif, which has been
shown in crystallographic gd; studies to form part of the
metal-binding site [1]. The HEXXH motif is gd; relatively common,
but may be more stringently defined for metallo- gd; proteases as
abXHEbbHbc, where a is most often valine or threonine and gd; forms
part of the S1' subsite in thermolysin and neprilysin, b is an gd;
uncharged residue, and c a hydrophobic residue. Proline is never
found gd; in this site, possibly because it would break the helical
structure gd; adopted by this motif in metalloproteases [1]. gd;
Metalloproteases may be split into five groups on the basis of
their metal- gd; binding residues: the first three contain the
HEXXH motif, the other two gd; do not [1]. In the first group, a
glutamic acid completes the active site- gd; these are termed
HEXXH+E: all families in this group show some sequence gd;
relationship and have been assigned to clan MA [1]. The second
group, which gd; have a third histidine as the extra metal-binding
residue, are termed gd; HEXXH+H and are grouped into clan MB on the
basis of their inter-relation- gd; ship[1]. In the third group, the
additional metal-binding residues are gd; unidentified. The fourth
group is diverse - the metal-binding residues are gd; known but do
not form the HEXXH motif. And the fifth group may comprise the gd;
remaining families where the metal-binding residues are as yet
unknown
gd; [1,2]. Microbial collagenases have been identified from
bacteria of both the gd; Vibrio and Clostridium genuses. They are
zinc-containing metallopeptidases gd; that belong to the M25
protease family, which form part of the MA clan gd; [1,3].
Collagenase is used during bacterial attack to degrade the collagen
gd; barrier of the host during invasion. Vibrio bacteria are
non-pathogenic, and gd; are sometimes used in hospitals to remove
dead tissue from burns and ulcers gd; [4]. Clostrium histolyticum
is a pathogen that causes gas gangrene; gd; nevertheless, the
isolated collagenase has been used to treat bed sores [5]. gd;
Collagen cleavage occurs at an Xaa+Gly in Vibrio bacteria and at
Yaa+Gly gd; bonds in Clostridium collagenases [4,5]. gd; Analysis
of the primary structure of the gene product from Clostridium gd;
perfringens has revealed that the enzyme is produced with a stretch
of 86 gd; residues that contain a putative signal sequence [6].
Within this stretch gd; is found PLGP, an amino acid sequence
typical of collagenase substrates. gd; This sequence may thus be
implicated in self-processing of the collagenase [6]. gd;
MICOLLPTASE is a 5-element fingerprint that provides a signature
for gd; microbial collagenase zinc metallopeptidases (M9). The
fingerprint was gd; derived from an initial alignment of 4
sequences: the motifs were drawn from gd; conserved regions
spanning virtually the full alignment length - motif 4 gd; includes
the region encoded by the PROSITE pattern ZINC PROTEASE (PS00142),
gd; which describes the HEXXH active site; and motif 5 contains the
active site gd; glutamate. Two iterations on OWL31.1 were required
to reach convergence, gd; at which point a true set which may
comprise 8 sequences was identified. tp; COLA_CLOPE O54108
COLA_VIBAL Q46085 tp; COLA_VIBPA sn; Codes involving 4 elements st;
O86030 tt; COLA_CLOPE MICROBIAL COLLAGENASE PRECURSOR (EC 3.4.24.3)
(120 KD COLLAGENASE-CLOSTRIDIUM tt; O54108 PUTATIVE SECRETED
PROTEASE - STREPTOMYCES COELICOLOR. tt; COLA_VIBAL MICROBIAL
COLLAGENASE PRECURSOR (EC 3.4.24.3) - VIBRIO ALGINOLYTICUS. tt;
Q46085 COLLAGENASE PRECURSOR - CLOSTRIDIUM HISTOLYTICUM. tt;
COLA_VIBPA MICROBIAL COLLAGENASE PRECURSOR (EC 3.4.24.3) - VIBRIO
PARAHAEMOLYTICUS. tt; O86030 COLLAGENASE - VIBRIO CHOLERAE. ic;
MICOLLPTASE1 il; 19 it; Microbial collagenase motif I-1 id;
GIPTLVEFLRAGYYLGFYN COLA_CLOPE 159 159 id; ELETLFLYLRAGYYAEFYN
COLA_VIBAL 144 144 id; VLENLGEFVRAAYYVRYNA COLA_VIBPA 97 97 id;
RLENYGEFIRAAYYVRYNA AF080248 97 97 bb; MIC1 microneme protein
signature HMMER 2.3.2 (Oct 2003) Copyright .COPYRGT. 1992-2003
HHMI/Washington University School of Medicine Freely distributed
under the GNU General Public License (GPL)
---------------------------------------------------------------------------
--------------------------------- HMM file: prints.hmm Sequence
file: rheu.ef.242.746 MIC1MICRNEME_5: domain 1 of 1, from 448 to
463: score 6.6, E = 4.4 *->TyiStkLdVaVGSCHk<-* T t+L Va GSC
rheu.ef.24 448 TKADTQLIVAGGSCKA 463 gc; MIC1MICRNEME gx; PR01744
gn; COMPOUND (7) ga; 03-JUL-2002 gt; MIC1 microneme protein
signature gr; 1. SIBLEY, L.D., MORDUE, D. AND HOWE, K. gr;
Experimental approaches to understanding virulence in
toxoplasmosis. gr; IMMUNOBIOL. 201 210-224 (1999). gr; 2.
CARRUTHERS, V.B. gr; Armed and dangerous: Toxoplasma gondii uses an
arsenal of secretory gr; proteins to infect host cells. gr;
PARASITOL.INT. 48 1-10 (1999). gr; 3. FOURMAUX, M.N., ACHBAROU, A.,
MERCEREAU-PUIJALON, O., BIDERRE, C., gr; BRICHE, I., LOYENS, A.,
ODBERG-FERRAGUT, C., CAMUS, D. AND DUBREMETZ, J.F. gr; The MIC1
microneme protein of Toxoplasma gondii contains a duplicated gr;
receptor-like domain and binds to host cell surface. gr;
MOL.BIOCHEM.PARASITOL. 20 201-210 (1996). gr; 4. LOURENCO, E.V.,
PEREIRA, S.R., FACA, V.M., COELHO-CASTELO, A.A., gr; MINEO, J.R.,
ROQUE-BARREIRA, M.C., GREENE, L.J. AND PANUNTO-CASTELO, A. gr;
Toxoplasma gondii micronemal protein MIC1 is a lactose-binding
lectin. gr; GLYCOBIOL. 11 541-547 (2001). gr; 5. KELLER, N.,
NAGULESWARAN, A., CANNAS, A., VONLAUFEN, N., BIENZ, M., gr;
BJORKMAN, C., BOHNE, W. AND HEMPHILL, A. gr; Identification of a
Neospora caninum microneme protein (NcMIC1) which gr; interacts
with sulphated host cell surface glycosaminoglycans. gr;
INFECT.IMMUN. 70 187-198 (2002). gd; Toxoplasma gondii is an
obligate intracellular apicomplexan protozoan gd; parasite, with a
complex lifestyle involving varied hosts [1]. It has two gd; phases
of growth: an intestinal phase in feline hosts, and an extra- gd;
intestinal phase in other mammals. Oocysts from infected cats
develop gd; into tachyzoites, and eventually, bradyzoites and
zoitocysts in the gd; extraintestinal host [1]. Transmission of the
parasite occurs through gd; contact with infected cats or
raw/undercooked meat; in immunocompromised gd; individuals, it may
cause severe and often lethal toxoplasmosis. Acute gd; infection in
healthy humans may sometimes also cause tissue damage [1]. gd; The
protozoan utilises a variety of secretory and antigenic proteins to
gd; invade a host and gain access to the intracellular environment
[2]. These gd; originate from distinct organelles in the T. gondii
cell termed micronemes, gd; rhoptries, and dense granules. They are
released at specific times during gd; invasion to ensure the
proteins are allocated to their correct target gd; destinations
[2]. gd; MIC1, a protein secreted from the microneme, is a
456-residue moiety gd; involved in host cell recognition by the
parasite [3]. The protein is gd; released from the apical pole of
T.gondii during infection, and attaches to gd; host-specific
receptors [4]. Recent studies have demonstrated that Mic1 is gd; a
lactose-binding lectin, and utilises this to enhance its binding to
host gd; endothelial cells [4]. A homologue of Mic1 found in
Neospora caninum gd; interacts with sulphated host cell-surface
glycosaminoglycans [5]. gd; MIC1MICRNEME is a 7-element fingerprint
that provides a signature for the gd; MIC1 microneme proteins. The
fingerprint was derived from an initial gd; alignment of 2
sequences: the motifs were drawn from conserved regions gd;
spanning the C-terminal portion of the alignment (~380 amino
acids). A gd; single iteration on SPTR40_20f was required to reach
convergence, no gd; further sequences being identified beyond the
starting set. bb; ic; MIC1MICRNEME5 il; 16 it; MIC1 microneme
protein motif V-1 id; TFISTKLDVAVGSCHS O00834 341 133 id;
TYSSPQLHVSVGSCHK Q8WRS0 344 138 AUTOIMMUNE REGULATOR (AIRE)
SIGNATURE HMMER 2.3.2 (Oct 2003) Copyright .COPYRGT. 1992-2003
HHMI/Washington University School of Medicine Freely distributed
under the GNU General Public License (GPL) HMM file: prints.hmm
Sequence file: rheu.ef.241.736 AIREGULATOR_4: domain 1 of 1, from
138 to 152: score 6.4, E = 9.2 *->DFWRvLFKDYnLERY<-* FW v D L
RY rheu.ef.24 138 NFWTVSNEDLDLCRY 152 rheu.ef.234rev.628
AIREGULATOR_4: domain 1 of 1, from 30 to 44: score 6.4, E = 9.2
*->DFWRvLFKDYnLERY<-* FW v D L RY rheu.ef.23 30
NFWTVSNEDLDLCRY 44 rheu.cd.215rev.1.736 AIREGULATOR_4: domain 1 of
1, from 138 to 152: score 6.4, E = 9.2 *->DFWRvLFKDYnLERY<-*
FW v D L RY rheu.cd.21 138 NFWTVSNEDLDLCRY 152 gc; AIREGULATOR gx;
PR01711 gn; COMPOUND (8) ga; 13-MAR-2002 gt; Autoimmune regulator
(AIRE) signature gr; 1. The Finnish-German APECED Consortium. gr;
An autoimmune disease, APECED, caused by mutations in a novel gene
gr; featuring two PHD-type zinc-finger domains. gr; NAT.GENET. 17
399-403 (1997). gr; 2. MITTAZ, L., ROSSIER, C., HEINO, M.,
PETERSON, P., KROHN, K.J.E., GOS, A., gr; MORRIS, M.A., KUDOH, J.,
SHIMIZU, N., ANTONARAKIS, S.E. AND SCOT, H.S. gr; Isolation and
chatacterisation of the mouse Aire gene. gr;
BIOCHEM.BIOPHYS.RES.COMMUN. 255 483-490 (1999). gr; 3. PETERSON,
H.M., KUDOH, J., NAGAMINE, K., LAGERSTEDT, A., OVOD, V., gr; RANKI,
A., RANTALA, I., NIEMINEN, M., TUUKKANEN, J., SCOTT, H.S., gr;
ANTONARAKIS, S.E., SHIMIZU, N. AND KROHN, K. gr; Autoimmune
regulator is expressed in the cells regulating immune tolerance gr;
in thymous medulla. gr; BIOCHEM. BIOPHYS. RES. COMMUN. 257 821-825
(1999). gr; 4. KUMAR, P.G., LALORAYA, M., WANG, C.Y., RUAN, Q.G.,
SEMIROMI, A.D., gr; KAO. K.J. AND SHE, J.X. gr; The autoimmune
regulator (AIRE) is a DNA-binding protein. gr; J. BIOL. CHEM. 276
41357-41364 (2001). gd; AIRE (AutoImmune REgulator) is the
predicted protein responsible for a rare gd; autosomal recessively
inherited disease termed APECED. APECED, also gd; called Autoimmune
Polyglandular Syndrome type I (APS 1), is the only gd; described
autoimmune disease with established monogenic background, being gd;
localised outside the major histocompatibility complex region. It
is gd; characterised by the presence of two of the three major
clinical entities, gd; chronic mucocutaneus candidiasis,
hypoparathyroidism and Addison's disease. gd; Other immunologically
mediated phenotypes, including insulin-dependent gd; diabetes
mellitus (IDDM), gonadal failure, chronic gastritis, vitiligo, gd;
autoimmune thyroid disease, enamel hypoplasia, and alopecia may
also gd; be present. Immunologically, APECED patients have
deficient T cell gd; responses towards Candida antigens, and
clinical symptoms both within and gd; outside the endocrine system,
mainly as a result of autoimmunity against gd; organ-specific
autoantigens [1,2]. gd; AIRE has motifs suggestive of a
transcriptional regulator protein. It gd; harbours two zinc fingers
of the plant homodomain (PHD) type. A putative DNA-binding domain,
termed SAND, as well as four nuclear receptor binding LXXLL gd;
motifs, an inverted LXXLL domain, and a variant of the latter
(FXXLL), hint gd; that this protein functions as a transcription
coactivator. Furthermore, a gd; highly conserved N-terminal
100-amino acid domain in AIRE shows significant gd; similarity to
the homogeneously staining (HSR) domain of Sp100 and Sp140 gd;
proteins, which has been shown to function as a dimerisation domain
in gd; several Sp-100 related proteins [2-4]. gd; AIRE has a dual
subcellular location. It is not only expressed in multiple gd;
immunologically relevant tissues, such as the thymus, spleen, lymph
nodes gd; and bone marrow, but it has also been detected in various
other tissues, gd; such as kidney, testis, adrenal glands, liver
and ovary, suggesting that gd; APECED proteins might also have a
function outside the immune system. gd; However, AIRE is not
expressed in the target organs of autoimmune
gd; destruction. At the subcellular level, AIRE may be found in the
cell nucleus gd; in a speckled pattern in domains resembling
promyeolocytic leukaemia nuclear gd; bodies, also known as ND10,
nuclear dots or potential oncogenic domains gd; associated with the
AIRE homologous nuclear proteins Sp100, Sp140, and gd; Lysp100. The
nuclear localisation of AIRE, in keeping with its predicted gd;
protein domains, suggest that it may regulate the mechanisms
involved in the gd; induction and maintenance of immune tolerance
[3,4]. gd; AIREGULATOR is an 8-element fingerprint that provides a
signature for the gd; AIRE autoimmune regulators. The fingerprint
was derived from an initial gd; alignment of 6 sequences: the
motifs were drawn from conserved regions gd; largely spanning the
N-terminal and central portions of the alignment, gd; focusing on
those sections that characterise the autoregulators but gd;
distinguish them from those possessing SAND and PHD domains. Two
iterations gd; on SPTR39_17f were required to reach convergence, at
which point a true set gd; which may comprise 14 sequences was
identified. fc; AIREGULATOR4 fl; 15 ft; Autoimmune regulator (AIRE)
motif IV-1 fd; DFWRILFKDYNLERY Q9JLW0 77 18 fd; DFWRILFKDYNLERY
Q9Z0E3 77 18 fd; DFWRILFKDYNLERY Q9JLX0 77 18 fd; DFWRILFKDYNLERY
Q9JLW9 77 18 fd; DFWRILFKDYNLERY Q9JLW8 77 18 fd; DFWRILFKDYNLERY
Q9JLW7 77 18 fd; DFWRILFKDYNLERY Q9JLW6 77 18 fd; DFWRILFKDYNLERY
Q9JLW5 77 18 fd; DFWRILFKDYNLERY Q9JLW4 77 18 fd; DFWRILFKDYNLERY
Q9JLW3 77 18 fd; DFWRILFKDYNLERY Q9JLW2 77 18 fd; DFWRILFKDYNLERY
Q9JLW1 77 18 fd; DFWRVLFKDYNLERY AIRE_HUMAN 76 18 fd;
DFWRVLFKDYNLERY O75745 76 18 GLIADIN HMMER 2.3.2 (Oct 2003)
Copyright .COPYRGT. 1992-2003 HHMI/Washington University School of
Medicine Freely distributed under the GNU General Public License
(GPL) HMM file: prints.hmm Sequence file: rheu.ef.241.736
GLIADIN_7: domain 1 of 1, from 688 to 708: score 17.7, E = 0.056
*->PqaqGsvqPqqLPqFeEiRnL<-* qaqGsvq q L q E R L rheu.ef.24
688 TQAQGSVQEQLLLQLREQRVL 708 rheu.ef.234rev.628 GLIADIN_7: domain
1 of 1, from 580 to 600: score 17.7, E = 0.056
*->PqaqGsvqPqqLPqFeEiRnL<-* qaqGsvq q L q E R L rheu.ef.23
580 TQAQGSVQEQLLLQLREQRVL 600 rheu.cd.215rev.1.736 GLIADIN_7:
domain 1 of 1, from 688 to 708: score 18.3, E = 0.037
*->PqaqGsvqPqqLPqFeEiRnL<-* qaqGsvq q L q E R L rheu.cd.21
688 TQAQGSVQDQLLLQLREQRVL 708 GLIADIN_7: domain 1 of 1, from 46 to
66: score 18.3, E = 0.037 *->PqaqGsvqPqqLPqFeEiRnL<-* qaqGsvq
q L q E R L zc3r11.B4. 46 TQAQGSVQDQLLLQLREQRVL 66 gc; GLIADIN gx;
PR00209 gn; COMPOUND (9) ga; 21-OCT-1992; UPDATE 19-JUN-1999 gt;
Alpha/beta gliadin family signature gp; PRINTS; PR00208
GLIADGLUTEN; PR00211 GLUTELIN; PR00210 GLUTENIN gp; INTERPRO;
IPR001376 gr; 1. SHEWRY, P. AND MORGAN, M. gr; Gluten - proteins
that put the springiness into bread and are implicated gr; in food
intolerance syndromes such as coeliac disease. gr; IN PROTEIN POWER
AFRC NEWS SUPPLEMENT (1992). gr; 2. OKITA T.W., CHEESBROUGH V. AND
REEVES C.D. gr; Evolution and heterogeneity of the alpha-type,
beta-type, and gamma-type gr; gliadin DNA sequences. gr; J. BIOL.
CHEM. 260 (13) 8203-8213 (1985). gr; 3. RAFALSKI J.A. gr; Structure
of wheat gamma-gliadin genes. gr; GENE 43 (3) 221-229 (1986). gd;
Gluten is the protein component of wheat flour. It consists of
numerous gd; proteins, which are of 2 different types responsible
for different physical gd; properties of dough [1]: the glutenins,
which are primarily responsible for gd; the elasticity, and the
gliadins, which contribute to the extensibility. gd; The gliadins
themselves are of different types (e.g., alpha/beta or gamma) gd;
and, like the glutenins, contain repetitive sequences [2] that form
loose gd; helical structures, but they are usually associated with
more extensive gd; non-repetitive regions, which are compact and
globular [3]. gd; GLIADIN is a 9-element fingerprint that provides
a signature for the gd; alpha/beta gliadins. The fingerprint was
derived from an initial align- gd; ment of 5 sequences: motifs 2
and 3 encode the Gln/Pro-rich tandem repeats. gd; Two iterations on
OWL18.0 were required to reach convergence, at which gd; point a
true set which may comprise 14 sequences was identified. Several
gd; partial matches were also found: 3 of these are alpha/beta
gliadin gd; fragments: GDA1_WHEAT and B22364 both lack the
C-terminal part of the gd; sequence bearing the last 2 motifs, and
GDA8_WHEAT lacks the N-terminal gd; part of the sequence bearing
the first 3 motifs. gd; In addition to the alpha/beta gliadin
fragments, a number of other partial gd; matches were identified:
these included gamma-gliadins, low molecular gd; weight glutenins,
avenins, secalins, and so on. Most of these fail to gd; match, or
at least match only poorly, those motifs that encode the tandem gd;
repeats - clearly they are characterised by their own distinctive
gd; signatures in this region. The fingerprint thus provides
reasonable gd; discrimination between the alpha/beta type gliadins
and the gamma type and gd; related proteins. c; GLIADIN7 fl; 21 ft;
Gliadin motif VII-2 fd; PQAQGSVQPQQLPQFEEIRNL GDA9_WHEAT 259 6 fd;
PQAQGSVQPQQLPQFEEIRNL GDA6_WHEAT 246 6 fd; PQAQGSVQPQQLPQFEEIRNL
Q41509 239 6 fd; PQAQGSVQPQQLPQFEEIRNL Q41531 241 6 fd;
PQAQGSVQPQQLPQFEEIRNL GDA0_WHEAT 238 6 fd; PQAQGSVQPQQLPQFAEIRNL
GDA7_WHEAT 263 6 fd; PQAQGSVQPQQLPQFAEIRNL Q41546 263 6 fd;
PQAQGSFQPQQLPQFEEIRNL GDA2_WHEAT 243 6 fd; PQAQGSVQPQQLPQFEEIRNL
Q41632 246 6 fd; PQAQGSVQPQQLPQFEEIRNL Q41530 240 6 fd;
PQAQGSVQPQQLPQFAEIRNL Q41529 263 6 fd; PQAQGSVQPQQLPQFAEIRNL
GDA5_WHEAT 269 6 fd; PQAQGSVQPQQLPQFAEIRNL Q41545 268 6 fd;
PQTQGSVQPQQLPQFEEIRNL Q41528 239 6 fd; PQAQGSVQPQQLPQFEEIRNL
GDA4_WHEAT 249 6 fd; PQAQGSVQPQQLPQFQEIRNL GDA3_WHEAT 232 6
NEUROPEPTIDE Y2 RECEPTOR SIGNATURE HMMER 2.3.2 (Oct 2003) Copyright
.COPYRGT. 1992-2003 HHMI/Washington University School of Medicine
Freely distributed under the GNU General Public License (GPL) HMM
file: prints.hmm Sequence file: rheu.ef.241.736 NRPEPTIDEY2R_9:
domain 1 of 1, from 664 to 677: score 8.9, E = 3.1
*->AFLsAFRCEqRLDAiHs<-* sAFR qR+ +Hs rheu.ef.24 664
---SAFRVQQRVPWVHS 677 rheu.ef.234rev.628 NRPEPTIDEY2R_9: domain 1
of 1, from 556 to 569: score 8.9, E = 3.1
*->AFLsAFRCEqRLDAiHs<-* sAFR qR+ +Hs rheu.ef.23 556
---SAFRVQQRVPWVHS 569 NRPEPTIDEY2R_9: domain 1 of 1, from 22 to 35:
score 7.2, E = 6.3 *->AFLsAFRCEqRLDAiHs<-* s FR qRL +Hs
zc3r11.B4. 22 ---SRFRVQQRLPWVHS 35 gc; NRPEPTIDEY2R gx; PR01014 gn;
COMPOUND (11) ga; 30-NOV-1998; UPDATE 07-JUN-1999 gt; Neuropeptide
Y2 receptor signature gp; PRINTS; PR00237 GPCRRHODOPSN; PR00247
GPCRCAMP; PR00248 GPCRMGR gp; PRINTS; PR00249 GPCRSECRETIN; PR00250
GPCRSTE2; PR00899 GPCRSTE3 gp; PRINTS; PR00251 BACTRLOPSIN gp;
PRINTS; PR01012 NRPEPTIDEYR; PR01013 NRPEPTIDEY1R; PR01015
NRPEPTIDEY4R gp; PRINTS; PR01016 NRPEPTIDEY5R; PR01017 NRPEPTIDEY6R
gp; INTERPRO; IPR001358 gr; 1. ATTWOOD, T.K. AND FINDLAY, J.B.C.
gr; Fingerprinting G protein-coupled receptors. gr; PROTEIN ENG. 7
(2) 195-203 (1994). gr; 2. ATTWOOD, T.K. AND FINDLAY, J.B.C. gr; G
protein-coupled receptor fingerprints. gr; 7TM, VOLUME 2, EDS. G.
VRIEND AND B. BYWATER (1993). gr; 3. BIRNBAUMER, L. gr; G proteins
in signal transduction. gr; ANNU. REV. PHARMACOL. TOXICOL. 30
675-705 (1990). gr; 4. CASEY, P.J. AND GILMAN, A.G. gr; G protein
involvement in receptor-effector coupling. gr; J. BIOL. CHEM. 263
(6) 2577-2580 (1988). gr; 5. ATTWOOD, T.K. AND FINDLAY, J.B.C. gr;
Design of a discriminating fingerprint for G protein-coupled
receptors. gr; PROTEIN ENG. 6 (2) 167-176 (1993). gr; 6. WATSON, S.
AND ARKINSTALL, S. gr; Neuropeptide Y. gr; IN THE G PROTEIN-LINKED
RECEPTOR FACTSBOOK, ACADEMIC PRESS, 1994, PP. 194-198. gd; G
protein-coupled receptors (GPCRs) constitute a vast protein family
that gd; encompasses a wide range of functions (including various
autocrine, para- gd; crine and endocrine processes). They show
considerable diversity at the gd; sequence level, on the basis of
which they may be separated into distinct gd; groups. Applicants
use the term clan to describe the GPCRs, as they embrace gd; a
group of families for which there are indications of evolutionary ,
gd; relationship but between which there is no statistically
significant gd; similarity in sequence [1,2]. The currently known
clan members include the gd; rhodopsin-like GPCRs, the
secretin-like GPCRs, the cAMP receptors, the gd; fungal mating
pheromone receptors, and the metabotropic glutamate receptor gd;
family. The rhodopsin-like GPCRs themselves represent a widespread
protein gd; family that includes hormone, neurotransmitter and
light receptors, all of gd; which transduce extracellular signals
through interaction with guanine gd; nucleotide-binding (G)
proteins. Although their activating ligands vary gd; widely in
structure and character, the amino acid sequences of the gd;
receptors are very similar and are believed to adopt a common
structural gd; framework which may comprise 7 transmembrane (TM)
helices [3-5]. gd; Neuropeptide Y (NPY) is one of the most abundant
peptides in mammalian gd; brain, inducing a variety of behavioural
effects (e.g., stimulation of food gd; intake, anxiety,
facilitation of learning and memory, and regulation of the gd;
cardiovascular and neuroendocrine systems) [6]. In the periphery,
NPY gd; stimulates vascular smooth muscle contraction and modulates
hormone gd; secretion. NPY has been implicated in the
pathophysiology of hypertension, gd; congestive heart failure,
affective disorders and appetite regulation [6]. gd; Several
pharmacologically distinct neuropeptide Y receptors have been gd;
characterised, designated NPY Y1-Y6. High densities of Y2 receptors
are gd; present in rat hippocampus and are also found in high
levels in superficial gd; layers of cortex, certain thalamic
nuclei, lateral septum, and anterior gd; olfactory nuclei; lower
levels are found in striatum [6]. The
receptors are gd; found in high levels in smooth muscle (e.g., vas
deferens and intestine), gd; kidney proximal tubules and in cell
lines [6]. They are believed to have a gd; predominantly
presynaptic location, and are involved in inhibition of gd;
adenylyl cyclase and voltage dependent calcium channels via a
pertussis- gd; toxin-sensitive G protein, probably of the G0/Gi
class [6]. gd; NRPEPTIDEY2R is an 11-element fingerprint that
provides a signature for gd; neuropeptide Y2 receptors. The
fingerprint was derived from an initial gd; alignment of 2
sequences: the motifs were drawn from conserved sections gd; within
either loop or TM regions, focusing on those areas of the alignment
gd; that characterise the Y2 receptors but distinguish them from
the rest of gd; the neuropeptide Y family - motifs 1-3 span the
N-terminus, leading into gd; TM domain 1; motifs 4 and 5 span the
C-terminus of TM domain 4 and the gd; second external loop; motifs
6 and 7 span the C-terminus of TM domain 5 gd; and the third
cytoplasmic loop; motif 8 spans the C-terminus of TM domain 6 gd;
and the third external loop; and motifs 9-11 reside at the
C-terminus. Two gd; iterations on OWL30.2 were required to reach
convergence, at which point gd; a true set which may comprise 5
sequences was identified. Two partial gd; matches were also found:
OAU83458 is an ovine neuropeptide Y2 receptor gd; fragment that
matches motifs 4-6; and AF054870 is a rat neuropeptide Y2 gd;
receptor fragment that matches motifs 5 and 6. fc; NRPEPTIDEY2R9
fl; 17 ft; Neuropeptide Y2 receptor motif IX-2 fd;
AFLSAFRCEQRLDAIHS NY2R_HUMAN 335 29 fd; AFLSAFRCEQRLDAIHS
NY2R_BOVIN 338 29 fd; AFLSAFRCEQRLDAIHS NY2R_MOUSE 339 29 fd;
AFLSAFRCEQRLDAIHS NY2R_PIG 337 29 AEROLYSIN HMMER 2.3.2 (Oct 2003)
Copyright .COPYRGT. 1992-2003 HHMI/Washington University School of
Medicine Freely distributed under the GNU General Public License
(GPL)
---------------------------------------------------------------------------
--------------------------------- HMM file: prints.hmm Sequence
file: rheu.ef.241.736 AEROLYSIN_7: domain 1 of 1, from 602 to 621:
score 3.4, E = 9.3 *->wDKRYiPGEvKWWDWnWtiq<-* +D +Y+ Ev W W
rheu.ef.24 602 VDPKYVTPEVTWHSWDIRRG 621 rheu.ef.234rev.628
AEROLYSIN_7: domain 1 of 1, from 494 to 513: score 3.4, E = 9.3
*->wDKRYiPGEvKWWDWnWtiq<-* +D +Y+ Ev W W rheu.ef.23 494
VDPKYVTPEVTWHSWDIRRG 513 HMM file: prints.hmm Sequence file:
uro742rev.109r AEROLYSIN_7: domain 1 of 1, from 65 to 84: score
3.6, E = 8.6 *->wDKRYiPGEvKWWDWnWtiq<-* + G K W WnW+ +
uro742rev. 65 FAWVLASGTAKCWSWNWSAR 84 AEROLYSIN_7: domain 1 of 1,
from 65 to 84: score 3.6, E = 8.6 *->wDKRYiPGEvKWWDWnWtiq<-*
+ G K W WnW+ + zc37.B9.2d 65 FAWVLASGTAKCWSWNWSAR 84 gc; AEROLYSIN
gx; PR00754 gn; COMPOUND (9) ga; 25-AUG-1997; UPDATE 06-JUN-1999
gt; Aerolysin signature gp; INTERPRO; IPR001776 gp; PROSITE;
PS00274 AEROLYSIN gp; PFAM; PF01117 Aerolysin gr; 1. PARKER, M.W.,
BUCKLEY, J.T., POSTMA, J.P., TUCKER, A.D., LEONARD, K., gr; PATTUS,
F. AND TSERNOGLOU, D. gr; Structure of the aeromonas toxin
proaerolysin in its water-soluble and gr; membrane-channel states.
gr; NATURE 367 292-295 (1994). gd; Aerolysin is responsible for the
pathogenicity of Aeromonas hydrophila, a gd; bacterium associated
with diarrhoeal diseases and deep wound infections [1]. gd; In
common with other microbial toxins, the protein changes in a
multi-step gd; process from a water-soluble form to produce a
transmembrane channel that gd; destroys sensitive cells by breaking
their permeability barriers [1]. gd; The structure of proaerolysin
has been determined to 2.8A resolution and gd; shows the protoxin
to adopt a novel fold [1]. Images of an aerolysin gd; oligomer
derived from electron microscopy have helped to construct a gd;
model of the protein and to outline a mechanism by which it might
insert gd; into lipid bilayers to form ion channels [1]. gd;
AEROLYSIN is a 9-element fingerprint that provides a signature for
the gd; aerolysins. The fingerprint was derived from an initial
alignment of 10 gd; sequences: the motifs were drawn from conserved
regions spanning virtually gd; the full alignment length. A single
iteration on OWL29.4 was required to gd; reach convergence, no
further sequences being identified beyond the gd; starting set. A
single partial match was found, CLOALPTOX, a related gd;
alpha-toxin from Clostridium septicum that matches motifs 4 and 6.
gd; fc; AEROLYSIN7 fl; 20 ft; Aerolysin motif VII-2 fd;
WDKRYIPGEVKWWDWNWTIQ ERA_AERHY 382 21 fd; WDKRYIPGEVKWWDWNWTIQ
Q4063 382 21 fd; WDKRYIPGEVKWWDWNWTIQ AER3_AERHY 382 21 fd;
WDKRYIPGEVKWWDWNWTIQ AER5_AERHY 382 21 fd; WDKRYIPGEVKWWDWNWTIQ
AER4_AERHY 382 21 fd; WDKRYIPGEVKWWDWNWTIQ P94128 382 21 fd;
WDKRYLPGEMKWWDWNWAIQ AERA_AERTR 382 21 fd; WDKRYLPGEMKWWDWNWAIQ
O85370 382 21 fd; VDKRYIPGEVKWWDWNWTIS AERA_AERSA 383 21 fd;
VDKRYIPGEVKWWDWNWTIS AERA_AERSO 382 OREXIN: HMMER 2.3.2 (Oct 2003)
Copyright .COPYRGT. 1992-2003 HHMI/Washington University School of
Medicine Freely distributed under the GNU General Public License
(GPL)1
---------------------------------------------------------------------------
--------------------------------- HMM file: pfam.hmm Sequence file:
rheu.ef.241.148 Orexin: domain 1 of 1, from 10 to 122: score -38.9,
E = 4.1 *->mnlPsaKvsWAavtlLLLLLLLPPAlLslGvdAqPLPDCCRqKtCsC + v A
LL + PP +G++ C R C rheu.ef.24 10
RKVLLQTVRAAKKARRLLGMWQPPVHNVPGIERNWYESCFRSHAAVC 56
RLYELLHGAGnHAAGiLtLGK.RRPGPPGLqGRLqRLLqAsGnHAAGiLt + + G nH A tLG++
RPGPPG G i rheu.ef.24 57
GCGDFV-GHINHLAT--TLGRpPRPGPPG------------GPRTPQI-89
mGRRAGAElePrlCPGRRClaAaAsalAPrGrsrv<-* R A ++P+ PG R As G+ +
rheu.ef.24 90 --RNLPALPAPQGEPGDRATWRGASGADAAGGDGG 122
rheu.ef.238rev.148 Orexin: domain 1 of 1, from 10 to 122: score
-38.9, E = 4.1
*->mnlPsaKvsWAavtlLLLLLLLPPAlLslGvdAqPLPDCCRqKtCsC + v A LL + PP
+G++ C R C rheu.ef.23 10
RKVLLQTVRAAKKARRLLGMWQPPVHNVPGIERNWYESCFRSHAAVC 56
RLYELLHGAGnHAAGiLtLGK.RRPGPPGLqGRLqRLLqAsGnHAAGiLt + + G nH A tLG++
RPGPPG G i rheu.ef.23 57
GCGDFV-GHINHLAT--TLGRpPRPGPPG------------GPRTPQI-89
mGRRAGAElePrlCPGRRClaAaAsalAPrGrsrv<-* R A ++P+ PG R As G+ +
rheu.ef.23 90 --RNLPALPAPQGEPGDRATWRGASGADAAGGDGG 122 # = GF ID
Orexin # = GF AC PF02072.7 # = GF DE Prepro-orexin # = GF AU Mian
N, Bateman A # = GF SE IPR001704 # = GF TP Family OREX_HUMAN/1-131
MNLPSTKVSWAAVTLLLLLLLLPPALLSSGAAAQPLPDCCRQKTCSCRLYELLHGAGN
HAAGILTLGKRRSGPPGLQGRLQRLLQASGNHAAGILTMGRRAGAEPAPRPCLGRRC
SAPAAASVAPGGQSGI GIP RECEPTOR HMMER 2.3.2 (Oct 2003) Copyright
.COPYRGT. 1992-2003 HHMI/Washington University School of Medicine
Freely distributed under the GNU General Public License (GPL)
---------------------------------------------------------------------------
--------------------------------- HMM file: prints.hmm Sequence
file: rheu.ef.241.148 GIPRECEPTOR_7: domain 1 of 1, from 76 to 97:
score 7.9, E = 3.7 *->PrlGPYlGdqtltLwnq.ALAA<-* Pr+GP G +t+
++n +AL A rheu.ef.24 76 PRPGPPGGPRTPQIRNLpALPA 97 rheu.ef.238rev
GIPRECEPTOR_7: domain 1 of 1, from 76 to 97: score 7.9, E = 3.7
*->PrlGPYlGdqtltLwnq.ALAA<-* Pr+GP G +t+ ++n +AL A rheu.ef.23
76 PRPGPPGGPRTPQIRNLpALPA 97 GIPRECEPTOR gx; PR01129 gn; COMPOUND
(11) ga; 22-MAY-1999 gt; Gastric inhibitory polypeptide receptor
precursor signature gp; PRINTS; PR00237 GPCRRHODOPSN; PR00247
GPCRCAMP; PR00248 GPCRMGR gp; PRINTS; PR00249 GPCRSECRETIN; PR00250
GPCRSTE2; PR00899 GPCRSTE3 gp; PRINTS; PR00251 BACTRLOPSIN gp;
INTERPRO; IPR001749 gr; 1. ATTWOOD, T.K. AND FINDLAY, J.B.C. gr;
Fingerprinting G protein-coupled receptors. gr; PROTEIN ENG. 7 (2)
195-203 (1994). gr; 2. ISHIHARA T., NAKAMURA S., KAZIRO, Y.,
TAKAHASHI, T., TAKAHASHI, K. gr; AND NAGATA, S. gr; Molecular
cloning and expression of a cDNA encoding the secretin receptor gr;
EMBO J. 10 1635-1641 (1991). gr; 3. LIN, H.Y., HARRIS, T.L.,
FLANNERY, M.S., ARUFFO, A., KAJI, E.H., gr; GORN, A., KOLAKOWSKI,
L.F., LODISH, H.F. AND GOLDRING, S.R. gr; Expression cloning of
adenylate cyclase-coupled calcitonin receptor gr; SCIENCE 254
1022-1024 (1991). gr; 4. JUEPPNER, H., ABOU-SAMRA, A.-B., FREEMAN,
M., KONG, X.F., gr; SCHIPANI, E., RICHARDS, J., KOLALOWSKI, L.F.,
HOCK, J., POTTS, J.T., gr; KRONENBERG, H.M. AND SEGRE, G.E. gr; A G
protein linked receptor for parathyroid hormone and parathyroid gr;
hormone-related peptide. gr; SCIENCE 254 1024-1026 (1991). gr; 5.
ISHIHARA, T., SHIGEMOTO, R., MORI, K., TAKAHASHI, K. AND NAGATA, S.
gr; Functional expression and tissue distribution of a novel
receptor for gr; vasoactive intestinal polypeptide. gr; NEURON 8
(4) 811-819 (1992). gr; 6. VOLZ, A., GOKE, R., LANKAT-BUTTGEREIT,
B., FEHMANN, H.C., BODE, H.P. gr; AND GOKE, B. gr; Molecular
cloning, functional expression, and signal transduction of the gr;
GIP-receptor cloned from a human insulinoma. gr; FEBS LETT. 373 (1)
23-9 (1995). gd; G protein-coupled receptors (GPCRs) constitute a
vast protein family that gd; encompasses a wide range of functions
(including various autocrine, para- gd; crine and endocrine
processes). They show considerable diversity at the gd; sequence
level, on the basis of which they may be separated into distinct
gd; groups. Applicants use the term clan to describe the GPCRs, as
they embrace gd; a group of families for which there are
indications of evolutionary gd; relationship,but between which
there is no statistically significant gd; similarity in sequence
[1]. The currently known clan members include the gd;
rhodopsin-like GPCRs, the secretin-like GPCRs, the cAMP receptors,
the gd; fungal mating pheromone receptors, and the metabotropic
glutamate receptor gd; family. The secretin-like GPCRs include
secretin [2], calcitonin [3], gd; parathyroid hormone/parathyroid
hormone-related peptides [4] and vasoactive gd; intestinal peptide
[5], all of which activate adenylyl cyclase and the gd;
phosphatidyl-inositol-calcium pathway. The amino acid sequences of
the
gd; receptors contain high proportions of hydrophobic residues
grouped into 7 gd; domains, in a manner reminiscent of the
rhodopsins and other receptors gd; believed to interact with G
proteins. However, while a similar 3D framework gd; has been
proposed to account for this, there is no significant sequence gd;
similarity between these families: the secretin-like receptors thus
bear gd; their own unique `7TM` signature. gd; Glucose-dependent
insulinotropic polypeptide (GIP) plays an important role gd; in the
regulation of postprandial insulin secretion and proinsulin gene
gd; expression of pancreatic beta-cells [6]. The human GIP-receptor
encodes a gd; 7TM protein that is similar to the human
glucagon-like peptide 1(GLP-1) gd; receptor. It is hoped that an
understanding of GIP-receptor regulation and gd; signal
transduction will shed light on the hormone's failure to exert its
gd; biological action at the pancreatic B-cell in type II diabetes
mellitus.| gd; GIPRECEPTOR is an 11-element fingerprint that
provides a signature for gd; gastric inhibitory polypeptide
receptors. The fingerprint was derived from gd; an initial
alignment of 3 sequences: the motifs were drawn from conserved gd;
regions spanning the full alignment length, focusing on those
sections gd; that characterise the gastric inhibitory polypeptide
receptors but gd; distinguish them from the rest of the
secretin-like superfamily - motifs 1-6 gd; span the N-terminal
domain; motif 7 resides in the loop between TM domains 2| gd; and
3; motif 8 spans the loop between TM domains 3 and 4; motif 9 spans
the C-terminal portion of TM domain 6 and gd; loop between TM
domains 4 and 5; and motifs 10 and 11 reside at the gd; C-terminus.
A single iteration on SPTR37_9f was required to reach gd;
convergence, no further sequences being identified beyond the
starting set. gd; Two partial matches were also found, secretin and
glucagon receptors gd; that match motifs 1, 8 and 9. bb; fc;
GIPRECEPTOR7 fl; 21 ft; Gastric inhibitory polypeptide receptor
precursor motif VII-1 fd; PTLGPYPGDRTLTLRNQALAA GIPR_MESAU 92 56
fd; PPLGPYTGNQTPTLWNQALAA GIPR_RAT 192 56 fd; PRPGPYLGDQALALWNQALAA
GIPR_HUMAN 195 56 PRION HMMER 2.3.2 (Oct 2003) Copyright .COPYRGT.
1992-2003 HHMI/Washington University School of Medicine Freely
distributed under the GNU General Public License (GPL) HMM file:
prints.hmm Sequence file: rheu.ef.241.148 PRION_2: domain 1 of 1,
from 68 to 89: score 5.4, E = 8.6
*->sngggsrypgqGSPGGNRYPpq<-* + r+p +G PGG R P rheu.ef.24 68
LATTLGRPPRPGPPGGPRTPQI 89 rheu.ef.238rev.148 PRION_2: domain 1 of
1, from 68 to 89: score 5.4, E = 8.6
->sngggsrypgqGSPGGNRYPpq<-* r+p +G PGG R P rheu.ef.23 68
LATTLGRPPRPGPPGGPRTPQI 89 gc; PRION gx; PR00341 gn; COMPOUND (8)
ga; 19-OCT-1992; UPDATE 07-JUN-1999 gt; Prion protein signature gp;
INTERPRO; IPR000817 gp; PROSITE; PS00291 PRION_1; PS00706 PRION_2
gp; PFAM; PF00377 prion gr; 1. STAHL, N. AND PRUSINER, S.B. gr;
Prions and prion proteins. gr; FASEB J. 5 2799-2807 (1991). gr; 2.
BRUNORI, M., CHIARA SILVESTRINI, M. AND POCCHIARI, M. gr; The
scrapie agent and the prion hypothesis. gr; TRENDS BIOCHEM. SCI. 13
309-313 (1988). gr; 3. PRUSINER, S.B. gr; Scrapie prions. gr; ANNU.
REV. MICROBIOL. 43 345-374 (1989). gd; Prion protein (PrP) is a
small glycoprotein found in high quantity in the gd; brain of
animals infected with certain degenerative neurological diseases,
gd; such as sheep scrapie and bovine spongiform encephalopathy
(BSE), and the gd; human dementias Creutzfeldt-Jacob disease (CJD)
and Gerstmann-Straussler gd; syndrome (GSS). PrP is encoded in the
host genome and is expressed both in gd; normal and infected cells.
During infection, however, the PrP molecules gd; become altered and
polymerise, yielding fibrils of modified PrP protein. gd; PrP
molecules have been found on the outer surface of plasma membranes
of gd; nerve cells, to which they are anchored through a
covalent-linked gd; glycolipid, suggesting a role as a membrane
receptor. PrP is also expressed gd; in other tissues, indicating
that it may have different functions depending gd; on its location.
gd; The primary sequences of PrP's from different sources are
highly similar: gd; all bear an N-terminal domain containing
multiple tandem repeats of a gd; Pro/Gly rich octapeptide; sites of
Asn-linked glycosylation; an essential gd; disulphide bond; and 3
hydrophobic segments. These sequences show some gd; similarity to a
chicken glycoprotein, thought to be an acetylcholine gd;
receptor-inducing activity (ARIA) molecule. It has been suggested
that gd; changes in the octapeptide repeat region may indicate a
predisposition to gd; disease, but it is not known for certain
whether the repeat may gd; meaningfully be used as a fingerprint to
indicate susceptibility. gd; PRION is an 8-element fingerprint that
provides a signature for the prion gd; proteins. The fingerprint
was derived from an initial alignment of 5 gd; sequences: the
motifs were drawn from conserved regions spanning virtually gd; the
full alignment length, including the 3 hydrophobic domains and the
gd; octapeptide repeats (WGQPHGGG). Two iterations on OWL18.0 were
required gd; to reach convergence, at which point a true set which
may comprise 9 gd; sequences was identified. Several partial
matches were also found: these gd; include a fragment (PRIO_RAT)
lacking part of the sequence bearing the first gd; motif, and the
PrP homologue found in chicken - this matches well with only gd; 2
of the 3 hydrophobic motifs (1 and 5) and one of the other
conserved gd; regions (6), but has an N-terminal signature based on
a sextapeptide repeat gd; (YPHNPG) rather than the characteristic
PrP octapeptide. c; PRION2 fl; 22 ft; Prion protein motif II-2 fd;
WNTGGSRYPGQGSPGGNRYPPQ PRIO_COLGU 31 8 fd; WNTGGSRYPGQGSPGGNRYPPQ
PRIO_MACFA 31 8 fd; WNTGGSRYPGQGSPGGNRYPPQ PRIO_CEREL 34 9 fd;
WNTGGSRYPGQGSPGGNRYPPQ PRIO_ODOHE 34 9 fd; WNTGGSRYPGQGSPGGNRYPPQ
PRIO_GORGO 31 8 fd; WNTGGSRYPGQGSPGGNRYPPQ PRIO_PANTR 31 fd;
WNTGGSRYPGQGSPGGNRYPPQ PRIO_HUMAN 31 8 fd; WNTGGSRYPGQGSPGGNRYPPQ
O46648 34 9 fd; WNTGGSRYPGQGSPGGNRYPPQ PRIO_SHEEP 34 9 fd;
WNTGGSRYPGQGSPGGNRYPPQ PRIO_CALJA 31 8 fd; WNTGGSRYPGQGSPGGNRYPPQ
PRIO_BOVIN 34 9 fd; WNTGGSRYPGQGSPGGNRYPPQ PRP2_BOVIN 34 9 fd;
WNTGGSRYPGQGSPGGNRYPPQ PRIO_ATEPA 31 8 fd; WNTGGSRYPGQGSPGGNRYPPQ
PRIO_SAISC 31 8 fd; WNTGGSRYPGQGSPGGNRYPPQ PRIO_PREFR 31 8 fd;
WNTGGSRYPGQGSPGGNRYPPQ PRIO_PONPY 31 8 fd; WNTGGSRYPGQGSPGGNRYPPQ
O75942 31 8 fd; WNTGGSRYPGQGSPGGNRYPPQ PRIO_CAPHI 34 9 fd;
WNTGGSRYPGQGSPGGNLYPPQ PRIO_CEBAP 31 8 fd; WNTGGSRYPGQGSPGGNRYPPQ
PRIO_CAMDR 34 9 fd; WNTGGSRYPGQGSPGGNRYPPQ PRIO_FELCA 34 9 fd;
WNTGGSRYPGQGSPGGNRYPSQ PRP1_TRAST 34 9 fd; WNTGGSRYPGQSSPGGNRYPPQ
PRIO_RABIT 32 9 fd; WNTGGSRYPGQGSPGGNRYPPQ PRP2_TRAST 34 9 fd;
WNTGGSRYPGQGSPGGNRYPPQ PRIO_PIG 34 9 fd; WNTGGSRYPGQGSPGGNRYPPQ
PRIO_CANFA 34 9 fd; WNTGGSRYPGQGSPGGNRYPPQ PRIO_CRIGR 31 8 fd;
WNTGGSRYPGQGSPGGNRYPPQ PRIO_CRIMI 31 8 fd; WNTGGSRYPGQGSPGGNRYPPQ
Q15216 31 8 fd; WNTGGSRYPGQGSPGGNRYPPQ PRIO_RAT 31 8 fd;
WNTGGSRYPGQGSPGGNRYPPQ PRIO_CERAE 31 8 fd; WNTGGSRYPGQGSPGGNRYPPQ
PRIO_MUSPF 34 9 fd; WNTGGSRYPGQGSPGGNRYPPQ PRIO_MUSVI 34 9 fd;
WNTGGSRYPGQGSPGGNRYPPQ PRIO_MESAU 31 8 fd; WNTGGSRYPGQGSPGGNRYPPQ
PRIO_MOUSE 31 8 fd; NTGGGSRYPGQGSPGGNRYPPQ O46593 34 9 fd;
SGGSNRYPGQPGSPGGNRYPGW PRIO_TRIVU 37 12 bb; NEUROTENSIN HMMER 2.3.2
(Oct 2003) Copyright .COPYRGT. 1992-2003 HHMI/Washington University
School of Medicine Freely distributed under the GNU General Public
License (GPL) HMM file: prints.hmm Sequence file: rheu.ef.241.148
NEUROTENSN2R_1: domain 1 of 1, from 68 to 80: score 6.8, E = 8.7
*->mEtsspwPPRPsp<-* + t +PPRP p rheu.ef.24 68 LATTLGRPPRPGP
80 rheu.ef.238rev.148 NEUROTENSN2R_1: domain 1 of 1, from 68 to 80:
score 6.8, E = 8.7 *->mEtsspwPPRPsp<-* + t +PPRP p rheu.ef.23
68 LATTLGRPPRPGP 80 c; NEUROTENSN2R gx; PR01481 gn; COMPOUND (6)
ga; 12-MAR-2001 gt; Neurotensin type 2 receptor signature gp;
PRINTS; PR00237 GPCRRHODOPSN; PR00247 GPCRCAMP; PR00248 GPCRMGR gp;
PRINTS; PR00249 GPCRSECRETIN; PR00250 GPCRSTE2; PR00899 GPCRSTE3
gp; PRINTS; PR00251 BACTRLOPSIN gp; PRINTS; PR01479 NEUROTENSINR;
PR01480 NEUROTENSN1R gr; 1. ATTWOOD, T.K. AND FINDLAY, J.B.C. gr;
Fingerprinting G protein-coupled receptors. gr; PROTEIN ENG. 7 (2)
195-203 (1994). gr; 2. ATTWOOD, T.K. AND FINDLAY, J.B.C. gr; G
protein-coupled receptor fingerprints. gr; 7TM, VOLUME 2, EDS. G.
VRIEND AND B. BYWATER (1993). gr; 3. BIRNBAUMER, L. gr; G proteins
in signal transduction. gr; ANNU. REV. PHARMACOL. TOXICOL. 30
675-705 (1990). gr; 4. CASEY, P.J. AND GILMAN, A.G. gr; G protein
involvement in receptor-effector coupling. gr; J. BIOL. CHEM. 263
(6) 2577-2580 (1988). gr; 5. ATTWOOD, T.K. AND FINDLAY, J.B.C. gr;
Design of a discriminating fingerprint for G protein-coupled
receptors. gr; PROTEIN ENG. 6 (2) 167-176 (1993). gr; 6. WATSON, S.
AND ARKINSTALL, S. gr; Neurotensin. gr; IN THE G PROTEIN-LINKED
RECEPTOR FACTSBOOK, ACADEMIC PRESS, 1994, PP. 199-201. gr; 7.
VINCENT, J-P., MAZELLA, J. AND KITABGI, P. gr; Neurotensin and
neurotensin receptors. gr; TRENDS PHARMACOL. SCI. 20 (7) 302-309
(1999). gr; 8. VITA, N., OURY-DONAT, F., CHALON, P., GUILLEMOT, M.,
KAGHAD, M., BACHY, gr; A., THURNEYSSEN, O., GARCIA, S.,
POINOT-CHAZEL, C., CASELLAS, P., KEANE, P., gr; LE FUR, G.,
MAFFRAND, J.P., SOUBRIE, P., CAPUT, D. AND FERRARA, P. gr;
Neurotensin is an antagonist of the human neurotensin NT2 receptor
expressed gr; in Chinese hamster ovary cells. gr; EUR. J.
PHARMACOL. 360 (2-3) 265-272 (1998). gr; 9. YAMADA, M., YAMADA, M.,
LOMBET, A., FORGEZ, P. AND ROSTENE, W. gr; Distinct functional
characteristics of levocabastine sensitive rat gr; neurotensin NT2
receptor expressed in Chinese hamster ovary cells. gr; LIFE SCI. 62
(23) PL 375-380 (1998). gd; G protein-coupled receptors (GPCRs)
constitute a vast protein family that gd; encompasses a wide range
of functions (including various autocrine,
gd; paracrine and endocrine processes). They show considerable
diversity at the gd; sequence level, on the basis of which they may
be separated into distinct gd; groups. Applicants use the term clan
to describe the GPCRs, as they embrace gd; a group of families for
which there are indications of evolutionary gd; relationship, but
between which there is no statistically significant gd; similarity
in sequence [1,2]. The currently known clan members include the gd;
rhodopsin-like GPCRs, the secretin-like GPCRs, the cAMP receptors,
the fungal gd; mating pheromone receptors, and the metabotropic
glutamate receptor family. gd; The rhodopsin-like GPCRs themselves
represent a widespread protein family gd; that includes hormone,
neurotransmitter and light receptors, all of gd; which transduce
extracellular signals through interaction with guanine gd;
nucleotide-binding (G) proteins. Although their activating ligands
vary gd; widely in structure and character, the amino acid
sequences of the gd; receptors are very similar and are believed to
adopt a common structural gd; framework which may comprise 7
transmembrane (TM) helices [3-5]. gd; Neurotensin is a 13-residue
peptide transmitter, sharing significant gd; similarity in its 6
C-terminal amino acids with several other neuropeptides, gd;
including neuromedin N. This region is responsible for the
biological gd; activity, the N-terminal portion having a modulatory
role. Neurotensin is gd; distributed throughout the central nervous
system, with highest levels in gd; the hypothalamus, amygdala and
nucleus accumbens. It induces a variety of gd; effects, including:
analgesia, hypothermia and increased locomotor activity. gd; It is
also involved in regulation of dopamine pathways. In the periphery,
gd; neurotensin is found in endocrine cells of the small intestine,
where it gd; leads to secretion and smooth muscle contraction [6].
gd; The existence of 2 neurotensin receptor subtypes, with
differing affinities gd; for neurotensin and differing
sensitivities to the antihistamine gd; levocabastine, was
originally demonstrated by binding studies in rodent gd; brain. Two
neurotensin receptors (NT1 and NT2) with such properties have gd;
since been cloned and have been found to be G protein-coupled
receptor gd; family members [7]. gd; The NT2 receptor was cloned
from rat, mouse and human brains based on its gd; similarity to the
NT1 receptor. The receptor was found to be a low affinity, gd;
levocabastine sensitive receptor for neurotensin. Unlike the high
affinity, gd; NT1 receptor, NT2 is insensitive to guanosine
triphosphate and has low gd; sensitivity to sodium ions [7].
Highest levels of expression of the receptor gd; are found in the
brain, in regions including: the olfactory system, cerebral gd; and
cerebellar cortices, hippocampus and hypothalamic nuclei. The gd;
distribution is distinct from that of the NT1 receptor, with only a
few gd; areas (diagonal band of Broca, medial septal nucleus and
suprachiasmatic gd; nuclei) expressing both receptor subtypes [7].
The receptor has also been gd; found at lower levels in the kidney,
uterus, heart and lung [8]. Activation gd; of the NT2 receptor by
non-peptide agonists suggests that the receptor may gd; couple to
phospholipase C, phospholipase A2 and MAP kinase. A functional gd;
response to neurotensin, however, is weak [9] or absent, and
neurotensin gd; appears to act as an antagonist of the receptor
[8]. It has been suggested gd; that a substance other than
neurotensin may act as the natural ligand for gd; this receptor
[8]. gd; NEUROTENSN2R is a 6-element fingerprint that provides a
signature for the gd; neurotensin type 2 receptors. The fingerprint
was derived from an initial gd; alignment of 3 sequences: the
motifs were drawn from conserved sections gd; within the N-terminus
and loop regions, focusing on those areas of the gd; alignment that
characterise the neurotensin type 2 receptors but distinguish gd;
them from the rest of neurotensin receptor family - motifs 1 and 2
span the gd; N-terminus; motifs 3 and 4 span the second external
loop; and motifs 5 and 6 gd; span the third cytoplasmic loop. A
single iteration on SPTR39_15f was gd; required to reach
convergence, no further sequences being identified beyond gd; the
starting set. bb; fc; NEUROTENSN2R1 fl; 13 ft; Neurotensin type 2
receptor motif I-1 fd; METSSPWPPRPSP NTR2_RAT 1 1 fd; METSSLWPPRPSP
NTR2_MOUSE 1 1 fd; METSSPRPPRPSS NTR2_HUMAN 1 1 ORPHAN NUCLEAR
RECEPTOR (4A NUCLEAR RECEPTOR) FAMILY SIGNATURE HMMER 2.3.2 (Oct
2003) Copyright .COPYRGT. 1992-2003 HHMI/Washington University
School of Medicine Freely distributed under the GNU General Public
License (GPL) HMM file: prints.hmm Sequence file: uro742rev.1.780
NUCLEARECPTR_5: domain 1 of 1, from 326 to 341: score 7.2, E = 5
*->PvnLlnaLVRAhvDStP<-* + + n++VRAh+D+ uro742rev. 326
-TFITNSMVRAHIDADK 341 gc; NUCLEARECPTR gx; PR01284 gn; COMPOUND
(11) ga; 16-FEB-2000 gt; Orphan nuclear receptor (4A nuclear
receptor) family signature gp; PRINTS; PR00398 STRDHORMONER;
PR00047 STROIDFINGER gp; PRINTS; PR01285 HMRNUCRECPTR; PR01286
NORNUCRECPTR; PR01287 NURRNUCRCPTR gr; 1. NUCLEAR RECEPTORS
NOMENCLATURE COMMITTEE gr; A unified nomenclature system for the
nuclear receptor superfamily. gr; CELL 97 161-163 (1999). gr; 2.
NISHIKAWA, J-I., KITAURA, M., IMAGAWA, M. AND NISHIHARA, T. gr;
Vitamin D receptor contains multiple dimerisation interfaces that
gr; are functionally different. gr; NUCLEIC ACIDS RES. 23 (4)
606-611 (1995). gr; 3. DE VOS, P., SCHMITT, J., VERHOEVEN, G. AND
STUNNENBERG, G. gr; Human androgen receptor expressed in HeLa cells
activates transcription gr; in vitro. gr; NUCLEIC ACIDS RES. 22 (7)
1161-1166 (1994). gr; 4. OHKURA, N., HIJIKURO, M., YAMAMOTO, A. AND
MIKI, K. gr; Molecular cloning of a novel thyroid/steroid receptor
superfamily gene from gr; cultured rat neuronal cells. gr; BIOCHEM.
BIOPHYS. RES. COMMUN. 205 1959-1965 (1994). gr; 5. LAW, S.W.,
CONNEELY, O.M., DEMAYO, F.J. AND O'MALLEY, B.W. gr; Identification
of a new brain-specific transcription factor, NURR1. gr; MOL.
ENDOCRINOL. 2129-2135 (1992). gr; 6. WILSON, T.E., PAULSEN, R.E.,
PADGETT, K.A. AND MILBRANDT, J. gr; Participation of non-zinc
finger residues in DNA binding by two nuclear gr; orphan receptors.
gr; SCIENCE 256 107-110 (1992). gr; 7. CLARK, J., BENJAMIN, H.,
GILL, S., SIDHAR, S., GOODWIN, G., CREW, J., gr; GUSTERSON, B.A.,
SHIPLEY, J. AND COOPER, C.S. gr; Fusion of the EWS gene to CHN, a
member of the steroid/thyroid receptor gr; gene superfamily, in a
human myxoid chondrosarcoma. gr; ONCOGENE 12 229-235 (1996). gd;
Steroid or nuclear hormone receptors (NRs) constitute an important
super- gd; family of transcription regulators that are involved in
widely diverse gd; physiological functions, including control of
embryonic development, cell gd; differentiation and homeostasis
[1]. Members of the superfamily include the gd; steroid hormone
receptors and receptors for thyroid hormone, retinoids, gd;
1,25-dihydroxy-vitamin D3 and a variety of other ligands. The
proteins gd; function as dimeric molecules in nuclei to regulate
the transcription of gd; target genes in a ligand-responsive manner
[2,3]. In addition to C-terminal gd; ligand-binding domains, these
nuclear receptors contain a highly-conserved, gd; N-terminal
zinc-finger that mediates specific binding to target DNA gd;
sequences, termed ligand-responsive elements. In the absence of
ligand, gd; steroid hormone receptors are thought to be weakly
associated with nuclear gd; components; hormone binding greatly
increases receptor affinity. gd; NRs are extremely important in
medical research, a large number of them gd; being implicated in
diseases such as cancer, diabetes, hormone resistance gd;
syndromes, etc. [1]. While several NRs act as ligand-inducible
transcription gd; factors, many do not yet have a defined ligand
and are accordingly termed gd; "orphan" receptors. During the last
decade, more than 300 NRs have been gd; described, many of which
are orphans, which cannot easily be named due to gd; current
nomenclature confusions in the literature. However, a new system
gd; has recently been introduced in an attempt to rationalise the
increasingly gd; complex set of names used to describe superfamily
members [1]. gd; Novel members of the steroid receptor superfamily
designated NOR-1 (neuron gd; derived orphan receptor) [4], Nurr1
(Nur-related factor 1) [5], and NGFI-B gd; [6] have been identified
from forebrain neuronal cells undergoing apoptosis, gd; from brain
cortex, and from lung, superior cervical ganglia and adrenal gd;
tissue respectively. The NOR-1 protein binds to the B1a
response-element, gd; which has been identified as the target
sequence of the Nur77 family, gd; suggesting that three members of
the Nur77 family may transactivate common gd; target gene(s) at
different situations [4]. Ewing's sarcoma is characterised gd; by
chromosomal translocations that involve the NOR protein [7]. gd;
NUCLEARECPTR is an 11-element fingerprint that provides a signature
for the gd; orphan nuclear receptor family. The fingerprint was
derived from an initial gd; alignment of 11 sequences: the motifs
were drawn from conserved regions gd; spanning virtually the full
alignment length, focusing on those sections gd; that characterise
members of the nuclear receptor family but distinguish gd; them
from the rest of the steroid hormone receptor superfamily - motifs
1-3 gd; lie N-terminal to the zinc finger domain; motifs 4 and 5
lie between the gd; zinc fingers and putative ligand-binding
domain; motifs 6 and 7 encode the gd; N- and C-terminal extremities
of the ligand-binding domain; and motifs 8-11 gd; reside at the
C-terminus. A single iteration on SPTR37_10f was required to gd;
reach convergence, no further sequences being identified beyond the
starting gd; set. Several partial matches were found, all of which
appear to be N- or gd; C-terminally truncated homologues. fc;
NUCLEARECPTR5 fl; 17 ft; Orphan nuclear receptor family motif V-1
fd; PANLLTSLVRAHLDSGP NR41_HUMAN 361 6 fd; PANLLTSLVRAHLDSGP
NR41_CANFA 361 6 fd; PVSLISALVRAHVDSNP NR42_RAT 361 10 fd;
PVSLISALVRAHVDSNP NR42_MOUSE 361 10 fd; PVSLISALVRAHVDSNP
NR42_HUMAN 361 10
fd; PTNLLTSLIRAHLDSGP NR41_RAT 360 6 fd; PTNLLTSLIRAHLDSGP
NR41_MOUSE 364 6 fd; PVDLINSLVRAHIDSIP NR42_XENLA 340 6 fd;
PVCMMNALVRALTDSTP O97726 412 15 fd; PICMMNALVRALTDSTP NR43_HUMAN
395 15 fd; PICMMNALVRALTDATP NR43_RAT 397 15 BRAIN DERIVED
NEUROTROPHIC FACTOR SIGNATURE (BDN) HMMER 2.3.2 (Oct 2003)
Copyright .COPYRGT. 1992-2003 HHMI/Washington University School of
Medicine Freely distributed under the GNU General Public License
(GPL) HMM file: prints.hmm Sequence file: uro742rev.1.780
BDNFACTOR_3: domain 1 of 2, from 496 to 512: score 3.1, E = 42
*->PLLFLLEEYKnYLDAAn<-* PL LL Y YL+ uro742rev. 496
PLWALLNGYVDYLETQI 512 BDNFACTOR_3: domain 2 of 2, from 690 to 706:
score 7.7, E = 5.7 *->PLLFLLEEYKnYLDAAn<-* PLLFL EY+ AA
uro742rev. 690 PLLFLPSEYQREDGAAE 706 gc; BDNFACTOR gx; PR01912 gn;
COMPOUND (5) ga; 29-AUG-2008 gt; Brain derived neurotrophic factor
signature gp; PRINTS; PR00268 NGF; PR01913 NGFBETA; PR01914
NEUROTROPHN3 gp; PRINTS; PR01915 NEUROTROPHN4; PR01916 NEUROTROPHN6
gp; PDB; 1BND; 1B8M gp; SCOP; 1BND; 1B8M gp; CATH; 1BND; 1B8M gp;
MIM; 113505 gr; 1. HOFER, M., PAGLIUSI, S.R., HOHN, A., LEIBROCK,
J. AND BARDE, Y.A. gr; Regional distribution of brain-derived
neurotrophic factor messenger RNA in gr; the adult mouse brain. gr;
EMBO J. 9 (8) 2459-2464 (1990). gr; 2. KOYAMA, J.I., INOUE, S.,
IKEDA, K. AND HAYASHI, K. gr; Purification and amino acid sequence
of a nerve growth factor from the gr; venom of Vipera russelli
russelli. gr; BIOCHIM. BIOPHYS. ACTA 1160 287-292 (1992). gr; 3.
INOUE, S., ODA, T., KOYAMA, J., IKEDA, K. AND HAYASHI, K. gr; Amino
acid sequences of nerve growth factors derived from cobra venoms.
gr; FEBS LETT. 279 (1) 38-40 (1991). gr; 4. BARDE, Y., EDGAR, D.
AND THOENEN, H. gr; Purification of a new neurotrophic factor from
mammalian brain. gr; EMBO J. 1 549-553 (1982). gr; 5. HIBBERT, A.,
KRAMER, B., MILLER, F. AND KAPLAN, D. gr; The localization,
trafficking and retrograde transport of BDNF bound to gr; p75NTR in
sympathetic neurons. gr; MOL. CELL. NEUROSCI. 32 387-402 (2006).
gr; 6. LINNARSSON, S., BJORKLUND, A. AND ERNFORS, P. gr; Learning
deficit in BDNF mutant mice. gr; EUR. J. NEUROSCI. 9 2581-2587
(1997). gr; 7. LEBRUN, B., BARIOHAY, B., MOYSE, E. AND JEAN, A. gr;
Brain-derived neurotrophic factor (BDNF) and food intake
regulation: a gr; minireview. gr; AUTON. NEUROSCI. 126-127 30-38
(2006). gr; 8. KOZISEK, M., MIDDLEMAS, D. AND BYLUND, D. gr;
Brain-derived neurotrophic factor and its receptor
tropomyosin-related gr; kinase B in the mechanism of action of
antidepressant therapies. gr; PHARMACOL. THER. 117 30-51 (2008).
gd; During the development of the vertebrate nervous system, many
neurons gd; become redundant (because they have died, failed to
connect to target gd; cells, etc.) and are eliminated. At the same
time, developing neurons send gd; out axon outgrowths that contact
their target cells [1]. Such cells control gd; their degree of
innervation (the number of axon connections) by the gd; secretion
of various specific neurotrophic factors that are essential for gd;
neuron survival. One of these is nerve growth factor (NGF), which
is gd; involved in the survival of some classes of embryonic neuron
(e.g., peri- gd; pheral sympathetic neurons) [1]. NGF is mostly
found outside the central gd; nervous system (CNS), but slight
traces have been detected in adult CNS gd; tissues, although a
physiological role for this is unknown [1}; it has also gd; been
found in several snake venoms [2,3]. Proteins similar to NGF
include gd; brain-derived neurotrophic factor (BDNF) and
neurotrophins 3 to 7, all of gd; which demonstrate neuron survival
and outgrowth activities. gd; Originally purified from pig brain
[4], the neurotrophin BDNF is expressed gd; in a range of tissues
and cell types in the CNS and periphery. It exerts gd; its effects
by binding to neurotrophic tyrosine kinase receptor type 2 gd;
(NTRK2; also called TrkB) and the low affinity nerve growth factor
receptor, gd; p75NTR. While the former receptor mediates the
neurotrophin's prosurvival gd; functions, activation of p75NTR by
BDNF has been shown to promote apoptosis gd; and to inhibit axonal
growth [5]. gd; BDNF is a key regulator of synaptic plasticity, and
plays an important role gd; in learning and memory [6]. Several
lines of evidence suggest that it is gd; also involved in the
control of food intake and body weight [7]. A number gd; of
clinical studies have demonstrated an association between aberrant
BDNF gd; levels and disorders and disease states, such as
depression, epilepsy, gd; bipolar disorder, Parkinson's disease and
Alzheimer's disease [8]. gd; BDNFACTOR is a 5-element fingerprint
that provides a signature for brain- gd; derived neurotrophic
factor. The fingerprint was derived from an initial gd; alignment
of 33 sequences: the motifs were drawn from conserved regions gd;
spanning virtually the full alignment length - motif 1 includes
part of the gd; signal sequence. Three iterations on SPTR55_38f
were required to reach gd; convergence, at which point a true set
which may comprise 47 sequences was gd; identified. A single
partial match was also found, Q6YNR1_HUMAN, a human gd; BDNF splice
variant that fails to match motifs 4 and 5. fc; BDNFACTOR3 fl; 17
ft; Brain derived neurotrophic factor motif III-3 fd;
PLLFLLEEYKNYLDAAN A2AII2_MOUSE 115 31 fd; PLLFLLEEYKNYLDAAN
Q8CCH9_MOUSE 107 31 fd; PLLFLLEEYKNYLDAAN Q6YNR3_HUMAN 113 31 fd;
PLLFLLEEYKNYLDAAN Q6YNR2_HUMAN 120 31 fd; PLLFLLEEYKNYLDAAN
Q598Q1_HUMAN 105 31 fd; PLLFLLEEYKNYLDAAN Q541P3_MOUSE 107 31 fd;
PLLFLLEEYKNYLDAAN BDNF_URSML 105 31 fd; PLLFLLEEYKNYLDAAN
BDNF_URSAR 105 31 fd; PLLFLLEEYKNYLDAAN BDNF_SPECI 105 31 fd;
PLLFLLEEYKNYLDAAN BDNF_SELTH 105 31 fd; PLLFLLEEYKNYLDAAN BDNF_RAT
107 31 fd; PLLFLLEEYKNYLDAAN BDNF_PROLO 105 31 fd;
PLLFLLEEYKNYLDAAN BDNF_PIG 110 31 fd; PLLFLLEEYKNYLDAAN BDNF_PANTR
105 31 fd; PLLFLLEEYKNYLDAAN BDNF_MOUSE 107 31 fd;
PLLFLLEEYKNYLDAAN BDNF_HUMAN 105 31 fd; PLLFLLEEYKNYLDAAN
BDNF_FELCA 105 31 fd; PLLFLLEEYKNYLDAAN BDNF_CANFA 105 31 fd;
PLLFLLEEYKNYLDAAN BDNF_BOVIN 108 31 fd; PLLFLLEEYKNYLDAAN
BDNF_AILME 105 31 fd; PLLFLLEEYKNYLDAAN BDNF_AILFU 105 31 fd;
PLLFLLEEYKNYLDAAN A7LA92_HUMAN 187 31 fd; PLLFLLEEYKNYLDAAN
A7LA85_HUMAN 134 31 fd; PLLFLLEEYKNYLDAAN BDNF_CAVPO 113 31 fd;
PLLFLLEEYKNYLDAAN BDNF_HORSE 105 31 fd; PLLFLLEEYKNYLDAAN
Q8VHH4_MOUSE 107 31 fd; PLLFLLEEYKNYLDAAN Q6DN19_HUMAN 105 31 fd;
PLLFLLEEYKNYLDAAN BDNF_LIPVE 106 31 fd; PLLFLLEEYKNYLDAAN
BDNF_CHICK 104 30 fd; PLLFLLEEYKNYLDAAN Q8AV78_NIPNI 104 30 fd;
PLLFLLEEYKNYLDAAN Q4JHT7_POEGU 104 30 fd; PLLFLLEEYKNYLDAAN
A4L7M3_BOMOR 105 30 fd; PLLFLLEEYKNYLDAAN Q63ZM5_XENLA 105 30 fd;
PLLFLLEEYKNYLDAAN A3FPG9_XENTR 105 30 fd; PLLFLLEEYKNYLDAAN
Q8QG75_9SAUR 104 30 fd; PLLFLLEEYKNYLDAAN Q8QG76_9SAUR 104 30 fd;
PLLFLLEEYKNYLDAAN A4L7M4_9SALA 105 30 fd; PLLFLLEEYKNYLDAAN
A4L7M5_SALSL 105 30 fd; PLLFLLEEYKNYLDAAN A2ICR4_AMBME 105 30 fd;
PLLFLLEEYKNYLDAAN Q8QG77_9SALA 105 30 fd; PLLFLLEEYKNYLDAAN
Q6NZO1_DANRE 128 47 fd; PLLFLLEEYKNYLDAAN Q9YH42_DANRE 128 47 fd;
PLLFLLEEYKNYLDAAN Q8JGW4_PAROL 127 48 fd; PLLFLLEEYKNYLDAAN
Q06B76_DICLA 127 48 fd; PLLFLLEEYKNYLDAAN BDNF_CYPCA 128 47 fd;
PLLFLLEEYKNYLDAAN Q8QG74_9SAUR 104 30 fd; PLLFLLEEYKNYLDAAN
BDNF_XIPMA 127 48 CALCITONIN HMMER 2.3.2 (Oct 2003) Copyright
.COPYRGT. 1992-2003 HHMI/Washington University School of Medicine
Freely distributed under the GNU General Public License (GPL) HMM
file: prints.hmm Sequence file: uro742rev.154 CALCITONINR_2: domain
1 of 1, from 91 to 108: score 6.0, E = 9.4
*->kCYDRmqqLPpYeGEGpY<-* R+ LP+Y GEGp uro742rev. 91
TPVRRLLPLPSYPGEGPQ 108 CALCITONINR_2: domain 1 of 1, from 72 to 89:
score 6.0, E = 9.4 *->kCYDRmqqLPpYeGEGpY<-* R+ LP+Y GEGp
zc37.B9.2d 72 TPVRRLLPLPSYPGEGPQ 89 gc; CALCITONINR gx; PR00361 gn;
COMPOUND (6) ga; 15-APR-1995; UPDATE 06-JUN-1999 gt; Calcitonin
receptor signature gp; PRINTS; PR00237 GPCRRHODOPSN; PR00247
GPCRCAMP; PR00248 GPCRMGR gp; PRINTS; PR00249 GPCRSECRETIN; PR00250
GPCRSTE2; PR00899 GPCRSTE3 gp; PRINTS; PR00251 BACTRLOPSIN gp;
PRINTS; PR01350 CTRFAMILY; PR01351 CGRPRECEPTOR gp; INTERPRO;
IPR001688 gr; 1. ATTWOOD, T.K. AND FINDLAY, J.B.C. gr;
Fingerprinting G protein-coupled receptors. gr; PROTEIN ENG. 7 (2)
195-203 (1994). gr; 2. ISHIHARA T., NAKAMURA S., KAZIRO, Y.,
TAKAHASHI, T., TAKAHASHI, K. gr; AND NAGATA, S. gr; Molecular
cloning and expression of a cDNA encoding the secretin receptor.
gr; EMBO J. 10 1635-1641 (1991). gr; 3. LIN, H.Y., HARRIS, T.L.,
FLANNERY, M.S., ARUFFO, A., KAJI, E.H., gr; GORN, A., KOLAKOWSKI,
L.F., LODISH, H.F. AND GOLDRING, S.R. gr; Expression cloning of
adenylate cyclase-coupled calcitonin receptor. gr; SCIENCE 254
1022-1024 (1991). gr; 4. JUEPPNER, H., ABOU-SAMRA, A.-B., FREEMAN,
M., KONG, X.F., gr; SCHIPANI, E., RICHARDS, J., KOLALOWSKI, L.F.,
HOCK, J., POTTS, J.T., gr; KRONENBERG, H.M. AND SEGRE, G.E. gr; A G
protein linked receptor for parathyroid hormone and parathyroid gr;
hormone-related peptide. gr; SCIENCE 254 1024-1026 (1991). gr; 5.
ISHIHARA, T., SHIGEMOTO, R., MORI, K., TAKAHASHI, K. AND NAGATA, S.
gr; Functional expression and tissue distribution of a novel
receptor for gr; vasoactive intestinal polypeptide. gr; NEURON 8
(4) 811-819 (1992). gr; 6. WATSON, S. AND ARKINSTALL, S. gr;
Calcitonin. gr; IN THE G PROTEIN-LINKED RECEPTOR FACTSBOOK,
ACADEMIC PRESS, 1994, PP. 74-76. gr; 7. NJUKI, F., NICHOLL, C.G.,
HOWARD, A., MAK, J.C., BARNES, P.J., gr; GIRGIS, S.I. AND LEGON,
S.A. gr; A new calcitonin-receptor-like sequence in rat pulmonary
blood vessels. gr; CLIN. SCI. 85 (4) 385-388 (1993). gd; G
protein-coupled receptors (GPCRs) constitute a vast protein family
that gd; encompasses a wide range of functions (including various
autocrine, para- gd; crine and endocrine processes). They show
considerable diversity at the gd; sequence level, on the basis of
which they may be separated into distinct gd; groups. Applicants
use the term clan to describe the GPCRs, as they embrace a gd;
group of families for which there are indications of evolutionary
gd; relationship, but between which there is no statistically
significant gd; similarity in sequence [1]. The currently known
clan members include the
gd; rhodopsin-like GPCRs, the secretin-like GPCRs, the cAMP
receptors, the fungal gd; mating pheromone receptors, and the
metabotropic glutamate receptor family. gd; The secretin-like GPCRs
include secretin [2], calcitonin [3], parathyroid gd;
hormone/parathyroid hormone-related peptides [4] and vasoactive
intestinal gd; peptide [5], all of which activate adenylyl cyclase
and the phosphatidyl- gd; inositol-calcium pathway. The amino acid
sequences of the receptors contain gd; high proportions of
hydrophobic residues grouped into 7 domains, in a manner gd;
reminiscent of the rhodopsins and other receptors believed to
interact with gd; G proteins. However, while a similar 3D framework
has been proposed to gd; account for this, there is no significant
sequence identity between these gd; families: the secretin-like
receptors thus bear their own unique `7TM` gd; signature. gd; The
major physiological role of calcitonin is to inhibit bone
resorption gd; thereby leading to a reduction in plasma Ca++ [6].
Further, it enhances gd; excretion of ions in the kidney, prevents
absorption of ions in the gd; intestine, and inhibits secretion in
endocrine cells (e.g. pancreas and gd; pituitary). In the CNS,
calcitonin has been reported to be analgesic gd; and to suppress
feeding and gastric acid secretion. It is used to treat gd; Paget's
disease of the bone. Calcitonin receptors are found predominantly
gd; on osteoclasts or on immortal cell lines derived from these
cells. It is gd; found in lower amounts in the brain (e.g. in
hypothalamus and pituitary gd; tissues) and in peripheral tissues
(e.g. testes, kidney, liver and gd; lymphocytes). It has also been
described in lung and breast cancer cell gd; lines. The predominant
signalling pathway is activation of adenylyl cyclase gd; through
Gs, but calcitonin has also been described to have both stimulatory
gd; and inhibitory actions on the phosphoinositide pathway. gd;
CALCITONINR is a 6-element fingerprint that provides a signature
for the gd; calcitonin receptors. The fingerprint was derived from
an initial alignment gd; of 6 sequences: the motifs were drawn from
conserved sections within either gd; loop or TM regions, focusing
on those areas of the alignment that gd; characterise the
calcitonin receptors but distinguish them from the rest gd; of the
secretin-like family - motifs 1-3 were drawn from the N-terminal
gd; region leading into the first TM domain; motif 4 lies at the
C-terminus of gd; the second TM domain following into the loop
region; motif 5 is N-terminal gd; to the seventh TM region; and
motif 6 was drawn from the C-terminus. Two gd; iterations on
OWL25.2 were required to reach convergence, at which point a gd;
true set which may comprise 9 sequences was identified. A single
partial gd; match was also found, RNCLR, a new calcitonin-like
receptor from rat gd; pulmonary blood vessels [7]. fc; CALCITONINR2
fl; 18 ft; Calcitonin receptor motif II-2 fd; KCYDRIQQLPPYEGEGPY
CALR_RAT 54 1 fd; KCYDRMEQLPPYQGEGPY CALR_RABIT 54 1 fd;
KCYDRMQQLPAYQGEGPY CALR_HUMAN 54 1 fd; KCYDRIHQLPSYEGEGLY
CALR_MOUSE 54 1 fd; RCYDRMQQLPPYEGEGPY CALR_CAVPO 54 1 fd;
RCYDRMQKLPPYQGEGLY CALR_PIG 55 1 LEUKOTRIENE B4 TYPE 1 RECEPTOR
BLKPROB Version 5/21/00.1 Database =
/gcg/husar/gcgdata/gcgblimps/blocksplus.dat Copyright .COPYRGT.
1992-6 by the Fred Hutchinson Cancer Research Center If you use
BLOCKS in your research, please cite: Steven Henikoff and Jorja G.
Henikoff, Protein Family Classification Based on Searching a
Database of Blocks, Genomics 19: 97-107 (1994). Each numbered
result consists of one or more blocks from a PROSITE or PRINTS
group found in the query sequence. One set of the highest-scoring
blocks that are in the correct order and separated by distances
comparable to the BLOCKS database is selected for analysis. If this
set includes multiple blocks the probability that the lower scoring
blocks support the highest scoring block is reported. Maps of the
database blocks and query sequence are shown: < indicates the
sequence has been truncated to fit the page : indicates the minimum
distance between blocks in the database . indicates the maximum
distance between blocks in the database The maps are aligned on the
highest scoring block. The alignment of the query sequence with the
sequence closest to it in the BLOCKS database is shown. Upper case
in the query sequence indicates at least one occurrence of the
residue in that column of the block. Query = uro705rev.1a.74
Length: 74 Type: P C Size = 74 Amino Acids Blocks Searched = 29068
Alignments Done = 2896529 Cutoff combined expected value for hits =
0 Cutoff block expected value for repeats/other = 0
==========================================================================-
================================= Combined Family Strand Blocks
E-value IPB003983 Leukotriene B4 type 1 receptor sign 1 1 of 6
0.0042 >IPB003983 1/6 blocks Combined E-value = 0.0042:
Leukotriene B4 type 1 receptor signature Block Frame Location (aa)
Block E-value IPB003983C 0 25-41 0.0046 Other reported alignments:
##STR00005## ##STR00006## rheu.cd.215rev.1.736 >IPB003983 1/6
blocks Combined E-value = 0.0094: Leukotriene B4 type 1 receptor
signature Block Frame Location (aa) Block E-value IPB003983C 0
28-44 0.0096 Other reported alignments: ##STR00007## ##STR00008##
zpr5.B4.12dk.209 Length: 209 Type: P Combined Family Strand Blocks
E-value IPB003983 Leukotriene B4 type 1 receptor sign 1 1 of 6
0.0078 zpr5.B4.12dk >IPB003983 1/6 blocks Combined E-value =
0.0078: Leukotriene B4 type 1 receptor signature Block Frame
Location (aa) Block E-value IPB003983C 0 32-48 0.0081 Other
reported alignments: ##STR00009## ##STR00010## SJOGREN'S
SYNDROME/SCLERODERMA AUTOANTIGEN 1 (AUTOANTIGEN P27) HMMER 2.3.2
(Oct 2003) Copyright .COPYRGT. 1992-2003 HHMI/Washington University
School of Medicine Freely distributed under the GNU General Public
License (GPL) HMM file: pfam.hmm Sequence file: rheu.cd.211rev.164
Auto_anti-p27: domain 1 of 1, from 117 to 156: score -12.1, E = 4.6
*->eiskkmaelLlkGatMLdehCpkCGtPLFrlKdGkvfCPiCe<-* + ++ +++l +
L++ +kC + +r + Gk fC +Ce rheu.ed.21 117
HT-AVKGQFGLGTGRALGKALKKCAFAGLR-RKGKCFCKVCE 156 # = GF ID
Auto_anti-p27 # = GF AC PF06677.4 # = GF DE Sjogren's
syndrome/scleroderma autoantigen 1 (Autoantigen p27) # = GF AU
Moxon SJ # = GF SE Pfam-B_21881 (release 10.0) # = GF TP Family # =
GF RN [1] # = GF RM 9486406 # = GF RT cDNA cloning of a novel
autoantigen targeted by a minor subset # = GF RT of anti-centromere
antibodies. # = GF RA Muro Y, Yamada T, Himeno M, Sugimoto K; # =
GF RL Clin Exp Immunol 1998; 111: 372-376. # = GF DR INTERPRO;
IPR009563; # = GF CC This family consists of several Sjogren's
syndrome/scleroderma # = GF CC autoantigen 1 (Autoantigen p27)
sequences. It is thought that # = GF CC the potential association
of anti-p27 with anti-centromere # = GF CC antibodies suggests that
autoantigen p27 might play a role in # = GF CC mitosis [1].
VASOPRESSIN HMMER 2.3.2 (Oct 2003) Copyright .COPYRGT. 1992-2003
HHMI/Washington University School of Medicine Freely distributed
under the GNU General Public License (GPL) HMM file: prints.hmm
Sequence file: uro742rp.132 VASOPRSNV2R_6: domain 1 of 1, from 7 to
26: score 7.4, E = 9.1 *->RaGgrRrGrRtGsPsEGArv<-* R rRrG t s
sE A uro742rp.1 7 RNASRRRGSSTASTSEEASL 26 VASOPRSNV2R_6: domain 1
of 1, from 7 to 26: score 7.4, E = 9.1
*->RaGgrRrGrRtGsPsEGArv<-* R rRrG t s sE A zc37.B8.10 7
RNASRRRGSSTASTSEEASL 26 VASOPRSNV1BR_4: domain 1 of 1, from 130 to
149: score 3.0, E = 7.1 *->TQAgRverrGWRTWDksSsS<-* Q + +e R
WD++ zc35s.B2.9 130 AQDWAEEYTACRYWDRPPRT 149 gc; VASOPRSNV2R gx;
PR00898 gn; COMPOUND (8) ga; 15-APR-1998; UPDATE 07-JUN-1999 gt;
Vasopressin V2 receptor signature gp; PRINTS; PR00237 GPCRRHODOPSN;
PR00247 GPCRCAMP; PR00248 GPCRMGR gp; PRINTS; PR00249 GPCRSECRETIN;
PR00250 GPCRSTE2; PR00899 GPCRSTE3 gp; PRINTS; PR00251 BACTRLOPSIN
gp; PRINTS; PR00896 VASOPRESSINR gp; PRINTS; PR00752 VASOPRSNV1AR;
PR00897 VASOPRSNV1BR; PR00665 OXYTOCINR gp; INTERPRO; IPR000161 gr;
1. ATTWOOD, T.K. AND FINDLAY, J.B.C. gr; Fingerprinting G
protein-coupled receptors. gr; PROTEIN ENG. 7 (2) 195-203 (1994).
gr; 2. ATTWOOD, T.K. AND FINDLAY, J.B.C. gr; G protein-coupled
receptor fingerprints. gr; 7TM, VOLUME 2, EDS. G. VRIEND AND B.
BYWATER (1993). gr; 3. BIRNBAUMER, L. gr; G proteins in signal
transduction. gr; ANNU. REV. PHARMACOL. TOXICOL. 30 675-705 (1990).
gr; 4. CASEY, P.J. AND GILMAN, A.G. gr; G protein involvement in
receptor-effector coupling. gr; J. BIOL. CHEM. 263 (6) 2577-2580
(1988). gr; 5. ATTWOOD, T.K. AND FINDLAY, J.B.C. gr; Design of a
discriminating fingerprint for G protein-coupled receptors. gr;
PROTEIN ENG. 6 (2) 167-176 (1993). gr; 6. WATSON, S. AND
ARKINSTALL, S. gr; Vasopressin and oxytocin. gr; IN THE G
PROTEIN-LINKED RECEPTOR FACTSBOOK, ACADEMIC PRESS, 1994, PP. 284-
gd; 291. G protein-coupled receptors (GPCRs) constitute a vast
protein family gd; that encompasses a wide range of functions
(including various autocrine, gd; paracrine and endocrine
processes). They show considerable diversity at the gd; sequence
level, on the basis of which they may be separated into distinct
gd; groups. Applicants use the term clan to describe the GPCRs, as
they embrace gd; a group of families for which there are
indications of evolutionary gd; relationship, but between which
there is no statistically significant gd; similarity in sequence
[1,2]. The currently known clan members include the gd;
rhodopsin-like GPCRs, the secretin-like GPCRs, the cAMP receptors,
the fungal gd; mating pheromone receptors, and the metabotropic
glutamate receptor
family. gd; The rhodopsin-like GPCRs themselves represent a
widespread protein family gd; that includes hormone,
neurotransmitter and light receptors, all of gd; which transduce
extracellular signals through interaction with guanine gd;
nucleotide-binding (G) proteins. Although their activating ligands
vary gd; widely in structure and character, the amino acid
sequences of the gd; receptors are very similar and are believed to
adopt a common structural gd; framework which may comprise 7
transmembrane (TM) helices [3-5]. gd; Vasopressin and oxytocin are
members of the neurohypophyseal hormone family gd; found in all
mammalian species [6]. They are present in high levels in the gd;
posterior pituitary. Vasopressin has an essential role in the
control of gd; the water content of the body, acting in the kidney
to increase water and gd; sodium absorption [6]. In higher
concentrations, vasopressin stimulates gd; contraction of vascular
smooth muscle, stimulates glycogen breakdown in the gd; liver,
induces platelet activation, and evokes release of corticotrophin
gd; from the anterior pituitary [6]. Vasopressin and its analogues
are used gd; clinically to treat diabetes insipidus [6]. gd; The V2
receptor is found in high levels in the osmoregulatory epithelia of
gd; the terminal urinary tract, where it stimulates water
reabsorption [6]. It gd; is also present in lower levels in the
endothelium and blood vessels of some gd; species, where it induces
vasodilation [6]. In the CNS, binding sites are gd; found in the
subiculum, with lower levels in caudate-putamen and islands gd; of
Calleja [6]. The receptor is involved in an effector pathway that
forms gd; cAMP through activation of Gs [6]. gd; VASOPRSNV2R is an
8-element fingerprint that provides a signature for gd; vasopressin
V2 receptors. The fingerprint was derived from an initial gd;
alignment of 4 sequences: the motifs were drawn from short
conserved gd; sections spanning the full alignment length, focusing
on those regions gd; that characterise the vasopressin V2 receptors
but distinguish them from gd; the rest of the vasopressin family -
motifs 1 and 2 reside at the N-terminus; gd; motif 3 spans the
first cytoplasmic loop; motif 4 spans the second gd; cytoplasmic
loop; motifs 5 and 6 span the third cytoplasmic loop; and gd;
motifs 7 and 8 reside at the C-terminus. A single iteration on
OWL30.1 was gd; required to reach convergence, no further sequences
being identified gd; beyond the starting set. fc; VASOPRSNV2R6 fl;
20 ft; Vasopressin V2 receptor motif VI-2 fd; RAGRRRRGHRTGSPSEGAHV
O88721 243 2 fd; RAGRRRRGRRTGSPSEGAHV V2R_RAT 243 2 fd;
RAGGHRGGRRAGSPREGARV V2R_PIG 242 2 fd; RPGGRRRGRRTGSPGEGAHV
V2R_HUMAN 243 2 fd; RAGGCRGGHRTGSPSEGARV O77808 242 2 fd;
RAGGPRRGCRPGSPAEGARV V2R_BOVIN 242 2 gc; VASOPRSNV1BR gx; PR00897
gn; COMPOUND (9) ga; 15-APR-1998; UPDATE 07-JUN-1999 gt;
Vasopressin VIB receptor signature gp; PRINTS; PR00237
GPCRRHODOPSN; PR00247 GPCRCAMP; PR00248 GPCRMGR gp; PRINTS; PR00249
GPCRSECRETIN; PR00250 GPCRSTE2; PR00899 GPCRSTE3 gp; PRINTS;
PR00251 BACTRLOPSIN gp; PRINTS; PR00896 VASOPRESSINR gp; PRINTS;
PR00752 VASOPRSNV1AR; PR00898 VASOPRSNV2R; PR00665 OXYTOCINR gp;
INTERPRO; IPR000628 gr; 1. ATTWOOD, T.K. AND FINDLAY, J.B.C. gr;
Fingerprinting G protein-coupled receptors. gr; PROTEIN ENG. 7 (2)
195-203 (1994). gr; 2. ATTWOOD, T.K. AND FINDLAY, J.B.C. gr; G
protein-coupled receptor fingerprints. gr; 7TM, VOLUME 2, EDS. G.
VRIEND AND B. BYWATER (1993). gr; 3. BIRNBAUMER, L. gr; G proteins
in signal transduction. gr; ANNU. REV. PHARMACOL. TOXICOL. 30
675-705 (1990). gr; 4. CASEY, P.J. AND GILMAN, A.G. gr; G protein
involvement in receptor-effector coupling. gr; J. BIOL. CHEM. 263
(6) 2577-2580 (1988). gr; 5. ATTWOOD, T.K. AND FINDLAY, J.B.C. gr;
Design of a discriminating fingerprint for G protein-coupled
receptors. gr; PROTEIN ENG. 6 (2) 167-176 (1993). gr; 6. WATSON, S.
AND ARKINSTALL, S. gr; Vasopressin and oxytocin. gr; IN THE G
PROTEIN-LINKED RECEPTOR FACTSBOOK, ACADEMIC PRESS, 1994, PP.
284-291. gd; VASOPRSNV1BR is a 9-element fingerprint that provides
a signature for gd; vasopressin V1B receptors. The fingerprint was
derived from an initial gd; alignment of 3 sequences: the motifs
were drawn from short conserved gd; sections spanning the full
alignment length, focusing on those regions gd; that characterise
the vasopressin V1B receptors but distinguish them from gd; the
rest of the vasopressin family - motif 1 lies at the N-terminus;
motif gd; 2 lies in the second cytoplasmic loop; motif 3 lies in
the second external gd; loop; motifs 4 and 5 span the third
cytoplasmic loop; motif 6 lies in the gd; third external loop; and
motifs 7-9 reside in the C-terminal domain. A gd; single iteration
on OWL30.1 was required to reach convergence, no further gd;
sequences being identified beyond the starting set. fc;
VASOPRSNV1BR4 fl; 20 ft; Vasopressin V1B receptor motif IV-2 fd;
TQAWRVGGGGWRTWDRPSPS V1BR_HUMAN 234 48 fd; TQAGREERRGWRTWDKSSSS
V1BR_RAT 234 48 MELANIN-CONCENTRATING HORMONE 2 RECEPTOR SIGNATURE
HMMER 2.3.2 (Oct 2003) Copyright .COPYRGT. 1992-2003
HHMI/Washington University School of Medicine Freely distributed
under the GNU General Public License (GPL) HMM file: prints.hmm
Sequence file: uro742rp.133 MCH2RECEPTOR_5: domain 1 of 1, from 69
to 86: score 5.9, E = 7.1 *->LvqPFRLtrWRtRYKtiRin<-* F +t+WRt
+ + n uro742rp.1 69 --RPFCITKWRTSFLFFKNN 86 gc; MCH2RECEPTOR gx;
PR01784 gn; COMPOUND (9) ga; 25-SEP-2002 gt; Melanin-concentrating
hormone 2 receptor signature gp; PRINTS; PR00237 GPCRRHODOPSN;
PR00247 GPCRCAMP; PR00248 GPCRMGR gp; PRINTS; PR00249 GPCRSECRETIN;
PR00250 GPCRSTE2; PR00899 GPCRSTE3 gp; PRINTS; PR00251 BACTRLOPSIN
gp; PRINTS; PR01507 MCH1RECEPTOR; PR01783 MCHRECEPTOR gr; 1.
ATTWOOD, T.K. AND FINDLAY, J.B.C. gr; Fingerprinting G
protein-coupled receptors. gr; PROTEIN ENG. 7 (2) 195-203 (1994).
gr; 2. ATTWOOD, T.K. AND FINDLAY, J.B.C. gr; G protein-coupled
receptor fingerprints. gr; 7TM, VOLUME 2, EDS. G. VRIEND AND B.
BYWATER (1993). gr; 3. BIRNBAUMER, L. gr; G proteins in signal
transduction. gr; ANNU. REV. PHARMACOL. TOXICOL. 30 675-705 (1990).
gr; 4. CASEY, P.J. AND GILMAN, A.G. gr; G protein involvement in
receptor-effector coupling. gr; J. BIOL. CHEM. 263 (6) 2577-2580
(1988). gr; 5. ATTWOOD, T.K. AND FINDLAY, J.B.C. gr; Design of a
discriminating fingerprint for G protein-coupled receptors. gr;
PROTEIN ENG. 6 (2) 167-176 (1993). gr; 6. CHAMBERS, J., AMES, R.S.,
BERGSMA, D., MUIR, A., FITZGERALD, L.R., gr; HERVIEU, G., DYTKO,
G.M., FOLEY, J.J., MARTIN, J., LIU, W.S., PARK, J., gr; ELLIS, C.,
GANGULY, S., KONCHAR, S., CLUDERAY, J., LESLIE, R., WILSON, S. gr;
AND SARAU, H.M. gr; Melanin-concentrating hormone is the cognate
ligand for the orphan G gr; protein-coupled receptor SLC-1. gr;
NATURE 400 261-265 (1999). gr; 7. SAITO, Y., NOTHACKER, H.-P.,
WANG, Z., LIN, S.H.S., LESLIE, F. AND gr; CIVELLI, O. gr; Molecular
characterization of the melanin-concentrating-hormone receptor. gr;
NATURE 400 265-269 (1999). gr; 8. SAITO, Y., NOTHACKER, H.-P. AND
CIVELLI, O. gr; Melanin-concentrating hormone receptor: an orphan
receptor fits the key. gr; TRENDS ENDOCRINOL. METAB. 11 (8) 299-303
(2000). gr; 9. HILL, J., DUCKWORTH, M., MURDOCK, P., RENNIE, G.,
SABIDO-DAVID, C., AMES, gr; R.S., SZEKERES, P., WILSON, S.,
BERGSMA, D.J., GLOGER, I.S., LEVY, D.S., gr; CHAMBERS, J.K. AND
MUIR, A.I. gr; Molecular cloning and functional characterization of
MCH2, a novel human MCH gr; receptor. gr; J.BIOL.CHEM. 276(23)
20125-20129 (2001). gd; G protein-coupled receptors (GPCRs)
constitute a vast protein family that gd; encompasses a wide range
of functions (including various autocrine, gd; para-crine and
endocrine processes). They show considerable diversity at the gd;
sequence level, on the basis of which they may be separated into
distinct gd; groups. Applicants use the term clan to describe the
GPCRs, as they embrace gd; a group of families for which there are
indications of evolutionary gd; relationship, but between which
there is no statistically significant gd; similarity in sequence
[1,2]. The currently known clan members include the gd;
rhodopsin-like GPCRs, the secretin-like GPCRs, the cAMP receptors,
the fungal gd; mating pheromone receptors, and the metabotropic
glutamate receptor family. gd; The rhodopsin-like GPCRs themselves
represent a widespread protein family gd; that includes hormone,
neurotransmitter and light receptors, all of gd; which transduce
extracellular signals through interaction with guanine gd;
nucleotide-binding (G) proteins. Although their activating ligands
vary gd; widely in structure and character, the amino acid
sequences of the gd; receptors are very similar and are believed to
adopt a common structural gd; framework which may comprise 7
transmembrane (TM) helices [3-5]. gd; Melanin-concentrating hormone
(MCH) is a cyclic peptide originally gd; identified in teleost fish
[6,7]. In fish, MCH is released from the gd; pituitary and causes
lightening of skin pigment cells through pigment gd; aggregation
[6,8]. In mammals, MCH is predominantly expressed in the gd;
hypothalamus, and functions as a neurotransmitter in the control of
a range gd; of functions [8]. A major role of MCH is thought to be
in the regulation of gd; feeding: injection of MCH into rat brains
stimulates feeding; expression of gd; MCH is upregulated in the
hypothalamus of obese and fasting mice; and mice gd; lacking MCH
are lean and eat less [6]. MCH and alpha melanocyte-stimulating gd;
hormone (alpha-MSH) have antagonistic effects on a number of
physiological gd; functions. Alpha-MSH darkens pigmentation in fish
and reduces feeding in gd; mammals, whereas MCH increases feeding
[6,8]. gd; Two G protein-coupled receptors, MCH1 and MCH2, have
recently been gd; identified as receptors for the hormone. gd; The
expression profile of MCH2 is similar to that of MCH1, with highest
gd; levels being found in the brain. However, expression of MCH2 is
gd; significantly lower than MCH1 in the pituitary, hypothalamus,
locus gd; coeruleus, medulla oblongata, and cerebellum [9]. Binding
of MCH to the gd; receptor causes a pertussis toxin-insensitive
increase in intracellular gd; calcium, suggesting coupling to Gq
proteins [9]. gd; MCH2RECEPTOR is a 9-element fingerprint that
provides a signature for the gd; melanin-concentrating hormone 2
receptor. The fingerprint was derived from gd; an initial alignment
of 5 sequences: the motifs were drawn from conserved gd; sections
within N- and C-terminal and loop regions, focusing on those
areas
gd; of the alignment that characterise the MCH2 receptors but
distinguish them gd; from the rest of the MCH receptor family -
motifs 1 and 2 span the gd; N-terminus; motif 3 encodes the first
cytoplasmic loop; motif 4 lies in the gd; first external loop;
motif 5 spans the second cytoplasmic loop, leading into gd; TM
domain 4; motif 6 resides in the second external loop; motif 7
spans the gd; third cytoplasmic loop; motif 8 is located at the
N-terminus of TM domain 7; gd; and motif 9 encodes the C-terminus.
Two iterations on SPTR40_22f were gd; required to reach
convergence, at which point a true set which may comprise gd; 6
sequences was identified. fc; MCH2RECEPTOR5 fl; 20 ft;
Melanin-concentrating hormone 2 receptor motif V-2 fd;
LVQPFRLTSWRTRYKTIRIN Q8MJ88 135 29 fd; LVQPFRLTRWRTRYKTIRIN Q969V1
135 29 fd; LVQPFRLTRWRTRYKTIRIN Q9BXA8 135 29 fd;
LVQPFRLTSWRTRYKTIRIN Q8SQ54 135 29 fd; LVQPFRLTSWRTRYKTIRIN Q8MIN7
135 29 fd; LVQPFRLTSWRTRYKTIRIN Q8MIP5 135 29 PROSTANOID EP1
RECEPTOR SIGNATURE HMMER 2.3.2 (Oct 2003) Copyright .COPYRGT.
1992-2003 HHMI/Washington University School of Medicine Freely
distributed under the GNU General Public License (GPL)
---------------------------------------------------------------------------
------- HMM file: prints.hmm Sequence file: uro742rev.107r
PRSTNOIDEP1R_4: domain 1 of 1, from 1 to 18: score 8.4, E = 4.7
*->isLGPpGGWRqAL.LAGL<-* ++LGP GG R+ L +AG uro742rev. 1
MGLGPSGGNRKTLfIAGK 18 PRSTNOIDEP1R_4: domain 1 of 1, from 1 to 18:
score 8.4, E = 4.7 *->isLGPpGGWRqAL.LAGL<-* ++LGP GG R+ L +AG
zc37.B8.10 1 MGLGPSGGNRKTLfIAGK 18 gc; PRSTNOIDEP1R gx; PR00580 gn;
COMPOUND (7) ga; 25-SEP-1996; UPDATE 07-JUN-1999 gt; Prostanoid EP1
receptor signature gp; PRINTS; PR00237 GPCRRHODOPSN; PR00247
GPCRCAMP; PR00248 GPCRMGR gp; PRINTS; PR00249 GPCRSECRETIN; PR00250
GPCRSTE2; PR00899 GPCRSTE3 gp; PRINTS; PR00251 BACTRLOPSIN gp;
PRINTS; PR00428 PROSTAGLNDNR; PR00581 PRSTNOIDEP2R; PR00582
PRSTNOIDEP3R gp; PRINTS; PR00583 PRSTNOIDE31R; PR00584
PRSTNOIDE32R; PR00585 PRSTNOIDE33R gp; PRINTS; PR00586
PRSTNOIDEP4R; PR00854 PRSTNOIDDPR; PR00855 PRSTNOIDFPR gp; PRINTS;
PR00856 PRSTNOIDIPR gp; INTERPRO; IPR000708 gr; 1. ATTWOOD, T.K.
AND FINDLAY, J.B.C. gr; Fingerprinting G protein-coupled receptors.
gr; PROTEIN ENG. 7 (2) 195-203 (1994). gr; 2. ATTWOOD, T.K. AND
FINDLAY, J.B.C. gr; G protein-coupled receptor fingerprints. gr;
7TM, VOLUME 2, EDS. G. VRIEND AND B. BYWATER (1993). gr; 3.
BIRNBAUMER, L. gr; G proteins in signal transduction. gr; ANNU.
REV. PHARMACOL. TOXICOL. 30 675-705 (1990). gr; 4. CASEY, P.J. AND
GILMAN, A.G. gr; G protein involvement in receptor-effector
coupling. gr; J. BIOL. CHEM. 263 (6) 2577-2580 (1988). gr; 5.
ATTWOOD, T.K. AND FINDLAY, J.B.C. gr; Design of a discriminating
fingerprint for G protein-coupled receptors. gr; PROTEIN ENG. 6 (2)
167-176 (1993). gr; 6. WATSON, S. AND ARKINSTALL, S. gr;
Prostanoids. gr; IN THE G PROTEIN-LINKED RECEPTOR FACTSBOOK,
ACADEMIC PRESS, 1994, PP. 239-251. gd; G protein-coupled receptors
(GPCRs) constitute a vast protein family that gd; encompasses a
wide range of functions (including various autocrine, para- gd;
crine and endocrine processes). They show considerable diversity at
the gd; sequence level, on the basis of which they may be separated
into distinct gd; groups. Applicants use the term clan to describe
the GPCRs, as they embrace gd; a group of families for which there
are indications of evolutionary gd; relationship, but between which
there is no statistically significant gd; similarity in sequence
[1,2]. The currently known clan members include the gd;
rhodopsin-like GPCRs, the secretin-like GPCRs, the cAMP receptors,
the fungal gd; mating pheromone receptors, and the metabotropic
glutamate receptor family. gd; The rhodopsin-like GPCRs themselves
represent a widespread protein family gd; that includes hormone,
neurotransmitter and light receptors, all of gd; which transduce
extracellular signals through interaction with guanine gd;
nucleotide-binding (G) proteins. Although their activating ligands
vary gd; widely in structure and character, the amino acid
sequences of the gd; receptors are very similar and are believed to
adopt a common structural gd; framework which may comprise 7
transmembrane (TM) helices [3-5]. gd; Prostanoids (prostaglandins
(PG) and thromboxanes (TX)) mediate a wide gd; variety of actions
and play important physiological roles in the cardio- gd; vascular
and immune systems, and in pain sensation in peripheral systems gd;
[6]. PGI2 and TXA2 have opposing actions, involving regulation of
the gd; interaction of platelets with the vascular endothelium,
while PGE2, PGI2 gd; and PGD2 are powerful vasodilators and
potentiate the action of various gd; autocoids to induce plasma
extravasation and pain sensation. To date, gd; evidence for at
least 5 classes of prostanoid receptor has been obtained. gd;
However, identification of subtypes and their distribution is
hampered by gd; expression of more than one receptor within a
tissue, coupled with poor gd; selectivity of available agonists and
antagonists. gd; EP1 receptors mediate contraction of
gastrointestinal smooth muscles in gd; various species, and
relaxation of airway and uterine smooth muscles, gd; especially in
rodents [6]. The receptors activate the phosphoinositide gd;
pathway via a pertussis-toxin-insensitive G protein, probably of
the gd; Gq/G11 class [6]. gd; PRSTNOIDEP1R is a 7-element
fingerprint that provides a signature for the gd; prostanoid EP1
receptors. The fingerprint was derived from an initial gd;
alignment of 2 sequences: the motifs were drawn from conserved
sections gd; within either loop or N- and C-terminal regions,
focusing on those areas of gd; the alignment that characterise the
prostanoid EP1 receptors but distinguish gd; them from the rest of
the rhodopsin-like superfamily - motif 1 lies at the gd;
N-terminus; motif 2 spans the first cytoplasmic loop; motif 3 spans
the gd; first external loop; motif 4 lies in the second external
loop; motif 5 lies gd; in the third cytoplasmic loop; and motifs 6
and 7 span the C-terminus. A gd; single iteration on OWL28.2 was
required to reach convergence, no further gd; sequences being
identified beyond the starting set. gd; fc; PRSTNOIDEP1R4 fl; 17
ft; Prostanoid EP1 receptor motif IV-2 fd; ISLGPRGGWRQALLAGL
PE21_MOUSE 192 73 fd; ISLGPPGGWRQALLAGL PE21_RAT 192 73 fd;
IGLGPPGGWRQALLAGL PE21_HUMAN 190 73 CYCLINKINASE HMMER 2.3.2 (Oct
2003) Copyright .COPYRGT. 1992-2003 HHMI/Washington University
School of Medicine Freely distributed under the GNU General Public
License (GPL) HMM file: prints.hmm Sequence file:
rheu.cd.215rev.1.736 CYCLINKINASE_3: domain 1 of 1, from 662 to
676: score 9.3, E = 3.1 *->EWRslGvqqslGWvh<-* E + Gvqq 1 Wvh
rheu.cd.21 662 ESSRFGVQQRLPWVH 676 gc; CYCLINKINASE gx; PR00296 gn;
COMPOUND (4) ga; 07-OCT-1994; UPDATE 07-JUN-1999 gt;
Cyclin-dependent kinase regulatory subunit signature gp; INTERPRO;
IPR000789 gp; PROSITE; PS00944 CKS_1; PS00945 CKS_2 gp; PFAM;
PF01111 CKS gr; 1. BRIZUELA, L., DRAETTA, G. AND BEACH, D. gr;
p13suc1 acts in the fission yeast cell division cycle as a
component of the gr; p34cdc2 protein kinase. gr; EMBO J. 6
3507-3514 (1987). gr; 2. PARGE, H.E., ARVAI, A.S., MURTARI, D.J.,
REED, S.I. AND TAINER, J.A. gr; Human CksHs2 atomic structure: a
role for its hexameric assembly in cell gr; cycle control. gr;
SCIENCE 262 387-395 (1993). gr; 3. TANG, Y. AND REED, S.I. gr; The
Cdk-associated protein Cks1 functions both in G1 and G2 in
Saccharomyces gr; cerevisiae. gr; GENES DEV. 7 822-832 (1993). gd;
In eukaryotes, cyclin-dependent protein kinases interact with
cyclins to gd; regulate cell cycle progression, and are required
for the G1 and G2 stages gd; of cell division [1]. The proteins
bind to a regulatory subunit (cyclin- gd; dependent kinase
regulatory subunit, or CKS), which is essential for their gd;
function [2]. The regulatory subunits exist as hexamers, formed by
the gd; symmetrical assembly of 3 interlocked homodimers, creating
an unusual gd; 12-stranded beta-barrel structure [2]. Through the
barrel centre runs a gd; 12A diameter tunnel, lined by 6 exposed
helix pairs [3]. Six kinase units gd; may be modelled to bind the
hexameric structure, which may thus act as a gd; hub for
cyclin-dependent protein kinase multimerisation [2,3]. gd;
CYCLINKINASE is a 4-element fingerprint that provides a signature
for gd; cyclin-dependent kinase regulatory subunits. The
fingerprint was derived gd; from an initial alignment of 4
sequences: the motifs were drawn from gd; conserved regions
encompassing virtually the full alignment length, motifs gd; 1, 2
and 4 spanning the regions encoded by PROSITE patterns CKS_1
(PS00944) gd; and CKS_2 (PS00945). Two iterations on OWL24.0 were
required to reach gd; convergence, at which point a true set which
may comprise 5 sequences was gd; identified fc; CYCLINKINASE3 fl;
15 ft; Cyclin-dependent kinase regulatory subunit motif III-2 fd;
EWRRLGVQQSLGWVH CKS2_XENLA 42 7 fd; EWRNLGVQQSQGWVH CKS1_HUMAN 42 7
fd; EWRRLGVQQSLGWVH CKS2_HUMAN 42 7 fd; EWRRLGVQQSLGWVH CKS2_MOUSE
42 7 fd; EWRSIGVQQSHGWIH CKS1_PATVU 42 7 fd; EWRSIGVQQSRGWIH
CKS1_DROME 41 7 fd; EWRGLGVQQSQGWVH CKS1_PHYPO 42 7 fd;
EWRQLGVQQSQGWVH CKS1_LEIME 67 7 fd; EWRAIGVQQSRGWVH O23249 40 7 fd;
EWRGLGITQSLGWQH O60191 73 16 fd; EWRGLGITQSLGWEM CKS1_SCHPO 69 16
fd; EWRGLGITQSLGWEH CKS1_YEAST 73 16 fd; EWRSLGIQQSPGWMH CKS1_CAEEL
44 7 PEROXISOME PROLIFERATOR-ACTIVATED RECEPTOR (1C NUCLEAR
RECEPTOR) SIGNATURE HMMER 2.3.2 (Oct 2003) Copyright .COPYRGT.
1992-2003 HHMI/Washington University School of Medicine Freely
distributed under the GNU General Public License (GPL) HMM file:
prints.hmm Sequence file: rheu.cd.215rev.1.736 PROXISOMEPAR_7:
domain 1 of 1, from 721 to 733: score 8.0, E = 5.7
*->KtEtdasLHPLLq<-*
K + sLHPLL rheu.cd.21 721 KVQAGHSLHPLLS 733 gc; PROXISOMEPAR gx;
PR01288 gn; COMPOUND (7) ga; 19-FEB-2000 gt; Peroxisome
proliferator-activated receptor (1C nuclear receptor) signature gp;
PRINTS; PR00398 STRDHORMONER; PR00047 STROIDFINGER gp; PRINTS;
PR01289 PROXISOMPAAR; PR01290 PROXISOMPABR; PR01291 PROXISOMPAGR
gr; 1. NUCLEAR RECEPTORS NOMENCLATURE COMMITTEE gr; A unified
nomenclature system for the nuclear receptor superfamily. gr; CELL
97 161-163 (1999). gr; 2. NISHIKAWA, J-I., KITAURA, M., IMAGAWA, M.
AND NISHIHARA, T. gr; Vitamin D receptor contains multiple
dimerisation interfaces that gr; are functionally different. gr;
NUCLEIC ACIDS RES. 23 (4) 606-611 (1995). gr; 3. DE VOS, P.,
SCHMITT, J., VERHOEVEN, G. AND STUNNENBERG, G. gr; Human androgen
receptor expressed in HeLa cells activates transcription gr; in
vitro. gr; NUCLEIC ACIDS RES. 22 (7) 1161-1166 (1994). gr; 4. KREY,
G., KELLER, H., MAHFOUDI, A., MEDIN, J., OZATO, K., DREYER, C. gr;
AND WAHLI, W. gr; Xenopus peroxisome proliferator activated
receptors: genomic organization, gr; response element recognition,
heterodimer formation with retinoid X receptor gr; and activation
by fatty acids. gr; J. STEROID BIOCHEM. MOL. BIOL. 47 65-73 (1993).
gr; 5. DREYER, C., KREY, G., KELLER, H., GIVEL, F., HELFTENBEIN, G.
gr; AND WAHLI, W. gr; Control of the peroxisomal beta-oxidation
pathway by a novel family gr; of nuclear hormone receptors. gr;
CELL 68 879-887 (1992). gd; Steroid or nuclear hormone receptors
(NRs) constitute an important super- gd; family of transcription
regulators that are involved in widely diverse gd; physiological
functions, including control of embryonic development, cell gd;
differentiation and homeostasis [1]. Members of the superfamily
include the gd; steroid hormone receptors and receptors for thyroid
hormone, retinoids, gd; 1,25-dihydroxy-vitamin D3 and a variety of
other ligands. The proteins gd; function as dimeric molecules in
nuclei to regulate the transcription of gd; target genes in a
ligand-responsive manner [2,3]. In addition to C-terminal gd;
ligand-binding domains, these nuclear receptors contain a
highly-conserved, gd; N-terminal zinc-finger that mediates specific
binding to target DNA gd; sequences, termed ligand-responsive
elements. In the absence of ligand, gd; steroid hormone receptors
are thought to be weakly associated with nuclear gd; components;
hormone binding greatly increases receptor affinity. gd; NRs are
extremely important in medical research, a large number of them gd;
being implicated in diseases such as cancer, diabetes, hormone
resistance gd; syndromes, etc. [1]. While several NRs act as
ligand-inducible transcription gd; factors, many do not yet have a
defined ligand and are accordingly termed gd; "orphan" receptors.
During the last decade, more than 300 NRs have been gd; described,
many of which are orphans, which cannot easily be named due to gd;
current nomenclature confusions in the literature. However, a new
system gd; has recently been introduced in an attempt to
rationalise the increasingly gd; complex set of names used to
describe superfamily members [1]. gd; Peroxisome
proliferator-activated receptors (PPAR) are ligand-activated gd;
transcription factors that belong to the nuclear hormone receptor
gd; superfamily. Three cDNAs encoding PPARs have been isolated from
Xenopus gd; laevis: xPPAR alpha, beta and gamma [4]. All three
xPPARs appear to be gd; activated by both synthetic peroxisome
proliferators and naturally occurring gd; fatty acids, suggesting a
common mode of action for all members of this gd; subfamily of
receptors [4]. Furthermore, the multiplicity of the receptors gd;
suggests the existence of hitherto unknown cellular signalling
pathways for gd; xenobiotics and putative endogenous ligands [5].
gd; PROXISOMEPAR is a 7-element fingerprint that provides a
signature for gd; peroxisome proliferator-activated receptors. The
fingerprint was derived gd; from an initial alignment of 11
sequences: the motifs were drawn from gd; conserved regions
spanning virtually the full alignment length, focusing on gd; those
sections that characterise the PPAR family but distinguish it from
the gd; rest of the steroid hormone receptor superfamily - motifs 1
and 2 lie gd; C-terminal to the zinc finger domain; and motifs 3-7
span the putative gd; ligand-binding domain. Three iterations on
SPTR37_10f were required to gd; reach convergence, at which point a
true set which may comprise 19 sequences gd; was identified. A
single partial match was also found, the Xenopus beta gd;
peroxisome proliferator activated receptor, PPAS_XENLA, which fails
to gd; match the first motif. fc; PROXISOMEPAR7 fl; 13 ft;
Peroxisome proliferator-activated receptor motif VII-3 fd;
KTETDMSLHPLLQ O18924 486 16 fd; KTETDMSLHPLLQ Q15832 486 16 fd;
KTETDMSLHPLLQ PPAT_HUMAN 456 16 fd; KTETDMSLHPLLQ O62807 485 16 fd;
KTETDMSLHPLLQ O18971 486 16 fd; KTETDMSLHPLLQ PPAT_RABIT 456 16 fd;
KTETDMSLHPLLQ O77815 485 16 fd; KTETDMSLHPLLQ O88275 456 16 fd;
KTETDMSLHPLLQ PPAT_MOUSE 456 16 fd; KTETDMSLHPLLQ Q15180 487 16 fd;
KTEADMCLHPLLQ PPAT_XENLA 458 16 fd; KTETDAALHPLLQ PPAR_XENLA 455 16
fd; KTESDAALHPLLQ PPAR_HUMAN 449 16 fd; KTESDAALHPLLQ PPAR_RAT 449
16 fd; KTESDAALHPLLQ PPAR_MOUSE 449 16 fd; KTETETSLHPLLQ PPAS_HUMAN
422 16 fd; KTESDAALHPLLQ PPAR_CAVPO 448 15 fd; KTESETLLHPLLQ
PPAS_MOUSE 421 16 fd; KTESETLLHPLLQ Q62879 421 16 MUSCARINIC M1
RECEPTOR SIGNATURE HMMER 2.3.2 (Oct 2003) Copyright .COPYRGT.
1992-2003 HHMI/Washington University School of Medicine Freely
distributed under the GNU General Public License (GPL) HMM file:
prints.hmm Sequence file: rheu.cd.215rev.1.736 MUSCRINICM1R_4:
domain 1 of 2, from 161 to 177: score 0.9, E = 98
*->KmPmvDpEAqAPtKqPPk<-* K P vD q t qPP rheu.cd.21 161
KHPTVDFMVQINT-QPPF 177 gc; MUSCRINICM1R gx; PR00538 gn; COMPOUND
(6) ga; 01-JUN-1996; UPDATE 07-JUN-1999 gt; Muscarinic M1 receptor
signature gp; PRINTS; PR00237 GPCRRHODOPSN; PR00247 GPCRCAMP;
PR00248 GPCRMGR gp; PRINTS; PR00249 GPCRSECRETIN; PR00250 GPCRSTE2;
PR00899 GPCRSTE3 gp; PRINTS; PR00251 BACTRLOPSIN gp; PRINTS;
PR00243 MUSCARINICR; PR00539 MUSCRINICM2R; PR00540 MUSCRINICM3R gp;
PRINTS; PR00541 MUSCRINICM4R; PR00542 MUSCRINICM5R gp; INTERPRO;
IPR002228 gr; 1. ATTWOOD, T.K. AND FINDLAY, J.B.C. gr;
Fingerprinting G protein-coupled receptors. gr; PROTEIN ENG. 7 (2)
195-203 (1994). gr; 2. ATTWOOD, T.K. AND FINDLAY, J.B.C. gr; G
protein-coupled receptor fingerprints. gr; 7TM, VOLUME 2, EDS. G.
VRIEND AND B. BYWATER (1993). gr; 3. BIRNBAUMER, L. gr; G proteins
in signal transduction. gr; ANNU. REV. PHARMACOL. TOXICOL. 30
675-705 (1990). gr; 4. CASEY, P.J. AND GILMAN, A.G. gr; G protein
involvement in receptor-effector coupling. gr; J. BIOL. CHEM. 263
(6) 2577-2580 (1988). gr; 5. ATTWOOD, T.K. AND FINDLAY, J.B.C. gr;
Design of a discriminating fingerprint for G protein-coupled
receptors. gr; PROTEIN ENG. 6 (2) 167-176 (1993). gr; 6. KERLAVAGE,
A.R., FRASER, C.M., CHUNG, F-Z. AND VENTER, J.C. gr; Molecular
structure and evolution of adrenergic and cholinergic receptors.
gr; PROTEINS 1 287-301 (1986). gr; 7. WATSON, S. AND ARKINSTALL, S.
gr; Acetylcholine. gr; IN THE G PROTEIN-LINKED RECEPTOR FACTSBOOK,
ACADEMIC PRESS, 1994, PP. 7-18. gd; G protein-coupled receptors
(GPCRs) constitute a vast protein family that gd; encompasses a
wide range of functions (including various autocrine, para- gd;
crine and endocrine processes). They show considerable diversity at
the gd; sequence level, on the basis of which they may be separated
into distinct gd; groups. Applicants use the term clan to describe
the GPCRs, as they embrace gd; a group of families for which there
are indications of evolutionary gd; relationship, but between which
there is no statistically significant gd; similarity in sequence
[1,2]. The currently known clan members include the gd;
rhodopsin-like GPCRs, the secretin-like GPCRs, the cAMP receptors,
the fungal gd; mating pheromone receptors, and the metabotropic
glutamate receptor family. gd; The rhodopsin-like GPCRs themselves
represent a widespread protein family gd; that includes hormone,
neurotransmitter and light receptors, all of gd; which transduce
extracellular signals through interaction with guanine gd;
nucleotide-binding (G) proteins. Although their activating ligands
vary gd; widely in structure and character, the amino acid
sequences of the gd; receptors are very similar and are believed to
adopt a common structural gd; framework which may comprise 7
transmembrane (TM) helices [3-5]. gd; The muscarinic acetylcholine
receptors, present in the central nervous gd; system, spinal cord
motoneurons and autonomic preganglia, modulate a gd; variety of
physiological functions, including airway, eye and intestinal gd;
smooth muscle contractions; heart rate; and glandular secretions.
The gd; receptors mediate adenylate cyclase attenuation, calcium
and potassium gd; channel activation, and phosphatidyl inositol
turnover [6]. This diversity gd; may result from the occurrence of
multiple receptor subtypes (of which 5 gd; are currently known,
designated M1 to M5), which have been classified gd; based on
observed differences in ligand binding to receptors in membranes
gd; from several tissues. gd; The M1 receptor is found in high
levels in neuronal cells of the CNS; it gd; is particularly
abundant in the cerebral cortex and hippocampus [7]. Its gd;
distribution largely overlaps with that of M3 and M4 subtypes. In
the gd; periphery, M1 receptors are found in autonomic ganglia and
certain gd; secretory glands, and they are also found in cell
lines. No truly selective gd; agonist has been described [7]. gd;
MUSCRINICM1R is a 6-element fingerprint that provides a signature
for the gd; muscarinic M1 receptors. The fingerprint was derived
from an initial gd; alignment of 4 sequences: the motifs were drawn
from conserved sections gd; within either loop or N- and C-terminal
regions, focusing on those areas gd; of the alignment that
characterise the M1 receptors but distinguish them gd; from the
rest of the muscarinic receptor family - motif 1 lies at the N- gd;
terminus; motifs 2-5 span the third cytoplasmic loop; and motif 6
lies
gd; at the C-terminus. A single iteration on OWL28.0 was required
to reach gd; convergence, no further sequences being identified
beyond the starting set. fc; MUSCRINICM1R4 fl; 18 ft; Muscarinic M1
receptor motif IV-2 fd; KMPMVDPEAQAPTKQPPR ACM1_HUMAN 303 3 fd;
KMPMVDPEAQAPTKQPPK ACM1_MOUSE 303 3 fd; KMPMVDSEAQAPTKQPPK ACM1_RAT
303 3 fd; KMPMVDPEAQAPTKQPPR ACM1_MACMU 303 3 fd;
KMPMVDPEAQAPAKQPPR ACM1_PIG 303 3 METABOTROPIC GAMMA-AMINOBUTYRIC
ACID (GABA) TYPE B2 RECEPTOR SIGNATURE transcript zc35s.B3.3e.172:
GABAB2RECPTR_1: domain 1 of 1, from 111 to 129: score 5.9, E = 6.4
*->LAPGAWGWaRGAPRPPPss<-* + P W + P+PPPs+ zc35s.B3.3 111
VGPEQWLFPERKPKPPPSA 129 gc; GABAB2RECPTR gx; PR01178 gn; COMPOUND
(13) ga; 18-SEP-1999 gt; Metabotropic gamma-aminobutyric acid type
B2 receptor signature gp; PRINTS; PR00237 GPCRRHODOPSN; PR00247
GPCRCAMP; PR00249 GPCRSECRETIN gp; PRINTS; PR00250 GPCRSTE2;
PR00899 GPCRSTE3; PR00251 BACTRLOPSIN gp; PRINTS; PR00592
CASENSINGR; PR00593 MTABOTROPICR gp; PRINTS; PR01176 GABABRECEPTR;
PR01177 GABAB1RECPTR gp; INTERPRO; IPR002457 gr; 1. KAUPMANN, K.,
HUGGEL, K., HEID, J., FLOR, P.J., BISCHOFF, S., MICKEL, gr; S.J.,
MCMASTER, G., ANGST, C., BITTIGER, H., FROESTL, W. AND BETTLER, B.
gr; Expression cloning of GABA(B) receptors uncovers similarity to
metabotropic gr; glutamate receptors. gr; NATURE 386 239-246
(1997). gr; 2. KAUPMANN, K., SCHULER, V., MOSBACHER., J, BISCHOFF,
S., BITTIGER, H., gr; HEID, J., FROESTL, W., LEONHARD, S., PFAFF,
T., KARSCHIN, A. AND BETTLER, gr; B. Human gamma-aminobutyric acid
type B receptors are differentially gr; expressed and regulate
inwardly rectifying K+ channels. gr; PROC. NATL. ACAD. SCI. U.S.A.
95 (25) 14991-14996 (1998), gr; 3. WHITE, J.H., WISE, A., MAIN,
M.J., GREEN, A., FRASER, N.J., DISNEY, G.H., gr; BARNES, A.A.,
EMSON, P., FOORD, S.M. AND MARSHALL, F.H. gr; Heterodimerization is
required for the formation of a functional GABA(B) gr; receptor.
gr; NATURE 396 679-82 (1998). gd; GABA (gamma-amino-butyric acid)
is the principal inhibitory neurotransmitter gd; in the brain, and
signals through ionotropic (GABA(A)/GABA(C)) and gd; metabotropic
(GABA(B)) receptor systems [1]. The GABA(B) receptors have gd; been
cloned, and photoaffinity labelling experiments suggest that they
gd; correspond to two highly conserved receptor forms in the
vertebrate nervous gd; system [1]. gd; GABA(B) receptors are
involved in the fine tuning of inhibitory synaptic gd; transmission
[2]. Presynaptic receptors inhibit neurotransmitter release by gd;
down-regulating high-voltage activated Ca2+ channels, while
postsynaptic gd; receptors decrease neuronal excitability by
activating a prominent inwardly gd; rectifying K+ (Kir) conductance
that underlies the late inhibitory post- gd; synaptic potentials
[2]. GABA(B) receptors negatively couple to adenylyl gd; cyclase
and show sequence similarity to the metabotropic receptors for the
gd; excitatory neurotransmitter L-glutamate. gd; A new subtype of
the GABA(B) receptor (GABA(B)R2) has been identified by gd; EST
database mining [3]. Yeast two-hybrid screening has shown that the
new gd; subtype forms heterodimers with GABA(B)R1 via an
interaction at their gd; intracellular C-terminal tails [3]. On
expression with GABA(B)R2 in HEK293T gd; cells, GABA(B)R1 is
terminally glycosylated and expressed at the cell gd; surface.
Co-expression of the receptors produces a fully functional GABA(B)
gd; receptor at the cell surface; this receptor binds GABA with a
high affinity gd; equivalent to that of the endogenous brain
receptor [3]. Such results gd; indicate that, in vivo, functional
brain GABA(B) receptors may be hetero- gd; dimers of GABA(B)R1 and
GABA(B)R2. gd; GABAB2RECPTR is a 13-element fingerprint that
provides a signature for gd; type 2 GABA(B) receptors. The
fingerprint was derived from an initial gd; alignment of 2
sequences: the motifs were drawn from conserved regions gd;
spanning virtually the full alignment length, focusing on those
sections gd; that characterise the type 2 receptors but distinguish
them from the rest gd; of the GABA(B) receptor family. A single
iteration on SPTR37_10f was gd; required to reach convergence, no
further sequences being identified gd; beyond the starting set. fc;
GABAB2RECPTR1 fl; 19 ft; GABAB2 receptor motif I-1 fd;
LAPGAWGWARGAPRPPPSS O75899 35 35 fd; LAPGAWGWTRGAPRPPPSS O88871 34
34 ARGININE DEIMINASE SIGNATURE ARGDEIMINASE_6: domain 1 of 1, from
57 to 75: score 8.0, E = 6.8 *->seLsrGrggprcmsmplvR<-* s L+rG
g pr s p++ zc35s.B3.3 57 SPLGRGAGEPRRTSTPVAA 75 gc; ARGDEIMINASE
gx; PR01466 gn; COMPOUND (6) ga; 08-JAN-2001 gt; Bacterial arginine
deiminase signature gp; PRINTS; PR00102 OTCASE gp; PFAM; PF02726
Arg_deiminase gp; INTERPRO; IPR003876 gr; 1. BROWN, D.M., UPCROFT,
J.A., EDWARDS, M.R. AND UPCROFT, P. gr; Anaerobic bacterial
metabolism in the ancient eukaryote Giardia duodenalis. gr; INT. J.
PARASITOL. 28 149-64 (1998). gr; 2. HARASAWA, R., KOSHIMIZU, K.,
KITAGAWA, M., ASADA, K. AND KATO, I. gr; Nucleotide sequence of the
arginine deiminase gene of Mycoplasma hominis. gr; MICROBIOL.
IMMUNOL. 36 661-665 (1992). gr; 3. KANAOKA, M., KAWANAKA, C.,
NEGORO, T., FUKITA, Y., TAYA, K. AND AGUI, H. gr; Cloning and
expression of the antitumor glycoprotein gene of Streptococcus gr;
pyogenes Su in Escherichia coli. gr; AGRIC. BIOL. CHEM. 51
2641-2648 (1987). gr; 4. DEGNAN, B.A., PALMER, J.M., ROBSON, T.,
JONES, C.E., FISCHER, M., gr; GLANVILLE, M., MELLOR, G.D., DIAMOND,
A.G., KEHOE, M.A. AND GOODACRE, J.A. gr; Inhibition of human
peripheral blood mononuclear cell proliferation by gr;
Streptococcus pyogenes cell extract is associated with arginine
deiminase gr; activity. gr; INFECT. IMMUN. 66 3050-3058 (1998). gd;
The arginine dihydrolase (AD) pathway is found in many prokaryotes
and some gd; primitive eukaryotes, an example of the latter being
Giardia [1}. The three- gd; enzyme anaerobic pathway breaks down
L-arginine to form 1 mol of ATP, carbon gd; dioxide and ammonia. In
simpler bacteria, the first enzyme, arginine gd; deiminase, may
account for up to 10% of total cell protein [1]. gd; Arginine
deiminase catalyses the conversion of L-arginine to L-citrulline
gd; and ammonia. As well as producing energy via ATP, the ammonia
also serves gd; to protect the bacteria against acid damage, and
the citrulline generated gd; may be used in other biosynthetic
pathways [2]. A streptococcal acid gd; glycoprotein (SAGP) has also
been shown to function as an arginine gd; deiminase [3]. gd;
Recently, another function of this enzyme has been discovered [4].
It has a gd; potent anti-tumour effect, and may inhibit antigen,
superantigen, or mitogen- gd; stimulated human peripheral blood
mononuclear cell proliferation [4]. gd; Another function of the
protein may be to inhibit cell proliferation by gd; cell cycle
arrest and apoptosis induction. It has thus been hypothesized gd;
that recombinant arginine deiminase could be used as a novel
anti-tumour gd; agent [4]. gd; ARGDEIMINASE is a 6-element
fingerprint that provides a signature for gd; the bacterial
arginine deiminase protein family. The fingerprint was gd; derived
from an initial alignment of 4 sequences: the motifs were drawn
from gd; conserved regions spanning the full alignment length (~430
amino acids). Two gd; iterations on SPTR37_10f were required to
reach convergence, at which point gd; a true set which may comprise
13 sequences was identified. Three partial gd; matches were also
found: P75475 and P75474 are Mycoplasma pneumoniae arginine gd;
deiminases that match the first three and the last three motifs
respectively; and Q48294 is a Halobacterium salinarium arginine
deiminase that matches motifs 2 and 6. bb; c; ARGDEIMINASE6 fl; 19
ft; Bacterial arginine deiminase motif VI-2 fd; SELSRGRGGPRCMSMPLIR
O51896 388 8 fd; SELSRGRGGPRCMSMPLIR Q46254 392 8 fd;
SELVRGRGGPRCMSMPFER SAGP_STRPY 389 8 fd; SELSRGRGGPRCMSMSLVR O51781
389 8 fd; GELSRGRGGPRCMSMPLYR O86131 391 8 fd; SELSRGRGGPRCMSMPLVR
O53088 388 8 fd; SELGRGRGGGHCMTCPIVR ARCA_PSEAE 394 8 fd;
NQLSLGMGNARCMSMPLSR ARCA_MYCHO 385 8 fd; SELGRGRGGGHCMTCPIWR O31017
387 8 fd; NQLSLGMGNARCMSMPLSR ARCA_MYCAR 386 8 fd;
GELGRGRGGGHCMTCPIVR ARCA_PSEPU 397 8 fd; SELGTGRGGPRCMSCPAAR O05585
381 8 fd; SELSRGPSGPLEMVCSLWR ARCA MYCPN 419 8 OPIOID GROWTH FACTOR
RECEPTOR REPEAT HMMER 2.3.2 (Oct 2003) Copyright .COPYRGT.
1992-2003 HHMI/Washington University School of Medicine Freely
distributed under the GNU General Public License (GPL) HMM file:
pfam.hmm Sequence file: zc37.B9.2de.p2 OGFr_III: domain 1 of 1,
from 186 to 207: score 8.2, E = 3.6
*->sPsEtPGPrPA..GParDEPAE<-* + tP P PA +GP+r +P E zc37.B9.2d
186 RAASTPVPTPAlrGPTRQDPGE 207 # = GF ID OGFr_III # = GF AC
PF04680.5 # = GF DE Opioid growth factor receptor repeat # = GF PI
OGFr_repeat; # = GF AU Waterfield DI, Finn RD # = GF SE Pfam-B_4529
(release 7.5) # = GF GA 33.30 0.00; 25.00 25.00; # = GF TC 40.70
0.30; 28.20 35.60; # = GF NC 30.90 18.10; 17.10 16.10; # = GF TP
Repeat # = GF BM hmmbuild -FHMM_ls.ann SEED.ann # = GF BM
hmmcalibrate --seed 0 HMM_ls # = GF BM hmmbuild -f -FHMM_fs.ann
SEED.ann # = GF BM hmmcalibrate --seed 0 HMM_fs # = GF AM
globalfirst # = GF RN [1] # = GF RM 11890982 # = GF RT The biology
of the opioid growth factor receptor (OGFr). # = GF RA Zagon IS,
Verderame MF, McLaughlin PJ; # = GF RL Brain Res Brain Res Rev
2002; 38: 351-376. # = GF DR INTERPRO; IPR006770; # = GF CC
Proline-rich repeat found only in a human opioid growth factor # =
GF CC receptor [1]. ADHESION MOLECULE CD36 SIGNATURE HMMER 2.3.2
(Oct 2003) Copyright .COPYRGT. 1992-2003 HHMI/Washington University
School of Medicine Freely distributed under the GNU General Public
License (GPL) HMM file: prints.hmm Sequence file: zc3r11.B4.10d.p1
CD36ANTIGEN_3: domain 1 of 1, from 11 to 29: score 6.3, E = 7.7
*->WiFDvqnPdevaknsskikvkqR<-*
vq P+e ss+ +v+qR zc3r11.B4. 11 ---NVQDPEE-QNESSRFRVQQR 29 gc;
CD36ANTIGEN gx; PR01610 gn; COMPOUND (13) ga; 23-DEC-2001 gt;
Adhesion molecule CD36 signature gp; PRINTS; PR01609 CD36FAMILY;
PR01611 LIMPII gp; MIM; 173510 gr; 1. OKUMURA, T. AND JAMIESON,
G.A. gr; Platelet glycocalicin. Orientation of glycoproteins on the
human platelet gr; surface. gr; J. BIOL. CHEM. 251 5944-5949
(1976). gr; 2. NICHOLSON, A.C., FEBBRAIO, M., HAN, J., SILVERSTEIN,
R.L. AND gr; HAJJAR, D.P. gr; CD36 in atherosclerosis. The role of
a class B macrophage scavenger receptor. gr; ANN. N.Y. ACAD. SCI.
902 128-131 (2000). gr; 3. SILVERSTEIN, R.L. AND FEBBRAIO, M. gr;
CD36 and atherosclerosis. gr; CURR. OPIN. LIPIDOL. 11 483-491
(2000). gr; 4. SAVILL, J., HOGG, N., REN, Y. AND HASLETT, C. gr;
Thrombospondin cooperates with CD36 and the vitronectin receptor
gr; in macrophage recognition of neutrophils undergoing apoptosis.
gr; J. CLIN. INVEST. 90 1513-1522 (1989). gr; 5. TANDON, NN.,
KRALISZ, U. AND JAMIESON, GA. gr; Identification of glycoprotein IV
(CD36) as a primary receptor gr; for platelet-collagen adhesion.
gr; J. BIOL. CHEM. 264 7576-7583 (1989). gr; 6. MCGREGOR, J.L.,
CATIMEL, B., PARMENTIER, S., CLEZARDIN, P., gr; DECHAVANNE, M. AND
LEUNG, L.L. gr; Rapid purification and partial characterization of
human platelet gr; glycoprotein IIIb. Interaction with
thrombospondin and its role in platelet gr; aggregation. gr; J.
BIOL. CHEM. 264 501-506 (1989). gr; 7. BARNWELL, J.W., ASCH, A.S.,
NACHMAN, R.L., YAMAYA, M., AIKAWA, M. AND gr; INGRAVALLO, P. gr; A
human 88-KD membrane glycoprotein (CD36) functions in vitro as a
receptor gr; for a cytoadherence ligand on Plasmodium
falciparum-infected erythrocytes. gr; J. CLIN. INVEST. 84 765-772
(1989). gr; 8. BULL, H.A., BRICKELL, P.M. AND DOWD, P.M. gr;
Src-related protein tyrosine kinases are physically associated with
the gr; surface antigen CD36 in human dermal microvascular
endothelial cells. gr; FEBS LETT. 351 41-44 (1994). gr; 9. MIYAOKA,
K., KUWASAKO, T., HIRANO, K., NOZAKI, S., YAMASHITA, S. gr; AND
MATSUZAWA, Y. gr; CD36 deficiency associated with insulin
resistance. gr; LANCET 357 686-687 (2001). gd; CD36 is a
transmembrane, highly glycosylated, 88kDa glycoprotein [1] gd;
expressed by monocytes, macrophages, platelets, microvascular
endothelial gd; cells and adipose tissue [2]. It is a
multifunctional receptor that binds gd; to oxidised LDL (OxLDL),
long chain fatty acids, anionic phospholipids, gd; apoptotic cells,
thrombospondin (TSP), collagen and Plasmodium falciparum- gd;
infected erythrocytes [2]. gd; CD36 has numerous cellular
functions. It is a type B scavenger receptor, gd; playing a major
role in the uptake of OxLDL by macrophages [3]. The lipid- gd; rich
macrophages are then differentiated into foam cells and contribute
to gd; the formation of atherosclerotic lesions [3]. In addition,
CD36 of macro- gd; phages, together with TSP and the integrin
alphav beta3, may phagocytose gd; apoptotic neutrophils [4].
Furthermore, the protein is one of the receptors gd; of collagen in
platelet adhesion and aggregation [5,6]. CD36 may also gd; mediate
cytoadherence of Plasmodium falciparum-infected erythrocytes to the
gd; endothelium of post-capillary venules of different organs [7].
Moreover, gd; cytoplasmic CD36 plays an important role in signal
transduction by inter- gd; acting with Src family tyrosine kinases
[8]. Deficiency in CD36 in Asian gd; and African populations has
been associated with insulin resistance [9]. gd; CD36 is a
13-element fingerprint that provides a signature for the CD36 gd;
adhesion molecules. The fingerprint was derived from an initial
alignment gd; of 4 sequences, focusing on those sections that
characterise CD36 adhesion gd; molecules but distinguish them from
the rest of the CD36 family: motif 1 gd; spans the first putative,
N-terminal TM domain; motifs 2-12 reside in the gd; extracellular
domain; and motif 13 spans the second putative, C-terminal gd; TM
domain. Two iterations on SPTR40_18f were required to reach
convergence, gd; at which point a true set which may comprise 6
sequences was identified. bb; fc; CD36ANTIGEN3 fl; 23 ft; Adhesion
molecule CD36 motif III-2 fd; WVFDVQNPEEVAKNSSKIKVIQR CD36_RAT 65
18 fd; WIFDVQNPDDVAKNSSKIKVKQR CD36_MOUSE 65 18 fd;
WIFDVQNPDEVTVNSSKIKVKQR CD36_BOVIN 65 18 fd;
WIFDVQNPQEVMMNSSNIQVKQR CD36_HUMAN 65 18 fd;
WIFDVQNPDEVAVNSSKIKVKQR CD36_MESAU 65 18 fd;
WIFDVQNPEEVAKNSSKIKVKQR O35754 66 18 MYELIN PROTEOLIPID PROTEIN
(PLP) SIGNATURE HMMER 2.3.2 (Oct 2003) Copyright .COPYRGT.
1992-2003 HHMI/Washington University School of Medicine Freely
distributed under the GNU General Public License (GPL)
---------------------------------------------------------------------------
------- HMM file: prints.hmm Sequence file:
zc312.B11.20d.trrev4_8009.sreformat MYELINP0_5: domain 1 of 1, from
70 to 91: score -0.1, E = 9.3 *->GVVlGAiIGGvLGvVLLlvlllYLv<-*
lG iIGGv G VLL + +l + zc312.B11. 70 --MLGRIIGGV-GCVLLELXGLGVR 91
gc; MYELINPLP gx; PR00214 gn; COMPOUND (7) ga; 11-JUL-1994; UPDATE
07-JUN-1999 gt; Myelin proteolipid protein (PLP) signature gp;
INTERPRO; IPR001614 gp; PROSITE; PS00575 MYELIN_PLP_1; PS01004
MYELIN_PLP_2 gp; BLOCKS; BL00575 gp; PFAM; PF01275 Myelin_PLP gr;
1. SAKAMOTO, Y., KITAMURA, K., YOSHIMURA, K., NISHIJIMA, T. AND
UYEMURA, K. gr; Complete amino acid sequence of P0 protein in
bovine peripheral nerve gr; myelin. gr; J. BIOL. CHEM. 262
4208-4214 (1987). gr; 2. SHAW, S.Y., LAURSEN, R.A. AND LEES, M.B.
gr; Identification of thiol groups and a disulfide crosslink site
in bovine gr; myelin proteolipid protein. gr; FEBS LETT. 250
306-310 (1989). gr; 3. DIEHL, H.J., SCHAICH, M., BUDZINSKI, R.M.
AND STOFFEL, W. gr; Individual exons encode the integral membrane
domains of human myelin gr; proteolipid protein. gr; PROC. NATL.
ACAD. SCI. U.S.A. 83 9807-9811 (1986). gd; The myelin sheath is a
multi-layered membrane, unique to the nervous system, gd; that
functions as an insulator to greatly increase the velocity of
axonal gd; impulse conduction [1]. Myelin proteolipid protein (PLP)
is the major gd; protein found in the sheath of central nervous
system nerves [2]. It spans gd; the membrane 4 times [3] and is
thought to play a role in the formation or gd; maintenance of the
multi-lamellar structure. The protein contains several gd; cysteine
residues, some involved in the formation of disulphide bonds, gd;
others being palmitoylated [2]. Mutations in PLP result in
neurological gd; disorders, such as Pelizaeus-Merzbacher disease in
humans, `jimpy` in gd; mice, and `shaking pup` in dogs. gd;
MYELINPLP is a 7-element fingerprint that provides a signature for
myelin gd; proteolipid proteins. The fingerprint was derived from
an initial alignment gd; of 4 sequences: motifs 1, 2, 5 and 7
encode the 4 transmembrane (TM) gd; domains - motif 4 includes the
region encoded by PROSITE pattern MYELIN_PLP_1 gd; (PS00575), which
is located between the second and third TM segments gd; and
contains 2 Cys residues that are palmitoylated; motif 7 includes
part gd; of the region encoded by PROSITE pattern MYELIN_PLP_2
(PS01004). Two gd; iterations on OWL23.2 were required to reach
convergence, at which point a gd; true set which may comprise 9
sequences was identified. Several partial gd; matches were also
found, all of which are either deletion mutants or myelin gd; PLP
fragments. gd; An update on SPTR37_9f identified a true set of 8
sequences, and 9 gd; partial matches. CHLAMIDIAOM HMMER 2.3.2 (Oct
2003) Copyright .COPYRGT. 1992-2003 HHMI/Washington University
School of Medicine Freely distributed under the GNU General Public
License (GPL)
---------------------------------------------------------------------------
------- HMM file: prints.hmm Sequence file:
rheu.cd.212rp.365_22305.sreformat CHLAMIDIAOM3_3: domain 1 of 1,
from 88 to 100: score 4.6, E = 9.7 *->CgsYvPsCskpcG<-* C +Y+
C k G rheu.cd.21 88 CTGYTEFCAKYTG 100 gr; 3. BACHMAIER, K., NEU, N.
DE LA MAZA, L.M., PAL, S., HESSEL, A. AND gr; PENNINGER, J.M. gr;
Chlamydia infections and heart disease linked through antigenic
mimicry. gr; SCIENCE 283 1335-1339 (1999). bb; bb; gd; Three
cycteine-rich proteins (also believed to be lipoproteins) make up
the gd; extracellular matrix of the Chlamydial outer membrane [1].
They are involved gd; in the essential structural integrity of both
the elementary body (EB) and gd; recticulate body (RB) phase. As
these bacteria lack the peptidoglycan layer gd; common to most
Gram-negative microbes, such proteins are highly important gd; in
the pathogenicity of the organism. gd; gd; The largest of these is
the major outer membrane protein (momp), and gd; constitutes around
60% of the total protein for the membrane [2]. CMP2 gd; is the
second largest, with a molecular mass of 58kDa, while the CMP3 gd;
protein is -15kDa [1]. MOMP is believed to elicit the strongest
immune gd; response, and has recently been linked to heart disease
through its sequence gd; similarity to a murine heart-muscle
specific alpha myosin [3]. gd; gd; The CMP3 family plays a
structural role in the outer membrane during gd; the EB stage of
the Chlamydial cell, and different biovars show a small, yet gd;
highly significant, change at peptide charge level [1]. Members of
this gd; family include C. trachomatis, C. pneumoniae, and C.
psittaci. gd; gd; CHLAMIDIAOM3 is a 3-element fingerprint that
provides a signature for gd; the Chlamydial cysteine-rich outer
membrane 3 protein (CMP3) family. gd; The fingerprint was derived
from an initial alignment of 3 sequences: the gd; motifs were drawn
from conserved regions spanning the full alignment length gd; (~90
amino acids). Two iterations on SPTR37_10f were required to reach
gd; convergence, at which point a true set comprising 8 sequences
was gd; identified. ; ; ; ; indicates data missing or illegible
when filed
[0115] The present invention also relates to an oligonucleotide
primer which may comprise or consisting of part of a polynucleic
acid as defined above, with said primer being able to act as primer
for specifically sequencing or specifically amplifying TT virus HCR
polynucleic acid of the invention and attached cellular (host) DNA
sequences.
[0116] The term "primer" refers to a single stranded DNA
oligonucleotide sequence capable of acting as a point of initiation
for synthesis of a primer extension product which is complementary
to the nucleic acid strand to be copied. The length and the
sequence of the primer must be such that they allow priming the
synthesis of the extension products. Preferably the primer is about
5-50 nucleotides. Specific length and sequence will depend on the
complexity of the required DNA or RNA targets, as well as on the
conditions of primer use such as temperature and ionic
strength.
[0117] The fact that amplification primers do not have to match
exactly with corresponding template sequence to warrant proper
amplification is amply documented in the literature. The
amplification method used may be polymerase chain reaction (PCR),
ligase chain reaction (LCR), nucleic acid sequence-based
amplification (NASBA), transcription-based amplification system
(TAS), strand displacement amplification (SDA) or amplification by
means of Q.beta. replicase or any other suitable method to amplify
nucleic acid molecules using primer extension. During
amplification, the amplified products may be conveniently labelled
either using labelled primers or by incorporating labelled
nucleotides.
[0118] Labels may be isotopic (32P, 35S, etc.) or non-isotopic
(biotin, digoxigenin, etc.). The amplification reaction is repeated
between 20 and 70 times, advantageously between 25 and 45
times.
[0119] Any of a variety of sequencing reactions known in the art
may be used to directly sequence the viral genetic information and
determine the orf by translating the sequence of the sample into
the corresponding amino acid sequence. Exemplary sequencing
reactions include those based on techniques developed by Sanger or
Maxam and Gilbert. It is also contemplated that a variety of
automated sequencing procedures may be utilized when performing the
subject assays including sequencing by mass spectrometry (see, for
example: PCT publication WO 94/16101). It will be evident to one
skilled in the art that, for example the occurrence of only two or
three nucleic bases needs to be determined in the sequencing
reaction.
[0120] Preferably, these primers are about 5 to 50 nucleotides
long, more preferably from about 10 to 25 nucleotides. Most
preferred are primers having a length of at least 13 bases.
[0121] In a preferred embodiment, a primer of the present invention
has a nucleotide sequence as shown in Table 2.
TABLE-US-00002 TABLE 2 Primers used to generate complete TTV-HD
genomes and .mu.TTV-HD subviral genomes by long distance PCR
amplification Nucleotide TTV Primer number Sequence TTV-jt34f
jt34f-1s 223-247 5'-GGCCGGGCCA TGGGCAAGGC TCTTA-3' (acc no
AB064607) jt34f- 195-222 5'-AGTCAAGGGG CAATTCGGGC 2as TCGGGACT-3'
jt34f-5s 205-222 5'-CAATTCGGGC TCGGGACT-3' jt34f- 186-204
5'-ACACACCGCA GTCAAGGGG-3' 6as jt34f-7s 205-223 5'-CAATTCGGGC
TCGGGACTG-3' jt34f- 181-204 5'-AGTTTACACA CCGCAGTCAA GGGG-3' 8as
TTV-HD1 th25-1s 126-156 5'-CCGCAGCGAG AACGCCACGG (acc no AGGGAGATCC
T-3' AJ620222) tth25- 95-125 5'-ACTTCCGAAT GGCTGAGTTT 2as
TCCACGCCCG T-3' TTV-HD3 tth8-1s 133-164 5'-AGAGGAGCCA CGGCAGGGGA
(acc no TCCGAACGTC CT-3' AJ620231) tth8-2as 102-132 5'-CTTACCGACT
CAAAAACGAC GGGCAGGCGC C TTV-HD4 tth4-1s 129-156 5'-CAGCGAGAAC
GCCACGGAGG (acc no GAGATCCT-3' AJ620226) tth4-2as 101-128
5'-GAATGGCTGA GTTTTCCACG CCCGTCCG- 3' TTV-t3pb t3pb-1s 209-226
5'-CAATTCGGGC ACGGGACT-3' * (acc. no AF247138) t3pb-2as 185-208
5'-AGTTTACACA CCGAAGTCAA GGGG-3' * A - TTV-t3pb sequence has a T at
this position
[0122] The present invention also relates to an oligonucleotide
probe which may comprise or consisting of part of a rearranged TT
virus polynucleic acid as defined above, with said probe being able
to act as a hybridization probe for specific detection of a TTV
nucleic acid according to the invention.
[0123] The term "probe" refers to single stranded sequence-specific
oligonucleotides which have a sequence which is complementary to
the target sequence of the rearranged TTV polynucleic acid to be
detected.
[0124] Preferably, these probes are about 5 to 50 nucleotides long,
more preferably from about 10 to 25 nucleotides. Most preferred are
probes having a length of at least 13 bases.
[0125] The probe may be labelled or attached to a solid
support.
[0126] The term "solid support" may refer to any substrate to which
an oligonucleotide probe may be coupled, provided that it retains
its hybridization characteristics and provided that the background
level of hybridization remains low. Usually the solid substrate
will be a microtiter plate, a membrane (e.g. nylon or
nitrocellulose) or a microsphere (bead). Prior to application to
the membrane or fixation it may be convenient to modify the nucleic
acid probe in order to facilitate fixation or improve the
hybridization efficiency. Such modifications may encompass
homopolymer tailing, coupling with different reactive groups such
as aliphatic groups, NH.sub.2 groups, SH groups, carboxylic groups,
or coupling with biotin or haptens.
[0127] The oligonucleotides according to the present invention,
used as primers or probes may also contain or consist of nucleotide
analogues such as phosphorothioates, alkylphosphoriates or peptide
nucleic acids or may contain intercalating agents. These
modifications will necessitate adaptions with respect to the
conditions under which the oligonucleotide should be used to obtain
the required specificity and sensitivity. However, the eventual
results will be essentially the same as those obtained with the
unmodified oligonucleotides.
[0128] The introduction of these modifications may be advantageous
in order to positively influence characteristics such as
hybridization kinetics, reversibility of the hybrid-formation,
biological stability of the oligonucleotide molecules, etc.
[0129] The polynucleic acids of the invention may be comprised in a
composition of any kind Said composition may be for diagnostic,
therapeutic or prophylactic use.
[0130] Also included within the present invention are sequence
variants of the polynucleic acids as selected from any of the
nucleotide sequences with said sequence variants containing either
deletions and/or insertions of one or more nucleotides, especially
insertions or deletions of 1 or more codons, mainly at the
extremities of oligonucleotides (either 3' or 5'), or substitutions
of some non-essential nucleotides by others (including modified
nucleotides an/or inosine).
[0131] Rearranged TTV polynucleic acid sequences according to the
present invention which are similar to the sequences as shown in
FIG. 1 may be characterized and isolated according to any of the
techniques known in the art, such as amplification by means of
sequence-specific primers, hybridization with sequence-specific
probes under more or less stringent conditions, sequence
determination of the genetic information of TTV, etc.
[0132] The present invention also relates to a recombinant
expression vector which may comprise a rearranged TTV polynucleic
acid of the invention as defined above operably linked to
prokaryotic, eukaryotic or viral transcription and translation
control elements.
[0133] The term "vector" may comprise a plasmid, a cosmid, an
artificial chromosome, a phage, or a virus or a transgenic
non-human animal. Particularly useful for vaccine development may
be TT virus recombinant molecules, BCG or adenoviral vectors, as
well as avipox recombinant viruses.
[0134] The term "recombinantly expressed" used within the context
of the present invention refers to the fact that the polypeptides
of the present invention are produced by recombinant expression
methods be it in prokaryotes, or lower or higher eukaryotes as
discussed in detail below.
[0135] The term "lower eukaryote" refers to host cells such as
yeast, fungi and the like. Lower eukaryotes are generally (but not
necessarily) unicellular. Preferred lower eukaryotes are yeasts,
particularly species within Saccharomyces, Schizosaccharomyces,
Kluiveromyces, Pichia (e. g. Pichia pastoris), Hansenula (e. g.
Hansenula polymorph), Schwaniomyces, Schizosaccharomyces, Yarowia,
Zygosaccharomyces and the like. Saccharomyces cerevisiae, S.
carlsbergensis and K. lactis are the most commonly used yeast
hosts, and are convenient fungal hosts.
[0136] The term "higher eukaryote" refers to host cells derived
from higher animals, such as mammals, reptiles, insects, and the
like. Presently preferred higher eukaryote host cells are derived
from Chinese hamster (e. g. CHO), monkey (e. g. COS and Vero
cells), baby hamster kidney (BHK), pig kidney (PK15), rabbit kidney
13 cells (RK13), the human osteosarcoma cell line 143 B, the human
cell line HeLa and human hepatoma cell lines like Hep G2, and
insect cell lines (e.g. Spodoptera frugiperda). The host cells may
be provided in suspension or flask cultures, tissue cultures, organ
cultures and the like. Alternatively the host cells may also be
transgenic non-human animals.
[0137] The term "prokaryotes" refers to hosts such as E. coli,
Lactobacillus, Lactococcus, Salmonella, Streptococcus, Bacillus
subtilis or Streptomyces. Also these hosts are contemplated within
the present invention.
[0138] The term "host cell" refers to cells which may be or have
been, used as recipients for a recombinant vector or other transfer
polynucleotide, and include the progeny of the original cell which
has been transfected.
[0139] It is understood that the progeny of a single parental cell
may not necessarily be completely identical in morphology or in
genomic or total DNA complement as the original parent, due to
natural, accidental, or deliberate mutation or recombination.
[0140] The term "replicon" is any genetic element, e. g., a
plasmid, a chromosome, a virus, a cosmid, etc., that behaves as an
autonomous unit of polynucleotide replication within a cell, i. e.,
capable of replication under its own control.
[0141] The term "vector" is a replicon further which may comprise
sequences providing replication and/or expression of a desired open
reading frame.
[0142] The term "control element" refers to polynucleotide
sequences which are necessary to effect the expression of coding
sequences to which they are ligated. The nature of such control
sequences differs depending upon the host organism; in prokaryotes,
such control sequences generally include promoter, ribosomal
binding site, splicing sites and terminators; in eukaryotes,
generally, such control sequences include promoters, splicing
sites, terminators and, in some instances, enhancers. The term
"control elements" is intended to include, at a minimum, all
components whose presence is necessary for expression, and may also
include additional components whose presence is advantageous, for
example, leader sequences which govern secretion.
[0143] The term "promoter" is a nucleotide sequence which is
comprised of consensus sequences which allow the binding of RNA
polymerase to the DNA template in a manner such that mRNA
production initiates at the normal transcription initiation site
for the adjacent structural gene.
[0144] The expression "operably linked" refers to a juxtaposition
wherein the components so described are in a relationship
permitting them to function in their intended manner. A control
sequence "operably linked" to a coding sequence is ligated in such
a way that expression of the coding sequence is achieved under
conditions compatible with the control sequences.
[0145] The segment of the rearranged TTV DNA encoding the desired
sequence inserted into the vector sequence may be attached to a
signal sequence. Said signal sequence may be that from a non-TTV
source, but particularly preferred constructs according to the
present invention contain signal sequences appearing in the TTV
genome before the respective start points of the proteins.
[0146] Higher eukaryotes may be transformed with vectors, or may be
infected with a recombinant virus, for example a recombinant
vaccinia virus. Techniques and vectors for the insertion of foreign
DNA into vaccinia virus are well known in the art, and utilize, for
example homologous recombination. A wide variety of viral promoter
sequences, possibly terminator sequences and poly(A)-addition
sequences, possibly enhancer sequences and possibly amplification
sequences, all required for the mammalian expression, are available
in the art. Vaccinia is particularly preferred since vaccinia halts
the expression of host cell proteins. For vaccination of humans the
avipox and Ankara Modified Virus (MVA) are particularly useful
vectors.
[0147] Also known are insect expression transfer vectors derived
from baculovirus Autographa californica nuclear polyhedrosis virus
(AcNPV), which is a helper-independent viral expression vector.
Expression vectors derived from this system usually use the strong
viral polyhedrin gene promoter to drive the expression of
heterologous genes. Different vectors as well as methods for the
introduction of heterologous DNA into the desired site of
baculovirus are available to the man skilled in the art for
baculovirus expression. Also different signals for
posttranslational modification recognized by insect cells are known
in the art.
[0148] The present invention also relates to a host cell as defined
above transformed with a recombinant vector as defined above.
[0149] The present invention also relates to a polypeptide having
an amino acid sequence encoded by a rearranged TTV polynucleic acid
as defined above, or a part or an analogue thereof being
substantially similar and biologically equivalent. Preferably, this
polypeptide is encoded by the nucleotide sequence which encodes the
protein containing a signature motif of a mammalian protein.
[0150] The term "polypeptide" refers to a polymer of amino acids
and does not refer to a specific length of the product. Thus,
peptides, oligopeptides, and proteins are included within the
definition of polypeptide. This term also does not refer to or
exclude post-expression modifications of the polypeptide, for
example, glycosylations, acetylations, phosphorylations and the
like. Included within the definition are, for example, polypeptides
containing one or more analogues of an amino acid (including, for
example, unnatural amino acids, peptide nucleic acid (PNA), etc.),
polypeptides with substituted linkages, as well as other
modifications known in the art, both naturally occurring and
non-naturally occurring.
[0151] By "biologically equivalent" as used throughout the
specification and claims, it is meant that the compositions are
immunogenically equivalent to the polypeptides of the invention as
defined above and below.
[0152] By "substantially homologous" as used throughout the
specification and claims to describe polypeptides, it is meant a
degree of homology in the amino acid sequence to the polypeptides
of the invention. Preferably the degree of homology is in excess of
70%, preferably in excess of 80%, with a particularly preferred
group of proteins being in excess of 90% or even 95% homologous
with the polypeptides of the invention.
[0153] The term "analogue" as used throughout the specification to
describe the polypeptides of the present invention, includes any
polypeptide having an amino acid residue sequence substantially
identical to a sequence specifically shown herein in which one or
more residues have been conservatively substituted with a
biologically equivalent residue. Examples of conservative
substitutions include the substitution of one nonpolar
(hydrophobic) residue such as isoleucine, valine, leucine or
methionine for another, the substitution of one polar
(hydrophillic) residue for another such as between arginine and
lysine, between glutamine and asparagine, between glycine and
serine, the substitution of one basic residue such as lysine,
arginine or histidine for another, or the substitution of one
acidic residue, such as aspartic acid or glutamic acid for
another.
[0154] The phrase "conservative substitution" also includes the use
of a chemically derivatized residue in place of a non-derivatized
residue provided that the resulting protein or peptide is
biologically equivalent to the protein or peptide of the
invention.
[0155] "Chemical derivative" refers to a protein or peptide having
one or more residues chemically derivatized by reaction of a
functional side group. Examples of such derivatized molecules
include but are not limited to, those molecules in which free amino
groups have been derivatized to form amine hydrochlorides,
p-toluene sulfonyl groups, carbobenzoxy groups, tbutyloxycarbonyl
groups, chloracetyl groups or formyl groups. Free carboxyl groups
may be derivatized to form salts, methyl and ethyl esters or other
types of esters or hydrazides. Free hydroxyl groups may be
derivatized to form O-acyl or O-alkyl derivatives. The imidazole
nitrogen of histidine may be derivatized to form
N-imbenzylhistidine. Those proteins or peptides are also included
as chemical derivatives which contain one or more
naturally-occurring amino acid derivatives of the twenty standard
amino acids. For examples: 4-hydroxyproline may be substituted for
proline; 5-hydroxylysine may be substituted for lysine;
3-methylhistidine may be substituted for histidine; homoserine may
be substituted for serine; and ornithine may be substituted for
lysine. The polypeptides of the present invention also include any
polypeptide having one or more additions and/or deletions or
residues relative to the sequence of a polypeptide whose sequence
is shown herein, so long as the polypeptide is biologically
equivalent to the polypeptides of the invention.
[0156] The polypeptides according to the present invention contain
preferably at least 3, preferably 4 or 5 contiguous amino acids, 6
or 7 preferably however at least 8 contiguous amino acids, at least
10 or at least 15.
[0157] The polypeptides of the invention may be prepared by
classical chemical synthesis. The synthesis may be carried out in
homogeneous solution or in solid phase. For instance, the synthesis
technique in homogeneous solution which may be used is the one
described by Houbenweyl in the book entitled "Methode der
organischen Chemie" (Method of organic chemistry) edited by E.
Wunsh, vol. 15-I et II. THIEME. Stuttgart 1974.
[0158] The polypeptides of the invention may also be prepared in
solid phase according to for example the methods described by
Atherton and Shepard in their book entitled "Solid phase peptide
synthesis" (IRL Press, Oxford, 1989).
[0159] The polypeptides according to this invention may also be
prepared by means of recombinant DNA techniques as for example
described by Maniatis et al., Molecular Cloning: A Laboratory
Manual, New York, Cold Spring Harbor Laboratory, 1982.
[0160] The present invention also relates to a method for
production of a recombinant polypeptide as defined above, which may
comprise: (a) transformation of an appropriate cellular host with a
recombinant vector, in which a polynucleic acid or a part thereof
as defined above has been inserted under the control of the
appropriate regulatory elements, (b) culturing said transformed
cellular host under conditions enabling the expression of said
insert, and (c) harvesting said polypeptide.
[0161] The present invention also relates to an antibody raised
upon immunization with at least one polypeptide as defined above,
with said antibody being specifically reactive with any of said
polypeptides, and with said antibody being preferably a monoclonal
antibody. The term "antibody", preferably, relates to antibodies
which consist essentially of pooled monoclonal antibodies with
different epitopic specificities, as well as distinct monoclonal
antibody preparations. Monoclonal antibodies are made from an
antigen containing, e.g., a polypeptide encoded by the TTV
polynucleic acid of the invention or a fragment thereof by methods
well known to those skilled in the art. As used herein, the term
"antibody" (Ab) or "monoclonal antibody" (Mab) is meant to include
intact molecules as well as antibody fragments (such as, for
example, Fab and F(ab')2 fragments) which are capable of
specifically binding to protein. Fab and F(ab')2 fragments lack the
Fc fragment of intact antibody, clear more rapidly from the
circulation, and may have less non-specific tissue binding than an
intact antibody. Thus, these fragments are preferred, as well as
the products of a FAB or other immunoglobulin expression library.
Moreover, antibodies useful for the purposes of the present
invention include chimerical, single chain, and humanized
antibodies.
[0162] Preferably, the antibody or antigen binding fragment thereof
carries a detectable label. The antibody/fragment may be directly
or indirectly detectably labeled, for example, with a radioisotope,
a fluorescent compound, a bioluminescent compound, a
chemiluminescent compound, a metal chelator or an enzyme. Those of
ordinary skill in the art will know of other suitable labels for
binding to the antibody, or will be able to ascertain such, using
routine experimentation.
[0163] The present invention also relates to a diagnostic kit for
use in determining the presence of a TT virus polynucleic acid or
polypeptide of the invention, said kit which may comprise a primer,
a probe, and/or an antibody of the invention.
[0164] Alternatively, the present invention also relates to a
method for the detection of a rearranged TTV polynucleic acid
according to the invention present in a biological sample, which
may comprise: (a) optionally extracting sample polynucleic acid,
(b) amplifying the polynucleic acid as described above with at
least one primer as defined above, optionally a labelled primer,
and (c) detecting the amplified polynucleic acids.
[0165] The term "polynucleic acid" may also be referred to as
analyte strand and corresponds to a single- or double-stranded
polynucleic acid molecule.
[0166] The term "labelled" refers to the use of labelled nucleic
acids. This may include the use of labelled nucleotides
incorporated during the polymerase step of the amplification or
labelled primers, or by any other method known to the person
skilled in the art.
[0167] The present invention also relates to a method for the
detection of a rearranged TTV polynucleic acid according to the
invention present in a biological sample, which may comprise: (a)
optionally extracting sample polynucleic acid, (b) hybridizing the
polynucleic acid as described above with at least one probe as
defined above, and (c) detecting the hybridized polynucleic
acids.
[0168] The hybridization and washing conditions are to be
understood as stringent and are generally known in the art (e. g.
Maniatis et al., Molecular Cloning: A Laboratory Manual, New York,
Cold Spring Harbor Laboratory, 1982). However, according to the
hybridization solution (SSC, SSPE, etc.), these probes should be
hybridized at their appropriate temperature in order to attain
sufficient specificity.
[0169] According to the hybridization solution (SSC, SSPE, etc.),
these probes should be stringently hybridized at their appropriate
temperature in order to attain sufficient specificity. However, by
slightly modifying the DNA probes, either by adding or deleting one
or a few nucleotides at their extremities (either 3' or 5'), or
substituting some non-essential nucleotides (i. e. nucleotides not
essential to discriminate between types) by others (including
modified nucleotides or inosine) these probes or variants thereof
may be caused to hybridize specifically at the same hybridization
conditions (i. e. the same temperature and the same hybridization
solution). Also changing the amount (concentration) of probe used
may be beneficial to obtain more specific hybridization results. It
should be noted in this context, that probes of the same length,
regardless of their GC content, will hybridize specifically at
approximately the same temperature in TMACI solutions.
[0170] Suitable assay methods for purposes of the present invention
to detect hybrids formed between the oligonucleotide probes and the
polynucleic acid sequences in a sample may comprise any of the
assay formats known in the art, such as the conventional dot-blot
format, sandwich hybridization or reverse hybridization. For
example, the detection may be accomplished using a dot blot format,
the unlabeled amplified sample being bound to a membrane, the
membrane being incorporated with at least one labelled probe under
suitable hybridization and wash conditions, and the presence of
bound probe being monitored.
[0171] An alternative and preferred method is a "reverse" dot-blot
format, in which the amplified sequence contains a label. In this
format, the unlabeled oligonucleotide probes are bound to a solid
support and exposed to the labelled sample under appropriate
stringent hybridization and subsequent washing conditions. It is to
be understood that also any other assay method which relies on the
formation of a hybrid between the polynucleic acids of the sample
and the oligonucleotide probes according to the present invention
may be used.
[0172] The present invention also relates to a method for detecting
a polypeptide encoded by a rearranged TTV polynucleic acid of the
present invention or an antibody against said polypeptide present
in a biological sample, which may comprise: (a) contacting the
biological sample for the presence of such polypeptide or antibody
as defined above, and (b) detecting the immunological complex
formed between said antibody and said polypeptide.
[0173] The immunoassay methods according to the present invention
may utilize antigens from different domains of the new and unique
polypeptide sequences of the present invention. It is within the
scope of the invention to use for instance single or specific
oligomeric antigens, dimeric antigens, as well as combinations of
single or specific oligomeric antigens. The TTV antigens of the
present invention may be employed in virtually any assay format
that employs a known antigen to detect antibodies. Of course, a
format that denatures the TTV conformational epitope should be
avoided or adapted. A common feature of all of these assays is that
the antigen is contacted with the body component suspected of
containing TTV antibodies under conditions that permit the antigen
to bind to any such antibody present in the component. Such
conditions will typically be physiologic temperature, pH and ionic
strength using an excess of antigen. The incubation of the antigen
with the specimen is followed by detection of immune complexes
comprised of the antigen.
[0174] Design of the immunoassays is subject to a great deal of
variation, and many formats are known in the art. Protocols may,
for example, use solid supports, or immunoprecipitation. Most
assays involve the use of labeled antibody or polypeptide; the
labels may be, for example, enzymatic, fluorescent,
chemiluminescent, radioactive, or dye molecules. Assays which
amplify the signals from the immune complex are also known;
examples of which are assays which utilize biotin and avidin or
streptavidin, and enzyme-labeled and mediated immunoassays, such as
ELISA assays.
[0175] The immunoassay may be in a heterogeneous or in a
homogeneous format, and of a standard or competitive type. In a
heterogeneous format, the polypeptide is typically bound to a solid
matrix or support to facilitate separation of the sample from the
polypeptide after incubation. Examples of solid supports that may
be used are nitrocellulose (e. g., in membrane or microtiter well
form), polyvinyl chloride (e. g., in sheets or microtiter wells),
polystyrene latex (e. g., in beads or microtiter plates,
polyvinylidine fluoride (known as Immunolon), diazotized paper,
nylon membranes, activated beads, and Protein A beads. The solid
support containing the antigenic polypeptides is typically washed
after separating it from the test sample, and prior to detection of
bound antibodies. Both standard and competitive formats are known
in the art.
[0176] In a homogeneous format, the test sample is incubated with
the combination of antigens in solution. For example, it may be
under conditions that will precipitate any antigen-antibody
complexes which are formed. Both standard and competitive formats
for these assays are known in the art.
[0177] In a standard format, the amount of TTV antibodies in the
antibody-antigen complexes is directly monitored. This may be
accomplished by determining whether (labelled) anti-xenogeneic (e.
g. anti-human) antibodies which recognize an epitope on anti-TTV
antibodies will bind due to complex formation. In a competitive
format, the amount of TTV antibodies in the sample is deduced by
monitoring the competitive effect on the binding of a known amount
of labeled antibody (or other competing ligand) in the complex.
[0178] Complexes formed which may comprise anti-TTV antibody (or in
the case of competitive assays, the amount of competing antibody)
are detected by any of a number of known techniques, depending on
the format. For example, unlabeled TTV antibodies in the complex
may be detected using a conjugate of anti-xenogeneic Ig complexed
with a label (e. g. an enzyme label).
[0179] In an immunoprecipitation or agglutination assay format the
reaction between the TTV antigens and the antibody forms a network
that precipitates from the solution or suspension and forms a
visible layer or film of precipitate. If no anti-TTV antibody is
present in the test specimen, no visible precipitate is formed.
[0180] There currently exist three specific types of particle
agglutination (PA) assays. These assays are used for the detection
of antibodies to various antigens when coated to a support. One
type of this assay is the hemagglutination assay using red blood
cells (RBCs) that are sensitized by passively adsorbing antigen (or
antibody) to the RBC. The addition of specific antigen/antibodies
present in the body component, if any, causes the RBCs coated with
the purified antigen to agglutinate.
[0181] To eliminate potential non-specific reactions in the
hemagglutination assay, two artificial carriers may be used instead
of RBC in the PA. The most common of these are latex particles.
[0182] The solid phase selected may include polymeric or glass
beads, nitrocellulose, microparticles, microwells of a reaction
tray, test tubes and magnetic beads. The signal generating compound
may include an enzyme, a luminescent compound, a chromogen, a
radioactive element and a chemiluminescent compound. Examples of
enzymes include alkaline phosphatase, horseradish peroxidase and
beta-galactosidase. Examples of enhancer compounds include biotin,
anti-biotin and avidin. Examples of enhancer compounds binding
members include biotin, anti-biotin and avidin.
[0183] The above methods are useful for evaluating the risk of
developing diseases like cancer or an autoimmune disease due to the
deleterious effects of the presence of a (subgenomic) TTV
polynucleotide sequence linked to a particular host gene or gene
fragment within the patient's cells and allow taking appropriate
counter measures.
[0184] The present invention also relates to an antisense
oligonucleotide or iRNA specific for a rearranged TT virus
polynucleic acid of the invention.
[0185] The generation of suitable antisense oligonucleotides or
iRNAs includes determination of a site or sites within the
rearranged TT virus polynucleic acid for the antisense interaction
to occur such that the desired effect, e.g., inhibition of
expression of the polypeptide, will result. A preferred intragenic
site is (a) the region encompassing the translation initiation or
termination codon of the open reading frame (ORF) of the gene or
(b) a region of the mRNA which is a "loop" or "bulge", i.e., not
part of a secondary structure. Once one or more target sites have
been identified, oligonucleotides are chosen which are sufficiently
complementary to the target, i.e., hybridize sufficiently well and
with sufficient specificity, to give the desired effect. In the
context of this invention, "hybridization" means hydrogen bonding,
which may be Watson-Crick, Hoogsteen or reversed Hoogsteen hydrogen
bonding, between complementary nucleoside or nucleotide bases.
"Complementary" as used herein, refers to the capacity for precise
pairing between two nucleotides. For example, if a nucleotide at a
certain position of an oligonucleotide is capable of hydrogen
bonding with a nucleotide at the same position of a DNA or RNA
molecule, then the oligonucleotide and the DNA or RNA are
considered to be complementary to each other at that position. The
oligonucleotide and the DNA or RNA are complementary to each other
when a sufficient number of corresponding positions in each
molecule are occupied by nucleotides which may hydrogen bond with
each other. Thus, "specifically hybridizable" and "complementary"
are terms which are used to indicate a sufficient degree of
complementarity or precise pairing such that stable and specific
binding occurs between the oligonucleotide and the DNA or RNA
target. It is understood in the art that the sequence of an
antisense compound does not need to be 100% complementary to that
of its target nucleic acid to be specifically hybridizable. An
antisense compound is specifically hybridizable when binding of the
compound to the target DNA or RNA molecule interferes with the
normal function of the target DNA or RNA to cause a loss of
utility, and there is a sufficient degree of complementarity to
avoid non-specific binding of the antisense compound to non-target
sequences under conditions in which specific binding is desired,
i.e., in the case of therapeutic treatment.
[0186] "Oligonucleotide" (in the context of antisense compounds)
refers to an oligomer or polymer of ribonucleic acid (RNA) or
deoxyribonucleic acid (DNA) or mimetics thereof. This term includes
oligonucleotides composed of naturally-occurring nucleobases,
sugars and covalent internucleoside (backbone) linkages as well as
oligonucleotides having non-naturally-occurring portions which
function similarly. Such modified or substituted oligonucleotides
are often preferred over native forms because of desirable
properties such as, for example, enhanced cellular uptake, enhanced
affinity for nucleic acid target and increased stability in the
presence of nucleases. While antisense oligonucleotides are a
preferred form of the antisense compound, the present invention
comprehends other oligomeric antisense compounds, including but not
limited to oligonucleotide mimetics such as are described below.
The antisense compounds in accordance with this invention comprise
from about 8 to about 50 nucleobases (i.e. from about 8 to about 50
linked nucleosides). Particularly preferred antisense compounds are
antisense oligonucleotides, even more preferably those which may
comprise from about 15 to about 25 nucleobases. Antisense compounds
include ribozymes, external guide sequences (EGS), oligonucleotides
(oligozymes), and other short catalytic RNAs or catalytic
oligonucleotides which hybridize to the target nucleic acid and
inhibit its expression. The antisense compounds also include an
iRNA which may comprise a sense sequence and an antisense sequence,
wherein the sense and antisense sequences form an RNA duplex and
wherein the antisense sequence may comprise a nucleotide sequence
sufficiently complementary to the nucleotide sequence of the TT
virus polynucleic acid of the present invention.
[0187] Alternatively, the invention provides a vector allowing to
transcribe an antisense oligonucleotide of the invention, e.g., in
a mammalian host. Preferably, such a vector is a vector useful for
gene therapy. Preferred vectors useful for gene therapy are viral
vectors, e.g. adenovirus, herpes virus, vaccinia, or, more
preferably, an RNA virus such as a retrovirus. Even more
preferably, the retroviral vector is a derivative of a murine or
avian retrovirus. Examples of such retroviral vectors which may be
used in the present invention are: Moloney murine leukemia virus
(MoMuLV), Harvey murine sarcoma virus (HaMuSV), murine mammary
tumor virus (MuMTV) and Rous sarcoma virus (RSV). Most preferably,
a non-human primate retroviral vector is employed, such as the
gibbon ape leukemia virus (GaLV), providing a broader host range
compared to murine vectors. Since recombinant retroviruses are
defective, assistance is required in order to produce infectious
particles. Such assistance may be provided, e.g., by using helper
cell lines that contain plasmids encoding all of the structural
genes of the retrovirus under the control of regulatory sequences
within the LTR. Suitable helper cell lines are well known to those
skilled in the art. Said vectors may additionally contain a gene
encoding a selectable marker so that the transduced cells may be
identified. Moreover, the retroviral vectors may be modified in
such a way that they become target specific. This may be achieved,
e.g., by inserting a polynucleotide encoding a sugar, a glycolipid,
or a protein, preferably an antibody. Those skilled in the art know
additional methods for generating target specific vectors. Further
suitable vectors and methods for in vitro- or in vivo-gene therapy
are described in the literature and are known to the persons
skilled in the art; see, e.g., WO 94/29469 or WO 97/00957.
[0188] In order to achieve expression only in the target organ, the
DNA sequences for transcription of the antisense oligonucleotides
may be linked to a tissue specific promoter and used for gene
therapy. Such promoters are well known to those skilled in the
art.
[0189] Within an oligonucleotide structure, the phosphate groups
are commonly referred to as forming the internucleoside backbone of
the oligonucleotide. The normal linkage or backbone of RNA and DNA
is a 3' to 5' phosphodiester linkage. Specific examples of
preferred antisense compounds useful in the present invention
include oligonucleotides containing modified backbones or
non-natural internucleoside linkages. Oligonucleotides having
modified backbones include those that retain a phosphorus atom in
the backbone and those that do not have a phosphorus atom in the
backbone. Modified oligonucleotide backbones which may result in
increased stability are known to the person skilled in the art,
preferably such modification is a phosphorothioate linkage.
[0190] A preferred oligonucleotide mimetic is an oligonucleotide
mimetic that has been shown to have excellent hybridization
properties, and is referred to as a peptide nucleic acid (PNA). In
PNA compounds, the sugar-backbone of an oligonucleotide is replaced
with an amide containing backbone, in particular an
aminoethylglycine backbone. The nucleobases are retained and are
bound directly or indirectly to aza nitrogen atoms of the amide
portion of the backbone.
[0191] Modified oligonucleotides may also contain one or more
substituted or modified sugar moieties. Preferred oligonucleotides
comprise one of the following at the 2' position: OH; F; 0-, S--,
or N-alkyl; 0-, S--, or N-alkenyl; 0-, S-- or N-alkynyl; or
0-alkyl-O-alkyl, wherein the alkyl, alkenyl and alkynyl may be
substituted or unsubstituted C.sub.1 to C.sub.10 alkyl or C.sub.2
to C.sub.10 alkenyl and alkynyl. A particularly preferred modified
sugar moiety is a 2'-O-methoxyethyl sugar moiety.
[0192] Antisense oligonucleotides of the invention may also include
nucleobase modifications or substitutions. Modified nucleobases
include other synthetic and natural nucleobases such as
5-methylcytosine (5-me-C), 5-hydroxymethyl cytosine, xanthine,
hypoxanthine, 2-aminoadenine, 6-methyl and other alkyl derivatives
of adenine and guanine, 2-propyl and other alkyl derivatives of
adenine and guanine, 2-thiouracil, 2-thiothymine and 2-thiocytosine
etc., with 5-methylcytosine substitutions being preferred since
these modifications have been shown to increase nucleic acid duplex
stability.
[0193] Another modification of the oligonucleotides of the
invention involves chemically linking to the oligonucleotide one or
more moieties or conjugates which enhance the activity, cellular
distribution or cellular uptake of the oligonucleotide. Such
moieties include lipid moieties such as a cholesterol moiety,
cholic acid, a thioether, a thiocholesterol, an aliphatic chain,
e.g., dodecandiol or undecyl residues, a phospholipid, a polyamine
or a polyethylene glycol chain, or adamantane acetic acid, a
palmityl moiety, or an octadecylamine or
hexylamino-carbonyl-oxycholesterol moiety.
[0194] The present invention also includes antisense compounds
which are chimeric compounds. "Chimeric" antisense compounds or
"chimeras," in the context of this invention, are antisense
compounds, particularly oligonucleotides, which contain two or more
chemically distinct regions, each made up of at least one monomer
unit, i.e., a nucleotide in the case of an oligonucleotide
compound. These oligonucleotides typically contain at least one
region wherein the oligonucleotide is modified so as to confer upon
the oligonucleotide increased resistance to nuclease degradation,
increased cellular uptake, and/or increased binding affinity for
the target nucleic acid. An additional region of the
oligonucleotide may serve as a substrate for enzymes capable of
cleaving RNA:DNA or RNA:RNA hybrids. By way of example, RNase H is
a cellular endonuclease which cleaves the RNA strand of an RNA:DNA
duplex. Activation of RNase H, therefore, results in cleavage of
the RNA target, thereby greatly enhancing the efficiency of
oligonucleotide inhibition of gene expression. Consequently,
comparable results may often be obtained with shorter
oligonucleotides when chimeric oligonucleotides are used, compared
to phosphorothioate deoxyoligonucleotides hybridizing to the same
target region. Chimeric antisense compounds of the invention may be
formed as composite structures of two or more oligonucleotides,
modified oligonucleotides, oligonucleosides and/or oligonucleotide
mimetics as described above. Such compounds have also been referred
to in the art as hybrids or gapmers.
[0195] The present invention also relates to a pharmaceutical
composition which may comprise an antibody or antisense
oligonucleotide of the invention and a suitable excipient, diluent
or carrier. Preferably, in a pharmaceutical composition, such
compound as described above is combined with a pharmaceutically
acceptable carrier. "Pharmaceutically acceptable" is meant to
encompass any carrier, which does not interfere with the
effectiveness of the biological activity of the active ingredient
and that is not toxic to the host to which it is administered.
Examples of suitable pharmaceutical carriers are well known in the
art and include phosphate buffered saline solutions, water,
emulsions, such as oil/water emulsions, various types of wetting
agents, sterile solutions etc. Such carriers may be formulated by
conventional methods and the active compound may be administered to
the subject at an effective dose.
[0196] An "effective dose" refers to an amount of the active
ingredient that is sufficient to prevent the disease or to affect
the course and the severity of the disease, leading to the
reduction or remission of such pathology. An "effective dose"
useful for treating and/or preventing these diseases or disorders
may be determined using methods known to one skilled in the
art.
[0197] Administration of the suitable compositions may be effected
by different ways, e.g. by intravenous, intraperitoneal,
subcutaneous, intramuscular, topical or intradermal administration.
The route of administration, of course, depends on the kind of
therapy and the kind of compound contained in the pharmaceutical
composition. The dosage regimen will be determined by the attending
physician and other clinical factors. As is well known in the
medical arts, dosages for any one patient depends on many factors,
including the patient's size, body surface area, age, sex, the
particular compound to be administered, time and route of
administration, the kind of therapy, general health and other drugs
being administered concurrently.
[0198] In a preferred embodiment of the present invention, the
disease that may be prevented/treated is an autoimmune disease (or
an early stage thereof) such as multiple sclerosis (MS) or any
other neurological disease, asthma, polyarthritis, diabetes, lupus
erythematosus, celiac disease, colitis ulcerosa, or Crohn's
disease. The term "autoimmune disease" also may comprise as yet
unknown autoimmune diseases.
[0199] The present invention also provides [0200] (a) a method for
the generation of a database for determining the risk to develop
cancer or an autoimmune disease, which may comprise the following
steps [0201] (i) determining the nucleotide sequence of a genomic
host cell DNA linked to rearranged TT virus polynucleic acids
according to the invention and being preferably present in episomal
form, if present, in a sample from a patient suffering from at
least one of said diseases; and [0202] (ii) compiling sequences
determined in step (a) associated with said diseases in a database;
as well as [0203] (b) a method for evaluating the risk to cancer or
an autoimmune disease of a patient suspected of being at risk of
developing such disease, which may comprise the following steps:
[0204] (i) determining the nucleotide sequence of a genomic host
cell DNA linked to a rearranged TT virus polynucleic acid according
to the invention and being preferably present in episomal form, if
present, in a sample from said patient; and [0205] (ii) comparing
sequences determined in step (a) with the sequences compiled in the
database generated to the method described above, wherein the
absence of a genomic host cell DNA linked to a TT virus polynucleic
acid or the presence only of host cell DNA linked to a TT virus
polynucleic acid not represented in said database indicates that
the risk of developing such disease is decreased or absent.
[0206] Finally, the present invention also provides a process for
the in vitro replication and propagation of Torque teno viruses
(TTV), preferably a rearranged TTV according to the present
invention, which may comprise the following steps: [0207] (a)
transfecting linearized TTV DNA into 293TT cells expressing high
levels of
[0208] SV40 large T antigen, preferably at least levels as reported
in Buck et al. (2004); [0209] (b) harvesting the cells and
isolating cells showing the presence of TTV DNA; [0210] (c)
culturing the cells obtained in step (b) for at least three days,
preferably at least one week or longer, depending on experimental
conditions and TTV type concerned; and [0211] (d) harvesting the
cells of step (c).
[0212] Although the present invention and its advantages have been
described in detail, it should be understood that various changes,
substitutions and alterations may be made herein without departing
from the spirit and scope of the invention as defined in the
appended claims.
[0213] The present invention will be further illustrated in the
following Examples which are given for illustration purposes only
and are not intended to limit the invention in any way.
EXAMPLE 1
Materials and Methods
[0214] (A) TT Virus Isolation and Characterization
[0215] The isolation of TT virus isolates TTV-HD3a (tth8, accession
no AJ620231) and TTV-HD1a (tth25, acc. no AJ620222) was previously
described (Jelcic et al., 2004). Full-length genomic sequences of
both TTV-HD3a and TTV-HD1a were cloned into the vector pUC18 using
restriction enzymes SalI (Leppik et al., 2007) and EcoR1,
respectively. Additional TTV sequences were identified in human
samples by DNA nested amplification using primers NG472/NG352 and
NG473/NG351 as previously described (Peng et al., 2002; Leppik et
al., 2007). The limited availability of DNA for a number of biopsy
and serum samples required prior amplification using rolling circle
amplification with a TempliPhi Kit (GE Healthcare). All amplified
products were cloned and sequenced (Leppik et al., 2007). Samples
harbouring TT virus DNA were subsequently subjected to long
distance-PCR amplification using TaKaRa LA Taq enzyme (TAKARA BIO
INC., Japan) and respective primers which had been designed based
on the initially identified TTV DNA sequences. These back-to-back
primers included the following combinations: tth25-1s and
tth25-2as, jt34f-1s and jt34f-2as, jt34f-7s and jt34f-8as, jt34f-5s
and jt34f-6as, tth4-1s and tth4-2as, t3pb-1s and t3pb-2as, as well
as tth8-1s and tth-2as (Table 2). Long-PCR amplification was
performed using a touchdown stepwise reaction as described
previously (Leppik et al., 2007) with the exception of primer
combinations t3pb-1/2, jt34f-5/6 and tth4. PCR conditions for PCR
amplification with t3pb-1/2 and jt34f-5/6-primers were an initial
denaturation at 94.degree. C. for 1 min, followed by 30 cycles of
94.degree. C. for 30 sec, annealing at 65.degree. C. for 1 min and
elongation at 72.degree. C. for 4 min with a final elongation at
72.degree. C. for 10 min. PCR conditions for amplification with
tth4 primers were similar except that annealing was performed at
68.degree. C. All obtained amplicons in the range of 3.8 kb were
eluted and purified after gel electrophoresis, cloned into vector
pCR2.1 (TA-Cloning-Kit, Invitrogen) and propagated in NovaBlue
Singles Competent Cells (Merck Chemicals, UK). All full-length
genomes were sequenced through both strands. A total of 53
full-length genomes was obtained.
[0216] (3) Sequence Analyses and Phylogeny
[0217] DNA sequences were compared to TTV sequences available in
all databanks using the HUSAR software package (Jelcic et al.,
2004). The ICTV recently classified TT viruses into the family
Anelloviridae based on the DNA sequence of large open reading frame
1 (ORF1) (Biagini and de Micco, 2010). Characterizing the genomes
of the isolates obtained revealed rearrangement of sequences in the
ORF1 region. The full-length genomes of the genus Alphatorquevirus
and the isolates were therefore subjected to phylogenetic analyses
as previously described (Jelcic et al., 2004). The phylogenetic
tree (FIG. 4) was displayed using the Treeview program of the
University of Glasgow. Translated ORFs were analyzed for homologous
proteins and functional domains by using ProtSweep (del Val et al.,
2004).
[0218] (C) Cell Culture and Transfection
[0219] The human embryonic kidney cell line 293TT (Buck et al.,
2004) was maintained in DMEM supplemented with 10% fetal calf
serum, 1% Glutamax, 1% non-essential amino acids (both Invitrogen,
Karlsruhe, Germany) and 400 .mu.g/ml Hygromycin B (Roche
Diagnostics, Mannheim). Linearized virus DNA (2 .mu.g per well on
6-well plates) was transfected into cells grown without Hygromycin
B using Lipofectamine reagent (Invitrogen) according to the
manufacturer's instructions (Fei et al., 2005). Culture medium (2
ml) was supplemented with 800 .mu.l Opti-MEM prior to incubation
for 4 hours at 37.degree. C. Transfected cultures were subsequently
incubated with fresh medium containing Hygromycin B and propagated
when confluency was reached. Full-length genomes of 12 TTV isolates
were transfected, maintained and harvested in parallel at all
times. TT virus genomes included TTV-HD14a, TTV-HD14b, TTV-HD14c,
TTV-HD14e, TTV-HD15a, TTV-HD16a, TTV-HD20a, TTV-HD3a, TTV-HD1a,
TTV-HD23a, TTV-HD23b and TTV-HD23d (Table 3).
TABLE-US-00003 TABLE 3 TT full-length genomes (3, 8 kb) subviral
genomes tth25 HD1a .mu.TTV-HD1 - zpr9.B1.6 (621 nt) tth3 HD1b tth9
HD1c tth16 HD1d tth17 HD1e tth26 HD1f tth27 HD1g tth31 HD1h tth5
HD2a tth14 HD2b tth29 HD2c tth8 HD3a tth7 HD3b tth13 HD3c tth19
HD3d tth22g4 HD3e tth23 HD3f tth4 HD4 tth10 HD5a tth11g2 HD5b tth18
HD5c tth21 HD5d tth6 HD6a tth20 HD6b tt32c2 HD7 tt32b8 HD8 sle1957
HD9 sle1931 HD10a sle1932 HD10b sle2045 HD10c sle2037 HD11 sle2065
HD12a sle2057 HD12b sle2058 HD12c sle2061 HD12d sle2072 HD12e
gB20.33 HD13a gB20.58 HD13b gB21.51 HD13c gbDhDi33.32 HD14a
.mu.TTV-HD14.1 - zpr4.B5.20 (719 nt) gbCuCv33.2 HD14b
.mu.TTV-HD14.2 - zpr4.B6.125 (1224 nt) gbDhDi33.31 HD14c
gbDhDi33.33 HD14d gbDhDi33.35 HD14e gbDhDi32.36 HD14f gbDfDg33.45
HD14g gbDfDg33.48 HD14h gbDfDg33.49 HD14i gbCsCt38.1 HD15b
gbCsCt38.2 HD15a .mu.TTV-HD15 - zpr5.B4.12 (913 nt) gbCsCt38.4
HD15c gbCsCt38.6 HD15d gbCsCt43.2 HD16a gbCsCt43.1 HD16b gbCsCt43.3
HD16c gbCsCt43.5 HD16d gbCsCt43.6 HD16e gbCuCv43.1 HD16f gbCuCv43.4
HD16g gbDhDi43.1 HD16h gbDhDi43.4 HD16i gbDhDi43.6 HD16j gbDhDi43.7
HD16k gbDhDi43.22 HD16l uro702 HD17 uro703 HD18a uro705 HD18b
rheu242 HD19 uro960 HD20a uro742 HD20b uro745 HD20c uro746 HD20d
uro953 HD20e uro958 HD20f rheu111 HD21 rheu112 HD22 rheu215 HD23a
rheu210 HD23b .mu.TTV-HD23.1 - zpr12.B2.22 (401 nt) rheu211 HD23c
.mu.TTV-HD23.2 - zpr12.B5.24 (642 nt) rheu212 HD23d rheu213 HD23e
rheu214 HD23f rheu231 HD24b rheu232 HD24a rheu234 HD24c rheu236
HD24d rheu238 HD24e rheu241 HD24f
[0220] Virus DNA was released from the vector prior to
transfection. Controls included transfection with vector alone and
cells transfected with 1.times. TE. Transfected cells and culture
medium were frozen at -80.degree. C. and samples for DNA and RNA
extraction taken at each time point during propagation. DNA was
extracted with phenol-chloroform-isoamylalcohol and RNA using the
RNeasy Mini Kit (Qiagen, Hilden, Germany). Replication of virus DNA
was monitored and demonstrated by long-PCR amplification as
described above. All transfection experiments were performed 3
times with 6 week intervals between primary transfections. Frozen
cells or purified virus preparations were passaged between 4 to 6
times.
[0221] (D) Virus Propagation, Purification and
Electronmicroscopy
[0222] Transfected cells were harvested from flasks by shaking
followed by centrifugation for 10 min at 200 g. Cell pellets were
resuspended in DPBS-Mg (Invitrogen) and separated on a 27-33-39%
Optiprep (Sigma, St. Louis, Mo.) step gradients for 3.5 hr at
234,000 g (Buck et al., 2005). Gradients were fractionated and
screened for the presence of virus DNA by gel electrophoresis of
lysed aliquots. Aliquots were lysed with proteinase K, 0.25 mM EDTA
and 0.5% SDS for 10 min at 56.degree. C. immediately prior to
loading onto the gel. The supernatant of the re-suspended cells
were alternatively filtered through a 0.22 .mu.m filter. Aliquots
of gradient fractions, as well as filtered supernatants were frozen
at -80.degree. C. for use as inoculum. Filtered aliquots were
pelleted. Pellets were subjected to negative staining and
visualized by electronmicroscopy. Cloned subviral .mu.TTV genomes
were transfected into 293TT in the same way as the full-length
genomes. The cultures were propagated over several weeks. Cells
were partially removed by scraping off part of the monolayer cells
while allowing outgrowth of the remaining cells. Removed cells were
pelleted and supernatant was filtered through a 0.22 .mu.m filter
before visualization in the electron microscopy. Cell pellets were
treated as described above prior to centrifugation and separation
through Optiprep gradients. Aliquots were lysed and the DNA
visualized after gel electrophoresis.
[0223] (E) Transcription Analyses
[0224] Transcripts of TTV-HD full-length genomes were analysed
using two different approaches. 5'- and 3'-RACE products were
generated from single- as well as double-stranded cDNA.
Single-stranded 5'-RACE-Ready and 3'-RACE-Ready cDNAs were
respectively synthesized from 1 .mu.g purified total RNA in a 10
.mu.l reaction mix using the SMARTer.TM. RACE cDNA Amplification
Kit (Clontech cat #634923) in which RNA is reverse transcribed by
SMARTScribe.TM. Reverse Transcriptase at 42.degree. C. for 90 min.
3'RACE-CDS primer A was used for the synthesis of 3'RACE-Ready
cDNA, whereas the 5'RACE-CDS primer A and SMARTer IIA
oligonucleotide were used for the synthesis of 5'-RACE-Ready cDNA.
Double-stranded cDNA was concomitantly synthesized. Here
full-length single stranded cDNA was initially synthesized using
the SMARTer.TM. PCR cDNA Synthesis Kit (Clontech cat #634925)
according to the manufacturer's protocol. Purified total RNA (1
.mu.g) was transcribed using SMARTScribe.TM. Reverse Transcriptase
and primers 3'SMART CDS PrimerIIA and SMARTer IIA Oligonucleotide.
These primers both contain a non-template nucleotide stretch
thereby creating an extended template. Second-strand cDNA
amplification was obtained by long distance PCR amplification (LD
PCR) with 5'PCR Primer IIA and the Advantage 2 polymerase mix
(Clontech cat #639201). PCR amplification was performed at follows:
15 sec at 95.degree. C., 30 sec at 65.degree. C. and 3 min at
68.degree. C. per cycle and ranging number of cycles in order to
determine optimal conditions.
[0225] 5'- and 3'-RACE PCR amplification was performed using
5'-RACE-Ready or 3'-RACE-Ready cDNA, respectively, or
double-stranded cDNA template in both cases. RACE-PCR was performed
using Advantage 2 polymerase mix, a universal primer A mix (UPM)
from the SMARTer.TM. RACE cDNA Amplification Kit and forward and
reverse primers fitting to the respective TTV types (Table 4).
TABLE-US-00004 TABLE 4 Nucleotide positions of primers used for PCR
amplification in RACE TTV primer Nucleotide number transcript
TTV-HD14b 1-f1 716-743 + 1-f3 2886-2912 + 1-r1 757-730 + 1-r2
3521-3492 + TTV-HD14c 2-f1 716-743 + 2-f2 3054-3082 + 2-r3
2912-2885 + TTV-HD14a 3-f1 717-744 + 3-f2 2890-2917 + 3-f3
3496-3521 + 3-r1 745-720 - 3-r2 2914-2887 + TTV-HD14e 4-f1
2887-2914 + 4-f2 3494-3519 + 4-f3 3053-3080 + 4-r1 757-730 + 4-r2
2911-2884 + TTV-HD15a 5-f1 125-149 + 5-f2 2807-2834 + 5-f3
3388-3415 + 5-r1 224-197 + 5-r2 3014-2987 + 5-r3 3425-3398 -
TTV-HD16a 6-f1 100-127 + 6-f2 3145-3172 - 6-f3 3564-3591 - 6-r1
3204-3182 + 6-r2 3443-3418 - TTV-HD20a 7-f1 314-341 + 7-f2
3025-3052 + 7-r1 227-200 + 7-r2 743-716 + 7-r3 3332-3305 -
TTV-HD23b 10-f1 113-139 - 10-f3 3121-3148 + TTV-HD23d 11-f1 126-148
+ 11-f2 354-381 + 11-f3 3397-3422 - 11-r1 226-199 + 11-r2 3653-3626
+ 11-r3 3327-3302 + TTV-HD23a 12-f1 126-148 + 12-f2 354-381 + 12-r2
3177-3150 + 12-r3 3326-3301 +
[0226] Conditions for amplification were: 29 cycles of 30 sec at
94.degree. C., annealing for 30 sec at 68.degree. C. and elongation
for 3 min at 72.degree. C., with a final extension for 15 min at
72.degree. C. All products were analysed by gel electrophoresis,
purified after gel elution, cloned into vector pCR2.1 (Invitrogen
cat #K2020-40) and sequenced. Two additional controls were
performed in order to control for non-specific amplification. In
one control amplification was performed using only one TTV-specific
primer and in the second using the UPM primer alone. No products
were detected in either of these.
EXAMPLE 2
[0227] Demonstration of the Persistence of TTV DNA in Cells from
Tissue Culture Lines Derived from Malignant Tumors
[0228] Cell lines derived from malignant tumors possess one
advantage over primary tumor biopsy material. They commonly
represent pure preparations of cancer cells, whereas primary
materials are commonly contaminated by normal mesenchymal cells, by
cells of the hematopoietic system and normal epithelial cells. On
the other hand, one disadvantage of tissue culture lines may arise
from the selection of specific clones growing under tissue culture
conditions and the acquisition of secondary genetic modifications
in the course of long-term cultivation. In addition, fetal calf
sera may pose a risk due to the introduction of cattle viruses
which survive serum inactivation procedures (e.g. bovine
polyomavirus); see Table 5 summarizing these
advantages/disadvantages.
TABLE-US-00005 TABLE 5 Analysis of primary tumor biopsies vs
established cell lines for TTV-related sequences Biopsies Cell
lines Advantage Disadvantage Advantage Disadvantage Authentic
Contaminated by Pure Selection of materials admixture of
preparations specific Clones normal cells of cancer adapted to
Search for TTV cells tissue culture sequences clouded Available in
conditions by the uniform unlimited Secondary genetic presence of
TTV amounts changes during in the peripheral long-term blood
cultivation Availability Use of fetal calf limited serum poses the
risk of contaminations with cattle viruses
[0229] Attempts to find TTV DNA in human primary tumor materials
suffers from one disadvantage: the plurality of TTV genotypes in
human material. This renders it virtually impossible to identify a
specific genotype as an etiologic agent for a human cancer type.
For these reasons studies on the persistence of TTV DNA sequences
in cells derived from cancer tissue culture lines were initiated.
Thus far the results have been extremely surprising: PCR primers
used to discover regions of the TTV large open reading frame have
been entirely unsuccessful. However, other primer combinations,
discovering exclusively a short GC-rich regulatory region of the
TTV genome of about 71 bases, detected this sequence in a larger
number of cell lines (FIG. 1). This regulatory region is highly
conserved among different TTV genotypes and is not present in the
human genome data bank.
[0230] In a first series of experiments the same sequence was
discovered in a number of additional cell lines. These included the
following lines: [0231] MCF7 (breast cancer line); [0232] HAK-1,
KMH-2, L1236 (all Epstein-Barr virus negative Hodgkin's lymphoma
lines); [0233] Y69 (Epstein-Barr virus negative B-lymphoma) [0234]
HSB-2 (acute lymphocytic leukemia); [0235] P3HR-1 (Epstein-Barr
virus-positive Burkitt's lymphoma); [0236] BJAB (Epstein-Barr virus
negative Burkitt's lymphoma); [0237] Ng (EBV-immortalized B
lymphoblasts from a patient with multiple sclerosis)
[0238] Besides these 9 positive lines, two melanoma cell lines (IGL
and KR, FIG. 1) and human placenta DNA were negative in initial
experiments. Interestingly, after removal of spooled DNA from L1236
cells and RNase treatment of the remaining solution, besides
mitochondrial DNA two faint bands of similar size became visible
banding between positions 4.3-6.6 kb (double-stranded DNA size
marker) in the agarose gels (FIG. 2). Analysis of these sequences
revealed again the presence of the TTV regulatory region. Mung-bean
nuclease, digesting selectively single-stranded DNA, completely
abolished the cellular DNA-containing bands from four multiple
sclerosis biopsies in contrast to double-stranded control DNA,
underlining the single-stranded nature of the former. Similar
studies are presently conducted for isolates from tumor DNA.
EXAMPLE 3
Analyses of Chimeric TTV/Truncated Host Cell DNA Sequences
[0239] Initially, all attempts failed to use primers in outwards
orientation starting within the regulatory region in order to find
flanking TT viral DNA, surrounding this region. Invariably,
however, human cellular DNA was demonstrated in the respective
clones (FIG. 3).
[0240] The human genes in these clones and their arrangements
within the single-stranded episomal DNA, obviously controlled by
the TTV 71 base region, are presently being analyzed. The available
data indicate a substantial variation in the uptake of commonly
truncated host cell genes. Their possible conversion into
growth-stimulating oncogenes or into functions interfering with
tumorsuppressor genes requires functional tests which are presently
under investigation.
[0241] The same accounts for rearranged TTV virus sequences. Some
of the available data are presented in FIGS. 7, 8, 9, and 11 to
13.
EXAMPLE 4
Identification and Characterization of TTV Genomes
[0242] Initial amplification of the short conserved GC-rich region
of TT viruses in serum and biopsy samples led to the identification
of TTV DNA in the majority of cases. Subsequent amplification of
the complete genome is necessary to identify specific TTV types as
many share exact DNA homology in the amplified 72 bp lying in the
control region, but differ as much as 60-80% in sequence identity
in the rest of their genomes. A number of back-to-back primer
combinations was designed on sequences obtained during the course
of the investigations (Table 2). Long distance PCR amplification
was performed on TTV DNA positive samples. Amplicons ranging
between 3 to 4 kb were cloned and sequenced. TTV DNA positive
samples originated from healthy subjects as well as patients with
leukaemia, multiple sclerosis, rheumatoid arthritis and kidney
disease. Part of these data has previously been described (Leppik
et al., 2007; Sospedra et al., 2005; de Villiers et al., 2009).
[0243] A total of 53 full-length DNA genomes were characterized. As
many as 12 distinct full-length isolates were identified after
sequencing 19 genomes from a single biopsy. The genome organization
of different isolates of one TTV type varied despite low diversity
of nucleotides (ranging from 1-4%). Although the large open reading
frame ORF1 was mainly involved, differences within the noncoding
region and other genes were also noted. These data confirmed
earlier observations (Jelcic et al., 2004; Leppik et al., 2007; de
Villiers et al., 2009). Modifications in the ORF1 included
premature stop codons leading to separate smaller ORFs in this
region, considerable sequence diversity in the hypervariable region
(Nishizawa et al., 1999; Jelcic et al., 2004) or absence of a stop
codon resulting in a larger ORF1 than present in the prototype
(FIG. 16). The official classification of the family Anelloviridae
is based on comparisons of the ORF1 DNA sequences (Biagini and de
Micco, 2010). Due to the ORF1 modifications in the isolates
obtained, the full-length genomic sequences was included in the
phylogenetic analyses presented here. The aim of this analysis was
to gain an overview of the isolates TTV-HD in relation to
established TTV species (FIG. 18). All previous isolates are
included in this tree as well (Jelcic et al., 2004; Leppik et al.,
2007; de Villiers et al., 2009).
EXAMPLE 5
In Vitro Replication of TTV-HD
[0244] Attempts to associate torque teno virus infection with the
pathogenesis of a specific disease have repeatedly been reported in
the past. Samples from a large range of diseases have been
analysed. In vitro investigations were hampered by negative
attempts to identify a cell culture system in which these viruses
may readily be propagated over longer time periods. Virus particles
were initially characterized with the help of density gradients and
immunoglobulin aggregates (reviewed in Okamoto, 2009) and later
visualized from sera and feces (Itoh et al., 2000). Torque teno
viruses occur predominantly in cells of the hematopoietic system
(Okamoto, 2009). The first isolates were obtained from the spleen
of a patient with Hodgkin's lymphoma (Jelcic et al., 2004).
Therefore, the L428 cell line was used in initial attempts to
demonstrate in vitro replication and transcription of TTV-HD3a.
Replication of the full-length genome for up to 7 days after
transfection of the linearized virus DNA was achieved (Leppik et
al., 2007). In order to extend this period of replication,
full-length TTV genomes were transfected into the human embryonic
kidney cell line 293TT which was engineered to express high levels
of SV40 large T antigen (Buck et al., 2004). Secondly, it was
decided to include 12 full-length isolates in this study in order
to determine whether 1) variations in the ORF1 would influence
replication and formation of virus particles, 2) divergent TTV
types vary in their mode of replication. Great care was taken in
propagating all 12 isolates in parallel in order to exclude
variation as far as possible which may occur during handling.
[0245] The following isolates were chosen for transfection and
propagation: TTV-HD3a (Leppik et al., 2007) and TTV-HD1a (Jelcic et
al., 2004). TTV-HD1a is closest related to species TTV3 (hel32) and
TTV-HD3a to species TTV12 (ct44f) (FIG. 4). TTV-HD16a (species
TTV22-related), TTV-HD15a (species TTV12-related), TTV-HD14a,
TTV-HD14b, TTV-HD14c and TTV-HD14e (species TTV29-related) were all
isolated from brain biopsies from patients with multiple sclerosis.
TTV-HD20a (species TTV13-related) originated from kidney tissue and
TTV-HD23a, TTV-HD23b and TTV-HD23d (species TTV3-related) were
amplified from serum taken from patients with rheumatoid arthritis.
The sequences of TTV-HD14a, TTV-HD14b, TTV-HD14c and TTV-HD14e vary
between 1-2% in their full-length genomes. The prototype is
TTV-HD14a with an intact ORF1 of 648 amino acids (aa) in size. The
ORF1 of TTV-HD14b is 660aa in size with only 554aa sharing identity
to TTV-HD14a ORF1, whereas the rest of the ORF indicates fusion to
ORF4 (after de Schmidt and Noteborn, 2009). Similarly, TTV-HD14c
ORF1 is 712aa and constitutes an ORF1 (first 645aa) fused to ORF5.
TTV-HD14e ORF1 is interrupted resulting in 2 ORFs of 467aa and
179aa in size. The TTV-HD23b, TTV-HD23d and TTV-HD23a genomes vary
only between 1-3% in sequence identity, but their ORF1 genes differ
as follows: TTV-HD23a ORF1 as prototype is 736aa in size, TTV-HD23b
ORF1 DNA sequence varies from that of TTV-HD23a in the
hypervariable region by 18.4% (34.2% in amino acids). TTV-HD23b and
TTV-HD23d DNA sequences differ only 1% in overall identity, but the
TTV-HD23d ORF1 is interrupted resulting in 2 ORFs 307aa and 365aa
in size (FIG. 16).
[0246] Transfections were performed on semi-confluent 293TT cells.
The nature of this cell line with its many rounded cells attached
to the monolayer does not permit a clear-cut identification of
cytopathic effects. Cells were passaged when confluent or when
cells started to detach from the surface. Flasks were shaken to
loosen all cells. Cells were centrifuged and aliquots frozen, as
well as used for DNA and RNA extraction and electron microscopic
analyses. Frozen infected cells were initially used to re-infect
new 293TT cultures as re-infection failed if cells had previously
been trypsinized at the time of harvest. Virus replication was
monitored by performing long-distance PCR on DNA extracted from
infected cells. Periods between re-infection and cell harvest
varied between 3 to 7 days, depending on culture density. No
obvious morphological differences were noted between cultures of
different TTV isolates. Re-infection during the course of one
experiment was performed several times using frozen cell aliquots
frozen. In vitro propagation of TT viruses has not been described
before. Restriction enzyme digestion was performed on cellular DNA
obtained from the initially transfected samples to remove any
residual bacteria-generated virus DNA. Long PCR amplification
results indicated de novo replication of virus DNA. Examples of
these TTV DNA amplicons using infected cellular DNA as template are
presented in FIG. 19.
[0247] Long distance PCR amplification of the full-length DNA
molecules indicated considerable differences between cultures.
Second round amplifications (using the same primers as in the first
round) were necessary on all cultures infected with isolates from
brain biopsies, i.e. TTV-HD16a, HD15a and the 4 individual TTV-HD14
isolates (FIG. 19A), despite their divergence (45-50% nucleotide
homology) according to the phylogenetic analyses (FIG. 18).
Modifications in ORF1 did not seem to influence amplification or
propagation as visualized in the amplification of the full-length
DNA (FIG. 21A a-c). Additional DNA amplicons varying in size were
observed in HD15a-infected cultures. The occurrence of these
molecules increased during subsequent propagation with a
concomitant reduction in the full-length genome (FIG. 21A a-c lane
5). Applicants previously reported subviral molecules of a similar
nature in human serum samples (Leppik et al., 2007). Similar
off-sized amplicons were also occasionally noted in
TTV-HD16a-infected cultures (lane 6) and rarely in TTV-HD14
cultures (lanes 1-4).
[0248] Large differences were noted in the behaviour of the other 6
isolates. This variation was also evident between experiments and
passages (FIG. 19B b1, b2, b3) reflecting an apparent high
sensitivity to very minor modifications in culturing conditions.
The initially replicating full-length genome (3.8 kb) was lost
during propagation (FIG. 19B a-c) in concurrence with prominent
subgenomic amplicons ranging in size in TTV-HD20a-, TTV-HD3a- and
TTV-HD1a-infected cells (lanes 7-9, FIG. 19B). Amounts of input DNA
used for long-distance PCR amplification, as well as of amplicons
loaded onto gels were the same for all cultures. The high level of
DNA amplicons of isolates TTV-HD23b, TTV-HD23d and TTV-HD23a after
a single round of long-distance PCR may therefore indicate a
stronger replication potential during early passages.
[0249] Due to the differences observed between the two groups of
isolates, it was investigated whether variations could be observed
during serial sampling. Equivalent passages of TTV-HD14e and
TTV-HD23b were propagated in parallel and samples were taken daily.
Long-distance amplification indicated a constant replication of
TTV-HD14e (visible after two rounds of DNA amplification) in
contrast to the decreasing replication of TTV-HD23b (visible
already after a single round of DNA amplification) which was lost
after 10 days in culture (FIG. 19C). These cultures were not
passaged and morphological differences between cultures were not
noticeable.
EXAMPLE 6
In Vitro Formation, Replication and Characterization of .mu.TTV
Subviral Molecules
[0250] The appearance of smaller DNA amplicons of a constant size
in cultures from isolates TTV-HD14b, TTV-HD14c, TTV-HD14d and
TTV-HD14e, as well as TTV-HD1a and the 3 TTV-HD23 isolates, was
already noted early after transfection and was maintained during
passages (FIGS. 19A and B). They were cloned and characterized.
These subviral DNA molecules (.mu.TTV-HD14, 719 bases in size) from
TTV-HD14b and the 3 TTV-HD14 isolates were all identical in DNA
sequence and represented circular subgenomic rearranged molecules
originating from the parental TTV-HD14 genome (FIG. 20A).
Similarly, a rearranged subviral DNA molecule (.mu.TTV-HD1, 621
bases) originated from the parental TTV-HD1a genome (FIG. 20B).
Interestingly, replication of .mu.TTV-HD1 was maintained during
passages, despite the disappearance of the full-length TTV-HD1a
genome. This presence or absence of the subviral molecules in
TTV-HD23 cultures indicates a possible influence of culturing
conditions. Here these molecules ranged from 400 to 900 bases in
size with an increased level of 642 and 401 base molecules.
Characterization of the cloned molecules indicated an apparent
evolutionary preferred maturation process as a segment of the 401
base subviral molecule (.mu.TTV-HD23.1) was duplicated in the 642
base subviral DNA (.mu.TTV-HD23.2; FIG. 20C). Multiple versions of
this segment were present in larger molecules. Subviral genomes
originating from TTV-HD23b, TTV-HD23d as well as TTV-HD23a
cultures, were all identical in DNA sequence. Transfection of these
subviral rearranged molecules in 293TT cells resulted in
replication of their genomes (FIG. 21) as visualized after PCR
amplification. Interestingly, the respective .mu.TTV reacted
exactly in the same way as the parental genomes, i.e. genomic
.mu.TTV-HD15 DNA initial replication was strong, but was
subsequently only visualized after nested PCR amplification (FIG.
21). Small protein-like structures 10 nm in size were visible by
electron microscopy after filtration (0.22 .mu.m) of the culture
medium from these cell cultures (FIG. 22).
EXAMPLE 7
Purification of Virus-Like Particles (Complete Genomes and
.mu.TTV)
[0251] Attempts to purify virus particles were initiated after
second round re-infections. Crude cell extracts were centrifuged on
27-33-39% Opti-prep step gradients (Buck et al., 2005). Aliquots of
gradient fractions were lysed prior to separation by gel
electrophoresis. Gradient fractions indicating virus DNA were
frozen at -80.degree. C. and used for further re-infections. Two
DNA bands at the 2 kb and 1.0 kb level of the double-stranded DNA
size marker were clearly visible (FIG. 8A). The exact sizes of
these DNA molecules could not be determined as suitable
single-stranded DNA markers are not available. Cell suspensions
were, in addition, filtered through a 0.22 .mu.m filter prior to
gradient centrifugation. Negative staining of these samples
indicated virus-like particles of approximately 30 nm in size (FIG.
8). Similarly protein structures (ca. 10 nm in size) were seen
after filtration of the culture medium after propagation of the
.mu.TTV-HD genomes (FIG. 22). These filtrates were lysed and the
DNA separated on agarose gels (FIG. 22).
EXAMPLE 8
In Vitro Transcription
[0252] Detailed transcription patterns of TTV have been reported
for the isolates TTV-P1C1 (Muller et al., 2008), TTV-HEL32 (Qiu et
al., 2005; Kakkola et al., 2009) and TTV-HD3a (Leppik et al.,
2007). Three main mRNA species (1.0, 1.2 and 3.0 kb) had earlier
been reported in bone marrow cells (Okamoto et al., 2000a) and in
COS1 cells (Kamahora et al., 2000). Predictions for use of
initiation codons according to Kozak rules (Jelcic et al., 2004) in
combination with use of alternative splice acceptor and donor sites
(Leppik et al., 2007) indicated the involvement of non-conserved
mechanisms during transcription of torque teno viruses. The
transcription of the isolates was investigated by using single-, as
well as double-stranded cDNA as templates for 3'-and 5 RACE
mapping. Double-stranded cDNA reduces the possibility for the
formation of non-specific hybrids. In addition, primers (forward
and reverse) were selected which were located within the intergenic
regions, instead of commonly used gene-specific primers. This was
done in aim of covering the expression of any unpredicted genes in
the TTV genome. RNA from all cultures was extracted on day 7 after
transfection. RNA from control transfections with vector alone was
included to control for false positive amplification. The
transcription analyses were repeated to control for a suitable time
point for harvesting mRNA by extracting RNA 48 hours after
transfection in the case of isolate TTV-HD14e. Transcription
patterns observed did not differ between day 2 and day 7. All
results obtained in the transcription analyses are presented in
FIG. 17.
[0253] Abundant transcripts were isolated from TTV-HD23 infected
cultures. Their transcription patterns, as well as those for
TTV-HD20a, TTV-HD15a, TTV-HD16a were in general similar to
previously described transcription patterns (reviewed in Kakkola et
al., 2009). An exception is the absence of a full-length ORF1
transcript from all of the isolates. This is surprising in view of
the fact that virus-like particles are concomitantly being
produced. Transcripts covering sections of the ORF1 gene (either
the 5'- or the 3'-ends) and which could code for smaller proteins,
were present (examples in FIG. 17). In silico analyses for putative
proteins revealed additional information from what have to date
been reported. Examples are splicing (fusions) between either ORF2
or ORF2a with ORF1 or with ORF5 in TTV-HD16a (6.3s.2, 6.3s.3,
6.3s.9), Splicing between ORF1 and ORF5 is another possibility
(6.3.7). Short transcripts covering the region of ORF2 in TTV-HD20a
may also be expressed as a smaller ORF1 protein (7.3.5, 7.3.4,
7.5.13) (FIG. 17). Transcripts were in addition obtained using
primers (forward or reverse) located in the control region. Two
observations were made. Reverse primers resulted in spliced or
non-spliced transcripts covering extended regions of the genome
(12.5.19, 12.5.20, 12.5.21, 5.5s.16, 5.5s.17, 5.5s.18, 5.5s.19) or
transcripts varying in length which did not have any coding
capacity (5.5s.12, 5.5s.13, 5.5s.14, 5.5s.15, 11.5.7, 11.5.8,
11.5.9). Amplification with forward primers in this region resulted
in other short non-coding transcripts or spliced transcripts with
coding capacity even as distant as ORF5 (4.3.4, 3.3.1, 3.3.2) (FIG.
17).
LIST OF REFERENCES
[0254] 1. Belotserkovskii, B. P., Liu, R., Tornaletti, S.,
Krasilnikova, M. M., Mirkin, S. M. and Hanawalt, P. C. 2010.
Mechanisms and implications of transcription blockage by
guanine-rich DNA sequences. Proc. Natl. Acad. Sci USA.
107:12816-12821. [0255] 2. Biagini, P., and P. de Micco. 2010. La
famille des Anelloviridae: virus TTV et genres apparentes.
Virologie 14:3-16. [0256] 3. Biagini, P., Charrel, R. N., de Micco,
P., and X. de Lamballerie. 2003. Association of TT virus primary
infection with rhinitis in a newborn. Clin. Infect. Dis.
36:128-129. [0257] 4. Buck, C. B., Pastrana, D. V., Lowy, D. R.,
and J. T. Schiller. 2004. Efficient intracellular assembly of
papillomaviral vectors. J. Virol. 78:751-757. [0258] 5. Buck, C.
B., Pastrana, D. V., Lowy, D. R., and J. T. Schiller. 2005.
Generation of HPV pseudovirions using transfection and their use in
neutralization assays. Methods Mol. Med. 119:445-462. [0259] 6. Del
Val, C., Mehrle, A., Falkenhahn, M., Seiler, M., Glatting, K-H.,
Poustka, A., Suhai, S., and S. Wiemann. 2004. High-throughput
protein analysis integrating bioinformatics and experimental
assays. Nucleic Acid Res. 32:742-748. [0260] 7. de Schmidt, M. H.,
and M. H. M. Noteborn. 2009. Apoptosis-inducing proteins in chicken
anemia virus and TT virus. Curr. Topics Microbiol. Immunol.
331:131-149. [0261] 8. de Villiers, E-M., Kimmel, R., Leppik, L.,
and K. Gunst. 2009. Intragenomic rearrangement in TT viruses: a
possible role in the pathogenesis of disease. Curr. Topics
Microbiol. Immunol. 331:91-107. [0262] 9. de Villiers, E-M.,
Schmidt, R., Delius, H., and H. zur Hausen. 2002. Heterogeneity of
TT virus related sequences isolated from human tumor biopsy
specimens. J. Mol. Med. 80:44-50. [0263] 10. Fei, J-W., Wei, Q-X.,
Angel, P., and E-M. de Villiers. 2005. Differential enhancement of
a cutaneous HPV promoter by p63, Jun and mutant p53. Cell Cycle
4:689-696. [0264] 11. Garbuglia, A. R., Iezzi, T., Capobianchi, M.
R., Pignoloni, P., Pulsoni, A., Sourdis, J., Pescarmona, E.,
Vitolo, D., and F. Mandelli. 2003. Detection of TT virus in lymph
node biopsies of B-cell lymphoma and Hodgkin's disease, and its
association with EBV infection. Int. J. Immunopathol. Pharmacol.
16:109-118. [0265] 12. Itoh, Y., Takahashi, M., Fukuda, M.,
Shibayama, T., Ishikawa, T., Tsuda, F., Tanaka, T., Nishizawa, T.,
and H. Okamoto. 2000. Visualization of TT virus particles recovered
from the sera and feces of infected humans. Biochem. Biophys. Res.
Commun 279:718-724. [0266] 13. Jelcic, I., Hotz-Wagenblatt, A.,
Hunziker, A., zur Hausen, H., and E-M. de Villiers. 2004. Isolation
of multiple TT virus genotypes from spleen biopsy tissue from a
Hodgkin's disease patient: Genome reorganization and diversity in
the hypervariable region. J. Virol. 78:7498-7507. [0267] 14. Jeske,
H. 2009. Geminiviruses. Curr Top Microbiol Immunol. 331:185-226
[0268] 15. Kakkola, L., Bonden, H., Hedman, L., Kivi, N., Moisala,
S. Julin, J., Yla-Liedenpohja, Miettinen, S., Kantola, K., Hedman,
K., and M. Soderlund-Venermo. 2008. Expression of all six human
Torque teno virus (TTV) proteins in bacteria and in insect cells,
and analysis of their IgG responses. Virology 382:182-189. [0269]
16. Kakkola, L., Hedman, K., Qiu, J., Pintel, D., and M.
Soderlund-Venermo. 2009. Replication of and protein synthesis by TT
viruses. Curr. Topics Microbiol. Immuno1.331: 53-64. [0270] 17.
Kakkola, L., Tommiska, J., Boele, L. C. L., Miettinen, S., Blom,
T., Kekarainen, T., Qiu, J., Pintel, D., Hoeben, RC., Hedman, K.,
and M. Soderlund-Venermo. 2007. Construction and biological
activity of a full-length molecular clone of human Torque teno
virus (TTV) genotype 6. FEBS. J. 274:4719-4730. [0271] 18. Kamada,
K., Kamahora, T., Kabat, P., and S. Hino. 2004. Transcriptional
regulation of TT virus: promoter and enhancer regions in the 1.2-kb
noncoding region. Virology 321:341-348. [0272] 19. Kamahora, T.,
Hino, S., and H. Miyata. 2000. Three spliced mRNAs of TT virus
transcribed from a plasmid containing the entire genome in COS1
cells. J. Virol 74:9980-9986. [0273] 20. Kanda, Y., Tanaka, Y.,
Kami, M., Saito, T., Asai, T., Izutsu, K., Yuji, S., Ogawa, S.,
Honda, H., Mitani, K., Ciba, S., Yasaki, Y., and H. Hirai. 1999. TT
virus in bone marrow transplant recipients. Blood 93: 2485-2490.
[0274] 21. Kazi, A., Miyata, H., Kurokawa, K., Khan, M. A.,
Kamahora, T., Katamine, S., and S. Hino. 2000. High frequency of
postnatal transmission of TT virus in infancy. Arch. Virol.
145:535-540. [0275] 22. Kovacs, E., Tompa, P., Liliom, K., and L.
Kalmar. 2010. Dual coding in alternative reading frames correlates
with intrinsic protein disorder. Proc. Natl. Acad. Sci. U.S.A.
107:5429-5434 [0276] 23. Leppik, L., Gunst, K., Lehtinen, M.,
Dillner, J., Streker, K., and E-M. de Villiers. 2007. In vivo and
in vitro intragenomic rearrangement of TT viruses. J Virol
81:9346-9356. [0277] 24. Maggi, F., Andreoli, E., Riente, L.,
Meschi, S., Rocchi, J., Delle Sedie, A., Vatteroni, M L.,
Ceccherini-Nelli, L., Specter, S., and M. Bendinelli. 2007.
Torquetenovirus in patients with arthritis. Rheumatology
46:885-886. [0278] 25. Maggi, F., Focosi, D., Albani, M., Lanini,
L., Vatteroni, M L, Petrini, M., Ceccherini-Nelli, L., Pistello,
M., and M Bendinelli. 2010. Role of hematopoietic cells in the
maintenance of chronic human torquetenovirus plasma viremia. J.
Virol. 84:6891-6893. [0279] 26. Maggi, F., Fornai, C., Vatteroni, M
L., Siciliano, G., Menichetti, F., Tascini, C., Specter, S.,
Pistello, M., and M. Bendinelli. 2001a. Low prevalence of TT virus
in the cerebrospinal fluid of viremic patients with central nervous
system disorders. J. Med. Virol. 65:418-422 [0280] 27. Maggi, F.,
Fornai, C., Zaccaro, L., Morrica, A., Vatteroni, M. L., Isola, P.,
Marchi, S., Ricchiuti, A., Pistello, M., and M. Bendinelli. 2001b.
TT virus (TTV) loads associated with different peripheral blood
cell types and evidence for TT replication in activated mononuclear
cells. J. Med. Virol. 64:190-194. [0281] 28. Maggi, F., Pifferi,
M., Fornai, C., Andreoli, A., Tempestini, E., Vatteroni, M.,
Presciuttini, S., Marchi, S., Pietrobelli, A., Boner, A., Pistello,
M., and M. Bendinelli. 2003a. TT virus in the nasal secretions of
children with acute respiratory disease: relations to viremia and
disease severity. J. Virol. 77:2418-2425. [0282] 29. Maggi, F.,
Pifferi, M., Tempestini, E., Fornai, C., Lanini, L., Andreoli, E.,
Vatteroni, M., Presciuttini, S., Pietrobelli, A., Boner, A.,
Pistello, M., and M. Bendinelli. 2003b. TT virus loads and
lymphocyte subpopulations in children with acute respiratory
diseases. J. Virol 77:9081-9083. [0283] 30. Mariscal, L. F.,
Lopez-Alcorocho, J. M., Rodriguez-Inigo, E., Ortiz-Movilla, N., de
Lucas, S., Bartolome, J., and V. Carreno. 2002. TT virus replicates
in stimulated but not in nonstimulated peripheral blood mononuclear
cells. Virology 301:121-129. [0284] 31. Muller, B., Marz, A.,
Doberstein, K., Finsterbusch, T., and A. Mankertz. 2008. Gene
expression of the human Torque Teno Virus isolate P/1C1. Virology
381:36-45. [0285] 32. Nawaz-ul-Rehman, M. S., and C. M. Fauquet.
2009. Evolution of geminiviruses and their satellites. FEBS Letter
583:1825-1832. [0286] 33. Nishizawa, T., Okamoto, K., Konishi, H.,
Yoshikawa, H., Miyakawa, Y., and M. Mayumi. 1997. A novel DNA virus
(TTV) associated with elevated transaminase levels in
posttransfusion hepatitis of unknown etiology. Biochem. Biophys.
Res. Commun. 241:92-97. [0287] 34. Ninomiya, M., Nishizawa, T.,
Takahashi, M., Lorenzo, F. R., Shimosegawa, T., and H. Okamoto.
2007. Identification and genomic characterization of a novel human
torque teno virus of 3.2 kb. J. Gen. Virology 88:1939-1944. [0288]
35. Ninomiya, M., Takahashi, M., Nishizawa, T., Shimosegawa, T.,
and H. Okamoto. 2008. Development of PCR assays with nested primers
specific for differential detection of three human anelloviruses
and early acquisition of dual or triple infection during infancy.
J. Clin. Microbiol. 46:507-514. [0289] 36. Okamoto, H. 2009.
History of discoveries and pathogenicity of TT viruses. Curr. Top.
Microbiol. Immunol. 331:1-20. [0290] 37. Okamoto, H., Nishizawa,
T., Tawara, A., Takahashi, M., Kishimoto, J., Sai, T., and Y.
Sugai. 2000a. TT virus mRNAs detected in the bone marrow cells from
an infected individual. Biochem. Biophys. Res. Commun. 279:700-707.
[0291] 38. Okamoto, H., Takahashi, M., Kato, N., Fukuda, M.,
Tawara, A., Fukuda, S., Tanaka, T., Miyakawa, Y., and M. Mayumi.
2000b. Sequestration of TT virus of restricted genotypes in
peripheral blood mononuclear cells. J. Virol. 74:10236-10239.
[0292] 39. Okamoto, H., Takahashi, M., Nishizawa, T., Tawara, A.,
Sugai, Y., Sai, T., Tanaka, T., and F. Tsuda. 2000c. Replicative
forms of TT virus DNA in bone marrow cells. Biochem. Biophys. Res.
Commun. 270:657-662. [0293] 40. Okamoto, H., Ukita, M., Nishizawa,
T., Kishimoto, J., Hoshi, Y., Mizuo, H., Tanka, T., Miyakawa, Y.,
and M. Mayumi. 2000d. Circular double-stranded forms of TT virus
DNA in the liver. J. Virol. 74:5161-5167. [0294] 41. Paprotka, T.,
Metzler, V., and H. Jeske. 2010. The first DNA 1-like a satellite
in association with New World begomovirus in natural infections.
Virology 404:148-157. [0295] 42. Patil, B. L, and C. M. Fauquet.
2010. Differential interaction between cassava mosaic geminivirus
and geminivirus satellites. J. Gen. Virol. 91:1871-1882. [0296] 43.
Peng, Y. H., Nishizawa, T., Takahashi, T., Ishikawa, T., Yoshikawa,
A., and H. Okamoto. 2002. Analysis of the entire genomes of
thirteen TT virus variants classifiable into the fourth and fifth
genetic groups, isolated from viremic infants. Arch. Virol.
147:21-41. [0297] 44. Pifferi, M., Maggi, F., Andreoli, E., Lanini,
L., Marco, E D., Fornai, C., Vatteroni, M L., Pistello, M.,
Ragazzo, V., Macchia, P., Boner, A., and M. Bendinelli. 2005.
Associations between nasal torquetenovirus load and spitometric
indices in children with asthma. J. Infect. Dis. 192:1141-1148.
[0298] 45. Qiu, J., Kakkola, L., Cheng, F., Ye, C.,
Soderlund-Venermo, M., Hedman, K., and D. J. Pintel. 2005.
Circovirus TT virus genotype 6 expresses six proteins following
transfection of a full-length clone. J. Virol. 79:6506-6510. [0299]
46. Ryabova, L. A., Pooggin, M., and T. Hohn. 2006. Translation
reinitiation and leaky scanning in plant viruses. Virus Res.
119:52-62. [0300] 47. Saunders, K., Bedford, I. D., Briddon, R. W.,
Markham, P. G., Wong, S. M., and J. Stanley. 2000. A unique virus
complex causes Ageratum yellow vein disease. Proc. Natl. Acad. Sci.
USA 97:6890-6895. [0301] 48. Shiramizu, B., Yu, Q., Hu, N.,
Yanagihara, R., and V. R. Nerurkar. 2002. Investigation of TT virus
in the etiology of pediatric acute lymphoblastic leukaemia.
Pediatr. Hematol. Oncol. 19:543-551. [0302] 49. Sospedra, M., Zhao,
Y., zur Hausen, H., Muraro, P. A., Hamashin, C., de Villiers, E.
M., Pinilla, C., and R. Martin. 2005. Recognition of conserved
amino acid motifs of common viruses and its role in autoimmunity.
PLoS Pathog. 1:e41. [0303] 50. Stanley, J. 2004. Subviral DNAs
associated with geminivirus disease complexes. Vet. Microbiol
98:121-129. [0304] 51. Takahashi, M., Asabe, S., Gotanda, Y.,
Kishimoto, J., Tsuda, F., and H. Okamoto. 2002. TT virus is
distributed in various leukocyte subpopulations at distinct levels,
with the highest viral load in granulocytes. Biochem. Biophys. Res.
Commun. 290:242-248. [0305] 52. Takahashi, K., Iwasa, Y., Hijikata,
M., and S. Mishiro. 2000. Identification of a new human DNA virus
(TTV-like mini virus, TLMV) intermediately related to TT virus and
chicken anemia virus. Arch. Virol. 145:979-993. [0306] 53. Zhong,
S., Yeo, W., Tang, M., Liu, C., Lin, X. R., Ho, W. M., Hui, P., and
P. J. Johnson. 2002. Frequent detection of the replicative form of
TT virus DNA in peripheral blood mononuclear cells and in bone
marrow cells in cancer patients. J. Med. Virol. 66:428-434. [0307]
54. zur Hausen H., and E-M. de Villiers. 2005. Virus target cell
conditioning model to explain some epidemiologic characteristics of
childhood leukemias and lymphomas. Int. J. Cancer 115:1-5.
[0308] The invention is further described by the following numbered
paragraphs:
[0309] 1. 1. A rearranged TT virus polynucleic acid comprising
[0310] (a) a nucleotide sequence shown in FIG. 6;
[0311] (b) a nucleotide sequence which shows at least 70% identity
to a nucleotide sequence of (a) and is capable of replicating
autonomously and/or inducing autonomous replication;
[0312] (c) a fragment of a nucleotide sequence of (a) or (b) which
is capable of replicating autonomously;
[0313] (d) a nucleotide sequence which is the complement of the
nucleotide sequence of (a), (b), or (c); or
[0314] (e) a nucleotide sequence which is redundant as a result of
the degeneracy of the genetic code compared to any of the
above-given nucleotide sequences.
[0315] 2. The rearranged TT virus polynucleic acid of paragraph 1
consisting of
[0316] (a) a nucleotide sequence shown in FIG. 6;
[0317] (b) a nucleotide sequence which shows at least 70% identity
to a nucleotide sequence of (a) and is capable of replicating
autonomously and/or inducing autonomous replication;
[0318] (c) a fragment of a nucleotide sequence of (a) or (b) which
is capable of replicating autonomously;
[0319] (d) a nucleotide sequence which is the complement of the
nucleotide sequence of (a), (b), or (c); or
[0320] (e) a nucleotide sequence which is redundant as a result of
the degeneracy of the genetic code compared to any of the
above-given nucleotide sequences.
[0321] 3. The rearranged TT virus polynucleic acid of paragraph 1
or 2, wherein said nucleotide sequence of (a), (b), (c), (d) or (e)
is linked to a polynucleic acid encoding a polypeptide containing a
signature motif of a mammalian protein or allergen being associated
with cancer or an autoimmune disease.
[0322] 4. The rearranged TT virus polynucleic acid of any one of
paragraphs 1 to 3 which is present as a single- or double-stranded
extrachromosomal episome.
[0323] 5. The rearranged TT virus polynucleic acid of any one of
paragraphs 1 to 4 which is a single-stranded DNA.
[0324] 6. The rearranged TT virus polynucleic acid of any one of
paragraphs 1 to 5 which is linked to a host cell DNA.
[0325] 7. The rearranged TT virus polynucleic acid of paragraph 6
having at least one of the following properties:
[0326] (a) growth-stimulation;
[0327] (b) oncogene function;
[0328] (c) tumor suppressor gene-like function; or
[0329] (d) stimulation of autoimmune reactions.
[0330] 8. The TT virus polynucleic acid of any one of paragraphs 1
to 7 comprising a nucleotide sequence being selected from the group
of nucleotide sequences shown in FIGS. 8, 9 and 11 to 13.
[0331] 9. The rearranged TT virus of any one of paragraphs 1 to 8,
wherein said polypeptide is a polypeptide as shown in Table 1.
[0332] 10. An oligonucleotide primer comprising part of a
polynucleic acid according to any one of paragraphs 1 to 7, with
said primer being able to act as primer for specifically sequencing
or specifically amplifying said polynucleic acid.
[0333] 11. The oligonucleotide primer of paragraph 10 having a
nucleotide sequence being selected from the group consisting of the
nucleotide sequences shown in Table 2 and FIG. 10.
[0334] 12. An oligonucleotide probe comprising part of a
polynucleic acid according to any one of paragraphs 1 to 9, wherein
said probe can specifically hybridize to said polynucleic acid.
[0335] 13. The oligonucleotide probe of paragraph 12 having a
nucleotide sequence being selected from the group consisting of the
nucleotide sequences shown in Table 2 and FIG. 10.
[0336] 14. The oligonucleotide probe of paragraph 12 or 13, which
is detectably labelled or attached to a solid support.
[0337] 15. The oligonucleotide primer of paragraph 10 or 11 or the
oligonucleotide probe of any one of paragraphs 12 to 14 having a
length of at least 13 bases.
[0338] 16. An expression vector comprising a rearranged TT virus
polynucleic acid of any one of paragraphs 1 to 9 operably linked to
prokaryotic, eukaryotic or viral transcription and translation
control elements.
[0339] 17. The expression vector of paragraph 16 which is an
artificial chromosome.
[0340] 18. A host cell transformed with an expression vector
according to paragraph 16 or 17.
[0341] 19. A polypeptide being encoded by a rearranged TT virus
polynucleic acid of any one of paragraphs 1 to 9.
[0342] 20. An antibody or fragment thereof specifically binding to
a polypeptide of paragraph 19.
[0343] 21. The antibody or fragment thereof of paragraph 20,
wherein said antibody or fragment is detectably labelled.
[0344] 22. A diagnostic kit for use in determining the presence of
a rearranged TT virus polynucleic acid of any one of paragraphs 1
to 9, or a polypeptide of paragraph 19, said kit comprising a
primer according to paragraph 10, 11 or 15, a probe according to
any one of paragraphs 12 to 15, or an antibody according to
paragraph 20 or 21.
[0345] 23. Use of a primer according to paragraph 10, 11 or 15, a
probe according to any one of paragraphs 12 to 15, a polypeptide of
paragraph 19, or an antibody according to paragraph 20 or 21 for
the preparation of a diagnostic composition for the diagnosis of a
predisposition or an early stage of cancer or an autoimmune
disease.
[0346] 24. A method for the detection of a rearranged TTV
polynucleic acid according to any one of paragraphs 1 to 9 in a
biological sample, comprising: (a) optionally extracting sample
polynucleic acid, (b) amplifying the polynucleic acid as described
above with at least one primer according to paragraph 10 or 11,
optionally a labelled primer, and (c) detecting the amplified
polynucleic acid.
[0347] 25. A method for the detection of a rearranged TTV
polynucleic acid according to any one of paragraphs 1 to 9 in a
biological sample, comprising: (a) optionally extracting sample
polynucleic acid, (b) hybridizing the polynucleic acid as described
above with at least one probe according to any one of paragraphs 12
to 15, optionally a labelled probe, and (c) detecting the
hybridized polynucleic acid.
[0348] 26. A method for detecting a polypeptide of paragraph 19 or
an antibody of paragraph 20 or 21 present in a biological sample,
comprising: (a) contacting the biological sample for the presence
of such polypeptide or antibody as defined above, and (b) detecting
the immunological complex formed between said antibody and said
polypeptide.
[0349] 27. An antisense oligonucleotide reducing or inhibiting the
expression of a rearranged TT virus polynucleic acid of any one of
paragraphs 1 to 9.
[0350] 28. The antisense oligonucleotide of paragraph 27, which is
an iRNA comprising a sense sequence and an antisense sequence,
wherein the sense and antisense sequences form an RNA duplex and
wherein the antisense sequence comprises a nucleotide sequence
sufficiently complementary to the nucleotide sequence of the
rearranged TT virus polynucleic acid of any one of paragraphs 1 to
9.
[0351] 29. A pharmaceutical composition comprising the antibody of
paragraph 20 or 21, or the antisense oligonucleotide of paragraph
27 or 28 and a suitable pharmaceutical carrier.
[0352] 30. Use of the antibody of paragraph 20 or 21, or the
antisense oligonucleotide of paragraph 27 or 28 for the preparation
of a pharmaceutical composition for the prevention or treatment of
cancer or an autoimmune disease or early stages thereof.
[0353] 31. The antibody of paragraph 20 or 21 or the antisense
oligonucleotide of paragraph 27 or 28 for use in a method of
preventing or treating cancer or an autoimmune disease or early
stages thereof.
[0354] 32. Use according to paragraph 30 or 31, wherein said
autoimmune disease is multiple sclerosis (MS), asthma,
polyarthritis, diabetes, lupus erythematodes, celiac disease,
colitis ulcerosa, or Crohn's disease.
[0355] 33. Use according to paragraph 30 or 31, wherein said cancer
is breast cancer, colorectal cancer, pancreatic cancer, cervical
cancer, Hodgkin's lymphoma, B-lymphoma, acute lymphocytic
leukaemia, or Burkitt's lymphoma.
[0356] 34. A vaccine comprising a rearranged TT virus polynucleic
acid of any one of paragraphs 1 to 9, or a polypeptide according to
paragraph 19.
[0357] 35. The rearranged TT virus polynucleic acid of any one of
paragraphs 1 to 9, or the polypeptide of paragraph 19 for use in a
method of immunizing a mammal against a TT virus infection.
[0358] 36. A method for the generation of a database for
determining the risk to develop cancer or an autoimmune disease,
comprising the following steps
[0359] (a) determining the nucleotide sequence of a host cell DNA
linked to a rearranged TT virus polynucleic acid according to any
one of paragraphs 1 to 9 and being present in episomal form, if
present, in a sample from a patient suffering from at least one of
said diseases; and
[0360] (b) compiling sequences determined in step (a) associated
with said diseases in a database.
[0361] 37. A method for evaluating the risk to develop cancer or an
autoimmune disease of a patient suspected of being at risk of
developing such disease, comprising the following steps
[0362] (a) determining the nucleotide sequence of genomic host cell
DNA linked to a rearranged TT virus polynucleic acid according to
any one of paragraphs 1 to 9 and being present in episomal form, if
present, in a sample from said patient; and
[0363] (b) comparing sequences determined in step (a) with the
sequences compiled in the database generated to the method of
paragraph 36,
[0364] wherein the absence of a host cell DNA linked to a TT virus
polynucleic acid or the presence only of genomic host cell DNA
linked to a TT virus polynucleic acid not represented in said
database indicates that the risk of developing such disease is
decreased or absent.
[0365] 38. A process for the in vitro replication and propagation
of Torque teno viruses (TTV) comprising the following steps:
[0366] (a) transfecting linearized TTV DNA into 293TT cells
expressing high levels of SV40 large T antigen;
[0367] (b) harvesting the cells and isolating cells showing the
presence of TTV DNA;
[0368] (c) culturing the cells obtained in step (b) for at least
three days; and
[0369] (d) harvesting the cells of step (c).
[0370] 39. The process of paragraph 38, wherein the TTV is a
rearranged TTV according to any one of paragraphs 1 to 9.
[0371] Having thus described in detail preferred embodiments of the
present invention, it is to be understood that the invention
defined by the above paragraphs is not to be limited to particular
details set forth in the above description as many apparent
variations thereof are possible without departing from the spirit
or scope of the present invention.
Sequence CWU 1
1
280119PRTArtificial SequenceSOURCE1..19/mol_type="protein"
/note="synthesized domain" /organism="Artificial Sequence" 1Arg Phe
Gly Val Gln Gln Arg Leu Pro Trp Val His Ser Ser Gln Glu 1 5 10 15
Thr Gln Ser 219PRTArtificial SequenceSOURCE1..19/mol_type="protein"
/note="synthesized domain" /organism="Artificial Sequence" 2Arg Phe
Arg Val Gln Gln Arg Leu Pro Trp Val His Ser Ser Gln Glu 1 5 10 15
Thr Gln Ser 312PRTArtificial SequenceSOURCE1..12/mol_type="protein"
/note="synthesized opsin motif" /organism="Artificial Sequence"
3Ile Phe Asn Ser Phe His Arg Gly Phe Ala Ile Tyr 1 5 10
412PRTArtificial SequenceSOURCE1..12/mol_type="protein"
/note="synthesized opsin motif" /organism="Artificial Sequence"
4Ile Tyr Asn Ser Phe His Arg Gly Phe Ala Leu Gly 1 5 10
512PRTArtificial SequenceSOURCE1..12/mol_type="protein"
/note="synthesized opsin motif" /organism="Artificial Sequence"
5Ile Tyr Asn Ser Phe His Gln Gly Tyr Ala Leu Gly 1 5 10
612PRTArtificial SequenceSOURCE1..12/mol_type="protein"
/note="synthesized opsin motif" /organism="Artificial Sequence"
6Ile Tyr Asn Ser Phe His Thr Gly Phe Ala Thr Gly 1 5 10
712PRTArtificial SequenceSOURCE1..12/mol_type="protein"
/note="synthesized opsin motif" /organism="Artificial Sequence"
7Ile Tyr Asn Ser Phe Asn Thr Gly Phe Ala Thr Gly 1 5 10
812PRTArtificial SequenceSOURCE1..12/mol_type="protein"
/note="synthesized opsin motif" /organism="Artificial Sequence"
8Ile Tyr Asn Ser Phe Asn Thr Gly Phe Ala Leu Gly 1 5 10
919PRTArtificial SequenceSOURCE1..19/mol_type="protein"
/note="synthesized opsin motif" /organism="Artificial Sequence"
9Arg Met Glu Leu Gln Lys Arg Cys Pro Trp Leu Ala Ile Asp Glu Lys 1
5 10 15 Ala Pro Glu 1019PRTArtificial
SequenceSOURCE1..19/mol_type="protein" /note="synthesized opsin
motif" /organism="Artificial Sequence" 10Arg Met Glu Leu Gln Lys
Arg Cys Pro Trp Leu Ala Leu Asn Glu Lys 1 5 10 15 Ala Pro Glu
1119PRTArtificial SequenceSOURCE1..19/mol_type="protein"
/note="synthesized opsin motif" /organism="Artificial Sequence"
11Arg Met Glu Leu Gln Lys Arg Cys Pro Trp Leu Gly Val Asn Glu Lys 1
5 10 15 Ser Gly Glu 1219PRTArtificial
SequenceSOURCE1..19/mol_type="protein" /note="synthesized opsin
motif" /organism="Artificial Sequence" 12Arg Met Glu Leu Gln Lys
Arg Cys Pro Trp Leu Ala Ile Ser Glu Lys 1 5 10 15 Ala Pro Glu
1319PRTArtificial SequenceSOURCE1..19/mol_type="protein"
/note="synthesized opsin motif" /organism="Artificial Sequence"
13Arg Leu Glu Leu Gln Lys Arg Cys Pro Trp Leu Gly Val Asn Glu Lys 1
5 10 15 Ser Gly Glu 1419PRTArtificial
SequenceSOURCE1..19/mol_type="protein" /note="synthesized opsin
motif" /organism="Artificial Sequence" 14Arg Leu Glu Leu Gln Lys
Arg Leu Pro Trp Leu Glu Leu Gln Glu Lys 1 5 10 15 Pro Val Ala
1519PRTArtificial SequenceSOURCE1..19/mol_type="protein"
/note="synthesized opsin motif" /organism="Artificial Sequence"
15Arg Leu Glu Leu Gln Lys Arg Leu Pro Trp Leu Glu Leu Gln Glu Lys 1
5 10 15 Pro Ile Glu 1619PRTArtificial
SequenceSOURCE1..19/mol_type="protein" /note="synthesized opsin
motif" /organism="Artificial Sequence" 16Arg Leu Glu Leu Gln Lys
Arg Leu Pro Trp Leu Glu Leu Gln Glu Lys 1 5 10 15 Pro Ile Ser
1734PRTArtificial SequenceSOURCE1..34/mol_type="protein"
/note="synthesized protamine P1" /organism="Artificial Sequence"
17Ala Arg Tyr Arg Arg Ser Arg Thr Arg Ser Arg Ser Pro Arg Ser Arg 1
5 10 15 Arg Arg Arg Arg Arg Ser Gly Arg Arg Arg Ser Pro Arg Arg Arg
Arg 20 25 30 Arg Tyr 1835PRTArtificial
SequenceSOURCE1..35/mol_type="protein" /note="synthesized protamine
2" /organism="Artificial Sequence" 18His Thr Arg Arg Arg Arg Ser
Cys Arg Arg Arg Arg Arg Arg Ala Cys 1 5 10 15 Arg His Arg Arg His
Arg Arg Gly Cys Arg Arg Ile Arg Arg Arg Arg 20 25 30 Arg Cys Arg
351934PRTArtificial SequenceSOURCE1..34/mol_type="protein"
/note="synthesized protamine P1" /organism="Artificial Sequence"
19Ala Arg Tyr Arg Arg Arg Ser Arg Ser Arg Ser Arg Ser Arg Tyr Gly 1
5 10 15 Arg Arg Arg Arg Arg Ser Arg Ser Arg Arg Arg Arg Ser Arg Arg
Arg 20 25 30 Arg Arg 2035PRTArtificial
SequenceSOURCE1..35/mol_type="protein" /note="synthesized protamine
2" /organism="Artificial Sequence" 20Arg Arg Arg Ser Arg Ser Cys
Arg Arg Arg Arg Arg Arg Ser Cys Arg 1 5 10 15 Tyr Arg Arg Arg Pro
Arg Arg Gly Cys Arg Ser Arg Arg Arg Arg Arg 20 25 30 Cys Arg Arg
352135PRTArtificial SequenceSOURCE1..35/mol_type="protein"
/note="synthesized protamine 2" /organism="Artificial Sequence"
21His Arg Arg Arg Arg Ser Cys Arg Arg Arg Arg Arg His Ser Cys Arg 1
5 10 15 His Arg Arg Arg His Arg Arg Gly Cys Arg Arg Ser Arg Arg Arg
Arg 20 25 30 Arg Cys Arg 352234PRTArtificial
SequenceSOURCE1..34/mol_type="protein" /note="synthesized
HSP1_mouse" /organism="Artificial Sequence" 22Ala Arg Tyr Arg Cys
Cys Arg Ser Lys Ser Arg Ser Arg Cys Arg Arg 1 5 10 15 Arg Arg Arg
Arg Cys Arg Arg Arg Arg Arg Arg Cys Cys Arg Arg Arg 20 25 30 Arg
Arg 2335PRTArtificial SequenceSOURCE1..35/mol_type="protein"
/note="synthesized HSP2_erypa" /organism="Artificial Sequence"
23Arg Arg Arg His Arg Ser Cys Arg Arg Arg Arg Arg Arg Ser Cys Arg 1
5 10 15 His Arg Arg Arg His Arg Arg Gly Cys Arg Thr Arg Arg Arg Arg
Cys 20 25 30 Arg Arg Tyr 352434PRTArtificial
SequenceSOURCE1..34/mol_type="protein" /note="synthesized
hsp1_cavpo" /organism="Artificial Sequence" 24Ala Arg Tyr Arg Cys
Cys Arg Ser Pro Ser Arg Ser Arg Cys Arg Arg 1 5 10 15 Arg Arg Arg
Arg Phe Tyr Arg Arg Arg Arg Arg Cys His Arg Arg Arg 20 25 30 Arg
Arg 25104PRTArtificial SequenceSOURCE1..104/mol_type="protein"
/note="synthesized gbDhDi33.3" /organism="Artificial Sequence"
25Ser Thr His Glu Leu Pro Asp Pro Asp Arg His Pro Arg Met Leu Gln 1
5 10 15 Val Ser Asp Pro Thr Lys Leu Gly Pro Lys Thr Ala Phe His Lys
Trp 20 25 30 Asp Trp Arg Arg Gly Met Leu Ser Lys Arg Ser Ile Lys
Arg Val Gln 35 40 45 Glu Asp Ser Thr Asp Asp Glu Tyr Val Ala Gly
Pro Leu Pro Arg Lys 50 55 60 Arg Asn Lys Phe Asp Thr Arg Val Gln
Gly Pro Pro Thr Pro Glu Lys 65 70 75 80Glu Ser Tyr Thr Leu Leu Gln
Ala Leu Gln Glu Ser Gly Gln Glu Ser 85 90 95 Ser Ser Glu Asp Gln
Glu Gln Ala 100 26104PRTArtificial
SequenceSOURCE1..104/mol_type="protein" /note="synthesized
gbDhDi33.4" /organism="Artificial Sequence" 26Ser Thr His Glu Leu
Pro Asp Pro Asp Arg His Pro Arg Met Leu Gln 1 5 10 15 Val Ser Asp
Pro Thr Lys Leu Gly Pro Lys Thr Val Phe His Lys Trp 20 25 30 Asp
Trp Arg Arg Gly Met Leu Ser Lys Arg Ser Ile Lys Arg Val Gln 35 40
45 Glu Asp Ser Thr Asp Asp Glu Tyr Val Ala Gly Pro Leu Pro Arg Lys
50 55 60 Arg Asn Lys Phe Asp Thr Arg Val Gln Gly Pro Pro Thr Pro
Glu Lys 65 70 75 80Glu Ser Tyr Thr Leu Leu Gln Ala Leu Gln Glu Ser
Gly Gln Glu Ser 85 90 95 Ser Ser Glu Asp Gln Glu Gln Ala 100
27104PRTArtificial SequenceSOURCE1..104/mol_type="protein"
/note="synthesized gbDhDi33.3" /organism="Artificial Sequence"
27Ser Thr His Glu Leu Pro Asp Pro Asp Arg His Pro Arg Met Leu Gln 1
5 10 15 Val Ser Asp Pro Thr Lys Leu Gly Pro Lys Thr Val Phe His Lys
Trp 20 25 30 Asp Trp Arg Arg Gly Met Leu Ser Lys Arg Ser Ile Lys
Arg Val Gln 35 40 45 Gly Asp Ser Thr Asp Gly Glu Tyr Val Ala Gly
Pro Leu Pro Arg Lys 50 55 60 Arg Asn Lys Phe Asp Thr Arg Val Gln
Gly Pro Pro Thr Pro Glu Lys 65 70 75 80Glu Ser Tyr Thr Leu Leu Gln
Ala Leu Gln Glu Ser Gly Gln Glu Ser 85 90 95 Ser Ser Glu Asp Gln
Glu Gln Ala 100 28104PRTArtificial
SequenceSOURCE1..104/mol_type="protein" /note="synthesized
gbDfDg33.4" /organism="Artificial Sequence" 28Ser Thr His Glu Leu
Pro Asp Pro Asp Arg His Pro Arg Met Leu Gln 1 5 10 15 Val Ser Asp
Pro Thr Lys Leu Gly Pro Lys Thr Val Phe His Lys Trp 20 25 30 Asp
Trp Gly Arg Gly Met Leu Ser Lys Arg Ser Ile Lys Arg Val Gln 35 40
45 Glu Asp Ser Thr Asp Asp Glu Tyr Val Ala Gly Pro Leu Pro Arg Lys
50 55 60 Arg Asn Lys Phe Asp Thr Arg Val Gln Gly Pro Pro Thr Pro
Glu Lys 65 70 75 80Glu Ser Tyr Thr Leu Leu Gln Ala Leu Gln Glu Ser
Gly Gln Glu Ser 85 90 95 Ser Ser Glu Asp Gln Glu Gln Ala 100
2954PRTArtificial SequenceSOURCE1..54/mol_type="protein"
/note="synthesized gbDhDi33.3" /organism="Artificial Sequence"
29Trp Cys Ser Glu Lys Ser Ser Lys Leu Asp Thr Thr Lys Ser Lys Cys 1
5 10 15 Ile Leu Arg Asp Phe Pro Leu Trp Ala Met Ala Tyr Gly Tyr Cys
Asp 20 25 30 Trp Val Val Lys Cys Thr Gly Val Ser Ser Ala Trp Thr
Asp Met Arg 35 40 45 Ile Ala Ile Ile Cys Pro 50 3038PRTArtificial
SequenceSOURCE1..38/mol_type="protein" /note="syntehsized
gbDhDi33.3" /organism="Artificial Sequence" 30Trp Cys Ser Glu Lys
Ser Ser Lys Leu Asp Thr Thr Lys Ser Lys Cys 1 5 10 15 Ile Leu Arg
Asp Phe Pro Leu Trp Ala Met Ala Tyr Gly His Cys Asp 20 25 30 Trp
Val Val Lys Cys Thr 35 31104PRTArtificial
SequenceSOURCE1..104/mol_type="protein" /note="synthesized galanin"
/organism="Artificial Sequence" 31Ala Thr Leu Gly Leu Gly Ser Pro
Val Lys Glu Lys Arg Gly Trp Thr 1 5 10 15 Leu Asn Ser Ala Gly Tyr
Leu Leu Gly Pro His Ala Ile Asp Asn His 20 25 30 Arg Ser Phe Ser
Asp Lys His Gly Leu Thr Gly Lys Arg Glu Leu Glu 35 40 45 Pro Glu
Asp Glu Ala Arg Pro Gly Ser Phe Asp Arg Pro Leu Ser Glu 50 55 60
Ser Asn Ile Val Arg Thr Ile Ile Glu Phe Leu Ser Phe Leu His Leu 65
70 75 80Lys Glu Ala Gly Ala Leu Asp Arg Leu Pro Gly Leu Pro Ala Ala
Ala 85 90 95 Ser Ser Glu Asp Leu Glu Arg Ser 100 3212PRTArtificial
SequenceSOURCE1..12/mol_type="protein" /note="synthesized
opsinrhrrh4_3" /organism="Artificial Sequence" 32Ile Tyr Asn Ser
Phe His Arg Gly Phe Ala Leu Gly 1 5 10 3319PRTArtificial
SequenceSOURCE1..19/mol_type="protein" /note="synthesized
opsinrhrrh4_7" /organism="Artificial Sequence" 33Arg Leu Glu Leu
Gln Lys Arg Leu Pro Trp Leu Glu Leu Asn Glu Lys 1 5 10 15 Ala Val
Glu 3456PRTArtificial SequenceSOURCE1..56/mol_type="protein"
/note="synthesized psinew7" /organism="Artificial Sequence" 34Arg
Cys Ser Gln Tyr Gly Val Thr Ser Cys Ser Glu Cys Leu Leu Ala 1 5 10
15 Arg Asp Pro Tyr Gly Cys Gly Trp Cys Ser Ser Glu Gly Arg Cys Thr
20 25 30 Arg Gly Glu Arg Cys Asp Glu Arg Arg Gly Ser Arg Gln Asn
Trp Ser 35 40 45 Ser Gly Pro Ser Ser Gln Cys Pro 50 55
3554PRTArtificial SequenceSOURCE1..54/mol_type="protein"
/note="synthesized gbDhDi33.3" /organism="Artificial Sequence"
35Trp Cys Ser Glu Lys Ser Ser Lys Leu Asp Thr Thr Lys Ser Lys Cys 1
5 10 15 Ile Leu Arg Asp Phe Pro Leu Trp Ala Met Ala Tyr Gly Tyr Cys
Asp 20 25 30 Trp Val Val Lys Cys Thr Gly Val Ser Ser Ala Trp Thr
Asp Met Arg 35 40 45 Ile Ala Ile Ile Cys Pro 50 3682PRTArtificial
SequenceSOURCE1..82/mol_type="protein" /note="synthesized
gbDhdi33.3" /organism="Artificial Sequence" 36Arg Cys Ser Gln Tyr
Gly Val Thr Ser Cys Ser Glu Cys Leu Leu Ala 1 5 10 15 Arg Asp Pro
Tyr Gly Cys Gly Trp Cys Ser Ser Glu Gly Arg Cys Thr 20 25 30 Arg
Gly Glu Arg Cys Asp Glu Arg Arg Gly Ser Arg Trp Cys Ser Glu 35 40
45 Lys Ser Ser Lys Leu Asp Thr Thr Lys Ser Lys Cys Ile Leu Arg Asp
50 55 60 Phe Pro Leu Trp Ala Met Ala Tyr Gly His Cys Asp Trp Val
Val Lys 65 70 75 80Cys Thr 3717PRTArtificial
SequenceSOURCE1..17/mol_type="protein" /note="synthesized
gastrin_8" /organism="Artificial Sequence" 37Val Ala Gly Glu Asp
Ser Asp Gly Cys Tyr Val Gln Leu Pro Arg Ser 1 5 10 15 Arg
3819PRTArtificial SequenceSOURCE1..19/mol_type="protein"
/note="synthesized gbDhDi33.3" /organism="Artificial Sequence"
38Val Gln Gly Asp Ser Thr Asp Gly Glu Tyr Val Ala Gly Pro Leu Pro 1
5 10 15 Arg Lys Arg 3917PRTArtificial
SequenceSOURCE1..17/mol_type="protein" /note="synthesized
gasr_rabit" /organism="Artificial Sequence" 39Leu Ala Gly Glu Asp
Gly Asp Gly Cys Tyr Val Gln Leu Pro Arg Ser 1 5 10 15 Arg
4017PRTArtificial SequenceSOURCE1..17/mol_type="protein"
/note="synthesized gasr_prana" /organism="Artificial Sequence"
40Val Ala Gly Glu Asp Asn Asp Gly Cys Tyr Val Gln Leu Pro Arg Ser 1
5 10 15 Arg 4117PRTHomo sapiensSOURCE1..17/mol_type="protein"
/note="gasr_human" /organism="homo sapiens" 41Ala Val Gly Glu Asp
Ser Asp Gly Cys Tyr Val Gln Leu Pro Arg Ser 1 5 10 15 Arg
4217PRTMus musculusSOURCE1..17/mol_type="protein"
/note="gasr_mouse" /organism="mus musculus" 42Leu Thr Gly Glu Asp
Ser Asp Gly Cys Tyr Val Gln Leu Pro Arg Ser 1 5 10 15 Arg
4317PRTRattusSOURCE1..17/mol_type="protein" /note="gasr_rat"
/organism="rattus" 43Val Ala Gly Glu Asp Ser Asp Gly Cys Cys Val
Gln Leu Pro Arg Ser 1 5 10 15 Arg 44288PRTArtificial
SequenceSOURCE1..288/mol_type="protein" /note="synthesized
rheu.ef.24" /organism="Artificial Sequence" 44Thr Leu Arg Ile Leu
Tyr Asp Glu Phe Thr Arg Phe Met Asn Phe Trp 1 5 10 15 Thr Val Ser
Asn Glu Asp Leu Asp Leu Cys Arg Tyr Val Gly Cys Lys 20
25 30 Leu Ile Phe Phe Lys His Pro Thr Val Asp Phe Ile Val Gln Ile
Asn 35 40 45 Thr Gln Pro Pro Phe Leu Asp Thr His Leu Thr Ala Ala
Ser Ile His 50 55 60 Pro Gly Ile Met Met Leu Ser Lys Arg Arg Ile
Leu Ile Pro Ser Leu 65 70 75 80Lys Thr Arg Pro Ser Arg Lys His Arg
Val Val Val Arg Val Gly Ala 85 90 95 Pro Arg Leu Phe Gln Asp Lys
Trp Tyr Pro Gln Ser Asp Leu Cys Asp 100 105 110 Thr Val Leu Leu Ser
Ile Phe Ala Thr Ala Cys Asp Leu Gln Tyr Pro 115 120 125 Phe Gly Ser
Pro Leu Thr Glu Asn Pro Cys Val Asn Phe Gln Ile Leu 130 135 140 Gly
Pro His Tyr Lys Lys His Leu Ser Ile Ser Ser Thr Asn Asp Glu 145 150
155 160Thr Asn Lys Thr His Tyr Glu Ser Asn Leu Phe Asn Lys Thr Glu
Leu 165 170 175 Tyr Asn Thr Phe Gln Thr Ile Ala Gln Leu Lys Glu Thr
Gly Arg Thr 180 185 190 Ser Gly Val Asn Pro Asn Trp Thr Ser Val Gln
Asn Thr Thr Pro Leu 195 200 205 Asn Gln Ala Gly Asn Asn Ala Gln Asn
Ser Arg Asp Thr Trp Tyr Lys 210 215 220 Gly Asn Thr Tyr Asn Asp Asn
Ile Ser Lys Leu Ala Glu Ile Thr Arg 225 230 235 240Gln Arg Phe Lys
Ser Ala Thr Ile Ser Ala Leu Pro Asn Tyr Pro Thr 245 250 255 Ile Met
Ser Thr Asp Leu Tyr Glu Tyr His Ser Gly Ile Tyr Ser Ser 260 265 270
Ile Phe Leu Ser Ala Gly Arg Ser Tyr Phe Glu Thr Thr Gly Ala Tyr 275
280 285 45318PRTArtificial SequenceSOURCE1..318/mol_type="protein"
/note="synthesized peptidase_m9" /organism="Artificial Sequence"
45Met Ser Arg Leu Ala Glu Leu Tyr Leu Leu Gly Asp Ser Ile Lys Gly 1
5 10 15 Arg His Asp Asn Leu Trp Leu Ala Ala Ala Glu Met Leu Ser Tyr
Tyr 20 25 30 Ala Pro Glu Gly Lys Ser Glu Leu Gly Ile Asp Ile Cys
Gln Ala Lys 35 40 45 Leu Glu Leu Ala Ala Lys Val Leu Pro Tyr Leu
Tyr Glu Cys Ser Gly 50 55 60 Pro Ala Ala Ile Arg Ser Gln Asp Leu
Thr Asp Gly Gln Ala Ala Ser 65 70 75 80Ala Cys Asp Ile Leu Arg Asn
Lys Glu Lys Asp Phe His Gln Val Lys 85 90 95 Tyr Thr Gly Lys Thr
Pro Val Ala Asp Asp Gly Asn Thr Arg Val Glu 100 105 110 Val Gly Val
Phe Val Ser Glu Glu Asp Tyr Lys Arg Tyr Ser Ala Phe 115 120 125 Ala
Ser Lys Glu Val Lys Ala Gln Phe Gly Arg Val Thr Asp Asn Gly 130 135
140 Gly Met Tyr Leu Glu Gly Asn Pro Ser Asp Ala Gly Asn Gln Val Arg
145 150 155 160Phe Ile Ala Tyr Glu Glu Ala Lys Leu Asn Ala Asp Leu
Ser Ile Gly 165 170 175 Asn Leu Glu His Glu Tyr Thr His Tyr Leu Asp
Gly Arg Phe Asp Thr 180 185 190 Tyr Gly Thr Phe Ser Arg Asn Leu Glu
Glu Ser His Ile Val Trp Trp 195 200 205 Glu Glu Gly Phe Ala Glu Tyr
Val His Tyr Lys Gln Gly Gly Val Pro 210 215 220 Tyr Gln Ala Ala Pro
Glu Leu Ile Gly Gln Gly Ser Lys Leu Tyr Leu 225 230 235 240Ser Asp
Val Phe Thr Thr Thr Glu Glu Gly Tyr Ala Glu Leu Phe Ala 245 250 255
Gly Ser His Asp Thr Asp Arg Ile Tyr Arg Trp Gly Tyr Leu Ala Val 260
265 270 Arg Phe Met Leu Glu Thr Asn His Asn Arg Asp Val Glu Ser Leu
Leu 275 280 285 Val His Ser Arg Tyr Gly Asn Ser Phe Ala Phe Tyr Ala
Tyr Leu Val 290 295 300 Lys Leu Leu Gly Tyr Met Tyr Asn Asn Glu Phe
Gly Ile Trp 305 310 315 4643PRTArtificial
SequenceSOURCE1..43/mol_type="protein" /note="synthesized caeel143"
/organism="Artificial Sequence" 46Gly Ala Pro Gly Pro Pro Gly Leu
Pro Gly Pro Lys Gly Pro Arg Gly 1 5 10 15 Pro Ala Gly Ile Glu Gly
Lys Pro Gly Arg Leu Gly Glu Asp Asn Arg 20 25 30 Pro Gly Pro Pro
Gly Pro Pro Gly Val Arg Gly 35 40 4743PRTArtificial
SequenceSOURCE1..43/mol_type="protein" /note="synthesized
gbdhdi33.55ikn.2.1" /organism="Artificial Sequence" 47Gly Pro Pro
Arg Pro Pro Pro Gly Leu Asp Gln Leu Asn Pro Glu Gly 1 5 10 15 Pro
Ala Gly Pro Gly Gly Pro Pro Ala Ile Leu Pro Ala Leu Pro Ala 20 25
30 Pro Ala Asp Pro Glu Pro Ala Pro Arg Arg Gly 35 40
4843PRTArtificial SequenceSOURCE1..43/mol_type="protein"
/note="synthesized steap" /organism="Artificial Sequence" 48Gly Lys
Pro Ala Glu Pro Gly Lys Pro Ala Glu Pro Gly Lys Pro Ala 1 5 10 15
Glu Pro Gly Thr Pro Ala Glu Pro Gly Lys Pro Ala Glu Pro Gly Thr 20
25 30 Pro Ala Glu Pro Gly Lys Pro Ala Glu Pro Gly 35 40
4943PRTArtificial SequenceSOURCE1..43/mol_type="protein"
/note="synthesized rheu.ef.241.148" /organism="Artificial Sequence"
49His Leu Ala Thr Thr Leu Gly Arg Pro Pro Arg Pro Gly Pro Pro Gly 1
5 10 15 Gly Pro Arg Thr Pro Gln Ile Arg Asn Leu Pro Ala Leu Pro Ala
Pro 20 25 30 Gln Gly Glu Pro Gly Asp Arg Ala Thr Trp Arg 35 40
5043PRTArtificial SequenceSOURCE1..43/mol_type="protein"
/note="synthesized rheu.ef.238rev.148" /organism="Artificial
Sequence" 50His Leu Ala Thr Thr Leu Gly Arg Pro Pro Arg Pro Gly Pro
Pro Gly 1 5 10 15 Gly Pro Arg Thr Pro Gln Ile Arg Asn Leu Pro Ala
Leu Pro Ala Pro 20 25 30 Gln Gly Glu Pro Gly Asp Arg Ala Thr Trp
Arg 35 40 5160PRTArtificial SequenceSOURCE1..60/mol_type="protein"
/note="synthesized collagen" /organism="Artificial Sequence" 51Gly
Pro Pro Gly Pro Pro Gly Pro Pro Gly Pro Pro Gly Pro Pro Gly 1 5 10
15 Pro Pro Gly Pro Pro Gly Pro Ala Gly Ala Pro Gly Pro Pro Gly Pro
20 25 30 Pro Gly Glu Pro Gly Pro Pro Gly Pro Pro Gly Pro Pro Gly
Pro Pro 35 40 45 Gly Pro Pro Gly Ala Pro Gly Ala Pro Gly Pro Pro 50
55 605261PRTArtificial SequenceSOURCE1..61/mol_type="protein"
/note="synthesized rheu.ef.24" /organism="Artificial Sequence"
52Gly Arg Pro Pro Arg Pro Gly Pro Pro Gly Gly Pro Arg Thr Pro Gln 1
5 10 15 Ile Arg Asn Leu Pro Ala Leu Pro Ala Pro Gln Gly Glu Pro Gly
Asp 20 25 30 Arg Ala Thr Trp Arg Gly Ala Ser Gly Ala Asp Ala Ala
Gly Gly Asp 35 40 45 Gly Gly Glu Arg Gly Ala Asp Gly Gly Asp Pro
Gly Asp 50 55 60 5362PRTArtificial
SequenceSOURCE1..62/mol_type="protein" /note="synthesized mssp"
/organism="Artificial Sequence" 53Val Gly Gly Pro Cys Gly Pro Cys
Gly Pro Cys Gly Gly Pro Cys Cys 1 5 10 15 Gly Ser Cys Cys Ser Pro
Cys Gly Gly Pro Cys Gly Pro Cys Gly Pro 20 25 30 Cys Gly Pro Cys
Gly Pro Cys Cys Gly Gly Cys Gly Pro Cys Gly Pro 35 40 45 Cys Gly
Pro Cys Cys Gly Thr Thr Glu Lys Tyr Cys Gly Leu 50 55 60
5458PRTArtificial SequenceSOURCE1..58/mol_type="protein"
/note="synthesized gbDhdi33.3" /organism="Artificial Sequence"
54Gln Leu Asn Pro Glu Gly Pro Ala Gly Pro Gly Gly Pro Pro Ala Ile 1
5 10 15 Leu Pro Ala Leu Pro Ala Pro Ala Asp Pro Glu Pro Ala Pro Arg
Cys 20 25 30 Gly Gly Arg Ala Asp Gly Gly Ala Ala Ala Gly Ala Ala
Ala Asp Ala 35 40 45 Asp His Thr Gly Tyr Glu Glu Gly Asp Leu 50 55
5519PRTArtificial SequenceSOURCE1..19/mol_type="protein"
/note="synthesized micollptase_1" /organism="Artificial Sequence"
55Gly Leu Glu Thr Leu Val Glu Phe Leu Arg Ala Gly Tyr Tyr Val Arg 1
5 10 15 Phe Tyr Asn 5618PRTArtificial
SequenceSOURCE1..18/mol_type="protein" /note="synthesized
gbDhDi43.4" /organism="Artificial Sequence" 56Thr Leu Glu Asn Ile
Leu Tyr Thr Arg Ala Ser Tyr Trp Asn Ser Phe 1 5 10 15 His Ala
5719PRTArtificial SequenceSOURCE1..19/mol_type="protein"
/note="synthesized cola_clope" /organism="Artificial Sequence"
57Gly Ile Pro Thr Leu Val Glu Phe Leu Arg Ala Gly Tyr Tyr Leu Gly 1
5 10 15 Phe Tyr Asn 5819PRTArtificial
SequenceSOURCE1..19/mol_type="protein" /note="synthesized
cola_vibal" /organism="Artificial Sequence" 58Glu Leu Glu Thr Leu
Phe Leu Tyr Leu Arg Ala Gly Tyr Tyr Ala Glu 1 5 10 15 Phe Tyr Asn
5919PRTArtificial SequenceSOURCE1..19/mol_type="protein"
/note="synthesized cola_vibpa" /organism="Artificial Sequence"
59Val Leu Glu Asn Leu Gly Glu Phe Val Arg Ala Ala Tyr Tyr Val Arg 1
5 10 15 Tyr Asn Ala 6019PRTArtificial
SequenceSOURCE1..19/mol_type="protein" /note="synthesized af080248"
/organism="Artificial Sequence" 60Arg Leu Glu Asn Tyr Gly Glu Phe
Ile Arg Ala Ala Tyr Tyr Val Arg 1 5 10 15 Tyr Asn Ala
6116PRTArtificial SequenceSOURCE1..16/mol_type="protein"
/note="synthesized mic1micrneme_5" /organism="Artificial Sequence"
61Thr Tyr Ile Ser Thr Lys Leu Asp Val Ala Val Gly Ser Cys His Lys 1
5 10 15 6216PRTArtificial SequenceSOURCE1..16/mol_type="protein"
/note="synthesized rheu.ef.24" /organism="Artificial Sequence"
62Thr Lys Ala Asp Thr Gln Leu Ile Val Ala Gly Gly Ser Cys Lys Ala 1
5 10 15 6316PRTArtificial SequenceSOURCE1..16/mol_type="protein"
/note="synthesized o00834" /organism="Artificial Sequence" 63Thr
Phe Ile Ser Thr Lys Leu Asp Val Ala Val Gly Ser Cys His Ser 1 5 10
15 6416PRTArtificial SequenceSOURCE1..16/mol_type="protein"
/note="synthesized q8wrs0" /organism="Artificial Sequence" 64Thr
Tyr Ser Ser Pro Gln Leu His Val Ser Val Gly Ser Cys His Lys 1 5 10
15 6515PRTArtificial SequenceSOURCE1..15/mol_type="protein"
/note="synthesized airegulator_4" /organism="Artificial Sequence"
65Asp Phe Trp Arg Val Leu Phe Lys Asp Tyr Asn Leu Glu Arg Tyr 1 5
10 156615PRTArtificial SequenceSOURCE1..15/mol_type="protein"
/note="synthesized rheu.ef.24" /organism="Artificial Sequence"
66Asn Phe Trp Thr Val Ser Asn Glu Asp Leu Asp Leu Cys Arg Tyr 1 5
10 156715PRTArtificial SequenceSOURCE1..15/mol_type="protein"
/note="synthesized rheu.ef.23" /organism="Artificial Sequence"
67Asn Phe Trp Thr Val Ser Asn Glu Asp Leu Asp Leu Cys Arg Tyr 1 5
10 156815PRTArtificial SequenceSOURCE1..15/mol_type="protein"
/note="synthesized rheu.cd.21" /organism="Artificial Sequence"
68Asn Phe Trp Thr Val Ser Asn Glu Asp Leu Asp Leu Cys Arg Tyr 1 5
10 156915PRTArtificial SequenceSOURCE1..15/mol_type="protein"
/note="synthesized q9jlw0" /organism="Artificial Sequence" 69Asp
Phe Trp Arg Ile Leu Phe Lys Asp Tyr Asn Leu Glu Arg Tyr 1 5 10
157015PRTArtificial SequenceSOURCE1..15/mol_type="protein"
/note="synthesized aire_human" /organism="Artificial Sequence"
70Asp Phe Trp Arg Val Leu Phe Lys Asp Tyr Asn Leu Glu Arg Tyr 1 5
10 157121PRTArtificial SequenceSOURCE1..21/mol_type="protein"
/note="synthesized gliadin_7" /organism="Artificial Sequence" 71Pro
Gln Ala Gln Gly Ser Val Gln Pro Gln Gln Leu Pro Gln Phe Glu 1 5 10
15 Glu Ile Arg Asn Leu 20 7221PRTArtificial
SequenceSOURCE1..21/mol_type="protein" /note="synthesized
rheu.ef.24" /organism="Artificial Sequence" 72Thr Gln Ala Gln Gly
Ser Val Gln Glu Gln Leu Leu Leu Gln Leu Arg 1 5 10 15 Glu Gln Arg
Val Leu 20 7321PRTArtificial SequenceSOURCE1..21/mol_type="protein"
/note="synthesized rheu.cd.21" /organism="Artificial Sequence"
73Thr Gln Ala Gln Gly Ser Val Gln Asp Gln Leu Leu Leu Gln Leu Arg 1
5 10 15 Glu Gln Arg Val Leu 20 7421PRTArtificial
SequenceSOURCE1..21/mol_type="protein" /note="synthesized
gda9_wheat" /organism="Artificial Sequence" 74Pro Gln Ala Gln Gly
Ser Val Gln Pro Gln Gln Leu Pro Gln Phe Glu 1 5 10 15 Glu Ile Arg
Asn Leu 20 7521PRTArtificial SequenceSOURCE1..21/mol_type="protein"
/note="synthesized gda7_wheat" /organism="Artificial Sequence"
75Pro Gln Ala Gln Gly Ser Val Gln Pro Gln Gln Leu Pro Gln Phe Ala 1
5 10 15 Glu Ile Arg Asn Leu 20 7621PRTArtificial
SequenceSOURCE1..21/mol_type="protein" /note="synthesized
gda2_wheat" /organism="Artificial Sequence" 76Pro Gln Ala Gln Gly
Ser Phe Gln Pro Gln Gln Leu Pro Gln Phe Glu 1 5 10 15 Glu Ile Arg
Asn Leu 20 7721PRTArtificial SequenceSOURCE1..21/mol_type="protein"
/note="synthesized gda3_wheat" /organism="Artificial Sequence"
77Pro Gln Ala Gln Gly Ser Val Gln Pro Gln Gln Leu Pro Gln Phe Gln 1
5 10 15 Glu Ile Arg Asn Leu 20 7817PRTArtificial
SequenceSOURCE1..17/mol_type="protein" /note="synthesized
nrpeptidey2r_9" /organism="Artificial Sequence" 78Ala Phe Leu Ser
Ala Phe Arg Cys Glu Gln Arg Leu Asp Ala Ile His 1 5 10 15 Ser
7914PRTArtificial SequenceSOURCE1..14/mol_type="protein"
/note="synthesized rheu.ef.24" /organism="Artificial Sequence"
79Ser Ala Phe Arg Val Gln Gln Arg Val Pro Trp Val His Ser 1 5 10
8014PRTArtificial SequenceSOURCE1..14/mol_type="protein"
/note="synthesized zc3r11.B4" /organism="Artificial Sequence" 80Ser
Arg Phe Arg Val Gln Gln Arg Leu Pro Trp Val His Ser 1 5 10
8117PRTArtificial SequenceSOURCE1..17/mol_type="protein"
/note="synthesized ny2r" /organism="Artificial Sequence" 81Ala Phe
Leu Ser Ala Phe Arg Cys Glu Gln Arg Leu Asp Ala Ile His 1 5 10 15
Ser 8220PRTArtificial SequenceSOURCE1..20/mol_type="protein"
/note="synthesized aerolysin_7" /organism="Artificial Sequence"
82Trp Asp Lys Arg Tyr Ile Pro Gly Glu Val Lys Trp Trp Asp Trp Asn 1
5 10 15 Trp Thr Ile Gln 208320PRTArtificial
SequenceSOURCE1..20/mol_type="protein" /note="synthesized
rheu.ef.24" /organism="Artificial Sequence" 83Val Asp Pro Lys Tyr
Val Thr Pro Glu Val Thr Trp His Ser Trp Asp 1 5 10 15 Ile Arg Arg
Gly 208420PRTArtificial SequenceSOURCE1..20/mol_type="protein"
/note="synthesized uro742rev" /organism="Artificial Sequence" 84Phe
Ala Trp Val Leu Ala Ser Gly Thr Ala Lys Cys Trp Ser Trp Asn 1 5 10
15 Trp Ser Ala Arg 208520PRTArtificial
SequenceSOURCE1..20/mol_type="protein" /note="synthesized
aera_aerhy" /organism="Artificial Sequence" 85Trp
Asp Lys Arg Tyr Ile Pro Gly Glu Val Lys Trp Trp Asp Trp Asn 1 5 10
15 Trp Thr Ile Gln 208620PRTArtificial
SequenceSOURCE1..20/mol_type="protein" /note="synthesized
aera_aertr" /organism="Artificial Sequence" 86Trp Asp Lys Arg Tyr
Leu Pro Gly Glu Met Lys Trp Trp Asp Trp Asn 1 5 10 15 Trp Ala Ile
Gln 208720PRTArtificial SequenceSOURCE1..20/mol_type="protein"
/note="synthesized aera_aersa" /organism="Artificial Sequence"
87Val Asp Lys Arg Tyr Ile Pro Gly Glu Val Lys Trp Trp Asp Trp Asn 1
5 10 15 Trp Thr Ile Ser 2088131PRTArtificial
SequenceSOURCE1..131/mol_type="protein" /note="synthesized orexin"
/organism="Artificial Sequence" 88Met Asn Leu Pro Ser Ala Lys Val
Ser Trp Ala Ala Val Thr Leu Leu 1 5 10 15 Leu Leu Leu Leu Leu Leu
Pro Pro Ala Leu Leu Ser Leu Gly Val Asp 20 25 30 Ala Gln Pro Leu
Pro Asp Cys Cys Arg Gln Lys Thr Cys Ser Cys Arg 35 40 45 Leu Tyr
Glu Leu Leu His Gly Ala Gly Asn His Ala Ala Gly Ile Leu 50 55 60
Thr Leu Gly Lys Arg Arg Pro Gly Pro Pro Gly Leu Gln Gly Arg Leu 65
70 75 80Gln Arg Leu Leu Gln Ala Ser Gly Asn His Ala Ala Gly Ile Leu
Thr 85 90 95 Met Gly Arg Arg Ala Gly Ala Glu Leu Glu Pro Arg Leu
Cys Pro Gly 100 105 110 Arg Arg Cys Leu Ala Ala Ala Ala Ser Ala Leu
Ala Pro Arg Gly Arg 115 120 125 Ser Arg Val 130 89113PRTArtificial
SequenceSOURCE1..113/mol_type="protein" /note="synthesized
rheu.ef.24" /organism="Artificial Sequence" 89Arg Lys Val Leu Leu
Gln Thr Val Arg Ala Ala Lys Lys Ala Arg Arg 1 5 10 15 Leu Leu Gly
Met Trp Gln Pro Pro Val His Asn Val Pro Gly Ile Glu 20 25 30 Arg
Asn Trp Tyr Glu Ser Cys Phe Arg Ser His Ala Ala Val Cys Gly 35 40
45 Cys Gly Asp Phe Val Gly His Ile Asn His Leu Ala Thr Thr Leu Gly
50 55 60 Arg Pro Pro Arg Pro Gly Pro Pro Gly Gly Pro Arg Thr Pro
Gln Ile 65 70 75 80Arg Asn Leu Pro Ala Leu Pro Ala Pro Gln Gly Glu
Pro Gly Asp Arg 85 90 95 Ala Thr Trp Arg Gly Ala Ser Gly Ala Asp
Ala Ala Gly Gly Asp Gly 100 105 110 Gly 90131PRTHomo
sapiensSOURCE1..131/mol_type="protein" /note="orex" /organism="Homo
sapiens" 90Met Asn Leu Pro Ser Thr Lys Val Ser Trp Ala Ala Val Thr
Leu Leu 1 5 10 15 Leu Leu Leu Leu Leu Leu Pro Pro Ala Leu Leu Ser
Ser Gly Ala Ala 20 25 30 Ala Gln Pro Leu Pro Asp Cys Cys Arg Gln
Lys Thr Cys Ser Cys Arg 35 40 45 Leu Tyr Glu Leu Leu His Gly Ala
Gly Asn His Ala Ala Gly Ile Leu 50 55 60 Thr Leu Gly Lys Arg Arg
Ser Gly Pro Pro Gly Leu Gln Gly Arg Leu 65 70 75 80Gln Arg Leu Leu
Gln Ala Ser Gly Asn His Ala Ala Gly Ile Leu Thr 85 90 95 Met Gly
Arg Arg Ala Gly Ala Glu Pro Ala Pro Arg Pro Cys Leu Gly 100 105 110
Arg Arg Cys Ser Ala Pro Ala Ala Ala Ser Val Ala Pro Gly Gly Gln 115
120 125 Ser Gly Ile 130 9121PRTArtificial
SequenceSOURCE1..21/mol_type="protein" /note="synthesized
gipreceptor_7" /organism="Artificial Sequence" 91Pro Arg Leu Gly
Pro Tyr Leu Gly Asp Gln Thr Leu Thr Leu Trp Asn 1 5 10 15 Gln Ala
Leu Ala Ala 20 9222PRTArtificial
SequenceSOURCE1..22/mol_type="protein" /note="synthesized
rheu.ef.24" /organism="Artificial Sequence" 92Pro Arg Pro Gly Pro
Pro Gly Gly Pro Arg Thr Pro Gln Ile Arg Asn 1 5 10 15 Leu Pro Ala
Leu Pro Ala 20 9321PRTArtificial
SequenceSOURCE1..21/mol_type="protein" /note="synthesized
gipr_mesau" /organism="Artificial Sequence" 93Pro Thr Leu Gly Pro
Tyr Pro Gly Asp Arg Thr Leu Thr Leu Arg Asn 1 5 10 15 Gln Ala Leu
Ala Ala 20 9421PRTRattusSOURCE1..21/mol_type="protein"
/note="gipr_rat" /organism="Rattus" 94Pro Pro Leu Gly Pro Tyr Thr
Gly Asn Gln Thr Pro Thr Leu Trp Asn 1 5 10 15 Gln Ala Leu Ala Ala
20 9521PRTHomo sapiensSOURCE1..21/mol_type="protein"
/note="gipr_hu" /organism="Homo sapiens" 95Pro Arg Pro Gly Pro Tyr
Leu Gly Asp Gln Ala Leu Ala Leu Trp Asn 1 5 10 15 Gln Ala Leu Ala
Ala 20 9622PRTArtificial SequenceSOURCE1..22/mol_type="protein"
/note="synthesized prion_2" /organism="Artificial Sequence" 96Ser
Asn Gly Gly Gly Ser Arg Tyr Pro Gly Gln Gly Ser Pro Gly Gly 1 5 10
15 Asn Arg Tyr Pro Pro Gln 20 9722PRTArtificial
SequenceSOURCE1..22/mol_type="protein" /note="synthesized
rheu.ef.24" /organism="Artificial Sequence" 97Leu Ala Thr Thr Leu
Gly Arg Pro Pro Arg Pro Gly Pro Pro Gly Gly 1 5 10 15 Pro Arg Thr
Pro Gln Ile 20 9822PRTArtificial
SequenceSOURCE1..22/mol_type="protein" /note="synthesized
prio_colgu" /organism="Artificial Sequence" 98Trp Asn Thr Gly Gly
Ser Arg Tyr Pro Gly Gln Gly Ser Pro Gly Gly 1 5 10 15 Asn Arg Tyr
Pro Pro Gln 20 9922PRTArtificial
SequenceSOURCE1..22/mol_type="protein" /note="synthesized
prio_cebap" /organism="Artificial Sequence" 99Trp Asn Thr Gly Gly
Ser Arg Tyr Pro Gly Gln Gly Ser Pro Gly Gly 1 5 10 15 Asn Leu Tyr
Pro Pro Gln 20 10022PRTArtificial
SequenceSOURCE1..22/mol_type="protein" /note="synthesized
prp1_trast" /organism="Artificial Sequence" 100Trp Asn Thr Gly Gly
Ser Arg Tyr Pro Gly Gln Gly Ser Pro Gly Gly 1 5 10 15 Asn Arg Tyr
Pro Ser Gln 20 10122PRTArtificial
SequenceSOURCE1..22/mol_type="protein" /note="synthesized
prio_rabit" /organism="Artificial Sequence" 101Trp Asn Thr Gly Gly
Ser Arg Tyr Pro Gly Gln Ser Ser Pro Gly Gly 1 5 10 15 Asn Arg Tyr
Pro Pro Gln 20 10222PRTArtificial
SequenceSOURCE1..22/mol_type="protein" /note="synthesized o46593"
/organism="Artificial Sequence" 102Asn Thr Gly Gly Gly Ser Arg Tyr
Pro Gly Gln Gly Ser Pro Gly Gly 1 5 10 15 Asn Arg Tyr Pro Pro Gln
20 10322PRTArtificial SequenceSOURCE1..22/mol_type="protein"
/note="synthesized trivu" /organism="Artificial Sequence" 103Ser
Gly Gly Ser Asn Arg Tyr Pro Gly Gln Pro Gly Ser Pro Gly Gly 1 5 10
15 Asn Arg Tyr Pro Gly Trp 20 10413PRTArtificial
SequenceSOURCE1..13/mol_type="protein" /note="synthesized
neurotensn2r_1" /organism="Artificial Sequence" 104Met Glu Thr Ser
Ser Pro Trp Pro Pro Arg Pro Ser Pro 1 5 10 10513PRTArtificial
SequenceSOURCE1..13/mol_type="protein" /note="synthesized
rheu.ef.24" /organism="Artificial Sequence" 105Leu Ala Thr Thr Leu
Gly Arg Pro Pro Arg Pro Gly Pro 1 5 10 10613PRTArtificial
SequenceSOURCE1..13/mol_type="protein" /note="synthesized ntr2
motif" /organism="Artificial Sequence" 106Met Glu Thr Ser Ser Pro
Trp Pro Pro Arg Pro Ser Pro 1 5 10 10713PRTArtificial
SequenceSOURCE1..13/mol_type="protein" /note="synthesized ntr2
motif" /organism="Artificial Sequence" 107Met Glu Thr Ser Ser Leu
Trp Pro Pro Arg Pro Ser Pro 1 5 10 10813PRTArtificial
SequenceSOURCE1..13/mol_type="protein" /note="synthesized ntr2
motif" /organism="Artificial Sequence" 108Met Glu Thr Ser Ser Pro
Arg Pro Pro Arg Pro Ser Ser 1 5 10 10917PRTArtificial
SequenceSOURCE1..17/mol_type="protein" /note="synthesized
nuclearecptr_5" /organism="Artificial Sequence" 109Pro Val Asn Leu
Leu Asn Ala Leu Val Arg Ala His Val Asp Ser Thr 1 5 10 15 Pro
11016PRTArtificial SequenceSOURCE1..16/mol_type="protein"
/note="synthesized ur742rev" /organism="Artificial Sequence" 110Thr
Phe Ile Thr Asn Ser Met Val Arg Ala His Ile Asp Ala Asp Lys 1 5 10
15 11117PRTArtificial SequenceSOURCE1..17/mol_type="protein"
/note="synthesized nr41 motif" /organism="Artificial Sequence"
111Pro Ala Asn Leu Leu Thr Ser Leu Val Arg Ala His Leu Asp Ser Gly
1 5 10 15 Pro 11217PRTArtificial
SequenceSOURCE1..17/mol_type="protein" /note="synthesized nr42
motif" /organism="Artificial Sequence" 112Pro Val Ser Leu Ile Ser
Ala Leu Val Arg Ala His Val Asp Ser Asn 1 5 10 15 Pro
11317PRTArtificial SequenceSOURCE1..17/mol_type="protein"
/note="synthesized nr41 motif" /organism="Artificial Sequence"
113Pro Thr Asn Leu Leu Thr Ser Leu Ile Arg Ala His Leu Asp Ser Gly
1 5 10 15 Pro 11417PRTArtificial
SequenceSOURCE1..17/mol_type="protein" /note="synthesized nr42
motif" /organism="Artificial Sequence" 114Pro Val Asp Leu Ile Asn
Ser Leu Val Arg Ala His Ile Asp Ser Ile 1 5 10 15 Pro
11517PRTArtificial SequenceSOURCE1..17/mol_type="protein"
/note="synthesized o97726" /organism="Artificial Sequence" 115Pro
Val Cys Met Met Asn Ala Leu Val Arg Ala Leu Thr Asp Ser Thr 1 5 10
15 Pro 11617PRTArtificial SequenceSOURCE1..17/mol_type="protein"
/note="synthesized nr43 motif" /organism="Artificial Sequence"
116Pro Ile Cys Met Met Asn Ala Leu Val Arg Ala Leu Thr Asp Ser Thr
1 5 10 15 Pro 11717PRTArtificial
SequenceSOURCE1..17/mol_type="protein" /note="synthesized nr43
motif" /organism="Artificial Sequence" 117Pro Ile Cys Met Met Asn
Ala Leu Val Arg Ala Leu Thr Asp Ala Thr 1 5 10 15 Pro
11817PRTArtificial SequenceSOURCE1..17/mol_type="protein"
/note="synthesized bdnfactor_3" /organism="Artificial Sequence"
118Pro Leu Leu Phe Leu Leu Glu Glu Tyr Lys Asn Tyr Leu Asp Ala Ala
1 5 10 15 Asn 11917PRTArtificial
SequenceSOURCE1..17/mol_type="protein" /note="synthesized
uro742rev" /organism="Artificial Sequence" 119Pro Leu Trp Ala Leu
Leu Asn Gly Tyr Val Asp Tyr Leu Glu Thr Gln 1 5 10 15 Ile
12017PRTArtificial SequenceSOURCE1..17/mol_type="protein"
/note="synthesized uro742rev" /organism="Artificial Sequence"
120Pro Leu Leu Phe Leu Pro Ser Glu Tyr Gln Arg Glu Asp Gly Ala Ala
1 5 10 15 Glu 12118PRTArtificial
SequenceSOURCE1..18/mol_type="protein" /note="synthesized
calcitonin" /organism="Artificial Sequence" 121Lys Cys Tyr Asp Arg
Met Gln Gln Leu Pro Pro Tyr Glu Gly Glu Gly 1 5 10 15 Pro Tyr
12218PRTArtificial SequenceSOURCE1..18/mol_type="protein"
/note="synthesized uro742rev" /organism="Artificial Sequence"
122Thr Pro Val Arg Arg Leu Leu Pro Leu Pro Ser Tyr Pro Gly Glu Gly
1 5 10 15 Pro Gln 12318PRTArtificial
SequenceSOURCE1..18/mol_type="protein" /note="synthesized calr
motif" /organism="Artificial Sequence" 123Lys Cys Tyr Asp Arg Ile
Gln Gln Leu Pro Pro Tyr Glu Gly Glu Gly 1 5 10 15 Pro Tyr
12418PRTArtificial SequenceSOURCE1..18/mol_type="protein"
/note="synthesized calr motif" /organism="Artificial Sequence"
124Lys Cys Tyr Asp Arg Met Glu Gln Leu Pro Pro Tyr Gln Gly Glu Gly
1 5 10 15 Pro Tyr 12518PRTArtificial
SequenceSOURCE1..18/mol_type="protein" /note="synthesized calr
motif" /organism="Artificial Sequence" 125Lys Cys Tyr Asp Arg Met
Gln Gln Leu Pro Ala Tyr Gln Gly Glu Gly 1 5 10 15 Pro Tyr
12618PRTArtificial SequenceSOURCE1..18/mol_type="protein"
/note="synthesized calr motif" /organism="Artificial Sequence"
126Lys Cys Tyr Asp Arg Ile His Gln Leu Pro Ser Tyr Glu Gly Glu Gly
1 5 10 15 Leu Tyr 12718PRTArtificial
SequenceSOURCE1..18/mol_type="protein" /note="synthesized calr
motif" /organism="Artificial Sequence" 127Arg Cys Tyr Asp Arg Met
Gln Gln Leu Pro Pro Tyr Glu Gly Glu Gly 1 5 10 15 Pro Tyr
12818PRTArtificial SequenceSOURCE1..18/mol_type="protein"
/note="synthesized calr_motif" /organism="Artificial Sequence"
128Arg Cys Tyr Asp Arg Met Gln Lys Leu Pro Pro Tyr Gln Gly Glu Gly
1 5 10 15 Leu Tyr 12917PRTArtificial
SequenceSOURCE1..17/mol_type="protein" /note="synthesized cavpo207"
/organism="Artificial Sequence" 129Ser Arg Arg Leu Arg Val Arg Arg
Phe His Arg Arg Arg Arg Thr Gly 1 5 10 15 Arg 13017PRTArtificial
SequenceSOURCE1..17/mol_type="protein" /note="synthesized
uro705rev" /organism="Artificial Sequence" 130Leu Arg Arg Arg Arg
Pro Arg Arg Pro Leu Arg Arg Arg Arg Arg Gly 1 5 10 15 Arg
13117PRTArtificial SequenceSOURCE1..17/mol_type="protein"
/note="synthesized lt4r1" /organism="Artificial Sequence" 131Gly
Arg Arg Leu Gln Ala Arg Arg Phe Arg Arg Ser Arg Arg Thr Gly 1 5 10
15 Arg 13217PRTArtificial SequenceSOURCE1..17/mol_type="protein"
/note="synthesized rheu.cd.215rev.1.7" /organism="Artificial
Sequence" 132Arg Arg Arg Arg Pro Ala Arg Arg Phe Arg Ala Arg Arg
Arg Val Arg 1 5 10 15 Arg 13317PRTArtificial
SequenceSOURCE1..17/mol_type="protein" /note="synthesized
zpr5.b4.12dk.209_2" /organism="Artificial Sequence" 133Arg Arg Arg
Pro Arg Arg Arg Arg Val Arg Arg Arg Arg Arg Trp Arg 1 5 10 15 Arg
13442PRTArtificial SequenceSOURCE1..42/mol_type="protein"
/note="synthesized auto_anti-p27" /organism="Artificial Sequence"
134Glu Ile Ser Lys Lys Met Ala Glu Leu Leu Leu Lys Gly Ala Thr Met
1 5 10 15 Leu Asp Glu His Cys Pro Lys Cys Gly Thr Pro Leu Phe Arg
Leu Lys 20 25 30 Asp Gly Lys Val Phe Cys Pro Ile Cys Glu 35 40
13540PRTArtificial SequenceSOURCE1..40/mol_type="protein"
/note="synthesized rheu.cd.21" /organism="Artificial Sequence"
135His Thr Ala Val Lys Gly Gln Phe Gly Leu Gly Thr Gly Arg Ala Leu
1 5 10 15 Gly Lys Ala Leu Lys Lys Cys Ala Phe Ala Gly Leu Arg Arg
Lys Gly 20 25 30 Lys Cys Phe Cys Lys Val Cys Glu 35
4013620PRTArtificial SequenceSOURCE1..20/mol_type="protein"
/note="synthesized vasoprsnv2r_6" /organism="Artificial Sequence"
136Arg Ala Gly Gly Arg Arg Arg Gly Arg Arg Thr Gly Ser Pro Ser Glu
1 5 10 15 Gly Ala Arg Val 2013720PRTArtificial
SequenceSOURCE1..20/mol_type="protein" /note="synthesized
uro742rp.1" /organism="Artificial Sequence" 137Arg Asn Ala Ser Arg
Arg Arg Gly Ser Ser Thr Ala Ser Thr Ser Glu 1 5 10 15 Glu Ala Ser
Leu
2013820PRTArtificial SequenceSOURCE1..20/mol_type="protein"
/note="synthesized vasoprsnvibr_4" /organism="Artificial Sequence"
138Thr Gln Ala Gly Arg Val Glu Arg Arg Gly Trp Arg Thr Trp Asp Lys
1 5 10 15 Ser Ser Ser Ser 2013920PRTArtificial
SequenceSOURCE1..20/mol_type="protein" /note="synthesized
zc35s.b2.9" /organism="Artificial Sequence" 139Ala Gln Asp Trp Ala
Glu Glu Tyr Thr Ala Cys Arg Tyr Trp Asp Arg 1 5 10 15 Pro Pro Arg
Thr 2014020PRTArtificial SequenceSOURCE1..20/mol_type="protein"
/note="synthesized v2 motif" /organism="Artificial Sequence" 140Arg
Ala Gly Arg Arg Arg Arg Gly His Arg Thr Gly Ser Pro Ser Glu 1 5 10
15 Gly Ala His Val 2014120PRTArtificial
SequenceSOURCE1..20/mol_type="protein" /note="synthesized v2 motif"
/organism="Artificial Sequence" 141Arg Ala Gly Arg Arg Arg Arg Gly
Arg Arg Thr Gly Ser Pro Ser Glu 1 5 10 15 Gly Ala His Val
2014220PRTArtificial SequenceSOURCE1..20/mol_type="protein"
/note="synthesized v2 motif" /organism="Artificial Sequence" 142Arg
Ala Gly Gly His Arg Gly Gly Arg Arg Ala Gly Ser Pro Arg Glu 1 5 10
15 Gly Ala Arg Val 2014320PRTArtificial
SequenceSOURCE1..20/mol_type="protein" /note="synthesized v2 motif"
/organism="Artificial Sequence" 143Arg Pro Gly Gly Arg Arg Arg Gly
Arg Arg Thr Gly Ser Pro Gly Glu 1 5 10 15 Gly Ala His Val
2014420PRTArtificial SequenceSOURCE1..20/mol_type="protein"
/note="synthesized v2 motif" /organism="Artificial Sequence" 144Arg
Ala Gly Gly Cys Arg Gly Gly His Arg Thr Gly Ser Pro Ser Glu 1 5 10
15 Gly Ala Arg Val 2014520PRTArtificial
SequenceSOURCE1..20/mol_type="protein" /note="synthesized v2 motif"
/organism="Artificial Sequence" 145Arg Ala Gly Gly Pro Arg Arg Gly
Cys Arg Pro Gly Ser Pro Ala Glu 1 5 10 15 Gly Ala Arg Val
2014620PRTArtificial SequenceSOURCE1..20/mol_type="protein"
/note="synthesized v1b motif" /organism="Artificial Sequence"
146Thr Gln Ala Trp Arg Val Gly Gly Gly Gly Trp Arg Thr Trp Asp Arg
1 5 10 15 Pro Ser Pro Ser 2014720PRTArtificial
SequenceSOURCE1..20/mol_type="protein" /note="synthesized v1b
motif" /organism="Artificial Sequence" 147Thr Gln Ala Gly Arg Glu
Glu Arg Arg Gly Trp Arg Thr Trp Asp Lys 1 5 10 15 Ser Ser Ser Ser
2014820PRTArtificial SequenceSOURCE1..20/mol_type="protein"
/note="synthesized mch2receptor_5" /organism="Artificial Sequence"
148Leu Val Gln Pro Phe Arg Leu Thr Arg Trp Arg Thr Arg Tyr Lys Thr
1 5 10 15 Ile Arg Ile Asn 2014918PRTArtificial
SequenceSOURCE1..18/mol_type="protein" /note="synthesized
uro742rp.1" /organism="Artificial Sequence" 149Arg Pro Phe Cys Ile
Thr Lys Trp Arg Thr Ser Phe Leu Phe Phe Lys 1 5 10 15 Asn Asn
15020PRTArtificial SequenceSOURCE1..20/mol_type="protein"
/note="synthesized receptor motif" /organism="Artificial Sequence"
150Leu Val Gln Pro Phe Arg Leu Thr Ser Trp Arg Thr Arg Tyr Lys Thr
1 5 10 15 Ile Arg Ile Asn 2015117PRTArtificial
SequenceSOURCE1..17/mol_type="protein" /note="synthesized
prstnoidep1r_4" /organism="Artificial Sequence" 151Ile Ser Leu Gly
Pro Pro Gly Gly Trp Arg Gln Ala Leu Leu Ala Gly 1 5 10 15 Leu
15218PRTArtificial SequenceSOURCE1..18/mol_type="protein"
/note="synthesized uro742rev" /organism="Artificial Sequence"
152Met Gly Leu Gly Pro Ser Gly Gly Asn Arg Lys Thr Leu Phe Ile Ala
1 5 10 15 Gly Lys 15317PRTArtificial
SequenceSOURCE1..17/mol_type="protein" /note="synthesized receptor
motif" /organism="Artificial Sequence" 153Ile Ser Leu Gly Pro Arg
Gly Gly Trp Arg Gln Ala Leu Leu Ala Gly 1 5 10 15 Leu
15417PRTArtificial SequenceSOURCE1..17/mol_type="protein"
/note="synthesized receptor motif" /organism="Artificial Sequence"
154Ile Gly Leu Gly Pro Pro Gly Gly Trp Arg Gln Ala Leu Leu Ala Gly
1 5 10 15 Leu 15512PRTArtificial
SequenceSOURCE1..12/mol_type="protein" /note="synthesized
opsinrhrrh4_3" /organism="Artificial Sequence" 155Ile Tyr Asn Ser
Phe His Arg Gly Phe Ala Leu Gly 1 5 10 15619PRTArtificial
SequenceSOURCE1..19/mol_type="protein" /note="synthesized
opsinrh3rh4_7" /organism="Artificial Sequence" 156Arg Leu Glu Leu
Gln Lys Arg Leu Pro Trp Leu Glu Leu Asn Glu Lys 1 5 10 15 Ala Val
Glu 15715PRTArtificial SequenceSOURCE1..15/mol_type="protein"
/note="synthesized cyclinkinase_3" /organism="Artificial Sequence"
157Glu Trp Arg Ser Leu Gly Val Gln Gln Ser Leu Gly Trp Val His 1 5
10 1515815PRTArtificial SequenceSOURCE1..15/mol_type="protein"
/note="synthesized rheu.cd.21" /organism="Artificial Sequence"
158Glu Ser Ser Arg Phe Gly Val Gln Gln Arg Leu Pro Trp Val His 1 5
10 1515915PRTArtificial SequenceSOURCE1..15/mol_type="protein"
/note="synthesized cks2 motif" /organism="Artificial Sequence"
159Glu Trp Arg Arg Leu Gly Val Gln Gln Ser Leu Gly Trp Val His 1 5
10 1516015PRTArtificial SequenceSOURCE1..15/mol_type="protein"
/note="synthesized cks1 motif" /organism="Artificial Sequence"
160Glu Trp Arg Asn Leu Gly Val Gln Gln Ser Gln Gly Trp Val His 1 5
10 1516115PRTArtificial SequenceSOURCE1..15/mol_type="protein"
/note="synthesized cks1 motif" /organism="Artificial Sequence"
161Glu Trp Arg Ser Ile Gly Val Gln Gln Ser His Gly Trp Ile His 1 5
10 1516215PRTArtificial SequenceSOURCE1..15/mol_type="protein"
/note="synthesized cks1 motif" /organism="Artificial Sequence"
162Glu Trp Arg Ser Ile Gly Val Gln Gln Ser Arg Gly Trp Ile His 1 5
10 1516315PRTArtificial SequenceSOURCE1..15/mol_type="protein"
/note="synthesized cks1 motif" /organism="Artificial Sequence"
163Glu Trp Arg Gly Leu Gly Val Gln Gln Ser Gln Gly Trp Val His 1 5
10 1516415PRTArtificial SequenceSOURCE1..15/mol_type="protein"
/note="synthesized cks1 motif" /organism="Artificial Sequence"
164Glu Trp Arg Gln Leu Gly Val Gln Gln Ser Gln Gly Trp Val His 1 5
10 1516515PRTArtificial SequenceSOURCE1..15/mol_type="protein"
/note="synthesized o23249" /organism="Artificial Sequence" 165Glu
Trp Arg Ala Ile Gly Val Gln Gln Ser Arg Gly Trp Val His 1 5 10
1516615PRTArtificial SequenceSOURCE1..15/mol_type="protein"
/note="synthesized o60191" /organism="Artificial Sequence" 166Glu
Trp Arg Gly Leu Gly Ile Thr Gln Ser Leu Gly Trp Gln His 1 5 10
1516715PRTArtificial SequenceSOURCE1..15/mol_type="protein"
/note="synthesized cks1 motif" /organism="Artificial Sequence"
167Glu Trp Arg Gly Leu Gly Ile Thr Gln Ser Leu Gly Trp Glu Met 1 5
10 1516815PRTArtificial SequenceSOURCE1..15/mol_type="protein"
/note="synthesized cks1 motif" /organism="Artificial Sequence"
168Glu Trp Arg Gly Leu Gly Ile Thr Gln Ser Leu Gly Trp Glu His 1 5
10 1516915PRTArtificial SequenceSOURCE1..15/mol_type="protein"
/note="synthesized cks1 motif" /organism="Artificial Sequence"
169Glu Trp Arg Ser Leu Gly Ile Gln Gln Ser Pro Gly Trp Met His 1 5
10 1517013PRTArtificial SequenceSOURCE1..13/mol_type="protein"
/note="synthesized peroxisomepar_7" /organism="Artificial Sequence"
170Lys Thr Glu Thr Asp Ala Ser Leu His Pro Leu Leu Gln 1 5 10
17113PRTArtificial SequenceSOURCE1..13/mol_type="protein"
/note="synthesized rheu.cd.21" /organism="Artificial Sequence"
171Lys Val Gln Ala Gly His Ser Leu His Pro Leu Leu Ser 1 5 10
17213PRTArtificial SequenceSOURCE1..13/mol_type="protein"
/note="synthesized ppat motif" /organism="Artificial Sequence"
172Lys Thr Glu Thr Asp Met Ser Leu His Pro Leu Leu Gln 1 5 10
17313PRTArtificial SequenceSOURCE1..13/mol_type="protein"
/note="synthesized ppat motif" /organism="Artificial Sequence"
173Lys Thr Glu Ala Asp Met Cys Leu His Pro Leu Leu Gln 1 5 10
17413PRTArtificial SequenceSOURCE1..13/mol_type="protein"
/note="synthesized ppar motif" /organism="Artificial Sequence"
174Lys Thr Glu Thr Asp Ala Ala Leu His Pro Leu Leu Gln 1 5 10
17513PRTArtificial SequenceSOURCE1..13/mol_type="protein"
/note="synthesized ppar motif" /organism="Artificial Sequence"
175Lys Thr Glu Ser Asp Ala Ala Leu His Pro Leu Leu Gln 1 5 10
17613PRTArtificial SequenceSOURCE1..13/mol_type="protein"
/note="synthesized ppas motif" /organism="Artificial Sequence"
176Lys Thr Glu Thr Glu Thr Ser Leu His Pro Leu Leu Gln 1 5 10
17713PRTArtificial SequenceSOURCE1..13/mol_type="protein"
/note="synthesized ppar motif" /organism="Artificial Sequence"
177Lys Thr Glu Ser Asp Ala Ala Leu His Pro Leu Leu Gln 1 5 10
17813PRTArtificial SequenceSOURCE1..13/mol_type="protein"
/note="synthesized ppas motif" /organism="Artificial Sequence"
178Lys Thr Glu Ser Glu Thr Leu Leu His Pro Leu Leu Gln 1 5 10
17918PRTArtificial SequenceSOURCE1..18/mol_type="protein"
/note="synthesized muscrinicm1r_4" /organism="Artificial Sequence"
179Lys Met Pro Met Val Asp Pro Glu Ala Gln Ala Pro Thr Lys Gln Pro
1 5 10 15 Pro Lys 18017PRTArtificial
SequenceSOURCE1..17/mol_type="protein" /note="synthesized
rheu.cd.21" /organism="Artificial Sequence" 180Lys His Pro Thr Val
Asp Phe Met Val Gln Ile Asn Thr Gln Pro Pro 1 5 10 15 Phe
18118PRTArtificial SequenceSOURCE1..18/mol_type="protein"
/note="synthesized acm1 motif" /organism="Artificial Sequence"
181Lys Met Pro Met Val Asp Pro Glu Ala Gln Ala Pro Thr Lys Gln Pro
1 5 10 15 Pro Arg 18218PRTArtificial
SequenceSOURCE1..18/mol_type="protein" /note="synthesized acm1
motif" /organism="Artificial Sequence" 182Lys Met Pro Met Val Asp
Pro Glu Ala Gln Ala Pro Thr Lys Gln Pro 1 5 10 15 Pro Lys
18318PRTArtificial SequenceSOURCE1..18/mol_type="protein"
/note="synthesized acm1 motif" /organism="Artificial Sequence"
183Lys Met Pro Met Val Asp Ser Glu Ala Gln Ala Pro Thr Lys Gln Pro
1 5 10 15 Pro Lys 18418PRTArtificial
SequenceSOURCE1..18/mol_type="protein" /note="synthesized acm1
motif" /organism="Artificial Sequence" 184Lys Met Pro Met Val Asp
Pro Glu Ala Gln Ala Pro Ala Lys Gln Pro 1 5 10 15 Pro Arg
18519PRTArtificial SequenceSOURCE1..19/mol_type="protein"
/note="synthesized gabab2receptr_1" /organism="Artificial Sequence"
185Leu Ala Pro Gly Ala Trp Gly Trp Ala Arg Gly Ala Pro Arg Pro Pro
1 5 10 15 Pro Ser Ser 18619PRTArtificial
SequenceSOURCE1..19/mol_type="protein" /note="synthesized
zc35s.B3.3" /organism="Artificial Sequence" 186Val Gly Pro Glu Gln
Trp Leu Phe Pro Glu Arg Lys Pro Lys Pro Pro 1 5 10 15 Pro Ser Ala
18719PRTArtificial SequenceSOURCE1..19/mol_type="protein"
/note="synthesized gabab2 motif" /organism="Artificial Sequence"
187Leu Ala Pro Gly Ala Trp Gly Trp Ala Arg Gly Ala Pro Arg Pro Pro
1 5 10 15 Pro Ser Ser 18819PRTArtificial
SequenceSOURCE1..19/mol_type="protein" /note="synthesized
gabab2receptr1" /organism="Artificial Sequence" 188Leu Ala Pro Gly
Ala Trp Gly Trp Thr Arg Gly Ala Pro Arg Pro Pro 1 5 10 15 Pro Ser
Ser 18919PRTArtificial SequenceSOURCE1..19/mol_type="protein"
/note="synthesized zc35s.b3.3" /organism="Artificial Sequence"
189Ser Glu Leu Ser Arg Gly Arg Gly Gly Pro Arg Cys Met Ser Met Pro
1 5 10 15 Leu Val Arg 19019PRTArtificial
SequenceSOURCE1..19/mol_type="protein" /note="synthesized o51896"
/organism="Artificial Sequence" 190Ser Glu Leu Ser Arg Gly Arg Gly
Gly Pro Arg Cys Met Ser Met Pro 1 5 10 15 Leu Ile Arg
19119PRTArtificial SequenceSOURCE1..19/mol_type="protein"
/note="synthesized sagp" /organism="Artificial Sequence" 191Ser Glu
Leu Val Arg Gly Arg Gly Gly Pro Arg Cys Met Ser Met Pro 1 5 10 15
Phe Glu Arg 19219PRTArtificial
SequenceSOURCE1..19/mol_type="protein" /note="synthesized o51781"
/organism="Artificial Sequence" 192Ser Glu Leu Ser Arg Gly Arg Gly
Gly Pro Arg Cys Met Ser Met Ser 1 5 10 15 Leu Val Arg
19319PRTArtificial SequenceSOURCE1..19/mol_type="protein"
/note="synthesized o86131" /organism="Artificial Sequence" 193Gly
Glu Leu Ser Arg Gly Arg Gly Gly Pro Arg Cys Met Ser Met Pro 1 5 10
15 Leu Tyr Arg 19419PRTArtificial
SequenceSOURCE1..19/mol_type="protein" /note="synthesized arca"
/organism="Artificial Sequence" 194Ser Glu Leu Gly Arg Gly Arg Gly
Gly Gly His Cys Met Thr Cys Pro 1 5 10 15 Ile Val Arg
19519PRTArtificial SequenceSOURCE1..19/mol_type="protein"
/note="synthesized arca" /organism="Artificial Sequence" 195Asn Gln
Leu Ser Leu Gly Met Gly Asn Ala Arg Cys Met Ser Met Pro 1 5 10 15
Leu Ser Arg 19619PRTArtificial
SequenceSOURCE1..19/mol_type="protein" /note="synthesized o31017"
/organism="Artificial Sequence" 196Ser Glu Leu Gly Arg Gly Arg Gly
Gly Gly His Cys Met Thr Cys Pro 1 5 10 15 Ile Trp Arg
19719PRTArtificial SequenceSOURCE1..19/mol_type="protein"
/note="synthesized arca" /organism="Artificial Sequence" 197Gly Glu
Leu Gly Arg Gly Arg Gly Gly Gly His Cys Met Thr Cys Pro 1 5 10 15
Ile Val Arg 19819PRTArtificial
SequenceSOURCE1..19/mol_type="protein" /note="synthesized arca"
/organism="Artificial Sequence" 198Ser Glu Leu Gly Thr Gly Arg Gly
Gly Pro Arg Cys Met Ser Cys Pro 1 5 10 15 Ala Ala Arg
19919PRTArtificial SequenceSOURCE1..19/mol_type="protein"
/note="synthesized arca" /organism="Artificial Sequence" 199Ser Glu
Leu Ser Arg Gly Pro Ser Gly Pro Leu Glu Met Val Cys Ser 1 5 10 15
Leu Trp Arg 20020PRTArtificial
SequenceSOURCE1..20/mol_type="protein" /note="synthesized ogfr_III"
/organism="Artificial Sequence" 200Ser Pro Ser Glu Thr Pro Gly Pro
Arg Pro Ala Gly Pro Ala Arg Asp 1 5 10 15 Glu Pro Ala Glu
2020122PRTArtificial SequenceSOURCE1..22/mol_type="protein"
/note="synthesized zc37.b9.2d" /organism="Artificial Sequence"
201Arg Ala Ala Ser Thr Pro Val Pro Thr Pro Ala Leu Arg Gly Pro Thr
1 5 10 15 Arg Gln Asp Pro Gly Glu 20 20223PRTArtificial
SequenceSOURCE1..23/mol_type="protein" /note="synthesized
cd3antigen_3" /organism="Artificial Sequence" 202Trp Ile Phe Asp
Val Gln Asn Pro Asp Glu Val Ala Lys Asn Ser Ser 1 5 10 15 Lys Ile
Lys Val Lys Gln Arg 20 20319PRTArtificial
SequenceSOURCE1..19/mol_type="protein" /note="synthesized
zc3r11.b4" /organism="Artificial Sequence" 203Asn Val Gln Asp Pro
Glu Glu Gln Asn Glu Ser Ser Arg Phe Arg Val 1 5 10 15 Gln Gln Arg
20423PRTArtificial SequenceSOURCE1..23/mol_type="protein"
/note="synthesized cd3 motif" /organism="Artificial Sequence"
204Trp Val Phe Asp Val Gln Asn Pro Glu Glu Val Ala Lys Asn Ser Ser
1 5 10 15 Lys Ile Lys Val Ile Gln Arg 20 20523PRTArtificial
SequenceSOURCE1..23/mol_type="protein" /note="synthesized cd36
motif" /organism="Artificial Sequence" 205Trp Ile Phe Asp Val Gln
Asn Pro Asp Asp Val Ala Lys Asn Ser Ser 1 5 10 15 Lys Ile Lys Val
Lys Gln Arg 20 20623PRTArtificial
SequenceSOURCE1..23/mol_type="protein" /note="synthesized cd36
motif" /organism="Artificial Sequence" 206Trp Ile Phe Asp Val Gln
Asn Pro Asp Glu Val Thr Val Asn Ser Ser 1 5 10 15 Lys Ile Lys Val
Lys Gln Arg 20 20723PRTArtificial
SequenceSOURCE1..23/mol_type="protein" /note="synthesized cd36
motif" /organism="Artificial Sequence" 207Trp Ile Phe Asp Val Gln
Asn Pro Gln Glu Val Met Met Asn Ser Ser 1 5 10 15 Asn Ile Gln Val
Lys Gln Arg 20 20823PRTArtificial
SequenceSOURCE1..23/mol_type="protein" /note="synthesized cd36
motif" /organism="Artificial Sequence" 208Trp Ile Phe Asp Val Gln
Asn Pro Asp Glu Val Ala Val Asn Ser Ser 1 5 10 15 Lys Ile Lys Val
Lys Gln Arg 20 20923PRTArtificial
SequenceSOURCE1..23/mol_type="protein" /note="synthesized cd36
motif" /organism="Artificial Sequence" 209Trp Ile Phe Asp Val Gln
Asn Pro Glu Glu Val Ala Lys Asn Ser Ser 1 5 10 15 Lys Ile Lys Val
Lys Gln Arg 20 21025PRTArtificial
SequenceSOURCE1..25/mol_type="protein" /note="synthesized
myelinp0_5" /organism="Artificial Sequence" 210Gly Val Val Leu Gly
Ala Ile Leu Gly Gly Val Leu Gly Val Val Leu 1 5 10 15 Leu Leu Val
Leu Leu Leu Tyr Leu Val 20 2521122PRTArtificial
SequenceSOURCE1..22/mol_type="protein" /note="synthesized
zc312.b11" /organism="Artificial Sequence" 211Met Leu Gly Arg Ile
Ile Gly Gly Val Gly Cys Val Leu Leu Glu Leu 1 5 10 15 Xaa Gly Leu
Gly Val Arg 20 21213PRTArtificial
SequenceSOURCE1..13/mol_type="protein" /note="synthesized
chlamidiaom3_3" /organism="Artificial Sequence" 212Cys Gly Ser Tyr
Val Pro Ser Cys Ser Lys Pro Cys Gly 1 5 10 21313PRTArtificial
SequenceSOURCE1..13/mol_type="protein" /note="synthesized
rheu.cd.21" /organism="Artificial Sequence" 213Cys Thr Gly Tyr Thr
Glu Phe Cys Ala Lys Tyr Thr Gly 1 5 10 21425DNAArtificial
Sequencesource1..25/mol_type="DNA" /note="synthesized primer"
/organism="Artificial Sequence" 214ggccgggcca tgggcaaggc tctta
2521528DNAArtificial Sequencesource1..28/mol_type="DNA"
/note="synthesized primer" /organism="Artificial Sequence"
215agtcaagggg caattcgggc tcgggact 2821618DNAArtificial
Sequencesource1..18/mol_type="DNA" /note="synthesized primer"
/organism="Artificial Sequence" 216caattcgggc tcgggact
1821719DNAArtificial Sequencesource1..19/mol_type="DNA"
/note="synthesized primer" /organism="Artificial Sequence"
217acacaccgca gtcaagggg 1921819DNAArtificial
Sequencesource1..19/mol_type="DNA" /note="synthesized primer"
/organism="Artificial Sequence" 218caattcgggc tcgggactg
1921924DNAArtificial Sequencesource1..24/mol_type="DNA"
/note="synthesized primer" /organism="Artificial Sequence"
219agtttacaca ccgcagtcaa gggg 2422031DNAArtificial
Sequencesource1..31/mol_type="DNA" /note="synthesized primer"
/organism="Artificial Sequence" 220ccgcagcgag aacgccacgg agggagatcc
t 3122131DNAArtificial Sequencesource1..31/mol_type="DNA"
/note="synthesized primer" /organism="Artificial Sequence"
221acttccgaat ggctgagttt tccacgcccg t 3122232DNAArtificial
Sequencesource1..32/mol_type="DNA" /note="synthesized primer"
/organism="Artificial Sequence" 222agaggagcca cggcagggga tccgaacgtc
ct 3222331DNAArtificial Sequencesource1..31/mol_type="DNA"
/note="synthesized primer" /organism="Artificial Sequence"
223cttaccgact caaaaacgac gggcaggcgc c 3122428DNAArtificial
Sequencesource1..28/mol_type="DNA" /note="synthesized primer"
/organism="Artificial Sequence" 224cagcgagaac gccacggagg gagatcct
2822528DNAArtificial Sequencesource1..28/mol_type="DNA"
/note="synthesized primer" /organism="Artificial Sequence"
225gaatggctga gttttccacg cccgtccg 2822618DNAArtificial
Sequencesource1..18/mol_type="DNA" /note="synthesized primer"
/organism="Artificial Sequence" 226caattcgggc acgggact
1822724DNAArtificial Sequencesource1..24/mol_type="DNA"
/note="synthesized primer" /organism="Artificial Sequence"
227agtttacaca ccgaagtcaa gggg 2422871DNAArtificial
Sequencesource1..71/mol_type="DNA" /note="synthesized zyb2"
/organism="Artificial Sequence" 228cgggtgccga aggtgagttt acacaccgca
gtcaaggggc aattcgggct cgggactggc 60cgggccatgg g
7122971DNAArtificial Sequencesource1..71/mol_type="DNA"
/note="synthesized zyb9" /organism="Artificial Sequence"
229cgggtgccga aggtgagttt acacaccgca gtcaaggggc aattcgggct
cgggactggc 60cgggctatgg g 7123071DNAArtificial
Sequencesource1..71/mol_type="DNA" /note="synthesized zkb5"
/organism="Artificial Sequence" 230cgggtgccgt aggtgagttt acacaccgca
gtcaaggggc aattcgggct cgggactggc 60cgggctatgg g
7123171DNAArtificial Sequencesource1..71/mol_type="DNA"
/note="synthesized zkb69" /organism="Artificial Sequence"
231cgggtgccgg aggtgagttt acacaccgca gtcaaggggc aattcgggct
cgggactggc 60cgggctatgg g 712322227DNAArtificial
Sequencesource1..2227/mol_type="DNA" /note="synthesized chimeric
TTV, WV13038 clone 6" /organism="Artificial Sequence" 232caattcgtgc
acgggactac aaggaaaggg gttgaccccc accctccccc gccatgccca 60ggagggtgca
gacacaactg ggaaggtgct agagaccccg gggggaggct gggccagcac
120caggcattgg ggggcaggtt cccgtctcta caccccagcc ccaggcggac
agcgcgtgcc 180cctcccgctg ccccacctgt cacccacctg ctggccccgg
gctgtctctg ctcctggctc 240ccctcccagc tgcgtcccca gctgcctctc
cagggaggag tgacagctgg cctgtgccac 300accctcgagc ccccccggac
taccccctcc ctggggcagg acccctgcct gtggcacaac 360caaggggcct
gctgatgggg gctcatgtga gcagtgcccc agctgtgggt gtgggtgctg
420ccagctgcca ccgcctttgc cctggtttcc cagatagacc ccgacccaca
ctccgaagct 480gtatcatgaa cgctgtggtg ggcggctggt ggggagcggg
gttgccgtcc cactaccctc 540tggaagcctc agccatgaag ggcccctgtg
ggcacctttt cccggcacac ggtgctgtgt 600ttctccactc ttgggctctg
cagtgacttg aggggtcaag tctatgatcc cacgggaggc 660tgggctaatg
aggggaccag agacctcagt gctgtgcagg gagtcctgaa ccaccctggt
720ggaaggccca gcccaactcc ccagtcctcc cgccagctcc ctgtggtgtc
caggagacct 780gtggtcaggc ctggaggaga agctcctcct cccctcgaca
tcctccctgc agcccttgct 840cttcaccaga gcctcctgac tccccaggac
cccagagagg actgaccctc tccagccgac 900ctctgggctc aggacagctg
ggcggggcag ccacaggagc tgcctgtagg gagcagagtc 960aggacgggga
ccgagccgga cacccattct ggaagtgtct gcacttccag gcaggggaag
1020gacggcagtg ggtagctggg agtgctgggc cgaagatggg cattgtcagg
ccctcagtgg 1080ggactgggag gtagaggtgg ggaggtctgt ggaggaagga
gaagaagggc cagtgtcccg 1140agttgggggt ggttggcagt ggacgaggcc
gacaggaaca gacctgagct tggggagctc 1200cactcagaac gaggcatcct
tcagggttct gtgcatactg gtgtccctgg ctgggggccg 1260ggccccgaag
tggagcctgg gactgtgagg gtgggggggg tgtgctgggg tgggaggtgg
1320atggagcccc ccctccaccg cctggccgct tgggctgaac cttggacttc
ggagccggaa 1380cagacatagg aaatggccta actgcatttg cgcaggaaca
ccaaatccct cgcagctgca 1440cggggctgag ccagggccac gggcggggtc
ggccatccca gagtcctgac agctccgtgg 1500tgtatgccaa ggggcctggg
ccgctgaccg aggggcgcct ttcccaggcc agaggccccc 1560accccacccc
aggagagctg cccccctttc agttcccaga acggagcccg gctgtggaat
1620agtgatgcgg tgaggtcatg gggagggggc ccgcatgact catatcctgg
ggtaggggaa 1680agggaggaga cggagaaggg gcccagaggc ctccacgtcc
tcagctctgc tgggtcagag 1740gccaggggct ggcggggctt ctccccagca
ctgggtttta ggggagacac caggagatgc 1800ttactctgca tccccactct
gtcccccagg cccctagcca gggagagctc agtcagagtg 1860atcctccagg
ggcccagctc tgcatggatg atgttcccag agtacacacc tgggcctcgt
1920gccagggccg gcaccgccgt tgtcagggct atggcaaggc aaacagtcaa
tgtttgcctc 1980actaaagtga ggctgcagca ccctgaaggg atccctggag
ggggacgtgg tccccttgtt 2040cccaagcttg tctgcacatg cacgtggatg
tcaagggttc ccgtgtgtga gcacatgcat 2100atttgtatgt gcatggggtg
cgggcatgtg tgcctgtgtg gccggagcgt gggctcgtgg 2160agaatgtgtg
tgagttgggt gtgcacctgc atgtgcccca ggcctaggga gtcccgtgcc 2220cgaattg
2227233883DNAArtificial Sequencesource1..883/mol_type="DNA"
/note="synthesized chimeric TTV, gb40.27" /organism="Artificial
Sequence" 233cgggactggc cgggctatgc cccagacaca ctcacgtagg ggtgtccggc
ctggcagccc 60aggaccatgg tctgcagggt ttcctctcgg ccattcagga caaccctagt
ctccagggaa 120tagcgctggt gtcgcctatc agccgtgaag gtctcctgca
ggaggaggct ctgcgggatg 180ggcaggtgca atgggtgcct ggtgtgcaga
gggaaaaaca ggccaaagcc attaaagcag 240ctggcagtgc caggggacaa
ttgtgcccca cggtctcagc ctgggcctgt cacgagcttg 300cagagttaag
actctgccac agagaagaga acatcaggac acctggcagc cctatgcttt
360acaatgtggc atccagaacc cttcaccacc tcactgtgcc agagaagtgg
gcatggctgg 420ggtccccgtc gccatttgac agcaaagacc caagaggata
gatgacacac agcatctggt 480gtcacacaga ctgggattag aatccaggca
cggtctttca ctagctgtgt gaccttggga 540aaaggacttg actgttctgt
gcctcagttt ccccatctgt aaaacggagg ctaaaataat 600actgatcgga
cacagtggtc agggttagag ataacataca tgaaacgacc acaagctccc
660caagggcaaa ggtttctgac attccggttc tctgccattt tccatgtgcc
cagaagagca 720cttggtccat agtatgtgct caatgaatgt aaatgggata
aaaacacgaa cgaacactct 780gccaacgatg ctgctgttcc tttgtcatca
ctgcttctgt ttaggctgta gctgacttat 840ctaaggccat acagctgctc
aatgcatagc ccggccagtc ccg 883234291DNAArtificial
Sequencesource1..291/mol_type="DNA" /note="synthesized chimeric
TTV, gb43.30" /organism="Artificial Sequence" 234ccccttgact
tcggtgtgta aacttgtggt atagaacatg atgttttaag atacatgtac 60attgtggaat
ggcttgatca tgctaattaa catatgaatt acctcactta gctatctttt
120ttatggtgaa agcacttaaa atctaccctc agcagttttc aagtacacaa
tacatttcta 180ttaactatag tcaccatgtt gtacaataaa tctcttgaat
ttattcctcc tgcctaactg 240acattttgta tcctttgact gatctctctc
cccagtcccg tgcccgaatt g 291235293DNAHomo
sapienssource1..293/mol_type="DNA" /organism="Homo sapiens"
235taattgacaa aacgtgtata aacttgtggt atagaacatg atgttttaag
atacatgtac 60attgtggaat ggcttgatca tgctaattaa catatgaatt acctcactta
gctatctttt 120ttatggtgaa agcacttaaa atctaccctc agcagttttc
aagtacacaa tacatttcta 180ttaactatag tcaccatgtt gtacaataaa
tctcttgaat ttattcctcc tgcctaactg 240acattttgta tcctttgact
gatctctctc cccagtcccg tgaccagtgc cct 293236278DNAArtificial
Sequencesource1..278/mol_type="DNA" /note="synthesized
gbDhDi43.30.sequence" /organism="Artificial Sequence" 236gtgtgtaaac
ttgtggtata gaacatgatg ttttaagata catgtacatt gtggaatggc 60ttgatcatgc
taattaacat atgaattacc tcacttagct atctttttta tggtgaaagc
120acttaaaatc taccctcagc agttttcaag tacacaatac atttctatta
actatagtca 180ccatgttgta caataaatct cttgaattta ttcctcctgc
ctaactgaca ttttgtatcc 240tttgactgat ctctctcccc agtcccgtgc ccgaattg
278237269DNAArtificial Sequencesource1..269/mol_type="DNA"
/note="synthesized query 14" /organism="Artificial Sequence"
237gtgtgtaaac ttgtggtata gaacatgatg ttttaagata catgtacatt
gtggaatggc 60ttgatcatgc taattaacat atgaattacc tcacttagct atctttttta
tggtgaaagc 120acttaaaatc taccctcagc agttttcaag tacacaatac
atttctatta actatagtca 180ccatgttgta caataaatct cttgaattta
ttcctcctgc ctaactgaca ttttgtatcc 240tttgactgat ctctctcccc agtcccgtg
269238272DNAHomo sapienssource1..272/mol_type="DNA" /organism="Homo
sapiens" 238gtgtataaac ttgtggtata gaacatgatg ttttaagata catgtacatt
gtggaatggc 60ttgatcatgc taattaacat atgaattacc tcacttagct atctttttta
tggtgaaagc 120acttaaaatc taccctcagc agttttcaag tacacaatac
atttctatta actatagtca 180ccatgttgta caataaatct cttgaattta
ttcctcctcc tgcctaactg acattttgta 240tcctttgact gatctctctc
cccagtcccg tg 27223949PRTArtificial
SequenceSOURCE1..49/mol_type="protein" /note="synthesized FASTA of
gbDhDi43.30" /organism="Artificial Sequence" 239Met Phe Tyr Thr Thr
Ser Leu His Thr Glu Val Lys Gly Gln Phe Gly 1 5 10 15 His Gly Thr
Gly Glu Arg Asp Gln Ser Lys Asp Thr Lys Cys Gln Leu 20 25 30 Gly
Arg Arg Asn Lys Phe Lys Arg Phe Ile Val Gln His Gly Asp Tyr 35 40
45 Ser 240119PRTArtificial SequenceSOURCE1..119/mol_type="protein"
/note="synthesized TT virus variant" /organism="Artificial
Sequence" 240Ala Gln Thr Gln Arg Arg Val Ile Pro Ala Ser Arg Gly
Arg Val Pro 1 5 10 15 Glu Val Ser Leu His Thr Xaa Val Lys Gly Gln
Phe Gly Leu Gly Thr 20 25 30 Gly Arg Ala Met Gly Lys Ala Leu Lys
Lys Asp Met Phe Leu Gly Lys 35 40 45 Leu Tyr Lys Lys Lys Arg Ala
Leu Ser Leu His Gly Leu Arg Thr Pro 50 55 60 Glu Ala Lys Pro Pro
Ala Met Ser Trp Arg Pro Pro Val His Asn Pro 65 70 75 80Asn Arg Ile
Glu Arg Asn Leu Trp Glu Ala Phe Phe Arg Ile His Ala 85 90 95 Ser
Ser Cys Gly Cys Gly His Leu Val Gly His Leu Thr Val Leu Ala 100 105
110 Arg Arg Tyr Gly Ala Pro Pro 115 241150PRTTorque teno
virusSITE23..23Xaa can be any naturally occurring amino acid 241Ala
Gln Thr Gln Arg Arg Val Ile Pro Ala Ser Arg Gly Arg Val Pro 1 5 10
15 Glu Val Ser Leu His Thr Xaa Val Lys Gly Gln Phe Gly Leu Gly Thr
20 25 30 Gly Arg Ala Met Gly Lys Ala Leu Lys Lys Asp Met Phe Leu
Gly Lys 35 40 45 Leu Tyr Lys Lys Lys Arg Ala Leu Ser Leu His Gly
Leu Arg Thr Pro 50 55 60 Glu Ala Lys Pro Pro Ala Met Ser Trp Arg
Pro Pro Val His Asn Pro 65 70 75 80Asn Arg Ile Glu Arg Asn Leu Trp
Glu Ala Phe Phe Arg Ile His Ala 85 90 95 Ser Ser Cys Gly Cys Gly
His Leu Val Gly His Leu Thr Val Leu Ala 100 105 110 Arg Arg Tyr Gly
Ala Pro Pro Arg Pro Pro Ala Pro Gly Ala Pro Arg 115 120 125 Pro Ala
Leu Lys Arg Gln Leu Ala Leu Pro Ala Pro Pro Ala Asp Pro 130 135 140
Gln Gln Ala Asn Pro Thr 145 150242639DNAArtificial
Sequencesource1..639/mol_type="DNA" /note="synthesized hod11"
/organism="Artificial Sequence" 242ccccttgact gcggtgtgta aagcgcccca
gcctgtgcct gcacagtgcc tgtgtggtgt 60gaacccatga ccaggcctct ggagggaagg
aaggttaggc ttagtggaca ccagctttcc 120taaggtgggt cttagaccaa
ctcattaaaa tggcaggatg ggcttttgtg ctgtatttct 180tgggattttc
aagatgcccc acacagcaga agggatgtgc atttttttct ctgccctgag
240ttgtttgata aaaatcagtg acctcgttct ccacttagaa ctcccctgaa
ctgcactcgg 300tgtctaggac tgttggggaa ggaagtgaag agccagcatg
tagtctcctc tggactctta 360caggatctgt ccacctctgg gctctttatg
taggggaagg tgtgagctcc tgggagtact 420cctgatagag gactgtttcc
ctgaaaacct cagcagtgtt tgaggcccta gcagggggaa 480cccagacccc
gcctgccaaa gcccctaatc cctcagggct attatcagca gcctaagcgc
540cttagggtgg ccagagtcca gcccagcaag cagcaaagtc agcagcctcc
tcgccctatc 600ctctccatgc cccggggcac tccagtcccg accgaattg
639243358DNAArtificial Sequencesource1..358/mol_type="DNA"
/note="synthesized hodL.VvWw.1.sequence" /organism="Artificial
Sequence" 243tcccctgaac tgcactcggt gtctaggact gttggggaag gaagtgaaga
gccagcatgt 60agtctcctct ggactcttac aggatctgtc cacctctggg ctctttatgt
aggggaaggt 120gtgagctcct gggagtactc ctgatagagg actgtttccc
tgaaaacctc agcagtgttt 180gaggccctag cagggggaac ccagaccccg
cctgccaaag cccctaatcc ctcagggcta 240ttatcagcag cctaagcgcc
ttagggtggc cagagtccag cccagcaagc agcaaagtca 300gcagcctcct
cgccctatcc tctccatgcc ccggggcact ccagtcccga ccgaattg
358244360DNAHomo sapienssource1..360/mol_type="DNA" /organism="Homo
sapiens" 244tcccctgaac tgcactcggt gtctaggact gttggggaag gaagtgaaga
gccagcatgt 60agtctcctct ggactcttac aggatctgtc cacctctggg ctctttatgt
aggggaaggt 120gtgagctcct gggagtactc ctgatagagg actgtttccc
tgaaaacctc agcagtgttt 180gaggccctag cagggggaac ccagaccccg
cctgccaaag cccctaatcc ctcagggcta 240ttatcagcag cctaagcgcc
ttagggtggc cagagtccag cccagcaagc agcaaagtca 300gcagcctcct
cgccctatcc tctccatgcc ccggggcact ccagtcccag ctggctgatc
360245288DNAArtificial Sequencesource1..288/mol_type="DNA"
/note="synthesized VvWw.1.sequence" /organism="Artificial Sequence"
245ccccttgact gcggtgtgta aagcgcccca gcctgtgcct gcacagtgcc
tgtgtggtgt 60gaacccatga ccaggcctct ggagggaagg aaggttaggc ttagtggaca
ccagctttcc 120taaggtgggt cttagaccaa ctcattaaaa tggcaggatg
ggcttttgtg ctgtatttct 180tgggattttc aagatgcccc acacagcaga
agggatgtgc annnnnnnct ctgccctgag 240ttgtttgata aaaatcagtg
acctcgttct ccacttagaa ctcccctg 288246289DNAHomo
sapienssource1..289/mol_type="DNA" /organism="Homo sapiens"
246ttcagttagc tgctgtgtgt aaagcgcccc agcctgtgcc tgcacagtgc
ctgtgtggtg 60tgaacccatg accaggcctc tggagggaag gaaggttagg cttagtggac
accagctttc 120ctaaggtggg tcttagacca actcattaaa atggcaggat
gggcttttgt gctgtatttc 180ttgggatttt caagatgccc cacacagcag
aagggatgtg catttttttc tctgccctga 240gttgtttgat aaaaatcagt
gacctcgttc tccacttaga actcccctg 2892473387DNAArtificial
Sequencesource1..3387/mol_type="DNA" /note="synthesized hoht33"
/organism="Artificial Sequence" 247aattcggtcg ggactggcag agtgacgctc
aggtcagcct gacagcaggg tgattgaagg 60ggccagatac cccagcaggg cctgaggcca
gaacacagca taggctggct ctgatgggtg 120gaggaggtgg ccaggcatca
tctggagctt ggagttgaga acatctgtga ctcctccttc 180aggagggtgc
tctaggagtt gagagcatcc taggtaggac catacatcta cccccatcct
240agttccctcc agcctctctt ttcagctcca ggtctacctt aagggaccta
ggacacctgg 300gctggggcat aacaggactt ggttttatgt aaaggagctg
ggaagagact gagataacag 360agggctgcaa ggagagagac agagagagaa
gaacctgcca gaagaagctc ctcagcaatc 420cactaagccc tgatctttgc
ctcactgcct gtcccttccc atccgctctt ctgctctctc 480aatctctgcc
ttcaagaaat ttggtgcata ttggaatagg gaggaataga agcaccctgg
540gtggagctct gggcttggct gtgcacgagc tttcagtggg tggtttgctg
gtctccaaag 600atgaccctcc attagtcatg cttctcggtg tttgtcctca
ggtagtctca tcccatcttg 660agtctgggct tgccctgtga ctcactttaa
ccacaagaat gtggcagaaa ggatgttgtg 720ccagttctag aactaagcct
tcagaaagcc tagcaccttc tgcttttagg agcactgagc 780ccccatgtta
gaagtccact tttatactct gctctggaga ctagcagaat tagaaatgca
840ctgctgaatg ctgctcgaga gactaatgga gaggccatgt gaataaggag
gcctgaaact 900acatggagat agagggccag ccaccccagc accacggctc
agctgtgcct cccagccatc 960tctgccagtc ctccagggct atgagtgaac
catcttggat gttctagctc ggtggagccc 1020ccaggtgatt gcagcctcag
ccaccatctg actgtagctg catgagaggc ccccagtggg 1080accagcagga
ctgccaagct gagccctgcc cacccacaga actgtgagaa ataaaaaaat
1140ggttgtttcc ttaagccatt aagttttgga atgatttgtt actcacaatt
gataactgat 1200acagtctgtc tttagggaaa acaagggata actctgggct
ccaggtgtct tctataggat 1260gaatgggact tggttgctga caagctgaca
agtttgagca tgaaactctt tttttttttt 1320ggagaaggaa ttttgctctt
gttatccagg ctggaataca gtggtgcgat ctcggcccaa 1380ggcaacctct
gcctcctggg ttcaagcaat tctcctgcct cagcctcctg agtagctggg
1440attacaggca cccaccacta cacctggctc tttttttttt ttgtattttt
agtagagaca 1500ggttttcatt atgttggcct ggtcaggttt tgaactcctg
acctcaggtg atccacctgc 1560cttggcctcc taaaatgctg ggattacagg
tgtgagccac cgtgcctggc ctgagcatga 1620aacttttatg ctcaaacatt
aaagtgtaaa cactcaccag ctcagctgaa taagaacttc 1680tgggggcaag
gcccaggaat ctacagttta gtaagtgccc ccaccactgg accctgggaa
1740agtggactgc attttgaaaa actctagatc agttgatacc caggagtcct
cataacacta 1800agttgtaata cctcagtgtg aattagtctg atgcagctct
tcttagaggt cattgacaga 1860gggcaagaca tttccaaaag gaaggaatag
ccaatatgga atgacaggtg gattggatga 1920ccctctatta tttagtttca
acctgccctt cctgccttcc ctcccacaaa ttccctttca 1980gatcctccgt
cctaatcctc ttcgatagtt cattgttctt ctgcagacag agcagcgaag
2040tgttatctgt tgtacccact atgactagtt gatggtgcat ggcttccatg
gagcagtgct 2100gtgatccatt agtcatggag cagtgctgtg atccattgtc
atgtctgcca tgaacactgg 2160aaggggcagt ggtaatgaca gcctcttaca
tttgccaact ctgcccaaca ttcttcccag 2220tgttgggaaa gcctttgctt
attccattcc ttcttggaaa gctttgttcc tccatttcac 2280atttttaatt
tttctcattt ttatggtgca ccatggatac cacctgtcca tatagctggc
2340ttctgatttt tccagatgaa agtaatcctt cctctcctaa cctcccatga
cacctaacct 2400ggcactcatt tacggtgttc agctccttct cctgtacgtt
ctcattgttc tcctctcatc 2460ttctccccag gaatggattc cccgccaagg
gaggtaccag gtcagtttct tctttgtgca 2520acagggtgtc cctgatgagc
acaaacctgg aacaagtgtt tgtagggctg gtgggcatct 2580ggttcctctg
ggtgttgtgt agcctgagcc ggggggcaaa tgggtgtttg tttttctgaa
2640gaaggcaggc gttctgtggc agatgtgggt ggagggggtt ggggagtagt
atcatggaga 2700ggctgggatc ctatctatct ccttcccctg cttgaagggc
aacttgggag aagctcaaga 2760gggaggagtt gactgcagaa gctgggatac
ctgcataact ctcaggttca agcatcactg 2820ctttagggcc ctgggggcct
atgtgtgagt caagaaaggg agatagagag agaagagaga 2880gagaggagag
agagagagag agagagaaga cagaggagag agagagagaa gaaagaggag
2940agagagagaa gagagagcag agagagagag catgctgtca gtgaggtggc
cctaagccct 3000cttggaaata acttggaggc actgtggggt ggctctgagg
tgctgaggta tacctgtagt 3060ggggctagga cctttccaac ctgggtctga
aggttgaggc aaccttgggt gtacctgctg 3120gtgagctgag agccctgggg
acctttggca gacattccca cccctgcagc ctggagggtt 3180tgcatgcagt
gaggctgtcc tgctcatcac tacgtcctct gggacagcac attgcctgtg
3240ctgaacaggc attcagttgc gatttgtgga atcagtgttg gtgaggaggg
caagtggcaa 3300cagaaatggg ggtgtgctcc ccccagttcc tcagctacaa
tctccatgac cttctacact 3360gccctgggcc cagtcccgac cgaattg
33872481790DNAArtificial Sequencesource1..1790/mol_type="DNA"
/note="synthesized hoht22" /organism="Artificial Sequence"
248caattcggtc gggactgggg agctgtgaga aagagaagag aaggtcagat
caggaacatt 60acacagaagt cggcaaaact ggaacgagga gggaaagaaa tgagcgagtc
tgacactcag 120tccatcctag ttcctatcac acagggaggg acattgccat
gcacatcccc acagagatgc 180accgtgtaag gggtcgaggc agatcctgtc
cactattgcc agctctgagg tgatcaaatt 240gtgtctgccc agggtaaccc
ggttgaccta aaccaaccca ctcccttgca catcttaggt 300gttcctgagt
cagcaaggct gaggaagcca ctccagccaa aatcccttgt gcgatcttca
360agccccaatc acaggcaatg acaaggccat gtctggctgg cctcatgggg
actgccctcc 420cctcaccaga cctagaacac aggcaatgct cagcagcgtt
ctgagaagag ctgaggtcaa 480gaactccaac cccacgcaac ccagacctga
tacaaacaga cacccatttg cactcctaac 540ccttgagcct ctatttccag
acctcctcac tgggtctcag ctgagaaccc acttttagcc 600aagcatcttt
agttcagagt tcctcgcagt gaggggatcc ctcccctgcc ttgctgtctg
660tgctgcatcc attataccct cacaccgtgc tactcagcag gggagaaatg
gagccctggg 720gagccggcac ttttctcttc tgcctcttcc ttgccttgcc
tcaggaaggg gaaaaactct 780gggttgtttt agtttgatcc cctgtcctaa
gtgaccacag gaacactagg cagtgagtac 840atatggattc ttagcagaga
gctgacaagt cttcagaaac atagaaaaca tagaagcttt 900gagtgaggag
atcagaatgt aattaggagt ttcttttgga gcaaacccca ccccaagaga
960gtgagcccaa gttcttgaag gcccacctga gcagatgaca ccagcgtctt
cactatggcc 1020acagttgtgg gtgagccagc cattgtgggg gcagctccac
aggtaggact cgtgtcctga 1080gcagcgcaca tcatccagga caatgggtcc
tgagccctgg ccaaactggg catttcctgg 1140ggctgacatg gcccagccac
agcccggctg cctgcagacc acattggcat cattggtgtc 1200ccagtagtca
tcacacacgg tgccccagga gcctcggtat aggacctcca ctcggcctcg
1260acacctgtcg cctccattca ccagcctcag ggccaaactg gattcagatc
ctacagggga 1320acacaagaac ctttcatcca tccctatcat gaggtcaaga
atctaaggta agttccacac 1380tcagggtact tcctaatgaa ctaagtcacc
taggcaggca gtcacctttg catatgacta 1440cagactaggc ttcatcaccg
tgaaagtagc actgataacc tactctgccc aggtctatgg 1500gtgctcaact
tttggggaag cacctgtgac cccagtggat gtgatgggaa tggatgcccc
1560actccccagt tgggtacaca gaggatggag ctgctcagct ccagatggca
ggcccagacc 1620cctcccttat tcaggagcat ggtcctatct gggatctgac
tggcagagta ccagagatgg 1680cagggatgag gtccccatag gattagggag
acccccaggg cttgttctga gcccatagat 1740aaggatcttt tctgaccact
tggaacagga tcccagtccc gaccgaattg 179024918DNAArtificial
Sequencesource1..18/mol_type="DNA" /note="synthesized DhDi primer
forward" /organism="Artificial Sequence" 249caattcgggc acgggact
1825024DNAArtificial Sequencesource1..24/mol_type="DNA"
/note="synthesized DhDi primer reverse" /organism="Artificial
Sequence" 250ccccttgact tcggtgtgta aact 2425128DNAArtificial
Sequencesource1..28/mol_type="DNA" /note="synthesized cd, primer
forward" /organism="Artificial Sequence" 251cagcgagaac gccacggagg
gagatcct 2825228DNAArtificial Sequencesource1..28/mol_type="DNA"
/note="synthesized cd; primer reverse" /organism="Artificial
Sequence" 252cggacgggcg tggaaaactc agccattc 2825318DNAArtificial
Sequencesource1..18/mol_type="DNA" /note="synthesized DfDg primer
forward" /organism="Artificial Sequence" 253cgggactggc cgggctat
1825419DNAArtificial Sequencesource1..19/mol_type="DNA"
/note="synthesized DfDg primer reverse" /organism="Artificial
Sequence" 254agcccgaatt gccccttga 192553725DNAArtificial
Sequencesource1..3725/mol_type="DNA" /note="synthesized ttgb33.35"
/organism="Artificial Sequence" 255attttgtgca gcccgccaat ttctgttcaa
acagaccaat caggaccttc tacgtgcact 60tcctggggcg tgtctacgag gtctatataa
gcaacagcgg tgacgaatgg tagagttttt 120cttcgcccgt ccgcggcgag
agcgcgagcg aagcgagcga tcgagcgtcc cgtgggcggg 180tgccgtaggt
gagtttacac accgaagtca aggggcaatt cgggcacggg actggccggg
240ctatgggcaa ggctcttaaa aaattccccc gctctgctct ccggcaggac
acaaagtcat 300gccgtggaga ccgccggtcc ataacgtgcc aggtagagag
aatcaatggt ttgcagcgtt 360ctttcacggt catgctgctt tctgcgggtg
tggtgaccct gttgggcatc ttaacggcat 420tgctcctcgc tttcctaacg
ccggtccacc gagaccacct ccagggctag accagcttaa 480tcccgagggc
ccggcaggtc ccggagggcc ccccgccatc ttgccagctc tgccggcccc
540ggcagaccct gaaccggcac cacggcgtgg tggtggggca gatggaggcg
ccgccgctgg 600ggccgccgcc gacgcagacc ataccgggta cgaagaagga
gacctagaag atcttttcgc 660cgccgcggcc gaggacgata tgtgagtagg
cggaggcgcc gccgctacta caggcgcaga 720ctgagacggg gcagacgcag
agggcgacga aagagacaca gacagactct agtagtgagg 780cagtggcaac
ctgacgttgt taaaaagtgt aaaataacag gatggatgcc tcttataatc
840tgtggctctg gaagcacaca gatgaacttt ataactcaca tggacgatac
tccccctatg 900ggatacacct acgggggcaa ctttgtaaat gtaactttca
gtctagaggc catctatgaa 960caattcctgt accacagaaa caggtggtcc
aggtctaacc atgacttaga cctggccaga 1020taccaaggaa ccactctaaa
actttacaga caccaaaccg tggactatat agttagctac 1080aacagaacag
gcccctttac tataagtgaa atgacttaca tgagcacaca cccggctctc
1140atgctactac aaaaacatag aatagttgta cccagcttca gaaccaagcc
aaaaggcaaa 1200agagccataa aaattagaat aagggcccca aaactaatgc
tcaccaagtg gtactttaca 1260aaagacattt gctccatggg cctctttcaa
ctaatggcaa cagctgcaga acttacaaac 1320ccatggctca gagacaccac
aaaaagccca gtaattggct tcagagtctt aaaaaacagc 1380ttatacacat
gcctttccaa cttaaaagac caagcaatac aaggtgaaag aaagactgta
1440caaaatagat tacacccaga aaacctacat ggcacaggac ctaatgctaa
aggctgggaa 1500tacacataca caaaactaat ggcatctaca tactactcag
ccaacagaaa cagcacctac 1560aactggcaaa actatcaaac taactatgca
aacacatata caaaatttaa agaaaaaaga 1620acagcaaact taaacttaat
taaagcagaa tacctatatc attaccctaa caatgtcaca 1680caatctgact
ttatattaga ctacacacta acacccgact ggggcatata cagcccctac
1740tacctaacac ccaccagaat tagcctagac tgggacacac catggacata
tgtaagatac 1800aacccactat cagacaaagg cataggtaac agaatatatg
cacagtggtg ctcagaaaaa 1860tctagtaaat tagacaccac aaagagcaag
tgcatactaa gagacttccc actgtgggcc 1920atggcctatg gctactgtga
ctgggtggtg aagtgcacag gagtgtccag tgcttggaca 1980gacatgagaa
tagccattat atgtccctac acagaaccag cacttatagg gtcaacagaa
2040gacgtaggct tcattccagt aagtgacacc ttttgcaacg gagacatgcc
gtttcttgca 2100ccatacatac ctattacatg gtggattaag tggtacccca
tgattacaca ccaaaaggaa 2160gttcttgagg caatagttaa ctgtggaccg
tttgtacccc gagaccaaac ttccccagct 2220tgggaataac catgggttac
aaaatggatt ggaaatgggg cggctctccc ctgccttcac 2280aggcaatcga
cgacccctgc cagaagtcca cccacgaact tcccgacccc gatagacacc
2340ctcgcatgtt acaagtctct gacccgacaa agctcggacc gaagacagtt
tttcacaaat 2400gggactggag acgtgggatg cttagcaaaa gaagtattaa
aagagtccaa gaagactcaa 2460cagacgatga atatgttgca ggacccttac
caagaaaaag aaacaagttc gatactcgag 2520tccaaggccc tccaacccca
gaaaaagaaa gttacacttt actccaagcc ctccaagagt 2580cggggcaaga
gagcagctca gaggaccaag aacaagcacc ccaagaaaaa gaggaccaga
2640aggaagcgct catggagcag ctccagctcc agaaacacca ccagcgagtc
ctcaagcgag 2700gcctcaaact cctcctcgga gacgtgctcc gactccggag
aggagtccac tgggaccccc 2760tcctgtccta attcaaggtc ccagtatccc
agacctgctt ttccctaaca cacaaaaaaa 2820aaaacgattt tccaactacg
actgggtgtg cgagtacgag ctggccaaat ggatggatcg 2880gcccttgcgg
cactacccat cagacccccc tcactacccc tggctaccaa aaaagcctcc
2940tacccctcct acatgtagag taagtttcaa attaaagctc aatgactaaa
attcaaggcc 3000gtgggtgttt cacttcatcg gtgtctacct ctaaaagtca
ctaagcactc cgagcgtaag 3060cgaggagtgc gacccccctg cccggtagca
acttcctcgg ggtccggcgc tacgccttcg 3120gctgcgccgg gcgcctcgga
ccccccctcg acccgaatcg ctcgcgcgat tcggacctgc 3180ggcctcgggg
gggtcggggg ctttactaaa cagactctga ggtgccgttg gacactgagg
3240gggtgaacag caacgaaagt gagtggggcc aaacttcgcc ataaggcctt
taactttggg 3300tcgcttgtca gcagcttccg ggtccgcctg gaggccgcca
ttttacattc ggccgccatt 3360ttaggccctc gcgggcctcc atagtcgcac
atcagtgacg tcacggcagc catcttggct 3420gtgacgtcaa cgtcacgtgg
ggaggacggc gtgtaacccg gaagtcatcc tcatcacgcg 3480acctgacgtc
acggccgcca ttttgtgctg tccgccatct tgtgacttcc ttccgctttt
3540tgtaaaaaaa agaggaagtg tgacgtagcg gcgggggggn nnnnnnnnnn
nnnnnnncgc 3600caccaggggg cgctacgcgc ccccccccgc gcatgtgcgg
gtcccccccc tcgggggggg 3660ctccgccccc ccggcccccc cccgggctaa
atacaccgcg catgcgcggc cacgcccccg 3720ccgcc 3725256719DNAArtificial
Sequencesource1..719/mol_type="DNA" /note="synthesized zpr4.20"
/organism="Artificial Sequence" 256caattcgggc acgggactgg ccgggctatg
ggcaaggctc ttaaaaaatt cccccgctct 60gctctccggc aggacacaaa gtcatgccgt
ggagaccgcc ggtccataac gtgccaggta 120gagagaatca atggtttgca
gcgttctttc acggtcatgc tgctttctgc gggtgtggtg 180accctgttgg
gcatcttaac ggcattgctc ctcgctttcc taacgccggt ccaccgagac
240cacctccagg gctagaccag cttaatcccg agggcccggc aggtcccgga
gggccccccg 300ccatcttgcc agctctgccg gccccggcag accctgaacc
ggcaccacgg cgtggtggtg 360gggcagatgg aggcgccgcc gctggggccg
ccgccgacgc agaccatacc gggtacgaag 420aaggagacct cggggggggc
tccgcccccc cggccccccc ccgggctaaa tacaccgcgc 480atgcgcggcc
acgcccccgc cgccattttg tgcagcccgc caatttctgt tcaaacagac
540caatcaggac cttctacgtg cacttcctgg ggcgtgtcta cgaggtctat
ataagcaaca 600gcggtgacga atggtagagt ttttcttcgc ccgtccgcgg
cgagagcgcg agcgaagcga 660gcgatcgagc gtcccgtggg cgggtgccgt
aggtgagttt acacaccgaa gtcaagggg 7192573758DNAArtificial
Sequencesource1..3758/mol_type="DNA" /note="synthesized tth25"
/organism="Artificial Sequence" 257aagtacgtca ctaaccacgt gactcccgca
ggccaaccag agtctacgtc gtgcacttcc 60tgggcatggt ctacatcata atataagaac
gtgcacttcc gaatggctga gttttccacg 120cccgtccgca gcgagaacgc
cacggaggga gatcctcgcg tcccgagggc gggtgccgga 180ggtgagttta
cacaccgcag tcaaggggca attcgggctc gggactggcc gggccccggg
240caaggctctt aaaaaatgcg ttttcgcagg gttgcccaga aaaggaaagt
gcttttgcaa 300actgtgccag ctgcaaagaa ggctaggcgg cttctaggta
tgtggcagcc ccccacgcac 360aatgtcccgg gcatcgagag aaactggtac
gagagctgtt ttagatccca cgctgctgtt 420tgtggctgtg gcgattttgt
tggccatctt aatcatctgg caactactct gggtcgtcct 480ccgcgtcctg
ggcccccagg cggaccccgc acgccgcaaa taagaaacct gccagcgctc
540ccggcgcccc agggcgagcc cggtgacaga gcgccatggc atggggcttc
tggggccgac 600gccgccggtg gagacgatgg agagcgcggc gcagacggtg
gagaccccgc agacgtagga 660gacgacgccc tactcgccgc tttcgagctc
gtcgaagagt aaggaggcgc ggggggaggt 720ggcgcagacg ctacagaaaa
tggcgacggg gcagacgcag acggactcat agaaaaaaga 780tagtcataaa
acagtggcaa ccaaacttta taagacgctg ctacgtcata gggtacttac
840cacttatatt ctgcggcgaa aatacaaccg cccagaactt tgccactcac
tcggacgaca 900tgataagcaa aggaccgtac ggggggggca tgactaccac
caaattcact ctgagaatac 960tgtacgacga gtttaccagg tttatgaact
tttggactgt cagtaacgaa gacctagacc 1020tgtgtagata cgtgggctgc
aaactaatat tttttaaaca ccccacggtg gactttatag 1080tacagataaa
cactcagcct cctttcttag acacgcacct caccgcggcc agcatacacc
1140cgggcatcat gatgctcagc aagagacaca tactaatacc ctctctaaag
acccggccca 1200gcagaaaaca cagggtggtc gtcagggtgg gcgccccaag
actttttcag gacaagtggt 1260acccccagtc agacctgtgt gacacagttc
tgctttccat atttgcaacc gcctgcgact 1320tgcaatatcc gttcggctca
ccactaactg acaacccttg cgtcaacttc cagatcctgg 1380ggccccagta
caaaaaacac cttagtatta gctccactat ggatcaaact aacgaaaacc
1440attataaaga aaacttattt aacaaaactg aactatacaa cacctttcaa
accatagctc 1500agcttaaaga gacaggacac atttcaggca ttagtcctac
ttggaatgaa gtccagaatt 1560caacaacact tactaaagga ggtgacaatg
ccactcagag tagagacact tggtataaag 1620gaaatacata caacgagaag
atatgcgagt tagcacaaat aaccagaaac agatttaaaa 1680atgcaaccaa
aggagcacta ccaaactacc ccacaataat gtccacagac ctatatgaat
1740accactcagg catacactcc agcatatatc tatcagctgg caggagctac
tttgaaacca
1800ccggggccta ctctgacatt atatacaacc ctttcacaga caaaggcaca
ggcaacataa 1860tctggataga ctacctcaca aaagaagaca ccatttttgt
gaaaaacaaa agcaaatgcg 1920agataatgga catgcccctg tgggcggcct
gcacaggata cacagagttt tgtgcaaagt 1980atacaggcga ctctgccatt
atctacaatg caagaatact cataagatgc ccatacactg 2040agcccatgtt
aatagaccac tcagacccaa acaaaagctt cgttccctac tcatttaact
2100ttggcaacgg aaagatgccc ggaggcagct ccaacgtgcc cataagaatg
agagccaagt 2160ggtacgtgaa catattccac caaaaagaag tattagagag
catagtacag tccggaccgt 2220ttgggtacaa gggcgacata agatcagctg
tactagccat gaaatacaga tttcactgga 2280agtggggcgg aaaccctata
tccaaacagg tcgtcaggaa tccctgctcc aactccagct 2340cctccgcggc
ccatagagga cctcgcagcg tacaagcggt tgacccgaaa tacaataccc
2400cagaggtcac gtggcactcg tgggacatta gacgaggact ctttggcaaa
gcaggtatta 2460aaagaatgca acaggaatca gatgctcttt acattcctcc
aggaccaatc aagagacctc 2520gcagggacac caacgcccaa gacccagaag
agcaaaacga aagctcaggt ttcagagtcc 2580agcagcgact cccgtgggtc
cactccagcc aagagacgca aagctcccaa gaagagacgg 2640aggcgcaggg
gtcggtacaa gaccaactac tcctccagct ccgagagcag cgagttctcc
2700gactccagct ccagcaactc gcaacccaag tcctcaaagt ccaagcaggg
cacagcctac 2760accccctatt atcttcccaa gcataaacaa agcctttatg
tttgagcccc agggtcctaa 2820acccatacag gggtacaacg actggctaga
agagtacact gcttgcaaat tctgggacag 2880accccccaga aagctacaca
cagacatacc cttctacccc tgggcaccaa aaccccaaca 2940gcaagtcagg
gtgtccttta aactcaactt tcaataaaaa ttctaggccg tgggagtttc
3000acttgtcggt gtctgcttct taaggtcgcc aagcactccg agcgccagcg
aggagtgcga 3060ccccccctcc ggtagcaacg ccttcggagc cgcgcgctac
gccttcggct gcgcgcggca 3120cctcagaccc cccctccacc cgaaacgctt
gcgcgtttcg gaccttcggc gtcggggggg 3180tcgggagctt tattaaacag
actccgagtt gccattggac actggagctg tgaatcagta 3240acgaaagtga
gtggggccag acttcgccat agggccttta tcttctcgcc attggatagt
3300gtccggggtc gccgtaggct tcggcctcgt ttttaggcct tccggactac
aaaaatggcg 3360gttttagtga cgtcacggcc gccattttaa gtaaggcgga
agcagctcca ctttctcaca 3420aaatggcggc ggagcacttc cggcttgccc
aaaatggcgg gcaagctctt ccgggtaaag 3480ggtcagcagc tacgtcacaa
gtcacctgac tggggagggg tcacaacccg gaagccctcc 3540tcagtcacgt
ggctgttcac gtggttgcta cgtcatcggc gccatcttgt gtcgcaaaat
3600ggcggacaac ttccgctttt ttaaaaaaag gcgcgaaaaa acggcggcgg
cggcgcgcgc 3660gctgtgcgcg cgcgccgggg gggcgccagc gccccccccc
ccgcgcatgc gcgggtcccc 3720ccccccgcgg ggggctccgc cccccggccc cccccccg
3758258621DNAArtificial Sequencesource1..621/mol_type="DNA"
/note="synthesized zpr9.6" /organism="Artificial Sequence"
258ccgcagcgag aacgccacgg agggagatcc tcgcgtcccg agggcgggtg
ccggaggtga 60gtttacacac cgcagtcaag gggcaattcg ggctcgggac tggccgggcc
ccgggcaagg 120ctcttaaaaa atgcgttttc gcagggttgc ccagaaaagg
aaagtgcttt tgcaaactgt 180gccagctgca aagaaggcta ggcggcttct
aggtatgtgg cagcccccca cgcacaatgt 240cccgggcatc gagagaaact
ggtacgagag ctgttttaga tcccacgctg ctgtttgtgg 300ctgtggcgat
tttgttggcc atcttaatca tctggcaact actctgggtc gtcctccgcg
360tcctgggccc ccaggcggac cccgcacgcc gcaaataaga aacctgccag
cgctcccggc 420gccccagggc gagcccggtg acagagcgcc atggcatggg
gcttctgggg ccgacgccgc 480cggtggagac gatggagagc gcggcgcaga
cggtggagac cccgcaggcc aaccagagtc 540tacgtcgtgc acttcctggg
catggtctac atcataatat aagaacgtgc acttccgaat 600ggctgagttt
tccacgcccg t 6212593758DNAArtificial
Sequencesource1..3758/mol_type="DNA" /note="synthesized ttrh215"
/organism="Artificial Sequence" 259aaagtacgtc actaaccacg tgactcccac
aggccaacca cagtctacgt cgtgcatttc 60ctgggcatgg tctacatcat aatataagaa
ggcgcacttc cgaatggctg agttttccac 120gcccgtccgc agcgagaacg
ccacggaggg agatcctcgc gtcccgaggg cgggtgccgg 180aggtgagttt
acacaccgca gtcaaggggc aattcgggct cgggactggc cgggccctgg
240gcaaggctct taaaaaatgc gctttcgcag ggttgcggag aaaaggaaag
tgcttctgca 300aactctgcga gctgcaaagc aggctaggcg gcttctaggt
atgtggcagc cccccgcgca 360caatgtcccc ggcatcgaga gaaactggta
cgagagctgc ttcaggtctc acgctgctgt 420ttgtggctgt ggcgactttg
ttggccatat taatcatttg gcaactactc tgggtcgtcc 480tccgcgtcct
gggcccccag gcggaccccg cacgccgcaa ataagaaacc tgccagcgct
540cccggcgccc cagggcgagc ccggtgacag agcgccatgg cgtggggttt
ctggggccga 600cgccgccggt ggagacggtg gagagcgcgg cgcagacggt
ggagaccccg gagacgtagg 660agacgacgcc ctgctcgccg ctttcgagct
cgtcgaagag taaggagacg cggggggagg 720tggcgcagac gctacagaaa
atggcgacgg ggcagacgca gacggactca cagaaaaaag 780ataattataa
aacagtggca accaaacttt attagacgct gctacataat aggatgccta
840cctctcgttt tctgtggcga aaatacaacc gcccagaact atgccactca
ctcagacgat 900atgataagca aaggaccgta cggggggggc atgactacca
cgaaattcac tctgagaata 960ctgtacgacg agtttaccag gtttatgaac
ttttggactg tcagtaacga agacctagac 1020ctgtgtagat acgtgggctg
caaactgata ttttttaaac accccacggt ggactttatg 1080gtacagataa
acactcagcc tcctttctta gacacaagcc tcaccgcggc cagcatacac
1140ccgggcatca tgatgctcag caagagacgc atattaatac cctctctaaa
gacccggccg 1200agcagaaaac acagggtggt cgtcagggtg ggcgccccaa
gactttttca ggacaagtgg 1260tacccccagt cagacctatg tgacacagtt
ctgctttcca tatttgcaac cgcccgcgac 1320ttgcaatatc cgttcggctc
accactaact gacaaccctt gcgtcaactt ccagatcctg 1380gggccccagt
acaaaaaaca ccttagtatt agctccacta tggatgatac taacaaacag
1440cactataaca gcaacttatt taataaaact gcactataca acacctttca
aaccatagcc 1500cggcttaaag agacaggaca aactgcaaac attagtccaa
gttggagtga agtacaaaac 1560acaaaactac tagatcacac aggtgctaat
gcaactgcca gcagagacac ttggtacaag 1620ggaaacacat acaatgacta
catacaacag ttagcagaga aaacaagaga aaggtttaaa 1680aaagcaacaa
tgtcagcact accaaactac cccacaataa tgtccacaga cttatacgaa
1740taccactcag gcatatactc cagcatattt ctatcagctg gcaggagcta
ctttgaaacc 1800actggggcct actctgacat tatatacaac cctttgacag
acaaaggcac aggcaacata 1860atctggatag actaccttac aaaagacgac
acaatctttg taaaaaacaa aagcaaatgt 1920gagataatgg acatgcccct
gtgggcggcc ggcacaggat acacagagtt ttgtgcaaag 1980tacacaggag
actctgccat tatttacaat gccagaatac tcataagatg cccatacact
2040gaacccatgc taatagacca ctcagaccca aacaaaggct ttgtaccgta
ctcatttaac 2100tttggcaacg gaaagatgcc gggaggcagc tccaacgtgc
ccataagaat gagagccaag 2160tggtacgtaa acatattcca ccaaaaagaa
gtattggaga gcatagtaca gtccggaccg 2220ttcgggtaca ggggcgacat
aaaatcagct gtactgtcca tgaaatacag atttcactgg 2280aaatggggcg
gaaaccctat atccaaacag gtcgtcagga atccctgctc caactccagc
2340acctccgcgg cccatagagg acctcgcagc gtacaagcgg ttgacccgaa
atacaatacc 2400ccagaagtca cttggcactc gtgggacatc agacgaggac
tctttggcaa agcaggtatt 2460aaaagaatgc aacaagaatc agatgctctt
tacgttcctg caggaccact caagaggcct 2520cgcagagaca ccaacgccca
agacccggaa aagcaaaacg aaagctcacg tttcggagtc 2580cagcagcgac
tcccgtgggt ccactccagc caagagacgc aaagctccga agaagagacg
2640caggcgcagg ggtcggtaca agaccaacta ctcctccagc tccgagagca
gcgagtactc 2700cgactccagc tccaacaact cgcaccccaa gtcctcaaag
ttcaagcagg acacagccta 2760caccccctat tatcctccca agcataaaca
aagcctatat gtttgaaccc cagggtccta 2820aacccataca ggggtacaac
gattggctag aggagtacac tagttgcaag ttccgggaca 2880gacccccgag
aatgctacac acagacttac ccttttaccc ctgggcacca aaaccccaag
2940accaagtcag ggtaaccttt aaactcaact ttcaataaaa attctaggcc
gtgggacttt 3000cacttgtcgg tgtctgcttc ttaaggtcgc caagcactcc
gagcgtcagc gaggagtgcg 3060accccccccc tcggtagcaa cgccttcgga
gccgcgcgct acgccttcgg ctgcgcgcgg 3120cacctcagac cccccctcca
cccgaaacgc ttgcgcgttt cggaccttcg gcgtcggggg 3180ggtcgggagc
tttattaaac agactccgag ttgccattgg acactggagc tgtgaatcag
3240taacgaaagt gagtggggcc agacttcgcc atagggcctt tatcttctcg
ccattggata 3300gtgtccgggg ttgccgtagg cttcggcctc gtttttaggc
cttccggact acaaaaatgg 3360cggattttgt gacgtcacgg ccgccatttt
aagtaaggcg gaagcagctc caccctctca 3420cataatggcg gcggagcact
cccggcttgc ccaaaatggc gggcaagctc ttccgggtca 3480aaggttggca
gctacgtcac aagtcacctg actggggagg agttacatcc cggaagttct
3540cctcggtcac gtgactgtac acgtgactgc tacgtcattg acgccatctt
gtgtcacaaa 3600atggcggtgc acttccgctt ttttgaaaaa aggcgcgaaa
aaacggcggc ggcggcgcgc 3660gcgctgcgcg cgcgcgccgg gggggcgcca
gcgccccccc ccccgcgcat gcacgggtcc 3720ccccccccac ggggggctcc
gccccccggc cccccccc 3758260642DNAArtificial
Sequencesource1..642/mol_type="DNA" /note="synthesized zpr12.24"
/organism="Artificial Sequence" 260cagcgagaac gccacggagg gagatcctcg
cgtcccgagg gcgggtgccg gaggtgagtt 60tacacaccgc agtcaagggg caattcgggc
tcgggactgg ccgggccccg ggcaaggctc 120ttaaaaaatg cgctttcgca
gggttgctga gaaaaggaaa gtgcttctgc aaactgtgcg 180agctacacag
aagactaggc ggcttctaag ccgcccacag gggcatgtct acatgcttcc
240gcagcgagaa cgccacggag ggagatcctc gcgtcccgag ggcgggtgcc
ggaggtgagt 300ttacacaccg cagtcaaagg gcaattcggg ctcgggactg
gccgggcccc gggcaaggct 360cttaaaaaat gcgctttcgc ggggttgctg
agaaaaggaa agtgcttctg caaactgtgc 420gagctacaca gaagactagg
cggcttctag gtatgtggca gccccccgtg cacaatgtcc 480ccggcatctt
attagtactc tggcgttgta gataatggca gagtctccag tgtactttgc
540acagaactct gtgtatcctg tgcaggccgc ccacaggggc atgtctacat
cataatataa 600taaggcgcac ttccgaatgg ctgagttttc cacgcccgtc cg
64226123PRTArtificial SequenceSOURCE1..23/mol_type="protein"
/note="synthesized zyb2.1. peptide" /organism="Artificial Sequence"
261Arg Val Pro Lys Val Ser Leu His Thr Ala Val Lys Gly Gln Phe Gly
1 5 10 15 Leu Gly Thr Gly Arg Ala Met 20 26223PRTArtificial
SequenceSOURCE1..23/mol_type="protein" /note="synthesized zyb9.1
peptide" /organism="Artificial Sequence" 262Arg Val Pro Lys Val Ser
Leu His Thr Ala Val Lys Gly Gln Phe Gly 1 5 10 15 Leu Gly Thr Gly
Arg Ala Met 20 26323PRTArtificial
SequenceSOURCE1..23/mol_type="protein" /note="synthesized zyb69.1
peptide" /organism="Artificial Sequence" 263Arg Val Pro Glu Val Ser
Leu His Thr Ala Val Lys Gly Gln Phe Gly 1 5 10 15 Leu Gly Thr Gly
Arg Ala Met 20 26423PRTArtificial
SequenceSOURCE1..23/mol_type="protein" /note="synthesized zyb2.3.
peptide" /organism="Artificial Sequence" 264Gly Ala Glu Gly Glu Phe
Thr His Arg Ser Gln Gly Ala Ile Arg Ala 1 5 10 15 Arg Asp Trp Pro
Gly His Gly 20 26523PRTArtificial
SequenceSOURCE1..23/mol_type="protein" /note="synthesized zyb9.3.
peptide" /organism="Artificial Sequence" 265Gly Ala Glu Gly Glu Phe
Thr His Arg Ser Gln Gly Ala Ile Arg Ala 1 5 10 15 Arg Asp Trp Pro
Gly Tyr Gly 20 26623PRTArtificial
SequenceSOURCE1..23/mol_type="protein" /note="synthesized zyb5.3.
peptide" /organism="Artificial Sequence" 266Gly Ala Val Gly Glu Phe
Thr His Arg Ser Gln Gly Ala Ile Arg Ala 1 5 10 15 Arg Asp Trp Pro
Gly Tyr Gly 20 26723PRTArtificial
SequenceSOURCE1..23/mol_type="protein" /note="synthesized zkb69.3
peptide" /organism="Artificial Sequence" 267Gly Ala Gly Gly Glu Phe
Thr His Arg Ser Gln Gly Ala Ile Arg Ala 1 5 10 15 Arg Asp Trp Pro
Gly Tyr Gly 20 26823PRTArtificial
SequenceSOURCE1..23/mol_type="protein" /note="synthesized
Q9WB09_9Viru" /organism="Artificial Sequence" 268Arg Val Pro Lys
Val Ser Leu His Thr Glu Val Lys Gly Gln Phe Gly 1 5 10 15 Leu Gly
Thr Gly Arg Ala Met 20 26923PRTArtificial
SequenceSOURCE1..23/mol_type="protein" /note="synthesized
q9wb09_9viru" /organism="Artificial Sequence" 269Arg Val Pro Lys
Val Ser Leu His Thr Glu Val Lys Gly Gln Phe Gly 1 5 10 15 Leu Gly
Thr Gly Arg Ala Met 20 270204PRTArtificial
SequenceSOURCE1..204/mol_type="protein" /note="synthesized torque
teno virus" /organism="Artificial Sequence" 270Cys Thr Ser Glu Trp
Leu Ser Phe Pro Arg Pro Ser Ala Ala Ala Xaa 1 5 10 15 Pro Arg Arg
Val Ile Pro Ala Ser Arg Trp Arg Val Pro Lys Val Ser 20 25 30 Leu
His Thr Ala Val Lys Gly Gln Phe Gly Leu Gly Thr Gly Arg Ala 35 40
45 Met Gly Lys Ala Leu Lys Val Phe Ile Leu Lys Met His Phe Ser Arg
50 55 60 Ile Ser Arg Ser Lys Arg Lys Val Leu Leu Pro Ala Leu Pro
Ala Pro 65 70 75 80Pro Pro Pro Arg Gln Leu Leu Met Trp Gln Pro Pro
Ile Gln Asn Gly 85 90 95 Thr Gln Leu Asp Arg His Trp Phe Glu Ser
Val Trp Arg Ser His Ala 100 105 110 Ala Tyr Cys Gly Cys Gly Asp Cys
Val Gly His Leu Gln His Leu Ala 115 120 125 Ala Asn Leu Gly Arg Pro
Pro His Pro Gln Pro Pro Arg Glu Gln His 130 135 140 Pro Pro Gln Ile
Arg Gly Leu Pro Ala Leu Pro Ala Pro Pro Ser Asn 145 150 155 160Arg
Asn Ser Trp Pro Gly Thr Gly Gly Asp Ala Ala Gly Glu Gln Ala 165 170
175 Gly Gly Ser Arg Gly Ala Gly Asp Gly Gly Asp Gly Glu Leu Ala Asp
180 185 190 Asp Asp Leu Xaa Asp Ala Ala Ala Leu Val Glu Glu 195 200
271138PRTArtificial SequenceSOURCE1..138/mol_type="protein"
/note="synthesized torque teno virus" /organism="Artificial
Sequence" 271Ala Val Lys Pro Arg Arg Glu Ile Ser Ala Ser Arg Gly
Arg Val Pro 1 5 10 15 Lys Val Ser Leu His Thr Glu Val Lys Gly Gln
Phe Gly Leu Gly Thr 20 25 30 Gly Arg Ala Met Gly Lys Ala Leu Lys
Lys Ser Met Phe Ile Gly Arg 35 40 45 His Tyr Arg Lys Lys Arg Ala
Leu Ser Leu Cys Ala Val Arg Thr Thr 50 55 60 Lys Lys Ala Cys Lys
Leu Leu Ile Val Met Trp Thr Pro Pro Arg Asn 65 70 75 80Asp Gln Gln
Tyr Leu Asn Trp Gln Trp Tyr Ser Ser Val Leu Ser Ser 85 90 95 His
Ala Ala Met Cys Gly Cys Pro Asp Ala Ile Ala His Leu Ser His 100 105
110 Leu Ala Phe Val Phe Arg Ala Pro Gln Asn Pro Pro Pro Pro Gly Pro
115 120 125 Gln Arg Asn Leu Pro Leu Arg Arg Leu Pro 130 135
272202PRTArtificial SequenceSOURCE1..202/mol_type="protein"
/note="synthesized torque teno virus" /organism="Artificial
Sequence" 272Met Ala Glu Phe Ser Thr Pro Val Arg Ser Gly Glu Ala
Thr Glu Gly 1 5 10 15 Asp His Arg Val Pro Arg Ala Gly Ala Glu Gly
Glu Phe Thr His Arg 20 25 30 Ser Gln Gly Ala Ile Arg Ala Arg Asp
Trp Pro Gly Tyr Gly Gln Gly 35 40 45 Ser Glu Lys Ser Met Phe Ile
Gly Arg His Tyr Arg Lys Lys Arg Ala 50 55 60 Leu Ser Leu Cys Ala
Val Arg Thr Thr Lys Lys Ala Cys Lys Leu Leu 65 70 75 80Ile Val Met
Trp Thr Pro Pro Arg Asn Asp Gln Gln Tyr Leu Asn Trp 85 90 95 Gln
Trp Tyr Ser Ser Val Leu Ser Ser His Ala Ser Met Cys Gly Cys 100 105
110 Pro Asp Ala Val Ala His Leu Ile Asn Leu Ala Ser Val Leu Arg Ala
115 120 125 Pro Gln Asn Pro Pro Pro Pro Gly Pro Gln Arg Asn Leu Pro
Leu Arg 130 135 140 Arg Leu Pro Ala Leu Pro Ala Ala Pro Glu Ala Pro
Gly Asp Arg Ala 145 150 155 160Pro Trp Pro Met Ala Gly Gly Ala Glu
Gly Glu Asn Gly Gly Ala Gly 165 170 175 Gly Asp Ala Asp His Gly Gly
Ala Ala Gly Gly Pro Glu Asp Ala Asn 180 185 190 Leu Leu Asp Ala Val
Ala Ala Ala Glu Thr 195 200 27323PRTArtificial
SequenceSOURCE1..23/mol_type="protein" /note="synthesized
q9wb0_9viru" /organism="Artificial Sequence" 273Arg Val Pro Lys Val
Ser Leu His Thr Glu Val Lys Gly Gln Phe Gly 1 5 10 15 Leu Gly Thr
Gly Arg Ala Met 20 274204PRTArtificial
SequenceSOURCE1..204/mol_type="protein" /note="synthesized torque
teno virus" /organism="Artificial Sequence" 274Cys Thr Ser Glu Trp
Leu Ser Phe Pro Arg Pro Ser Ala Ala Ala Xaa 1 5 10 15 Pro Arg Arg
Val Ile Pro Ala Ser Arg Trp Arg Val Pro Lys Val Ser 20 25 30 Leu
His Thr Ala Val Lys Gly Gln Phe Gly Leu Gly Thr Gly Arg Ala 35 40
45 Met Gly Lys Ala Leu Lys Val Phe Ile Leu Lys Met His Phe Ser Arg
50 55 60 Ile Ser Arg Ser Lys Arg Lys Val Leu Leu Pro Ala Leu Pro
Ala Pro 65 70 75 80Pro Pro Pro Arg Gln Leu Leu Met Trp Gln Pro Pro
Ile Gln Asn Gly 85 90 95 Thr Gln Leu Asp Arg His Trp Phe Glu Ser
Val Trp Arg Ser His Ala 100 105 110 Ala Tyr Cys Gly Cys Gly Asp Cys
Val Gly His Leu Gln His Leu Ala 115 120 125 Ala Asn Leu Gly Arg Pro
Pro His
Pro Gln Pro Pro Arg Glu Gln His 130 135 140 Pro Pro Gln Ile Arg Gly
Leu Pro Ala Leu Pro Ala Pro Pro Ser Asn 145 150 155 160Arg Asn Ser
Trp Pro Gly Thr Gly Gly Asp Ala Ala Gly Glu Gln Ala 165 170 175 Gly
Gly Ser Arg Gly Ala Gly Asp Gly Gly Asp Gly Glu Leu Ala Asp 180 185
190 Asp Asp Leu Xaa Asp Ala Ala Ala Leu Val Glu Glu 195 200
275138PRTArtificial SequenceSOURCE1..138/mol_type="protein"
/note="synthesized torque teno virus" /organism="Artificial
Sequence" 275Ala Val Lys Pro Arg Arg Glu Ile Ser Ala Ser Arg Gly
Arg Val Pro 1 5 10 15 Lys Val Ser Leu His Thr Glu Val Lys Gly Gln
Phe Gly Leu Gly Thr 20 25 30 Gly Arg Ala Met Gly Lys Ala Leu Lys
Lys Ser Met Phe Ile Gly Arg 35 40 45 His Tyr Arg Lys Lys Arg Ala
Leu Ser Leu Cys Ala Val Arg Thr Thr 50 55 60 Lys Lys Ala Cys Lys
Leu Leu Ile Val Met Trp Thr Pro Pro Arg Asn 65 70 75 80Asp Gln Gln
Tyr Leu Asn Trp Gln Trp Tyr Ser Ser Val Leu Ser Ser 85 90 95 His
Ala Ala Met Cys Gly Cys Pro Asp Ala Ile Ala His Leu Ser His 100 105
110 Leu Ala Phe Val Phe Arg Ala Pro Gln Asn Pro Pro Pro Pro Gly Pro
115 120 125 Gln Arg Asn Leu Pro Leu Arg Arg Leu Pro 130 135
276138PRTArtificial SequenceSOURCE1..138/mol_type="protein"
/note="synthesized torque teno virus" /organism="Artificial
Sequence" 276Ser Gly Glu Ala Thr Glu Gly Asp Leu Arg Val Pro Arg
Ala Gly Ala 1 5 10 15 Glu Gly Glu Phe Thr His Arg Ser Gln Gly Ala
Ile Arg Ala Arg Asp 20 25 30 Trp Pro Gly Tyr Gly Gln Gly Ser Glu
Lys Ser Met Phe Ile Gly Arg 35 40 45 His Tyr Arg Lys Lys Arg Ala
Leu Ser Leu Cys Ala Val Arg Thr Thr 50 55 60 Lys Lys Ala Cys Lys
Leu Leu Ile Val Met Trp Thr Pro Pro Arg Asn 65 70 75 80Asp Gln Gln
Tyr Leu Asn Trp Gln Trp Tyr Ser Ser Val Leu Ser Ser 85 90 95 His
Ala Ala Met Cys Gly Cys Pro Asp Ala Val Ala His Phe Asn His 100 105
110 Leu Ala Ala Val Leu Arg Ala Pro Gln Asn Pro Pro Pro Pro Gly Pro
115 120 125 Gln Arg Asn Leu Pro Leu Arg Arg Leu Pro 130 135
27723PRTArtificial SequenceSOURCE1..23/mol_type="protein"
/note="synthesized peptide of subject 24" /organism="Artificial
Sequence" 277Gly Ala Glu Gly Glu Phe Thr His Arg Ser Gln Gly Ala
Ile Arg Ala 1 5 10 15 Arg Asp Trp Pro Gly Tyr Gly 20
278202PRTArtificial SequenceSOURCE1..202/mol_type="protein"
/note="synthesized torque teno virus" /organism="Artificial
Sequence" 278Met Ala Glu Phe Ser Thr Pro Val Arg Ser Gly Glu Ala
Thr Glu Gly 1 5 10 15 Asp His Arg Val Pro Arg Ala Gly Ala Glu Gly
Glu Phe Thr His Arg 20 25 30 Ser Gln Gly Ala Ile Arg Ala Arg Asp
Trp Pro Gly Tyr Gly Gln Gly 35 40 45 Ser Glu Lys Ser Met Phe Ile
Gly Arg His Tyr Arg Lys Lys Arg Ala 50 55 60 Leu Ser Leu Cys Ala
Val Arg Thr Thr Lys Lys Ala Cys Lys Leu Leu 65 70 75 80Ile Val Met
Trp Thr Pro Pro Arg Asn Asp Gln Gln Tyr Leu Asn Trp 85 90 95 Gln
Trp Tyr Ser Ser Val Leu Ser Ser His Ala Ser Met Cys Gly Cys 100 105
110 Pro Asp Ala Val Ala His Leu Ile Asn Leu Ala Ser Val Leu Arg Ala
115 120 125 Pro Gln Asn Pro Pro Pro Pro Gly Pro Gln Arg Asn Leu Pro
Leu Arg 130 135 140 Arg Leu Pro Ala Leu Pro Ala Ala Pro Glu Ala Pro
Gly Asp Arg Ala 145 150 155 160Pro Trp Pro Met Ala Gly Gly Ala Glu
Gly Glu Asn Gly Gly Ala Gly 165 170 175 Gly Asp Ala Asp His Gly Gly
Ala Ala Gly Gly Pro Glu Asp Ala Asn 180 185 190 Leu Leu Asp Ala Val
Ala Ala Ala Glu Thr 195 200 279152PRTArtificial
SequenceSOURCE1..152/mol_type="protein" /note="synthesized torque
teno virus" /organism="Artificial Sequence" 279Ala Arg Thr Pro Arg
Arg Gly Val Arg Ala Ser Arg Gly Arg Val Pro 1 5 10 15 Glu Val Ser
Leu His Thr Ala Val Lys Gly Gln Phe Gly Leu Gly Thr 20 25 30 Gly
Arg Ala Met Gly Lys Ala Leu Lys Lys Ala Met Phe Leu Gly Arg 35 40
45 Ile Tyr Arg Lys Lys Arg Arg Leu Pro Leu Ser Pro Leu His Ser Pro
50 55 60 Pro Lys Ala Arg Lys Leu Leu Arg Gly Met Trp Arg Pro Pro
Thr Gln 65 70 75 80Asn Val Ser Gly Gln Glu Arg Ser Trp Tyr Asp Ser
Val Phe Tyr Ser 85 90 95 His Ala Ala Phe Cys Gly Cys Gly Asp Cys
Val Gly His Leu Ser Tyr 100 105 110 Leu Ala Thr His Leu Gly Arg Pro
Pro Ser Ala Gln Pro Pro Pro Gln 115 120 125 Leu Gln Pro Pro Val Ile
Arg Arg Leu Pro Ala Leu Pro Ala Pro Pro 130 135 140 Asn Pro Ser Gly
Asp Arg Ala Ala 145 150 28049PRTArtificial
SequenceSOURCE1..49/mol_type="protein" /note="synthesized torque
teno virus" /organism="Artificial Sequence" 280Met Ala Glu Phe Ser
Thr Pro Val Arg Ser Glu Gly Ala Thr Glu Gly 1 5 10 15 Ile Pro Asn
Val Pro Arg Ala Gly Ala Gly Gly Glu Phe Thr His Arg 20 25 30 Ser
Gln Gly Ala Ile Arg Ala Arg Asp Trp Pro Gly Tyr Gly Gln Gly 35 40
45 Ser
* * * * *
References