U.S. patent application number 14/330535 was filed with the patent office on 2014-11-27 for method for the in vitro diagnosis or prognosis of testicular cancer.
The applicant listed for this patent is BIOMERIEUX. Invention is credited to Juliette GIMENEZ, Francois MALLET, Cecile MONTGIRAUD.
Application Number | 20140349875 14/330535 |
Document ID | / |
Family ID | 39855745 |
Filed Date | 2014-11-27 |
United States Patent
Application |
20140349875 |
Kind Code |
A1 |
GIMENEZ; Juliette ; et
al. |
November 27, 2014 |
METHOD FOR THE IN VITRO DIAGNOSIS OR PROGNOSIS OF TESTICULAR
CANCER
Abstract
A method for in vitro diagnosis of testicular cancer includes
(i) obtaining a biological sample from a patient suspected of
having testicular cancer, (ii) performing an assay to determine the
methylation status of CpG dinucleotides in a genomic DNA target
sequence, the DNA target sequence being the 5' LTR U3 promoter
sequence of the ERVWE1 locus, optionally further including an
activator sequence directly upstream of the 5' LTR U3 promoter
sequence, and (iii) diagnosing the patient with testicular cancer
when the DNA target sequence is hypomethylated as compared to a
methylation status indicative of the absence of testicular
cancer.
Inventors: |
GIMENEZ; Juliette; (Caluire
et Cuire, FR) ; MONTGIRAUD; Cecile; (Lyon, FR)
; MALLET; Francois; (Villeurbanne, FR) |
|
Applicant: |
Name |
City |
State |
Country |
Type |
BIOMERIEUX |
Marcy L'Etoile |
|
FR |
|
|
Family ID: |
39855745 |
Appl. No.: |
14/330535 |
Filed: |
July 14, 2014 |
Related U.S. Patent Documents
|
|
|
|
|
|
Application
Number |
Filing Date |
Patent Number |
|
|
12918126 |
Aug 18, 2010 |
8815506 |
|
|
PCT/FR2009/050388 |
Mar 10, 2009 |
|
|
|
14330535 |
|
|
|
|
Current U.S.
Class: |
506/9 ;
435/6.11 |
Current CPC
Class: |
C12Q 2535/125 20130101;
C12Q 2600/154 20130101; C12Q 1/6886 20130101; C12Q 2600/158
20130101; C12Q 1/702 20130101; C12Q 1/686 20130101 |
Class at
Publication: |
506/9 ;
435/6.11 |
International
Class: |
C12Q 1/68 20060101
C12Q001/68 |
Foreign Application Data
Date |
Code |
Application Number |
Mar 12, 2008 |
FR |
0851621 |
Claims
1. A method for in vitro diagnosis of testicular cancer,
comprising: obtaining a biological sample from a patient suspected
of having testicular cancer; performing an assay to determine the
methylation status of CpG dinucleotides in a genomic DNA target
sequence, the DNA target sequence being at least one sequence
selected from the group consisting of sequences having at least 99%
sequence identity with the full-length sequence of SEQ ID NO: 6 or
7 and the sequences fully complementary thereto; and diagnosing the
patient with testicular cancer when the DNA target sequence is
hypomethylated as compared to a methylation status indicative of
the absence of testicular cancer, wherein the assay comprises:
extracting genomic DNA from the biological sample; treating the
extracted genomic DNA to convert cytosine bases of CpG
dinucleotides that are nonmethylated at position 5 into uracil
bases; amplifying the treated genomic DNA target sequence; and
determining the methylation status of the CpG dinucleotides in the
genomic DNA target sequence from the amplified genomic DNA target
sequence.
2. The method of claim 1, wherein the extracted genomic DNA is
treated using hydrogen sulfite, disulfite, bisulfite, or a
combination thereof.
3. The method of claim 1, wherein the treated genomic DNA target
sequence is amplified using at least one primer comprising a
sequence selected from the group consisting of the full-length
sequences of SEQ ID NOS: 46-49.
4. The method of claim 1, wherein the biological sample is a
testicular tissue extract or a biological fluid.
5. The method of claim 1, wherein the biological sample is blood,
serum, plasma, urine, or seminal fluid.
6. The method of claim 1, wherein the DNA target sequence is
hypomethylated if 60% or less of the CpG dinucleotides are
methylated.
7. The method of claim 1, wherein the DNA target sequence is
hypomethylated if 30% or less of the CpG dinucleotides are
methylated.
8. A method for in vitro diagnosis of testicular cancer,
comprising: obtaining a biological sample from a patient suspected
of having testicular cancer; performing an assay to determine the
methylation status of CpG dinucleotides in a genomic DNA target
sequence, the DNA target sequence being the 5' LTR U3 promoter
sequence of the ERVWE1 locus, optionally further including an
activator sequence directly upstream of the 5' LTR U3 promoter
sequence; and diagnosing the patient with testicular cancer when
the DNA target sequence is hypomethylated as compared to a
methylation status indicative of the absence of testicular cancer,
wherein the assay comprises: extracting genomic DNA from the
biological sample; treating the extracted genomic DNA to convert
cytosine bases of CpG dinucleotides that are nonmethylated at
position 5 into uracil bases; amplifying the treated genomic DNA
target sequence; and determining the methylation status of the CpG
dinucleotides in the genomic DNA target sequence from the amplified
genomic DNA target sequence.
9. The method of claim 8, wherein the extracted genomic DNA is
treated using hydrogen sulfite, disulfite, bisulfite, or a
combination thereof.
10. The method of claim 8, wherein the treated genomic DNA target
sequence is amplified using at least one primer comprising a
sequence selected from the group consisting of the full-length
sequences of SEQ ID NOS: 46-49.
11. The method of claim 8, wherein the biological sample is a
testicular tissue extract or a biological fluid.
12. The method of claim 8, wherein the biological sample is blood,
serum, plasma, urine, or seminal fluid.
13. The method of claim 8, wherein the DNA target sequence is
hypomethylated if 60% or less of the CpG dinucleotides are
methylated.
14. The method of claim 8, wherein the DNA target sequence is
hypomethylated if 30% or less of the CpG dinucleotides are
methylated.
Description
[0001] This is a Division of application Ser. No. 12/918,126 filed
Aug. 18, 2010, which is a National Stage entry of PCT/FR2009/050388
filed Mar. 10, 2009, which claims priority to FR 0851621 filed Mar.
12, 2008. The disclosure of the prior applications is hereby
incorporated by reference herein in its entirety.
[0002] Testicular cancer represents 1 to 2% of cancers in men, and
3.5% of urological tumors. It is the most common tumor in young
men, and rare before 15 years of age and after 50 years of age. The
risk is highest in patients who are seropositive for HIV. Seminoma
is the most common form of testicular cancer (40%), but many other
types of cancer exist, among which are embryonic carcinoma (20%),
teratocarcinoma (30%) and choriocarcinoma (1%).
[0003] The diagnosis of testicular cancer is first clinical: it
often presents in the form of a hard and irregular swelling of the
testicle. An ultrasound confirms the intratesticular tumor and
Doppler ultrasound demonstrates the increase in vascularization in
the tumor. In some cases, a magnetic resonance examination
(testicular MRI) can be useful. A thoracic, abdominal and pelvic
scan makes it possible to investigate whether there is any lymph
node involvement of the cancer. A blood sample for assaying tumor
markers is virtually systematic. It makes it possible to orient the
diagnosis of the type of tumor. Two main tumor markers are used and
assayed in the blood: .beta.-HCG and .alpha.-foetoprotein. However,
these markers are not very specific and, furthermore, if the
concentration of these markers is at physiological levels, this
does not mean that there is an absence of tumor. At the current
time, the final diagnosis and final prognosis are given after
ablation of the affected testicle (orchidectomy), which constitutes
the first stage of treatment. Next, depending on the type of cancer
and on its stage, a complementary treatment by radiotherapy or
chemotherapy is applied. There is therefore a real need for having
markers which are specific for testicular cancer and which, in
addition, make it possible to establish as early a diagnosis and
prognosis as possible.
[0004] The rare event represented by the infection of a germline
cell by an exogenous provirus results in the integration, into the
host's genome, of a proviral DNA or provirus, which becomes an
integral part of the genetic inheritance of the host. This
endogenous provirus (HERV) is therefore transmissible to the next
generation in Mendelien fashion. It is estimated that there are
approximately a hundred or so HERV families representing
approximately 8% of the human genome. Each of the families has from
several tens to thousands of loci, which are the result of
intracellular retrotranspositions of transcriptionally active
copies. The loci of the contemporary HERV families are all
replication-defective, which signifies loss of the infectious
properties and therefore implies an exclusively vertical
(Mendelien) transmission mode.
[0005] HERV expression has been particularly studied in three
specific contexts, placentation, autoimmunity and cancer, which are
associated with cell differentiation or with the modulation of
immunity. It has thus been shown that the envelope glycoprotein of
the ERVWE1 locus of the HERV-W family is involved in the fusion
process resulting in syncytiotrophoblast formation. It has,
moreover, been suggested that the Rec protein, which is a splice
variant of the env gene of HERV-K, could be involved in the
testicular tumorogenesis process. However, the following question
has not yet been answered: are HERVs players or markers in
pathological contexts?
[0006] The present inventors have now discovered and demonstrated
that nucleic acid sequences belonging to loci of the HERV-W family
are associated with testicular cancer and that these sequences are
molecular markers for the pathological condition. The sequences
identified are U3 retroviral promoter sequences of 5' LTRs (Long
Terminal Repeats) which are hypomethylated in a cancerous
biological sample.
[0007] In mammals, DNA can be methylated on the cytosines preceding
a guanine (CpG doublet). This involves the transfer of a methyl
group from S-adenosyl methionine to a cytosine residue so as to
form 5-methylcytosine. The methylation of CpG doublets located in a
promoter sequence generally results in an underexpression, or even
a lack of expression, of the associated gene. Conversely, if the
CpG doublets contained in a promoter sequence are hypomethylated,
the expression of the associated gene is favored. The role of
methylation in carcinogenesis has been recently studied. Thus,
hypermethylation on the CpG doublets can result in the
underexpression of a tumor suppressor gene, whereas, conversely,
hypomethylation of CpG doublets can cause the activation of
protooncogenes.
[0008] The subject of the present invention is therefore a method
for in vitro, diagnosis or prognosis of testicular cancer, in a
biological sample from a patient suspected of suffering from
testicular cancer, characterized in that it comprises a step of
detecting the presence or absence of methylation of CpG
dinucleotides in at least one genomic DNA target sequence of the
sample, the target sequence being selected from at least one of the
sequences identified in SEQ ID Nos. 1 to 7 or from at least one
sequence which exhibits at least 99% identity, preferably at least
99.5% identity, and advantageously at least 99.6% identity, with
one of the sequences identified in SEQ ID Nos. 1 to 7 and the
sequences complementary thereto.
[0009] The percentage identity described above has been determined
while taking into consideration the nucleotide diversity in the
genome. It is known that nucleotide variability is higher in the
regions of the genome that are rich in repeat sequences than in the
regions which do not contain repeat sequences. By way of example,
D. A. Nickerson et al.,.sup.[1] have shown a diversity of
approximately 0.3% (0.32%) in regions containing repeat
sequences.
[0010] The sequences SEQ ID Nos. 1 to 6 correspond, respectively,
to the sequences of the U3 retroviral promoters of the HW4TT,
HW2TT, HW13TT, HWXTT, HW21TT and ERVWE1 loci, and SEQ ID No. 7
corresponds to the sequence of the activator plus the sequence of
the U3 region of ERVWE1.
[0011] The sample from the patient will generally comprise cells
(such as the testicular cells). They may be present in a tissue
sample (such as the testicular tissue) or be found in the
circulation. In general, the sample is a testicular tissue extract
or a biological fluid, such as blood, serum, plasma, urine or else
seminal fluid.
[0012] More particularly, the method comprises:
(i) extraction of the genomic DNA to be analyzed from the sample,
(ii) treatment of the extracted genomic DNA with one or more
reagents so as to convert the cytosine bases, of the CpG
dinucleotides, which are nonmethylated at position 5, into uracil,
(iii) at least one amplification of the treated DNA by bringing
into contact with at least two primers, (iv) determination, on the
basis of the presence or absence of methylation of the cytosines of
the CpG dinucleotides, of a methylation state of said target
sequence or of a value which reflects the methylation state of the
target sequence, for example the ratio of the number of methylated
cytosines of the CpG dinucleotides/total number of cytosines of the
CpG dinucleotides. In particular, if the ratio, corresponding to a
percentage methylation, is less than or equal to 80%, preferably
less than or equal to 60%, and advantageously less than or equal to
30%, this can be correlated with a presumption of testicular
cancer.
[0013] If necessary, the method comprises a second amplification
step after the amplification step described in (iii), which
consists in bringing the amplicons obtained in (iii) into contact
with at least two primers in order to amplify the target
sequence.
[0014] The term "target sequence" is intended to mean a sequence or
the sequences of a set of clones.
[0015] The determination, in the DNA, of the degree of methylation
is carried out by any suitable technique. The methylation state or
status of a DNA sequence can be established by methods using
methylation-sensitive restriction enzymes or by methods involving a
chemical modification of the DNA with sodium bisulfite, hydrogen
sulfite or disulfite, preferably with a solution of sodium
bisulfite, which converts the nonmethylated cytosines into uracils
while at the same time not modifying the 5-methylcytosines. The
analysis of the methylation can be carried out by conventional
methods, such as sequencing, hybridization or PCR. Several methods
of analysis use the ammonium bisulfite conversion technique, such
as bisulfite sequencing PCR (conversion with ammonium bisulfite,
amplification of the sequence of interest and sequencing), MSP
(Methylation Specific PCR) and MSO (Methylation Specific
Oligonucleotide Microarray) using DNA chips specific for the
modified DNA. All these methods are well known to those skilled in
the art and mention may be made, by way of illustration, of S. E.
Cottrell .sup.[2].
[0016] Thus, in step (ii) of the abovementioned method, the
treatment of the genomic DNA comprises the use of a solution
selected from the group consisting of hydrogen sulfite, disulfite
and bisulfite, and combinations thereof; preferably, a solution of
sodium bisulfite.
[0017] In one embodiment of the invention, the method for in vitro
diagnosis and/or prognosis of testicular cancer comprises:
(i) extraction of the DNA to be analyzed from the sample from the
patient, (ii) determination, in the DNA to be analyzed, of the
degree (percentage) of methylation of the cytosines of the CpG
dinucleotides included in at least one of the DNA sequences
identified in SEQ ID Nos. 1 to 7 or in at least one sequence which
exhibits at least 99% identity, preferably at least 99.5%,
advantageously at least 99.6% identity, with a sequence identified
in SEQ ID Nos. 1 to 7, and (iii) comparison of the degree
(percentage) of methylation of the cytosines in one or more DNA
sequences as defined in (ii) with the degree (percentage) of
methylation of said cytosines of said sequence(s) present in the
DNA extracted from a noncancerous biological sample; if the degree
of methylation in the DNA to be analyzed is determined as being
less than the degree of methylation in the DNA extracted from the
noncancerous biological sample, this can be correlated with the
diagnosis or prognosis of a testicular cancer.
[0018] The term "hypomethylated sequence" is therefore intended to
mean a DNA sequence comprising one or more CpG doublets, in which a
cytosine of at least one CpG dinucleotide or doublet is not
methylated at position 5 (i.e. which does not contain a CH.sub.3
radical at the fifth position of the cytosine base) in comparison
with the same DNA sequence derived from the same type of
noncancerous sample. In order to determine the methylation state or
status of a target sequence, the following ratio can be
calculated:
number of methylated cytosines of the CpG dinucleotides/total
number of cytosines of the CpG dinucleotides. If the ratio,
corresponding to a percentage methylation, is less than or equal to
80%, preferably less than or equal to 60%, and advantageously less
than or equal to 30%, this can be correlated with a presumption of
testicular cancer.
[0019] The subject of the invention is also an isolated nucleic
acid sequence consisting of at least one DNA sequence selected from
the sequences identified in SEQ ID Nos. 1 to 7 or from at least one
sequence which exhibits at least 99% identity (preferably at least
99.5% or 99.6% identity) with one of the sequences identified in
SEQ ID Nos. 1 to 7 and the sequences complementary thereto. The
abovementioned sequences which are associated with testicular
cancer are used as molecular markers for testicular cancer.
FIGURES
[0020] FIG. 1 represents the principle of the WTA method for
amplifying RNAs.
[0021] FIG. 2 represents a synoptic scheme of the nature and the
sequence of the various steps for preprocessing DNA-chip data
according to the RMA method.
[0022] FIG. 3 illustrates the nomenclature, the position and the
structure of the HERV-W loci overexpressed and exhibiting a loss of
methylation in the tumoral testicle.
[0023] FIG. 4 is a histogram representing the increase in
expression of five loci (HW4TT, HW2TT, HW13TT, HWXTT and HW21TT),
respectively, in three pairs of testicular samples (testicle 1,
testicle 2 and testicle 3), based on a comparative tumor
sample/healthy sample quantification. The loci are represented
along the x-axis and the factors of increase of expression between
tumor tissue and healthy tissue are represented along the
y-axis.
[0024] FIGS. 5 to 10 represent the methylation status of the U3
region of unique LTR or of the 5' LTR of the various loci,
respectively HW4TT, HW2TT, HW13TT, HWXTT, HW21TT and ERVWE1 in the
healthy testicle (normal) and in the tumoral testicle derived from
the same patient, after amplification and analysis of the sequences
obtained.
EXAMPLES
Example 1
Identification of HERV-W loci Expressed in Cancerous Tissues
[0025] Method:
[0026] The identification of expressed HERV-W loci is based on the
design of a high-density DNA chip in the GeneChip format proposed
by the company Affymetrix. It is a specially developed, custom-made
chip, the probes of which correspond to HERV-W loci. The sequences
of the HERV-W family were identified from the GenBank nucleic
databank using the Blast algorithm (Altschul et al., 1990) with the
sequence of the ERVWE1 locus, located on chromosome 7 at 7q21.2 and
encoding the protein called syncytin. The sequences homologous to
HERV-W were compared to a library containing reference sequences of
the HERV-W family (ERVWE1) cut up into functional regions (LTR,
gag, pol and env), using the RepeatMasker software (A. F. A. Smit
and P. Green). These elements constitute the HERVgDB bank.
[0027] The probes making up the high-density chip were defined on a
criterion of uniqueness of their sequences in the HERVgDB bank. The
HERV-W proviral and solitary LTRs contained in the HERVgDB bank
were extracted. Each of these sequences was broken down into a set
of sequences of 25 nucleotides (25-mers) constituting it, i.e. as
many potential probes. The evaluation of the uniqueness of each
probe was carried out by means of a similarity search with all the
25-mers generated for all the LTRs of the family under
consideration. This made it possible to identify all the 25-mers of
unique occurrence for each family of HERV. Next, some of these
25-mers were retained as probes. For each U3 or U5 target region, a
set of probes was formed on the basis of the probes identified as
unique.
[0028] The samples analyzed using the HERV high-density chip
correspond to RNAs extracted from tumors and to RNAs extracted from
the healthy tissues adjacent to these tumors. The tissues analyzed
are: uterus, colon, lung, breast, testicle, prostate and ovary.
Placental RNAs (health tissue only) were also analyzed. For each
sample, 400 ng of total RNA were amplified by means of an unbiased
transcriptional method known as WTA. The principle of WTA
amplification is the following: primers (RP-T7) comprising a random
sequence and a T7 promoter sequence are hybridized to the
transcripts; double-standard cDNAs are synthesized and serve as a
template for transcriptional amplification by the T7 RNA
polymerase; the antisense RNAs generated are converted to
double-stranded cDNAs which are then fragmented and labeled by
introducing biotinylated nucleotide analogs at the 3'OH ends using
terminal transferase (TdT) (cf. FIG. 1).
[0029] For each sample, 16 .mu.g of biotin-labeled amplification
products were hybridized to a DNA chip according to the protocol
recommended by the company Affymetrix. The chips were then washed
and labeled, according to the recommended protocol. Finally, the
chips were read by a scanner in order to acquire the image of their
fluorescence. The image analysis carried out using the GCOS
software makes it possible to obtain numerical values of
fluorescence intensity which are preprocessed according to the RMA
method (cf.: FIG. 2) before being able to carry out a statistical
analysis in order to identify the HERV loci specifically expressed
in certain samples.
[0030] Comparison of the means of more than two classes of samples
was carried out by the SAM procedure applied to a Fisher test.
[0031] Results:
[0032] The processing of the data generated by the analysis on DNA
chip using this method made it possible to identify six sets of
probes corresponding to an overexpression in just one sample: the
tumoral testicle. These six sets of probes are specific for six
precise loci referenced HW4TT, HW2TT, HW13TT, HWXTT, HW21TT and
ERVWE1 (cf.: FIG. 3). The information relating to the
abovementioned loci are summarized in Table 1 below.
TABLE-US-00001 TABLE 1 Locus SEQ ID No: Chromosome Position* HW4TT
8 4 41982184:41989670 HW2TT 9 2 17383689:17391462 HW13TT 10 13
68693759:68699228 HWXTT 11 X 113026618:113027400 HW21TT 12 21
27148627:27156168 ERVWE1 13 7 91935221:91945670 *Position given in
relation to ensemble version No. 39 (June 2006) (NCBI No. 36)
http://www.ensembl.org/Homo_sapiens/index.html
[0033] The HW13TT locus is a chimeric provirus of HERV-W/L type
resulting from the recombination of an HERV-W provirus and an
HERV-L provirus. This chimera is such that the 5' region made up of
the sequence starting from the beginning of the 5' LTR to the end
of the determined gag fragment is of W type and the 3' region made
up of the sequence starting from the subsequent pol fragment to the
end of the 3' LTR (U3-R only) is of L type. This results in a
fusion of the 3' gag W-5' pol L regions.
Example 2
Validation of the loci Overexpressed in the Tumoral Testicle and
Determination of the Associated Induction Factor
[0034] Principle:
[0035] The six loci identified as overexpressed in the tumoral
testicle by means of the high-density HERV chip were validated by
real-time RT-PCR on three pairs of testicular samples. The
specificity of this overexpression is evaluated by analyzing
samples originating from other tissues. To this end, specific
amplification systems were developed and used for the loci
identified, as described in Table 2 below.
TABLE-US-00002 TABLE 2 Locus Sense primer (SEQ ID No:) Antisense
primer (SEQ ID No:) G6PD gene TGCAGATGCTGTGTCTGG (14)
CGTACTGGCCCAGGACC (15) HW4TT GGTTCGTGCTAATTGAGCTG (16)
ATGGTGGCAAGCTTCTTGTT (17) HW2TT TGAGCTTTCCCTCACTGTCC (18)
TGTTCGGCTTGATTAGGATG (19) HW13TT CATGGCCCAATATTCCATTC (20)
GGTCCTTGTTCACAGAACTCC (21) HWXTT CCGCTCCTGATTGGACTAAA (22)
CGTGGGTCAAGGAAGAGAAC (23) HW21TT ATGACCCGCAGCTTCTAACAG (24)
CTCCGCTCACAGAGCTCCTA (25)
[0036] The expression of these loci is standardized with respect to
that of a suitable housekeeping gene: G6PD. This quantification of
expression was carried out using an Mx3005P real-time RT-PCR
machine, marketed by the company Stratagene.
[0037] Results:
[0038] The study of the three pairs of testicular samples indicates
that all the putative loci identified, with the exception of HWXTT,
the expression of which could not be quantified in the second
testicular RNA pair, are overexpressed in the tumoral testicle
compared with the health tissue (cf.: FIG. 4).
[0039] The analysis of pairs of samples originating from other
tissues (colon, uterus, breast, ovary, lung and prostate) shows
that the overexpression phenomenon is restricted to the tumoral
testicle. Consequently, the expression of the five identified loci
assumes the nature of a marker specific for testicular cancer.
Example 3
Epigenetic Control of Transcription
[0040] Principle:
[0041] DNA methylation is an epigenetic modification which takes
place, in eukaryotics, by the addition of a methyl group to the
cytosines of 5'-CpG dinucleotides, and results in transcriptional
repression when this modification occurs within the nucleotide
sequence of a promoter. Apart from a few exceptions, human
endogenous sequences of retroviral origin are restricted, owing to
this methylation process, to a silent transcriptional state in the
cells of the organism under physiological conditions.
[0042] In order to analyze the methylation status of the unique LTR
or of the 5' LTR of the five loci, the "bisulfite sequencing PCR"
method was used. This method makes it possible, on the basis of
sequencing a representative sample of the population, to identify
the methylation state of each CG dinucleotide on each of the
sequences within the tissue studied.
[0043] Since the methylation information is lost during the
amplification steps, it is advisable to translate the methylation
information actually within the nucleotide sequence by means of the
method of treating the genomic DNA with sodium bisulfite. The
action of the bisulfite (sulfonation), followed by hydrolytic
deamination and then alkaline desulfonation, in fact makes it
possible to modify all the cytosines contained in the genomic DNA,
into uracil. The speed of deamination of sulfonated cytosines (C)
is, however, much higher than that of the sulfonated 5-methyl-Cs.
It is therefore possible, by limiting the reaction time to 16
hours, to convert strictly the non-methylated cytosines to uracil
(U), while at the same time preserving the cytosines which have a
methyl group. After the sodium bisulfite treatment, the sequence of
interest is amplified from the genomic DNA derived from the tumoral
testicular section and from that derived from the adjacent healthy
testicular section, by polymerase chain reaction (PCR) in two
stages. The first PCR enables a specific selection of the sequence
of interest, the second, "nested", PCR makes it possible to amplify
this sequence.
[0044] Since the DNA sequence had been modified by the bisulfate,
the design of the primers took into account the code change (C to
U), and the primers were selected so as to hybridize to a region
containing no CpG (their methylation state, and therefore their
conversion state, being a priori unknown).
[0045] The sequences of the primers used are described in Tables 3
to 8 below.
TABLE-US-00003 TABLE 3 HW4TT locus Sense primer 5' .fwdarw. 3'
Antisense primer 5' .fwdarw. 3' Amplification (SEQ ID No.:) (SEQ ID
No.:) First PCR CCAACATCACTAACACAACCT (26) GGGAGTTAGTAAGGGGTTTG
(27) Nested PCR CAACCTATTAAACAAAACTAAATT (28)
AGATTTAATAGAGTGAAAATAGAGTTT (29)
TABLE-US-00004 TABLE 4 HW2TT locus Sense primer 5' .fwdarw. 3'
Antisense primer 5' .fwdarw. 3' Amplification (SEQ ID No.:) (SEQ ID
No.:) First PCR TTATTAGTTTAGGGGATAGTTG (30) ACACAATAAACAACCTACTAAAT
(31) Nested PCR GAGGGTAAGTGGTGATAAA (32) AACCTACTAAATCCAAAAAAA
(33)
TABLE-US-00005 TABLE 5 HW13TT locus Sense primer 5' .fwdarw. 3'
Antisense primer 5' .fwdarw. 3' Amplification (SEQ ID No.:) (SEQ ID
No.:) First PCR TAGGATTTTAGGTTTATTGTTA (34) AAAAATAAAATATTAAACC
(35) Nested PCR ATATGTGGGAGTGAGAGATA (36) CAACAACAAACAATAATAATAA
(37)
TABLE-US-00006 TABLE 6 HWXTT locus Sense primer 5' .fwdarw. 3'
Antisense primer 5' .fwdarw. 3' Amplification (SEQ ID No.:) (SEQ ID
No.:) First PCR TTGAGTTTTTTTATTGATAGTG (38) TCTAAATCCTATTTTCCTACT
(39) Nested PCR GTTTTTTTATTGATAGTGAGAGAT (40) TAACAAACCTTTAATCCAAT
(41)
TABLE-US-00007 TABLE 7 HW21TT locus Sense primer 5' .fwdarw. 3'
Antisense primer 5' .fwdarw. 3' Amplification (SEQ ID No.:) (SEQ ID
No.:) First PCR TTTAGTGAGGATGATGTAATAT (42) CAACTTAATAAAAATAAACCCA
(43) Nested PCR ATAATGTTTTAGTAAGTGTTGGAT (44) ACAATTACAAACCTTTAACC
(45)
TABLE-US-00008 TABLE 8 ERVWE1 locus Sense primer 5' .fwdarw. 3'
Antisense primer 5' .fwdarw. 3' Amplification (SEQ ID No.:) (SEQ ID
No.:) First PCR AATTCATTCAACATCCATTC (46)
GGTTTAATATTATTTATTATTTTGGA (47) Nested PCR
CTCTTACCTTCCTATACTCTCTAAA (48) AGAGTGTAGTTGTAAGATTTAATAGAGT
(49)
[0046] After extraction on a gel and purification, the amplicons
are cloned into plasmids, and the latter are used to transform
competent bacteria. About twelve plasmid DNA mini preparations are
carried out using the transformed bacteria and the amplicons
contained in the plasmids are sequenced. The sequences obtained are
then analyzed (cf.: FIGS. 5 to 10).
[0047] Results:
[0048] The analysis of the 5' region of the transcripts of the loci
identified was carried out by means of the 5' Race technique. It in
particular made it possible to show that the transcription is
started at the beginning of the R region of the proviral 5' LTR.
This reflects the existence of a promoter role for the U3 region of
the proviral 5' LTR.
[0049] 1. Methylation state of the U3 sequences of the 5' LTR of
the HW4TT locus:
[0050] The U3 sequence of the 5' LTR of the HW4TT locus of
reference contains 5 CpG sites:
[0051] a) in the sample of healthy testicular tissue: out of 12
sequences analyzed, 9 are completely methylated. The other 3 each
time exhibit 1 CpG nonmethylated out of the 5 contained in the U3
region. This therefore represents an overall methylation of the U3
region of the 5' LTR of the HW4TT locus amounting to 95% in the
healthy testicular sample;
[0052] b) in the sample of tumoral testicular tissue: out of 12
sequences analyzed, 5 (i.e. 41.66% of the sequences) are completely
demethylated, 3 sequences have 4 CpGs out of 5 nonmethylated, 2
sequences have 2 CpGs out of 5 nonmethylated, 1 sequence has 1 CpG
out of 5 nonmethylated, and 1 sequence remains completely
methylated. This therefore represents an overall methylation of the
U3 region of the 5' LTR of the HW4TT locus amounting to 30% in the
tumoral testicular sample.
[0053] 2. Methylation state of the U3 sequences of the 5' LTR of
the HW2TT locus:
[0054] The U3 sequence of the 5' LTR of the HW2TT locus of
reference contains 5 CpG sites:
[0055] a) in the sample of healthy testicular tissue: out of 12
sequences analyzed, 9 are completely methylated, 1 has its 2.sup.nd
CpG nonmethylated, 1 has the CpG at position 4 nonmethylated, 1 has
the CpGs at positions 4 and 5 nonmethylated, and 3 sequences have
point mutations on one or two CpGs (one in position 3, one in
position 5 and one in positions 4 and 5), very probably reflecting
PCR artifacts. This therefore represents an overall methylation of
the U3 region of the 5' LTR of the HW2TT locus amounting to 92.9%
in the healthy testicular sample;
[0056] b) in the sample of tumoral testicular tissue: out of 12
sequences analyzed, 6 are completely demethylated, 5 sequences have
one or two methylated CpG(s) (1 at position 1, 1 other at position
5, 1 on positions 1 and 5, 2 at positions 4 and 5 and 1 at position
3). Finally, one sequence has 4 CpGs methylated out of 5 (positions
1, 2, 4 and 5). This corresponds to an overall methylation of the
U3 region of the 5' LTR of the HW2TT locus amounting to 20% in the
tumoral testicular sample.
[0057] 3. Methylation state of the U3 sequences of the 5' LTR of
the HW13TT locus:
[0058] The U3 sequence of the 5' LTR of the HW13TT locus of
reference contains 3 CpG sites:
[0059] a) in the sample of healthy testicular tissue: an additional
CpG, compared with the reference sequence, is found in 4 of the 10
clones studied for this locus. It is located between CpGs 2 and 3
and is methylated. In the other 6 clones, this site is mutated
compared with the reference sequence. The other 3 CpGs of the U3
region are methylated in the 10 sequences analyzed. This therefore
represents an overall methylation of the U3 region of the 5' LTR of
the HW13TT locus amounting to 100% in the healthy testicular
sample;
[0060] b) in the sample of tumoral testicular tissue: the
additional CpG indicated above is also found. It is demethylated in
4 of the 10 sequences analyzed, mutated in 3 other sequences, and
its methylation state is indeterminate in the last 3 sequences. 7
sequences out of 10 are completely demethylated and the other 3 are
methylated on the 2.sup.nd and on the 3.sup.rd CpG. This
corresponds to an overall methylation of the U3 region of the 5'
LTR of the HW13TT locus amounting to 20% in the tumoral testicular
sample.
[0061] 4. Methylation state of the U3 sequences of the solitary LTR
of the HWXTT locus:
[0062] The U3 sequence of the 5' LTR of the HWXTT locus of
reference contains 6 CpG sites:
[0063] a) in the sample of healthy testicular tissue: the 8
sequences analyzed are completely methylated, which corresponds to
a methylation percentage of 100% in the healthy testicular
sample;
[0064] b) in the sample of tumoral testicular tissue: the 9
sequences analyzed are completely demethylated, which corresponds
to a methylation percentage of 0%.
[0065] 5. Methylation state of the U3 sequences of the 5' LTR of
the HW21TT locus:
[0066] The U3 sequence of the 5' LTR of the HW21TT locus of
reference contains 7 CpG sites:
[0067] a) in the sample of healthy testicular tissue: the 10
sequences analyzed all have 6 CpGs methylated out of 7; for 6 of
the sequences, the 1.sup.st CpG is nonmethylated and for the other
4 sequences, the 4.sup.th CpG is nonmethylated. This therefore
represents an overall methylation of the U3 region of the 5' LTR of
the HW21TT locus amounting to 85.7% in the healthy testicular
sample;
[0068] b) in the sample of tumoral testicular tissue: out of 8
sequences analyzed, 6 are completely demethylated, 2 others exhibit
a profile identical to one of those found in the healthy testicular
tissue, namely 6 CpGs methylated and the 1.sup.st CpG
nonmethylated. This corresponds to an overall methylation of the U3
region of the 5' LTR of the HW21TT locus amounting to 21.4% in the
tumoral testicular sample.
[0069] 6. Methylation state of the sequences of the activator of
the U3 of the 5' LTR of the ERVWE1 locus:
[0070] The ERVWE1 locus comprises, in addition to its U3 promoter
region, a known activator located directly upstream of the 5' LTR,
and which contains two CpG sites (CpG 1 and 2). The U3 sequence of
the 5' LTR of the ERVWE1 locus of reference contains, for its part,
5 CpG sites (CpGs 3 to 7):
[0071] a) in the sample of healthy testicular tissue: out of 10
sequences analyzed, 5 sequences have CpGs 1 and 2 (activator) and 5
(U3) nonmethylated, 1 sequence has CpGs 2 and 5 nonmethylated, 2
sequences have CpGs 1 (activator) and 7 (U3) nonmethylated, 1
sequence has CpG 7 only nonmethylated and, finally, 1 is completely
methylated for the 7 CpGs. In total, this corresponds to a
methylation percentage of 68.57% in the healthy testicular
sample;
[0072] b) in the sample of tumoral testicular tissue: out of the 10
sequences analyzed, only 3 sequences exhibit, for each one, a
unique methylated CpG (CpG 4 or CpG5 or CpG6), the other 7
sequences are completely demethylated, which corresponds to a
methylation percentage of 4.29%.
[0073] The very high level of methylation of the U3 retroviral
promoters of the loci considered, in the healthy tissue, indicates
a repression of the transcriptional expression by an epigenetic
mechanism. On the other hand, the low level of methylation of these
same promoters in the tumoral tissue reflects a lifting of
transcriptional inhibition, the result of which is the
significantly higher expression demonstrated by means of the
high-density HERV DNA chip and by means of the real-time RT-PCR.
Thus, the U3 retroviral promoters of the loci considered appear to
be specific markers for the tumoral nature of the testicle.
LITERATURE REFERENCES
[0074] [1] Nickerson D. A. et al., DNA sequence diversity in a
9.7-kb region of the human lipoprotein lipase gene, Nature
Genetics, Vol. 19, pp 233-240 (1998). [2] Cottrell S. E., Molecular
diagnostic applications of DNA methylation technology, Clinical
Biochemistry 37, pp 595-604 (2004).
Sequence CWU 1
1
491255DNAHomo sapiens 1tgagaaacag gactagttag atttcctagg ccaactaaga
atccctaagc ctagctggga 60aggtgatcgc atccaccttt aaacacgggc ttgcaactta
gctcacacct gaccaatcag 120gtagtaaaga gagctcacta aaatgctaat
taggcaaaaa caggaggtaa agaaatagcc 180aatcatctat tgcctgacac
cacacgggga gggacaatga ttgggatata aacccaggaa 240ttcgagctgg caacg
2552247DNAHomo sapiens 2tgagacacag gactagctgg atttcctagg ccgactaaga
atccctaagc ctagctggga 60aggtgaccac atccaccttt aaacacgggg tttacaactt
agctcacacc cagccaatca 120gagagctcac taaaatgcta attaggcaaa
aacaggaggt aaagaaatag ccaatcatct 180attgcctgag agcacagcgg
gagggacaag gattgggata taaacccagg cattcgagct 240ggcaacg
2473241DNAHomo sapiens 3tgagagacag ctggatttcc taggccgact aagaatccct
aagcctagct gggaaggtga 60ccgcatccac ctttaaacac agggcttgca acttagctca
cacccaacca atcagagagc 120tcactaaaat gctaattagg caaaaacagg
aggtaaagaa atagcaagtc atctattgcc 180tgagagcaca gtgggaggga
caaggaccag gatataaacc caggcatttg agccagcaac 240g 2414248DNAHomo
sapiens 4tgagagacag gactaactgg atttcctagg ccgactaaga atccctaagc
ctagctggga 60aggtgaccgc atccatcttt aaacacgggg cttgaaactt agctcacacc
taaccagtca 120gagagctcac taaaatgcta attaggcaaa aaacaggagg
taaagaaata gccaatcatc 180tattgcctga gagcacagcg ggagggacaa
ggatcgggat ataaacccag gcattcgagc 240cagcaatg 2485255DNAHomo sapiens
5tgagagacag gactagctgg atttcctagg ccgactaaga atccctaagc ctagctggga
60aggtgaccgc ttccaccttt aaacacgggg cttgcaactt agctcacacc cgaccaatca
120gatagtaaag agagcacact aaaatgctaa ttaggcaaaa acaggaggta
aagaaatagc 180caatcatcta ttgcctgaga gcaaagcggg agggacaatg
atcgggatag aaacccaggc 240attcaagccg gaatg 2556247DNAHomo sapiens
6tgagagacag gactagctgg atttcctagg ccgactaaga atccctaagc ctagctggga
60aggtgaccac gtccaccttt aaacacgggg cttgcaactt agctcacacc tgaccaatca
120gagagctcac taaaatgcta attaggcaaa gacaggaggt aaagaaatag
ccaatcatct 180attgcctgag agcacagcag gagggacaac aatcgggata
taaacccagg cattcgagct 240ggcaaca 2477314DNAHomo sapiens 7ccctggggcg
ggcttccttt ctgggatgag ggcaaaacgc ctggagatac agcaattatc 60ttgcaactga
gagacaggac tagctggatt tcctaggccg actaagaatc cctaagccta
120gctgggaagg tgaccacgtc cacctttaaa cacggggctt gcaacttagc
tcacacctga 180ccaatcagag agctcactaa aatgctaatt aggcaaagac
aggaggtaaa gaaatagcca 240atcatctatt gcctgagagc acagcaggag
ggacaacaat cgggatataa acccaggcat 300tcgagctggc aaca 31487774DNAHomo
sapiens 8tgagacacag gactagctgg atttcctagg ccgactaaga atccctaagc
ctagctggga 60aggtgaccac atccaccttt aaacacgggg tttacaactt agctcacacc
cagccaatca 120gagagctcac taaaatgcta attaggcaaa aacaggaggt
aaagaaatag ccaatcatct 180attgcctgag agcacagcgg gagggacaag
gattgggata taaacccagg cattcgagct 240ggcaacggca accccctttg
ggtctcctcc ctttgcatag gagctctgtt ttcactctat 300taagtcttgc
aactgcactc ttctggtccg tgtttcttac cgcttgagct gagctttccc
360tcactgtcca ccactgctgt tttgccaccg tcacaggccc accgctgact
tccattcttc 420tggatctagc aggctgtcca ctgtgctcct gatccagcga
ggcgcccatt gccgctcccg 480attgggctaa aggcttgcca ttgttcctgc
atggctacgt gcctgggttc atcctaatca 540agccgaacac tagtcactgg
gttccacggt tctcttccat gacccacgac ttctaataga 600actataacac
tcacctcatg gcccaagatt ccattccttg gaatccatga ggccaagaac
660cccaggtcag agaacacgag gcttgccacc atcttggaag tggccccacc
accatcttgg 720gagctctggg agcaaggacc cccggtaaca ttttggcgac
cacaaaggga catccaaagt 780ggtgagtaat attggaccac tttcacttgc
tattctgttc tatccttcct tagaactgga 840ggaaaatacc aggcacaggc
acctgtcagc cagttaaaaa caattagcgt cgccgccaca 900cttaagactc
aggtgtgagg ctatctgggg aaagactttc taacaacccc caacccatct
960agtggggatg ttggtctgcc tggagacagc ttccactttc aattttcttg
gggaagccga 1020gggctcacta gaggcagaca gctgttgtcc caaactccgg
gcagtagccg gttgagatca 1080tggtgcagcc aggagtctct actcagcagt
cgccgatgca tgtgccccta ccttcccttc 1140tgacccatac atcctgagtc
ccgactgtga ctttcttgaa agtgtagccc caaaattctc 1200cttacctctg
aatctacttc ctctgatccc tgcctcctgg gtactaatga ttcagacttt
1260catttcctct agcaagttgt gtctccaaag ggatctaagg aggctctacg
ctgcatcctt 1320aggcacctag gctataaccc aaggagtctt atccctggtg
tccctcccga tttgggtata 1380caactctcaa catgggcagt tatgtaggac
ccattcccca ccacacttgc cagggcccca 1440agtttgtaat ggctaagaga
gagacacaga gagagagaga gagatggaga gagagacaag 1500gagggagtca
aagagaaaaa gaaagaaaaa gaaatagtag aaaaaaaagt gtgccctatt
1560cctttaaaag ccagggtaaa tttaaaacct gtaattgata attgaaggtc
ttctccgtga 1620ccctgtaaca ctccaatgcc attttgttgt cagtgtaaat
aagggcatag cccaaaagca 1680ctgaggtcac tgacaacccg tagctttccc
atcaaaaatc cttaacccag taatccgcgg 1740atgggccaaa tgcattcagt
cggtagcagc aaccgctttg ctaaaagtag aaaagtaact 1800tttagaggaa
acctcattgt gagcgcacac ctcaccagtt cagaattatt ctaagtcaaa
1860aaaaaaaaaa gcaaaaaggt aacttactaa ctcaaaaatc ttaaagtata
ggtctatcat 1920attagaaaag ggtaatgtaa ctccaaccac tgataattcc
cttaacccag cagatttcct 1980aacaggggat ttaaaactta attaccatac
aaaggtccca ccagacctag gaggaactcc 2040cttcaggaca ggacgataaa
cggttcctcc caggtgattg aggaaaaaaa ccacaatggg 2100tattcagtaa
ttgatacaga gactcatgtg gaagcagtta gaaaaattgc ctaataattg
2160gtctcctcaa acgtgtaagc tgtttgcact cagccaagcc ttaaagtact
tacagaatca 2220aaaagactct gaatcctgac tcaaaaggtt tgctacaccc
tctgtgaaac aaatttgcat 2280aagaactgtt gtttatggga aggcatcttg
atggggcagc tgggttgtta tgaaatactc 2340aggaccccag cccggctcta
ggactcaccc ctgagcgcaa aaggcaatgt tgggcacgct 2400ggtaaaggac
cactagaatc cagcagcccg gacccctttc tttgtggtca agagaggcgg
2460gaaaacaggt gcaggactgc tacatcagtg agcataacta atccagtaag
cagaggtcca 2520tgggtggtta tgcaccctgg aaaagaatac gcattaggcc
cttagaggat gctctaggac 2580taatgctcat cggaaaatga ctaggggtgc
tgacatccct atgttctttt ttcagatggg 2640aaacgttcct cccaccccaa
ggcaaaaaac acccctaaga tgtattttgg agaattagga 2700ccaatttgac
cctcagacac taagaaagaa atgacttaca ttcttctgca gtaccatgat
2760atcctcttca agggggagaa acctggcctc ctgagagaag tataaattat
aacaccatct 2820tacagtgaga cctcttctgt agaaaggagg gcaaatggag
tgaagtgcaa actttccttt 2880cattaagaga caactcgcaa ttatgtaaaa
agtgtgattt atgccctaca gaaagccctc 2940agtctacctc cctatcccag
ggtccccccg attcctttcc caactaataa ggacccccct 3000tttacccaaa
tggtccaaag gagatagatg aagggataaa caatgaacca aacagtgcca
3060atattccctg attatgcccc ctccaggcag tgggaggagg agaattcggc
ccagccagag 3120tgcatgtacc tttttttttc tctcagactt aaagcaaatt
aaaatagacc taggtaaatt 3180ctcagataac cctgatggct atattgatgt
tttacaaggg ttaggacaat cctttgctct 3240gacatggaga gatataatgt
tactgctaaa tcagacacta accccaaatg agagaagtgt 3300caccatagct
gcagcccaag agtttggcaa tctctggtat ctcagtcagg tcaatgatag
3360gatgacaaca gaggaaaggg aatgattccc cacaggccag caggcagttc
tcagtgtaga 3420ccctcactgg gacacagaat aagaacatgg agatcggtgc
cgcagatatt tgctaacttg 3480cgtgctagga ctaaggaaaa ctaggaagaa
gcctatgaat tattcagtga tgtccactat 3540aacacaggga aaggaagaaa
atcatactgc ctttccggaa atactaaggg aggcattgag 3600gaagcatacc
tctctgtcac ctgactgtat tgaagtccaa ctaatcttaa aggatatgtt
3660tatcactcag tcagctgcag acattagaaa aaacttcaaa agtccacctt
aggcccagag 3720caaaacttag aaaccctatt gaacttgtta acctcagttt
tttataatag agatcaggag 3780gagcaggcgg aacaggacaa acaggattaa
aaaaagacca ccgctttagt catggccctc 3840aggcaagtgg actttggaag
ctctggaaaa gggaaaagct gggcaaattg aatgcctaat 3900agggcttgct
tccagtgtgg tctacaagga cacttaaaaa aagattgtcc aagtagaaat
3960aagctgcccc ttcgtccatg cctcttatgt caagggaatc actggaaggc
ccattgcccc 4020aggggaggaa ggtcctctga gtcagaagcc actaaccaga
tgatccagca gcaggactaa 4080gggtgcccag ggcaagcccc agcccatgcc
atcaccctca cagagccccg ggtatgcttg 4140accattgagg gccaggaggt
taactgtctc ctgaacactg gcacagcctt ctcagtctta 4200ctttcctgtc
ccggacaact gtcctccaga tctgtcacta tctgagcggt cctaggacag
4260ccagtcacta gatatttctc ccagccacta agttgtgact ggggaacttt
actcttttca 4320catgcttttc taattatgcc tgaaagcccc actcctttgt
tagggagaga cattctagca 4380aaagcagggg ccattataca tctgaacata
ggagaaggaa cacccgtttg ttgtcacctg 4440cttgaggaag gaattaatgc
tgaagtctgg gcaacagaag gacaatatgg atgagcaaag 4500aatgcccatc
ctgttcaagt taaattaaag gattccgcct cctttcccta ccaaaggcaa
4560taccccctta gacccgaggc ccaacaagga ctccaaaaga ttgttaagga
cctaaaagcc 4620caaggcctag taaaaccatg caatagcccc tgccatactc
caattttagg agtaaggaaa 4680cccaacggac agtggaggtt agtgcaagaa
ctcaggatta tcaatgaggc tgttgttcct 4740ctatacccag ctgtacctaa
cccttataca gtgctttccc aaataccaga ggaagcagag 4800tggtttacag
tcctggacct taaggatgcc tttttctgca tccctgtacg tcctgactct
4860caattcttgt ttgcctttga agatcctttg aacccaacgt ctcaactcac
ctggactgtt 4920ttaccccaag ggttcaagga tagcccccat ctatttggcc
aggcattagc ccaagacttg 4980agccaattct catacctgga cactcttatc
cttcggtatg gggatgattt aattttagct 5040acccattcag aaacgttgtg
ccatcaagcc acccaagtgc tcttaaattt cctcgctacc 5100tgtggctaca
ggtttccaaa cgaaaggctc agctctgctc acagcaggtt aaatacttag
5160ggctaaaatt atccaaaggc accagggccc tcagtgagga acgtatccag
cctatactgg 5220cttattctca tcccaaaacc ctaaagcaac taagagcatt
ccttggcata acaggctgct 5280gctgaatatg gattcccagg tacagtgaaa
tagccaggcc attatacaca ctaattaagg 5340aaactcagaa agccaatacc
catttagtaa gatggacacc ttaagcagaa gcggctttcc 5400aggccttaaa
gaaggcccta acccaagccc cagtggtaag cttgccaaca gggcaagact
5460tttctttata tgtcacagaa gaaacaggaa tagctctagg agtccttaca
caggtctgag 5520ggatgagctt gcaacccatg gcatacctga gtaaggaaac
tgatgtagtg gcaaagggtt 5580ggcctcattg tttacgggta gtggcagcag
tagcagtctt agtatctgaa gtagttaaaa 5640taatacaggg aagagatctt
actgtgtgaa catctcatga tgtgaatggc atagtcactg 5700ctaaaggaga
cttgtggctg tcagacaact gtttacttaa ataccaggct ctattacttg
5760aagggccagt gctgcgactg tgcacttgtg caactcttaa cccagacaca
tttcttccag 5820acaatgaaga aaagatagaa cataactgcc aacaagtaat
tgctcaaacc tatgccactc 5880gaggggacct tttagaggtt cccttgactg
atcccaacct caacttgtat actgatggaa 5940gttcctctgt agaaaaagga
ctttgaaaag tggggtatgc agtggtcagt gataatggaa 6000tacttgaaag
taatcccctc actccaggaa ctagtgctca gctggcagaa ctaatagccc
6060tcactcgggc actagaatta ggagaagaga aaagggtaaa tatatacaga
ctctaagtat 6120gcttacctag tcctccatgc ccatgcagca atatggagag
aaagggaatt cctaatttcc 6180aagggaacac ctatccaaca tcaggaagcc
attaggagat tactattggc tgtacagaaa 6240cataaagagg tggcaatctt
acactgccgg tgtcaccaga aaggaaagga aagggaaata 6300gaaaggaacc
accaagcgga tattgaagcc aaaagagccg caaggcagga ccctccatta
6360gaaatgctta tagaaggacc cctagtatgg ggtaatcccc tccaggaaac
caagccccag 6420tactcagaag aagaaataga atgaggaacc tcacaagcac
atagtttcct cccctcagga 6480tggctagcca ctgaagaagg aaaaatactt
ttgcctgcag ctaaccaatg gaaattactt 6540aaaacccttc accaaacatt
tcccttaggc attgatagca cccatcagat ggccaaatta 6600ttatttactg
gaccaggcct tttcaaaact atcaagcaga tagtcagggc ctgtaaagtg
6660tgccaaacaa gtaatcccct gcactgcagg ccatacattt caatccctgt
atctttaacc 6720tccttgttaa gtttgtctct tccagaatca aagctgtaaa
actacaaata gttcttcaaa 6780tggagcccca gatgtagtcc atgactaaga
tctaccgcgg acccctggac aagcctgcta 6840gcccatgctc tgatgttaat
gacatggaag gcacccctcc cgaggaaatc gcaactgcac 6900aacccctatt
acaccccaat tcagcaggaa gcagttagag cattcatcag ccaacctccc
6960caacagcact tgggttttcc tattgagagg gggtactgag agacaggact
agctggatgt 7020cctaggctga ctaagaatcc ctaagcctag ctgggaaggt
gaccacatcc acctttaaat 7080acggggcttg caacctagct cacacccaac
agatcagaga gctcgttaaa atgctaatta 7140ggcaaaaaca ggaggtaaag
aaatagccaa tcatctattg cctgagagca cagcaggagg 7200gacaaggatt
gggatataat cccaggcatt cgagctggca acagcaaccc cctttgggtc
7260ccctcccttt gtatgggagc tgttttcact ctatttcact ctattaaatc
ttgcaactgc 7320actcttctgg tgcatgtttg ttactgcttg agctgaactt
tcactcgcca tctaccactg 7380ctgttttgcc gccgtcgcag acccactgct
gacttccatt cttctggatc cagcagggtg 7440tccactgtgc tcctgatcca
gtgaggcacc cattgccgct cccgatctgg ctaaaggctt 7500gccattgttc
ctgcatcgct aagtgcctgg gttcgtccta atcaagctga acactagtca
7560ctgggttcca cagttctctt ccatgaccca cgacttctaa tagagctata
acactcacct 7620tatggcccaa gattccattc cttggaatcc atgaggccaa
aaaccccagg tcagagaaca 7680tgagacttgc caccatgttg aagtggcctg
ctgccatttt ggaagtggcc caccaccatc 7740ttgggagctc tgggagcaag
gacccctggt aaca 777497487DNAHomo sapiens 9tgagaaacag gactagttag
atttcctagg ccaactaaga atccctaagc ctagctggga 60aggtgatcgc atccaccttt
aaacacgggc ttgcaactta gctcacacct gaccaatcag 120gtagtaaaga
gagctcacta aaatgctaat taggcaaaaa caggaggtaa agaaatagcc
180aatcatctat tgcctgacac cacacgggga gggacaatga ttgggatata
aacccaggaa 240ttcgagctgg caacggcaac tccctttggg tctcctctca
ttgtatggga gctctgtttt 300cactctatta aatcttgcaa ctgcacactc
ttctggtctg tgtttgttat ggcttgagct 360gagcttttgc tggctgtcca
ccactgctgt ttgctgccgt cgcagacccc ttgctgactc 420ccacccctgc
ggatctggca gggtgtctgc tgcgctcctg atccagccag gcacccactg
480ctgctcccaa tcaggctaaa ggcttgccat tgttcctgca tggctaagtg
cccgggttcg 540tgctaattga gctgaacact agtcgctggg ttccacagtt
ctcttccgtg acccacagct 600tctaatagag ctataacact cactgcatgg
cccaacattc cattccttgg aatctgtgag 660gccaagaacc cccggtcaga
gaacaagaag cttgccacca tcttggaagc agcccgccac 720cattttggga
gctctaagaa caaggacccc ccagtaacat tttggtgacc acgaagggac
780ctccaaagca gtgagtaata ttgaaccact tccgcttgct attctgtcct
aaccttcctt 840agaattggag gaaaataccg ggcacctgtc ggccagttaa
gaacgattag cgtggccgcc 900agacttaaga ctctggtgtg aggctgtctg
ggaaagggct ttctaacaac ccccaaccct 960tccgggttgg gagctttggt
ctgcctggaa ccagcttcca ctttcaattt tcctggggaa 1020tccaagggct
gactagaggc agaaagctgt catcccgaac tcctggcatt agacagttga
1080gatcgtggcg cagccagaag tctctactca acagtcaccc atgcgtgcac
ccctaccttt 1140ccttctaacc catacctccc gggtcccaac catgactttc
ttgaaagtgt agcccctaaa 1200ttctctttac ctctaaatct acttcttctc
atccctgctt cctaggtact aatggttcag 1260actttcattt cctctagcaa
gttctatctc cagagggatc taaggaaggg atctatgctg 1320tgtccttagg
cccctaggct atgaacccag agagtcttct ccctgttatc tctccccatt
1380taggcataca gctctcaaca tggacagtta tgtgggaccc attccctacc
acccttgcca 1440gggccccaag ttttcaaagg gctagaagaa aaaagagaga
aagagagaga gaggcagagg 1500ggagagaaag agagagagac aaagagggag
tcaaagagag atagaaagag aaagatagaa 1560ctagtaaaga aaaaaagtat
gccccattcc tttaaaagcc agggtaaatt taaaacctat 1620aattgataat
tgaaggtctt ctccatgacc ctataacact ccaataccac cttgttttca
1680gtgtaaacaa gggtgtagcc cgaaaacact gagaccactg acaacccata
gccttcctat 1740caaaaatcct taacccagga acccatggat ggcccaaatg
cattcaatct gtagcagcaa 1800ctgctttgct aacagaagaa agtagaaaag
taacttttag agaaaacctc attgtgagca 1860cacctcacca gttcagaatt
attctaagtc aaaaaagcaa aaaggtagct tactaactca 1920aaaatcttaa
agtatggggt tattctgtta gaaaaaggtg atttaacatt aaccactgaa
1980aattccctta acccagcagg tttcctaatg ggatttaaat cttcattacc
atacaaaggt 2040ccgaccagac ccagcaggaa ctccctttag gacaggatga
tagatggttc ctcctgggtg 2100attgaggggg tgaaaaacca caatgggtgt
tcagtaattg atagggagac tcttgtggaa 2160ggagagttag gaaaattgcc
taataattgg tctgctcaaa tgtgcgagct gtttgcactc 2220agccaagcct
taaagtactt acagaatcaa aaagactcta tctcaatcct gactcaaaat
2280gttacctaca ccatctctga catgaatttg cataagaact gttgtttatg
ggaatgcatc 2340ttgatggggc agctgggttg ttatgaaata ctcaggaacc
cagcccaggt ctagaattca 2400cctctgagcg caaaggcaat gttggccatg
ctggtaaagg accactagaa tccaggagcc 2460tggacccctt tctttgtggt
caagaaaggc gggaaaacag gtgcaggact gctacatcag 2520agagcataac
aaatccgata agcagagttc catgagtggt taagcaccct ggaaaggaac
2580tcacctctga gtgcaaaggc aatgttaggc acaccagtaa aggaccacta
gaatccagca 2640gcccagaccc ctttctttgt gatcaagaaa ggcgggaaaa
ggggtgcagg actgctacat 2700cagtgagcgt aactaatctg ataagcagaa
gtccatgggt ggttacgcac cctggaaagg 2760aataagcatt aggaccacag
aggacactct aagactaatg ctcattggaa aatgactagg 2820ggtgctggca
tccctatgtt tttttttcag atgggaaaca ttccccccaa ggcaaaaacg
2880cccataagat atattctgga gaattcggcc cagagtgtat gtatcttttt
tccctgtcag 2940acttgaagca aacctaggta aattatcaga tagccctgat
ggctatattg atgctttaca 3000agggttagga caatcctttg atctaacatg
gagagatata ctgttactgc tagatcagac 3060actaatccca aatgaaagaa
gtgccaccat aactgcagcc agagagtttg atgatctctg 3120gtatctcagt
caggtcaatg ataggatgac aacagaagaa agaaaacaat tccccacagg
3180ccagcaggca gttcccagcg tagaccttca ttgggacaca gaatcagaac
atggagattg 3240gtgccgcaga catttactaa cttgcgcgct agaagcacta
aggaaaacta ggaagaagcc 3300tatgaattat tcaatgatgt ccactataac
acagggaaag gaagaaaatc ctactgcctt 3360tctggagaga ctaagggagg
cattgagaaa gcatacctct ctgtcacctg actctattga 3420aggccaacta
atcttaaagg ataagttttc cactcagtca gctgcagaca ttagaaaaaa
3480acttcaaaag tctgcgttag gccgggagca aaacttagaa accctattga
acttggcaac 3540ctcagttttt tatgatagag atcaggagga tcaggtggaa
tggacaaatg agattttaaa 3600aaaaggccac cactttagtc atggccctca
ggcaagcaga ctttggacac tctggaaaag 3660ggaaaagctg ggcaaatcga
atgcctaata agacttgctt ccagtgtggt ctacaaggac 3720actttaaaaa
agattgtcca aatagaaata agccaccccc tcgtccatgc tccttatgtc
3780aagggaatca ctggaaggcc tactgcccca ggggatgaag gtcctctgag
tcagaagcca 3840ctaaccagat gattcagccc caggactcag ggtgcccagg
gcaagcgcca gcctatgcca 3900tcaccctcac agagccctgg gtatgcttga
ccattgaggg tcaggaggtt aactatctcc 3960tggacactgg cgtggccttc
tcagtcttac tctcctgtcc cggacaactg tcctccagat 4020ctgtcactat
ccgagggttt ctacgacagc cagccactag atacttctcc cagccactaa
4080gttgtgactg gggaactcta ctcttttcac atgtttttct aattatgcct
gaaagcccca 4140ctcctttgtt agggaaagac attctagcaa aagcaggggc
cattatacac ctgaacatag 4200gagaaggaac acctgtttgt tgtcccctgc
ttgaagaagg aattaatcct gaagtctgga 4260caacagaagg acaatacaga
tgagcaacaa atgcctgtcc tgttcaagtt aaactaaagg 4320attatgcctc
ctttccctac caaaggcagt acccccttag acccgaggcc caacaaggac
4380tccaaaagat tgttaaggac ctaaaagctc aaagcctagc aaaaccatgc
agtagcccct 4440gcaatactcc aattttagga gtacagaaaa ccaacagaca
gtggaggtta gtgcaagatc 4500tcaggattat caatgaggct gttgttccta
acccttatac tctgctttcc caaataccag 4560aagaagcaga gtggtttaca
gtcctggacc ttaaggatgg ctttttctgc atccctgtac 4620atcctgactc
tcaattcttg tttgcctttg gagatccttc gaacccaatg tctcaactca
4680gcttgactgt tttaccccaa gggttcaggg atagccccca tctagttggc
caagcattag 4740ccgagccagt tctcctacct ggacactctt gtcctctggt
acatggatga tttattttta 4800gctgcccgtt cagaaacctt gtgccatcaa
gccacccaag tgctcttaaa tttcctcgcc 4860acctgtggct acaaggtttc
caaaccaaag gctcagctct gctcacagca ggttaaatac
4920ttagggctaa aattatccaa aggcaccagg gccctcagtg aggaatgtat
ccagcctgta 4980ttggcttatc ctcatcccaa aaccctaaag caactaagag
ggttccttgg cataacaggt 5040ttctgccaaa tgtggattcc caggtacggt
gaaatagcca ggccattata taccctaatt 5100aaggaaactc agaaagccaa
cacccattta ttaagatgga cacctgaagc agaagcagct 5160ttccaggccc
taaagaaggc cctaacccaa gccccagtgt taagcttgcc aacggggaag
5220acttttcttt atatgtcaca gaaaaaacag gaatagctct aggagtcctt
agacaggtcc 5280aagggatgag cttgcaacct gtggcatacc tgagtaagga
aattgatgta gttgcaaagg 5340gttgacctca ttgtttacag gtagtggcgg
cagtagcagt cttagtatct gaagcagtta 5400aaataataca gggaagagat
cttactgtgt ggacatctca tgatgtaaac ggcgtactca 5460cttctaaagg
agacttgtgg ctgtcagaca accgtttact taaatatcag gctctattac
5520ttgaagggcc agtgctgcga ctgcccactt gttcaactct taacccagcc
acatttcttt 5580cagacaatga agaaaagata gaacataact gtcaacaggt
gattgctcaa acctacggcg 5640ctcgagggga ccttctagag gttcccttga
ctgatcccaa cctcaacttg tatactgatg 5700gaagctcctt tgtagaaaaa
ggactttgaa aggtggggta tgcagtggtc agtgataatg 5760gaatacttga
aagtaattcc ttcactccag gaactagtgc tcagctggca gaactaatag
5820ccctcactca ggcactagaa ttaggagaag gaaaaagggt aaatatatat
gcagactcta 5880agtatgctta cccagtcctc cacgcccaca cagcaatatg
gagagatagg aaattcctaa 5940cttctgaggg aacaccgatc aaacatcagg
aagccattag gagattatta ttggctgtac 6000agaaacctaa agaggtggca
gtcttacact gctggggtca tcagaaagga aaggaaaagg 6060aaatagaaag
gaaccaccaa gtggatattg aagccaaaag agccacaagg caggccctcc
6120attagaaatg cttatagaag gatccctagt atggggtaat cccctccggg
aaaccaagcc 6180ccagtactca gcaggagaaa tagacacgag gacatagttt
cctcccctca ggatggctag 6240ccaccgaaaa agggaaaata cttttgcctg
cagctaatca atggaaatta cttaaaaccc 6300ttcaccaaac ctttcacttg
ggcatggata gcatctatca gatggccaat ttattattta 6360ctggaccagg
ccttttcaaa actatcaagc agatagtcag ggcctgtgaa atgtgccaaa
6420gaaataatcc cctgcacttc aagccataca tttcaatccc tgtatcttta
acctcctgtt 6480gtttgtctct tccagactca aagctgtaaa actgcaaatg
gttcctcata tggagcccca 6540gatgcagtcc atgactaaga tctaccacag
agccctagac cggcctgtta gcccatgctc 6600cgatgttgat gacatcaaag
gcacaccttc cgaggaaatc tcaactgcac gacccctact 6660aagccccaat
tcagcaggaa gcagttaaga gcagtcgttg gctaacatcc ccaatagtat
6720gtgggttttc ctgttgagag gggggactga gagacaggac tagctggatt
tcctaggcca 6780actaagaatc cctaagccta gttgggaagg tgaccgcatc
cacctttaaa cacggggctt 6840gcaacttagc tcacacccga ccaatcaggt
agtaaagaga gctcactaaa atgctaatta 6900ggcaaaaaca agaggtaaag
aaatagccaa tcatctatcg cctgagagca cagtggggag 6960ggacaatgat
cgggatataa acccaggcat tcgggccggc aacggcaacc cccattgcgt
7020cccctcccat tgtatgggag ctctgttttc attctattaa atcttgcaac
tgcacactct 7080tctggtctat gtttgttatg gctcgagctg agctttcgct
cgctgtccac cactgctgtt 7140tgccgccatc gcagacccac cactgacttc
cacctctgca gatctggcag ggtgtccgct 7200gtgctcctga cccagcgagc
cacccattgc tgctcccaat caggctaaag gcttgccatt 7260gttcctgcat
ggctaagagc ccagggttcg tcctaatcga gctgaacgct agtagctggg
7320ttccacagtt ctcttccgtg acccacggct cctaatagag ctataacact
caccacatgg 7380cccaaggttc cattcattgg aatccgtgag gccaagaacc
cccggtcaga gaacaagaag 7440cttgccacca tcttggaagc tctaaaaaca
gagacacccc agtaaca 7487105470DNAHomo sapiens 10tgagagacag
ctggatttcc taggccgact aagaatccct aagcctagct gggaaggtga 60ccgcatccac
ctttaaacac agggcttgca acttagctca cacccaacca atcagagagc
120tcactaaaat gctaattagg caaaaacagg aggtaaagaa atagcaagtc
atctattgcc 180tgagagcaca gtgggaggga caaggaccag gatataaacc
caggcatttg agccagcaac 240ggcaacctcc tttgagtccc ctccctttgt
ataggagctc tgttttcact gtgtttcact 300ctattaaatc ttgcaattgc
actcttctgg tccatatttg tcacggcttg agctgagctt 360tcacttgccg
tccaccacta ctgtttgctg ctgtcacaga cccgccgctg actcccatcc
420cgctgctgac tcccatccct ccggatccgg cagggtgtcc gctgtgctcc
tgatccagca 480agactcccat tgccactccc gatagtgcta aaggcttgcc
attgttcctg catggctaag 540tgcctgggtt cgtcctaatc cagctgaaca
ctagtcactg ggttccacgg ttctcttcca 600tgacccgcgg cttctaatag
agctataaca ctcaccacat ggcccaatat tccattcctt 660ggaatccgtg
aggccaagaa ccccaggtca gagaacacga ggcttgccac catcttggaa
720gcagcctgcc accatcttgg aagtggctca ccaccgtctt gggagttctg
tgaacaagga 780cccctggtaa cattttggcg accacgaagg gacatccaaa
gctgtgagta atattggacc 840actttcgctt gctattctgt tctatcctta
gaactggagg aaaatactgg gcacctgtcg 900ccagttaaaa atgattagca
tggccgccgg acttaagact caggtgtgag gctatctggg 960aaagggcttt
ctaacaaccc ccaagccttc tgttgggaac tttggtctgc ctggagccag
1020cttccacttt caattttctt ggggaagcca agggctgact ggaggcagaa
agctgttgtc 1080ccgaactccc ggcagtagcc ggttgagatc atggcgcagc
cagaagtctc tactcggcag 1140tcgcccatgc gtgcgccctt acctttcctt
ctgaattata cctccggggt cccgactccg 1200actttcttga gagtttagcc
ccaaaattct ccttacctct gaatctactt cctttgatcc 1260ctgcctcctg
cctcctaggt actaatagtt cagactttca tttcctctag caagttgtgt
1320ctccaaaggg atctaaggag gctctatgct gtgtccttag gcacctaggc
tataacccag 1380ggagtcttat ccctggtatc cctcccgatt taggtataca
gctcttgaca tgggcagtta 1440tgtgggacct gttccccacc acccttgtga
gggccccaag tttgtaatgg ctaagaaaga 1500gagacggaga gagagagaga
cggagaaaga gacaaagagg gagtcaaaga gaaaaagaaa 1560gaaaaagata
gaaatagtta aaaaaaaaaa aaagtgtgcc ctattccttt aaaagccagg
1620gtaaatttaa aacctgtaat tgataattgc cactttgttg tcagtgtaaa
taagggcgta 1680gcaaatcctt aacccagtaa cccgcggata ggccaaatgc
attcagtcgg tagcggcaac 1740agctttgcta aaagtagaaa agtaactttt
agaggaaacc tcattgtgag cacacctcac 1800cagttcagag ttattctaag
taaaaaaaaa aaaaaaaaaa aaagcaaaaa ggtagcttac 1860taactcaata
atcttaaagt atggggctac tatgctagaa aagggtaatg taactccaac
1920cactgataac tcccttaacc cagcagattt cctaacaggg gatttaaatc
ttaattacca 1980cacgaaggtc cgaccagacc taggaggaac tcccttcagc
acaggacgat agatggttcc 2040tcccaggtga ctgaggaaaa aactacaatg
ggtattcagt aattggtatg gagactcttg 2100tggaagcaga gttaaaaatt
tgcctaataa ttggtctcct caaatgtgcg agctgtttgc 2160actcagccaa
gccttaaagt acttacagaa tcaaaagact atctcaatcc tgactcaaaa
2220ggttagctac acagtctctg aaatgaattt gcagaagaac tgttgtttat
gggaatgcat 2280cttgatgggg cagctgggtt gttatgaaat actcaggaac
ccagcccagc tctaggactc 2340accgctgagc gcaaaggcaa tgttgggcac
gctggtaaag gaccactaga atccagcagc 2400ccaggcccct ttctttgtgg
tcaagaaagg caggaaaagg agtgcagaac tgctacattg 2460gtgagcgtaa
ctaatccaat aagcagaggt ccatgagtgg ttatgcacgc tggaaaagaa
2520taagcattag gcccttagag gatgctctag gactaatgct catcggaaaa
tgactagggg 2580tgctggcatc cttatgttct ttcttcagat gggaaacgtt
ccccccaagg caaaagcgcc 2640cctaagatgt attctggaga attagaacca
atttgaccct cagatgtcaa gaaagaaacg 2700acttatattc ttctgcagta
ctgcctggcc acgatatcct cttcaagggg gagaaacctg 2760gcctcctgag
ggaagtacaa attataacac catcttacag ctagacctct tttgtagaaa
2820agaaggcaaa tggagtgaag tgccatatgt gcaaactttc ttttcattaa
gagacaactc 2880acaattatgt aaaaagtgtg gtttatgtct tacaggaagc
cctcagagtc tacctcccta 2940tcccagcatt cccccgactc cttccccaac
taataagcac cacccttgaa cccaaacagt 3000ccaaaaggag atagacaaac
aggtaaacaa tgaaccaaag agtgtcagta ttccccgatt 3060atgccccttc
caagcagtgg gaggaggaga attcggccca gccagagtgc atgtaccttt
3120ttctctctca gacttaacgc aaattaaaat agacttaggt aaattctcag
ataaccctga 3180tggctacatt gatgttttac aagggttagg gcaatccttt
gatctgacat ggagagatat 3240aatgttactg ctaaatcaga cactaacccc
aaatgagaga agtgccgccg taactgcagc 3300ccgagagttt ggtgatctct
ggtatctcag tcaggtcaat gataggatga caacagagaa 3360aagagaacga
ttccccacag gccagcaggc agtttccagt gtagaccctc attaggacac
3420agaatcagaa catggagatt ggtgccacag atatttgcta acttgagtgc
tagaaggact 3480aaggaaaact aggaagaagc ctatgaatta ttcagtgatg
tccactataa cacaaggaaa 3540ggaagaaaat cctactgcct ttctggagag
agtaagggag gcattaagga agcatacctc 3600cctgtcacct gactctattg
aaggccaact aatcttaaag gataagtttg tcactcagtt 3660agctgcagac
attagaaaaa aacttcaaaa gtccgactta ggcctggagt acggctgagt
3720gcccaatttg gcagcaggca agaccaacac tgagcccttc atatggcacc
atgctttgtg 3780gtgatcagcc aactacttga tggcaggttg attatattgg
acatctttca tcagagaaat 3840ggcagtggtt tgtccttcct ggaatagaca
cttattctcg atatgggttt gtctatcctg 3900caggcaatgc ttctgccagg
agtaccatct gtggactcat ggaaagcctt atccaccatc 3960atggcattcc
acacagcatt gcctctaaac aaggcactta ttttatagct aaggaagtgt
4020ggcagtgggc tcatgctcat ggaattcact gattgtatct tgttgcccat
tatcttaaag 4080cagctggatt gatagaacag tggaaaggcc atttgaaatc
acaattacac caccaactag 4140gtgacaatac tttgcagggc tcggcaaagt
tctcttgaag gctgagtatg tcctgaatca 4200gcatccaata tatggtactg
tttccctcat agccagcatt cacaggccta agaatcaagg 4260ggtagaagta
gaagtggcac cactcaccat cactcctagt gacccactag caaaaatttt
4320acttccagtt cccccaacat tatgttctgc tggccttagt tccagaggga
agaattctgc 4380caccagtcga cacaagaatg ataccattaa actgaaagtt
aaaattgcca cctggccact 4440ttgggctcct cccacctcta agtcaacagg
tcaagaaagg agttacagtg ttgacttggg 4500tgattgacct ggactatcaa
gatgaaatca ggttactact ccacagtgga ggtaaggaag 4560aatatgtgtg
gaatacagga gatcccttag gccgtctttt agtactacca tgccctgtga
4620ttaaggtcag tggaaaacta caacaatcca atctaggcag gactacaaat
ggcccagact 4680cttcaggaat gaagggttgg gtgacttcac caggtaaaaa
aataacagcc tgctgaggtg 4740ctagctgaag gcaaagggaa tacagaatgg
ttagtagaaa aaggtagtca tcaataccag 4800ctatgaccac aagaccagtt
gcagaaatga gacctgtaat tgtcatgtgg atttcctcct 4860tacatgtttg
tgcatgtata cacttctact aagaaaatac ctttatttat ttcctttgct
4920tttcccttat caagtgacat tattaacttc atatcagcag ttaagtgtta
ttaactttat 4980gtaatagcat ttcggttaat aattcacttc tggttgtatg
aaggatagcc gtattaagtt 5040aggtgtaatt atgacatcat tattgtcttt
atttgaagat tatgtgtaat ttcaggagat 5100gtgtatgggt tcaagttgac
aagggatgga cttgtgatgg ctaatgttga gtgtcaactt 5160gactgaggat
gcaaagtatt gttcctgggt gtgtctgtga gggtgttgcc aaaggagatt
5220aacatttgtg tcagtgaact gggagatgca gacccacccg caatctgggt
gagcaccatg 5280taatcagctg ccagagcagc tagaataaag caagcagaag
aaggtggaag gagctgactt 5340gctgagtctt ctagtattct tcgttcttct
atgctggttg cttcctgccc ccaaacatca 5400gtctgcaagt tcttctgctt
ttggactctt ggacttacac cagtggtttg ccagggactc 5460tcgggccttc
547011783DNAHomo sapiens 11tgagagacag gactaactgg atttcctagg
ccgactaaga atccctaagc ctagctggga 60aggtgaccgc atccatcttt aaacacgggg
cttgaaactt agctcacacc taaccagtca 120gagagctcac taaaatgcta
attaggcaaa aaacaggagg taaagaaata gccaatcatc 180tattgcctga
gagcacagcg ggagggacaa ggatcgggat ataaacccag gcattcgagc
240cagcaatggc aacccccttt gggtcccctt cccttgtatg ggagctctgt
tttcactcta 300tttcactcta ttaaatcttg caactgcact cttctggtcc
atgtttgtta cggctcgagc 360tgagctttgg ctcgccatcc accactgctg
tttgccgccg tcgcacacct gctgctgact 420cccatccctc cggatccagc
agggtgtgtc cgctgtgctc ctgatccagc gaggtgccca 480ttgccgctcc
tgattggact aaaggcttgc cattgttcct gcacggctaa gtgcccgggt
540tcgtcctaat ccagctgaac actagtcact gggttccacg gttctcttcc
ttgacccacg 600gcttctaata gagctataac actcaccgca tggcccaaga
ttccattcct tggaatctgt 660gaggccaaga accccaggtc agagaacacg
aggcttgcca ccatcttgga agcggcctgc 720caacatcttg gaagtggctc
gccaccatct tgggagctct gtgagcaagg acccctggta 780aca 783127542DNAHomo
sapiens 12tgagagacag gactagctgg atttcctagg ccgactaaga atccctaagc
ctagctggga 60aggtgaccgc ttccaccttt aaacacgggg cttgcaactt agctcacacc
cgaccaatca 120gatagtaaag agagcacact aaaatgctaa ttaggcaaaa
acaggaggta aagaaatagc 180caatcatcta ttgcctgaga gcaaagcggg
agggacaatg atcgggatag aaacccaggc 240attcaagccg gaatggctac
cctctttggg tcccctccct ttgtatggga gctctgtttt 300cactctattc
aatcttgcaa ctgcactctt ctggtccgtg tttgttacag cttgagctga
360gctttcgctc gccttccacc actgctgttt gccgccatcg cagacctgcc
gtgctgactt 420ccatccctct agatctggca gggtgtccgc tgtgctcttg
atccagcgag gcgcccattg 480ccgctcccga ttgggctaaa ggcttgcaat
tgttcctgca cgctaagtgc ctgggttcat 540cctcatcaag ctgggttcca
cggttctctt catgacccgc agcttctaac agagctataa 600aactctgtgc
atggcccaag attccattcc ttggaatctg tgaggccaag aaccccaggt
660cagagaacag gaggcttgcc accatcttgg aagtggctcg ccaccatctt
aggagctctg 720tgagcggaga cccccacccc ccggtaacat tttggcgacc
acgaagggac ctccaaagcg 780gtgagtaata ttggatcact ttcgcttgct
attctgtcct atccttcttt agaattggag 840gaaaatactg ggcacctgtc
ggccagttaa aaacaattag cgtggctgcc cgacttaaga 900ctcaggtgtg
aggctatctg gggaagggct ttctaacaac ccccaaccct tctgggttgg
960ggacgttggt ctgccccttc cactttcaat tttcttgggg aagccaaggg
tcgactagag 1020gcagaaagct gtcgtccgga actcctggca gtagccggtt
gagatcatgg cgcagccaga 1080agtctctact caacagtcgc ccatgcgtgc
gctcctacct ttcctcctga cccatacctc 1140ctgggtcccg acgatgactt
tcttgaaagt gtagccccaa aattctgctt acctctgaat 1200ctacttcccc
tgatccctgg ctcctaggta ctaatggttc agtttcattt cctctagcaa
1260gttgtatctc caaagggatc taaggaagct ctacgctgcg tccttaggca
tctaggctat 1320aaacccagga agtcttgtcc ctggtgtccc tcccgattta
ggcatacagc tctcgacatg 1380ggcagttatg tgggacccgt tccccatcac
ccttgtcaag gccccaagtt tgtaatggct 1440aagaggagag agagagaaag
agagagagac ggaggggaga gagagagaga gagatggagg 1500ggagagagag
agagagagac ggaggggaga gagagagaga gagagagacg gaggggagag
1560agagagagac ggaggggaga gagagagaga tggaggagag aaagacaaag
ggagtcaaag 1620agaaaaagaa agagaaagac agaaatggta aaacaaacaa
aaaacagcgt gccctattcc 1680tttaaaagcc ggggtaaatt taaaacctat
aattgataat tgaaggtctt ctccatgacc 1740ctataatact ccaatactac
cttgttgtca gtgtaaacaa gggcgtagcc tgaaaacact 1800gagaccactg
acaacctgca gctttcctat caaaaaatcc ttaacccagt aaccggcaga
1860tgcattcaat ctgtagcagc aactgttttg ctaacagaag aaagtagaaa
agtaactttt 1920agaggaaacc tcattgtgag cacaccttac cagttcagaa
ttattctaag tcaaaaaagc 1980aaaaaggtag cttactaact caaaaatctt
aaagtatggg gctattgtgt ttaaaaaaaa 2040aaaaaggtaa tttaacacca
accactgata attctcttaa cccagcaggt ttcctaacag 2100gggatttaaa
tcttaattac catacaaagg tctgaccaca cctaggagga actcccttca
2160ggacaggact atagagggtt cctcccaggt gattgaggaa aaaaccacag
tgggtattca 2220gtaattgata gggagactct tgtggaagca gagttagaaa
aattgcctaa taaatggtgt 2280cctcaaaagt gtgagctgtt tgcactcagc
caagccttaa agtacttaca gaatcgtaaa 2340aactatctca atcctgactc
aaaagtttac ttacaccctc tctgaaatga atttacataa 2400gaactgcttt
tttgggaatg catcttgatg gggcagctgg gtggttatga aatactcagg
2460aaaccagccc agctctagga cacatccctg agcacaaagg caatgttggg
cacgctggta 2520aaggaccact agaatccagc agcctggact cctttctttg
tggtcaagaa aggcaggaaa 2580acaggtgcag gactgctaca tcagtgagca
taactaatct gataagcaga gggccttggg 2640tggttacaca ccctggaaag
gaattcaact ctgagcgcaa aggcaatgtt gggcacattg 2700gtaaaggacc
actagaatcc agcagcccag gcccctttct ttatggtcaa gaaaggcggg
2760aaaaggggtg caggactgtt acctcggtga gcgtaactaa tccgataagc
agaggtccat 2820gggtgattac gcaccctgaa aagaataagc attaggccct
taaaggatgc tctaggacta 2880atgctcattg gaaaatgact aggggtgctg
gcatccctat gttcttttct cagacgggaa 2940atgttctcca ccctccccaa
ggcaaaaaca cccctaagat gtattctgga gaattgggac 3000caatttgacc
cccagacgct aagaaagaga tgacttatgt tcttctgcag taccacctgg
3060ccacgatatc ctcttcaagg gggagaaacc tggcctcctg agggaagtat
aaattataac 3120accatcttac agctagacct cttctgtaga aaggagggca
aatggagtga agtgccatat 3180gtgcaaactt tcttttcatt aagagacaac
ttgcaattat gtaagaagtg tgatttatgc 3240cctacaggaa gccctcagag
tctacctccc taccccagca tccccctgac tccttctcca 3300actaataagg
aacccccttc aacccaaacg gtccaaaagg agatagacaa aggggtaaac
3360aatgaaccaa agcgtgccaa tgttccctga ttatgccccc tctaagcagt
gggaggagga 3420gaatttggcc cagccagtgt gcatgtgcct ttttctctct
cagacttaaa gcaaattaaa 3480atagacctag gtaaattctc agataaccct
gatggctata ttgatgtttt ataagggtta 3540ggataatcct ttgatctgac
atggagagat ataatgttac tgctagatca gacactaacc 3600ccaaatgaga
caagtgccgc cataactgca gcctgagagt ttggcgatct ctggtatctc
3660actcgggtca atgataggag gacaacagag gaaagagaat gattccccac
agaccagcag 3720gcagttccca gtgtagaccc tcactgggac acagaatcag
aacatggaca ttggtgctgc 3780agacatttgc taacttacat gctagaagga
ctaaggaaaa ctaggaagaa gcctacgaat 3840tattcaatga tgtccactat
aacacaggga aaggaagaaa atcctactgc ctttctggag 3900cgactaaggg
aggcattgag gaagcatact tccctgtcac ctgactctat tgaaggccaa
3960ctaatcttaa aggataagtt tatcactcag tcagctgaag acattaggaa
aaaacttcaa 4020aagtctgcct taggcccaga gcaaaactta gaaaccccat
tgaacttggc aacctcggtt 4080ttttataata gagatcagga ggagcaggcg
gaacaggaca aacggggtaa aaaaaaggcc 4140accgctttag ttatggccct
caggcaagtg gactttggag gctctggaaa agggaaaagc 4200tgggcaaatc
gaatgcctac tagggcttgc ttccagagtg gtctacaagg acactttgaa
4260aaagattgtc caagtagaaa taagtcgccc cttcgtccat gccccttata
tcaagggaat 4320cactggaagg cccactatcc caggggacaa atgtcctctg
agtcagaagc cactaaccag 4380atgatccagc agcaggactg agggtgccca
gggcaagcac tagcccatgc cgtcaccctc 4440acagagcccc aggtatgctt
gaccattgag ggccaggagg ttaactgtct cctggacact 4500agcacggcct
tctcagtctt actctccttt cccggacaac tgtcctccag atctgtcact
4560atccgagggt tcctaggaca gtcagtcact agatacttat cccagtcact
aagttgtgac 4620tggtgaactt tactcttttc acatgctttt ctaattatcc
ctgaaagcac cactcccttg 4680ttagggcgag acattctagc aaaagcaggg
gccattatac acctgaacat aggagaagga 4740acacctgttt gttgtcccct
gcttgaggaa ggaattaatc ccgaagtctg ggcaacagaa 4800ggacaatacg
gacgagcaaa gaatgcctgt gctgttcaag ttaaactaaa ggattccgcc
4860tcctttccct accaaaggca gtaccccctt agacctgagg cccaacaagg
actccaaaag 4920attgttaagg acctaaaagc ccatggccta gtaaaaccat
gcaatagccc ctgcaatact 4980ccaattttag gagtacagaa acccaacaga
cagtggaggt tagtgcaaga tctcaggatt 5040atcattgagg ctgttgttcc
tgtatagcca gctgtaccta acccttatac tctgctttcc 5100caaataccac
aggaagcaga ggggtttaca gtccggggcc ttaaggacac ctttttctgc
5160atccctgtat atcctgactc tcaattcttg tttgcctttg aagatccttc
aaactcaacg 5220tctcaactca cctggaatgt tttaccccaa gggttcaggg
atagccccca ttagcccaag 5280acttgagcca gttcttatac ctggacactc
ttgtcctttg gtacgtggat gatttacttt 5340tagccacctg ttcagaaacc
ttgtgccatc aagccaccca agcactcttt aatttcctcg 5400ccacctgtgg
ctacaggttt ccaaaccaaa ggctcagctc tgctcacagc aatttaaatg
5460cttagggcta aaattatcca aaggcaccag ggccctcagt gaggaaagta
tccggcctat 5520actggcttat cctcatccca aaaccctaaa gcaactaaga
gtgttccttg gcataacggg 5580tttctgccga atatggattc ccaggtacag
cgaaatagcc agaccattat atacactaat 5640taaggaaact cagaaagcca
atacccattt ggtaagatgg acacctgaag cagaagcaga 5700tttccaggcc
ctaaagaagg ccctgaccca agccccagtg ttaagcttgc caatggggca
5760agacttttct ttatatgtca cagaaaaaac aggaatagct ccaggagtcc
ttacgcagat 5820ccaagggacg agcctgcaac ccatggcata cctgagtaag
gaaattagtg gcaaagggtt 5880ggcctcattg tttatgggta gtggcagcag
tcacagtctt agtaactgaa gcagttaaaa 5940tgatacaagg aagagatctt
actgtgtgga catctcatga tgtgaatggc atactcactg 6000ctaaaggaga
cttgtgactg tcagacaact gtttacttaa atatcaggct ctattacttg
6060aagggccagt gttgcgactg tgcacttgtg caactcttaa cccagccaca
ttgcttccag 6120acaatgaaga aaagatagaa cataactgtc aacaaataat
tgctcaaacc tacactgctc 6180gaggggacct tttagaagtt cccttgactg
atcccgatct caacttgtat actgatggaa 6240gttcctttgc agaaaaagga
cttcaaaagg cggtgtatgc agtagtcctt caaaatcgaa 6300gagctttaga
attgctaatc actgagagag ggggaacgtt tttattttta ggggaagaat
6360gctgttatta tgttaatcaa ttcggaatca tcaccaagaa agttaaagaa
attcaagatc 6420gaatacaacg tagaacagag gagcttaaaa aacactggac
cctggggcct cctcagccaa 6480tggatgccct ggattctccc cttcttagga
cctctagcag ctatatttct actcctcttt 6540ggaccctgta tctttaacct
ccgtgttaag tttgtctctt ccagaatcga agatgtaaaa 6600ctacaaatcg
ttcttcaaat ggacccccag atgcagtcca tgactaagat ctactgagga
6660cccctggacc agcctgctag cccatgctcc aatgttaatg acattgaagg
cacccctccc 6720aaggaaatct caactgcaca acccctacta tgctccaatt
cagcaggaag cagttacagt 6780ggtcctcggc caacctcccc aacagcattt
gtattttcct gttgggaggg ggcactgaga 6840gacaggacta gctggatttc
ctaggctgac tgagaatccc taagcctagc tgggaaggtg 6900accacttcca
cctttaaaca cagggcttgc aacttagctc acaccctacc aattggatag
6960taaagagagg tcactaaaat gctaattagg caaaaacagg aggtaaagaa
atagccaatc 7020atccattgcc tgagagcaca gcgggaggga caatgaccag
gatataaacc caggcattcc 7080agcctgcaac ggcaaccccc tttgggtccc
ctctctttgt atgggagctc tgttttcact 7140ctattcaatc ttgcaactgc
actcttctgg tccgtgtttg ttacggctca agctgagctt 7200ttgctcacca
tccaccactg ctgtttgccg ccgttgcaga cccgtcgctg acttccatcc
7260ctccagatct ggcagggtgt ccactgtgct cctgatccag cgaggcaccc
attgccactc 7320ccgatcaggc taaaggcttg ccattgttcc tgcacagcta
agtgcctggg ttcgtcctaa 7380tcaagctgaa cactagtcac tgggttccat
ggttctcttc catgacccat ggcttctaat 7440agagctataa cactcaccgc
atggcccaag attccattcc ttggaatccg tgaggccaag 7500aaccccaggt
cagagaacac gaggctgccg ccatcttgga ag 75421310288DNAHomo sapiens
13cctggggcgg gcttcctttc tgggatgagg gcaaaacgcc tggagataca gcaattatct
60tgcaactgag agacaggact agctggattt cctaggccga ctaagaatcc ctaagcctag
120ctgggaaggt gaccacgtcc acctttaaac acggggcttg caacttagct
cacacctgac 180caatcagaga gctcactaaa atgctaatta ggcaaagaca
ggaggtaaag aaatagccaa 240tcatctattg cctgagagca cagcaggagg
gacaacaatc gggatataaa cccaggcatt 300cgagctggca acagcagccc
ccctttgggt cccttccctt tgtatgggag ctgttttcat 360gctatttcac
tctattaaat cttgcaactg cactcttctg gtccatgttt cttacggctc
420gagctgagct tttgctcacc gtccaccact gctgtttgcc accaccgcag
acctgccgct 480gactcccatc cctctggatc ctgcagggtg tccgctgtgc
tcctgatcca gcgaggcgcc 540cattgccgct cccaattggg ctaaaggctt
gccattgttc ctgcacggct aagtgcctgg 600gtttgttcta attgagctga
acactagtca ctgggttcca tggttctctt ctgtgaccca 660cggcttctaa
tagaactata acacttacca catggcccaa gattccattc cttggaatcc
720gtgaggccaa gaactccagg tcagagaata cgaggcttgc caccatcttg
gaagcggcct 780gctaccatct tggaagtggt tcaccaccat cttgggagct
ctgtgagcaa ggaccccccg 840gtaacatttt ggcaaccacg aacggacatc
caaagtggtg agtaatattg gaccactttc 900acttgctatt ctgtcctatc
cttccttaga attggaggaa aataccgggc acttgtcggc 960cagttaaaaa
cgattagtgt ggccaccgga cttaagactc aggtgtgagg ctatctgggg
1020aagggctttc taacaacccc caacccttct gggttgggga cttggtttgc
ctcaagccag 1080cttccacttt cagttttctt ggggaagccg agggccgact
agaggcagaa agctgtcgtc 1140ctgaactccc ggcagtagcc ggttgagatc
atggtgtagc cagaagtctc aacagtcgcc 1200catgcatgca cccctatctt
tccttctgac ccatacctcc tgggtcccaa ccacaacttt 1260cttcaaagtg
tagccccaaa attctcctta cctctgaata tacttcctct gatccctgcc
1320tcctaggtac tattggttca gacttccatt tcctctagca agttgtatct
ccaaagggat 1380ctaaggaagc tctgcgctgc gtccttaggc acctaggcta
taacccaggg agtcttatcc 1440ctggtgtccc tcccaattta ggcatacagc
tcttgacatg ggcagttatg taggacccac 1500tccccaccac ccttgccagg
gccccaagtt tgtaaatggc tgagggaaaa gagagacaga 1560ggagagagag
agaaatggag gagaaagaga gagagacaga gaggagagag agacagtgag
1620agagacagaa gagagagaga gacaaagagg agagagagag agtcaaagag
agaaagaaag 1680agaaagaaat agtaaaaaac agtgtgccct attcctttaa
aagccagggt aaatttaaaa 1740cctgtacttg ataattgaag gtcttctctg
tgaccctata gcactccaat ccactttgtg 1800gtcagtgtaa ataagagcat
aggccgaaag cactgaggcc attgacaacc cgtagcttcc 1860ctatcaaaaa
tccttaaccc agtaacccgc agatggacca aatgcattca gtcggtagcg
1920caactgcttt gctaaaagta gaaaagtaac ttttagagga aacctcattg
tgagcacacc 1980tcacctgttc agaattattc taataaaaaa agcaaaaagg
tagcttacta actcaaaaat 2040cttaaagtat ggggctattc tgttagaaaa
aggtaatgta actccaacca ctgataattc 2100ccttaaccca gcagatttcc
taacgggatt taaatcttaa ttaccataca aaggtccgac 2160cagacctagg
cggaactccc ttcaggacag gacgatagat ggttcctccc aggtgattga
2220ggaaaaaaac cacaatgggt attcagtaat tgatacgggg actcttgtgg
aagcagagtt 2280agaaaaattg cctaataact ggtctcctca aacgtgtgag
ctgtttgcac tcagccaagc 2340cttaaagtac ttacagaatc aaaagactat
ctcaatcctg attcaaaagg ttagctacac 2400cctctctgta atgcatttgc
ataagaactt gtttatggga atgcatcttg atggggcagc 2460tgggttgtta
taaaatagga acccagccca gctctaggac tcacccctga gcgcaaaggc
2520aatgttgggc atgctggtaa aggaccacta gaatccagca gcccagaccc
ctttctttgt 2580ggtcaagaaa ggcgggaaaa ggggtgcagg actgctacat
cggtaagcat aactaatccg 2640ataaacagag gtccatgggt ggttacgcac
cctggaaagg aactcacccc tgagcacaaa 2700ggcaatgttg ggcacgctgg
taaaggacca ctagaatcca gcagcctgga cccctttctt 2760tgtggtcaag
agaggcagga aaacaggtgc aggactgcaa catcagtgag cataactaat
2820tcgataagca gaggtccatg ggtggtgatg caccctggaa agaataagca
ttaggaccat 2880agaggacact ccaggactaa agctcatcgg aaaatgacta
gggttgctgg catccctatg 2940ttcttttttc agatgggaaa cgttccccgc
aagacaaaaa cgcccctaag acgtattctg 3000gagaattggg accaatttga
ccctcagaca ctaagaaaga aacgacttat attcttctgc 3060agtgccgcct
ggcactcctg agggaagtat aaattataac accatcttac agctagacct
3120cttttgtaga aaaggcaaat ggagtgaagt gccataagta caaactttct
tttcattaag 3180agacaactca caattatgta aaaagtgtga tttatgccct
acaggaagcc ttcagagtct 3240acctccctat cccagcatcc ccgactcctt
ccccaactaa taaggacccc ccttcaaccc 3300aaatggtcca aaaggagata
gacaaaaggg taaacagtga accaaagagt gccaatattc 3360cccaattatg
acccctccaa gcagtgggag gaagagaatt cggcccagcc agagtgcatg
3420tgcctttttc tctcccagac ttaaagcaaa taaaaacaga cttaggtaaa
ttctcagata 3480accctgatgg ctatattgat gttttacaag ggttaggaca
attctttgat ctgacatgga 3540gagatataat gtcactgcta aatcagacac
taaccccaaa tgagagaagt gccaccataa 3600ctgcagcctg agagtttggc
gatctctggt atctcagtca ggtcaatgat aggatgacaa 3660cagaggaaag
agaatgattc cccacaggcc agcaggcagt tcccagtcta gaccctcatt
3720gggacacaga atcagaacat ggagattggt gctgcagaca tttgctaact
tgtgtgctag 3780aaggactaag gaaaactagg aagaagtcta tgaattactc
aatgatgtcc accataacac 3840agggaaggga agaaaatcct actgcctttc
tggagagact aagggaggca ttgaggaagc 3900gtgcctctct gtcacctgac
tcttctgaag gccaactaat cttaaagcgt aagtttatca 3960ctcagtcagc
tgcagacatt agaaaaaaac ttcaaaagtc tgccgtaggc ccggagcaaa
4020acttagaaac cctattgaac ttggcaacct cggtttttta taatagagat
caggaggagc 4080aggcggaaca ggacaaacgg gattaaaaaa aaggccaccg
ctttagtcat gaccctcagg 4140caagtggact ttggaggctc tggaaaaggg
aaaagctggg caaattgaat gcctaatagg 4200gcttgcttcc agtgcggtct
acaaggacac tttaaaaaag attgtccaag tagaagtaag 4260ccgccccctc
gtccatgccc cttatttcaa gggaatcact ggaaggccca ctgccccagg
4320ggacaaaggt cctctgagtc agaagccact aaccagatga tccagcagca
ggactgaggg 4380tgcctggggc aagcgccatc ccatgccatc accctcacag
agccctgggt atgcttgacc 4440attgagggcc aggaggttgt ctcctggaca
ctggtgcggt cttcttagtc ttactcttct 4500gtcccggaca actgtcctcc
agatctgtca ctatctgagg gggtcctaag acgggcagtc 4560actagatact
tctcccagcc actaagttat gactggggag ctttattctt ttcacatgct
4620tttctaatta tgcttgaaag ccccactacc ttgttaggga gagacattct
agcaaaagca 4680ggggccatta tacacctgaa cataggagaa ggaacacccg
tttgttgtcc cctgcttgag 4740gaaggaatta atcctgaagt ctgggcaaca
gaaggacaat atggacgagc aaagaatgcc 4800cgtcctgttc aagttaaact
aaaggattcc acctcctttc cctaccaaag gcagtacccc 4860ctcagaccca
aggcccaaca aggactccaa aagattgtta aggacctaaa agcccaaggc
4920ctagtaaaac catgcagtaa cccctgcagt actccaattt taggagtaca
gaaacccaac 4980agacagtgga ggttagtgca agatctcagg attatcaatg
aggctgttgt tcctctatag 5040ccagctgtac ctagccctta tactctgctt
tcccaaatac cagaggaagc agagtggttt 5100acagtcctgg accttcagga
tgccttcttc tgcatccctg tacatcctga ctctcaattc 5160ttgtttgcct
ttgaagatac ttcaaaccca acatctcaac tcacctggac tattttaccc
5220caagggttca gggatagtcc ccatctattt ggccaggcat tagcccaaga
cttgagccaa 5280tcctcatacc tggacacttg tccttcggta ggtggatgat
ttacttttgg ccgcccattc 5340agaaaccttg tgccatcaag ccacccaagc
gctcttcaat ttcctcgcta cctgtggcta 5400catggtttcc aaaccaaagg
ctcaactctg ctcacagcag gttacttagg gctaaaatta 5460tccaaaggca
ccagggccct cagtgaggaa cacatccagc ctatactggc ttatcctcat
5520cccaaaaccc taaagcaact aaggggattc cttggcgtaa taggtttctg
ccgaaaatgg 5580attcccaggt atggcgaaat agccaggtca ttaaatacac
taattaagga aactcagaaa 5640gccaataccc atttagtaag atggacaact
gaagtagaag tggctttcca ggccctaacc 5700caagccccag tgttaagttt
gccaacaggg caagactttt cttcatatgt cacagaaaaa 5760acaggaatag
ctctaggagt ccttacacag atccgaggga tgagcttgca acctgtggca
5820tacctgacta aggaaattga tgtagtggca aagggttgac ctcattgttt
acgggtagtg 5880gtggcagtag cagtcttagt atctgaagca gttaaaataa
tacagggaag agatcttact 5940gtgtggacat ctcatgatgt gaatggcata
ctcactgcta aaggagactt gtggctgtca 6000gacaactgtt tacttaaatg
tcaggctcta ttacttgaag ggccagtgct gcgactgtgc 6060acttgtgcaa
ctcttaaccc agccacattt cttccagaca atgaagaaaa gataaaacat
6120aactgtcaac aagtaatttc tcaaacctat gccactcgag gggacctttt
agaggttcct 6180ttgactgatc ccgacctcaa cttgtatact gatggaagtt
cctttgtaga aaaaggactt 6240cgaaaagtgg ggtatgcagt ggtcagtgat
aatggaatac ttgaaagtaa tcccctcact 6300ccaggaacta gtgctcagct
agcagaacta atagccctca cttgggcact agaattagga 6360gaagaaaaaa
gggcaaatat atatacagac tctaaatatg cttacctagt cctccatgcc
6420catgcagcaa tatggaaaga aagggaattc ctaacttctg agagaacacc
tatcaaacat 6480caggaagcca ttaggaaatt attattggct gtacagaaac
ctaaagaggt ggcagtctta 6540cactgccggg gtcatcagaa aggaaaggaa
agggaaatag aagagaactg ccaagcagat 6600attgaagcca aaagagctgc
aaggcaggac cctccattag aaatgcttat aaaacaaccc 6660ctagtatagg
gtaatcccct ccgggaaacc aagccccagt actcagcagg agaaacagaa
6720tggggaacct cacgaggaca gttttctccc ctcgggacgg ctagccactg
aagaagggaa 6780aatacttttg cctgcaacta tccaatggaa attacttaaa
acccttcatc aaacctttca 6840cttaggcatc gatagcaccc atcagatggc
caaatcatta tttactggac caggcctttt 6900caaaactatc aagcagatag
tcagggcctg tgaagtgtgc cagagaaata atcccctgcc 6960ttatcgccaa
gctccttcag gagaacaaag aacaggccat taccctggag aagactggca
7020actgatttta cccacaagcc caaacctcag ggatttcagt atctactagt
ctgggtagat 7080actttcacgg gttgggcaga ggccttcccc tgtaggacag
aaaaggccca agaggtaata 7140aaggcactag ttcatgaaat aattcccaga
ttcggacttc cccgaggctt acagagtgac 7200aatagccctg ctttccaggc
cacagtaacc cagggagtat cccaggcgtt aggtatacga 7260tatcacttac
actgcgcctg aaggccacag tcctcaggga aggtcgagaa aatgaatgaa
7320acactcaaag gacatctaaa aaagcaaacc caggaaaccc acctcacatg
gcctgctctg 7380ttgcctatag ccttaaaaag aatctgcaac tttccccaaa
aagcaggact tagcccatac 7440gaaatgctgt atggaaggcc cttcataacc
aatgaccttg tgcttgaccc aagacagcca 7500acttagttgc agacatcacc
tccttagcca aatatcaaca agttcttaaa acattacaag 7560gaacctatcc
ctgagaagag ggaaaagaac tattccaccc ttgtgacatg gtattagtca
7620agtcccttcc ctctaattcc ccatccctag atacatcctg ggaaggaccc
tacccagtca 7680ttttatctac cccaactgcg gttaaagtgg ctggagtgga
gtcttggata catcacactt 7740gagtcaaatc ctggatactg ccaaaggaac
ctgaaaatcc aggagacaac gctagctatt 7800cctgtgaacc tctagaggat
ttgcgcctgc tcttcaaaca acaaccagga ggaaagtaac 7860taaaatcata
aatccccatg gccctccctt atcatatttt tctctttact gttcttttac
7920cctctttcac tctcactgca ccccctccat gccgctgtat gaccagtagc
tccccttacc 7980aagagtttct atggagaatg cagcgtcccg gaaatattga
tgccccatcg tataggagtc 8040tttctaaggg aacccccacc ttcactgccc
acacccatat gccccgcaac tgctatcact 8100ctgccactct ttgcatgcat
gcaaatactc attattggac aggaaaaatg attaatccta 8160gttgtcctgg
aggacttgga gtcactgtct gttggactta cttcacccaa actggtatgt
8220ctgatggggg tggagttcaa gatcaggcaa gagaaaaaca tgtaaaagaa
gtaatctccc 8280aactcacccg ggtacatggc acctctagcc cctacaaagg
actagatctc tcaaaactac 8340atgaaaccct ccgtacccat actcgcctgg
taagcctatt taataccacc ctcactgggc 8400tccatgaggt ctcggcccaa
aaccctacta actgttggat atgcctcccc ctgaacttca 8460ggccatatgt
ttcaatccct gtacctgaac aatggaacaa cttcagcaca gaaataaaca
8520ccacttccgt tttagtagga cctcttgttt ccaatctgga aataacccat
acctcaaacc 8580tcacctgtgt aaaatttagc aatactacat acacaaccaa
ctcccaatgc atcaggtggg 8640taactcctcc cacacaaata gtctgcctac
cctcaggaat attttttgtc tgtggtacct 8700cagcctatcg ttgtttgaat
ggctcttcag aatctatgtg cttcctctca ttcttagtgc 8760cccctatgac
catctacact gaacaagatt tatacagtta tgtcatatct aagccccgca
8820acaaaagagt acccattctt ccttttgtta taggagcagg agtgctaggt
gcactaggta 8880ctggcattgg cggtatcaca acctctactc agttctacta
caaactatct caagaactaa 8940atggggacat ggaacgggtc gccgactccc
tggtcacctt gcaagatcaa cttaactccc 9000tagcagcagt agtccttcaa
aatcgaagag ctttagactt gctaaccgct gaaagagggg 9060gaacctgttt
atttttaggg gaagaatgct gttattatgt taatcaatcc ggaatcgtca
9120ctgagaaagt taaagaaatt cgagatcgaa tacaacgtag agcagaggag
cttcgaaaca 9180ctggaccctg gggcctcctc agccaatgga tgccctggat
tctccccttc ttaggacctc 9240tagcagctat aatattgcta ctcctctttg
gaccctgtat ctttaacctc cttgttaact 9300ttgtctcttc cagaatcgaa
gctgtaaaac tacaaatgga gcccaagatg cagtccaaga 9360ctaagatcta
ccgcagaccc ctggaccggc ctgctagccc acgatctgat gttaatgaca
9420tcaaaggcac ccctcctgag gaaatctcag ctgcacaacc tctactacgc
cccaattcag 9480caggaagcag ttagagcggt cgtcggccaa cctccccaac
agcacttagg ttttcctgtt 9540gagatggggg actgagagac aggactagct
ggatttccta ggctgactaa gaatccctaa 9600gcctagctgg gaaggtgacc
acatccacct ttaaacacgg ggcttgcaac ttagctcaca 9660cctgaccaat
cagagagctc actaaaatgc taattaggca aagacaggag gtaaagaaat
9720agccaatcat ctattgcctg agagcacagc aggagggaca atgatcggga
tataaaccca 9780agtcttcgag ccggcaacgg caaccccctt tgggtcccct
ccctttgtat gggagctctg 9840ttttcatgct atttcactct attaaatctt
gcaactgcac tcttctggtc catgtttctt 9900acggcttgag ctgagctttc
gctcgccatc caccactgct gtttgccgcc accgcagacc 9960cgccgctgac
tcccatccct ctggatcatg cagggtgtcc gctgtgctcc tgatccagcg
10020aggcacccat tgccgctccc aatcgggcta aaggcttgcc attgttcctg
catggctaag 10080tgcctgggtt catcctaatt gagctgaaca ctagtcactg
ggttccatgg ttctcttctg 10140tgacccacag cttctaatag agctataaca
ctcaccgcat ggcccaaggt tccattcctt 10200gaatccataa ggccaagaac
cccaggtcag agaacacgag gcttgccacc atcttgggag 10260ctctgtgagc
aaggaccccc aagtaaca 102881418DNAArtificial sequenceSynthetic
Construct - amplification primer 14tgcagatgct gtgtctgg
181517DNAArtificial sequenceSynthetic Construct - amplification
primer 15cgtactggcc caggacc 171620DNAArtificial sequenceSynthetic
Construct - amplification primer 16ggttcgtgct aattgagctg
201720DNAArtificial sequenceSynthetic Construct - amplification
primer 17atggtggcaa gcttcttgtt 201820DNAArtificial
sequenceSynthetic Construct - amplification primer 18tgagctttcc
ctcactgtcc 201920DNAArtificial sequenceSynthetic Construct -
amplification primer 19tgttcggctt gattaggatg 202020DNAArtificial
sequenceSynthetic Construct - amplification primer 20catggcccaa
tattccattc 202121DNAArtificial sequenceSynthetic Construct -
amplification primer 21ggtccttgtt cacagaactc c 212220DNAArtificial
sequenceSynthetic Construct - amplification primer 22ccgctcctga
ttggactaaa 202320DNAArtificial sequenceSynthetic Construct -
amplification primer 23cgtgggtcaa ggaagagaac 202421DNAArtificial
sequenceSynthetic Construct - amplification primer 24atgacccgca
gcttctaaca g 212520DNAArtificial sequenceSynthetic Construct -
amplification primer 25ctccgctcac agagctccta 202621DNAArtificial
sequenceSynthetic Construct - amplification primer 26ccaacatcac
taacacaacc t 212720DNAArtificial sequenceSynthetic Construct -
amplification primer 27gggagttagt aaggggtttg 202824DNAArtificial
sequenceSynthetic Construct - amplification primer 28caacctatta
aacaaaacta aatt 242927DNAArtificial sequenceSynthetic Construct -
amplification primer 29agatttaata gagtgaaaat agagttt
273022DNAArtificial sequenceSynthetic Construct - amplification
primer 30ttattagttt aggggatagt tg 223123DNAArtificial
sequenceSynthetic Construct - amplification primer 31acacaataaa
caacctacta aat 233219DNAArtificial sequenceSynthetic Construct -
amplification primer 32gagggtaagt ggtgataaa 193321DNAArtificial
sequenceSynthetic Construct - amplification primer 33aacctactaa
atccaaaaaa a 213422DNAArtificial sequenceSynthetic Construct -
amplification primer 34taggatttta ggtttattgt ta 223519DNAArtificial
sequenceSynthetic Construct - amplification primer 35aaaaataaaa
tattaaacc 193620DNAArtificial sequenceSynthetic Construct -
amplification primer 36atatgtggga gtgagagata 203722DNAArtificial
sequenceSynthetic Construct - amplification primer 37caacaacaaa
caataataat aa 223822DNAArtificial sequenceSynthetic Construct -
amplification primer 38ttgagttttt ttattgatag tg 223921DNAArtificial
sequenceSynthetic Construct - amplification primer 39tctaaatcct
attttcctac t 214024DNAArtificial sequenceSynthetic Construct -
amplification primer 40gtttttttat tgatagtgag agat
244120DNAArtificial sequenceSynthetic Construct - amplification
primer 41taacaaacct ttaatccaat 204222DNAArtificial
sequenceSynthetic Construct - amplification primer 42tttagtgagg
atgatgtaat at 224322DNAArtificial sequenceSynthetic Construct -
amplification primer 43caacttaata aaaataaacc ca 224424DNAArtificial
sequenceSynthetic Construct - amplification primer 44ataatgtttt
agtaagtgtt ggat 244520DNAArtificial sequenceSynthetic Construct -
amplification primer 45acaattacaa acctttaacc 204620DNAArtificial
sequenceSynthetic Construct - amplification primer 46aattcattca
acatccattc 204726DNAArtificial sequenceSynthetic Construct -
amplification primer 47ggtttaatat tatttattat tttgga
264825DNAArtificial sequenceSynthetic Construct - amplification
primer 48ctcttacctt cctatactct ctaaa 254928DNAArtificial
sequenceSynthetic Construct - amplification primer 49agagtgtagt
tgtaagattt aatagagt 28
* * * * *
References