Gene Expression Profiling in Biopsied Tumor Tissues

Baker; Joffre B. ;   et al.

Patent Application Summary

U.S. patent application number 12/616039 was filed with the patent office on 2010-08-19 for gene expression profiling in biopsied tumor tissues. Invention is credited to Joffre B. Baker, Maureen T. Cronin, Michael C. Kiefer, Steve Shak, Michael Graham Walker.

Application Number20100209920 12/616039
Document ID /
Family ID28045456
Filed Date2010-08-19

United States Patent Application 20100209920
Kind Code A1
Baker; Joffre B. ;   et al. August 19, 2010

Gene Expression Profiling in Biopsied Tumor Tissues

Abstract

The invention concerns sensitive methods to measure mRNA levels in biopsied tumor tissues, including archived paraffin-embedded biopsy material. The invention also concerns breast cancer gene sets important in the diagnosis and treatment of breast cancer, and methods for assigning the most optimal treatment options to breast cancer patient based upon knowledge derived from gene expression studies.


Inventors: Baker; Joffre B.; (Montara, CA) ; Cronin; Maureen T.; (Los Altos, CA) ; Kiefer; Michael C.; (Clayton, CA) ; Shak; Steve; (Hillsborough, CA) ; Walker; Michael Graham; (Sunnyvale, CA)
Correspondence Address:
    Genomic Health, Inc. (Bozicevic, Field & Francis);c/o Kathleen Determann
    301 Penobscot Road
    Redwood City
    CA
    94063
    US
Family ID: 28045456
Appl. No.: 12/616039
Filed: November 10, 2009

Related U.S. Patent Documents

Application Number Filing Date Patent Number
11450973 Jun 9, 2006
12616039
10388360 Mar 12, 2003 7081340
11450973
60412049 Sep 18, 2002
60364890 Mar 13, 2002

Current U.S. Class: 435/6.16
Current CPC Class: C12Q 2600/106 20130101; G01N 33/57415 20130101; G01N 33/57484 20130101; C12Q 2600/118 20130101; C12N 15/1003 20130101; C12Q 2600/16 20130101; C12Q 1/6886 20130101; C12Q 2600/158 20130101
Class at Publication: 435/6
International Class: C12Q 1/68 20060101 C12Q001/68

Claims



1.-45. (canceled)

46. A method comprising: assaying a level of a RNA transcript of CEGP1 in a tissue sample obtained from a primary ductal or lobular breast tumor of a human patient; normalizing said level against a level of at least one reference RNA transcript in said tissue sample to provide a normalized CEGP1 expression level; and predicting the likelihood of long-term survival of said patient without recurrence of breast cancer by comparing said normalized CEGP1 expression level to CEGP1 expression data obtained from reference breast cancer samples, wherein an increased normalized CEGP1 expression level is positively correlated with an increased likelihood of long-term survival without breast cancer recurrence in said patients.

47. The method of claim 46 further comprising assaying a level of a RNA transcript of one or more genes selected from the group consisting of: STK15, Ki-67, PR, GSTM3, ESR1, HNF3A, BIRC5, BAG1, BCL2, CCNB1, and GSTM1 in said tissue sample; normalizing the level of the RNA transcript of the one or more genes against a level of at least one reference RNA transcript in said tissue sample to provide a normalized level of said one or more genes; and comparing said normalized level of said one or more genes to gene expression data from said one or more genes obtained from reference breast cancer samples, wherein increased expression of one or more of BIRC5, CCNB1, STK15 and Ki-67, negatively correlates with an increased likelihood of long-term survival without breast cancer recurrence, and increased expression of one or more of BAG1, BCL2, PR, GSTM1, GSTM3, ESR1 and HNF3A positively correlates with an increased likelihood of long-term survival without breast cancer recurrence.

48. The method of claim 46 wherein the breast tumor is an invasive breast tumor, and said method further comprises assaying a level of a RNA transcript of one or more genes selected from the group consisting of: FOXM1, PRAME, BCL2, STK15, Ki-67, PR, BBC3, NME1, BIRC5, GATA3, TFRC, YB-1, DPYD, CA9, Contig51037, RPS6K1 and Her2 in said tissue sample.

49. The method of claim 46 wherein said breast tumor is estrogen receptor (ER) positive breast tumor.

50. The method of claim 49 further comprising assaying a level of a RNA transcript of one or more genes selected from the group consisting of: PRAME, BCL2, FOXM1, DIABLO, EPHX1, HIF1A, VEGFC, Ki-67, IGF1R, VDR, NME1, GSTM3, Contig51037, CDC25B, CTSB, p27, CDH1, and IGFBP3 in said tissue sample.

51. The method of claim 47 wherein the levels of 2 or more RNA transcripts are assayed.

52. The method of claim 46, wherein said tissue sample is a fixed, wax-embedded breast cancer tissue specimen of said patient.

53. The method of claim 46, wherein said tissue sample is from a fine needle biopsy.

54. The method of claim 46, further comprising creating a report based upon the normalized CEGP1 expression level.

55. The method of claim 54, wherein said report includes a prediction of the likelihood of long term survival of said patient without the recurrence of breast cancer.

56. The method of claim 55, wherein said report comprises information concerning a recommendation for a treatment modality of said patient.

57. The method of claim 46, wherein said gene expression data is produced using a multivariate analysis using the Cox Proportional Hazards model.

58. The method of claim 46 wherein said assaying is done by reverse transcriptase polymerase chain reaction (RT-PCR).

59. The method of claim 46, wherein said assaying is done after a primary ductal carcinoma has been surgically removed from a breast of said patient.

60. The method of claim 59, wherein said primary ductal carcinoma is an invasive ductal carcinoma.

61. The method of claim 46, wherein said assaying is done after a primary lobular carcinoma has been surgically removed from a breast of said patient.

62. The method of claim 61, wherein said primary lobular carcinoma is an invasive lobular carcinoma.
Description



CROSS-REFERENCE

[0001] This application claims the benefit under 35 U.S.C. 119(h) of provisional application Ser. Nos. 60/412,049, filed Sep. 18, 2002 and 60/364,890, filed Mar. 13, 2002, the entire disclosures which are hereby incorporated by reference.

BACKGROUND OF THE INVENTION

Field of the Invention

[0002] The present invention relates to gene expression profiling in biopsied tumor tissues. In particular, the present invention concerns sensitive methods to measure mRNA levels in biopsied tumor tissues, including archived paraffin-embedded biopsy material. In addition, the invention provides a set of genes the expression of which is important in the diagnosis and treatment of breast cancer.

[0003] Oncologists have a number of treatment options available to them, including different combinations of chemotherapeutic drugs that are characterized as "standard of care," and a number of drugs that do not carry a label claim for a particular cancer, but for which there is evidence of efficacy in that cancer. Best likelihood of good treatment outcome requires that patients be assigned to optimal available cancer treatment, and that this assignment be made as quickly as possible following diagnosis.

[0004] Currently, diagnostic tests used in clinical practice are single analyte, and therefore do not capture the potential value of knowing relationships between dozens of different markers. Moreover, diagnostic tests are frequently not quantitative, relying on immunohistochemistry. This method often yields different results in different laboratories, in part because the reagents are not standardized, and in part because the interpretations are subjective and cannot be easily quantified. RNA-based tests have not often been used because of the problem of RNA degradation over time and the fact that it is difficult to obtain fresh tissue samples from patients for analysis. Fixed paraffin-embedded tissue is more readily available and methods have been established to detect RNA in fixed tissue. However, these methods typically do not allow for the study of large numbers of genes (DNA or RNA) from small amounts of material. Thus, traditionally fixed tissue has been rarely used other than for immunohistochemistry detection of proteins.

[0005] Recently, several groups have published studies concerning the classification of various cancer types by microarray gene expression analysis (see, e.g. Golub et al., Science 286:531-537 (1999); Bhattacharjae et al., Proc. Natl. Acad. Sci. USA 98:13790-13795 (2001); Chen-Hsiang et al., Bioinformatics 17 (Suppl. 1):S316-S322 (2001); Ramaswamy et al., Proc. Natl. Acad. Sci. USA 98:15149-15154 (2001)). Certain classifications of human breast cancers based on gene expression patterns have also been reported (Martin et al., Cancer Res. 60:2232-2238 (2000); West et al., Proc. Natl. Acad. Sci. USA 98:11462-11467 (2001); Sorlie et al., Proc. Natl. Acad. Sci. USA 98:10869-10874 (2001); Yan et al., Cancer Res. 61:8375-8380 (2001)). However, these studies mostly focus on improving and refining the already established classification of various types of cancer, including breast cancer, and generally do not provide new insights into the relationships of the differentially expressed genes, and do not link the findings to treatment strategies in order to improve the clinical outcome of cancer therapy.

[0006] Although modern molecular biology and biochemistry have revealed more than 100 genes whose activities influence the behavior of tumor cells, state of their differentiation, and their sensitivity or resistance to certain therapeutic drugs, with a few exceptions, the status of these genes has not been exploited for the purpose of routinely making clinical decisions about drug treatments. One notable exception is the use of estrogen receptor (ER) protein expression in breast carcinomas to select patients to treatment with anti-estrogen drugs, such as tamoxifen. Another exceptional example is the use of ErbB2 (Her2) protein expression in breast carcinomas to select patients with the Her2 antagonist drug Herceptin.RTM. (Genentech, Inc., South San Francisco, Calif.).

[0007] Despite recent advances, the challenge of cancer treatment remains to target specific treatment regimens to pathogenically distinct tumor types, and ultimately personalize tumor treatment in order to maximize outcome. Hence, a need exists for tests that simultaneously provide predictive information about patient responses to the variety of treatment options. This is particularly true for breast cancer, the biology of which is poorly understood. It is clear that the classification of breast cancer into a few subgroups, such as ErbB2.sup.+ subgroup, and subgroups characterized by low to absent gene expression of the estrogen receptor (ER) and a few additional transcriptional factors (Perou et al. Nature 406:747-752 (2000)) does not reflect the cellular and molecular heterogeneity of breast cancer, and does not allow the design of treatment strategies maximizing patient response.

SUMMARY OF THE INVENTION

[0008] The present invention provides (1) sensitive methods to measure mRNA levels in biopsied tumor tissue, (2) a set of approximately 190 genes, the expression of which is important in the diagnosis of breast cancer, and (3) the significance of abnormally low or high expression for the genes identified and included in the gene set, through activation or disruption of biochemical regulatory pathways that influence patient response to particular drugs used or potentially useful in the treatment of breast cancer. These results permit assessment of genomic evidence of the efficacy of more than a dozen relevant drugs.

[0009] The present invention accommodates the use of archived paraffin-embedded biopsy material for assay of all markers in the set, and therefore is compatible with the most widely available type of biopsy material. The invention presents an efficient method for extraction of RNA from wax-embedded, fixed tissues, which reduces cost of mass production process for acquisition of this information without sacrificing quality of the analysis. In addition, the invention describes a novel highly effective method for amplifying mRNA copy number, which permits increased assay sensitivity and the ability to monitor expression of large numbers of different genes given the limited amounts of biopsy material. The invention also captures the predictive significance of relationships between expressions of certain markers in the breast cancer marker set. Finally, for each member of the gene set, the invention specifies the oligonucleotide sequences to be used in the test.

[0010] In one aspect, the invention concerns a method for predicting clinical outcome for a patient diagnosed with cancer, comprising

[0011] determining the expression level of one or more genes, or their expression products, selected from the group consisting of p53BP2, cathepsin B, cathepsin L, Ki67/MiB1, and thymidine kinase in a cancer tissue obtained from the patient, normalized against a control gene or genes, and compared to the amount found in a reference cancer tissue set,

[0012] wherein a poor outcome is predicted if:

[0013] (a) the expression level of p53BP2 is in the lower 10.sup.th percentile; or

[0014] (b) the expression level of either cathepsin B or cathepsin L is in the upper 10.sup.th percentile; or

[0015] (c) the expression level of any either Ki67/MiB1 or thymidine kinase is in the upper 10.sup.th percentile.

[0016] Poor clinical outcome can be measured, for example, in terms of shortened survival or increased risk of cancer recurrence, e.g. following surgical removal of the cancer.

[0017] In another embodiment, the inventor concerns a method of predicting the likelihood of the recurrence of cancer, following treatment, in a cancer patient, comprising determining the expression level of p27, or its expression product, in a cancer tissue obtained from the patient, normalized against a control gene or genes, and compared to the amount found in a reference cancer tissue set, wherein an expression level in the upper 10th percentile indicates decreased risk of recurrence following treatment.

[0018] In another aspect, the invention concerns a method for classifying cancer comprising, determining the expression level of two or more genes selected from the group consisting of Bcl2, hepatocyte nuclear factor 3, ER, ErbB2, and Grb7, or their expression products, in a cancer tissue, normalized against a control gene or genes, and compared to the amount found in a reference cancer tissue set, wherein (i) tumors expressing at least one of Bcl2, hepatocyte nuclear factor 3, and ER, or their expression products, above the mean expression level in the reference tissue set are classified as having a good prognosis for disease free and overall patient survival following treatment; and (ii) tumors expressing elevated levels of ErbB2 and Grb7, or their expression products, at levels ten-fold or more above the mean expression level in the reference tissue set are classified as having poor prognosis of disease free and overall patient survival following treatment.

[0019] All types of cancer are included, such as, for example, breast cancer, colon cancer, lung cancer, prostate cancer, hepatocellular cancer, gastric cancer, pancreatic cancer, cervical cancer, ovarian cancer, liver cancer, bladder cancer, cancer of the urinary tract, thyroid cancer, renal cancer, carcinoma, melanoma, and brain cancer. The foregoing methods are particularly suitable for prognosis/classification of breast cancer.

[0020] In all previous aspects, in a specific embodiment, the expression level is determined using RNA obtained from a formalin-fixed, paraffin-embedded tissue sample. While all techniques of gene expression profiling, as well as proteomics techniques, are suitable for use in performing the foregoing aspects of the invention, the gene expression levels are often determined by reverse transcription polymerase chain reaction (RT-PCR).

[0021] If the source of the tissue is a formalin-fixed, paraffin embedded tissue sample, the RNA is often fragmented.

[0022] The expression data can be further subjected to multivariate analysis, for example using the Cox Proportional Hazards model.

[0023] In a further aspect, the invention concerns a method for the preparation of nucleic acid from a fixed, wax-embedded tissue specimen, comprising:

[0024] (a) incubating a section of the fixed, wax-embedded tissue specimen at a temperature of about 56.degree. C. to 70.degree. C. in a lysis buffer, in the presence of a protease, without prior dewaxing, to form a lysis solution;

[0025] (b) cooling the lysis solution to a temperature where the wax solidifies; and

[0026] (c) isolating the nucleic acid from the lysis solution.

[0027] The lysis buffer may comprise urea, such as 4M urea. In a particular embodiment, incubation in step (a) of the foregoing method is performed at about 65.degree. C.

[0028] In another particular embodiment, the protease used in the foregoing method is proteinase K.

[0029] In another embodiment, the cooling in step (b) is performed at room temperature.

[0030] In a further embodiment, the nucleic acid is isolated after protein removal with 2.5 M NH.sub.4OAc.

[0031] The nucleic acid can, for example, be total nucleic acid present in the fixed, wax-embedded tissue specimen.

[0032] In yet another embodiment, the total nucleic acid is isolated by precipitation from the lysis solution, following protein removal, with 2.5 M NH.sub.4OAc. The precipitation may, for example, be performed with isopropanol.

[0033] The method described above may further comprise the step of removing DNA from the total nucleic acid, for example by DNAse treatment.

[0034] The tissue specimen may, for example, be obtained from a tumor, and the RNA may be obtained from a microdissected portion of the tissue specimen enriched for tumor cells.

[0035] All types of tumor are included, such as, without limitation, breast cancer, colon cancer, lung cancer, prostate cancer, hepatocellular cancer, gastric cancer, pancreatic cancer, cervical cancer, ovarian cancer, liver cancer, bladder cancer, cancer of the urinary tract, thyroid cancer, renal cancer, carcinoma, melanoma, and brain cancer, in particular breast cancer.

[0036] The method described above may further comprise the step of subjecting the RNA to gene expression profiling. Thus, the gene expression profile may be completed for a set of genes comprising at least two of the genes listed in Table 1.

[0037] Although all methods of gene expression profiling are contemplated, in a particular embodiment, gene expression profiling is performed by RT-PCR which may be preceded by an amplification step.

[0038] In another aspect, the invention concerns a method for preparing fragmented RNA for gene expression analysis, comprising the steps of:

[0039] (a) mixing the RNA with at least one gene-specific, single-stranded DNA scaffold under conditions such that fragments of the RNA complementary to the DNA scaffold hybridize with the DNA scaffold;

[0040] (b) extending the hybridized RNA fragments with a DNA polymerase to form a DNA-DNA duplex; and

[0041] (c) removing the DNA scaffold from the duplex.

[0042] In a specific embodiment, in step (b) of this method, the RNA may be mixed with a mixture of single-stranded DNA templates specific for each gene of interest.

[0043] The method can further comprise the step of heat-denaturing and reannealing the duplexed DNA to the DNA scaffold, with or without additional overlapping scaffolds, and further extending the duplexed sense strand with DNA polymerase prior to removal of the scaffold in step (c).

[0044] The DNA templates may be, but do not need to be, fully complementary to the gene of interest.

[0045] In a particular embodiment, at least one of the DNA templates is complementary to a specific segment of the gene of interest.

[0046] In another embodiment, the DNA templates include sequences complementary to polymorphic variants of the same gene.

[0047] The DNA template may include one or more dUTP or rNTP sites. In this case. in step (c) the DNA template may be removed by fragmenting the DNA template present in the DNA-DNA duplex formed in step (b) at the dUTP or rNTP sites.

[0048] In an important embodiment, the RNA is extracted from fixed, wax-embedded tissue specimens, and purified sufficiently to act as a substrate in an enzyme assay. The RNA purification may, but does not need to, include an oligo-dT based step.

[0049] In a further aspect, the invention concerns a method for amplifying RNA fragments in a sample comprising fragmented RNA representing at least one gene of interest, comprising the steps of:

[0050] (a) contacting the sample with a pool of single-stranded DNA scaffolds comprising an RNA polymerase promoter at the 5' end under conditions such that the RNA fragments complementary to the DNA scaffolds hybridize with the DNA scaffolds;

[0051] (b) extending the hybridized RNA fragments with a DNA polymerase along the DNA scaffolds to form DNA-DNA duplexes;

[0052] (c) amplifying the gene or genes of interest by in vitro transcription; and

[0053] (d) removing the DNA scaffolds from the duplexes.

[0054] An exemplary promoter is the T7 RNA polymerase promoter, while an exemplary DNA polymerase is DNA polymerase I.

[0055] In step (d) the DNA scaffolds may be removed, for example, by treatment with DNase I.

[0056] In a further embodiment, the pool of single-stranded DNA scaffolds comprises partial or complete gene sequences of interest, such as a library of cDNA clones.

[0057] In a specific embodiment, the sample represents a whole genome or a fraction thereof. In a preferred embodiment, the genome is the human genome.

[0058] In another aspect, the invention concerns a method of preparing a personalized genomics profile for a patient, comprising the steps of:

[0059] (a) subjecting RNA extracted from a tissue obtained from the patient to gene expression analysis;

[0060] (b) determining the expression level in such tissue of at least two genes selected from the gene set listed in Table 1, wherein the expression level is normalized against a control gene or genes, and is compared to the amount found in a cancer tissue reference set;

[0061] (c) and creating a report summarizing the data obtained by the gene expression analysis.

[0062] The tissue obtained from the patient may, but does not have to, comprise cancer cells. Just as before, the cancer can, for example, be breast cancer, colon cancer, lung cancer, prostate cancer, hepatocellular cancer, gastric cancer, pancreatic cancer, cervical cancer, ovarian cancer, liver cancer, bladder cancer, cancer of the urinary tract, thyroid cancer, renal cancer, carcinoma, melanoma, or brain cancer, breast cancer being particularly preferred.

[0063] In a particular embodiment, the RNA is obtained from a microdissected portion of breast cancer tissue enriched for cancer cells. The control gene set may, for example, comprise S-actin, and ribosomal protein LPO.

[0064] The report prepared for the use of the patient or the patient's physician, may include the identification of at least one drug potentially beneficial in the treatment of the patient.

[0065] Step (b) of the foregoing method may comprise the step of determining the expression level of a gene specifically influencing cellular sensitivity to a drug, where the gene can, for example, be selected from the group consisting of aldehyde dehydrogenase 1A1, aldehyde dehydrogenase 1A3, amphiregulin, ARG, BRK, BCRP, CD9, CD31, CD82/KAI-1, COX2, c-abl, c-kit, c-kit L, CYP1B1, CYP2C9, DHFR, dihydropyrimidine dehydrogenase, EGF, epiregulin, ER-alpha, ErbB-1, ErbB-2, ErbB-3, ErbB-4, ER-beta, farnesyl pyrophosphate synthetase, gamma-GCS (glutamyl cysteine synthetase), GATA3, geranyl geranyl pyrophosphate synthetase, Grb7, GST-alpha, GST-pi, HB-EGF, hsp 27, human chorionic gonadotropin/CGA, IGF-1, IGF-2, IGF1R, KDR, LIV1, Lung Resistance Protein/MVP, Lot1, MDR-1, microsomal epoxide hydrolase, MMP9, MRP1, MRP2, MRP3, MRP4, PAI1, PDGF-A, PDGF-B, PDGF-C, PDGF-D, PGDFR-alpha, PDGFR-beta, PLAGa (pleiomorphic adenoma 1), PREP prolyl endopeptidase, progesterone receptor, pS2/trefoil factor 1, PTEN, PTB1b, RAR-alpha, RAR-beta2, Reduced Folate Carrier, SXR, TGF-alpha, thymidine phosphorylase, thymidine synthase, topoisomerase II-alpha, topoisomerase II-beta, VEGF, XIST, and YB-1.

[0066] In another embodiment, step (b) of the foregoing process includes determining the expression level of multidrug resistance factors, such as, for example, gamma-glutamyl-cysteine synthetase (GCS), GST-.alpha., GST-.pi., MDR-1, MRP1-4, breast cancer resistance protein (BCRP), lung cancer resistance protein (MVP), SXR, or YB-1.

[0067] In another embodiment, step (b) of the foregoing process comprises determination of the expression level of eukaryotic translation initiation factor 4E (EIF4E).

[0068] In yet another embodiment, step (b) of the foregoing process comprises determination of the expression level of a DNA repair enzyme.

[0069] In a further embodiment, step (b) of the foregoing process comprises determination of the expression level of a cell cycle regulator, such as, for example, c-MYC, c-Src, Cyclin D1, Ha-Ras, mdm2. p14ARF, p21WAF1/CI, p16INK4a/p14, p23, p27, p53, PI3K, PKC-epsilon, or PKC-delta.

[0070] In a still further embodiment, step (b) of the foregoing process comprises determination of the expression level of a tumor suppressor or a related protein, such as, for example, APC or E-cadherin.

[0071] In another embodiment, step (b) of the foregoing method comprises determination of the expression level of a gene regulating apoptosis, such as, for example, p53, BC12, Bcl-x 1, Bak, Bax, and related factors, NF.kappa.-B, CIAP1, CIAP2, survivin, and related factors, p53BP1/ASPP1, or p53BP2/ASPP2.

[0072] In yet another embodiment, step (b) of the foregoing process comprises determination of the expression level of a factor that controls cell invasion or angiogenesis, such as, for example, uPA, PAI1, cathepsin B, C, and L, scatter factor (HGF), c-met, KDR, VEGF, or CD31.

[0073] In a different embodiment, step (b) of the foregoing method comprises determination of the expression level of a marker for immune or inflammatory cells or processes, such as, for example, Ig light chain .lamda., CD18, CD3, CD68. Fas (CD95), or Fas Ligand.

[0074] In a further embodiment, step (b) of the foregoing process comprises determination of the expression level of a cell proliferation marker, such as, for example, Ki67/MiB1, PCNA, Pin1, or thymidine kinase.

[0075] In a still further embodiment, step (b) of the foregoing process comprises determination of the expression level of a growth factor or growth factor receptor, such as, for example, IGF1, IGF2, IGFBP3, IGF1R, FGF2, CSF-1, CSF-1R/fms, SCF-1, IL6 or IL8.

[0076] In another embodiment, step (b) of the foregoing process comprises determination of the expression level of a gene marker that defines a subclass of breast cancer, where the gene marker can, for example, be GRO1 oncogene alpha, Grb7, cytokeratins 5 and 17, retinol binding protein 4, hepatocyte nuclear factor 3, integrin subunit alpha 7, or lipoprotein lipase.

[0077] In a still further aspect, the invention concerns a method for predicting the response of a patient diagnosed with breast cancer to 5-fluorouracil (5-FU) or an analog thereof, comprising the steps of:

[0078] (a) subjecting RNA extracted from a breast cancer tissue obtained from the patient to gene expression analysis;

[0079] (b) determining the expression level in the tissue of thymidylate synthase mRNA, wherein the expression level is normalized against a control gene or genes, and is compared to the amount found in a reference breast cancer tissue set; and

[0080] (c) predicting patient response based on the normalized thymidylate synthase mRNA level.

[0081] Step (d) of the foregoing method can further comprise determining the expression level of dihydropyrimidine phosphorylase.

[0082] In another embodiment, step (b) of the method can further comprise determining the expression level of thymidine phosphorylase.

[0083] In yet another embodiment, a positive response to 5-FU or an analog thereof is predicted if: (i) normalized thymidylate synthase mRNA level determined in step (b) is at or below the 15.sup.th percentile; or (ii) the sum of normalized expression levels of thymidylate synthase and dihydropyrimidine phosphorylase determined in step (b) is at or below the 25.sup.th percentile; or (iii) the sum of normalized expression levels of thymidylate synthase, dihydropyrimidine phosphorylase, plus thymidine phosphorylase determined in step (b) is at or below the 20.sup.th percentile.

[0084] In a further embodiment, in step (b) of the foregoing method the expression level of c-myc and wild-type p53 is determined. In this case, a positive response to 5-FU or an analog thereof is predicted, if the normalized expression level of c-myc relative to the normalized expression level of wild-type p53 is in the upper 15.sup.th percentile.

[0085] In a still further embodiment, in step (b) of the foregoing method, expression level of NF.kappa.B and cIAP2 is determined. In this particular embodiment, resistance to 5-FU or an analog thereof is typically predicted if the normalized expression level of NF.kappa.B and cIAP2 is at or above the 10.sup.th percentile.

[0086] In another aspect, the invention concerns a method for predicting the response of a patient diagnosed with breast cancer to methotrexate or an analog thereof, comprising the steps of:

[0087] (a) subjecting RNA extracted from a breast cancer tissue obtained from the patient to gene expression analysis, wherein gene expression levels are normalized against a control gene or genes, and compared to the amount found in a reference breast cancer tissue set; and

[0088] (b) predicting decreased patient sensitivity to methotrexate or analog if (i) DHFR levels are more than tenfold higher than the average expression level of DHFR in the control gene set, or (ii) the normalized expression levels of members of the reduced folate carver (RFC) family are below the 10.sup.th percentile.

[0089] In yet another aspect, the invention concerns a method for predicting the response of a patient diagnosed with breast cancer to an anthracycline or an analog thereof, comprising the steps of:

[0090] (a) subjecting RNA extracted from a breast cancer tissue obtained from the patient to gene expression analysis, wherein gene expression levels are normalized against a control gene or genes, and compared to the amount found in a reference breast cancer tissue set; and

[0091] (b) predicting patient resistance or decreased sensitivity to the anthracycline or analog if (i) the normalized expression level of topoisomerase II.alpha. is below the 10.sup.th percentile, or (ii) the normalized expression level of topoisomerase II.beta. is below the 10.sup.th percentile, or (iii) the combined normalized topoisomerase II.alpha. or II.beta. expression levels are below the 10.sup.th percentile.

[0092] In a different aspect, the invention concerns a method for predicting the response of a patient diagnosed with breast cancer to a docetaxol, Comprising the steps of:

[0093] (a) subjecting RNA extracted from a breast cancer tissue obtained from the patient to gene expression analysis, wherein gene expression levels are normalized against a control gene or genes, and compared to the amount found in a reference breast cancer tissue set; and

[0094] (b) predicting reduced sensitivity to docetaxol if the normalized expression level of CYP1B1 is in the upper 10.sup.th percentile.

[0095] The invention further concerns a method for predicting the response of a patient diagnosed with breast cancer to cyclophosphamide or an analog thereof, comprising

[0096] (a) subjecting RNA extracted from a breast cancer tissue obtained from the patient to gene expression analysis, wherein gene expression levels are normalized against a control gene or genes, and compared to the amount found in a reference breast cancer tissue set; and

[0097] (b) predicting reduced sensitivity to the cyclophosphamide or analog if the sum of the expression levels of aldehyde dehydrogenase 1A1 and 1A3 is more than tenfold higher than the average of their combined expression levels in the reference tissue set.

[0098] In a further aspect, the invention concerns a method for predicting the response of a patient diagnosed with breast cancer to anti-estrogen therapy, comprising

[0099] (a) subjecting RNA extracted from a breast cancer tissue obtained from the patient to gene expression analysis, wherein gene expression levels are normalized against a control gene or genes, and compared to the amount found in a reference breast cancer tissue set that contains both specimens negative for and positive for estrogen receptor-.alpha. (ER.alpha.) and progesterone receptor-.alpha. (PR.alpha.); and

[0100] (b) predicting patient response based upon the normalized expression levels of ER.alpha. or PR.alpha., and at least one of microsomal epoxide hydrolase, pS2/trefoil factor 1, GATA3 and human chorionic gonadotropin.

[0101] In a specific embodiment, lack of response or decreased responsiveness is predicted if (i) the normalized expression level of microsomal epoxide hydrolase is in the upper 10.sup.th percentile; or (ii) the normalized expression level of pS2/trefoil factor 1, or GATA3 or human chorionic gonaostropin is at or below the corresponding average expression level in said breast cancer tissue set, regardless of the expression level of ER.alpha. or PR.alpha. in the breast cancer tissue obtained from the patient.

[0102] In another aspect, the invention concerns a method for predicting the response of a patient diagnosed with breast cancer to a taxane, comprising the steps of:

[0103] (a) subjecting RNA extracted from a breast cancer tissue obtained from the patient to gene expression analysis, wherein gene expression levels are normalized against a control gene or genes, and compared to the amount found in a reference breast cancer tissue set; and

[0104] (b) predicting reduced sensitivity to taxane if (i) no or minimal XIST expression is detected; or (ii) the normalized expression level of GST-.pi. or propyl endopeptidase (PREP) is in the upper 10.sup.th percentile; or (iii) the normalized expression level of PLAG1 is in the upper 10.sup.th percentile.

[0105] The invention also concerns a method for predicting the response of a patient diagnosed with breast cancer to cisplatin or an analog thereof, comprising the steps of:

[0106] (a) subjecting RNA extracted from a breast cancer tissue obtained from the patient to gene expression analysis, wherein gene expression levels are normalized against a control gene or genes, and compared to the amount found, in a reference breast cancer tissue set; and

[0107] (b) predicting resistance or reduced sensitivity if the normalized expression level of ERCC1 is in the upper 10.sup.th percentile.

[0108] The invention further concerns a method for predicting the response of a patient diagnosed with breast cancer to an ErbB2 or EGFR antagonist, comprising the steps of:

[0109] (a) subjecting RNA extracted from a breast cancer tissue obtained from the patient to gene expression analysis, wherein gene expression levels are normalized against a control gene or genes, and compared to the amount found in a reference breast cancer tissue set; and

[0110] (b) predicting patient response based on the normalized expression levels of at least one of Grb7, IGF1R, IGF1 and IGF2.

[0111] In particular embodiment, a positive response is predicted if the normalized expression level of Grb7 is in the upper 10.sup.th percentile, and the expression of IGF1R, IGF1 and IGF2 is not elevated above the 90.sup.th percentile.

[0112] In a further particular embodiment, a decreased responsiveness is predicted if the expression level of at least one of IGF1R, IGF1 and IGF2 is elevated.

[0113] In another aspect, the invention concerns a method for predicting the response of a patient diagnosed with breast cancer to a bis-phosphonate drug, comprising the steps of:

[0114] (a) subjecting RNA extracted from a breast cancer tissue obtained from the patient to gene expression analysis, wherein gene expression levels are normalized against a control gene or genes, and compared to the amount found in a reference breast cancer tissue set; and

[0115] (b) predicting a positive response if the breast cancer tissue obtained from the patient expresses mutant Ha-Ras and additionally expresses farnesyl pyrophosphate synthetase or geranyl pyrophosphone synthetase at a normalized expression level at or above the 90.sup.th percentile.

[0116] In yet another aspect, the invention concerns a method for predicting the response of a patient diagnosed with breast cancer to treatment with a cyclooxygenase 2 inhibitor, comprising the steps of:

[0117] (a) subjecting RNA extracted from a breast cancer tissue obtained from the patient to gene expression analysis, wherein gene expression levels are normalized against a control gene or genes, and compared to the amount found in a reference breast cancer tissue set; and

[0118] (b) predicting a positive response if the normalized expression level of COX2 in the breast cancer tissue obtained from the patient is at or above the 90.sup.th percentile.

[0119] The invention further concerns a method for predicting the response of a patient diagnosed with breast cancer to an EGF receptor (EGFR) antagonist, comprising the steps of:

[0120] (a) subjecting RNA extracted from a breast cancer tissue obtained from the patient to gene expression analysis, wherein gene expression levels are normalized against a control gene or genes, and compared to the amount found in a reference breast cancer tissue set; and

[0121] (b) predicting a positive response to an EGFR antagonist, if (i) the normalized expression level of EGFR is at or above the 10.sup.th percentile, and (ii) the normalized expression level of at least one of epiregulin, TGF-.alpha., amphiregulin, ErbB3, BRK, CD9, MMP9, CD82, and Lot1 is above the 90.sup.th percentile.

[0122] In another aspect, the invention concerns a method for monitoring the response of a patient diagnosed with breast cancer to treatment with an EGFR antagonist, comprising monitoring the expression level of a gene selected from the group consisting of epiregulin, TGF-.alpha., amphiregulin, ErbB3, BRK, CD9, MMP9, CD82, and Lot1 in the patient during treatment, wherein reduction in the expression level is indicative of positive response to such treatment.

[0123] In yet another aspect, the invention concerns a method for predicting the response of a patient diagnosed with breast cancer to a drug targeting a tyrosine kinase selected from the group consisting of abl, c-kit, PDGFR-.alpha., PDGFR-.beta. and ARG, comprising the steps of:

[0124] (a) subjecting RNA extracted from a breast cancer tissue obtained from the patient to gene expression analysis, wherein gene expression levels are normalized against a control gene or genes, and compared to the amount found in a reference breast cancer tissue set;

[0125] (b) determining the normalized expression level of a tyrosine kinase selected from the group consisting of abl, c-kit, PDGFR-.alpha., PDGFR-.beta. and ARG, and the cognate ligand of the tyrosine kinase, and if the normalized expression level of the tyrosine kinase is in the upper 10.sup.th percentile,

[0126] (c) determining whether the sequence of the tyrosine kinase contains any mutation,

[0127] wherein a positive response is predicted if (i) the normalized expression level of the tyrosine kinase is in the upper 10.sup.th percentile, (ii) the sequence of the tyrosine kinase contains an activating mutation, or (iii) the normalized expression level of the tyrosine kinase is normal and the expression level of the ligand is in the upper 10.sup.th percentile.

[0128] Another aspect of the invention is a method for predicting the response of a patient diagnosed with breast cancer to treatment with an anti-angiogenic drug, comprising the steps of:

[0129] (a) subjecting RNA extracted from a breast cancer tissue obtained from the patient to gene expression analysis, wherein gene expression levels are normalized against a control gene or genes, and compared to the amount found in a reference breast cancer tissue set; and

[0130] (b) predicting a positive response if (i) the normalized expression level of VEGF is in the upper 10.sup.th percentile and (ii) the normalized expression level of KDR or CD31 is in the upper 20.sup.th percentile.

[0131] A further aspect of the invention is a method for predicting the likelihood that a patient diagnosed with breast cancer develops resistance to a drug interacting with the MRP-1 gene coding for the multidrug resistance protein P-glycoprotein, comprising the steps of:

[0132] (a) subjecting RNA extracted from a breast cancer tissue obtained from the patient to gene expression analysis to determine the expression level of PTP1b, wherein the expression level is normalized against a control gene or genes, and compared to the amount found in a reference breast cancer tissue set; and

[0133] (b) concluding that the patient is likely to develop resistance to said drug if the normalized expression level of the MRP-1 gene is above the 90.sup.th percentile.

[0134] The invention further relates to a method for predicting the likelihood that a patient diagnosed with breast cancer develops resistance to a chemotherapeutic drug or toxin used in cancer treatment, comprising the steps of:

[0135] (a) subjecting RNA extracted from a breast cancer tissue obtained from the patient to gene expression analysis, wherein gene expression levels are normalized against a control gene or genes, and compared to the amount found in a reference breast cancer tissue set; and

[0136] (b) determining the normalized expression levels of at least one of the following genes: MDR1, SGT.alpha., GST-.pi., SXR, BCRP YB-1, and LRP/MVP, wherein the finding of a normalized expression level in the upper 4.sup.th percentile is an indication that the patient is likely to develop resistance to the drug.

[0137] Also included herein is a method for measuring the translational efficiency of VEGF mRNA in a breast cancer tissue sample, comprising determining the expression levels of the VEGF and EIF4E mRNA in the sample, normalized against a control gene or genes, and compared to the amount found in a reference breast cancer tissue set, wherein a higher normalized EIF4E expression level for the same VEGF expression level is indicative of relatively higher translational efficiency for VEGF.

[0138] In another aspect, the invention provides a method for predicting the response of a patient diagnosed with breast cancer to a VEGF antagonist, comprising determining the expression level of VEGF and EIF4E mRNA normalized against a control gene or genes, and compared to the amount found in a reference breast cancer tissue set, wherein a VEGF expression level above the 90.sup.th percentile and an EIF4E expression level above the 50.sup.th percentile is a predictor of good patient response.

[0139] The invention further provides a method for predicting the likelihood of the recurrence of breast cancer in a patient diagnosed with breast cancer, comprising determining the ratio of p53:p21 mRNA expression or p53:mdm2 mRNA expression in a breast cancer tissue obtained from the patient, normalized against a control gene or genes, and compared to the amount found in a reference breast cancer tissue set, wherein an above normal ratio is indicative of a higher risk of recurrence. Typically, a higher risk of recurrence is indicated if the ratio is in the upper 10.sup.th percentile.

[0140] In yet another aspect, the invention concerns a method for predicting the likelihood of the recurrence of breast cancer in a breast cancer patient following surgery, comprising determining the expression level of cyclin D1 in a breast cancer tissue obtained from the patient, normalized against a control gene or genes, and compared to the amount found in a reference breast cancer tissue set, wherein an expression level in the upper 10.sup.th percentile indicates increased risk of recurrence following surgery. In a particular embodiment of this method, the patient is subjected to adjuvant chemotherapy, if the expression level is in the upper 10.sup.th percentile.

[0141] Another aspect of the invention is a method for predicting the likelihood of the recurrence of breast cancer in a breast cancer patient following surgery, comprising determining the expression level of APC or E-cadherin in a breast cancer tissue obtained from the patient, normalized against a control gene or genes, and compared to the amount found in a reference breast cancer tissue set, wherein an expression level in the upper 5.sup.th percentile indicates high risk of recurrence following surgery, and heightened risk of shortened survival.

[0142] A further aspect of the invention is a method for predicting the response of a patient diagnosed with breast cancer to treatment with a proapoptotic drug comprising determining the expression levels of BC12 and c-MYC in a breast cancer tissue obtained from the patient, normalized against a control gene or genes, and compared to the amount found in a reference breast cancer tissue set, wherein (i) a BC12 expression level in the upper 10.sup.th percentile in the absence of elevated expression of c-MYC indicates good response, and (ii) a good response is not indicated if the expression level c-MYC is elevated, regardless of the expression level of BC12.

[0143] A still further aspect of the invention is a method for predicting treatment outcome for a patient diagnosed with breast cancer, comprising the steps of:

[0144] (a) subjecting RNA extracted from a breast cancer tissue obtained from the patient to gene expression analysis, wherein gene expression levels are normalized against a control gene or genes, and compared to the amount found in a reference breast cancer tissue set; and

[0145] (b) determining the normalized expression levels of NF.kappa.B and at least one gene selected from the group consisting of cIAP1, cIAP2, XIAP, and Survivin,

[0146] wherein a poor prognosis is indicated if the expression levels for NF.kappa.B and at least one of the genes selected from the group consisting of cIAP1, cIAP2, XIAP, and Survivin is in the upper 5.sup.th percentile.

[0147] The invention further concerns a method for predicting treatment outcome for a patient diagnosed with breast cancer, comprising determining the expression levels of p53BP1 and p53BP2 in a breast cancer tissue obtained from the patient, normalized against a control gene or genes, and compared to the amount found in a reference breast cancer tissue set, wherein a poor outcome is predicted if the expression level of either p53BP1 or p53BP2 is in the lower 10.sup.th percentile.

[0148] The invention additionally concerns a method for predicting treatment outcome for a patient diagnosed with breast cancer, comprising determining the expression levels of uPA and PAI1 in a breast cancer tissue obtained from the patient, normalized against a control gene or genes, and compared to the amount found in a reference breast cancer tissue set, wherein (i) a poor outcome is predicted if the expression levels of uPA and PAI1 are in the upper 20.sup.th percentile, and (ii) a decreased risk of recurrence is predicted if the expression levels of uPA and PAI1 are not elevated above the mean observed in the breast cancer reference set. In a particular embodiment, poor outcome is measured in terms of shortened survival or increased risk of cancer recurrence following surgery. In another particular embodiment, uPA and PAI1 are expressed at normal levels, and the patient is subjected to adjuvant chemotherapy following surgery.

[0149] Another aspect of the invention is a method for predicting treatment outcome in a patient diagnosed with breast cancer, comprising determining the expression levels of cathepsin B and cathepsin L in a breast cancer tissue obtained from the patient, normalized against a control gene or genes, and compared to the amount found in a reference breast cancer tissue set, wherein a poor outcome is predicted if the expression level of either cathepsin B or cathepsin L is in the upper 10.sup.th percentile. Just as before, poor treatment outcome may be measured, for example, in terms of shortened survival or increased risk of cancer recurrence.

[0150] A further aspect of the invention is a method for devising the treatment of a patient diagnosed with breast cancer, comprising the steps of

[0151] (a) determining the expression levels of scatter factor and c-met in a breast cancer tissue obtained from the patient, normalized against a control gene or genes, and compared to the amount found in a reference breast cancer tissue set, and

[0152] (b) suggesting prompt aggressive chemotherapeutic treatment if the expression levels of scatter factor and c-met or the combination of both, are above the 90.sup.th percentile.

[0153] A still further aspect of the invention is a method for predicting treatment outcome for a patient diagnosed with breast cancer, comprising determining the expression levels of VEGF, CD31, and KDR in a breast cancer tissue obtained from the patient, normalized against a control gene or genes, and compared to the amount found in a reference breast cancer tissue set, wherein a poor treatment outcome is predicted if the expression level of any of VEGF, CD31, and KDR is in the upper 10.sup.th percentile.

[0154] Yet another aspect of the invention is a method for predicting treatment outcome for a patient diagnosed with breast cancer, comprising determining the expression levels of Ki67/MiB1, PCNA, Pin 1, and thymidine kinase in a breast cancer tissue obtained from the patient, normalized against a control gene or genes, and compared to the amount found in a reference breast cancer tissue set, wherein a poor treatment outcome is predicted if the expression level of any of Ki67/MiB1, PCNA, Pin1, and thymidine kinase is in the upper 10.sup.th percentile.

[0155] The invention further concerns a method for predicting treatment outcome for a patient diagnosed with breast cancer, comprising determining the expression level of soluble and full length CD95 in a breast cancer tissue obtained from the patient, normalized against a control gene or genes, and compared to the amount found in a reference breast cancer tissue set, wherein the presence of soluble CD95 correlates with poor patient survival.

[0156] The invention also concerns a method for predicting treatment outcome for a patient diagnosed with breast cancer, comprising determining the expression levels of IGF1, IGF1R and IGFBP3 in a breast cancer tissue obtained from the patient, normalized against a control gene or genes, and compared to the amount found in a reference breast cancer tissue set, wherein a poor treatment outcome is predicted if the sum of the expression levels of IGF1, IGF1R and IGFBP3 is in the upper 10.sup.th percentile.

[0157] The invention additionally concerns a method for classifying breast cancer comprising, determining the expression level of two or more genes selected from the group consisting of Bcl12, hepatocyte nuclear factor 3, LIV1, ER, lipoprotein lipase, retinol binding protein 4, integrin .alpha.7, cytokeratin 5, cytokeratin 17, GRO oncogen, ErbB2 and Grb7, in a breast cancer tissue, normalized against a control gene or genes, and compared to the amount found in a reference breast cancer tissue set, wherein (i) tumors expressing at least one of Bcl1, hepatocyte nuclear factor 3, LIV1, and ER above the mean expression level in the reference tissue set are classified as having a good prognosis for disease free and overall patient survival following surgical removal; (ii) tumors characterized by elevated expression of at least one of lipoprotein lipase, retinol binding protein 4, integrin .alpha.7 compared to the reference tissue set are classified as having intermediate prognosis of disease free and overall patient survival following surgical removal; and (iii) tumors expressing either elevated levels of cytokeratins 5 and 17, and GRO oncogen at levels four-fold or greater above the mean expression level in the reference tissue set, or ErbB2 and Grb7 at levels ten-fold or more above the mean expression level in the reference tissue set are classified as having poor prognosis of disease free and overall patient survival following surgical removal.

[0158] Another aspect of the invention is a panel of two or more gene specific primers selected from the group consisting of the forward and reverse primers listed in Table 2.

[0159] Yet another aspect of the invention is a method for reverse transcription of a fragmented RNA population in RT-PCR amplification, comprising using a multiplicity of gene specific primers as the reverse primers in the amplification reaction. In a particular embodiment, the method uses between two and about 40,000 gene specific primers in the same amplification reaction. In another embodiment, the gene specific primers are about 18 to 24 bases, such as about 20 bases in length. In another embodiment, the Tm of the primers is about 58-60.degree. C. The primers can, for example, be selected from the group consisting of the forward and reverse primers listed in Table 2.

[0160] The invention also concerns a method of reverse transcriptase driven first strand cDNA synthesis, comprising using a gene specific primer of about 18 to 24 bases in length and having a Tm optimum between about 58.degree. C. and about 60.degree. C. In a particular embodiment, the first strand cDNA synthesis is followed by PCR DNA amplification, and the primer serves as the reverse primer that drives the PCR amplification. In another embodiment, the method uses a plurality of gene specific primers in the same first strand cDNA synthesis reaction mixture. The number of the gene specific primers can, for example, be between 2 and about 40,000.

[0161] In a different aspect, the invention concerns a method of predicting the likelihood of long-term survival of a breast cancer patient without the recurrence of breast cancer, following surgical removal of the primary tumor, comprising determining the expression level of one or more prognostic RNA transcripts or their product in a breast cancer tissue sample obtained from said patient, normalized against the expression level of all RNA transcripts or their products in said breast cancer tissue sample, or of a reference set of RNA transcripts or their products, wherein the prognostic transcript is the transcript of one or more genes selected from the group consisting of: FOXM1, PRAME, Bcl2, STK15, CEGP1, Ki-67, GSTM1, CA9, PR, BBC3, NME1, SURV, GATA3, TFRC, YB-1, DPYD, GSTM3, RPS6 KB1, Src, Chk1, ID1, EstR1, p27, CCNB1, XIAP, Chk2, CDC25B, IGF1R, AK055699, PI3KC2A, TGFB3, BAGI1, CYP3A4, EpCAM, VEGFC, pS2, hENT1, WISP1, HNF3A, NFKBp65, BRCA2, EGFR, TK1, VDR, Contig51037, pENT1, EPHX1, IF1A, DIABLO, CDH1, HIF1.alpha., IGFBP3, CTSB, and Her2, wherein overexpression of one or more of FOXM1, PRAME, STK15, Ki-67, CA9, NME1, SURV, TFRC, YB-1, RPS6 KB1, Src, Chk1, CCNB1, Chk2, CDC25B, CYP3A4, EpCAM, VEGFC, hENT1, BRCA2, EGFR, TK1, VDR, EPHX1, IF1A, Contig51037, CDH1, HIF1.alpha., IGFBP3, CTSB, Her2, and pENT1 indicates a decreased likelihood of long-term survival without breast cancer recurrence, and the overexpression of one or more of Bcl2, CEGP1, GSTM1, PR, BBC3, GATA3, DPYD, GSTM3, ID1, EstR1, p27, XIAP, IGF1R, AK055699, P13KC2A, TGFB3, BAGI1, pS2, WISP1, HNF3A, NFKBp65, and DIABLO indicates an increased likelihood of long-term survival without breast cancer recurrence.

[0162] In a particular embodiment of this method, the expression level of at least 2, preferably at least 5, more preferably at least 10, most preferably at least 15 prognostic transcripts or their expression products is determined.

[0163] When the breast cancer is invasive breast carcinoma, including both estrogen receptor (ER) overexpressing (ER positive) and ER negative tumors, the analysis includes determination of the expression levels of the transcripts of at least two of the following genes, or their expression products: FOXM1, PRAME, Bcl2, STK15, CEGP1, Ki-67, GSTM1, PR, BBC3, NME1, SURV, GATA3, TFRC, YB-1, DPYD, Src, CA9, Contig51037, RPS6K1 and Her2.

[0164] When the breast cancer is ER positive invasive breast carcinoma, the analysis includes determination of the expression levels of the transcripts of at least two of the following genes, or their expression products: PRAME, Bcl2, FOXM1, DIABLO, EPHX1, HIF1A, VEGFC, Ki-67, IGF1R, VDR, NME1, GSTM3, Contig51037, CDC25B, CTSB, p27, CDH1, and IGFBP3.

[0165] Just as before, it is preferred to determine the expression levels of at least 5, more preferably at least 10, most preferably at least 15 genes, or their respective expression products.

[0166] In a particular embodiment, the expression level of one or more prognostic RNA transcripts is determined, where RNA may, for example, be obtained from a fixed, wax-embedded breast cancer tissue specimen of the patient. The isolation of RNA can, for example, be carried out following any of the procedures described above or throughout the application, or by any other method known in the art.

[0167] In yet another aspect, the invention concerns an array comprising polynucleotides hybridizing to the following genes: FOXM1, PRAME, Bcl2, STK15, CEGP1, Ki-67, GSTM1, PR, BBC3, NME1, SURV, GATA3, TFRC, YB-1, DPYD, CA9, Contig51037, RPS6K1 and Her2, immobilized on a solid surface.

[0168] In a particular embodiment, the array comprises polynucleotides hybridizing to the following genes: FOXM1, PRAME, Bcl2, STK15, CEGP1, Ki-67, GSTM1, CA9, PR, BBC3, NME1, SURV, GATA3, TFRC, YB-1, DPYD, GSTM3, RPS6KB1, Src, Chk1, ID1, EstR1, p27, CCNB1, XIAP, Chk2, CDC25B, IGF1R, AK055699, P13KC2A, TGFB3, BAGI1, CYP3A4, EpCAM, VEGFC, pS2, hENT1, WISP1, HNF3A, NFKBp65, BRCA2, EGFR, TK1, VDR, Contig51037, pENT1, EPHX1, IF1A, CDH1, HIF1.alpha., IGFBP3, CTSB, Her2 and DIABLO.

[0169] In a further aspect, the invention concerns a method of predicting the likelihood of long-term survival of a patient diagnosed with invasive breast cancer, without the recurrence of breast cancer, following surgical removal of the primary tumor, comprising the steps of:

[0170] (1) determining the expression levels of the RNA transcripts or the expression products of genes of a gene set selected from the group consisting of [0171] (a) Bcl2, cyclinG1, NFKBp65, NME1, EPHX1, TOP2B, DR5, TERC, Src, DIABLO; [0172] (b) Ki67, XIAP, hENT1, TS, CD9, p27, cyclinG1, pS2, NFKBp65, CYP3A4; [0173] (c) GSTM1, XIAP, Ki67, TS, cyclinG1, p27, CYP3A4, pS2, NFKBp65, ErbB3; [0174] (d) PR, NME1, XIAP, upa, cyclinG1, Contig51037, TERC, EPHX1, ALDH1A3, CTSL; [0175] (e) CA9, NME1, TERC, cyclinG1, EPHX1, DPYD, Src, TOP2B, NFKBp65, VEGFC; [0176] (f) TFRC, XIAP, Ki67, TS, cyclinG1, p27, CYP3A4, pS2, ErbB3, NFKBp65; [0177] (g) Bcl2, PRAME, cyclinG1, FOXM1, NFKBp65, TS, XIAP, Ki67, CYP3A4, p27; [0178] (h) FOXM1, cyclinG1, XIAP, Contig51037, PRAME, TS, Ki67, PDGFRa, p27, NFKBp65; [0179] (i) PRAME, FOXM1, cyclinG1, XIAP, Contig51037, TS, Ki6, PDGFRa, p27, NFKBp65; [0180] (j) Ki67, XIAP, PRAME, hENT1, contig51037, TS, CD9, p27, ErbB3, cyclinG1; [0181] (k) STK15, XIAP, PRAME, PLAUR, p27, CTSL, CD18, PREP, p53, RPS6 KB1; [0182] (l) GSTM1, XIAP, PRAME, p27, Contig51037, ErbB3, GSTp, EREG, ID1, PLAUR; [0183] (m) PR, FRAME, NME1, XIAP, PLAUR, cyclinG1, Contig51037, TERC, EPHX1, DR5; [0184] (n) CA9, FOXM1, cyclinG1, XIAP, TS, Ki67, NFKBp65, CYP3A4, GSTM3, p27; [0185] (o) TFRC, XIAP, PRAME, p27, Contig51037, ErbB3, DPYD, TERC, NME1, VEGFC; and [0186] (p) CEGP1, PRAME, hENT1, XIAP, Contig51037, ErbB3, DPYD, NFKBp65, ID1, TS in a breast cancer tissue sample obtained from said patient, normalized against the expression levels of all RNA transcripts or their products in said breast cancer tissue sample, or of a reference set of RNA transcripts or their products;

[0187] (2) subjecting the data obtained in step (a) to statistical analysis; and

[0188] (3) determining whether the likelihood of said long-term survival has increased or decreased.

[0189] In a still further aspect, the invention concerns a method of predicting the likelihood of long-term survival of a patient diagnosed with estrogen receptor (ER)-positive invasive breast cancer, without the recurrence of breast cancer, following surgical removal of the primary tumor, comprising the steps of:

[0190] (1) determining the expression levels of the RNA transcripts or the expression products of genes of a gene set selected from the group consisting of [0191] (a) PRAME, p27, IGFBP2, HIF1A, T1MP2, ILT2, CYP3A4, ID1, EstR1, DIABLO; [0192] (b) Contig51037, EPHX1, Ki67, TIMP2, cyclinG1, DPYD, CYP3A4, TP, AIB1, CYP2C8; [0193] (c) Bcl2, hENT1, FOXM1, Contig51037, cyclinG1, Contig46653, PTEN, CYP3A4, TIMP2, AREG; [0194] (d) HIF1A, PRAME, p27, IGFBP2, T1MP2, ILT2, CYP3A4, ID1, EstR1, DIABLO; [0195] (e) IGF1R, PRAME, EPHX1, Contig51037, cyclinG1, Bcl2, NME1, PTEN, TBP, TIMP2; [0196] (f) FOXM1, Contig51037, VEGFC, TBP, HIF1A, DPYD, RAD51C, DCR3, cyclinG1, BAG1; [0197] (g) EPHX1, Contig51037, Ki67, TIMP2, cyclinG1, DPYD, CYP3A4, TP, AIB1, CYP2C8; [0198] (h) Ki67, VEGFC, VDR, GSTM3, p27, upa, ITGA7, rhoC, TERC, Pin1; [0199] (i) CDC25B, Contig51037, hENT1, Bcl2, HLAG, TERC, NME1, upa, ID1, CYP; [0200] (j) VEGFC, Ki67, VDR, GSTM3, p27, upa, ITGA7, rhoC, TERC, Pin1; [0201] (k) CTSB, PRAME, p27, IGFBP2, EPHX1, CTSL, BAD, DR5, DCR3, XIAP; [0202] (l) DIABLO, Ki67, hENT1, TIMP2, ID1, p27, KRT19, IGFBP2, TS, PDGFB; [0203] (m) p27, PRAME, IGFBP2, HIF1A, T1MP2, ILT2, CYP3A4, ID1, EstR1, DIABLO; [0204] (n) CDH1; PRAME, VEGFC; HIF1A; DPYD, TIMP2, CYP3A4, EstR1, RBP4, p27; [0205] (o) IGFBP3, PRAME, p27, Bcl2, XIAP, EstR1, Ki67, TS, Src, VEGF; [0206] (p) GSTM3, PRAME, p27, IGFBP3, XIAP, FGF2, hENT1, PTEN, EstR1, APC; [0207] (q) hENT1, Bcl2, FOXM1, Contig51037, CyclinG1, Contig46653, PTEN, CYP3A4, TIMP2, AREG; [0208] (r) STK15, VEGFC, PRAME, p27, GCLC, hENT1, ID1, TIMP2, EstR1, MCP1; [0209] (s) NME1, PRAM, p27, IGFBP3, XIAP, PTEN, hENT1, Bcl2, CYP3A4, HLAG; [0210] (t) VDR, Bcl2, p27, hENT1, p53, PI3KC2A, EIF4E, TFRC, MCM3, ID1; [0211] (u) EIF4E, Contig51037, EPHX1, cyclinG1, Bcl2, DR5, TBP, PTEN, NME1, HER2; [0212] (v) CCNB1, PRAME, VEGFC, HIF1A, hENT1, GCLC, TIMP2, ID1, p27, upa; [0213] (w) ID1, PRAME, DIABLO, hENT1, p27, PDGFRa, NME1, B1N1, BRCA1, TP; [0214] (x). FBXO5, PRAME, IGFBP3, p27, GSTM3, hENT1, XIAP, FGF2, TS, PTEN; [0215] (y) GUS, HIA1A, VEGFC, GSTM3, DPYD, hENT1, EBXO5, CA9, CYP, KRT18; and [0216] (z) Bclx, Bcl2, hENT1, Contig51037, HLAG, CD9, ID1, BRCA1, BIN1, HBEGF;

[0217] (2) subjecting the data obtained in step (1) to statistical analysis; and

[0218] (3) determining whether the likelihood of said long-term survival has increased or decreased.

[0219] In a different aspect, the invention concerns an array comprising polynucleotides hybridizing to a gene set selected from the group consisting of [0220] (a) Bcl2, cyclinG1, NFKBp65, NME1, EPHX1, TOP2B, DR5, TERC, Src, DIABLO; [0221] (b) Ki67, XIAP, hENT1, TS, CD9, p27, cyclinG1, pS2, NFKBp65, CYP3A4; [0222] (c) GSTM1, XIAP, Ki67, TS, cyclinG1, p27, CYP3A4, pS2, NFKBp65ErbB3; [0223] (d) PR, NME1, XIAP, upa, cyclinG1, Contig51037, TERC, EPHX1, ALDH1A3, CTSL; [0224] (e) CA9, NME1, TERC, cyclinG1, EPHX1, DPYD, Src, TOP2B, NFKBp65, VEGFC; [0225] (f) TFRC, XIAP, Ki67, TS, cyclinG1, p27, CYP3A4, pS2, ErbB3, NFKBp65; [0226] (g) Bcl2, PRAME, cyclinG1, FOXM1, NFKBp65, TS, XIAP, Ki67, CYP3A4, p27; [0227] (h) FOXM1, cyclinG1, XIAP, Contig51037, PRAME, TS, Ki67, PDGFRa, p27, NFKBp65; [0228] (i) PRAME, FOXM1, cyclinG1, XIAP, Contig51037, TS, Ki6, PDGFRa, p27, NFKBp65; [0229] (j) Ki67, XIAP, PRAME, hENT1, contig51037, TS, CD9, p27, ErbB3, cyclinG1; [0230] (k) STK15, XIAP, PRAME, PLAUR, p27, CTSL, CD18, PREP, p53, RPS6 KB1; [0231] (l) GSTM1, XIAP, PRAME, p2'7, Contig51037, ErbB3, GSTp, EREG, ID1, PLAUR; [0232] (m) PR, PRAME, NME1, XIAP, PLAUR, cyclinG1, Contig51037, TERC, EPHX1, DR5; [0233] (n) CA9, FOXM1, cyclinG1, XIAP, TS, Ki67, NFKBp65, CYP3A4, GSTM3, p27; [0234] (o) TFRC, XIAP, PRAME, p27, Contig51037, ErbB3, DPYD, TERC, NME1, VEGFC; and [0235] (p) CEGP1, PRAME, hENT1, XIAP, Contig51037, ErbB3, DPYD, NFKBp65, ID1, TS, immobilized on a solid surface.

[0236] In an additional aspect, the invention concerns an array comprising polynucleotides hybridizing to a gene set selected from the group consisting of: [0237] (a) PRAME, p27, IGFBP2, HIF1A, T1MP2, ILT2, CYP3A4, ID1, EstR1, DIABLO; [0238] (b) Contig51037, EPHX1, Ki67, TIMP2, cyclinG1, DPYD, CYP3A4, TP, AIB1, CYP2C8; [0239] (c) Bcl2, hENT1, FOXM1, Contig51037, cyclinG1, Contig46653, PTEN, CYP3A4, TIMP2, AREG; [0240] (d) HIF1A, PRAME, p27, IGFBP2, T1MP2, ILT2, CYP3A4, ID1, EstR1, DIABLO; [0241] (e) IGF1R, PRAME, EPHX1, Contig51037, cyclinG1, Bcl2, NME1, PTEN, TBP, TIMP2; [0242] (f) FOXM1, Contig51037, VEGFC, TBP, HIF1A, DPYD, RAD51C, DCR3, cyclinG1, BAG1; [0243] (g) EPHX1, Contig51037, Ki67, TIMP2, cyclinG1, DPYD, CYP3A4, TP, AIB1, CYP2C8; [0244] (h) Ki67, VEGFC, VDR, GSTM3, p27, upa, ITGA7, rhoC, TERC, Pin1; [0245] (i) CDC25B, Contig51037, hENT1, Bcl2, HLAG, TERC, NME1, upa, ID1, CYP; [0246] (j) VEGFC, Ki67, VDR, GSTM3, p27, upa, ITGA7, rhoC, TERC, Pin1; [0247] (k) CTSB, PRAME, p27, IGFBP2, EPHX1, CTSL, BAD, DR5, DCR3, XIAP; [0248] (l) DIABLO, Ki67, hENT1, TIMP2, ID1, p27, KRT19, IGFBP2, TS, PDGFB; [0249] (m) p27, PRAME, IGFBP2, HIF1A, T1MP2, ILT2, CYP3A4, ID1, EstR1, DIABLO; [0250] (n) CDH1; PRAME, VEGFC; HIF1A; DPYD, TIMP2, CYP3A4, EstR1, RBP4, p27; [0251] (o) IGFBP3, PRAME, p27, Bcl2, XIAP, EstR1, Ki67, TS, Src, VEGF; [0252] (p) GSTM3, PRAME, p27, IGFBP3, XIAP, FGF2, hENT1, PTEN, EstR1, APC; [0253] (q) hENT1, Bcl2, FOXM1, Contig51037, CyclinG1, Contig46653, PTEN, CYP3A4, TIMP2, AREG; [0254] (r) STK15, VEGFC, PRAME, p27, GCLC, hENT1, ID1, TIMP2, EstR1, MCP1; [0255] (s) NME1, PRAM, p27, IGFBP3, XIAP, PTEN, hENT1, Bcl2, CYP3A4, HLAG; [0256] (t) VDR, Bcl2, p27, hENT1, p53, PI3KC2A, EIF4E, TFRC, MCM3, ID1; [0257] (u) EIF4E, Contig51037, EPHX1, cyclinG1, Bcl2, DR5, TBP, PTEN, NME1, HER2; [0258] (v) CCNB1, PRAME, VEGFC, HIF1A, hENT1, GCLC, TIMP2, ID1, p27, upa; [0259] (w) ID1, PRAME, DIABLO, hENT1, p27, PDGFRa, NME1, BIN1, BRCA1, TP; [0260] (x) FBXO5, PRAME, IGFBP3, p27, GSTM3, hENT1, XIAP, FGF2, TS, PTEN; [0261] (y) GUS, HIA1A, VEGFC, GSTM3, DPYD, hENT1, FBXO5, CA9, CYP, KRT18; and [0262] (z) Bclx, Bcl2, hENT1, Contig51037, HLAG, CD9, ID1, BRCA1, BIN1, HBEGF, immobilized on a solid surface.

[0263] In all aspects, the polynucleotides can be cDNAs ("cDNA arrays") that are typically about 500 to 5000 bases long, although shorter or longer cDNAs can also be used and are within the scope of this invention. Alternatively, the polynucleotides can be oligonucleotides (DNA microarrays), which are typically about 20 to 80 bases long, although shorter and longer oligonucleotides are also suitable and are within the scope of the invention. The solid surface can, for example, be glass or nylon, or any other solid surface typically used in preparing arrays, such as microarrays, and is typically glass.

BRIEF DESCRIPTION OF THE DRAWINGS

[0264] FIG. 1 is a chart illustrating the overall workflow of the process of the invention for measurement of gene expression. In the Figure, FPET stands for "fixed paraffin-embedded tissue," and "RT-PCR" stands for "reverse transcriptase PCR." RNA concentration is determined by using the commercial RiboGreen.TM. RNA Quantitation Reagent and Protocol.

[0265] FIG. 2 is a flow chart showing the steps of an RNA extraction method according to the invention alongside a flow chart of a representative commercial method.

[0266] FIG. 3 is a scheme illustrating the steps of an improved method for preparing fragmented mRNA for expression profiling analysis.

[0267] FIG. 4 illustrates methods for amplification of RNA prior to RT-PCR.

[0268] FIG. 5 illustrates an alternative scheme for repair and amplification of fragmented mRNA.

[0269] FIG. 6 shows the measurement of estrogen receptor mRNA levels in 40 FPE breast cancer specimens via RT-PCR. Three 10 micron sections were used for each measurement. Each data point represents the average of triplicate measurements.

[0270] FIG. 7 shows the results of the measurement of progesterone receptor mRNA levels in 40 FPE breast cancer specimens via RT-PCR performed as described in the legend of FIG. 6 above.

[0271] FIG. 8 shows results from an IVT/RT-PCR experiment.

[0272] FIG. 9 is a representation of the expression of 92 genes across 70 FPE breast cancer specimens. The y-axis shows expression as cycle threshold times. These genes are a subset of the genes listed in Table 1.

[0273] Table 1 shows a breast cancer gene list.

[0274] Table 2 sets forth amplicon and primer sequences used for amplification of fragmented mRNA.

[0275] Table 3 shows the Accession Nos. and SEQ ID NOS of the breast cancer genes examined.

DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENT

A. Definitions

[0276] Unless defined otherwise, technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs. Singleton et al., Dictionary of Microbiology and Molecular Biology 2nd ed., J. Wiley & Sons (New York, N.Y. 1994), and March, Advanced Organic Chemistry Reactions, Mechanisms and Structure 4th ed., John Wiley & Sons (New York, N.Y. 1992), provide one skilled in the art with a general guide to many of the terms used in the present application.

[0277] One skilled in the art will recognize many methods and materials similar or equivalent to those described herein, which could be used in the practice of the present invention. Indeed, the present invention is in no way limited to the methods and materials described. For purposes of the present invention, the following terms are defined below.

[0278] The term "microarray" refers to an ordered arrangement of hybridizable array elements, preferably polynucleotide probes, on a substrate.

[0279] The term "polynucleotide," when used in singular or plural, generally refers to any polyribonucleotide or polydeoxyribonucleotide, which may be unmodified RNA or DNA or modified RNA or DNA. Thus, for instance, polynucleotides as defined herein include, without limitation, single- and double-stranded DNA, DNA including single- and double-stranded regions, single- and double-stranded RNA, and RNA including single- and double-stranded regions, hybrid molecules comprising DNA and RNA that may be single-stranded or, more typically, double-stranded or include single- and double-stranded regions. In addition, the term "polynucleotide" as used herein refers to triple-stranded regions comprising RNA or DNA or both RNA and DNA. The strands in such regions may be from the same molecule or from different molecules. The regions may include all of one or more of the molecules, but more typically involve only a region of some of the molecules. One of the molecules of a triple-helical region often is an oligonucleotide. The term "polynucleotide" specifically includes DNAs and RNAs that contain one or more modified bases. Thus, DNAs or RNAs with backbones modified for stability or for other reasons are "polynucleotides" as that term is intended herein. Moreover, DNAs or RNAs comprising unusual bases, such as inosine, or modified bases, such as tritiated bases, are included within the term "polynucleotides" as defined herein. In general, the term "polynucleotide" embraces all chemically, enzymatically and/or metabolically modified forms of unmodified polynucleotides, as well as the chemical forms of DNA and RNA characteristic of viruses and cells, including simple and complex cells.

[0280] The term "oligonucleotide" refers to a relatively short polynucleotide, including, without limitation, single-stranded deoxyribonucleotides, single- or double-stranded ribonucleotides, RNA:DNA hybrids and double-stranded DNAs. Oligonucleotides, such as single-stranded DNA probe oligonucleotides, are often synthesized by chemical methods, for example using automated oligonucleotide synthesizers that are commercially available. However, oligonucleotides can be made by a variety of other methods, including in vitro recombinant DNA-mediated techniques and by expression of DNAs in cells and organisms.

[0281] The terms "differentially expressed gene," "differential gene expression" and their synonyms, which are used interchangeably, refer to a gene whose expression is activated to a higher or lower level in a subject suffering from a disease, specifically cancer, such as breast cancer, relative to its expression in a normal or control subject. The terms also include genes whose expression is activated to a higher or lower level at different stages of the same disease. It is also understood that a differentially expressed gene may be either activated or inhibited at the nucleic acid level or protein level, or may be subject to alternative splicing to result in a different polypeptide product. Such differences may be evidenced by a change in mRNA levels, surface expression, secretion or other partitioning of a polypeptide, for example. Differential gene expression may include a comparison of expression between two or more genes, or a comparison of the ratios of the expression between two or more genes, or even a comparison of two differently processed products of the same gene, which differ between normal subjects and subjects suffering from a disease, specifically cancer, or between various stages of the same disease. Differential expression includes both quantitative, as well as qualitative, differences in the temporal or cellular expression pattern in a gene or its expression products among, for example, normal and diseased cells, or among cells which have undergone different disease events or disease stages. For the purpose of this invention, "differential gene expression" is considered to be present when there is at least an about two-fold, preferably at least about four-fold, more preferably at least about six-fold, most preferably at least about ten-fold difference between the expression of a given gene in normal and diseased subjects, or in various stages of disease development in a diseased subject.

[0282] The phrase "gene amplification" refers to a process by which multiple copies of a gene or gene fragment are formed in a particular cell or cell line. The duplicated region (a stretch of amplified DNA) is often referred to as "amplicon." Usually, the amount of the messenger RNA (mRNA) produced, i.e., the level of gene expression, also increases in the proportion of the number of copies made of the particular gene expressed.

[0283] The term "prognosis" is used herein to refer to the prediction of the likelihood of cancer-attributable death or progression, including recurrence, metastatic spread, and drug resistance, of a neoplastic disease, such as breast cancer. The term "prediction" is used herein to refer to the likelihood that a patient will respond either favorably or unfavorably to a drug or set of drugs, and also the extent of those responses. The predictive methods of the present invention can be used clinically to make treatment decisions by choosing the most appropriate treatment modalities for any particular patient. The predictive methods of the present invention are valuable tools in predicting if a patient is likely to respond favorably to a treatment regimen, such as surgical intervention, chemotherapy with a given drug or drug combination, and/or radiation therapy.

[0284] The term "increased resistance" to a particular drug or treatment option, when used in accordance with the present invention, means decreased response to a standard dose of the drug or to a standard treatment protocol.

[0285] The term "decreased sensitivity" to a particular drug or treatment option, when used in accordance with the present invention, means decreased response to a standard dose of the drug or to a standard treatment protocol, where decreased response can be compensated for (at least partially) by increasing the dose of drug, or the intensity of treatment.

[0286] "Patient response" can be assessed using any endpoint indicating a benefit to the patient, including, without limitation, (1) inhibition, to some extent, of tumor growth, including slowing down and complete growth arrest; (2) reduction in the number of tumor cells; (3) reduction in tumor size; (4) inhibition (i.e., reduction, slowing down or complete stopping) of tumor cell infiltration into adjacent peripheral organs and/or tissues; (5) inhibition (i.e. reduction, slowing down or complete stopping) of metastasis; (6) enhancement of anti-tumor immune response, which may, but does not have to, result in the regression or rejection of the tumor; (7) relief, to some extent, of one or more symptoms associated with the tumor; (8) increase in the length of survival following treatment; and/or (9) decreased mortality at a given point of time following treatment.

[0287] The term "treatment" refers to both therapeutic treatment and prophylactic or preventative measures, wherein the object is to prevent or slow down (lessen) the targeted pathologic condition or disorder. Those in need of treatment include those already with the disorder as well as those prone to have the disorder or those in whom the disorder is to be prevented. In tumor (e.g., cancer) treatment, a therapeutic agent may directly decrease the pathology of tumor cells, or render the tumor cells more susceptible to treatment by other therapeutic agents, e.g., radiation and/or chemotherapy.

[0288] The term "tumor," as used herein, refers to all neoplastic cell growth and proliferation, whether malignant or benign, and all pre-cancerous and cancerous cells and tissues.

[0289] The terms "cancer" and "cancerous" refer to or describe the physiological condition in mammals that is typically characterized by unregulated cell growth. Examples of cancer include but are not limited to, breast cancer, colon cancer, lung cancer, prostate cancer, hepatocellular cancer, gastric cancer, pancreatic cancer, cervical cancer, ovarian cancer, liver cancer, bladder cancer, cancer of the urinary tract, thyroid cancer, renal cancer, carcinoma, melanoma, and brain cancer.

[0290] The "pathology" of cancer includes all phenomena that compromise the well-being of the patient. This includes, without limitation, abnormal or uncontrollable cell growth, metastasis, interference with the normal functioning of neighboring cells, release of cytokines or other secretory products at abnormal levels, suppression or aggravation of inflammatory or immunological response, neoplasia, premalignancy, malignancy, invasion of surrounding or distant tissues or organs, such as lymph nodes, etc.

[0291] "Stringency" of hybridization reactions is readily determinable by one of ordinary skill in the art, and generally is an empirical calculation dependent upon probe length, washing temperature, and salt concentration. In general, longer probes require higher temperatures for proper annealing, while shorter probes need lower temperatures. Hybridization generally depends on the ability of denatured DNA to reanneal when complementary strands are present in an environment below their melting temperature. The higher the degree of desired homology between the probe and hybridizable sequence, the higher the relative temperature which can be used. As a result, it follows that higher relative temperatures would tend to make the reaction conditions more stringent, while lower temperatures less so. For additional details and explanation of stringency of hybridization reactions, see Ausubel et al., Current Protocols in Molecular Biology, Wiley Interscience Publishers, (1995).

[0292] "Stringent conditions" or "high stringency conditions", as defined herein, typically: (1) employ low ionic strength and high temperature for washing, for example 0.015 M sodium chloride/0.0015 M sodium citrate/0.1% sodium dodecyl sulfate at 50.degree. C.; (2) employ during hybridization a denaturing agent, such as formamide, for example, 50% (v/v) formamide with 0.1% bovine serum albumin/0.1% Ficoll/0.1% polyvinylpyrrolidone/50 mM sodium phosphate buffer at pH 6.5 with 750 mM sodium chloride, 75 mM sodium citrate at 42.degree. C.; or (3) employ 50% formamide, 5.times.SSC (0.75 M NaCl, 0.075 M sodium citrate), 50 mM sodium phosphate (pH 6.8), 0.1% sodium pyrophosphate, 5.times.Denhardt's solution, sonicated salmon sperm DNA (50 .mu.g/ml), 0.1% SDS, and 10% dextran sulfate at 42.degree. C., with washes at 42.degree. C. in 0.2.times.SSC (sodium chloride/sodium citrate) and 50% formamide at 55.degree. C., followed by a high-stringency wash consisting of 0.1.times.SSC containing EDTA at 55.degree. C.

[0293] "Moderately stringent conditions" may be identified as described by Sambrook et al., Molecular Cloning: A Laboratory Manual, New York: Cold Spring Harbor Press, 1989, and include the use of washing solution and hybridization conditions (e.g., temperature, ionic strength and % SDS) less stringent that those described above. An example of moderately stringent conditions is overnight incubation at 37.degree. C. in a solution comprising: 20% formamide, 5.times.SSC (150 mM NaCl, 15 mM trisodium citrate), 50 mM sodium phosphate (pH 7.6), 5.times.Denhardt's solution, 10% dextran sulfate, and 20 mg/ml denatured sheared salmon sperm DNA, followed by washing the filters in 1.times.SSC at about 37-50.degree. C. The skilled artisan will recognize how to adjust the temperature, ionic strength, etc. as necessary to accommodate factors such as probe length and the like. In the context of the present invention, reference to "at least one," "at least two," "at least five," etc. of the genes listed in any particular gene set means any one or any and all combinations of the genes listed.

[0294] The terms "splicing" and "RNA splicing" are used interchangeably and refer to RNA processing that removes introns and joins exons to produce mature mRNA with continuous coding sequence that moves into the cytoplasm of an eukaryotic cell.

[0295] In theory, the term "exon" refers to any segment of an interrupted gene that is represented in the mature RNA product (B. Lewin. Genes IV Cell Press, Cambridge Mass. 1990). In theory the term "intron" refers to any segment of DNA that is transcribed but removed from within the transcript by splicing together the exons on either side of it. Operationally, exon sequences occur in the mRNA sequence of a gene as defined by Ref. Seq ID numbers. Operationally, intron sequences are the intervening sequences within the genomic DNA of a gene, bracketed by exon sequences and having GT and AG splice consensus sequences at their 5' and 3' boundaries.

B. Detailed Description

[0296] The practice of the present invention will employ, unless otherwise indicated, conventional techniques of molecular biology (including recombinant techniques), microbiology, cell biology, and biochemistry, which are within the skill of the art. Such techniques are explained fully in the literature, such as, "Molecular Cloning: A Laboratory Manual", 2.sup.nd edition (Sambrook et al., 1989); "Oligonucleotide Synthesis" (M. J. Gait, ed., 1984); "Animal Cell Culture" (R. I. Freshney, ed., 1987); "Methods in Enzymology" (Academic Press, Inc.); "Handbook of Experimental Immunology", 4.sup.th edition (D. M. Weir & C. C. Blackwell, eds., Blackwell Science Inc., 1987); "Gene Transfer Vectors for Mammalian Cells" (J. M. Miller & M. P. Calos, eds., 1987); "Current Protocols in Molecular Biology" (F. M. Ausubel et al., eds., 1987); and "PCR: The Polymerase Chain Reaction", (Mullis et al., eds., 1994).

[0297] 1. Gene Expression Profiling

[0298] In general, methods of gene expression profiling can be divided into two large groups: methods based on hybridization analysis of polynucleotides, and methods based on sequencing of polynucleotides. The most commonly used methods known in the art for the quantification of mRNA expression in a sample include northern blotting and in situ hybridization (Parker & Barnes, Methods in Molecular Biology 106:247-283 (1999)); RNAse protection assays (Hod, Biotechniques 13:852-854 (1992)); and reverse transcription polymerase chain reaction (RT-PCR) (Weis et al., Trends in Genetics 8:263-264 (1992)). Alternatively, antibodies may be employed that can recognize specific duplexes, including DNA duplexes, RNA duplexes, and DNA-RNA hybrid duplexes or DNA-protein duplexes. Representative methods for sequencing-based gene expression analysis include Serial Analysis of Gene Expression (SAGE), and gene expression analysis by massively parallel signature sequencing (MPSS).

[0299] 2. Reverse Transcriptase PCR (RT-PCR)

[0300] Of the techniques listed above, the most sensitive and most flexible quantitative method is RT-PCR, which can be used to compare mRNA levels in different sample populations, in normal and tumor tissues, with or without drug treatment, to characterize patterns of gene expression, to discriminate between closely related mRNAs, and to analyze RNA structure.

[0301] The first step is the isolation of mRNA from a target sample. The starting material is typically total RNA isolated from human tumors or tumor cell lines, and corresponding normal tissues or cell lines, respectively. Thus RNA can be isolated from a variety of primary tumors, including breast, lung, colon, prostate, brain, liver, kidney, pancreas, spleen, thymus, testis, ovary, uterus, etc., tumor, or tumor cell lines, with pooled DNA from healthy donors. If the source of mRNA is a primary tumor, mRNA can be extracted, for example, from frozen or archived paraffin-embedded and fixed (e.g. formalin-fixed) tissue samples.

[0302] General methods for mRNA extraction are well known in the art and are disclosed in standard textbooks of molecular biology, including Ausubel et al., Current Protocols of Molecular Biology, John Wiley and Sons (1997). Methods for RNA extraction from paraffin embedded tissues are disclosed, for example, in Rupp and Locker, Lab Invest. 56:A67 (1987), and De Andres et al., BioTechniques 18:42044 (1995). In particular, RNA isolation can be performed using purification kit, buffer set and protease from commercial manufacturers, such as Qiagen, according to the manufacturer's instructions. For example, total RNA from cells in culture can be isolated using Qiagen RNeasy mini-columns. Other commercially available RNA isolation kits include MasterPure.TM. Complete DNA and RNA Purification Kit (EPICENTRE.RTM., Madison, Wis.), and Paraffin Block RNA Isolation Kit (Ambion, Inc.). Total RNA from tissue samples can be isolated using RNA Stat-60 (Tel-Test). RNA prepared from tumor can be isolated, for example, by cesium chloride density gradient centrifugation.

[0303] As RNA cannot serve as a template for PCR, the first step in gene expression profiling by RT-PCR is the reverse transcription of the RNA template into cDNA, followed by its exponential amplification in a PCR reaction. The two most commonly used reverse transcriptases are avilo myeloblastosis virus reverse transcriptase (AMV-RT) and Moloney murine leukemia virus reverse transcriptase (MMLV-RT). The reverse transcription step is typically primed using specific primers, random hexamers, or oligo-dT primers, depending on the circumstances and the goal of expression profiling. For example, extracted RNA can be reverse-transcribed using a GeneAmp RNA PCR kit (Perkin Elmer, Calif., USA), following the manufacturer's instructions. The derived cDNA can then be used as a template in the subsequent PCR reaction.

[0304] Although the PCR step can use a variety of thermostable DNA-dependent DNA polymerases, it typically employs the Taq DNA polymerase, which has a 5'-3' nuclease activity but lacks a 3'-5' proofreading endonuclease activity. Thus, TaqMan.RTM. PCR typically utilizes the 5'-nuclease activity of Taq or Tth polymerase to hydrolyze a hybridization probe bound to its target amplicon, but any enzyme with equivalent 5' nuclease activity can be used. Two oligonucleotide primers are used to generate an amplicon typical of a PCR reaction. A third oligonucleotide, or probe, is designed to detect nucleotide sequence located between the two PCR primers. The probe is non-extendible by Taq DNA polymerase enzyme, and is labeled with a reporter fluorescent dye and a quencher fluorescent dye. Any laser-induced emission from the reporter dye is quenched by the quenching dye when the two dyes are located close together as they are on the probe. During the amplification reaction, the Taq DNA polymerase enzyme cleaves the probe in a template-dependent manner. The resultant probe fragments disassociate in solution, and signal from the released reporter dye is free from the quenching effect of the second fluorophore. One molecule of reporter dye is liberated for each new molecule synthesized, and detection of the unquenched reporter dye provides the basis for quantitative interpretation of the data.

[0305] TaqMan.RTM. RT-PCR can be performed using commercially available equipment, such as, for example, ABI PRISM 7700.TM. Sequence Detection System.TM. (Perkin-Elmer-Applied Biosystems, Foster City, Calif., USA), or Lightcycler (Roche Molecular Biochemicals, Mannheim, Germany). In a preferred embodiment, the 5' nuclease procedure is run on a real-time quantitative PCR device such as the ABI PRISM 7700.TM. Sequence Detection System.TM.. The system consists of a thermocycler, laser, charge-coupled device (CCD), camera and computer. The system amplifies samples in a 96-well format on a thermocycler. During amplification, laser-induced fluorescent signal is collected in real-time through fiber optics cables for all 96 wells, and detected at the CCD. The system includes software for running the instrument and for analyzing the data.

[0306] 5'-Nuclease assay data are initially expressed as Ct, or the threshold cycle. As discussed above, fluorescence values are recorded during every cycle and represent the amount of product amplified to that point in the amplification reaction. The point when the fluorescent signal is first recorded as statistically significant is the threshold cycle (C.sub.t).

[0307] To minimize errors and the effect of sample-to-sample variation, RT-PCR is usually performed using an internal standard. The ideal internal standard is expressed at a constant level among different tissues, and is unaffected by the experimental treatment. RNAs most frequently used to normalize patterns of gene expression are mRNAs for the housekeeping genes glyceraldehyde-3-phosphate-dehydrogenase (GAPDH) and .beta.-actin.

[0308] A more recent variation of the RT-PCR technique is the real time quantitative PCR, which measures PCR product accumulation through a dual-labeled fluorogenic probe (i.e., TaqMan.RTM. probe). Real time PCR is compatible both with quantitative competitive PCR, where internal competitor for each target sequence is used for normalization, and with quantitative comparative PCR using a normalization gene contained within the sample, or a housekeeping gene for RT-PCR. For further details see, e.g. Held et al. Genome Research 6:986-994 (1996).

[0309] 3. Microarrays

[0310] Differential gene expression can also be identified, or confirmed using the microarray technique. Thus, the expression profile of breast cancer-associated genes can be measured in either fresh or paraffin-embedded tumor tissue, using microarray technology. In this method, polynucleotide sequences of interest are plated, or arrayed, on a microchip substrate. The arrayed sequences are then hybridized with specific DNA probes from cells or tissues of interest. Just as in the RT-PCR method, the source of mRNA typically is total RNA isolated from human tumors or tumor cell lines, and corresponding normal tissues or cell lines. Thus RNA can be isolated from a variety of primary tumors or tumor cell lines. If the source of mRNA is a primary tumor, mRNA can be extracted, for example, from frozen or archived paraffin-embedded and fixed (e.g. formalin-fixed) tissue samples, which are routinely prepared and preserved in everyday clinical practice.

[0311] In a specific embodiment of the microarray technique, PCR amplified inserts of cDNA clones are applied to a substrate in a dense array. Preferably at least 10,000 nucleotide sequences are applied to the substrate. The microarrayed genes, immobilized on the microchip at 10,000 elements each, are suitable for hybridization under stringent conditions. Fluorescently labeled cDNA probes may be generated through incorporation of fluorescent nucleotides by reverse transcription of RNA extracted from tissues of interest. Labeled cDNA probes applied to the chip hybridize with specificity to each spot of DNA on the array. After stringent washing to remove non-specifically bound probes, the chip is scanned by confocal laser microscopy or by another detection method, such as a CCD camera. Quantitation of hybridization of each arrayed element allows for assessment of corresponding mRNA abundance. With dual color fluorescence, separately labeled cDNA probes generated from two sources of RNA are hybridized pairwise to the array. The relative abundance of the transcripts from the two sources corresponding to each specified gene is thus determined simultaneously. The miniaturized scale of the hybridization affords a convenient and rapid evaluation of the expression pattern for large numbers of genes. Such methods have been shown to have the sensitivity required to detect rare transcripts, which are expressed at a few copies per cell, and to reproducibly detect at least approximately two-fold differences in the expression levels (Schena et al., Proc. Natl. Acad. Sci. USA 93(2):106-149 (1996)). Microarray analysis can be performed by commercially available equipment, following manufacturer's protocols, such as by using the Affymetrix GenChip technology, or Incyte's microarray technology.

[0312] The development of microarray methods for large-scale analysis of gene expression makes it possible to search systematically for molecular markers of cancer classification and outcome prediction in a variety of tumor types.

[0313] 4. Serial Analysis of Gene Expression (SAGE)

[0314] Serial analysis of gene expression (SAGE) is a method that allows the simultaneous and quantitative analysis of a large number of gene transcripts, without the need of providing an individual hybridization probe for each transcript. First, a short sequence tag (about 10-14 bp) is generated that contains sufficient information to uniquely identify a transcript, provided that the tag is obtained from a unique position within each transcript. Then, many transcripts are linked together to form long serial molecules, that can be sequenced, revealing the identity of the multiple tags simultaneously. The expression pattern of any population of transcripts can be quantitatively evaluated by determining the abundance of individual tags, and identifying the gene corresponding to each tag. For more details see, e.g. Velculescu et al., Science 270:484-487 (1995); and Velculescu et al., Cell 88:243-51 (1997).

[0315] 5. Gene Expression Analysis by Massively Parallel Signature Sequencing (MPSS)

[0316] This method, described by Brenner et al., Nature Biotechnology 18:630-634 (2000), is a sequencing approach that combines non-gel-based signature sequencing with in vitro cloning of millions of templates on separate 5 .mu.m diameter microbeads. First, a microbead library of DNA templates is constructed by in vitro cloning. This is followed by the assembly of a planar array of the template-containing microbeads in a flow cell at a high density (typically greater than 3.times.10.sup.6 microbeads/cm.sup.2). The free ends of the cloned templates on each microbead are analyzed simultaneously, using a fluorescence-based signature sequencing method that does not require DNA fragment separation. This method has been shown to simultaneously and accurately provide, in a single operation, hundreds of thousands of gene signature sequences from a yeast cDNA library.

[0317] 6. General Description of the mRNA Isolation, Purification and Amplification Methods of the Invention

[0318] The steps of a representative protocol of the invention, including mRNA isolation, purification, primer extension and amplification are illustrated in FIG. 1. As shown in FIG. 1, this representative process starts with cutting about 10 .mu.m thick sections of paraffin-embedded tumor tissue samples. The RNA is then extracted, and protein and DNA are removed, following the method of the invention described below. After analysis of the RNA concentration, RNA repair and/or amplification steps may be included, if necessary, and RNA is reverse transcribed using gene specific promoters followed by RT-PCR. Finally, the data are analyzed to identify the best treatment option(s) available to the patient on the basis of the characteristic gene expression pattern identified in the tumor sample examined. The individual steps of this protocol will be discussed in greater detail below.

[0319] 7. Improved Method for Isolation of Nucleic Acid from Archived Tissue Specimens

[0320] As discussed above, in the first step of the method of the invention, total RNA is extracted from the source material of interest, including fixed, paraffin-embedded tissue specimens, and purified sufficiently to act as a substrate in an enzyme assay. Despite the availability of commercial products, and the extensive knowledge available concerning the isolation of nucleic acid, such as RNA, from tissues, isolation of nucleic acid (RNA) from fixed, paraffin-embedded tissue specimens (FPET) is not without difficulty.

[0321] In one aspect, the present invention concerns an improved method for the isolation of nucleic acid from archived, e.g. FPET tissue specimens. Measured levels of mRNA species are useful for defining the physiological or pathological status of cells and tissues. RT-PCR (which is discussed above) is one of the most sensitive, reproducible and quantitative methods for this "gene expression profiling". Paraffin-embedded, formalin-fixed tissue is the most widely available material for such studies. Several laboratories have demonstrated that it is possible to successfully use fixed-paraffin-embedded tissue (FPET) as a source of RNA for RT-PCR (Stanta et al., Biotechniques 11:304-308 (1991); Stanta et al., Methods Mol. Biol. 86:23-26 (1998); Jackson et al., Lancet 1:1391 (1989); Jackson et al., J. Clin. Pathol. 43:499-504 (1999); Finke et al., Biotechniques 14:448-453 (1993); Goldsworthy et al., Mol. Carcinog. 25:86-91 (1999); Stanta and Bonin, Biotechniques 24:271-276 (1998); Godfrey et al., J. Mol. Diagnostics 2:84 (2000); Specht et al., J. Mol. Med. 78:B27 (2000); Specht et al., Am. J. Pathol. 158:419-429 (2001)). This allows gene expression profiling to be carried out on the most commonly available source of human biopsy specimens, and therefore potentially to create new valuable diagnostic and therapeutic information.

[0322] The most widely used protocols utilize hazardous organic solvents, such as xylene, or octane (Finke et al., supra) to dewax the tissue in the paraffin blocks before nucleic acid (RNA and/or DNA) extraction. Obligatory organic solvent removal (e.g. with ethanol) and rehydration steps follow, which necessitate multiple manipulations, and addition of substantial total time to the protocol, which can take up to several days. Commercial kits and protocols for RNA extraction from FPET [MasterPure.TM. Complete DNA and RNA Purification Kit (EPICENTRE.RTM., Madison, Wis.); Paraffin Block RNA Isolation Kit (Ambion, Inc.) and RNeasy.TM. Mini kit (Qiagen, Chatsworth, Calif.)] use xylene for deparaffinization, in procedures which typically require multiple centrifugations and ethanol buffer changes, and incubations following incubation with xylene.

[0323] The present invention provides an improved nucleic acid extraction protocol that produces nucleic acid, in particular RNA, sufficiently intact for gene expression measurements. The key step in the nucleic acid extraction protocol herein is the performance of dewaxing without the use of any organic solvent, thereby eliminating the need for multiple manipulations associated with the removal of the organic solvent, and substantially reducing the total time to the protocol. According to the invention, wax, e.g. paraffin is removed from wax-embedded tissue samples by incubation at 65-75.degree. C. in a lysis buffer that solubilizes the tissue and hydrolyzes the protein, following by cooling to solidify the wax.

[0324] FIG. 2 shows a flow chart of an RNA extraction protocol of the present invention in comparison with a representative commercial method, using xylene to remove wax. The times required for individual steps in the processes and for the overall processes are shown in the chart. As shown, the commercial process requires approximately 50% more time than the process of the invention.

[0325] The lysis buffer can be any buffer known for cell lysis. It is, however, preferred that oligo-dT-based methods of selectively purifying polyadenylated mRNA not be used to isolate RNA for the present invention, since the bulk of the mRNA molecules are expected to be fragmented and therefore will not have an intact polyadenylated tail, and will not be recovered or available for subsequent analytical assays. Otherwise, any number of standard nucleic acid purification schemes can be used. These include chaotrope and organic solvent extractions, extraction using glass beads or filters, salting out and precipitation based methods, or any of the purification methods known in the art to recover total RNA or total nucleic acids from a biological source.

[0326] Lysis buffers are commercially available, such as, for example, from Qiagen, Epicentre, or Ambion. A preferred group of lysis buffers typically contains urea, and Proteinase K or other protease. Proteinase K is very useful in the isolation of high quality, undamaged DNA or RNA, since most mammalian DNases and RNases are rapidly inactivated by this enzyme, especially in the presence of 0.5-1% sodium dodecyl sulfate (SDS). This is particularly important in the case of RNA, which is more susceptible to degradation than DNA. While DNases require metal ions for activity, and can therefore be easily inactivated by chelating agents, such as EDTA, there is no similar co-factor requirement for RNases.

[0327] Cooling and resultant solidification of the wax permits easy separation of the wax from the total nucleic acid, which can be conveniently precipitated, e.g. by isopropanol. Further processing depends on the intended purpose. If the proposed method of RNA analysis is subject to bias by contaminating DNA in an extract, the RNA extract can be further treated, e.g. by DNase, post purification to specifically remove DNA while preserving RNA. For example, if the goal is to isolate high quality RNA for subsequent RT-PCR amplification, nucleic acid precipitation is followed by the removal of DNA, usually by DNase treatment. However, DNA can be removed at various stages of nucleic acid isolation, by DNase or other techniques well known in the art.

[0328] While the advantages of the nucleic acid extraction protocol of the invention are most apparent for the isolation of RNA from archived, paraffin embedded tissue samples, the wax removal step of the present invention, which does not involve the use of an organic solvent, can also be included in any conventional protocol for the extraction of total nucleic acid (RNA and DNA) or DNA only. All of these aspects are specifically within the scope of the invention.

[0329] By using heat followed by cooling to remove paraffin, the process of the present invention saves valuable processing time, and eliminates a series of manipulations, thereby potentially increasing the yield of nucleic acid. Indeed, experimental evidence presented in the examples below, demonstrates that the method of the present invention does not compromise RNA yield.

[0330] 8. 5'-Multiplexed Gene Specific Priming of Reverse Transcription

[0331] RT-PCR requires reverse transcription of the test RNA population as a first step. The most commonly used primer for reverse transcription is oligo-dT, which works well when RNA is intact. However, this primer will not be effective when RNA is highly fragmented as is the case in FPE tissues.

[0332] The present invention includes the use of gene specific primers, which are roughly 20 bases in length with a Tm optimum between about 58.degree. C. and 60.degree. C. These primers will also serve as the reverse primers that drive PCR DNA amplification.

[0333] Another aspect of the invention is the inclusion of multiple gene-specific primers in the same reaction mixture. The number of such different primers can vary greatly and can be as low as two and as high as 40,000 or more. Table 2 displays examples of reverse primers that can be successfully used in carrying out the methods of the invention. FIG. 9 shows expression data obtained using this multiplexed gene-specific priming strategy. Specifically, FIG. 9 is a representation of the expression of 92 genes (a subset of genes listed in Table 1) across 70 FPE breast cancer specimens. The y-axis shows expression as cycle threshold times.

[0334] An alternative approach is based on the use of random hexamers as primers for cDNA synthesis. However, we have experimentally demonstrated that the method of using a multiplicity of gene-specific primers is superior over the known approach using random hexamers.

[0335] 9. Preparation of Fragmented mRNA for Expression Profiling Assays

[0336] It is of interest to analyze the abundance of specific mRNA species in biological samples, since this expression profile provides an index of the physiological state of that sample. mRNA is notoriously difficult to extract and maintain in its native state, consequently, mRNA recovered from biological sources is often fragmented or somewhat degraded. This is especially true of human tissue specimen which have been chemically fixed and stored for extended periods of time.

[0337] In one aspect, the present invention provides a means of preparing the mRNA extracted from various sources, including archived tissue specimens, for expression profiling in a way that its relative abundance is preserved and the mRNA's of interest can be successfully measured. This method is useful as a means of preparing mRNA for analysis by any of the known expression profiling methods, including RT-PCR coupled with 5' exonuclease of reporter probes (TaqMan.RTM. type assays), as discussed above, flap endonuclease assays (Cleavase.RTM. and Invader.RTM. type assays), oligonucleotide hybridization arrays, cDNA hybridization arrays, oligonucleotide ligation assays, 3' single nucleotide extension assays and other assays designed to assess the abundance of specific mRNA sequences in a biological sample.

[0338] According to the method of the invention, total RNA is extracted from the source material and sufficiently purified to act as a substrate in an enzyme assay. The extraction procedure, including a new and improved way of removing the wax (e.g. paraffin) used for embedding the tissue samples, has been discussed above. It has also been noted that it is preferred that oligo-dT based methods of selectively purifying polyadenylated mRNA not be used to isolate RNA for this invention since the bulk of the mRNA is expected to be fragmented, will not be polyadenylated and, therefore, will not be recovered and available for subsequent analytical assays if an oligo-dT based method is used.

[0339] A diagram of an improved method for repairing fragmented RNA is shown in FIG. 3. The fragmented RNA purified from the tissue sample is mixed with universal or gene-specific, single-stranded, DNA templates for each mRNA species of interest. These templates may be full length DNA copies of the mRNA derived from cloned gene sources, they may be fragments of the gene representing only the segment of the gene to be assayed, they may be a series of long oligonucleotides representing either the full length gene or the specific segment(s) of interest. The template can represent either a single consensus sequence or be a mixture of polymorphic variants of the gene. This DNA template, or scaffold, will preferably include one or more dUTP or rNTP sites in its length. This will provide a means of removing the template prior to carrying out subsequent analytical steps to avoid its acting as a substrate or target in later analysis assays. This removal is accomplished by treating the sample with uracil-DNA glycosylase (UDG) and heating it to cause strand breaks where UDG has generated abasic sites. In the case of rNTP's, the sample can be heated in the presence of a basic buffer (pH.about.10) to induce strand breaks where rNTP's are located in the template.

[0340] The single stranded DNA template is mixed with the purified RNA, the mixture is denatured and annealed so that the RNA fragments complementary to the DNA template effectively become primers that can be extended along the single stranded DNA templates. DNA polymerase I requires a primer for extension but will efficiently use either a DNA or an RNA primer. Therefore in the presence of DNA polymerase I and dNTP's, the fragmented RNA can be extended along the complementary DNA templates. In order to increase the efficiency of the extension, this reaction can be thermally cycled, allowing overlapping templates and extension products to hybridize and extend until the overall population of fragmented RNA becomes represented as double stranded DNA extended from RNA fragment primers.

[0341] Following the generation of this "repaired" RNA, the sample should be treated with UDG or heat-treated in a mildly based solution to fragment the DNA template (scaffold) and prevent it from participating in subsequent analytical reactions.

[0342] The product resulting from this enzyme extension can then be used as a template in a standard enzyme profiling assay that includes amplification and detectable signal generation such as fluorescent, chemiluminescent, colorimetric or other common read outs from enzyme based assays. For example, for TaqMan.RTM. type assays, this double stranded DNA product is added as the template in a standard assay; and, for array hybridization, this product acts as the cDNA template for the cRNA labeling reaction typically used to generate single-stranded, labeled RNA for array hybridization.

[0343] This method of preparing template has the advantage of recovering information from mRNA fragments too short to effectively act as templates in standard cDNA generation schemes. In addition, this method acts to preserve the specific locations in mRNA sequences targeted by specific analysis assays. For example, TaqMan.RTM. assays rely on a single contiguous sequence in a cDNA copy of mRNA to act as a PCR amplification template targeted by a labeled reporter probe. If mRNA strand breaks occur in this sequence, the assay will not detect that template and will underestimate the quantity of that mRNA in the original sample. This target preparation method minimizes that effect from RNA fragmentation.

[0344] The extension product formed in the RNA primer extension assay can be controlled by controlling the input quantity of the single stranded DNA template and by doing limited cycling of the extension reaction. This is important in preserving the relative abundance of the mRNA sequences targeted for analysis.

[0345] This method has the added advantage of not requiring parallel preparation for each target sequence since it is easily multiplexed. It is also possible to use large pools of random sequence long oligonucleotides or full libraries of cloned sequences to extend the entire population of mRNA sequences in the sample extract for whole expressed genome analysis rather than targeted gene specific analysis.

[0346] 10. Amplification of mRNA Species Prior to RT-PCR

[0347] Due to the limited amount and poor quality of mRNA that can be isolated from FPET, a new procedure that could accurately amplify mRNAs of interest would be very useful, particularly for real time quantitation of gene expression (TaqMan.RTM.) and especially for quantitatively large number (>50) of genes>50 to 10,000.

[0348] Current protocols (e.g. Eberwine, Biotechniques 20:584-91 (1996)) are optimized for mRNA amplification from small amount of total or poly A.sup.+ RNA mainly for microarray analysis. The present invention provides a protocol optimized for amplification of small amounts of fragmented total RNA (average size about 60-150 bps), utilizing gene-specific sequences as primers, as illustrated in FIG. 4.

[0349] The amplification procedure of the invention uses a very large number, typically as many as 100-190,000 gene specific primers (GSP's) in one reverse transcription run. Each GSP contains an RNA polymerase promoter, e.g. a T7 DNA-dependent RNA polymerase promoter, at the 5' end for subsequent RNA amplification. GSP's are preferred as primers because of the small size of the RNA. Current protocols utilize dT primers, which would not adequately represent all reverse transcripts of mRNAs due to the small size of the FPET RNA. GSP's can be designed by optimizing usual parameters, such as length, Tm, etc. For example, GSP's can be designed using the Primer Express.RTM. (Applied Biosystems), or Primer 3 (MIT) software program. Typically at least 3 sets per gene are designed, and the ones giving the lowest Ct on FPET RNA (best performers) are selected.

[0350] Second strand cDNA synthesis is performed by standard procedures (see FIG. 4, Method 1), or by GSP.sub.f primers and Taq pol under PCR conditions (e.g., 95.degree. C., 10 min (Taq activation) then 60.degree. C., 45 sec). The advantages of the latter method are that the second gene specific primer, SGF.sub.f adds additional specificity (and potentially more efficient second strand synthesis) and the option of performing several cycles of PCR, if more starting DNA is necessary for RNA amplification by T7 RNA polymerase. RNA amplification is then performed under standard conditions to generate multiple copies of cRNA, which is then used in a standard TaqMan.RTM. reaction.

[0351] Although this process is illustrated by using T7-based RNA amplification, a person skilled in the art will understand that other RNA polymerase promoters that do not require a primer, such as T3 or Sp6 can also be used, and are within the scope of the invention.

[0352] 11. A Method of Elongation of Fragmented RNA and Subsequent Amplification

[0353] This method, which combines and modifies the inventions described in sections 9 and 10 above, is illustrated in FIG. 5. The procedure begins with elongation of fragmented mRNA. This occurs as described above except that the scaffold DNAs are tagged with the T7 RNA polymerase promoter sequence at their 5' ends, leading to double-stranded DNA extended from RNA fragments. The template sequences need to be removed after in vitro transcription. These templates can include dUTP or rNTP nucleotides, enabling enzymatic removal of the templates as described in section 9, or the templates can be removed by DNaseI treatment.

[0354] The template DNA can be a population representing different mRNAs of any number. A high sequence complexity source of DNA templates (scaffolds) can be generated by pooling RNA from a variety of cells or tissues. In one embodiment, these RNAs are converted into double stranded DNA and cloned into phagemids. Single stranded DNA can then be rescued by phagemid growth and single stranded DNA isolation from purified phagemids.

[0355] This invention is useful because it increases gene expression profile signals two different ways: both by increasing test mRNA polynucleotide sequence length and by in vitro transcription amplification. An additional advantage is that it eliminates the need to carry out reverse transcription optimization with gene specific primers tagged with the T7 RNA polymerase promoter sequence, and thus, is comparatively fast and economical.

[0356] This invention can be used with a variety of different methods to profile gene expression, e.g., RT-PCR or a variety of DNA array methods. Just as in the previous protocol, this approach is illustrated by using a T7 promoter but the invention is not so limited. A person skilled in the art will appreciate, however, that other RNA polymerase promoters, such as T3 or Sp6 can also be used.

[0357] 12. Breast Cancer Gene Set, Assayed Gene Subsequences, and Clinical Application of Gene Expression Data

[0358] An important aspect of the present invention is to use the measured expression of certain genes by breast cancer tissue to match patients to best drugs or drug combinations, and to provide prognostic information. For this purpose it is necessary to correct for (normalize away) both differences in the amount of RNA assayed and variability in the quality of the RNA used. Therefore, the assay measures and incorporates the expression of certain normalizing genes, including well known housekeeping genes, such as GAPDH and Cyp1. Alternatively, normalization can be based on the mean or median signal (Ct) of all of the assayed genes or a large subset thereof (global normalization approach). On a gene-by-gene basis, measured normalized amount of a patient tumor mRNA is compared to the amount found in a breast cancer tissue reference set. The number (N) of breast cancer tissues in this reference set should be sufficiently high to ensure that different reference sets (as a whole) behave essentially the same way. If this condition is met, the identity of the individual breast cancer tissues present in a particular set will have no significant impact on the relative amounts of the genes assayed. Usually, the breast cancer tissue reference set consists of at least about 30, preferably at least about 40 different FPE breast cancer tissue specimens. Unless noted otherwise, normalized expression levels for each mRNA/tested tumor/patient will be expressed as a percentage of the expression level measured in the reference set. More specifically, the reference set of a sufficiently high number (e.g. 40) tumors yields a distribution of normalized levels of each mRNA species. The level measured in a particular tumor sample to be analyzed falls at some percentile within this range, which can be determined by methods well known in the art. Below, unless noted otherwise, reference to expression levels of a gene assume normalized expression relative to the reference set although this is not always explicitly stated.

[0359] The breast cancer gene set is shown in Table 1. The gene Accession Numbers, and the SEQ ID NOs for the forward primer, reverse primer and amplicon sequences that can be used for gene amplification, are listed in Table 2. The basis for inclusion of markers, as well as the clinical significance of mRNA level variations with respect to the reference set, is indicated below. Genes are grouped into subsets based on the type of clinical significance indicated by their expression levels: A. Prediction of patient response to drugs used in breast cancer treatment, or to drugs that are approved for other indications and could be used off-label in the treatment of breast cancer. B. Prognostic for survival or recurrence of cancer.

C. Prediction of Patient Response to Therapeutic Drugs

[0360] 1. Molecules that Specifically Influence Cellular Sensitivity to Drugs

[0361] Table 1 lists 74 genes (shown in italics) that specifically influence cellular sensitivity to potent drugs, which are also listed. Most of the drugs shown are approved and already used to treat breast cancer (e.g., anthracyclines; cyclophosphamide; methotrexate; 5-FU and analogues). Several of the drugs are used to treat breast cancer off-label or are in clinical development phase (e.g., bisphosphonates and anti-VEGF mAb). Several of the drugs have not been widely used to treat breast cancer but are used in other cancers in which the indicated target is expressed (e.g., Celebrex is used to treat familial colon cancer; cisplatin is used to treat ovarian and other cancers.)

[0362] Patient response to 5 FU is indicated if normalized thymidylate synthase mRNA amount is at or below the 15.sup.th percentile, or the sum of expression of thymidylate synthase plus dihydropyrimidine phosphorylase is at or below the 25.sup.th percentile, or the sum of expression of these mRNAs plus thymidine phosphorylase is at or below the 20.sup.th percentile. Patients with dihydropyrimidine dehydrogenase below 5.sup.th percentile are at risk of adverse response to 5 FU, or analogs such as Xeloda.

[0363] When levels of thymidylate synthase, and dihydropyrimidine dehydrogenase, are within the acceptable range as defined in the preceding paragraph, amplification of c-myc mRNA in the upper 15%, against a background of wild-type p53 [as defined below] predicts a beneficial response to 5 FU (see D. Arango et al., Cancer Res. 61:4910-4915 (2001)). In the presence of normal levels of thymidylate synthase and dihydropyrimidine dehydrogenase, levels of NF.kappa.B and cIAP2 in the upper 10% indicate resistance of breast tumors to the chemotherapeutic drug 5 FU.

[0364] Patient resistance to anthracyclines is indicated if the normalized mRNA level of topoisomerase II.alpha. is below the 10.sup.th percentile, or if the topoisomerase II.beta. normalized mRNA level is below the 10.sup.th percentile or if the combined normalized topoisomerase II.alpha. and .beta. signals are below the 10.sup.th percentile.

[0365] Patient sensitivity to methotrexate is compromised if DHFR levels are more than tenfold higher than the average reference set level for this mRNA species, or if reduced folate carrier levels are below 10.sup.th percentile.

[0366] Patients whose tumors express CYP1B1 in the upper 10%, have reduced likelihood of responding to docetaxol.

[0367] The sum of signals for aldehyde dehydrogenase 1A1 and 1A3, when more than tenfold higher than the reference set average, indicates reduced likelihood of response to cyclophosphamide.

[0368] Currently, estrogen and progesterone receptor expression as measured by immunohistochemistry is used to select patients for anti-estrogen therapy. We have demonstrated RT-PCR assays for estrogen and progesterone receptor mRNA levels that predict levels of these proteins as determined by a standard clinical diagnostic tests, with high degree of concordance (FIGS. 6 and 7).

[0369] Patients whose tumors express ER.alpha. or PR mRNA in the upper 70%, are likely to respond to tamoxifen or other anti-estrogens (thus, operationally, lower levels of ER.alpha. than this are to defined ER.alpha.-negative). However, when the signal for microsomal epoxide hydrolase is in the upper 10% or when mRNAs for pS2/trefoil factor, GATA3 or human chorionic gonadotropin are at or below average levels found in ER.alpha.-negative tumors, anti-estrogen therapy will not be beneficial.

[0370] Absence of XIST signal compromises the likelihood of response to taxanes, as does elevation of the GST-.pi. or prolyl endopeptidase [PREP] signal in the upper 10%. Elevation of PLAG1 in the upper 10% decreases sensitivity to taxanes.

[0371] Expression of ERCC1 mRNA in the upper 10% indicate significant risk of resistance to cisplatin or analogs.

[0372] An RT-PCR assay of Her2 mRNA expression predicts Her2 overexpression as measured by a standard diagnostic test, with high degree of concordance (data not shown). Patients whose tumors express Her2 (normalized to cyp. 1) in the upper 10% have increased likelihood of beneficial response to treatment with Herceptin or other ErbB2 antagonists. Measurement of expression of Grb7 mRNA serves as a test for HER2 gene amplification, because the Grb7 gene is closely linked to Her2. When Her2 is expression is high as defined above in this paragraph, similarly elevated Grb7 indicates Her2 gene amplification. Overexpression of IGF1R and or IGF1 or IGF2 decreases likelihood of beneficial response to Herceptin and also to EGFR antagonists.

[0373] Patients whose tumors express mutant Ha-Ras, and also express farnesyl pyrophosphate synthetase or geranyl pyrophosphonate synthetase mRNAs at levels above the tenth percentile comprise a group that is especially likely to exhibit a beneficial response to bis-phosphonate drugs.

[0374] Cox2 is a key control enzyme in the synthesis of prostaglandins. It is frequently expressed at elevated levels in subsets of various types of carcinomas including carcinoma of the breast. Expression of this gene is controlled at the transcriptional level, so RT-PCR serves a valid indicator of the cellular enzyme activity. Nonclinical research has shown that cox2 promotes tumor angiogenesis, suggesting that this enzyme is a promising drug target in solid tumors. Several Cox2 antagonists are marketed products for use in anti-inflammatory conditions. Treatment of familial adenomatous polyposis patients with the cox2 inhibitor Celebrex significantly decreased the number and size of neoplastic polyps. No cox2 inhibitor has yet been approved for treatment of breast cancer, but generally this class of drugs is safe and could be prescribed off-label in breast cancers in which cox2 is over-expressed. Tumors expressing COX2 at levels in the upper ten percentile have increased chance of beneficial response to Celebrex or other cyclooxygenase 2 inhibitors.

[0375] The tyrosine kinases ErbB1 [EGFR], ErbB3 [Her3] and ErbB4 [Her4]; also the ligands TGFalpha, amphiregulin, heparin-binding EGF-like growth factor, and epiregulin; also BRK, a non-receptor kinase. Several drugs in clinical development block the EGF receptor. ErbB2-4, the indicated ligands, and BRK also increase the activity of the EGFR pathway. Breast cancer patients whose tumors express high levels of EGFR or EGFR and abnormally high levels of the other indicated activators of the EGFR pathway are potential candidates for treatment with an EGFR antagonist.

[0376] Patients whose tumors express less than 10% of the average level of EGFR mRNA observed in the reference panel are relatively less likely to respond to EGFR antagonists [such as Iressa, or ImClone 225]. In cases in which the EGFR is above this low range, the additional presence of epiregulin, TGF.alpha., amphiregulin, or ErbB3, or BRK, CD9, MMP9, or Lot1 at levels above the 90.sup.th percentile predisposes to response to EGFR antagonists. Epiregulin gene expression, in particular, is a good surrogate marker for EGFR activation, and can be used to not only to predict response to EGFR antagonists, but also to monitor response to EGFR antagonists [taking fine needle biopsies to provide tumor tissue during treatment]. Levels of CD82 above the 90.sup.th percentile suggest poorer efficacy from EGFR antagonists.

[0377] The tyrosine kinases abl, c-kit, PDGFRalpha, PDGFbeta, and ARG; also, the signal transmitting ligands c-kit ligand, PDGFA, B, C and D. The listed tyrosine kinases are all targets of the drug Gleevec.TM. (imatinib mesylate, Novartis), and the listed ligands stimulate one or more of the listed tyrosine kinases. In the two indications for which Gleevec.TM. is approved, tyrosine kinase targets (bcr-abl and ckit) are overexpressed and also contain activating mutations. A finding that one of the Gleevec.TM. target tyrosine kinase targets is expressed in breast cancer tissue will prompt a second stage of analysis wherein the gene will be sequenced to determine whether it is mutated. That a mutation found is an activating mutation can be proved by methods known in the art, such as, for example, by measuring kinase enzyme activity or by measuring phosphorylation status of the particular kinase, relative to the corresponding wild-type kinase. Breast cancer patients whose tumors express high levels of mRNAs encoding Gleevec.TM. target tyrosine kinases, specifically, in the upper ten percentile, or mRNAs for Gleevec.TM. target tyrosine kinases in the average range and mRNAs for their cognate growth stimulating ligands in the upper ten percentile, are particularly good candidates for treatment with Gleevec.TM.

[0378] VEGF is a potent and pathologically important angiogenic factor. (See below under Prognostic Indicators.) When VEGF mRNA levels are in the upper ten percentile, aggressive treatment is warranted. Such levels particularly suggest the value of treatment with anti-angiogenic drugs, including VEGF antagonists, such as anti-VEGF antibodies. Additionally, KDR or CD31 mRNA level in the upper 20 percentile further increases likelihood of benefit from VEGF antagonists.

[0379] Farnesyl pyrophosphatase synthetase and geranyl geranyl pyrophosphatase synthetase. These enzymes are targets of commercialized bisphosphonate drugs, which were developed originally for treatment of osteoporosis but recently have begun to prescribe them off-label in breast cancer. Elevated levels of mRNAs encoding these enzymes in breast cancer tissue, above the 90.sup.th percentile, suggest use of bisphosphonates as a treatment option.

[0380] 2. Multidrug Resistance Factors

[0381] These factors include 10 Genes: gamma glutamyl cysteine synthetase [GCS]; GST-.alpha.; GST-.pi.; MDR-1; MRP1-4; breast cancer resistance protein [BCRP]; lung resistance protein [MVP]; SXR; YB-1.

[0382] GCS and both GST-.alpha. and GST-.pi. regulate glutathione levels, which decrease cellular sensitivity to chemotherapeutic drugs and other toxins by reductive derivatization. Glutathione is a necessary cofactor for multi-drug resistant pumps, MDR-1 and the MRPs. MDR1 and MRPs function to actively transport out of cells several important chemotherapeutic drugs used in breast cancer.

[0383] GSTs, MDR-1, and MRP-1 have all been studied extensively to determine possible have prognostic or predictive significance in human cancer. However, a great deal of disagreement exists in the literature with respect to these questions. Recently, new members of the MRP family have been identified: MRP-2, MRP-3, MRP-4, BCRP, and lung resistance protein [major vault protein]. These have substrate specificities that overlap with those of MDR-1 and MRP-1. The incorporation of all of these relevant ABC family members as well as glutathione synthetic enzymes into the present invention captures the contribution of this family to drug resistance, in a way that single or double analyte assays cannot.

[0384] MRP-I, the gene coding for the multidrug resistance protein.

[0385] P-glycoprotein, is not regulated primarily at the transcriptional level. However, p-glycoprotein stimulates the transcription of PTP1b. An embodiment of the present invention is the use of the level of the mRNA for the phosphatase PTP1b as a surrogate measure of MRP-1/p-glycoprotein activity.

[0386] The gene SXR is also an activator of multidrug resistance, as it stimulates transcription of certain multidrug resistance factors.

[0387] The impact of multidrug resistance factors with respect to chemotherapeutic agents used in breast cancer is as follows. Beneficial response to doxorubicin is compromised when the mRNA levels of either MDR1, GST.alpha., GST.pi., SXR, BCRP YB-1, or LRP/MVP are in the upper four percentile. Beneficial response to methotrexate is inhibited if mRNA levels of any of MRP1, MRP2, MRP3, or MRP4 or gamma-glutamyl cysteine synthetase are in the upper four percentile.

[0388] 3. Eukaryotic Translation Initiation Factor 4E [EIF4E]

[0389] EIF4E mRNA levels provides evidence of protein expression and so expands the capability of RT-PCR to indicate variation in gene expression. Thus, one claim of the present invention is the use of EIF4E as an added indicator of gene expression of certain genes [e.g., cyclinD1, mdm2, VEGF, and others]. For example, in two tissue specimens containing the same amount of normalized VEGF mRNA, it is likely that the tissue containing the higher normalized level of EIF4E exhibits the greater level of VEGF gene expression.

[0390] The background is as follows. A key point in the regulation of mRNA translation is selection of mRNAs by the EIF4G complex to bind to the 43S ribosomal subunit. The protein EIF4E [the m7G CAP-binding protein] is often limiting because more mRNAs than EIF4E copies exist in cells. Highly structured 5'UTRs or highly GC-rich ones are inefficiently translated, and these often code for genes that carry out functions relevant to cancer [e.g., cyclinD1, mdm2, and VEGF]. EIF4E is itself regulated at the transcriptional/mRNA level. Thus, expression of EIF4E provides added indication of increased activity of a number of proteins.

[0391] It is also noteworthy that overexpression of EIF4E transforms cultured cells, and hence is an oncogene. Overexpression of EIF4E occurs in several different types of carcinomas but is particularly significant in breast cancer. EIF4E is typically expressed at very low levels in normal breast tissue.

D. Prognostic Indicators

[0392] 1. DNA Repair Enzymes

[0393] Loss of BRCA1 or BRCA2 activity via mutation represents the critical oncogenic step in the most common type[s] of familial breast cancer. The levels of mRNAs of these important enzymes are abnormal in subsets of sporadic breast cancer as well. Loss of signals from either [to within the lower ten percentile] heightens risk of short survival.

[0394] 2. Cell Cycle Regulators

[0395] Cell cycle regulators include 14 genes: c-MYC; c-Src; Cyclin D1; Ha-Ras; mdm2; p14ARF; p21WAF1/CIP; p16INK4a/p14; p23; p27; p53; PI3K; PKC-epsilon; PKC-delta.

[0396] The gene for p53 [TP53] is mutated in a large fraction of breast cancers. Frequently p53 levels are elevated when loss of function mutation occurs. When the mutation is dominant-negative, it creates survival value for the cancer cell because growth is promoted and apoptosis is inhibited. Thousands of different p53 mutations have been found in human cancer, and the functional consequences of many of them are not clear. A large body of academic literature addresses the prognostic and predictive significance of mutated p53 and the results are highly conflicting. The present invention provides a functional genomic measure of p53 activity, as follows. The activated wild type p53 molecule triggers transcription of the cell cycle inhibitor p21. Thus, the ratio of p53 to p21 should be low when p53 is wild-type and activated. When p53 is detectable and the ratio of p53 to p21 is elevated in tumors relative to normal breast, it signifies nonfunctional or dominant negative p53. The cancer literature provides evidence for this as born out by poor prognosis.

[0397] Mdm2 is an important p53 regulator. Activated wildtype p53 stimulates transcription of mdm2. The mdm2 protein binds p53 and promotes its proteolytic destruction. Thus, abnormally low levels of mdm2 in the presence of normal or higher levels of p53 indicate that p53 is mutated and inactivated.

[0398] One aspect of the present invention is the use of ratios of mRNAs levels p53:p21 and p53:mdm2 to provide a picture of p53 status. Evidence for dominant negative mutation of p53 (as indicated by high p53:p21 and/or high p53:mdm2 mRNA ratios--specifically in the upper ten percentile) presages higher risk of recurrence in breast cancer and therefore weights toward a decision to use chemotherapy in node negative post surgery breast cancer.

[0399] Another important cell cycle regulator is p27, which in the activated form blocks cell cycle progression at the level of cdk4. The protein is regulated primarily via phosphorylation/dephosphorylation, rather than at the transcriptional level. However, levels of p27 mRNAs do vary. Therefore a level of p27 mRNA in the upper ten percentile indicates reduced risk of recurrence of breast cancer post surgery.

[0400] Cyclin D1 is a principle positive regulator of entry into S phase of the cell cycle. The gene for cyclin D1 is amplified in about 20% of breast cancer patients, and therefore promotes tumor promotes tumor growth in those cases. One aspect of the present invention is use of cyclin D1 mRNA levels for diagnostic purposes in breast cancer. A level of cyclin D1 mRNA in the upper ten percentile suggests high risk of recurrence in breast cancer following surgery and suggests particular benefit of adjuvant chemotherapy.

[0401] 3. Other Tumor Suppressors and Related Proteins

[0402] These include APC and E-cadherin. It has long been known that the tumor suppressor APC is lost in about 50% of colon cancers, with concomitant transcriptional upregulation of E-cadherin, an important cell adhesion molecule and growth suppressor. Recently, it has been found that the APC gene silenced in 15-40% of breast cancers. Likewise, the E-cadherin gene is silenced [via CpG island methylation] in about 30% of breast cancers. An abnormally low level of APC and/or E-cadherin mRNA in the lower 5 percentile suggests high risk of recurrence in breast cancer following surgery and heightened risk of shortened survival.

[0403] 4. Regulators of Apoptosis

[0404] These include BC1/BAX family members BC12, Bcl-x1, Bak, Bax and related factors, NF.kappa.-B and related factors, and also p53BP1/ASPP1 and p53BP2/ASPP2.

[0405] Bax and Bak are pro-apoptotic and BC12 and Bcl-x1 are anti-apoptotic. Therefore, the ratios of these factors influence the resistance or sensitivity of a cell to toxic (pro-apoptotic) drugs. In breast cancer, unlike other cancers, elevated level of BC12 (in the upper ten percentile) correlates with good outcome. This reflects the fact that BC12 has growth inhibitory activity as well as anti-apoptotic activity, and in breast cancer the significance of the former activity outweighs the significance of the latter. The impact of BC12 is in turn dependent on the status of the growth stimulating transcription factor c-MYC. The gene for c-MYC is amplified in about 20% of breast cancers. When c-MYC message levels are abnormally elevated relative to BC12 (such that this ratio is in the upper ten percentile), then elevated level of BC12 mRNA is no longer a positive indicator.

[0406] NF.kappa.-B is another important anti-apoptotic factor. Originally, recognized as a pro-inflammatory transcription factor, it is now clear that it prevents programmed cell death in response to several extracellular toxic factors [such as tumor necrosis factor]. The activity of this transcription factor is regulated principally via phosphorylation/dephosphorylation events. However, levels of NF.kappa.-B nevertheless do vary from cell to cell, and elevated levels should correlate with increased resistance to apoptosis. Importantly for present purposes, NF.kappa.-B, exerts its anti-apoptotic activity largely through its stimulation of transcription of mRNAs encoding certain members of the IAP [inhibitor of apoptosis] family of proteins, specifically cIAP1, cIAP2, XIAP, and Survivin. Thus, abnormally elevated levels of mRNAs for these IAPs and for NF.kappa.-B any in the upper 5 percentile] signify activation of the NF.kappa.-B anti-apoptotic pathway. This suggests high risk of recurrence in breast cancer following chemotherapy and therefore poor prognosis. One embodiment of the present invention is the inclusion in the gene set of the above apoptotic regulators, and the above-outlined use of combinations and ratios of the levels of their mRNAs for prognosis in breast cancer.

[0407] The proteins p53BP1 and 2 bind to p53 and promote transcriptional activation of pro-apoptotic genes. The levels of p53BP1 and 2 are suppressed in a significant fraction of breast cancers, correlating with poor prognosis. When either is expressed in the lower tenth percentile poor prognosis is indicated.

[0408] 5. Factors that Control Cell Invasion and Angiogenesis

[0409] These include uPA, PAI1 cathepsinsB, G and L, scatter factor [HGF], c-met, KDR, VEGF, and CD31. The plasminogen activator uPA and its serpin regulator PAI1 promote breakdown of extracellular matrices and tumor cell invasion. Abnormally elevated levels of both mRNAs in malignant breast tumors (in the upper twenty percentile) signify an increased risk of shortened survival, increased recurrence in breast cancer patients post surgery, and increased importance of receiving adjuvant chemotherapy. On the other hand, node negative patients whose tumors do not express elevated levels of these mRNA species are less likely to have recurrence of this cancer and could more seriously consider whether the benefits of standard chemotherapy justifies the associated toxicity.

[0410] Cathepsins B or L, when expressed in the upper ten percentile, predict poor disease-free and overall survival. In particular, cathepsin L predicts short survival in node positive patients.

[0411] Scatter factor and its cognate receptor c-met promote cell motility and invasion, cell growth, and angiogenesis. In breast cancer elevated levels of mRNAs encoding these factors should prompt aggressive treatment with chemotherapeutic drugs, when expression of either, or the combination, is above the 90.sup.th percentile.

[0412] VEGF is a central positive regulator of angiogenesis, and elevated levels in solid tumors predict short survival [note many references showing that elevated level of VEGF predicts short survival]. Inhibitors of VEGF therefore slow the growth of solid tumors in animals and humans. VEGF activity is controlled at the level of transcription. VEGF mRNA levels in the upper ten percentile indicate significantly worse than average prognosis. Other markers of vascularization, CD31 [PECAM], and KDR indicate high vessel density in tumors and that the tumor will be particularly malignant and aggressive, and hence that an aggressive therapeutic strategy is warranted.

[0413] 6. Markers for Immune and Inflammatory Cells and Processes

[0414] These markers include the genes for Immunoglobulin light chain .lamda., CD18, CD3, CD68, Fas [CD95], and Fas Ligand.

[0415] Several lines of evidence suggest that the mechanisms of action of certain drugs used in breast cancer entail activation of the host immune/inflammatory response (For example, Herceptin.RTM.). One aspect of the present invention is the inclusion in the gene set of markers for inflammatory and immune cells, and markers that predict tumor resistance to immune surveillance. Immunoglobulin light chain lambda is a marker for immunoglobulin producing cells. CD18 is a marker for all white cells. CD3 is a marker for T-cells. CD68 is a marker for macrophages.

[0416] CD95 and Fas ligand are a receptor: ligand pair that mediate one of two major pathways by which cytotoxic T cells and NK cells kill targeted cells. Decreased expression of CD95 and increased expression of Fas Ligand indicates poor prognosis in breast cancer. Both CD95 and Fas Ligand are transmembrane proteins, and need to be membrane anchored to trigger cell death. Certain tumor cells produce a truncated soluble variant of CD95, created as a result of alternative splicing of the CD95 mRNA. This blocks NK cell and cytotoxic T cell Fas Ligand-mediated killing of the tumors cells. Presence of soluble CD95 correlates with poor survival in breast cancer. The gene set includes both soluble and full-length variants of CD95.

[0417] 7. Cell Proliferation Markers

[0418] The gene set includes the cell proliferation markers Ki67/MiB1, PCNA, Pin1, and thymidine kinase. High levels of expression of proliferation markers associate with high histologic grade, and short survival. High levels of thymidine kinase in the upper ten percentile suggest in creased risk of short survival. Pin1 is a prolyl isomerase that stimulates cell growth, in part through the transcriptional activation of the cyclin D1 gene, and levels in the upper ten percentile contribute to a negative prognostic profile.

[0419] 8. Other Growth Factors and Receptors

[0420] This gene set includes IGF1, IGF2, IGFBP3, IGF1R, FGF2, FGFR1, CSF-1R/fms, CSF-1, IL6 and IL8. All of these proteins are expressed in breast cancer. Most stimulate tumor growth. However, expression of the growth factor FGF2 correlates with good outcome. Some have anti-apoptotic activity, prominently IGF1. Activation of the IGF1 axis via elevated IGF1, IGF1R, or IGFBP3 (as indicated by the sum of these signals in the upper ten percentile) inhibits tumor cell death and strongly contributes to a poor prognostic profile.

[0421] 9. Gene Expression Markers that Define Subclasses of Breast Cancer

[0422] These include: GRO1 oncogene alpha, Grb7, cytokeratins 5 and 17, retinal binding protein 4, hepatocyte nuclear factor 3, integrin alpha 7, and lipoprotein lipase. These markers subset breast cancer into different cell types that are phenotypically different at the level of gene expression. Tumors expressing signals for Bcl2, hepatocyte nuclear factor 3, LIV1 and ER above the mean have the best prognosis for disease free and overall survival following surgical removal of the cancer. Another category of breast cancer tumor type, characterized by elevated expression of lipoprotein lipase, retinol binding protein 4, and integrin .alpha.7, carry intermediate prognosis. Tumors expressing either elevated levels of cytokeratins 5, and 17, GRO oncogene at levels four-fold or greater above the mean, or ErbB2 and Grb7 at levels ten-fold or more above the mean, have worst prognosis.

[0423] Although throughout the present description, including the Examples below, various aspects of the invention are explained with reference to gene expression studies, the invention can be performed in a similar manner, and similar results can be reached by applying proteomics techniques that are well known in the art. The proteome is the totality of the proteins present in a sample (e.g. tissue, organism, or cell culture) at a certain point of time. Proteomics includes, among other things, study of the global changes of protein expression in a sample (also referred to as "expression proteomics"). Proteomics typically includes the following steps: (1) separation of individual proteins in a sample by 2-D gel electrophoresis (2-D PAGE); (2) identification of the individual proteins recovered from the gel, e.g. my mass spectrometry and/or N-terminal sequencing, and (3) analysis of the data using bioinformatics. Proteomics methods are valuable supplements to other methods of gene expression profiling, and can be used, alone or in combination with other methods of the present invention, to detect the products of the gene markers of the present invention.

[0424] Further details of the invention will be described in the following non-limiting Examples.

Example 1

Isolation of RNA from Formalin-Fixed, Paraffin-Embedded (FPET) Tissue Specimens

[0425] A. Protocols

[0426] I. EPICENTRE.RTM. Xylene Protocol

[0427] RNA Isolation

[0428] (1) Cut 1-6 sections (each 10 .mu.m thick) of paraffin-embedded tissue per sample using a clean microtome blade and place into a 1.5 ml eppendorf tube.

[0429] (2) To extract paraffin, add 1 ml of xylene and invert the tubes for 10 minutes by rocking on a nutator.

[0430] (3) Pellet the sections by centrifugation for 10 minutes at 14,000.times.g in an eppendorf microcentrifuge.

[0431] (4) Remove the xylene, leaving some in the bottom to avoid dislodging the pellet.

[0432] (5) Repeat steps 2-4.

[0433] (6) Add 1 ml of 100% ethanol and invert for 3 minutes by rocking on the nutator.

[0434] (7) Pellet the debris by centrifugation for 10 minutes at 14,000.times.g in an eppendorf microcentrifuge.

[0435] (8) Remove the ethanol, leaving some at the bottom to avoid the pellet.

[0436] (9) Repeat steps 6-8 twice.

[0437] (10) Remove all of the remaining ethanol.

[0438] (11) For each sample, add 2 .mu.l of 50 .mu.g/.mu.l Proteinase K to 300 .mu.l of Tissue and Cell Lysis Solution.

[0439] (12) Add 300 .mu.l of Tissue and Cell Lysis Solution containing the Proteinase K to each sample and mix thoroughly.

[0440] (13) Incubate at 65.degree. C. for 90 minutes (vortex mixing every 5 minutes). Visually monitor the remaining tissue fragment. If still visible after 30 minutes, add an additional 2 .mu.l of 50 .mu.g/.mu.l Proteinase K and continue incubating at 65.degree. C. until fragment dissolves.

[0441] (14) Place the samples on ice for 3-5 minutes and proceed with protein removal and total nucleic acid precipitation.

[0442] Protein Removal and Precipitation of Total Nucleic Acid

[0443] (1) Add 150 .mu.l of MPC Protein Precipitation Reagent to each lysed sample and vortex vigorously for 10 seconds.

[0444] (2) Pellet the debris by centrifugation for 10 minutes at 14,000.times.g in an eppendorf microcentrifuge.

[0445] (3) Transfer the supernatant into clean eppendorf tubes and discard the pellet.

[0446] (4) Add 500 .mu.l of isopropanol to the recovered supernatant and thoroughly mix by rocking on the nutator for 3 minutes.

[0447] (5) Pellet the RNA/DNA by centrifugation at 4.degree. C. for 10 minutes at 14,000.times.g in an eppendorf microcentrifuge.

[0448] (6) Remove all of the isopropanol with a pipet, being careful not to dislodge the pellet.

[0449] Removal of Contaminating DNA from RNA Preparations

[0450] (1) Prepare 200 .mu.l of DNase I solution for each sample by adding 5 .mu.l of RNase-Free DNase I (1 U/.mu.l) to 195 .mu.l of 1.times.DNase Buffer.

[0451] (2) Completely resuspend the pelleted RNA in 200 .mu.l of DNase I solution by vortexing.

[0452] (3) Incubate the samples at 37.degree. C. for 60 minutes.

[0453] (4) Add 200 .mu.l of 2.times. T and C Lysis Solution to each sample and vortex for 5 seconds.

[0454] (5) Add 200 .mu.l of MPC Protein Precipitation Reagent, mix by vortexing for 10 seconds and place on ice for 3-5 minutes.

[0455] (6) Pellet the debris by centrifugation for 10 minutes at 14,000.times.g in an eppendorf microcentrifuge.

[0456] (7) Transfer the supernatant containing the RNA to clean eppendorf tubes and discard the pellet. (Be careful to avoid transferring the pellet.)

[0457] (8) Add 500 .mu.l of isopropanol to each supernatant and rock samples on the nutator for 3 minutes.

[0458] (9) Pellet the RNA by centrifugation at 4.degree. C. for 10 minutes at 14,000.times.g in an eppendorf microcentrifuge.

[0459] (10) Remove the isopropanol, leaving some at the bottom to avoid dislodging the pellet.

[0460] (11) Rinse twice with 1 ml of 75% ethanol. Centrifuge briefly if the RNA pellet is dislodged.

[0461] (12) Remove ethanol carefully.

[0462] (13) Set under fume hood for about 3 minutes to remove residual ethanol.

[0463] (14) Resuspend the RNA in 30 .mu.l of TE Buffer and store at -30.degree. C.

[0464] II. Hot Wax/Urea Protocol of the Invention

[0465] RNA Isolation

[0466] (1) Cut 3 sections (each 10 .mu.m thick) of paraffin-embedded tissue using a clean microtome blade and place into a 1.5 ml eppendorf tube.

[0467] (2) Add 300 .mu.l of lysis buffer (10 mM Tris 7.5, 0.5% sodium lauroyl sarcosine, 0.1 mM EDTA pH 7.5, 4M Urea) containing 330 .mu.g/ml Proteinase K (added freshly from a 50 .mu.g/.mu.l stock solution) and vortex briefly.

[0468] (3) Incubate at 65.degree. C. for 90 minutes (vortex mixing every 5 minutes). Visually monitor the tissue fragment. If still visible after 30 minutes, add an additional 2 .mu.l of 50 .mu.g/.mu.l Proteinase K and continue incubating at 65.degree. C. until fragment dissolves.

[0469] (4) Centrifuge for 5 minutes at 14,000.times.g and transfer upper aqueous phase to new tube, being careful not to disrupt the paraffin seal.

[0470] (5) Place the samples on ice for 3-5 minutes and proceed with protein removal and total nucleic acid precipitation.

[0471] Protein Removal and Precipitation of Total Nucleic Acid

[0472] (1) Add 150 .mu.l of 7.5M NH.sub.4OAc to each lysed sample and vortex vigorously for 10 seconds.

[0473] (2) Pellet the debris by centrifugation for 10 minutes at 14,000.times.g in an eppendorf microcentrifuge.

[0474] (3) Transfer the supernatant into clean eppendorf tubes and discard the pellet.

[0475] (4) Add 500 .mu.l of isopropanol to the recovered supernatant and thoroughly mix by rocking on the nutator for 3 minutes.

[0476] (5) Pellet the RNA/DNA by centrifugation at 4.degree. C. for 10 minutes at 14,000.times.g in an eppendorf microcentrifuge.

[0477] (6) Remove all of the isopropanol with a pipet, being careful not to dislodge the pellet.

[0478] Removal of Contaminating DNA from RNA Preparations

[0479] (1) Add 45 .mu.l of 1.times.DNase I buffer (10 mM Tris-Cl, pH 7.5, 2.5 mM MgCl.sub.2, 0.1 mM CaCl.sub.2) and 5 .mu.l of RNase-Free DNase I (2 U/.mu.l, Ambion) to each sample.

[0480] (2) Incubate the samples at 37.degree. C. for 60 minutes. Inactivate the DNaseI by heating at 70.degree. C. for 5 minutes.

[0481] B. Results

[0482] Experimental evidence demonstrates that the hot RNA extraction protocol of the invention does not compromise RNA yield. Using 19 FPE breast cancer specimens, extracting RNA from three adjacent sections in the same specimens, RNA yields were measured via capillary electrophoresis with fluorescence detection (Agilent Bioanalyzer). Average RNA yields in nanograms and standard deviations with the invented and commercial methods, respectively, were: 139+/-21 versus 141+/-34.

[0483] Also, it was found that the urea-containing lysis buffer of the present invention can be substituted for the EPICENTRE.RTM. T&C lysis buffer, and the 7.5 M NH.sub.4OAc reagent used for protein precipitation in accordance with the present invention can be substituted for the EPICENTRE.RTM. MPC protein precipitation solution with neither significant compromise of RNA yield nor TaqMan.RTM. efficiency.

Example 2

Amplification of mRNA Species Prior to RT-PCR

[0484] The method described in section 10 above was used with RNA isolated from fixed, paraffin-embedded breast cancer tissue. TaqMan.RTM. analyses were performed with first strand cDNA generated with the T7-GSP primer (unamplified (T7-GSPr)), T7 amplified RNA (amplified (T7-GSPr)). RNA was amplified according to step 2 of FIG. 4. As a control, TaqMan.RTM. was also performed with cDNA generated with an unmodified GSPr (amplified (GSPr)). An equivalent amount of initial template (1 ng/well) was used in each TaqMan.RTM. reaction.

[0485] The results are shown in FIG. 8. In vitro transcription increased RT-PCR signal intensity by more than 10 fold, and for certain genes by more than 100 fold relative to controls in which the RT-PCR primers were the same primers used in method 2 for the generation of double-stranded DNA for in vitro transcription (GSP-T7.sub.r and GSP.sub.f). Also shown in FIG. 8 are RT-PCR data generated when standard optimized RT-PCR primers (i.e., lacking T7 tails) were used. As shown, compared to this control, the new method yielded substantial increases in RT-PCR signal (from 4 to 64 fold in this experiment).

[0486] The new method requires that each T7-GSP sequence be optimized so that the increase in the RT-PCR signal is the same for each gene, relative to the standard optimized RT-PCR (with non-T7 tailed primers).

Example 3

A Study of Gene Expression in Premalignant and Malignant Breast Tumors

[0487] A gene expression study was designed and conducted with the primary goal to molecularly characterize gene expression in paraffin-embedded, fixed tissue samples of invasive breast ductal carcinoma, and to explore the correlation between such molecular profiles and disease-free survival. A further objective of the study was to compare the molecular profiles in tissue samples of invasive breast cancer with the molecular profiles obtained in ductal carcinoma in situ. The study was further designed to obtain data on the molecular profiles in lobular carcinoma in situ and in paraffin-embedded, fixed tissue samples of invasive lobular carcinoma.

[0488] Molecular assays were performed on paraffin-embedded, formalin-fixed primary breast tumor tissues obtained from 202 individual patients diagnosed with breast cancer. All patients underwent surgery with diagnosis of invasive ductal carcinoma of the breast, pure ductal carcinoma in situ (DCIS), lobular carcinoma of the breast, or pure lobular carcinoma in situ (LCIS). Patients were included in the study only if histopathologic assessment, performed as described in the Materials and Methods section, indicated adequate amounts of tumor tissue and homogeneous pathology.

[0489] The individuals participating in the study were divided into the following groups:

[0490] Group 1: Pure ductal carcinoma in situ (DCIS); n=18

[0491] Group 2: Invasive ductal carcinoma n=130

[0492] Group 3: Pure lobular carcinoma in situ (LCIS); n=7

[0493] Group 4: Invasive lobular carcinoma n=16

Materials and Methods

[0494] Each representative tumor block was characterized by standard histopathology for diagnosis, semi-quantitative assessment of amount of tumor, and tumor grade. A total of 6 sections (10 microns in thickness each) were prepared and placed in two Costar Brand Microcentrifuge Tubes (Polypropylene, 1.7 mL tubes, clear; 3 sections in each tube). If the tumor constituted less than 30% of the total specimen area, the sample may have been crudely dissected by the pathologist, using gross microdissection, putting the tumor tissue directly into the Costar tube.

[0495] If more than one tumor block was obtained as part of the surgical procedure, all tumor blocks were subjected to the same characterization, as described above, and the block most representative of the pathology was used for analysis.

Gene Expression Analysis

[0496] mRNA was extracted and purified from fixed, paraffin-embedded tissue samples, and prepared for gene expression analysis as described in chapters 7-11 above. Molecular assays of quantitative gene expression were performed by RT-PCR, using the ABI PRISM 7900.TM. Sequence Detection System.TM. (Perkin-Elmer-Applied Biosystems, Foster City, Calif., USA). ABI PRISM 7900.TM. consists of a thermocycler, laser, charge-coupled device (CCD), camera and computer. The system amplifies samples in a 384-well format on a thermocycler. During amplification, laser-induced fluorescent signal is collected in real-time through fiber optics cables for all 384 wells, and detected at the CCD. The system includes software for running the instrument and for analyzing the data.

Analysis and Results

[0497] Tumor tissue was analyzed for 185 cancer-related genes and 7 reference genes. The threshold cycle (CT) values for each patient were normalized based on the median of all genes for that particular patient. Clinical outcome data were available for all patients from a review of registry data and selected patient charts. Outcomes were classified as:

0 died due to breast cancer or to unknown cause or alive with breast cancer recurrence; 1 alive without breast cancer recurrence or died due to a cause other than breast cancer

[0498] Analysis was performed by:

1. Analysis of the relationship between normalized gene expression and the binary outcomes of 0 or 1. 2. Analysis of the relationship between normalized gene expression and the time to outcome (0 or 1 as defined above) where patients who were alive without breast cancer recurrence or who died due to a cause other than breast cancer were censored. This approach was used to evaluate the prognostic impact of individual genes and also sets of multiple genes. Analysis of 147 Patients with Invasive Breast Carcinoma by Binary Approach

[0499] In the first (binary) approach, analysis was performed on all 146 patients with invasive breast carcinoma. At test was performed on the group of patients classified as 0 or 1 and the p-values for the differences between the groups for each gene were calculated.

[0500] The following Table 4 lists the 45 genes for which the p-value for the differences between the groups was <0.05

TABLE-US-00001 TABLE 4 Gene/ Mean CT Mean CT Degrees of SEQ ID NO: Alive Deceased t-value freedom p FOXM1 33.66 32.52 3.92 144 0.0001 PRAME 35.45 33.84 3.71 144 0.0003 Bcl2 28.52 29.32 -3.53 144 0.0006 STK15 30.82 30.10 3.49 144 0.0006 CEGP1 29.12 30.86 -3.39 144 0.0009 Ki-67 30.57 29.62 3.34 144 0.0011 GSTM1 30.62 31.63 -3.27 144 0.0014 CA9 34.96 33.54 3.18 144 0.0018 PR 29.56 31.22 -3.16 144 0.0019 BBC3 31.54 32.10 -3.10 144 0.0023 NME1 27.31 26.68 3.04 144 0.0028 SURV 31.64 30.68 2.92 144 0.0041 GATA3 26.06 26.99 -2.91 144 0.0042 TFRC 28.96 28.48 2.87 144 0.0047 YB-1 26.72 26.41 2.79 144 0.0060 DPYD 28.51 28.84 -2.67 144 0.0084 GSTM3 28.21 29.03 -2.63 144 0.0095 RPS6KB1 31.18 30.61 2.61 144 0.0099 Src 27.97 27.69 2.59 144 0.0105 Chk1 32.63 31.99 2.57 144 0.0113 ID1 28.73 29.13 -2.48 144 0.0141 EstR1 24.22 25.40 -2.44 144 0.0160 p27 27.15 27.51 -2.41 144 0.0174 CCNB1 31.63 30.87 2.40 144 0.0176 XIAP 30.27 30.51 -2.40 144 0.0178 Chk2 31.48 31.11 2.39 144 0.0179 CDC25B 29.75 29.39 2.37 144 0.0193 IGF1R 28.85 29.44 -2.34 144 0.0209 AK055699 33.23 34.11 -2.28 144 0.0242 PI3KC2A 31.07 31.42 -2.25 144 0.0257 TGFB3 28.42 28.85 -2.25 144 0.0258 BAGI1 28.40 28.75 -2.24 144 0.0269 CYP3A4 35.70 35.32 2.17 144 0.0317 EpCAM 28.73 28.34 2.16 144 0.0321 VEGFC 32.28 31.82 2.16 144 0.0326 pS2 28.96 30.60 -2.14 144 0.0341 hENT1 27.19 26.91 2.12 144 0.0357 WISP1 31.20 31.64 -2.10 144 0.0377 HNF3A 27.89 28.64 -2.09 144 0.0384 NFKBp65 33.22 33.80 -2.08 144 0.0396 BRCA2 33.06 32.62 2.08 144 0.0397 EGFR 30.68 30.13 2.06 144 0.0414 TK1 32.27 31.72 2.02 144 0.0453 VDR 30.08 29.73 1.99 144 0.0488

[0501] In the foregoing Table 4, lower (negative) t-values indicate higher expression (or lower CTs), associated with better outcomes, and, inversely, higher (positive) t-values indicate higher expression (lower CTs) associated with worse outcomes. Thus, for example, elevated expression of the FOXM1 gene (t-value=3.92, CT mean alive>CT mean deceased) indicates a reduced likelihood of disease free survival. Similarly, elevated expression of the CEGP1 gene (t-value=-3.39; CT mean alive<CT mean deceased) indicates an increased likelihood of disease free survival.

[0502] Based on the data set forth in Table 4, the overexpression of any of the following genes in breast cancer indicates a reduced likelihood of survival without cancer recurrence following surgery: FOXM1; PRAME; SKT15, Ki-67; CA9; NME1; SURV; TFRC; YB-1; RPS6 KB1; Src; Chk1; CCNB1; Chk2; CDC25B; CYP3A4; EpCAM; VEGFC; hENT1; BRCA2; EGFR; TK1; VDR.

[0503] Based on the data set forth in Table 4, the overexpression of any of the following genes in breast cancer indicates a better prognosis for survival without cancer recurrence following surgery: Blc12; CEGP1; GSTM1; PR; BBC3; GATA3; DPYD; GSTM3; 101; EstR1; p27; XIAP; IGF1R; AK055699; P13KC2A; TGFB3; BAGI1; pS2; WISP1; HNF3A; NFKBp65.

[0504] Analysis of 108 ER Positive Patient by Binary Approach

[0505] 108 patients with normalized CT for estrogen receptor (ER)<25.2 (i.e., ER positive patients) were subjected to separate analysis. At test was performed on the groups of patients classified as 0 or 1 and the p-values for the differences between the groups for each gene were calculated. The following Table 5 lists the 12 genes where the p-value for the differences between the groups was <0.05.

TABLE-US-00002 TABLE 5 Gene/ Mean CT Mean CT Degrees of SEQ ID NO: Alive Deceased t-value freedom p PRAME 35.54 33.88 3.03 106 0.0031 Bcl2 28.24 28.87 -2.70 106 0.0082 FOXM1 33.82 32.85 2.66 106 0.089 DIABLO 30.33 30.71 -2.47 106 0.0153 EPHX1 28.62 28.03 2.44 106 0.0163 HIF1A 29.37 28.88 2.40 106 0.0180 VEGFC 32.39 31.69 2.39 106 0.0187 Ki-67 30.73 29.82 2.38 106 0.0191 IGF1R 28.60 29.18 -2.37 106 0.0194 VDR 30.14 29.60 2.17 106 0.0322 NME1 27.34 26.80 2.03 106 0.0452 GSTM3 28.08 28.92 -2.00 106 0.0485

[0506] For each gene, a classification algorithm was utilized to identify the best threshold value (CT) for using each gene alone in predicting clinical outcome.

[0507] Based on the data set forth in Table 5, overexpression of the following genes in ER-positive cancer is indicative of a reduced likelihood of survival without cancer recurrence following surgery: PRAME; FOXM1; EPHX1; HIF1A; VEGFC; Ki-67; VDR; NME1. Some of these genes (PRAME; FOXM1; VEGFC; Ki-67; VDR; and NME1) were also identified as indicators of poor prognosis in the previous analysis, not limited to ER-positive breast cancer. The overexpression of the remaining genes (EPHX1 and HIF1A) appears to be negative indicator of disease free survival in ER-positive breast cancer only. Based on the data set forth in Table 5, overexpression of the following genes in ER-positive cancer is indicative of a better prognosis for survival without cancer recurrence following surgery: Bcl-2; DIABLO; IGF1R; GSTM3. Of the latter genes, Bcl-2; IGFR1; and GSTM3 have also been identified as indicators of good prognosis in the previous analysis, not limited to ER-positive breast cancer. The overexpression of DIABLO appears to be positive indicator of disease free survival in ER-positive breast cancer only.

[0508] Analysis of Multiple Genes and Indicators of Outcome

[0509] Two approaches were taken in order to determine whether using multiple genes would provide better discrimination between outcomes.

[0510] First, a discrimination analysis was performed using a forward stepwise approach. Models were generated that classified outcome with greater discrimination than was obtained with any single gene alone.

[0511] According to a second approach (time-to-event approach), for each gene a Cox Proportional Hazards model (see, e.g. Cox, D. R., and Oakes, D. (1984), Analysis of Survival Data, Chapman and Hall, London, N.Y.) was defined with time to recurrence or death as the dependent variable, and the expression level of the gene as the independent variable. The genes that have a p-value<0.05 in the Cox model were identified. For each gene, the Cox model provides the relative risk (RR) of recurrence or death for a unit change in the expression of the gene. One can choose to partition the patients into subgroups at any threshold value of the measured expression (on the CT scale), where all patients with expression values above the threshold have higher risk, and all patients with expression values below the threshold have lower risk, or vice versa, depending on whether the gene is an indicator of good (RR>1.01) or poor (RR<1.01) prognosis. Thus, any threshold value will define subgroups of patients with respectively increased or decreased risk. The results are summarized in the following Tables 6 and 7.

TABLE-US-00003 TABLE 6 Cox Model Results for 146 Patients with Invasive Breast Cancer Gene Relative Risk (RR) SE Relative Risk p value FOXM1 0.58 0.15 0.0002 STK15 0.51 0.20 0.0006 PRAME 0.78 0.07 0.0007 Bcl2 1.66 0.15 0.0009 CEGP1 1.25 0.07 0.0014 GSTM1 1.40 0.11 0.0014 Ki67 0.62 0.15 0.0016 PR 1.23 0.07 0.0017 Contig51037 0.81 0.07 0.0022 NME1 0.64 0.15 0.0023 YB-1 0.39 0.32 0.0033 TFRC 0.53 0.21 0.0035 BBC3 1.72 0.19 0.0036 GATA3 1.32 0.10 0.0039 CA9 0.81 0.07 0.0049 SURV 0.69 0.13 0.0049 DPYD 2.58 0.34 0.0052 RPS6KB1 0.60 0.18 0.0055 GSTM3 1.36 0.12 0.0078 Src.2 0.39 0.36 0.0094 TGFB3 1.61 0.19 0.0109 CDC25B 0.54 0.25 0.0122 XIAP 3.20 0.47 0.0126 CCNB1 0.68 0.16 0.0151 IGF1R 1.42 0.15 0.0153 Chk1 0.68 0.16 0.0155 ID1 1.80 0.25 0.0164 p27 1.69 0.22 0.0168 Chk2 0.52 0.27 0.0175 EstR1 1.17 0.07 0.0196 HNF3A 1.21 0.08 0.206 pS2 1.12 0.05 0.0230 BAGI1 1.88 0.29 0.0266 AK055699 1.24 0.10 0.0276 pENT1 0.51 0.31 0.0293 EpCAM 0.62 0.22 0.0310 WISP1 1.39 0.16 0.0338 VEGFC 0.62 0.23 0.0364 TK1 0.73 0.15 0.0382 NFKBp65 1.32 0.14 0.0384 BRCA2 0.66 0.20 0.0404 CYP3A4 0.60 0.25 0.0417 EGFR 0.72 0.16 0.0436

TABLE-US-00004 TABLE 7 Cox Model Results for 108 Patients wih ER+ Invasive Breast Cancer Gene Relative Risk (RR) SE Relative Risk p-value PRAME 0.75 0.10 0.0045 Contig51037 0.75 0.11 0.0060 Blc2 2.11 0.28 0.0075 HIF1A 0.42 0.34 0.0117 IGF1R 1.92 0.26 0.0117 FOXM1 0.54 0.24 0.0119 EPHX1 0.43 0.33 0.0120 Ki67 0.60 0.21 0.0160 CDC25B 0.41 0.38 0.0200 VEGFC 0.45 0.37 0.0288 CTSB 0.32 0.53 0.0328 DIABLO 2.91 0.50 0.0328 p27 1.83 0.28 0.0341 CDH1 0.57 0.27 0.0352 IGFBP3 0.45 0.40 0.0499

[0512] The binary and time-to-event analyses, with few exceptions, identified the same genes as prognostic markers. For example, comparison of Tables 4 and 6 shows that, with the exception of a single gene, the two analyses generated the same list of top 15 markers (as defined by the smallest p values). Furthermore, when both analyses identified the same gene, they were concordant with respect to the direction (positive or negative sign) of the correlation with survival/recurrence. Overall, these results strengthen the conclusion that the identified markers have significant prognostic value.

[0513] For Cox models comprising more than two genes (multivariate models), stepwise entry of each individual gene into the model is performed, where the first gene entered is pre-selected from among those genes having significant univariate p-values, and the gene selected for entry into the model at each subsequent step is the gene that best improves the fit of the model to the data. This analysis can be performed with any total number of genes. In the analysis the results of which are shown below, stepwise entry was performed for up to 10 genes.

[0514] Multivariate analysis is performed using the following equation:

RR=exp[coef(geneA).times.Ct(geneA)+coef(geneB).times.Ct(geneB)+coef(gene- C).times.Ct(geneC)+ . . . ].

[0515] In this equation, coefficients for genes that are predictors of beneficial outcome are positive numbers and coefficients for genes that are predictors of unfavorable outcome are negative numbers. The "Ct" values in the equation are .DELTA.Cts, i.e. reflect the difference between the average normalized Ct value for a population and the normalized Ct measured for the patient in question. The convention used in the present analysis has been that .DELTA.Cts below and above the population average have positive signs and negative signs, respectively (reflecting greater or lesser mRNA abundance). The relative risk (RR) calculated by solving this equation will indicate if the patient has an enhanced or reduced chance of long-term survival without cancer recurrence.

[0516] Multivariate Gene Analysis of 147 Patients with Invasive Breast Carcinoma

[0517] (a) A multivariate stepwise analysis, using the Cox Proportional Hazards Model, was performed on the gene expression data obtained for all 147 patients with invasive breast carcinoma. Genes CEGP1, FOXM1, STK15 and PRAME were excluded from this analysis. The following ten-gene sets have been identified by this analysis as having particularly strong predictive value of patient survival without cancer recurrence following surgical removal of primary tumor. [0518] 1. Bcl2, cyclinG1, NFKBp65, NME1, EPHX1, TOP2B, DR5, TERC, Src, DIABLO; [0519] 2. Ki67, XIAP, hENT1, TS, CD9, p27, cyclinG1, pS2, NFKBp65, CYP3A4; [0520] 3. GSTM1, XIAP, Ki67, TS, cyclinG1, p27, CYP3A4, pS2, NFKBp65, ErbB3; [0521] 4. PR, NME1, XIAP, upa, cyclinG1, Contig51037, TERC, EPHX1, ALDH1A3, CTSL; [0522] 5. CA9, NME1, TERC, cyclinG1, EPHX1, DPYD, Src, TOP2B, NFKBp65, VEGFC; [0523] 6. TFRC, XIAP, Ki67, TS, cyclinG1, p27, CYP3A4, pS2, ErbB3, NFKBp65.

[0524] (b) A multivariate stepwise analysis, using the Cox, Proportional Hazards Model, was performed on the gene expression data obtained for all 147 patients with invasive breast carcinoma, using an interrogation set including a reduced number of genes. The following ten-gene sets have been identified by this analysis as having particularly strong predictive value of patient survival without cancer recurrence following surgical removal of primary tumor. [0525] 1. Bcl2, PRAME, cyclinG1, FOXM1, NFKBp65, TS, XIAP, Ki67, CYP3A4, p27; [0526] 2. FOXM1, cyclinG1, XIAP, Contig51037, PRAME, TS, Ki67, PDGFRa, p27, NFKBp65; [0527] 3. PRAME, FOXM1, cyclinG1, XIAP, Contig51037, TS, Ki6, PDGFRa, p27, NFKBp65; [0528] 4. Ki67, XIAP, PRAME, hENT1, contig51037, TS, CD9, p27, ErbB3, cyclinG1; [0529] 5. STK15, XIAP, PRAME, PLAUR, p27, CTSL, CD18, PREP, p53, RPS6 KB1; [0530] 6. GSTM1, XIAP, PRAME, p27, Contig51037, ErbB3, GSTp, EREG, ID1, PLAUR; [0531] 7. PR, PRAME, NME1, XIAP, PLAUR, cyclinG1, Contig51037, TERC, EPHX1, DR5; [0532] 8. CA9, FOXM1, cyclinG1, XIAP, TS, Ki67, NFKBp65, CYP3A4, GSTM3, p27; [0533] 9. TFRC, XIAP, PRAME, p27, Contig51037, ErbB3, DPYD, TERC, NME1, VEGFC; [0534] 10. CEGP1, PRAME, hENT1, XIAP, Contig51037, ErbB3, DPYD, NFKBp65, ID1, TS.

[0535] Multivariate Analysis of Patients with ER Positive Invasive Breast Carcinoma

[0536] A multivariate stepwise analysis, using the Cox Proportional Hazards Model, was performed on the gene expression data obtained for patients with ER positive invasive breast carcinoma. The following ten-gene sets have been identified by this analysis as having particularly strong predictive value of patient survival without cancer recurrence following surgical removal of primary tumor. [0537] 1. PRAME, p27, IGFBP2, HIF1A, T1MP2, ILT2, CYP3A4, ID1, EstR1, DIABLO; [0538] 2. Contig51037, EPHX1, Ki67, TIMP2, cyclinG1, DPYD, CYP3A4, TP, AIB1, CYP2C8; [0539] 3. Bcl2, hENT1, FOXM1, Contig51037, cyclinG1, Contig46653, PTEN, CYP3A4, TIMP2, AREG; [0540] 4. HIF1A, PRAME, p27, IGFBP2, T1MP2, ILT2, CYP3A4, ID1, EstR1, DIABLO; [0541] 5. IGF1R, PRAME, EPHX1, Contig51037, cyclinG1, Bcl2, NME1, PTEN, TBP, TIMP2; [0542] 6. FOXM1, Contig51037, VEGFC, TBP, HIF1A, DPYD, RAD51C, DCR3, cyclinG1, BAG1;

[0543] 7. EPHX1, Contig51037, Ki67, TIMP2, cyclinG1, DPYD, CYP3A4, TP, AIB1, CYP2C8; [0544] 8. Ki67, VEGFC, VDR, GSTM3, p27, upa, ITGA7, rhoC, TERC, Pin1; [0545] 9. CDC25B, Contig51037, hENT1, Bcl2, HLAG, TERC, NME1, upa, ID1, CYP; [0546] 10. VEGFC, Ki67, VDR, GSTM3, p27, upa, ITGA7, rhoC, TERC, Pin1; [0547] 11. CTSB, PRAME, p27, IGFBP2, EPHX1, CTSL, BAD, DR5, DCR3, XIAP; [0548] 12. DIABLO, Ki67, hENT1, TIMP2, ILT2, p27, KRT19, IGFBP2, TS, PDGFB; [0549] 13. p27, PRAME, IGFBP2, HIF1A, T1MP2, ILT2, CYP3A4, ID1, EstR1, DIABLO; [0550] 14. CDH1; PRAME, VEGFC; HIF1A; DPYD, TIMP2, CYP3A4, EstR1, RBP4, p27; [0551] 15. IGFBP3, PRAME, p27, Bcl2, XIAP, EstR1, Ki67, TS, Src, VEGF; [0552] 16. GSTM3, PRAME, p27, IGFBP3, XIAP, FGF2, hENT1, PTEN, EstR1, APC; [0553] 17. hENT1, Bcl2, FOXM1, Contig51037, CyclinG1, Contig46653, PTEN, CYP3A4, TIMP2, AREG; [0554] 18. STK15, VEGFC, PRAME, p2'7, GCLC, hENT1, ID1, TIMP2, EstR1, MCP1; [0555] 19. NME1, PRAM, p27, IGFBP3, XIAP, PTEN, hENT1, Bcl2, CYP3A4, HLAG; [0556] 20. VDR, Bcl2, p27, hENT1, p53, PI3KC2A, EIF4E, TFRC, MCM3, ID1; [0557] 21. EIF4E, Contig51037, EPHX1, cyclinG1, Bcl2, DR5, TBP, PTEN, NME1, HER2; [0558] 22. CCNB1, PRAME, VEGFC, HIF1A, hENT1, GCLC, TIMP2, ID1, p27, upa; [0559] 23. ID1, PRAME, DIABLO, hENT1, p27, PDGFRa, NME1, BIN1, BRCA1, TP; [0560] 24. FBXO5, PRAME, IGFBP3, p27, GSTM3, hENT1, XIAP, FGF2, TS, PTEN; [0561] 25. GUS, HIA1A, VEGFC, GSTM3, DPYD, hENT1, FBXO5, CA9, CYP, KRT18; [0562] 26. Bclx, Bcl2, hENT1, Contig51037, HLAG, CD9, ID1, BRCA1, BIN1, HBEGF.

[0563] It is noteworthy that many of the foregoing gene sets include genes that alone did not have sufficient predictive value to qualify as prognostic markers under the standards discussed above, but in combination with other genes, their presence provides valuable information about the likelihood of long-term patient survival without cancer recurrence

[0564] All references cited throughout the disclosure are hereby expressly incorporated by reference.

[0565] While the present invention has been described with reference to what are considered to be the specific embodiments, it is to be understood that the invention is not limited to such embodiments. To the contrary, the invention is intended to cover various modifications and equivalents included within the spirit and scope of the appended claims. For example, while the disclosure focuses on the identification of various breast cancer associated genes and gene sets, and on the diagnosis and treatment of breast cancer, similar genes, gene sets and methods concerning other types of cancer are specifically within the scope herein.

TABLE-US-00005 TABLE 1 1. ADD3 (adducin 3 gamma)* 2. AKT1/Protein Kinase B 3. AKT 2 4. AKT 3 5. Aldehyde dehydrogenase 1A1 6. Aldehyde dehydrogenase 1A3 7. amphiregulin 8. APC 9. ARG 10. ATM 11. Bak 12. Bax 13. Bcl2 14. Bcl-xl 15. BRK 16. BCRP 17. BRCA-1 18. BRCA-2 19. Caspase-3 20. Cathepsin B 21. Cathepsin G 22. Cathepsin L 23. CD3 24. CD9 25. CD18 26. CD31 27. CD44{circumflex over ( )} 28. CD68 29. CD82/KAI-1 30. Cdc25A 31. Cdc25B 32. CGA 33. COX2 34. CSF-1 35. CSF-1R/fms 36. cIAP1 37. cIAP2 38. c-abl 39. c-kit 40. c-kit L 41. c-met 42. c-myc 43. cN-1 44. cryptochrome1* 45. c-Src 46. Cyclin D1 47. CYP1B1 48. CYP2C9* 49. Cytokeratin 5{circumflex over ( )} 50. Cytokeratin 17{circumflex over ( )} 51. Cytokeratin 18{circumflex over ( )} 52. DAP-Kinase-1 53. DHFR 54. DIABLO 55. Dihydropyrimidine dehydrogenase 56. EGF 57. ECadherin/CDH1{circumflex over ( )} 58. ELF 3* 59. Endothelin 60. Epiregulin 61. ER-alpha{circumflex over ( )} 62. ErbB-1 63. ErbB-2{circumflex over ( )} 64. ErbB-3 65. ErbB-4 66. ER-Beta 67. Eukaryotic Translation Initiation Factor 4B*(EIF4B) 68. E1F4E 69. farnesyl pyrolophosphate synthetase 70. FAS (CD95) 71. FasL 72. FGF R 1* 73. FGF2 [bFGF] 74. 53BP1 75. 53BP2 76. GALC (galactosylceramidase)* 77. Gamma-GCS (glutamyl cysteine synthetase) 78. GATA3{circumflex over ( )} 79. geranyl geranyl pyrophosphate synthetase 80. G-CSF 81. GPC3 82. gravin* [AK AP258] 83. GRO1 oncogene alpha{circumflex over ( )} 84. Grb7{circumflex over ( )} 85. GST-alpha 86. GST-pi{circumflex over ( )} 87. Ha-Ras 88. HB-EGF 89. HE4-extracellular Proteinase Inhibitor Homologue* 90. hepatocyte nuclear factor 3{circumflex over ( )} 91. HER-2 92. HGF/Scatter factor 93. hIAP1 94. hIAP2 95. HIF-1 96. human kallikrein 10 97. MLH1 98. hsp 27 99. human chorionic gonadotropin/CGA 100. Human Extracellular Protein S1-5 101. Id-1 102. Id-2 103. Id-3 104. IGF-1 105. IGF2 106. IGF1R 107. IGFBP3 108. interstitial integrin alpha 7 109. IL6 110. IL8 111. IRF-2* 112. IRF9 Protein 113. Kalikrein 5 114. Kalikrein 6 115. KDR 116. Ki-67/MiB1 117. lipoprotein lipase{circumflex over ( )} 118. LIV1 119. Lung Resistance Protein/MVP 120. Lot1 121. Maspin 122. MCM2 123. MCM3 124. MCM7 125. MCP-1 126. microtubule-associated protein 4 127. MCJ 128. mdm2 129. MDR-1 130. microsomal epoxide hydrolase 131. MMP9 132. MRP1 133. MRP2 134. MRP3 135. MRP4 136. MSN (Moesin)* 137. mTOR 138. Muc1/CA 15-3 139. NF-kB 140. P14ARF 141. P16INK4a/p14 142. p21wAF1/CIP1 143. p23 144. p27 145. p311* 146. p53 147. PAI1 148. PCNA 149. PDGF-A 150. PDGF-B 151. PDGF-C 152. PDGF-D 153. PDGFR-.alpha. 154. PDGFR-.beta. 155. PI3K 156. Pin1 157. PKC-.epsilon. 158. Pkc-.delta. 159. PLAG1 (pleiomorphic aden 1)* 160. PREP prolyl endopeptidase 161. Progesterone receptor 162. pS2/trefoil factor 1 163. PTEN 164. PTP1b 165. RAR-alpha 166. RAR-beta2 167. RCP 168. Reduced Folate Carrier 169. Retinol binding protein 4{circumflex over ( )} 170. STK15/BTAK 171. Survivin 172. SXR 173. Syk 174. TGD (thymine-DNA glycosylase)* 175. TGFalpha 176. Thymidine Kinase 177. Thymidine phosphorylase 178. Thymidylate Synthase 179. Topoisomerase II-.alpha. 180. Topoisomerase II-.beta. 181. TRAMP 182. UPA 183. VEGF 184. Vimentin 185. WTH3 186. XAF1 187. XIAP 188. XIST 189. XPA 190. YB-1 *NCI 60 drug Sens./Resist Marker {circumflex over ( )}In Cluster Defining tumor subclass Jan. 19, 2002 indicates data missing or illegible when filed

TABLE-US-00006 TABLE 2 Forward Reverse Primer Primer Amplicon Gene Accession No. SEQ ID NO. SEQ ID NO. SEQ ID NO. ABCB1 NM_000927 1 2 3 ABCC1 NM_004996 4 5 6 ABCC2 NM_000392 7 8 9 ABCC3 NM_003786 10 11 12 ABCC4 NM_005845 13 14 15 ABL1 NM_005157 16 17 18 ABL2 NM_005158 19 20 21 ACTB NM_001101 22 23 24 AKT1 NM_005163 25 26 27 AKT3 NM_005465 28 29 30 ALDH1 NM_000689 31 32 33 ALDH1A3 NM_000693 34 35 36 APC NM_000038 37 38 39 AREG NM_001657 40 41 42 B2M NM_004048 43 44 45 BAK1 NM_001188 46 47 48 BAX NM_004324 49 50 51 BCL2 NM_000633 52 53 54 BCL2L1 NM_001191 55 56 57 BIRC3 NM_001165 58 59 60 BIRC4 NM_001167 61 62 63 BIRC5 NM_001168 64 65 66 BRCA1 NM_007295 67 68 69 BRCA2 NM_000059 70 71 72 CCND1 NM_001758 73 74 75 CD3Z NM_000734 76 77 78 CD68 NM_001251 79 80 81 CDC25A NM_001789 82 83 84 CDH1 NM_004360 85 86 87 CDKN1A NM_000389 88 89 90 CDKN1B NM_004064 91 92 93 CDKN2A NM_000077 94 95 96 CYP1B1 NM_000104 97 98 99 DHFR NM_000791 100 101 102 DPYD NM_000110 103 104 105 ECGF1 NM_001953 106 107 108 EGFR NM_005228 109 110 111 EIF4E NM_001968 112 113 114 ERBB2 NM_004448 115 116 117 ERBB3 NM_001982 118 119 120 ESR1 NM_000125 121 122 123 ESR2 NM_001437 124 125 126 GAPD NM_002046 127 128 129 GATA3 NM_002051 130 131 132 GRB7 NM_005310 133 134 135 GRO1 NM_001511 136 137 138 GSTP1 NM_000852 139 140 141 GUSB NM_000181 142 143 144 hHGF M29145 145 146 147 HNF3A NM_004496 148 149 150 ID2 NM_002166 151 152 153 IGF1 NM_000618 154 155 156 IGFBP3 NM_000598 157 158 159 ITGA7 NM_002206 160 161 162 ITGB2 NM_000211 163 164 165 KDR NM_002253 166 167 168 KIT NM_000222 169 170 171 KITLG NM_000899 172 173 174 KRT17 NM_000422 175 176 177 KRT5 NM_000424 178 179 180 LPL NM_000237 181 182 183 MET NM_000245 184 185 186 MKI67 NM_002417 187 188 189 MVP NM_017458 190 191 192 MYC NM_002467 193 194 195 PDGFA NM_002607 196 197 198 PDGFB NM_002608 199 200 201 PDGFC NM_016205 202 203 204 PDGFRA NM_006206 205 206 207 PDGFRB NM_002609 208 209 210 PGK1 NM_000291 211 212 213 PGR NM_000926 214 215 216 PIN1 NM_006221 217 218 219 PLAU NM_002658 220 221 222 PPIH NM_006347 223 224 225 PTEN NM_000314 226 227 228 PTGS2 NM_000963 229 230 231 RBP4 NM_006744 232 233 234 RELA NM_021975 235 236 237 RPL19 NM_000981 238 239 240 RPLP0 NM_001002 241 242 243 SCDGF-B NM_025208 244 245 246 SERPINE1 NM_000602 247 248 249 SLC19A1 NM_003056 250 251 252 TBP NM_003194 253 254 255 TFF1 NM_003225 256 257 258 TFRC NM_003234 259 260 261 TK1 NM_003258 262 263 264 TNFRSF6 NM_000043 265 266 267 TNFSF6 NM_000639 268 269 270 TOP2A NM_001067 271 272 273 TOP2B NM_001068 274 275 276 TP53 NM_000546 277 278 279 TYMS NM_001071 280 281 282 VEGF NM_003376 283 284 285

TABLE-US-00007 TABLE 3 GENE ACCESSION NO. SEQ ID NO: AK055699 AK055699 286 BAG1 NM_004323 287 BBC3 NM_014417 288 Bcl2 NM_000633 289 BRCA2 NM_000059 290 CA9 NM_001216 291 CCNB1 NM_031966 292 CDC25B NM_021874 293 CEGP1 NM_020974 294 Chk1 NM_001274 295 Chk2 NM_007194 296 CYP3A4 NM_017460 297 DIABLO NM_019887 298 DPYD NM_000110 299 EGFR NM_005228 300 EpCAM NM_002354 301 EPHX1 NM_000120 302 EstR1 NM_000125 303 FOXM1 NM_021953 304 GATA3 NM_002051 305 GSTM1 NM_000561 306 GSTM3 NM_000849 307 hENT1 NM_004955 308 HIF1A NM_001530 309 HNF3A NM_004496 310 ID1 NM_002165 311 IGF1R NM_000875 312 Ki-67 NM_002417 313 NFKBp65 NM_021975 314 NME1 NM_000269 315 p27 NM_004064 316 PI3KC2A NM_002645 317 PR NM_000926 318 PRAME NM_006115 319 pS2 NM_003225 320 RPS6KB1 NM_003161 321 Src NM_004383 322 STK15 NM_003600 323 SURV NM_001168 324 TFRC NM_003234 325 TGFB3 NM_003239 326 TK1 NM_003258 327 VDR NM_000376 328 VEGFC NM_005429 329 WISP1 NM_003882 330 XIAP NM_001167 331 YB-1 NM_004559 332 ITGA7 NM_002206 333 PDGFB NM_002608 334 Upa NM_002658 335 TBP NM_003194 336 PDGFRa NM_006206 337 Pin1 NM_006221 338 CYP NM_006347 339 RBP4 NM_006744 340 BRCA1 NM_007295 341 APC NM_000038 342 GUS NM_000181 343 CD18 NM_000211 344 PTEN NM_000314 345 P53 NM_000546 346 ALDH1A3 NM_000693 347 GSTp NM_000852 348 TOP2B NM_001068 349 TS NM_001071 350 Bclx NM_001191 351 AREG NM_001657 352 TP NM_001953 353 EIF4E NM_001968 354 ErbB3 NM_001982 355 EREG NM_001432 356 GCLC NM_001498 357 CD9 NM_001769 358 HB-EGF NM_001945 359 IGFBP2 NM_000597 360 CTSL NM_001912 361 PREP NM_002726 362 CYP3A4 NM_017460 363 ILT-2 NM_006669 364 MCM3 NM_002388 365 KRT19 NM_002276 366 KRT18 NM_000224 367 TIMP2 NM_003255 368 BAD NM_004322 369 CYP2C8 NM_030878 370 DCR3 NM_016434 371 PLAUR NM_002659 372 PI3KC2A NM_002645 373 FGF2 NM_002006 374 HLA-G NM_002127 375 AIB1 NM_006534 376 MCP1 NM_002982 377 Contig46653 Contig46653 378 RhoC NM_005167 379 DR5 NM_003842 380 RAD51C NM_058216 381 BIN1 NM_004305 382 VDR NM_000376 383 TERC U86046 384

Sequence CWU 1

1

384118DNAHomo sapiens 1gtcccaggag cccatcct 18219DNAHomo sapiens 2cccggctgtt gtctccata 19368DNAHomo sapiens 3gtcccaggag cccatcctgt ttgactgcag cattgctgag aacattgcct atggagacaa 60cagccggg 68418DNAHomo sapiens 4tcatggtgcc cgtcaatg 18523DNAHomo sapiens 5cgattgtctt tgctcttcat gtg 23679DNAHomo sapiens 6tcatggtgcc cgtcaatgct gtgatggcga tgaagaccaa gacgtatcag gtggcccaca 60tgaagagcaa agacaatcg 79720DNAHomo sapiens 7aggggatgac ttggacacat 20820DNAHomo sapiens 8aaaactgcat ggctttgtca 20965DNAHomo sapiens 9aggggatgac ttggacacat ctgccattcg acatgactgc aattttgaca aagccatgca 60gtttt 651022DNAHomo sapiens 10tcatcctggc gatctacttc ct 221120DNAHomo sapiens 11ccgttgagtg gaatcagcaa 201291DNAHomo sapiens 12tcatcctggc gatctacttc ctctggcaga acctaggtcc ctctgtcctg gctggagtcg 60ctttcatggt cttgctgatt ccactcaacg g 911320DNAHomo sapiens 13agcgcctgga atctacaact 201420DNAHomo sapiens 14agagcccctg gagagaagat 201566DNAHomo sapiens 15agcgcctgga atctacaact cggagtccag tgttttccca cttgtcatct tctctccagg 60ggctct 661624DNAHomo sapiens 16gcccagagaa ggtctatgaa ctca 241722DNAHomo sapiens 17gtttcaaagg cttggtggat tt 221894DNAHomo sapiens 18gcccagagaa ggtctatgaa ctcatgcgag catgttggca gtggaatccc tctgaccggc 60cctcctttgc tgaaatccac caagcctttg aaac 941921DNAHomo sapiens 19cgcagtgcag ctgagtatct g 212021DNAHomo sapiens 20tgcccagggc tactctcact t 212180DNAHomo sapiens 21cgcagtgcag ctgagtatct gctcagcagt ctaatcaatg gcagcttcct ggtgcgagaa 60agtgagagta gccctgggca 802221DNAHomo sapiens 22cagcagatgt ggatcagcaa g 212318DNAHomo sapiens 23gcatttgcgg tggacgat 182466DNAHomo sapiens 24cagcagatgt ggatcagcaa gcaggagtat gacgagtccg gcccctccat cgtccaccgc 60aaatgc 662520DNAHomo sapiens 25cgcttctatg gcgctgagat 202620DNAHomo sapiens 26tcccggtaca ccacgttctt 202771DNAHomo sapiens 27cgcttctatg gcgctgagat tgtgtcagcc ctggactacc tgcactcgga gaagaacgtg 60gtgtaccggg a 712825DNAHomo sapiens 28ttgtctctgc cttggactat ctaca 252924DNAHomo sapiens 29ccagcattag attctccaac ttga 243075DNAHomo sapiens 30ttgtctctgc cttggactat ctacattccg gaaagattgt gtaccgtgat ctcaagttgg 60agaatctaat gctgg 753125DNAHomo sapiens 31gaaggagata aggaggatgt tgaca 253218DNAHomo sapiens 32cgccacggag atccaatc 183374DNAHomo sapiens 33gaaggagata aggaggatgt tgacaaggca gtgaaggccg caagacaggc ttttcagatt 60ggatctccgt ggcg 743421DNAHomo sapiens 34tggtgaacat tgtgccagga t 213522DNAHomo sapiens 35gaaggcgatc ttgttgatct ga 223680DNAHomo sapiens 36tggtgaacat tgtgccagga ttcgggccca cagtgggagc agcaatttct tctcaccctc 60agatcaacaa gatcgccttc 803720DNAHomo sapiens 37ggacagcagg aatgtgtttc 203820DNAHomo sapiens 38acccactcga tttgtttctg 203969DNAHomo sapiens 39ggacagcagg aatgtgtttc tccatacagg tcacggggag ccaatggttc agaaacaaat 60cgagtgggt 694027DNAHomo sapiens 40tgtgagtgaa atgccttcta gtagtga 274127DNAHomo sapiens 41ttgtggttcg ttatcatact cttctga 274282DNAHomo sapiens 42tgtgagtgaa atgccttcta gtagtgaacc gtcctcggga gccgactatg actactcaga 60agagtatgat aacgaaccac aa 824319DNAHomo sapiens 43gtctcgctcc gtggcctta 194424DNAHomo sapiens 44cgtgagtaaa cctgaatctt tgga 244593DNAHomo sapiens 45gtctcgctcc gtggccttag ctgtgctcgc gctactctct ctttctggcc tggaggctat 60ccagcgtact ccaaagattc aggtttactc acg 934620DNAHomo sapiens 46ccattcccac cattctacct 204720DNAHomo sapiens 47gggaacatag acccaccaat 204866DNAHomo sapiens 48ccattcccac cattctacct gaggccagga cgtctggggt gtggggattg gtgggtctat 60gttccc 664918DNAHomo sapiens 49ccgccgtgga cacagact 185021DNAHomo sapiens 50ttgccgtcag aaaacatgtc a 215170DNAHomo sapiens 51ccgccgtgga cacagactcc ccccgagagg tctttttccg agtggcagct gacatgtttt 60ctgacggcaa 705225DNAHomo sapiens 52cagatggacc tagtacccac tgaga 255324DNAHomo sapiens 53cctatgattt aagggcattt ttcc 245473DNAHomo sapiens 54cagatggacc tagtacccac tgagatttcc acgccgaagg acagcgatgg gaaaaatgcc 60cttaaatcat agg 735524DNAHomo sapiens 55cttttgtgga actctatggg aaca 245619DNAHomo sapiens 56cagcggttga agcgttcct 195770DNAHomo sapiens 57cttttgtgga actctatggg aacaatgcag cagccgagag ccgaaagggc caggaacgct 60tcaaccgctg 705824DNAHomo sapiens 58ggatatttcc gtggctctta ttca 245925DNAHomo sapiens 59cttctcatca aggcagaaaa atctt 256086DNAHomo sapiens 60ggatatttcc gtggctctta ttcaaactct ccatcaaatc ctgtaaactc cagagcaaat 60caagattttt ctgccttgat gagaag 866123DNAHomo sapiens 61gcagttggaa gacacaggaa agt 236221DNAHomo sapiens 62tgcgtggcac tattttcaag a 216377DNAHomo sapiens 63gcagttggaa gacacaggaa agtatcccca aattgcagat ttatcaacgg cttttatctt 60gaaaatagtg ccacgca 776420DNAHomo sapiens 64tgttttgatt cccgggctta 206524DNAHomo sapiens 65caaagctgtc agctctagca aaag 246680DNAHomo sapiens 66tgttttgatt cccgggctta ccaggtgaga agtgagggag gaagaaggca gtgtcccttt 60tgctagagct gacagctttg 806720DNAHomo sapiens 67tcagggggct agaaatctgt 206820DNAHomo sapiens 68ccattccagt tgatctgtgg 206965DNAHomo sapiens 69tcagggggct agaaatctgt tgctatgggc ccttcaccaa catgcccaca gatcaactgg 60aatgg 657020DNAHomo sapiens 70agttcgtgct ttgcaagatg 207120DNAHomo sapiens 71aaggtaagct gggtctgctg 207270DNAHomo sapiens 72agttcgtgct ttgcaagatg gtgcagagct ttatgaagca gtgaagaatg cagcagaccc 60agcttacctt 707321DNAHomo sapiens 73gcatgttcgt ggcctctaag a 217422DNAHomo sapiens 74cggtgtagat gcacagcttc tc 227569DNAHomo sapiens 75gcatgttcgt ggcctctaag atgaaggaga ccatccccct gacggccgag aagctgtgca 60tctacaccg 697620DNAHomo sapiens 76agatgaagtg gaaggcgctt 207721DNAHomo sapiens 77tgcctctgta atcggcaact g 217865DNAHomo sapiens 78agatgaagtg gaaggcgctt ttcaccgcgg ccatcctgca ggcacagttg ccgattacag 60aggca 657918DNAHomo sapiens 79tggttcccag ccctgtgt 188019DNAHomo sapiens 80ctcctccacc ctgggttgt 198174DNAHomo sapiens 81tggttcccag ccctgtgtcc acctccaagc ccagattcag attcgagtca tgtacacaac 60ccagggtgga ggag 748220DNAHomo sapiens 82tcttgctggc tacgcctctt 208321DNAHomo sapiens 83ctgcattgtg gcacagttct g 218471DNAHomo sapiens 84tcttgctggc tacgcctctt ctgtccctgt tagacgtcct ccgtccatat cagaactgtg 60ccacaatgca g 718521DNAHomo sapiens 85tgagtgtccc ccggtatctt c 218621DNAHomo sapiens 86cagccgcttt cagattttca t 218781DNAHomo sapiens 87tgagtgtccc ccggtatctt ccccgccctg ccaatcccga tgaaattgga aattttattg 60atgaaaatct gaaagcggct g 818821DNAHomo sapiens 88tggagactct cagggtcgaa a 218922DNAHomo sapiens 89ggcgtttgga gtggtagaaa tc 229065DNAHomo sapiens 90tggagactct cagggtcgaa aacggcggca gaccagcatg acagatttct accactccaa 60acgcc 659121DNAHomo sapiens 91cggtggacca cgaagagtta a 219219DNAHomo sapiens 92ggctcgcctc ttccatgtc 199366DNAHomo sapiens 93cggtggacca cgaagagtta acccgggact tggagaagca ctgcagagac atggaagagg 60cgagcc 669419DNAHomo sapiens 94gcggaaggtc cctcagaca 199523DNAHomo sapiens 95tctaagtttc ccgaggtttc tca 239670DNAHomo sapiens 96gcggaaggtc cctcagacat ccccgattga aagaaccaga gaggctctga gaaacctcgg 60gaaacttaga 709722DNAHomo sapiens 97ccagctttgt gcctgtcact at 229820DNAHomo sapiens 98gggaatgtgg tagcccaaga 209971DNAHomo sapiens 99ccagctttgt gcctgtcact attcctcatg ccaccactgc caacacctct gtcttgggct 60accacattcc c 7110027DNAHomo sapiens 100ttgctataac taagtgcttc tccaaga 2710122DNAHomo sapiens 101gtggaatggc agctcactgt ag 2210273DNAHomo sapiens 102ttgctataac taagtgcttc tccaagaccc caactgagtc cccagcacct gctacagtga 60gctgccattc cac 7310319DNAHomo sapiens 103aggacgcaag gagggtttg 1910421DNAHomo sapiens 104gatgtccgcc gagtccttac t 2110587DNAHomo sapiens 105aggacgcaag gagggtttgt cactggcaga ctcgagactg taggcactgc catggcccct 60gtgctcagta aggactcggc ggacatc 8710624DNAHomo sapiens 106ctatatgcag ccagagatgt gaca 2410724DNAHomo sapiens 107ccacgagttt cttactgaga atgg 2410882DNAHomo sapiens 108ctatatgcag ccagagatgt gacagccacc gtggacagcc tgccactcat cacagcctcc 60attctcagta agaaactcgt gg 8210920DNAHomo sapiens 109tgtcgatgga cttccagaac 2011019DNAHomo sapiens 110attgggacag cttggatca 1911162DNAHomo sapiens 111tgtcgatgga cttccagaac cacctgggca gctgccaaaa gtgtgatcca agctgtccca 60at 6211223DNAHomo sapiens 112gatctaagat ggcgactgtc gaa 2311325DNAHomo sapiens 113ttagattccg ttttctcctc ttctg 2511482DNAHomo sapiens 114gatctaagat ggcgactgtc gaaccggaaa ccacccctac tcctaatccc ccgactacag 60aagaggagaa aacggaatct aa 8211520DNAHomo sapiens 115cggtgtgaga agtgcagcaa 2011619DNAHomo sapiens 116cctctcgcaa gtgctccat 1911770DNAHomo sapiens 117cggtgtgaga agtgcagcaa gccctgtgcc cgagtgtgct atggtctggg catggagcac 60ttgcgagagg 7011823DNAHomo sapiens 118cggttatgtc atgccagata cac 2311924DNAHomo sapiens 119gaactgagac ccactgaaga aagg 2412081DNAHomo sapiens 120cggttatgtc atgccagata cacacctcaa aggtactccc tcctcccggg aaggcaccct 60ttcttcagtg ggtctcagtt c 8112119DNAHomo sapiens 121cgtggtgccc ctctatgac 1912219DNAHomo sapiens 122ggctagtggg cgcatgtag 1912368DNAHomo sapiens 123cgtggtgccc ctctatgacc tgctgctgga gatgctggac gcccaccgcc tacatgcgcc 60cactagcc 6812420DNAHomo sapiens 124tggtccatcg ccagttatca 2012523DNAHomo sapiens 125tgttctagcg atcttgcttc aca 2312676DNAHomo sapiens 126tggtccatcg ccagttatca catctgtatg cggaacctca aaagagtccc tggtgtgaag 60caagatcgct agaaca 7612724DNAHomo sapiens 127catccatgac aactttggta tcgt 2412821DNAHomo sapiens 128cagtcttctg ggtggcagtg a 2112974DNAHomo sapiens 129catccatgac aactttggta tcgtggaagg actcatgacc acagtccatg ccatcactgc 60cacccagaag actg 7413023DNAHomo sapiens 130caaaggagct cactgtggtg tct 2313126DNAHomo sapiens 131gagtcagaat ggcttattca cagatg 2613275DNAHomo sapiens 132caaaggagct cactgtggtg tctgtgttcc aaccactgaa tctggacccc atctgtgaat 60aagccattct gactc 7513320DNAHomo sapiens 133ccatctgcat ccatcttgtt 2013420DNAHomo sapiens 134ggccaccagg gtattatctg 2013567DNAHomo sapiens 135ccatctgcat ccatcttgtt tgggctcccc acccttgaga agtgcctcag ataataccct 60ggtggcc 6713623DNAHomo sapiens 136cgaaaagatg ctgaacagtg aca 2313720DNAHomo sapiens 137tcaggaacag ccaccagtga 2013873DNAHomo sapiens 138cgaaaagatg ctgaacagtg acaaatccaa ctgaccagaa gggaggagga agctcactgg 60tggctgttcc tga 7313920DNAHomo sapiens 139gagaccctgc tgtcccagaa 2014023DNAHomo sapiens 140ggttgtagtc agcgaaggag atc 2314176DNAHomo sapiens 141gagaccctgc tgtcccagaa ccagggaggc aagaccttca ttgtgggaga ccagatctcc 60ttcgctgact acaacc 7614220DNAHomo sapiens 142cccactcagt agccaagtca 2014320DNAHomo sapiens 143cacgcaggtg gtatcagtct 2014473DNAHomo sapiens 144cccactcagt agccaagtca caatgtttgg aaaacagccc gtttacttga gcaagactga 60taccacctgc gtg 7314524DNAHomo sapiens 145catcaaatgt cagccctgga gttc 2414626DNAHomo sapiens 146ttcctgtagg tctttacccc gatagc 2614785DNAHomo sapiens 147catcaaatgt cagccctgga gttccatgat accacacgaa cacagctttt tgccttcgag 60ctatcggggt aaagacctac aggaa 8514824DNAHomo sapiens 148tccaggatgt taggaactgt gaag 2414922DNAHomo sapiens 149gcgtgtctgc gtagtagctg tt 2215073DNAHomo sapiens 150tccaggatgt taggaactgt gaagatggaa gggcatgaaa ccagcgactg gaacagctac 60tacgcagaca cgc 7315123DNAHomo sapiens 151aacgactgct actccaagct caa 2315222DNAHomo sapiens

152ggatttccat cttgctcacc tt 2215376DNAHomo sapiens 153aacgactgct actccaagct caaggagctg gtgcccagca tcccccagaa caagaaggtg 60agcaagatgg aaatcc 7615421DNAHomo sapiens 154tccggagctg tgatctaagg a 2115520DNAHomo sapiens 155cggacagagc gagctgactt 2015676DNAHomo sapiens 156tccggagctg tgatctaagg aggctggaga tgtattgcgc acccctcaag cctgccaagt 60cagctcgctc tgtccg 7615717DNAHomo sapiens 157acgcaccggg tgtctga 1715824DNAHomo sapiens 158tgccctttct tgatgatgat tatc 2415968DNAHomo sapiens 159acgcaccggg tgtctgatcc caagttccac cccctccatt caaagataat catcatcaag 60aaagggca 6816022DNAHomo sapiens 160ccattcaccc tgtgtaacag ga 2216121DNAHomo sapiens 161ccgaccctct aggttaaggc a 2116268DNAHomo sapiens 162ccattcaccc tgtgtaacag gaccccaagg acctgcctcc ccggaagtgc cttaacctag 60agggtcgg 6816320DNAHomo sapiens 163cgtcaggacc caccatgtct 2016424DNAHomo sapiens 164ggttaattgg tgacatcctc aaga 2416581DNAHomo sapiens 165cgtcaggacc caccatgtct gccccatcac gcggccgaga catggcttgg ccacagctct 60tgaggatgtc accaattaac c 8116623DNAHomo sapiens 166caaacgctga catgtacggt cta 2316718DNAHomo sapiens 167gctcgttggc gcactctt 1816888DNAHomo sapiens 168caaacgctga catgtacggt ctatgccatt cctcccccgc atcacatcca ctggtattgg 60cagttggagg aagagtgcgc caacgagc 8816925DNAHomo sapiens 169gaggcaactg cttatggctt aatta 2517018DNAHomo sapiens 170ggcactcggc ttgagcat 1817175DNAHomo sapiens 171gaggcaactg cttatggctt aattaagtca gatgcggcca tgactgtcgc tgtaaagatg 60ctcaagccga gtgcc 7517218DNAHomo sapiens 172gtccccggga tggatgtt 1817325DNAHomo sapiens 173gatcagtcaa gctgtctgac aattg 2517479DNAHomo sapiens 174gtccccggga tggatgtttt gccaagtcat tgttggataa gcgagatggt agtacaattg 60tcagacagct tgactgatc 7917521DNAHomo sapiens 175cgaggattgg ttcttcagca a 2117622DNAHomo sapiens 176actctgcacc agctcactgt tg 2217773DNAHomo sapiens 177cgaggattgg ttcttcagca agacagagga actgaaccgc gaggtggcca ccaacagtga 60gctggtgcag agt 7317820DNAHomo sapiens 178tcagtggaga aggagttgga 2017920DNAHomo sapiens 179tgccatatcc agaggaaaca 2018069DNAHomo sapiens 180tcagtggaga aggagttgga ccagtcaaca tctctgttgt cacaagcagt gtttcctctg 60gatatggca 6918126DNAHomo sapiens 181gtacaagaga gaaccagact ccaatg 2618218DNAHomo sapiens 182gtgtagcccg cggacact 1818387DNAHomo sapiens 183gtacaagaga gaaccagact ccaatgtcat tgtggtggac tggctgtcac gggctcagga 60gcattaccca gtgtccgcgg gctacac 8718422DNAHomo sapiens 184gacatttcca gtcctgcagt ca 2218520DNAHomo sapiens 185ctccgatcgc acacatttgt 2018686DNAHomo sapiens 186gacatttcca gtcctgcagt caatgcctct ctgccccacc ctttgttcag tgtggctggt 60gccacgacaa atgtgtgcga tcggag 8618724DNAHomo sapiens 187gttttggagg aaatgtgttc ttca 2418826DNAHomo sapiens 188ttctctaata cactgccgtc ttaagg 26189101DNAHomo sapiens 189gttttggagg aaatgtgttc ttcagtgcac agaatgcagc aaaacagcca tctgataaat 60gctctgcaag ccctccctta agacggcagt gtattagaga a 10119022DNAHomo sapiens 190acgagaacga gggcatctat gt 2219122DNAHomo sapiens 191gcatgtaggt gcttccaatc ac 2219275DNAHomo sapiens 192acgagaacga gggcatctat gtgcaggatg tcaagaccgg aaaggtgcgc gctgtgattg 60gaagcaccta catgc 7519321DNAHomo sapiens 193tccctccact cggaaggact a 2119422DNAHomo sapiens 194cggttgttgc tgatctgtct ca 2219584DNAHomo sapiens 195tccctccact cggaaggact atcctgctgc caagagggtc aagttggaca gtgtcagagt 60cctgagacag atcagcaaca accg 8419619DNAHomo sapiens 196ttgttggtgt gccctggtg 1919721DNAHomo sapiens 197tgggttctgt ccaaacactg g 2119867DNAHomo sapiens 198ttgttggtgt gccctggtgc cgtggtggcg gtcactccct ctgctgccag tgtttggaca 60gaaccca 6719920DNAHomo sapiens 199actgaaggag acccttggag 2020020DNAHomo sapiens 200taaataaccc tgcccacaca 2020162DNAHomo sapiens 201actgaaggag acccttggag cctaggggca tcggcaggag agtgtgtggg cagggttatt 60ta 6220228DNAHomo sapiens 202agttactaaa aaataccacg aggtcctt 2820321DNAHomo sapiens 203gtcggtgagt gatttgtgca a 2120479DNAHomo sapiens 204agttactaaa aaataccacg aggtccttca gttgagacca aagaccggtg tcaggggatt 60gcacaaatca ctcaccgac 7920520DNAHomo sapiens 205gggagtttcc aagagatgga 2020620DNAHomo sapiens 206cttcaaccac cttcccaaac 2020772DNAHomo sapiens 207gggagtttcc aagagatgga ctagtgcttg gtcgggtctt ggggtctgga gcgtttggga 60aggtggttga ag 7220823DNAHomo sapiens 208aggtgtcatc catcaacgtc tct 2320920DNAHomo sapiens 209tcccgatcac aatgcacatg 2021090DNAHomo sapiens 210aggtgtcatc catcaacgtc tctgtgaacg cagtgcagac tgtggtccgc cagggtgaga 60acatcaccct catgtgcatt gtgatcggga 9021124DNAHomo sapiens 211agagccagtt gctgtagaac tcaa 2421221DNAHomo sapiens 212ctgggcctac acagtccttc a 2121374DNAHomo sapiens 213agagccagtt gctgtagaac tcaaatctct gctgggcaag gatgttctgt tcttgaagga 60ctgtgtaggc ccag 7421426DNAHomo sapiens 214gaaatgactg catcgttgat aaaatc 2621519DNAHomo sapiens 215tgccagcctg acagcactt 1921678DNAHomo sapiens 216gaaatgactg catcgttgat aaaatccgca gaaaaaactg cccagcatgt cgccttagaa 60agtgctgtca ggctggca 7821720DNAHomo sapiens 217gatcaacggc tacatccaga 2021820DNAHomo sapiens 218tgaactgtga ggccagagac 2021968DNAHomo sapiens 219gatcaacggc tacatccaga agatcaagtc gggagaggag gactttgagt ctctggcctc 60acagttca 6822019DNAHomo sapiens 220gtggatgtgc cctgaagga 1922120DNAHomo sapiens 221ctgcggatcc agggtaagaa 2022270DNAHomo sapiens 222gtggatgtgc cctgaaggac aagccaggcg tctacacgag agtctcacac ttcttaccct 60ggatccgcag 7022327DNAHomo sapiens 223tggacttcta gtgatgagaa agattga 2722422DNAHomo sapiens 224cactgcgaga tcaccacagg ta 2222584DNAHomo sapiens 225tggacttcta gtgatgagaa agattgagaa tgttcccaca ggccccaaca ataagcccaa 60gctacctgtg gtgatctcgc agtg 8422625DNAHomo sapiens 226tggctaagtg aagatgacaa tcatg 2522725DNAHomo sapiens 227tgcacatatc attacaccag ttcgt 2522881DNAHomo sapiens 228tggctaagtg aagatgacaa tcatgttgca gcaattcact gtaaagctgg aaagggacga 60actggtgtaa tgatatgtgc a 8122923DNAHomo sapiens 229tctgcagagt tggaagcact cta 2323021DNAHomo sapiens 230gccgaggctt ttctaccaga a 2123179DNAHomo sapiens 231tctgcagagt tggaagcact ctatggtgac atcgatgctg tggagctgta tcctgccctt 60ctggtagaaa agcctcggc 7923224DNAHomo sapiens 232acgacacgta tgccgtacag tact 2423318DNAHomo sapiens 233ccgggaaaac acgaagga 1823486DNAHomo sapiens 234acgacacgta tgccgtacag tactcctgcc gcctcctgaa cctcgatggc acctgtgctg 60acagctactc cttcgtgttt tcccgg 8623519DNAHomo sapiens 235ctgccgggat ggcttctat 1923622DNAHomo sapiens 236ccaggttctg gaaactgtgg at 2223768DNAHomo sapiens 237ctgccgggat ggcttctatg aggctgagct ctgcccggac cgctgcatcc acagtttcca 60gaacctgg 6823820DNAHomo sapiens 238ccacaagctg aaggcagaca 2023921DNAHomo sapiens 239gcgtgcttcc ttggtcttag a 2124085DNAHomo sapiens 240ccacaagctg aaggcagaca aggcccgcaa gaagctcctg gctgaccagg ctgaggcccg 60caggtctaag accaaggaag cacgc 8524124DNAHomo sapiens 241ccattctatc atcaacgggt acaa 2424223DNAHomo sapiens 242tcagcaagtg ggaaggtgta atc 2324375DNAHomo sapiens 243ccattctatc atcaacgggt acaaacgagt cctggccttg tctgtggaga cggattacac 60cttcccactt gctga 7524420DNAHomo sapiens 244tatcgaggca ggtcatacca 2024520DNAHomo sapiens 245taacgcttgg catcatcatt 2024674DNAHomo sapiens 246tatcgaggca ggtcatacca tgaccggaag tcaaaagttg acctggatag gctcaatgat 60gatgccaagc gtta 7424719DNAHomo sapiens 247ccgcaacgtg gttttctca 1924821DNAHomo sapiens 248tgctgggttt ctcctcctgt t 2124981DNAHomo sapiens 249ccgcaacgtg gttttctcac cctatggggt ggcctcggtg ttggccatgc tccagctgac 60aacaggagga gaaacccagc a 8125025DNAHomo sapiens 250tcaagaccat catcactttc attgt 2525127DNAHomo sapiens 251ggatcaggaa gtacacggag tataact 2725296DNAHomo sapiens 252tcaagaccat catcactttc attgtctcgg acgtgcgggg cctgggcctc ccggtccgca 60agcagttcca gttatactcc gtgtacttcc tgatcc 9625319DNAHomo sapiens 253gcccgaaacg ccgaatata 1925423DNAHomo sapiens 254cgtggctctc ttatcctcat gat 2325565DNAHomo sapiens 255gcccgaaacg ccgaatataa tcccaagcgg tttgctgcgg taatcatgag gataagagag 60ccacg 6525619DNAHomo sapiens 256gccctcccag tgtgcaaat 1925725DNAHomo sapiens 257cgtcgatggt attaggatag aagca 2525886DNAHomo sapiens 258gccctcccag tgtgcaaata agggctgctg tttcgacgac accgttcgtg gggtcccctg 60gtgcttctat cctaatacca tcgacg 8625927DNAHomo sapiens 259caagctagat cagcattctc taacttg 2726025DNAHomo sapiens 260cacatgactg ttatcgccat ctact 2526199DNAHomo sapiens 261caagctagat cagcattctc taacttgttt ggtggagaac cattgtcata tacccggttc 60agcctggctc ggcaagtaga tggcgataac agtcatgtg 9926222DNAHomo sapiens 262cacaggaaca acagcatctt tc 2226320DNAHomo sapiens 263agataagccc ctgggatcca 2026475DNAHomo sapiens 264cacaggaaca acagcatctt tcaccaagat gggtggcacc aaccttgctg ggacttggat 60cccaggggct tatct 7526521DNAHomo sapiens 265ggattgctca acaaccatgc t 2126624DNAHomo sapiens 266ggcattaaca cttttggacg ataa 2426791DNAHomo sapiens 267ggattgctca acaaccatgc tgggcatctg gaccctccta cctctggttc ttacgtctgt 60tgctagatta tcgtccaaaa gtgttaatgc c 9126824DNAHomo sapiens 268gcactttggg attctttcca ttat 2426924DNAHomo sapiens 269gcatgtaaga agaccctcac tgaa 2427080DNAHomo sapiens 270gcactttggg attctttcca ttatgattct ttgttacagg caccgagaat gttgtattca 60gtgagggtct tcttacatgc 8027120DNAHomo sapiens 271aatccaaggg ggagagtgat 2027220DNAHomo sapiens 272gtacagattt tgcccgagga 2027372DNAHomo sapiens 273aatccaaggg ggagagtgat gacttccata tggactttga ctcagctgtg gctcctcggg 60caaaatctgt ac 7227421DNAHomo sapiens 274tgtggacatc ttcccctcag a 2127518DNAHomo sapiens 275ctagcccgac cggttcgt 1827666DNAHomo sapiens 276tgtggacatc ttcccctcag acttccctac tgagccacct tctctgccac gaaccggtcg 60ggctag 6627720DNAHomo sapiens 277ctttgaaccc ttgcttgcaa 2027818DNAHomo sapiens 278cccgggacaa agcaaatg 1827968DNAHomo sapiens 279ctttgaaccc ttgcttgcaa taggtgtgcg tcagaagcac ccaggacttc catttgcttt 60gtcccggg 6828018DNAHomo sapiens 280gcctcggtgt gcctttca 1828119DNAHomo sapiens 281cgtgatgtgc gcaatcatg 1928265DNAHomo sapiens 282gcctcggtgt gcctttcaac atcgccagct acgccctgct cacgtacatg attgcgcaca 60tcacg 6528320DNAHomo sapiens 283ctgctgtctt gggtgcattg 2028418DNAHomo sapiens 284gcagcctggg accacttg 1828571DNAHomo sapiens 285ctgctgtctt gggtgcattg gagccttgcc ttgctgctct acctccacca tgccaagtgg 60tcccaggctg c 712861947DNAHomo sapiens 286ttttccccag atatggggtt ctattcagcc atagataatc tagacagagg atttcagaat 60gaaaggaaaa atgtgtggag attagtccta gttcattctg agggccgact aagtggctca 120gccagcttct tactccatct gcagttcata ctgccaaaga gctcccactt ccaaatcccc 180agtgacttta tggagaagat tctgcattaa attgtctttc gaatgatggg gaagcaaggc 240ataatatgcg atgatgagga gaaagtagac cagtgaggtg attgcaagac taacaaggag 300actcaatggg aagtttttct ttcttttaga tattgctttt gaagtagatg gtaaaatttt 360tgtcatcctt cttgtatttt ttgtacccca agttacaatt tttcttcttc cttgtaaata 420atttaaacag tatttatttt tgtaaggcat aactagaaac taaaatatat tctaaaaaat 480tcattattct gaacaaagtg atcaaattag aatacatatt tttcaacagt ggtagagctt 540ttaatatatg tttattgaaa gttatctata atacttgcac cagtgttgaa aaaagttaac 600atgtaggcaa gagcaatatg tttgtctcaa ggatttttcc atggtttcct cagtgatggt 660gtcctggaat tattcaggtg gtgaccatca ctggtctaag tttgtgtgca gggttttcag 720acgtgttttt gtgaaacttg gtagaaccat ggctaataaa gaggacagtg ttgtcagggt 780ccatctgccc tccatagaaa aatgtctctg gctcataaaa tgagactccc tcagggacta 840aatatgaact gacagcagta actctgatac agaataatct aaattgcatc aaatggcctt 900aattcagagt ttgttaggct tatcagtatg ttgcttttaa ttggggtggg aaagtagagg 960gagagaaagc aagacattta ttaagcacct cgtatgtgcc aggcactatg ctaagcactt 1020tacataagtt aggattaatc cctgcaagaa tcctataaag aatgttacta gcatttacac 1080ttcccaaatg aaggtaccaa agctcaaacg caatgttgtg aagctgtttc cttcagattt 1140aggttatgtg ggatgatgtg ggattgaaga ggaaagaaag gtgggattat ccccctagga 1200agactttcag gcctgacttc ataggaattc atccatctta tcatgtggag tttatctcac 1260cctgctgttg caggatgcta tttgcatgtg tccccaggtg atgttttttc tttggggagt 1320aggggtttgg cttcctcatt catccctctt gctaaaagag gagatagttg atgttgcatc 1380taaagatgct ataagacaat gaaagtttga tgttgtacat acctacaagt accatttttg 1440tgcatgatta cactccactg acatcttcca agtactgcat gtgattgaat aagaaacaag 1500aaagtgacca caccaaagcc tccctggctg gtgtacaggg atcaggtcca cagtggtaca 1560gattcaacca ccacccaggg agtgcttgca gactctgcat agatgttgct gcatgcgtcc 1620catgtgcctg tcagaatggc

agtgtttaat tctcttgaaa gaaagttatt tgctcactat 1680ccccagcctc aaggagccaa ggaagagtca ttcacatgga aggtccgggt ctggtcagcc 1740actctgactt ttctaccaca ttaaattctc cattacatct cactattggt aatggcttaa 1800gtgtaaagag ccatgatgtg tatattaagc tatgtgccac atatttattt ttagactctc 1860cacagcattc atgtcaatat gggattaatg cctaaacttt gtaaatattg tacagtttgt 1920aaatcaatga ataaaggttt tgagtgt 19472871311DNAHomo sapiens 287tagtcgggcg gggttgtgag acgccgcgct cagcttccat cgctgggcgg tcaacaagtg 60cgggcctggc tcagcgcggg ggggcgcgga gaccgcgagg cgaccgggag cggctgggtt 120cccggctgcg cgcccttcgg ccaggccggg agccgcgcca gtcggagccc ccggcccagc 180gtggtccgcc tccctctcgg cgtccacctg cccggagtac tgccagcggg catgaccgac 240ccaccagggg cgccgccgcc ggcgctcgca ggccgcggat gaagaagaaa acccggcgcc 300gctcgacccg gagcgaggag ttgacccgga gcgaggagtt gaccctgagt gaggaagcga 360cctggagtga agaggcgacc cagagtgagg aggcgaccca gggcgaagag atgaatcgga 420gccaggaggt gacccgggac gaggagtcga cccggagcga ggaggtgacc agggaggaaa 480tggcggcagc tgggctcacc gtgactgtca cccacagcaa tgagaagcac gaccttcatg 540ttacctccca gcagggcagc agtgaaccag ttgtccaaga cctggcccag gttgttgaag 600aggtcatagg ggttccacag tcttttcaga aactcatatt taagggaaaa tctctgaagg 660aaatggaaac accgttgtca gcacttggaa tacaagatgg ttgccgggtc atgttaattg 720ggaaaaagaa cagtccacag gaagaggttg aactaaagaa gttgaaacat ttggagaagt 780ctgtggagaa gatagctgac cagctggaag agttgaataa agagcttact ggaatccagc 840agggttttct gcccaaggat ttgcaagctg aagctctctg caaacttgat aggagagtaa 900aagccacaat agagcagttt atgaagatct tggaggagat tgacacactg atcctgccag 960aaaatttcaa agacagtaga ttgaaaagga aaggcttggt aaaaaaggtt caggcattcc 1020tagccgagtg tgacacagtg gagcagaaca tctgccagga gactgagcgg ctgcagtcta 1080caaactttgc cctggccgag tgaggtgtag cagaaaaagg ctgtgctgcc ctgaagaatg 1140gcgccaccag ctctgccgtc tctggatcgg aatttacctg atttcttcag ggctgctggg 1200ggcaactggc catttgccaa ttttcctact ctcacactgg ttctcaatga aaaatagtgt 1260ctttgtgatt tgagtaaagc tcctattctg tttttcacaa aaaaaaaaaa a 1311288582DNAHomo sapiens 288atggcccgcg cacgccagga gggcagctcc ccggagcccg tagagggcct ggcccgcgac 60ggcccgcgcc ccttcccgct cggccgcctg gtgccctcgg cagtgtcctg cggcctctgc 120gagcccggcc tggctgccgc ccccgccgcc cccaccctgc tgcccgctgc ctacctctgc 180gcccccaccg ccccacccgc cgtcaccgcc gccctggggg gttcccgctg gcctgggggt 240ccccgcagcc ggccccgagg cccgcgcccg gacggtcctc agccctcgct ctcgctggcg 300gagcagcacc tggagtcgcc cgtgcccagc gccccggggg ctctggcggg cggtcccacc 360caggcggccc cgggagtccg cggggaggag gaacagtggg cccgggagat cggggcccag 420ctgcggcgga tggcggacga cctcaacgca cagtacgagc ggcggagaca agaggagcag 480cagcggcacc gcccctcacc ctggagggtc ctgtacaatc tcatcatggg actcctgccc 540ttacccaggg gccacagagc ccccgagatg gagcccaatt ag 5822896030DNAHomo sapiens 289gttggccccc gttacttttc ctctgggaaa tatggcgcac gctgggagaa cagggtacga 60taaccgggag atagtgatga agtacatcca ttataagctg tcgcagaggg gctacgagtg 120ggatgcggga gatgtgggcg ccgcgccccc gggggccgcc cccgcgccgg gcatcttctc 180ctcgcagccc gggcacacgc cccatacagc cgcatcccgg gacccggtcg ccaggacctc 240gccgctgcag accccggctg cccccggcgc cgccgcgggg cctgcgctca gcccggtgcc 300acctgtggtc cacctgaccc tccgccaggc cggcgacgac ttctcccgcc gctaccgccg 360cgacttcgcc gagatgtcca ggcagctgca cctgacgccc ttcaccgcgc ggggacgctt 420tgccacggtg gtggaggagc tcttcaggga cggggtgaac tgggggagga ttgtggcctt 480ctttgagttc ggtggggtca tgtgtgtgga gagcgtcaac cgggagatgt cgcccctggt 540ggacaacatc gccctgtgga tgactgagta cctgaaccgg cacctgcaca cctggatcca 600ggataacgga ggctgggatg cctttgtgga actgtacggc cccagcatgc ggcctctgtt 660tgatttctcc tggctgtctc tgaagactct gctcagtttg gccctggtgg gagcttgcat 720caccctgggt gcctatctgg gccacaagtg aagtcaacat gcctgcccca aacaaatatg 780caaaaggttc actaaagcag tagaaataat atgcattgtc agtgatgttc catgaaacaa 840agctgcaggc tgtttaagaa aaaataacac acatataaac atcacacaca cagacagaca 900cacacacaca caacaattaa cagtcttcag gcaaaacgtc gaatcagcta tttactgcca 960aagggaaata tcatttattt tttacattat taagaaaaaa agatttattt atttaagaca 1020gtcccatcaa aactcctgtc tttggaaatc cgaccactaa ttgccaagca ccgcttcgtg 1080tggctccacc tggatgttct gtgcctgtaa acatagattc gctttccatg ttgttggccg 1140gatcaccatc tgaagagcag acggatggaa aaaggacctg atcattgggg aagctggctt 1200tctggctgct ggaggctggg gagaaggtgt tcattcactt gcatttcttt gccctggggg 1260ctgtgatatt aacagaggga gggttcctgt ggggggaagt ccatgcctcc ctggcctgaa 1320gaagagactc tttgcatatg actcacatga tgcatacctg gtgggaggaa aagagttggg 1380aacttcagat ggacctagta cccactgaga tttccacgcc gaaggacagc gatgggaaaa 1440atgcccttaa atcataggaa agtatttttt taagctacca attgtgccga gaaaagcatt 1500ttagcaattt atacaatatc atccagtacc ttaagccctg attgtgtata ttcatatatt 1560ttggatacgc accccccaac tcccaatact ggctctgtct gagtaagaaa cagaatcctc 1620tggaacttga ggaagtgaac atttcggtga cttccgcatc aggaaggcta gagttaccca 1680gagcatcagg ccgccacaag tgcctgcttt taggagaccg aagtccgcag aacctgcctg 1740tgtcccagct tggaggcctg gtcctggaac tgagccgggg ccctcactgg cctcctccag 1800ggatgatcaa cagggcagtg tggtctccga atgtctggaa gctgatggag ctcagaattc 1860cactgtcaag aaagagcagt agaggggtgt ggctgggcct gtcaccctgg ggccctccag 1920gtaggcccgt tttcacgtgg agcatgggag ccacgaccct tcttaagaca tgtatcactg 1980tagagggaag gaacagaggc cctgggccct tcctatcaga aggacatggt gaaggctggg 2040aacgtgagga gaggcaatgg ccacggccca ttttggctgt agcacatggc acgttggctg 2100tgtggccttg gcccacctgt gagtttaaag caaggcttta aatgactttg gagagggtca 2160caaatcctaa aagaagcatt gaagtgaggt gtcatggatt aattgacccc tgtctatgga 2220attacatgta aaacattatc ttgtcactgt agtttggttt tatttgaaaa cctgacaaaa 2280aaaaagttcc aggtgtggaa tatgggggtt atctgtacat cctggggcat taaaaaaaaa 2340atcaatggtg gggaactata aagaagtaac aaaagaagtg acatcttcag caaataaact 2400aggaaatttt tttttcttcc agtttagaat cagccttgaa acattgatgg aataactctg 2460tggcattatt gcattatata ccatttatct gtattaactt tggaatgtac tctgttcaat 2520gtttaatgct gtggttgata tttcgaaagc tgctttaaaa aaatacatgc atctcagcgt 2580ttttttgttt ttaattgtat ttagttatgg cctatacact atttgtgagc aaaggtgatc 2640gttttctgtt tgagattttt atctcttgat tcttcaaaag cattctgaga aggtgagata 2700agccctgagt ctcagctacc taagaaaaac ctggatgtca ctggccactg aggagctttg 2760tttcaaccaa gtcatgtgca tttccacgtc aacagaattg tttattgtga cagttatatc 2820tgttgtccct ttgaccttgt ttcttgaagg tttcctcgtc cctgggcaat tccgcattta 2880attcatggta ttcaggatta catgcatgtt tggttaaacc catgagattc attcagttaa 2940aaatccagat ggcaaatgac cagcagattc aaatctatgg tggtttgacc tttagagagt 3000tgctttacgt ggcctgtttc aacacagacc cacccagagc cctcctgccc tccttccgcg 3060ggggctttct catggctgtc cttcagggtc ttcctgaaat gcagtggtgc ttacgctcca 3120ccaagaaagc aggaaacctg tggtatgaag ccagacctcc ccggcgggcc tcagggaaca 3180gaatgatcag acctttgaat gattctaatt tttaagcaaa atattatttt atgaaaggtt 3240tacattgtca aagtgatgaa tatggaatat ccaatcctgt gctgctatcc tgccaaaatc 3300attttaatgg agtcagtttg cagtatgctc cacgtggtaa gatcctccaa gctgctttag 3360aagtaacaat gaagaacgtg gacgctttta atataaagcc tgttttgtct tctgttgttg 3420ttcaaacggg attcacagag tatttgaaaa atgtatatat attaagaggt cacgggggct 3480aattgctggc tggctgcctt ttgctgtggg gttttgttac ctggttttaa taacagtaaa 3540tgtgcccagc ctcttggccc cagaactgta cagtattgtg gctgcacttg ctctaagagt 3600agttgatgtt gcattttcct tattgttaaa aacatgttag aagcaatgaa tgtatataaa 3660agcctcaact agtcattttt ttctcctctt cttttttttc attatatcta attattttgc 3720agttgggcaa cagagaacca tccctatttt gtattgaaga gggattcaca tctgcatctt 3780aactgctctt tatgaatgaa aaaacagtcc tctgtatgta ctcctcttta cactggccag 3840ggtcagagtt aaatagagta tatgcacttt ccaaattggg gacaagggct ctaaaaaaag 3900ccccaaaagg agaagaacat ctgagaacct cctcggccct cccagtccct cgctgcacaa 3960atactccgca agagaggcca gaatgacagc tgacagggtc tatggccatc gggtcgtctc 4020cgaagatttg gcaggggcag aaaactctgg caggcttaag atttggaata aagtcacaga 4080atcaaggaag cacctcaatt tagttcaaac aagacgccaa cattctctcc acagctcact 4140tacctctctg tgttcagatg tggccttcca tttatatgtg atctttgttt tattagtaaa 4200tgcttatcat ctaaagatgt agctctggcc cagtgggaaa aattaggaag tgattataaa 4260tcgagaggag ttataataat caagattaaa tgtaaataat cagggcaatc ccaacacatg 4320tctagctttc acctccagga tctattgagt gaacagaatt gcaaatagtc tctatttgta 4380attgaactta tcctaaaaca aatagtttat aaatgtgaac ttaaactcta attaattcca 4440actgtacttt taaggcagtg gctgttttta gactttctta tcacttatag ttagtaatgt 4500acacctactc tatcagagaa aaacaggaaa ggctcgaaat acaagccatt ctaaggaaat 4560tagggagtca gttgaaattc tattctgatc ttattctgtg gtgtcttttg cagcccagac 4620aaatgtggtt acacactttt taagaaatac aattctacat tgtcaagctt atgaaggttc 4680caatcagatc tttattgtta ttcaatttgg atctttcagg gatttttttt ttaaattatt 4740atgggacaaa ggacatttgt tggaggggtg ggagggagga acaattttta aatataaaac 4800attcccaagt ttggatcagg gagttggaag ttttcagaat aaccagaact aagggtatga 4860aggacctgta ttggggtcga tgtgatgcct ctgcgaagaa ccttgtgtga caaatgagaa 4920acattttgaa gtttgtggta cgacctttag attccagaga catcagcatg gctcaaagtg 4980cagctccgtt tggcagtgca atggtataaa tttcaagctg gatatgtcta atgggtattt 5040aaacaataaa tgtgcagttt taactaacag gatatttaat gacaaccttc tggttggtag 5100ggacatctgt ttctaaatgt ttattatgta caatacagaa aaaaatttta taaaattaag 5160caatgtgaaa ctgaattgga gagtgataat acaagtcctt tagtcttacc cagtgaatca 5220ttctgttcca tgtctttgga caaccatgac cttggacaat catgaaatat gcatctcact 5280ggatgcaaag aaaatcagat ggagcatgaa tggtactgta ccggttcatc tggactgccc 5340cagaaaaata acttcaagca aacatcctat caacaacaag gttgttctgc ataccaagct 5400gagcacagaa gatgggaaca ctggtggagg atggaaaggc tcgctcaatc aagaaaattc 5460tgagactatt aataaataag actgtagtgt agatactgag taaatccatg cacctaaacc 5520ttttggaaaa tctgccgtgg gccctccaga tagctcattt cattaagttt ttccctccaa 5580ggtagaattt gcaagagtga cagtggattg catttctttt ggggaagctt tcttttggtg 5640gttttgttta ttataccttc ttaagttttc aaccaaggtt tgcttttgtt ttgagttact 5700ggggttattt ttgttttaaa taaaaataag tgtacaataa gtgtttttgt attgaaagct 5760tttgttatca agattttcat acttttacct tccatggctc tttttaagat tgatactttt 5820aagaggtggc tgatattctg caacactgta cacataaaaa atacggtaag gatactttac 5880atggttaagg taaagtaagt ctccagttgg ccaccattag ctataatggc actttgtttg 5940tgttgttgga aaaagtcaca ttgccattaa actttccttg tctgtctagt taatattgtg 6000aagaaaaata aagtacagtg tgagatactg 603029010987DNAHomo sapiens 290ggtggcgcga gcttctgaaa ctaggcggca gaggcggagc cgctgtggca ctgctgcgcc 60tctgctgcgc ctcgggtgtc ttttgcggcg gtgggtcgcc gccgggagaa gcgtgagggg 120acagatttgt gaccggcgcg gtttttgtca gcttactccg gccaaaaaag aactgcacct 180ctggagcgga cttatttacc aagcattgga ggaatatcgt aggtaaaaat gcctattgga 240tccaaagaga ggccaacatt ttttgaaatt tttaagacac gctgcaacaa agcagattta 300ggaccaataa gtcttaattg gtttgaagaa ctttcttcag aagctccacc ctataattct 360gaacctgcag aagaatctga acataaaaac aacaattacg aaccaaacct atttaaaact 420ccacaaagga aaccatctta taatcagctg gcttcaactc caataatatt caaagagcaa 480gggctgactc tgccgctgta ccaatctcct gtaaaagaat tagataaatt caaattagac 540ttaggaagga atgttcccaa tagtagacat aaaagtcttc gcacagtgaa aactaaaatg 600gatcaagcag atgatgtttc ctgtccactt ctaaattctt gtcttagtga aagtcctgtt 660gttctacaat gtacacatgt aacaccacaa agagataagt cagtggtatg tgggagtttg 720tttcatacac caaagtttgt gaagggtcgt cagacaccaa aacatatttc tgaaagtcta 780ggagctgagg tggatcctga tatgtcttgg tcaagttctt tagctacacc acccaccctt 840agttctactg tgctcatagt cagaaatgaa gaagcatctg aaactgtatt tcctcatgat 900actactgcta atgtgaaaag ctatttttcc aatcatgatg aaagtctgaa gaaaaatgat 960agatttatcg cttctgtgac agacagtgaa aacacaaatc aaagagaagc tgcaagtcat 1020ggatttggaa aaacatcagg gaattcattt aaagtaaata gctgcaaaga ccacattgga 1080aagtcaatgc caaatgtcct agaagatgaa gtatatgaaa cagttgtaga tacctctgaa 1140gaagatagtt tttcattatg tttttctaaa tgtagaacaa aaaatctaca aaaagtaaga 1200actagcaaga ctaggaaaaa aattttccat gaagcaaacg ctgatgaatg tgaaaaatct 1260aaaaaccaag tgaaagaaaa atactcattt gtatctgaag tggaaccaaa tgatactgat 1320ccattagatt caaatgtagc acatcagaag ccctttgaga gtggaagtga caaaatctcc 1380aaggaagttg taccgtcttt ggcctgtgaa tggtctcaac taaccctttc aggtctaaat 1440ggagcccaga tggagaaaat acccctattg catatttctt catgtgacca aaatatttca 1500gaaaaagacc tattagacac agagaacaaa agaaagaaag attttcttac ttcagagaat 1560tctttgccac gtatttctag cctaccaaaa tcagagaagc cattaaatga ggaaacagtg 1620gtaaataaga gagatgaaga gcagcatctt gaatctcata cagactgcat tcttgcagta 1680aagcaggcaa tatctggaac ttctccagtg gcttcttcat ttcagggtat caaaaagtct 1740atattcagaa taagagaatc acctaaagag actttcaatg caagtttttc aggtcatatg 1800actgatccaa actttaaaaa agaaactgaa gcctctgaaa gtggactgga aatacatact 1860gtttgctcac agaaggagga ctccttatgt ccaaatttaa ttgataatgg aagctggcca 1920gccaccacca cacagaattc tgtagctttg aagaatgcag gtttaatatc cactttgaaa 1980aagaaaacaa ataagtttat ttatgctata catgatgaaa cattttataa aggaaaaaaa 2040ataccgaaag accaaaaatc agaactaatt aactgttcag cccagtttga agcaaatgct 2100tttgaagcac cacttacatt tgcaaatgct gattcaggtt tattgcattc ttctgtgaaa 2160agaagctgtt cacagaatga ttctgaagaa ccaactttgt ccttaactag ctcttttggg 2220acaattctga ggaaatgttc tagaaatgaa acatgttcta ataatacagt aatctctcag 2280gatcttgatt ataaagaagc aaaatgtaat aaggaaaaac tacagttatt tattacccca 2340gaagctgatt ctctgtcatg cctgcaggaa ggacagtgtg aaaatgatcc aaaaagcaaa 2400aaagtttcag atataaaaga agaggtcttg gctgcagcat gtcacccagt acaacattca 2460aaagtggaat acagtgatac tgactttcaa tcccagaaaa gtcttttata tgatcatgaa 2520aatgccagca ctcttatttt aactcctact tccaaggatg ttctgtcaaa cctagtcatg 2580atttctagag gcaaagaatc atacaaaatg tcagacaagc tcaaaggtaa caattatgaa 2640tctgatgttg aattaaccaa aaatattccc atggaaaaga atcaagatgt atgtgcttta 2700aatgaaaatt ataaaaacgt tgagctgttg ccacctgaaa aatacatgag agtagcatca 2760ccttcaagaa aggtacaatt caaccaaaac acaaatctaa gagtaatcca aaaaaatcaa 2820gaagaaacta cttcaatttc aaaaataact gtcaatccag actctgaaga acttttctca 2880gacaatgaga ataattttgt cttccaagta gctaatgaaa ggaataatct tgctttagga 2940aatactaagg aacttcatga aacagacttg acttgtgtaa acgaacccat tttcaagaac 3000tctaccatgg ttttatatgg agacacaggt gataaacaag caacccaagt gtcaattaaa 3060aaagatttgg tttatgttct tgcagaggag aacaaaaata gtgtaaagca gcatataaaa 3120atgactctag gtcaagattt aaaatcggac atctccttga atatagataa aataccagaa 3180aaaaataatg attacatgaa caaatgggca ggactcttag gtccaatttc aaatcacagt 3240tttggaggta gcttcagaac agcttcaaat aaggaaatca agctctctga acataacatt 3300aagaagagca aaatgttctt caaagatatt gaagaacaat atcctactag tttagcttgt 3360gttgaaattg taaatacctt ggcattagat aatcaaaaga aactgagcaa gcctcagtca 3420attaatactg tatctgcaca tttacagagt agtgtagttg tttctgattg taaaaatagt 3480catataaccc ctcagatgtt attttccaag caggatttta attcaaacca taatttaaca 3540cctagccaaa aggcagaaat tacagaactt tctactatat tagaagaatc aggaagtcag 3600tttgaattta ctcagtttag aaaaccaagc tacatattgc agaagagtac atttgaagtg 3660cctgaaaacc agatgactat cttaaagacc acttctgagg aatgcagaga tgctgatctt 3720catgtcataa tgaatgcccc atcgattggt caggtagaca gcagcaagca atttgaaggt 3780acagttgaaa ttaaacggaa gtttgctggc ctgttgaaaa atgactgtaa caaaagtgct 3840tctggttatt taacagatga aaatgaagtg gggtttaggg gcttttattc tgctcatggc 3900acaaaactga atgtttctac tgaagctctg caaaaagctg tgaaactgtt tagtgatatt 3960gagaatatta gtgaggaaac ttctgcagag gtacatccaa taagtttatc ttcaagtaaa 4020tgtcatgatt ctgttgtttc aatgtttaag atagaaaatc ataatgataa aactgtaagt 4080gaaaaaaata ataaatgcca actgatatta caaaataata ttgaaatgac tactggcact 4140tttgttgaag aaattactga aaattacaag agaaatactg aaaatgaaga taacaaatat 4200actgctgcca gtagaaattc tcataactta gaatttgatg gcagtgattc aagtaaaaat 4260gatactgttt gtattcataa agatgaaacg gacttgctat ttactgatca gcacaacata 4320tgtcttaaat tatctggcca gtttatgaag gagggaaaca ctcagattaa agaagatttg 4380tcagatttaa cttttttgga agttgcgaaa gctcaagaag catgtcatgg taatacttca 4440aataaagaac agttaactgc tactaaaacg gagcaaaata taaaagattt tgagacttct 4500gatacatttt ttcagactgc aagtgggaaa aatattagtg tcgccaaaga gtcatttaat 4560aaaattgtaa atttctttga tcagaaacca gaagaattgc ataacttttc cttaaattct 4620gaattacatt ctgacataag aaagaacaaa atggacattc taagttatga ggaaacagac 4680atagttaaac acaaaatact gaaagaaagt gtcccagttg gtactggaaa tcaactagtg 4740accttccagg gacaacccga acgtgatgaa aagatcaaag aacctactct gttgggtttt 4800catacagcta gcgggaaaaa agttaaaatt gcaaaggaat ctttggacaa agtgaaaaac 4860ctttttgatg aaaaagagca aggtactagt gaaatcacca gttttagcca tcaatgggca 4920aagaccctaa agtacagaga ggcctgtaaa gaccttgaat tagcatgtga gaccattgag 4980atcacagctg ccccaaagtg taaagaaatg cagaattctc tcaataatga taaaaacctt 5040gtttctattg agactgtggt gccacctaag ctcttaagtg ataatttatg tagacaaact 5100gaaaatctca aaacatcaaa aagtatcttt ttgaaagtta aagtacatga aaatgtagaa 5160aaagaaacag caaaaagtcc tgcaacttgt tacacaaatc agtcccctta ttcagtcatt 5220gaaaattcag ccttagcttt ttacacaagt tgtagtagaa aaacttctgt gagtcagact 5280tcattacttg aagcaaaaaa atggcttaga gaaggaatat ttgatggtca accagaaaga 5340ataaatactg cagattatgt aggaaattat ttgtatgaaa ataattcaaa cagtactata 5400gctgaaaatg acaaaaatca tctctccgaa aaacaagata cttatttaag taacagtagc 5460atgtctaaca gctattccta ccattctgat gaggtatata atgattcagg atatctctca 5520aaaaataaac ttgattctgg tattgagcca gtattgaaga atgttgaaga tcaaaaaaac 5580actagttttt ccaaagtaat atccaatgta aaagatgcaa atgcataccc acaaactgta 5640aatgaagata tttgcgttga ggaacttgtg actagctctt caccctgcaa aaataaaaat 5700gcagccatta aattgtccat atctaatagt aataattttg aggtagggcc acctgcattt 5760aggatagcca gtggtaaaat cgtttgtgtt tcacatgaaa caattaaaaa agtgaaagac 5820atatttacag acagtttcag taaagtaatt aaggaaaaca acgagaataa atcaaaaatt 5880tgccaaacga aaattatggc aggttgttac gaggcattgg atgattcaga ggatattctt 5940cataactctc tagataatga tgaatgtagc acgcattcac ataaggtttt tgctgacatt 6000cagagtgaag aaattttaca acataaccaa aatatgtctg gattggagaa agtttctaaa 6060atatcacctt gtgatgttag tttggaaact tcagatatat gtaaatgtag tatagggaag 6120cttcataagt cagtctcatc tgcaaatact tgtgggattt ttagcacagc aagtggaaaa 6180tctgtccagg tatcagatgc ttcattacaa aacgcaagac aagtgttttc tgaaatagaa 6240gatagtacca agcaagtctt ttccaaagta ttgtttaaaa gtaacgaaca ttcagaccag 6300ctcacaagag aagaaaatac tgctatacgt actccagaac atttaatatc ccaaaaaggc 6360ttttcatata atgtggtaaa ttcatctgct ttctctggat ttagtacagc aagtggaaag 6420caagtttcca ttttagaaag ttccttacac aaagttaagg gagtgttaga ggaatttgat 6480ttaatcagaa ctgagcatag tcttcactat tcacctacgt ctagacaaaa tgtatcaaaa 6540atacttcctc gtgttgataa gagaaaccca gagcactgtg taaactcaga aatggaaaaa 6600acctgcagta aagaatttaa attatcaaat

aacttaaatg ttgaaggtgg ttcttcagaa 6660aataatcact ctattaaagt ttctccatat ctctctcaat ttcaacaaga caaacaacag 6720ttggtattag gaaccaaagt ctcacttgtt gagaacattc atgttttggg aaaagaacag 6780gcttcaccta aaaacgtaaa aatggaaatt ggtaaaactg aaactttttc tgatgttcct 6840gtgaaaacaa atatagaagt ttgttctact tactccaaag attcagaaaa ctactttgaa 6900acagaagcag tagaaattgc taaagctttt atggaagatg atgaactgac agattctaaa 6960ctgccaagtc atgccacaca ttctcttttt acatgtcccg aaaatgagga aatggttttg 7020tcaaattcaa gaattggaaa aagaagagga gagcccctta tcttagtggg agaaccctca 7080atcaaaagaa acttattaaa tgaatttgac aggataatag aaaatcaaga aaaatcctta 7140aaggcttcaa aaagcactcc agatggcaca ataaaagatc gaagattgtt tatgcatcat 7200gtttctttag agccgattac ctgtgtaccc tttcgcacaa ctaaggaacg tcaagagata 7260cagaatccaa attttaccgc acctggtcaa gaatttctgt ctaaatctca tttgtatgaa 7320catctgactt tggaaaaatc ttcaagcaat ttagcagttt caggacatcc attttatcaa 7380gtttctgcta caagaaatga aaaaatgaga cacttgatta ctacaggcag accaaccaaa 7440gtctttgttc caccttttaa aactaaatca cattttcaca gagttgaaca gtgtgttagg 7500aatattaact tggaggaaaa cagacaaaag caaaacattg atggacatgg ctctgatgat 7560agtaaaaata agattaatga caatgagatt catcagttta acaaaaacaa ctccaatcaa 7620gcagcagctg taactttcac aaagtgtgaa gaagaacctt tagatttaat tacaagtctt 7680cagaatgcca gagatataca ggatatgcga attaagaaga aacaaaggca acgcgtcttt 7740ccacagccag gcagtctgta tcttgcaaaa acatccactc tgcctcgaat ctctctgaaa 7800gcagcagtag gaggccaagt tccctctgcg tgttctcata aacagctgta tacgtatggc 7860gtttctaaac attgcataaa aattaacagc aaaaatgcag agtcttttca gtttcacact 7920gaagattatt ttggtaagga aagtttatgg actggaaaag gaatacagtt ggctgatggt 7980ggatggctca taccctccaa tgatggaaag gctggaaaag aagaatttta tagggctctg 8040tgtgacactc caggtgtgga tccaaagctt atttctagaa tttgggttta taatcactat 8100agatggatca tatggaaact ggcagctatg gaatgtgcct ttcctaagga atttgctaat 8160agatgcctaa gcccagaaag ggtgcttctt caactaaaat acagatatga tacggaaatt 8220gatagaagca gaagatcggc tataaaaaag ataatggaaa gggatgacac agctgcaaaa 8280acacttgttc tctgtgtttc tgacataatt tcattgagcg caaatatatc tgaaacttct 8340agcaataaaa ctagtagtgc agatacccaa aaagtggcca ttattgaact tacagatggg 8400tggtatgctg ttaaggccca gttagatcct cccctcttag ctgtcttaaa gaatggcaga 8460ctgacagttg gtcagaagat tattcttcat ggagcagaac tggtgggctc tcctgatgcc 8520tgtacacctc ttgaagcccc agaatctctt atgttaaaga tttctgctaa cagtactcgg 8580cctgctcgct ggtataccaa acttggattc tttcctgacc ctagaccttt tcctctgccc 8640ttatcatcgc ttttcagtga tggaggaaat gttggttgtg ttgatgtaat tattcaaaga 8700gcatacccta tacagtggat ggagaagaca tcatctggat tatacatatt tcgcaatgaa 8760agagaggaag aaaaggaagc agcaaaatat gtggaggccc aacaaaagag actagaagcc 8820ttattcacta aaattcagga ggaatttgaa gaacatgaag aaaacacaac aaaaccatat 8880ttaccatcac gtgcactaac aagacagcaa gttcgtgctt tgcaagatgg tgcagagctt 8940tatgaagcag tgaagaatgc agcagaccca gcttaccttg agggttattt cagtgaagag 9000cagttaagag ccttgaataa tcacaggcaa atgttgaatg ataagaaaca agctcagatc 9060cagttggaaa ttaggaaggc catggaatct gctgaacaaa aggaacaagg tttatcaagg 9120gatgtcacaa ccgtgtggaa gttgcgtatt gtaagctatt caaaaaaaga aaaagattca 9180gttatactga gtatttggcg tccatcatca gatttatatt ctctgttaac agaaggaaag 9240agatacagaa tttatcatct tgcaacttca aaatctaaaa gtaaatctga aagagctaac 9300atacagttag cagcgacaaa aaaaactcag tatcaacaac taccggtttc agatgaaatt 9360ttatttcaga tttaccagcc acgggagccc cttcacttca gcaaattttt agatccagac 9420tttcagccat cttgttctga ggtggaccta ataggatttg tcgtttctgt tgtgaaaaaa 9480acaggacttg cccctttcgt ctatttgtca gacgaatgtt acaatttact ggcaataaag 9540ttttggatag accttaatga ggacattatt aagcctcata tgttaattgc tgcaagcaac 9600ctccagtggc gaccagaatc caaatcaggc cttcttactt tatttgctgg agatttttct 9660gtgttttctg ctagtccaaa agagggccac tttcaagaga cattcaacaa aatgaaaaat 9720actgttgaga atattgacat actttgcaat gaagcagaaa acaagcttat gcatatactg 9780catgcaaatg atcccaagtg gtccacccca actaaagact gtacttcagg gccgtacact 9840gctcaaatca ttcctggtac aggaaacaag cttctgatgt cttctcctaa ttgtgagata 9900tattatcaaa gtcctttatc actttgtatg gccaaaagga agtctgtttc cacacctgtc 9960tcagcccaga tgacttcaaa gtcttgtaaa ggggagaaag agattgatga ccaaaagaac 10020tgcaaaaaga gaagagcctt ggatttcttg agtagactgc ctttacctcc acctgttagt 10080cccatttgta catttgtttc tccggctgca cagaaggcat ttcagccacc aaggagttgt 10140ggcaccaaat acgaaacacc cataaagaaa aaagaactga attctcctca gatgactcca 10200tttaaaaaat tcaatgaaat ttctcttttg gaaagtaatt caatagctga cgaagaactt 10260gcattgataa atacccaagc tcttttgtct ggttcaacag gagaaaaaca atttatatct 10320gtcagtgaat ccactaggac tgctcccacc agttcagaag attatctcag actgaaacga 10380cgttgtacta catctctgat caaagaacag gagagttccc aggccagtac ggaagaatgt 10440gagaaaaata agcaggacac aattacaact aaaaaatata tctaagcatt tgcaaaggcg 10500acaataaatt attgacgctt aacctttcca gtttataaga ctggaatata atttcaaacc 10560acacattagt acttatgttg cacaatgaga aaagaaatta gtttcaaatt tacctcagcg 10620tttgtgtatc gggcaaaaat cgttttgccc gattccgtat tggtatactt ttgcttcagt 10680tgcatatctt aaaactaaat gtaatttatt aactaatcaa gaaaaacatc tttggctgag 10740ctcggtggct catgcctgta atcccaacac tttgagaagc tgaggtggga ggagtgcttg 10800aggccaggag ttcaagacca gcctgggcaa catagggaga cccccatctt tacgaagaaa 10860aaaaaaaagg ggaaaagaaa atcttttaaa tctttggatt tgatcactac aagtattatt 10920ttacaatcaa caaaatggtc atccaaactc aaacttgaga aaatatcttg ctttcaaatt 10980gacacta 109872911552DNAHomo sapiens 291gcccgtacac accgtgtgct gggacacccc acagtcagcc gcatggctcc cctgtgcccc 60agcccctggc tccctctgtt gatcccggcc cctgctccag gcctcactgt gcaactgctg 120ctgtcactgc tgcttctgat gcctgtccat ccccagaggt tgccccggat gcaggaggat 180tcccccttgg gaggaggctc ttctggggaa gatgacccac tgggcgagga ggatctgccc 240agtgaagagg attcacccag agaggaggat ccacccggag aggaggatct acctggagag 300gaggatctac ctggagagga ggatctacct gaagttaagc ctaaatcaga agaagagggc 360tccctgaagt tagaggatct acctactgtt gaggctcctg gagatcctca agaaccccag 420aataatgccc acagggacaa agaaggggat gaccagagtc attggcgcta tggaggcgac 480ccgccctggc cccgggtgtc cccagcctgc gcgggccgct tccagtcccc ggtggatatc 540cgcccccagc tcgccgcctt ctgcccggcc ctgcgccccc tggaactcct gggcttccag 600ctcccgccgc tcccagaact gcgcctgcgc aacaatggcc acagtgtgca actgaccctg 660cctcctgggc tagagatggc tctgggtccc gggcgggagt accgggctct gcagctgcat 720ctgcactggg gggctgcagg tcgtccgggc tcggagcaca ctgtggaagg ccaccgtttc 780cctgccgaga tccacgtggt tcacctcagc accgcctttg ccagagttga cgaggccttg 840gggcgcccgg gaggcctggc cgtgttggcc gcctttctgg aggagggccc ggaagaaaac 900agtgcctatg agcagttgct gtctcgcttg gaagaaatcg ctgaggaagg ctcagagact 960caggtcccag gactggacat atctgcactc ctgccctctg acttcagccg ctacttccaa 1020tatgaggggt ctctgactac accgccctgt gcccagggtg tcatctggac tgtgtttaac 1080cagacagtga tgctgagtgc taagcagctc cacaccctct ctgacaccct gtggggacct 1140ggtgactctc ggctacagct gaacttccga gcgacgcagc ctttgaatgg gcgagtgatt 1200gaggcctcct tccctgctgg agtggacagc agtcctcggg ctgctgagcc agtccagctg 1260aattcctgcc tggctgctgg tgacatccta gccctggttt ttggcctcct ttttgctgtc 1320accagcgtcg cgttccttgt gcagatgaga aggcagcaca gaaggggaac caaagggggt 1380gtgagctacc gcccagcaga ggtagccgag actggagcct agaggctgga tcttggagaa 1440tgtgagaagc cagccagagg catctgaggg ggagccggta actgtcctgt cctgctcatt 1500atgccacttc cttttaactg ccaagaaatt ttttaaaata aatatttata at 15522921578DNAHomo sapiens 292acgaacaggc caataaggag ggagcagtgc ggggtttaaa tctgaggcta ggctggctct 60tctcggcgtg ctgcggcgga acggctgttg gtttctgctg gttgtaggtc cttggctggt 120cgggcctccg gtgttctgct tctccccgct gagctgctgc ctggtgaaga ggaagccatg 180gcgctccgag tcaccaggaa ctcgaaaatt aatgctgaaa ataaggcgaa gatcaacatg 240gcaggcgcaa agcgcgttcc tacggcccct gctgcaacct ccaagcccgg actgaggcca 300agaacagctc ttggggacat tggtaacaaa gtcagtgaac aactgcaggc caaaatgcct 360atgaagaagg aagcaaaacc ttcagctact ggaaaagtca ttgataaaaa actaccaaaa 420cctcttgaaa aggtacctat gctggtgcca gtgccagtgt ctgagccagt gccagagcca 480gaacctgagc cagaacctga gcctgttaaa gaagaaaaac tttcgcctga gcctattttg 540gttgatactg cctctccaag cccaatggaa acatctggat gtgcccctgc agaagaagac 600ctgtgtcagg ctttctctga tgtaattctt gcagtaaatg atgtggatgc agaagatgga 660gctgatccaa acctttgtag tgaatatgtg aaagatattt atgcttatct gagacaactt 720gaggaagagc aagcagtcag accaaaatac ctactgggtc gggaagtcac tggaaacatg 780agagccatcc taattgactg gctagtacag gttcaaatga aattcaggtt gttgcaggag 840accatgtaca tgactgtctc cattattgat cggttcatgc agaataattg tgtgcccaag 900aagatgctgc agctggttgg tgtcactgcc atgtttattg caagcaaata tgaagaaatg 960taccctccag aaattggtga ctttgctttt gtgactgaca acacttatac taagcaccaa 1020atcagacaga tggaaatgaa gattctaaga gctttaaact ttggtctggg tcggcctcta 1080cctttgcact tccttcggag agcatctaag attggagagg ttgatgtcga gcaacatact 1140ttggccaaat acctgatgga actaactatg ttggactatg acatggtgca ctttcctcct 1200tctcaaattg cagcaggagc tttttgctta gcactgaaaa ttctggataa tggtgaatgg 1260acaccaactc tacaacatta cctgtcatat actgaagaat ctcttcttcc agttatgcag 1320cacctggcta agaatgtagt catggtaaat caaggactta caaagcacat gactgtcaag 1380aacaagtatg ccacatcgaa gcatgctaag atcagcactc taccacagct gaattctgca 1440ctagttcaag atttagccaa ggctgtggca aaggtgtaac ttgtaaactt gagttggagt 1500actatattta caaataaaat tggcaccatg tgccatctgt aaaaaaaaaa aaaaaaaaaa 1560aaaaaaaaaa aaaaaaaa 15782933195DNAHomo sapiens 293agaggcttcc ctggctggtg cctgagcccg gcgtccctcg ccccccgccc tccccgcatc 60cctctcctcc ctcgcgcctg gccctgtggc tcttcctccc tccctccttc cccccccccc 120cacccctcgc ccgctgcctc cctcggccca gccagctgtg ccggcgtttg ttggctgccc 180tgcgcccggc cctccagcca gccttctgcc ggccccgccg cgatggaggt gccccagccg 240gagcccgcgc caggctcggc tctcagtcca gcaggcgtgt gcggtggcgc ccagcgtccg 300ggccacctcc cgggcctcct gctgggatct catggcctcc tggggtcccc ggtgcgggcg 360gccgcttcct cgccggtcac caccctcacc cagaccatgc acgacctcgc cgggctcggc 420agccgcagcc gcctgacgca cctatccctg tctcgacggg catccgaatc ctccctgtcg 480tctgaatcct ccgaatcttc tgatgcaggt ctctgcatgg attcccccag ccctatggac 540ccccacatgg cggagcagac gtttgaacag gccatccagg cagccagccg gatcattcga 600aacgagcagt ttgccatcag acgcttccag tctatgccgg tgaggctgct gggccacagc 660cccgtgcttc ggaacatcac caactcccag gcgcccgacg gccggaggaa gagcgaggcg 720ggcagtggag ctgccagcag ctctggggaa gacaaggaga atgtgcgctt ctggaaggcc 780ggggtgggag ctctccggga agaggagggg gcatgctggg gtggttccct ggcatgtgag 840gaccctcctc tcccatcttg gctgcaggat ggatttgtct tcaagatgcc atggaagccc 900acacatccca gctccaccca tgctctggca gagtgggcca gccgcaggga agcctttgcc 960cagagaccca gctcggcccc cgacctgatg tgtctcagtc ctgaccggaa gatggaagtg 1020gaggagctca gccccctggc cctaggtcgc ttctctctga cccctgcaga gggggatact 1080gaggaagatg atggatttgt ggacatccta gagagtgact taaaggatga tgatgcagtt 1140cccccaggca tggagagtct cattagtgcc ccactggtca agaccttgga aaaggaagag 1200gaaaaggacc tcgtcatgta cagcaagtgc cagcggctct tccgctctcc gtccatgccc 1260tgcagcgtga tccggcccat cctcaagagg ctggagcggc cccaggacag ggacacgccc 1320gtgcagaata agcggaggcg gagcgtgacc cctcctgagg agcagcagga ggctgaggaa 1380cctaaagccc gcgtcctccg ctcaaaatca ctgtgtcacg atgagatcga gaacctcctg 1440gacagtgacc accgagagct gattggagat tactctaagg ccttcctcct acagacagta 1500gacggaaagc accaagacct caagtacatc tcaccagaaa cgatggtggc cctattgacg 1560ggcaagttca gcaacatcgt ggataagttt gtgattgtag actgcagata cccctatgaa 1620tatgaaggcg ggcacatcaa gactgcggtg aacttgcccc tggaacgcga cgccgagagc 1680ttcctactga agagccccat cgcgccctgt agcctggaca agagagtcat cctcattttc 1740cactgtgaat tctcatctga gcgtgggccc cgcatgtgcc gtttcatcag ggaacgagac 1800cgtgctgtca acgactaccc cagcctctac taccctgaga tgtatatcct gaaaggcggc 1860tacaaggagt tcttccctca gcacccgaac ttctgtgaac cccaggacta ccggcccatg 1920aaccacgagg ccttcaagga tgagctaaag accttccgcc tcaagactcg cagctgggct 1980ggggagcgga gccggcggga gctctgtagc cggctgcagg accagtgagg ggcctgcgcc 2040agtcctgcta cctcccttgc ctttcgaggc ctgaagccag ctgccctatg ggcctgccgg 2100gctgagggcc tgctggaggc ctcaggtgct gtccatggga aagatggtgt ggtgtcctgc 2160ctgtctgccc cagcccagat tcccctgtgt catcccatca ttttccatat cctggtgccc 2220cccacccctg gaagagccca gtctgttgag ttagttaagt tgggttaata ccagcttaaa 2280ggcagtattt tgtgtcctcc aggagcttct tgtttccttg ttagggttaa cccttcatct 2340tcctgtgtcc tgaaacgctc ctttgtgtgt gtgtcagctg aggctgggga gagccgtggt 2400ccctgaggat gggtcagagc taaactcctt cctggcctga gagtcagctc tctgccctgt 2460gtacttcccg ggccagggct gcccctaatc tctgtaggaa ccgtggtatg tctgccatgt 2520tgcccctttc tcttttcccc tttcctgtcc caccatacga gcacctccag cctgaacaga 2580agctcttact ctttcctatt tcagtgttac ctgtgtgctt ggtctgtttg actttacgcc 2640catctcagga cacttccgta gactgtttag gttcccctgt caaatatcag ttacccactc 2700ggtcccagtt ttgttgcccc agaaagggat gttattatcc ttgggggctc ccagggcaag 2760ggttaaggcc tgaatcatga gcctgctgga agcccagccc ctactgctgt gaaccctggg 2820gcctgactgc tcagaacttg ctgctgtctt gttgcggatg gatggaaggt tggatggatg 2880ggtggatggc cgtggatggc cgtggatgcg cagtgccttg catacccaaa ccaggtggga 2940gcgttttgtt gagcatgaca cctgcagcag gaatatatgt gtgcctattt gtgtggacaa 3000aaatatttac acttagggtt tggagctatt caagaggaaa tgtcacagaa gcagctaaac 3060caaggactga gcaccctctg gattctgaat ctcaagatgg gggcagggct gtgcttgaag 3120gccctgctga gtcatctgtt agggccttgg ttcaataaag cactgagcaa gttgagaaaa 3180aaaaaaaaaa aaaaa 31952943737DNAHomo sapiens 294ggcgtccgcg cacacctccc cgcgccgccg ccgccaccgc ccgcactccg ccgcctctgc 60ccgcaaccgc tgagccatcc atgggggtcg cgggccgcaa ccgtcccggg gcggcctggg 120cggtgctgct gctgctgctg ctgctgccgc cactgctgct gctggcgggg gccgtcccgc 180cgggtcgggg ccgtgccgcg gggccgcagg aggatgtaga tgagtgtgcc caagggctag 240atgactgcca tgccgacgcc ctgtgtcaga acacacccac ctcctacaag tgctcctgca 300agcctggcta ccaaggggaa ggcaggcagt gtgaggacat cgatgaatgt ggaaatgagc 360tcaatggagg ctgtgtccat gactgtttga atattccagg caattatcgt tgcacttgtt 420ttgatggctt catgttggct catgacggtc ataattgtct tgatgtggac gagtgcctgg 480agaacaatgg cggctgccag catacctgtg tcaacgtcat ggggagctat gagtgctgct 540gcaaggaggg gtttttcctg agtgacaatc agcacacctg cattcaccgc tcggaagagg 600gcctgagctg catgaataag gatcacggct gtagtcacat ctgcaaggag gccccaaggg 660gcagcgtcgc ctgtgagtgc aggcctggtt ttgagctggc caagaaccag agagactgca 720tcttgacctg taaccatggg aacggtgggt gccagcactc ctgtgacgat acagccgatg 780gcccagagtg cagctgccat ccacagtaca agatgcacac agatgggagg agctgccttg 840agcgagagga cactgtcctg gaggtgacag agagcaacac cacatcagtg gtggatgggg 900ataaacgggt gaaacggcgg ctgctcatgg aaacgtgtgc tgtcaacaat ggaggctgtg 960accgcacctg taaggatact tcgacaggtg tccactgcag ttgtcctgtt ggattcactc 1020tccagttgga tgggaagaca tgtaaagata ttgatgagtg ccagacccgc aatggaggtt 1080gtgatcattt ctgcaaaaac atcgtgggca gttttgactg cggctgcaag aaaggattta 1140aattattaac agatgagaag tcttgccaag atgtggatga gtgctctttg gataggacct 1200gtgaccacag ctgcatcaac caccctggca catttgcttg tgcttgcaac cgagggtaca 1260ccctgtatgg cttcacccac tgtggagaca ccaatgagtg cagcatcaac aacggaggct 1320gtcagcaggt ctgtgtgaac acagtgggca gctatgaatg ccagtgccac cctgggtaca 1380agctccactg gaataaaaaa gactgtgtgg aagtgaaggg gctcctgccc acaagtgtgt 1440caccccgtgt gtccctgcac tgcggtaaga gtggtggagg agacgggtgc ttcctcagat 1500gtcactctgg cattcacctc tcttcagatg tcaccaccat caggacaagt gtaaccttta 1560agctaaatga aggcaagtgt agtttgaaaa atgctgagct gtttcccgag ggtctgcgac 1620cagcactacc agagaagcac agctcagtaa aagagagctt ccgctacgta aaccttacat 1680gcagctctgg caagcaagtc ccaggagccc ctggccgacc aagcacccct aaggaaatgt 1740ttatcactgt tgagtttgag cttgaaacta accaaaagga ggtgacagct tcttgtgacc 1800tgagctgcat cgtaaagcga accgagaagc ggctccgtaa agccatccgc acgctcagaa 1860aggccgtcca cagggagcag tttcacctcc agctctcagg catgaacctc gacgtggcta 1920aaaagcctcc cagaacatct gaacgccagg cagagtcctg tggagtgggc cagggtcatg 1980cagaaaacca atgtgtcagt tgcagggctg ggacctatta tgatggagca cgagaacgct 2040gcattttatg tccaaatgga accttccaaa atgaggaagg acaaatgact tgtgaaccat 2100gcccaagacc aggaaattct ggggccctga agaccccaga agcttggaat atgtctgaat 2160gtggaggtct gtgtcaacct ggtgaatatt ctgcagatgg ctttgcacct tgccagctct 2220gtgccctggg cacgttccag cctgaagctg gtcgaacttc ctgcttcccc tgtggaggag 2280gccttgccac caaacatcag ggagctactt cctttcagga ctgtgaaacc agagttcaat 2340gttcacctgg acatttctac aacaccacca ctcaccgatg tattcgttgc ccagtgggaa 2400cataccagcc tgaatttgga aaaaataatt gtgtttcttg cccaggaaat actacgactg 2460actttgatgg ctccacaaac ataacccagt gtaaaaacag aagatgtgga ggggagctgg 2520gagatttcac tgggtacatt gaatccccaa actacccagg caattaccca gccaacaccg 2580agtgtacgtg gaccatcaac ccacccccca agcgccgcat cctgatcgtg gtccctgaga 2640tcttcctgcc catagaggac gactgtgggg actatctggt gatgcggaaa acctcttcat 2700ccaattctgt gacaacatat gaaacctgcc agacctacga acgccccatc gccttcacct 2760ccaggtcaaa gaagctgtgg attcagttca agtccaatga agggaacagc gctagagggt 2820tccaggtccc atacgtgaca tatgatgagg actaccagga actcattgaa gacatagttc 2880gagatggcag gctctatgca tctgagaacc atcaggaaat acttaaggat aagaaactta 2940tcaaggctct gtttgatgtc ctggcccatc cccagaacta tttcaagtac acagcccagg 3000agtcccgaga gatgtttcca agatcgttca tccgattgct acgttccaaa gtgtccaggt 3060ttttgagacc ttacaaatga ctcagcccac gtgccactca atacaaatgt tctgctatag 3120ggttggtggg acagagctgt cttccttctg catgtcagca cagtcgggta ttgctgcctc 3180ccgtatcagt gactcattag agttcaattt ttatagataa tacagatatt ttggtaaatt 3240gaacttggtt tttctttccc agcatcgtgg atgtagactg agaatggctt tgagtggcat 3300cagcttctca ctgctgtggg cggatgtctt ggatagatca cgggctggct gagctggact 3360ttggtcagcc taggtgagac tcacctgtcc ttctggggtc ttactcctcc tcaaggagtc 3420tgtagtggaa aggaggccac agaataagct gcttattctg aaacttcagc ttcctctagc 3480ccggccctct ctaagggagc cctctgcact cgtgtgcagg ctctgaccag gcagaacagg 3540caagagggga gggaaggaga cccctgcagg ctccctccac ccaccttgag acctgggagg 3600actcagtttc tccacagcct tctccagcct gtgtgataca agtttgatcc caggaacttg 3660agttctaagc agtgctcgtg aaaaaaaaaa gcagaaagaa ttagaaataa ataaaaacta 3720agcacttctg gagacat 37372952042DNAHomo sapiens 295ggggccagtc gttcgccgga aagcatttgt ctcccacctc atcataacaa caattaattt 60cctctggggc ctgaggaggg cagaatttca accttcggtg tgcttgggag tggcgattgt 120gatttacacg acaaaatgcc gaggtgctcg gtggagtcat ggcagtgccc tttgtggaag 180actgggactt ggtgcaaacc ctgggagaag gtgcctatgg agaagttcaa cttgctgtga 240atagagtaac tgaagaagca gtcgcagtga agattgtaga tatgaagcgt gccgtagact 300gtccagaaaa tattaagaaa gagatctgta tcaataaaat

gctaaatcat gaaaatgtag 360taaaattcta tggtcacagg agagaaggca atatccaata tttatttctg gagtactgta 420gtggaggaga gctttttgac agaatagagc cagacatagg catgcctgaa ccagatgctc 480agagattctt ccatcaactc atggcagggg tggtttatct gcatggtatt ggaataactc 540acagggatat taaaccagaa aatcttctgt tggatgaaag ggataacctc aaaatctcag 600actttggctt ggcaacagta tttcggtata ataatcgtga gcgtttgttg aacaagatgt 660gtggtacttt accatatgtt gctccagaac ttctgaagag aagagaattt catgcagaac 720cagttgatgt ttggtcctgt ggaatagtac ttactgcaat gctcgctgga gaattgccat 780gggaccaacc cagtgacagc tgtcaggagt attctgactg gaaagaaaaa aaaacatacc 840tcaacccttg gaaaaaaatc gattctgctc ctctagctct gctgcataaa atcttagttg 900agaatccatc agcaagaatt accattccag acatcaaaaa agatagatgg tacaacaaac 960ccctcaagaa aggggcaaaa aggccccgag tcacttcagg tggtgtgtca gagtctccca 1020gtggattttc taagcacatt caatccaatt tggacttctc tccagtaaac agtgcttcta 1080gtgaagaaaa tgtgaagtac tccagttctc agccagaacc ccgcacaggt ctttccttat 1140gggataccag cccctcatac attgataaat tggtacaagg gatcagcttt tcccagccca 1200catgtcctga tcatatgctt ttgaatagtc agttacttgg caccccagga tcctcacaga 1260acccctggca gcggttggtc aaaagaatga cacgattctt taccaaattg gatgcagaca 1320aatcttatca atgcctgaaa gagacttgtg agaagttggg ctatcaatgg aagaaaagtt 1380gtatgaatca ggttactata tcaacaactg ataggagaaa caataaactc attttcaaag 1440tgaatttgtt agaaatggat gataaaatat tggttgactt ccggctttct aagggtgatg 1500gattggagtt caagagacac ttcctgaaga ttaaagggaa gctgattgat attgtgagca 1560gccagaaggt ttggcttcct gccacatgat cggaccatcg gctctgggga atcctggtga 1620atatagtgct gctatgttga cattattctt cctagagaag attatcctgt cctgcaaact 1680gcaaatagta gttcctgaag tgttcacttc cctgtttatc caaacatctt ccaatttatt 1740ttgtttgttc ggcatacaaa taatacctat atcttaattg taagcaaaac tttggggaaa 1800ggatgaatag aattcatttg attatttctt catgtgtgtt tagtatctga atttgaaact 1860catctggtgg aaaccaagtt tcaggggaca tgagttttcc agcttttata cacacgtatc 1920tcatttttat caaaacattt tgtttaattc aaaaagtaca tatttcttcc atgttgattt 1980aattctaaga tgaaccaata aagacataat tcttgcaaaa aaaaaaaaaa aaaaaaaaaa 2040aa 20422962547DNAHomo sapiens 296cttacaaggt acagtcctct gctcaggggg gccaggaggg tcttataggc atcattcacc 60agggtcgaat gcttctctga gaagtccttt tcagtctgag acctctggct gaagaaatct 120gggtggacaa gacgctgcag ttgctggtac ctgtgctgga gcttcgctgt atcaactctg 180aaggaacggt tgcagtccat aaggctgaag tagtctcgag tggggtcagg tgcctgcagc 240gctcggcact gtgggcagaa gaacctgtcc tcccgcccgg ggccccatgg gccgccgcag 300ttccaacagc ggggataatt gcttcccgcc tgcgacgcag catcgcagct tagcggtctc 360cttctgggaa cccctgtcgg ccaaaacccc cacacccgga gcaaagcccc ggctctcccc 420cgccacatct ggccggcggc ctatctagcc gtggtcactc gtggggaaaa gcaaagagag 480cgtctaacca gactaatgtt gctgattggc tggggagtcg agggggcggg atcacccgag 540gggaacccgg gttctaagtt ccgctctccc ttctaaacta caactcccag gaggcattga 600ggcggcgcct gacggccaca tctgctgctc ctcattggtc cggcggcagg ggagggggtt 660ttgattggct gagggtggag tttgtatctg caggtttagc gccactctgc tggctgaggc 720tgcggagagt gtgcggctcc aggtgggctc acgcggtcgt gatgtctcgg gagtcggatg 780ttgaggctca gcagtctcat ggcagcagtg cctgttcaca gccccatggc agcgttaccc 840agtcccaagg ctcctcctca cagtcccagg gcatatccag ctcctctacc agcacgatgc 900caaactccag ccagtcctct cactccagct ctgggacact gagctcctta gagacagtgt 960ccactcagga actctattct attcctgagg accaagaacc tgaggaccaa gaacctgagg 1020agcctacccc tgccccctgg gctcgattat gggcccttca ggatggattt gccaatcttg 1080aatgtgtgaa tgacaactac tggtttggga gggacaaaag ctgtgaatat tgctttgatg 1140aaccactgct gaaaagaaca gataaatacc gaacatacag caagaaacac tttcggattt 1200tcagggaagt gggtcctaaa aactcttaca ttgcatacat agaagatcac agtggcaatg 1260gaacctttgt aaatacagag cttgtaggga aaggaaaacg ccgtcctttg aataacaatt 1320ctgaaattgc actgtcacta agcagaaata aagtttttgt cttttttgat ctgactgtag 1380atgatcagtc agtttatcct aaggcattaa gagatgaata catcatgtca aaaactcttg 1440gaagtggtgc ctgtggagag gtaaagctgg ctttcgagag gaaaacatgt aagaaagtag 1500ccataaagat catcagcaaa aggaagtttg ctattggttc agcaagagag gcagacccag 1560ctctcaatgt tgaaacagaa atagaaattt tgaaaaagct aaatcatcct tgcatcatca 1620agattaaaaa cttttttgat gcagaagatt attatattgt tttggaattg atggaagggg 1680gagagctgtt tgacaaagtg gtggggaata aacgcctgaa agaagctacc tgcaagctct 1740atttttacca gatgctcttg gctgtgcagt accttcatga aaacggtatt atacaccgtg 1800acttaaagcc agagaatgtt ttactgtcat ctcaagaaga ggactgtctt ataaagatta 1860ctgattttgg gcactccaag attttgggag agacctctct catgagaacc ttatgtggaa 1920cccccaccta cttggcgcct gaagttcttg tttctgttgg gactgctggg tataaccgtg 1980ctgtggactg ctggagttta ggagttattc tttttatctg ccttagtggg tatccacctt 2040tctctgagca taggactcaa gtgtcactga aggatcagat caccagtgga aaatacaact 2100tcattcctga agtctgggca gaagtctcag agaaagctct ggaccttgtc aagaagttgt 2160tggtagtgga tccaaaggca cgttttacga cagaagaagc cttaagacac ccgtggcttc 2220aggatgaaga catgaagaga aagtttcaag atcttctgtc tgaggaaaat gaatccacag 2280ctctacccca ggttctagcc cagccttcta ctagtcgaaa gcggccccgt gaaggggaag 2340ccgagggtgc cgagaccaca aagcgcccag ctgtgtgtgc tgctgtgttg tgaactccgt 2400ggtttgaaca cgaaagaaat gtaccttctt tcactctgtc atctttcttt tctttgagtc 2460tgttttttta tagtttgtat tttaattatg ggaataattg ctttttcaca gtcactgatg 2520tacaattaaa aacctgatgg aacctgg 25472972768DNAHomo sapiens 297cactgctgtg cagggcagga aagctccatg cacatagccc agcaaagagc aacacagagc 60tgaaaggaag actcagagga gagagataag taaggaaagt agtgatggct ctcatcccag 120acttggccat ggaaacctgg cttctcctgg ctgtcagcct ggtgctcctc tatctatatg 180gaacccattc acatggactt tttaagaagc ttggaattcc agggcccaca cctctgcctt 240ttttgggaaa tattttgtcc taccataagg gcttttgtat gtttgacatg gaatgtcata 300aaaagtatgg aaaagtgtgg ggcttttatg atggtcaaca gcctgtgctg gctatcacag 360atcctgacat gatcaaaaca gtgctagtga aagaatgtta ttctgtcttc acaaaccgga 420ggccttttgg tccagtggga tttatgaaaa gtgccatctc tatagctgag gatgaagaat 480ggaagagatt acgatcattg ctgtctccaa ccttcaccag tggaaaactc aaggagatgg 540tccctatcat tgcccagtat ggagatgtgt tggtgagaaa tctgaggcgg gaagcagaga 600caggcaagcc tgtcaccttg aaagacgtct ttggggccta cagcatggat gtgatcacta 660gcacatcatt tggagtgaac atcgactctc tcaacaatcc acaagacccc tttgtggaaa 720acaccaagaa gcttttaaga tttgattttt tggatccatt ctttctctca ataacagtct 780ttccattcct catcccaatt cttgaagtat taaatatctg tgtgtttcca agagaagtta 840caaatttttt aagaaaatct gtaaaaagga tgaaagaaag tcgcctcgaa gatacacaaa 900agcaccgagt ggatttcctt cagctgatga ttgactctca gaattcaaaa gaaactgagt 960cccacaaagc tctgtccgat ctggagctcg tggcccaatc aattatcttt atttttgctg 1020gctatgaaac cacgagcagt gttctctcct tcattatgta tgaactggcc actcaccctg 1080atgtccagca gaaactgcag gaggaaattg atgcagtttt acccaataag gcaccaccca 1140cctatgatac tgtgctacag atggagtatc ttgacatggt ggtgaatgaa acgctcagat 1200tattcccaat tgctatgaga cttgagaggg tctgcaaaaa agatgttgag atcaatggga 1260tgttcattcc caaaggggtg gtggtgatga ttccaagcta tgctcttcac cgtgacccaa 1320agtactggac agagcctgag aagttcctcc ctgaaagatt cagcaagaag aacaaggaca 1380acatagatcc ttacatatac acaccctttg gaagtggacc cagaaactgc attggcatga 1440ggtttgctct catgaacatg aaacttgctc taatcagagt ccttcagaac ttctccttca 1500aaccttgtaa agaaacacag atccccctga aattaagctt aggaggactt cttcaaccag 1560aaaaacccgt tgttctaaag gttgagtcaa gggatggcac cgtaagtgga gcctgaattt 1620tcctaaggac ttctgctttg ctcttcaaga aatctgtgcc tgagaacacc agagacctca 1680aattactttg tgaatagaac tctgaaatga agatgggctt catccaatgg actgcataaa 1740taaccgggga ttctgtacat gcattgagct ctctcattgt ctgtgtagag tgttatactt 1800gggaatataa aggaggtgac caaatcagtg tgaggaggta gatttggctc ctctgcttct 1860cacgggacta tttccaccac ccccagttag caccattaac tcctcctgag ctctgataag 1920agaatcaaca tttctcaata atttcctcca caaattatta atgaaaataa gaattatttt 1980gatggctcta acaatgacat ttatatcaca tgttttctct ggagtattct ataagtttta 2040tgttaaatca ataaagacca ctttacaaaa gtattatcag atgctttcct gcacattaag 2100gagaaatcta tagaactgaa tgagaaccaa caagtaaata tttttggtca ttgtaatcac 2160tgttggcgtg gggcctttgt cagaactaga atttgattat taacataggt gaaagttaat 2220ccactgtgac tttgcccatt gtttagaaag aatattcata gtttaattat gccttttttg 2280atcaggcaca gtggctcacg cctgtaatcc tagcagtttg ggaggctgag ccgggtggat 2340cgcctgaggt caggagttca agacaagcct ggcctacatg gttgaaaccc catctctact 2400aaaaatacac aaattagcta ggcatggtgg actcgcctgt aatctcacta cacaggaggc 2460tgaggcagga gaatcacttg aacctgggag gcggatgttg aagtgagctg agattgcacc 2520actgcactcc agtctgggtg agagtgagac tcagtcttaa aaaaatatgc ctttttgaag 2580cacgtacatt ttgtaacaaa gaactgaagc tcttattata ttattagttt tgatttaatg 2640ttttcagccc atctcctttc atatttctgg gagacagaaa acatgtttcc ctacacctct 2700tgcattccat cctcaacacc caactgtctc gatgcaatga acacttaata aaaaacagtc 2760gattggtc 27682981358DNAHomo sapiens 298ggcgtccgcg cgctgcacaa tggcggctct gaagagttgg ctgtcgcgca gcgtaacttc 60attcttcagg tacagacagt gtttgtgtgt tcctgttgtg gctaacttta agaagcggtg 120tttctcagaa ttgataagac catggcacaa aactgtgacg attggctttg gagtaaccct 180gtgtgcggtt cctattgcac agaaatcaga gcctcattcc cttagtagtg aagcattgat 240gaggagagca gtgtctttgg taacagatag cacctctacc tttctctctc agaccacata 300tgcgttgatt gaagctatta ctgaatatac taaggctgtt tataccttaa cttctcttta 360ccgacaatat acaagtttac ttgggaaaat gaattcagag gaggaagatg aagtgtggca 420ggtgatcata ggagccagag ctgagatgac ttcaaaacac caagagtact tgaagctgga 480aaccacttgg atgactgcag ttggtctttc agagatggca gcagaagctg catatcaaac 540tggcgcagat caggcctcta taaccgccag gaatcacatt cagctggtga aactgcaggt 600ggaagaggtg caccagctct cccggaaagc agaaaccaag ctggcagaag cacagataga 660agagctccgt cagaaaacac aggaggaagg ggaggagcgg gctgagtcgg agcaggaggc 720ctacctgcgt gaggattgag ggcctgagca cactgccctg tctccccact cagtggggaa 780agcaggggca gatgccaccc tgcccagggt tggcatgact gtctgtgcac cgagaagagg 840cggcaggtcc tgccctggcc aatcaggcga gacgcctttg tgagctgtga gtgcctcctg 900tggtctcagg cttgcgctgg acctggttct tagcccttgg gcactgcacc ctgtttaaca 960tttcacccca ctctgtacag ctgctcttac ccattttttt tacctcacac ccaaagcatt 1020ttgcctacct gggtcagaga gaggagtcct ttttgtcatg cccttaagtt cagcaactgt 1080ttaacctgtt ttcagtctta tttacgtcgt caaaaatgat ttagtacttg ttccctctgt 1140tgggatgcca gttgtggcag ggggagggga acctgtccag tttgtacgat ttctttgtat 1200gtatttctga tgtgttctct gatctgcccc cactgtcctg tgaggacagc tgaggccaag 1260gagtgaaaaa cctattacta ctaagagaag gggtgcagag tgtttacctg gtgctctcaa 1320caggacttaa catcaacagg acttaacaca gaaaaaaa 13582994407DNAHomo sapiens 299tttcgactcg cgctccggct gctgtcactt ggctctctgg ctggagcttg aggacgcaag 60gagggtttgt cactggcaga ctcgagactg taggcactgc catggcccct gtgctcagta 120aggactcggc ggacatcgag agtatcctgg ctttaaatcc tcgaacacaa actcatgcaa 180ctctgtgttc cacttcggcc aagaaattag acaagaaaca ttggaaaaga aatcctgata 240agaactgctt taattgtgag aagctggaga ataattttga tgacatcaag cacacgactc 300ttggtgagcg aggagctctc cgagaagcaa tgagatgcct gaaatgtgca gatgccccgt 360gtcagaagag ctgtccaact aatcttgata ttaaatcatt catcacaagt attgcaaaca 420agaactatta tggagctgct aagatgatat tttctgacaa cccacttggt ctgacttgtg 480gaatggtatg tccaacctct gatctatgtg taggtggatg caatttatat gccactgaag 540agggacccat taatattggt ggattgcagc aatttgctac tgaggtattc aaagcaatga 600gtatcccaca gatcagaaat ccttcgctgc ctcccccaga aaaaatgtct gaagcctatt 660ctgcaaagat tgctcttttt ggtgctgggc ctgcaagtat aagttgtgct tcctttttgg 720ctcgattggg gtactctgac atcactatat ttgaaaaaca agaatatgtt ggtggtttaa 780gtacttctga aattcctcag ttccggctgc cgtatgatgt agtgaatttt gagattgagc 840taatgaagga ccttggtgta aagataattt gcggtaaaag cctttcagtg aatgaaatga 900ctcttagcac tttgaaagaa aaaggctaca aagctgcttt cattggaata ggtttgccag 960aacccaataa agatgccatc ttccaaggcc tgacgcagga ccaggggttt tatacatcca 1020aagacttttt gccacttgta gccaaaggca gtaaagcagg aatgtgcgcc tgtcactctc 1080cattgccatc gatacgggga gtcgtgattg tacttggagc tggagacact gccttcgact 1140gtgcaacatc tgctctacgt tgtggagctc gccgagtgtt catcgtcttc agaaaaggct 1200ttgttaatat aagagctgtc cctgaggaga tggagcttgc taaggaagaa aagtgtgaat 1260ttctgccatt cctgtcccca cggaaggtta tagtaaaagg tgggagaatt gttgctatgc 1320agtttgttcg gacagagcaa gatgaaactg gaaaatggaa tgaagatgaa gatcagatgg 1380tccatctgaa agccgatgtg gtcatcagtg cctttggttc agttctgagt gatcctaaag 1440taaaagaagc cttgagccct ataaaattta acagatgggg tctcccagaa gtagatccag 1500aaactatgca aactagtgaa gcatgggtat ttgcaggtgg tgatgtcgtt ggtttggcta 1560acactacagt ggaatcggtg aatgatggaa agcaagcttc ttggtacatt cacaaatacg 1620tacagtcaca atatggagct tccgtttctg ccaagcctga actacccctc ttttacactc 1680ctattgatct ggtggacatt agtgtagaaa tggccggatt gaagtttata aatccttttg 1740gtcttgctag cgcaactcca gccaccagca catcaatgat tcgaagagct tttgaagctg 1800gatggggttt tgccctcacc aaaactttct ctcttgataa ggacattgtg acaaatgttt 1860cccccagaat catccgggga accacctctg gccccatgta tggccctgga caaagctcct 1920ttctgaatat tgagctcatc agtgagaaaa cggctgcata ttggtgtcaa agtgtcactg 1980aactaaaggc tgacttccca gacaacattg tgattgctag cattatgtgc agttacaata 2040aaaatgactg gacggaactt gccaagaagt ctgaggattc tggagcagat gccctggagt 2100taaatttatc atgtccacat ggcatgggag aaagaggaat gggcctggcc tgtgggcagg 2160atccagagct ggtgcggaac atctgccgct gggttaggca agctgttcag attccttttt 2220ttgccaagct gaccccaaat gtcactgata ttgtgagcat cgcaagagct gcaaaggaag 2280gtggtgccaa tggcgttaca gccaccaaca ctgtctcagg tctgatggga ttaaaatctg 2340atggcacacc ttggccagca gtggggattg caaagcgaac tacatatgga ggagtgtctg 2400ggacagcaat cagacctatt gctttgagag ctgtgacctc cattgctcgt gctctgcctg 2460gatttcccat tttggctact ggtggaattg actctgctga aagtggtctt cagtttctcc 2520atagtggtgc ttccgtcctc caggtatgca gtgccattca gaatcaggat ttcactgtga 2580tcgaagacta ctgcactggc ctcaaagccc tgctttatct gaaaagcatt gaagaactac 2640aagactggga tggacagagt ccagctactg tgagtcacca gaaagggaaa ccagttccac 2700gtatagctga actcatggac aagaaactgc caagttttgg accttatctg gaacagcgca 2760agaaaatcat agcagaaaac aagattagac tgaaagaaca aaatgtagct ttttcaccac 2820ttaagagaag ctgttttatc cccaaaaggc ctattcctac catcaaggat gtaataggaa 2880aagcactgca gtaccttgga acatttggtg aattgagcaa cgtagagcaa gttgtggcta 2940tgattgatga agaaatgtgt atcaactgtg gtaaatgcta catgacctgt aatgattctg 3000gctaccaggc tatacagttt gatccagaaa cccacctgcc caccataacc gacacttgta 3060caggctgtac tctgtgtctc agtgtttgcc ctattgtcga ctgcatcaaa atggtttcca 3120ggacaacacc ttatgaacca aagagaggcg tacccttatc tgtgaatccg gtgtgttaag 3180gtgatttgtg aaacagttgc tgtgaacttt catgtcacct acatatgctg atctcttaaa 3240atcatgatcc ttgtgttcag ctctttccaa attaaaacaa atatacattt tctaaataaa 3300aatatgtaat ttcaaaatac atttgtaagt gtaaaaaatg tctcatgtca atgaccattc 3360aattagtggc ataaaataga ataattcttt tctgaggata gtagttaaat aactgtgtgg 3420cagttaattg gatgttcact gccagttgtc ttatgtgaaa aattaacttt ttgtgtggca 3480attagtgtga cagtttccaa attgccctat gctgtgctcc atatttgatt tctaattgta 3540agtgaaatta agcattttga aacaaagtac tctttaacat acaagaaaat gtatccaagg 3600aaacatttta tcaataaaaa ttacctttaa ttttaatgct gtttctaaga aaatgtagtt 3660agctccataa agtacaaatg aagaaagtca aaaattattt gctatggcag gataagaaag 3720cctaaaattg agtttgtgga ctttattaag taaaatcccc ttcgctgaaa ttgcttattt 3780ttggtgttgg atagaggata gggagaatat ttactaacta aataccattc actactcatg 3840cgtgagatgg gtgtacaaac tcatcctctt ttaatggcat ttctctttaa actatgttcc 3900taaccaaatg agatgatagg atagatcctg gttaccactc ttttactgtg cacatatggg 3960ccccggaatt ctttaatagt caccttcatg attatagcaa ctaatgtttg aacaaagctc 4020aaagtatgca atgcttcatt attcaagaat gaaaaatata atgttgataa tatatattaa 4080gtgtgccaaa tcagtttgac tactctctgt tttagtgttt atgtttaaaa gaaatatatt 4140ttttgttatt attagataat atttttgtat ttctctattt tcataatcag taaatagtgt 4200catataaact catttatctc ctcttcatgg catcttcaat atgaatctat aagtagtaaa 4260tcagaaagta acaatctatg gcttatttct atgacaaatt caagagctag aaaaataaaa 4320tgtttcatta tgcactttta gaaatgcata tttgccacaa aacctgtatt actgaataat 4380atcaaataaa atatcataaa gcatttt 44073005532DNAHomo sapiens 300gccgcgctgc gccggagtcc cgagctagcc ccggcgccgc cgccgcccag accggacgac 60aggccacctc gtcggcgtcc gcccgagtcc ccgcctcgcc gccaacgcca caaccaccgc 120gcacggcccc ctgactccgt ccagtattga tcgggagagc cggagcgagc tcttcgggga 180gcagcgatgc gaccctccgg gacggccggg gcagcgctcc tggcgctgct ggctgcgctc 240tgcccggcga gtcgggctct ggaggaaaag aaagtttgcc aaggcacgag taacaagctc 300acgcagttgg gcacttttga agatcatttt ctcagcctcc agaggatgtt caataactgt 360gaggtggtcc ttgggaattt ggaaattacc tatgtgcaga ggaattatga tctttccttc 420ttaaagacca tccaggaggt ggctggttat gtcctcattg ccctcaacac agtggagcga 480attcctttgg aaaacctgca gatcatcaga ggaaatatgt actacgaaaa ttcctatgcc 540ttagcagtct tatctaacta tgatgcaaat aaaaccggac tgaaggagct gcccatgaga 600aatttacagg aaatcctgca tggcgccgtg cggttcagca acaaccctgc cctgtgcaac 660gtggagagca tccagtggcg ggacatagtc agcagtgact ttctcagcaa catgtcgatg 720gacttccaga accacctggg cagctgccaa aagtgtgatc caagctgtcc caatgggagc 780tgctggggtg caggagagga gaactgccag aaactgacca aaatcatctg tgcccagcag 840tgctccgggc gctgccgtgg caagtccccc agtgactgct gccacaacca gtgtgctgca 900ggctgcacag gcccccggga gagcgactgc ctggtctgcc gcaaattccg agacgaagcc 960acgtgcaagg acacctgccc cccactcatg ctctacaacc ccaccacgta ccagatggat 1020gtgaaccccg agggcaaata cagctttggt gccacctgcg tgaagaagtg tccccgtaat 1080tatgtggtga cagatcacgg ctcgtgcgtc cgagcctgtg gggccgacag ctatgagatg 1140gaggaagacg gcgtccgcaa gtgtaagaag tgcgaagggc cttgccgcaa agtgtgtaac 1200ggaataggta ttggtgaatt taaagactca ctctccataa atgctacgaa tattaaacac 1260ttcaaaaact gcacctccat cagtggcgat ctccacatcc tgccggtggc atttaggggt 1320gactccttca cacatactcc tcctctggat ccacaggaac tggatattct gaaaaccgta 1380aaggaaatca cagggttttt gctgattcag gcttggcctg aaaacaggac ggacctccat 1440gcctttgaga acctagaaat catacgcggc aggaccaagc aacatggtca gttttctctt 1500gcagtcgtca gcctgaacat aacatccttg ggattacgct ccctcaagga gataagtgat 1560ggagatgtga taatttcagg aaacaaaaat ttgtgctatg caaatacaat aaactggaaa 1620aaactgtttg ggacctccgg tcagaaaacc aaaattataa gcaacagagg tgaaaacagc 1680tgcaaggcca caggccaggt ctgccatgcc ttgtgctccc ccgagggctg ctggggcccg 1740gagcccaggg actgcgtctc ttgccggaat gtcagccgag gcagggaatg cgtggacaag 1800tgcaagcttc tggagggtga gccaagggag tttgtggaga actctgagtg catacagtgc 1860cacccagagt gcctgcctca ggccatgaac atcacctgca caggacgggg accagacaac 1920tgtatccagt gtgcccacta cattgacggc ccccactgcg tcaagacctg cccggcagga

1980gtcatgggag aaaacaacac cctggtctgg aagtacgcag acgccggcca tgtgtgccac 2040ctgtgccatc caaactgcac ctacggatgc actgggccag gtcttgaagg ctgtccaacg 2100aatgggccta agatcccgtc catcgccact gggatggtgg gggccctcct cttgctgctg 2160gtggtggccc tggggatcgg cctcttcatg cgaaggcgcc acatcgttcg gaagcgcacg 2220ctgcggaggc tgctgcagga gagggagctt gtggagcctc ttacacccag tggagaagct 2280cccaaccaag ctctcttgag gatcttgaag gaaactgaat tcaaaaagat caaagtgctg 2340ggctccggtg cgttcggcac ggtgtataag ggactctgga tcccagaagg tgagaaagtt 2400aaaattcccg tcgctatcaa ggaattaaga gaagcaacat ctccgaaagc caacaaggaa 2460atcctcgatg aagcctacgt gatggccagc gtggacaacc cccacgtgtg ccgcctgctg 2520ggcatctgcc tcacctccac cgtgcaactc atcacgcagc tcatgccctt cggctgcctc 2580ctggactatg tccgggaaca caaagacaat attggctccc agtacctgct caactggtgt 2640gtgcagatcg caaagggcat gaactacttg gaggaccgtc gcttggtgca ccgcgacctg 2700gcagccagga acgtactggt gaaaacaccg cagcatgtca agatcacaga ttttgggctg 2760gccaaactgc tgggtgcgga agagaaagaa taccatgcag aaggaggcaa agtgcctatc 2820aagtggatgg cattggaatc aattttacac agaatctata cccaccagag tgatgtctgg 2880agctacgggg tgaccgtttg ggagttgatg acctttggat ccaagccata tgacggaatc 2940cctgccagcg agatctcctc catcctggag aaaggagaac gcctccctca gccacccata 3000tgtaccatcg atgtctacat gatcatggtc aagtgctgga tgatagacgc agatagtcgc 3060ccaaagttcc gtgagttgat catcgaattc tccaaaatgg cccgagaccc ccagcgctac 3120cttgtcattc agggggatga aagaatgcat ttgccaagtc ctacagactc caacttctac 3180cgtgccctga tggatgaaga agacatggac gacgtggtgg atgccgacga gtacctcatc 3240ccacagcagg gcttcttcag cagcccctcc acgtcacgga ctcccctcct gagctctctg 3300agtgcaacca gcaacaattc caccgtggct tgcattgata gaaatgggct gcaaagctgt 3360cccatcaagg aagacagctt cttgcagcga tacagctcag accccacagg cgccttgact 3420gaggacagca tagacgacac cttcctccca gtgcctgaat acataaacca gtccgttccc 3480aaaaggcccg ctggctctgt gcagaatcct gtctatcaca atcagcctct gaaccccgcg 3540cccagcagag acccacacta ccaggacccc cacagcactg cagtgggcaa ccccgagtat 3600ctcaacactg tccagcccac ctgtgtcaac agcacattcg acagccctgc ccactgggcc 3660cagaaaggca gccaccaaat tagcctggac aaccctgact accagcagga cttctttccc 3720aaggaagcca agccaaatgg catctttaag ggctccacag ctgaaaatgc agaataccta 3780agggtcgcgc cacaaagcag tgaatttatt ggagcatgac cacggaggat agtatgagcc 3840ctaaaaatcc agactctttc gatacccagg accaagccac agcaggtcct ccatcccaac 3900agccatgccc gcattagctc ttagacccac agactggttt tgcaacgttt acaccgacta 3960gccaggaagt acttccacct cgggcacatt ttgggaagtt gcattccttt gtcttcaaac 4020tgtgaagcat ttacagaaac gcatccagca agaatattgt ccctttgagc agaaatttat 4080ctttcaaaga ggtatatttg aaaaaaaaaa aaaaagtata tgtgaggatt tttattgatt 4140ggggatcttg gagtttttca ttgtcgctat tgatttttac ttcaatgggc tcttccaaca 4200aggaagaagc ttgctggtag cacttgctac cctgagttca tccaggccca actgtgagca 4260aggagcacaa gccacaagtc ttccagagga tgcttgattc cagtggttct gcttcaaggc 4320ttccactgca aaacactaaa gatccaagaa ggccttcatg gccccagcag gccggatcgg 4380tactgtatca agtcatggca ggtacagtag gataagccac tctgtccctt cctgggcaaa 4440gaagaaacgg aggggatgaa ttcttcctta gacttacttt tgtaaaaatg tccccacggt 4500acttactccc cactgatgga ccagtggttt ccagtcatga gcgttagact gacttgtttg 4560tcttccattc cattgttttg aaactcagta tgccgcccct gtcttgctgt catgaaatca 4620gcaagagagg atgacacatc aaataataac tcggattcca gcccacattg gattcatcag 4680catttggacc aatagcccac agctgagaat gtggaatacc taaggataac accgcttttg 4740ttctcgcaaa aacgtatctc ctaatttgag gctcagatga aatgcatcag gtcctttggg 4800gcatagatca gaagactaca aaaatgaagc tgctctgaaa tctcctttag ccatcacccc 4860aaccccccaa aattagtttg tgttacttat ggaagatagt tttctccttt tacttcactt 4920caaaagcttt ttactcaaag agtatatgtt ccctccaggt cagctgcccc caaaccccct 4980ccttacgctt tgtcacacaa aaagtgtctc tgccttgagt catctattca agcacttaca 5040gctctggcca caacagggca ttttacaggt gcgaatgaca gtagcattat gagtagtgtg 5100aattcaggta gtaaatatga aactagggtt tgaaattgat aatgctttca caacatttgc 5160agatgtttta gaaggaaaaa agttccttcc taaaataatt tctctacaat tggaagattg 5220gaagattcag ctagttagga gcccattttt tcctaatctg tgtgtgccct gtaacctgac 5280tggttaacag cagtcctttg taaacagtgt tttaaactct cctagtcaat atccacccca 5340tccaatttat caaggaagaa atggttcaga aaatattttc agcctacagt tatgttcagt 5400cacacacaca tacaaaatgt tccttttgct tttaaagtaa tttttgactc ccagatcagt 5460cagagcccct acagcattgt taagaaagta tttgattttt gtctcaatga aaataaaact 5520atattcattt cc 55323011528DNAHomo sapiens 301cggcgagcga gcaccttcga cgcggtccgg ggaccccctc gtcgctgtcc tcccgacgcg 60gacccgcgtg ccccaggcct cgcgctgccc ggccggctcc tcgtgtccca ctcccggcgc 120acgccctccc gcgagtcccg ggcccctccc gcgcccctct tctcggcgcg cgcgcagcat 180ggcgcccccg caggtcctcg cgttcgggct tctgcttgcc gcggcgacgg cgacttttgc 240cgcagctcag gaagaatgtg tctgtgaaaa ctacaagctg gccgtaaact gctttgtgaa 300taataatcgt caatgccagt gtacttcagt tggtgcacaa aatactgtca tttgctcaaa 360gctggctgcc aaatgtttgg tgatgaaggc agaaatgaat ggctcaaaac ttgggagaag 420agcaaaacct gaaggggccc tccagaacaa tgatgggctt tatgatcctg actgcgatga 480gagcgggctc tttaaggcca agcagtgcaa cggcacctcc acgtgctggt gtgtgaacac 540tgctggggtc agaagaacag acaaggacac tgaaataacc tgctctgagc gagtgagaac 600ctactggatc atcattgaac taaaacacaa agcaagagaa aaaccttatg atagtaaaag 660tttgcggact gcacttcaga aggagatcac aacgcgttat caactggatc caaaatttat 720cacgagtatt ttgtatgaga ataatgttat cactattgat ctggttcaaa attcttctca 780aaaaactcag aatgatgtgg acatagctga tgtggcttat tattttgaaa aagatgttaa 840aggtgaatcc ttgtttcatt ctaagaaaat ggacctgaca gtaaatgggg aacaactgga 900tctggatcct ggtcaaactt taatttatta tgttgatgaa aaagcacctg aattctcaat 960gcagggtcta aaagctggtg ttattgctgt tattgtggtt gtggtgatag cagttgttgc 1020tggaattgtt gtgctggtta tttccagaaa gaagagaatg gcaaagtatg agaaggctga 1080gataaaggag atgggtgaga tgcataggga actcaatgca taactatata atttgaagat 1140tatagaagaa gggaaatagc aaatggacac aaattacaaa tgtgtgtgcg tgggacgaag 1200acatctttga aggtcatgag tttgttagtt taacatcata tatttgtaat agtgaaacct 1260gtactcaaaa tataagcagc ttgaaactgg ctttaccaat cttgaaattt gaccacaagt 1320gtcttatata tgcagatcta atgtaaaatc cagaacttgg actccatcgt taaaattatt 1380tatgtgtaac attcaaatgt gtgcattaaa tatgcttcca cagtaaaatc tgaaaaactg 1440atttgtgatt gaaagctgcc tttctattta cttgagtctt gtacatacat acttttttat 1500gagctatgaa ataaaacatt ttaaactg 15283021856DNAHomo sapiens 302ctgacttggc aggactgtgc aattgtcaga aggccgtggg gagtgggggc cagtgcctgc 60agcctgccct gcctctctca caggccctta gagcatcgcc aggtgcagag ctccacagct 120ctctttccca aggagtaatc agagggtgag aacgtggagc ctggtggaca ggtgaaagca 180ctgggatctt tctgcccaga aaggggaaag ttgcacattt atatcctaga gggaagcgac 240agcagtgctt ctccctgtgc tgaggtacag gagccatgtg gctagaaatc ctcctcactt 300cagtgctggg ctttgccatc tactggttca tctcccggga caaagaggaa actttgccac 360ttgaagatgg gtggtggggg ccaggcacga ggtccgcagc cagggaggac gacagcatcc 420gccctttcaa ggtggaaacg tcagatgagg agatccacga cttacaccag aggatcgata 480agttccgttt caccccacct ttggaggaca gctgcttcca ctatggcttc aactccaact 540acctgaagaa agtcatctcc tactggcgga atgaatttga ctggaagaag caggtggaga 600ttctcaacag ataccctcac ttcaagacta agattgaagg gctggacatc cacttcatcc 660acgtgaagcc cccccagctg cccgcaggcc ataccccgaa gcccttgctg atggtgcacg 720gctggcccgg ctctttctac gagttttata agatcatccc actcctgact gaccccaaga 780accatggcct gagcgatgag cacgtttttg aagtcatctg cccttccatc cctggctatg 840gcttctcaga ggcatcctcc aagaaggggt tcaactcggt ggccaccgcc aggatctttt 900acaagctgat gctgcggctg ggcttccagg aattctacat tcaaggaggg gactgggggt 960ccctgatctg cactaatatg gcccagctgg tgcccagcca cgtgaaaggc ctgcacttga 1020acatggcttt ggttttaagc aacttctcta ccctgaccct cctcctggga cagcgtttcg 1080ggaggtttct tggcctcact gagagggatg tggagctgct gtaccccgtc aaggagaagg 1140tattctacag cctgatgagg gagagcggct acatgcacat ccagtgcacc aagcctgaca 1200ccgtaggctc tgctctgaat gactctcctg tgggtctggc tgcctatatt ctagagaagt 1260tttccacctg gaccaatacg gaattccgat acctggagga tggaggcctg gaaaggaagt 1320tctccctgga cgacctgctg accaacgtca tgctctactg gacaacaggc accatcatct 1380cctcccagcg cttctacaag gagaacctgg gacagggctg gatgacccag aagcatgagc 1440ggatgaaggt ctatgtgccc actggcttct ctgccttccc ttttgagcta ttgcacacgc 1500ctgaaaagtg ggtgaggttc aagtacccaa agctcatctc ctattcctac atggttcgtg 1560ggggccactt tgcggccttt gaggagccgg agctgctcgc ccaggacatc cgcaagttcc 1620tgtcggtgct ggagcggcaa tgacccaccc ctctcccccc gcctgccacc tccccccaca 1680agtgccctcc aggcttttct tggggaagat accccttttc tgaggaatga gtttgcctcc 1740gtcccctgcc catgctggga gcccacgctc accccctcac ccctccaagc tcactcccca 1800acccccaact ccgtgtggta agcaacatgg ctttgatgat aaacgacttt actcta 18563036450DNAHomo sapiens 303gagttgtgcc tggagtgatg tttaagccaa tgtcagggca aggcaacagt ccctggccgt 60cctccagcac ctttgtaatg catatgagct cgggagacca gtacttaaag ttggaggccc 120gggagcccag gagctggcgg agggcgttcg tcctgggagc tgcacttgct ccgtcgggtc 180gccggcttca ccggaccgca ggctcccggg gcagggccgg ggccagagct cgcgtgtcgg 240cgggacatgc gctgcgtcgc ctctaacctc gggctgtgct ctttttccag gtggcccgcc 300ggtttctgag ccttctgccc tgcggggaca cggtctgcac cctgcccgcg gccacggacc 360atgaccatga ccctccacac caaagcatct gggatggccc tactgcatca gatccaaggg 420aacgagctgg agcccctgaa ccgtccgcag ctcaagatcc ccctggagcg gcccctgggc 480gaggtgtacc tggacagcag caagcccgcc gtgtacaact accccgaggg cgccgcctac 540gagttcaacg ccgcggccgc cgccaacgcg caggtctacg gtcagaccgg cctcccctac 600ggccccgggt ctgaggctgc ggcgttcggc tccaacggcc tggggggttt ccccccactc 660aacagcgtgt ctccgagccc gctgatgcta ctgcacccgc cgccgcagct gtcgcctttc 720ctgcagcccc acggccagca ggtgccctac tacctggaga acgagcccag cggctacacg 780gtgcgcgagg ccggcccgcc ggcattctac aggccaaatt cagataatcg acgccagggt 840ggcagagaaa gattggccag taccaatgac aagggaagta tggctatgga atctgccaag 900gagactcgct actgtgcagt gtgcaatgac tatgcttcag gctaccatta tggagtctgg 960tcctgtgagg gctgcaaggc cttcttcaag agaagtattc aaggacataa cgactatatg 1020tgtccagcca ccaaccagtg caccattgat aaaaacagga ggaagagctg ccaggcctgc 1080cggctccgca aatgctacga agtgggaatg atgaaaggtg ggatacgaaa agaccgaaga 1140ggagggagaa tgttgaaaca caagcgccag agagatgatg gggagggcag gggtgaagtg 1200gggtctgctg gagacatgag agctgccaac ctttggccaa gcccgctcat gatcaaacgc 1260tctaagaaga acagcctggc cttgtccctg acggccgacc agatggtcag tgccttgttg 1320gatgctgagc cccccatact ctattccgag tatgatccta ccagaccctt cagtgaagct 1380tcgatgatgg gcttactgac caacctggca gacagggagc tggttcacat gatcaactgg 1440gcgaagaggg tgccaggctt tgtggatttg accctccatg atcaggtcca ccttctagaa 1500tgtgcctggc tagagatcct gatgattggt ctcgtctggc gctccatgga gcacccagtg 1560aagctactgt ttgctcctaa cttgctcttg gacaggaacc agggaaaatg tgtagagggc 1620atggtggaga tcttcgacat gctgctggct acatcatctc ggttccgcat gatgaatctg 1680cagggagagg agtttgtgtg cctcaaatct attattttgc ttaattctgg agtgtacaca 1740tttctgtcca gcaccctgaa gtctctggaa gagaaggacc atatccaccg agtcctggac 1800aagatcacag acactttgat ccacctgatg gccaaggcag gcctgaccct gcagcagcag 1860caccagcggc tggcccagct cctcctcatc ctctcccaca tcaggcacat gagtaacaaa 1920ggcatggagc atctgtacag catgaagtgc aagaacgtgg tgcccctcta tgacctgctg 1980ctggagatgc tggacgccca ccgcctacat gcgcccacta gccgtggagg ggcatccgtg 2040gaggagacgg accaaagcca cttggccact gcgggctcta cttcatcgca ttccttgcaa 2100aagtattaca tcacggggga ggcagagggt ttccctgcca cagtctgaga gctccctggc 2160tcccacacgg ttcagataat ccctgctgca ttttaccctc atcatgcacc actttagcca 2220aattctgtct cctgcataca ctccggcatg catccaacac caatggcttt ctagatgagt 2280ggccattcat ttgcttgctc agttcttagt ggcacatctt ctgtcttctg ttgggaacag 2340ccaaagggat tccaaggcta aatctttgta acagctctct ttcccccttg ctatgttact 2400aagcgtgagg attcccgtag ctcttcacag ctgaactcag tctatgggtt ggggctcaga 2460taactctgtg catttaagct acttgtagag acccaggcct ggagagtaga cattttgcct 2520ctgataagca ctttttaaat ggctctaaga ataagccaca gcaaagaatt taaagtggct 2580cctttaattg gtgacttgga gaaagctagg tcaagggttt attatagcac cctcttgtat 2640tcctatggca atgcatcctt ttatgaaagt ggtacacctt aaagctttta tatgactgta 2700gcagagtatc tggtgattgt caattcactt ccccctatag gaatacaagg ggccacacag 2760ggaaggcaga tcccctagtt ggccaagact tattttaact tgatacactg cagattcaga 2820gtgtcctgaa gctctgcctc tggctttccg gtcatgggtt ccagttaatt catgcctccc 2880atggacctat ggagagcaac aagttgatct tagttaagtc tccctatatg agggataagt 2940tcctgatttt tgtttttatt tttgtgttac aaaagaaagc cctccctccc tgaacttgca 3000gtaaggtcag cttcaggacc tgttccagtg ggcactgtac ttggatcttc ccggcgtgtg 3060tgtgccttac acaggggtga actgttcact gtggtgatgc atgatgaggg taaatggtag 3120ttgaaaggag caggggccct ggtgttgcat ttagccctgg ggcatggagc tgaacagtac 3180ttgtgcagga ttgttgtggc tactagagaa caagagggaa agtagggcag aaactggata 3240cagttctgag cacagccaga cttgctcagg tggccctgca caggctgcag ctacctagga 3300acattccttg cagaccccgc attgcctttg ggggtgccct gggatccctg gggtagtcca 3360gctcttattc atttcccagc gtggccctgg ttggaagaag cagctgtcaa gttgtagaca 3420gctgtgttcc tacaattggc ccagcaccct ggggcacggg agaagggtgg ggaccgttgc 3480tgtcactact caggctgact ggggcctggt cagattacgt atgcccttgg tggtttagag 3540ataatccaaa atcagggttt ggtttgggga agaaaatcct cccccttcct cccccgcccc 3600gttccctacc gcctccactc ctgccagctc atttccttca atttcctttg acctataggc 3660taaaaaagaa aggctcattc cagccacagg gcagccttcc ctgggccttt gcttctctag 3720cacaattatg ggttacttcc tttttcttaa caaaaaagaa tgtttgattt cctctgggtg 3780accttattgt ctgtaattga aaccctattg agaggtgatg tctgtgttag ccaatgaccc 3840aggtagctgc tcgggcttct cttggtatgt cttgtttgga aaagtggatt tcattcattt 3900ctgattgtcc agttaagtga tcaccaaagg actgagaatc tgggagggca aaaaaaaaaa 3960aaaaagtttt tatgtgcact taaatttggg gacaatttta tgtatctgtg ttaaggatat 4020gcttaagaac ataattcttt tgttgctgtt tgtttaagaa gcaccttagt ttgtttaaga 4080agcaccttat atagtataat atatattttt ttgaaattac attgcttgtt tatcagacaa 4140ttgaatgtag taattctgtt ctggatttaa tttgactggg ttaacatgca aaaaccaagg 4200aaaaatattt agtttttttt tttttttttg tatacttttc aagctacctt gtcatgtata 4260cagtcattta tgcctaaagc ctggtgatta ttcatttaaa tgaagatcac atttcatatc 4320aacttttgta tccacagtag acaaaatagc actaatccag atgcctattg ttggatattg 4380aatgacagac aatcttatgt agcaaagatt atgcctgaaa aggaaaatta ttcagggcag 4440ctaattttgc ttttaccaaa atatcagtag taatattttt ggacagtagc taatgggtca 4500gtgggttctt tttaatgttt atacttagat tttcttttaa aaaaattaaa ataaaacaaa 4560aaaaatttct aggactagac gatgtaatac cagctaaagc caaacaatta tacagtggaa 4620ggttttacat tattcatcca atgtgtttct attcatgtta agatactact acatttgaag 4680tgggcagaga acatcagatg attgaaatgt tcgcccaggg gtctccagca actttggaaa 4740tctctttgta tttttacttg aagtgccact aatggacagc agatattttc tggctgatgt 4800tggtattggg tgtaggaaca tgatttaaaa aaaaaactct tgcctctgct ttcccccact 4860ctgaggcaag ttaaaatgta aaagatgtga tttatctggg gggctcaggt atggtgggga 4920agtggattca ggaatctggg gaatggcaaa tatattaaga agagtattga aagtatttgg 4980aggaaaatgg ttaattctgg gtgtgcacca aggttcagta gagtccactt ctgccctgga 5040gaccacaaat caactagctc catttacagc catttctaaa atggcagctt cagttctaga 5100gaagaaagaa caacatcagc agtaaagtcc atggaatagc tagtggtctg tgtttctttt 5160cgccattgcc tagcttgccg taatgattct ataatgccat catgcagcaa ttatgagagg 5220ctaggtcatc caaagagaag accctatcaa tgtaggttgc aaaatctaac ccctaaggaa 5280gtgcagtctt tgatttgatt tccctagtaa ccttgcagat atgtttaacc aagccatagc 5340ccatgccttt tgagggctga acaaataagg gacttactga taatttactt ttgatcacat 5400taaggtgttc tcaccttgaa atcttataca ctgaaatggc cattgattta ggccactggc 5460ttagagtact ccttcccctg catgacactg attacaaata ctttcctatt catactttcc 5520aattatgaga tggactgtgg gtactgggag tgatcactaa caccatagta atgtctaata 5580ttcacaggca gatctgcttg gggaagctag ttatgtgaaa ggcaaataaa gtcatacagt 5640agctcaaaag gcaaccataa ttctctttgg tgcaagtctt gggagcgtga tctagattac 5700actgcaccat tcccaagtta atcccctgaa aacttactct caactggagc aaatgaactt 5760tggtcccaaa tatccatctt ttcagtagcg ttaattatgc tctgtttcca actgcatttc 5820ctttccaatt gaattaaagt gtggcctcgt ttttagtcat ttaaaattgt tttctaagta 5880attgctgcct ctattatggc acttcaattt tgcactgtct tttgagattc aagaaaaatt 5940tctattcatt tttttgcatc caattgtgcc tgaactttta aaatatgtaa atgctgccat 6000gttccaaacc catcgtcagt gtgtgtgttt agagctgtgc accctagaaa caacatactt 6060gtcccatgag caggtgcctg agacacagac ccctttgcat tcacagagag gtcattggtt 6120atagagactt gaattaataa gtgacattat gccagtttct gttctctcac aggtgataaa 6180caatgctttt tgtgcactac atactcttca gtgtagagct cttgttttat gggaaaaggc 6240tcaaatgcca aattgtgttt gatggattaa tatgcccttt tgccgatgca tactattact 6300gatgtgactc ggttttgtcg cagctttgct ttgtttaatg aaacacactt gtaaacctct 6360tttgcacttt gaaaaagaat ccagcgggat gctcgagcac ctgtaaacaa ttttctcaac 6420ctatttgatg ttcaaataaa gaattaaact 64503043336DNAHomo sapiensunsure(0)...(0)n = A, T, C or G 304cggcggcgac tgcagtctgg agggtccaca cttgtgattc tcaatggaga gtgaaaacgc 60agattcataa tgaaagctag cccccgtcgg ccactgattc tcaaaagacg gaggctgccc 120cttcctgttc aaaatgcccc aagtgaaaca tcagaggagg aacctaagag atcccctgcc 180caacaggagt ctaatcaagc agaggcctcc aaggaagtgg cggagtccaa ctcttgcaag 240tttccagctg ggatcaagat tattaaccac cccaccatgc ccaacacgca agtagtggcc 300atccccaaca atgctaatat tcacagcatc atcacagcac tgactgccaa gggaaaagag 360agtggcagta gtgggcccaa caaattcatc ctcatcagct gtgggggagc cccaactcag 420cctccaggac tccggcctca aacccaaacc agctatgatg ccaaaaggac agaagtgacc 480ctggagacct tgggaccaaa acctgcagct agggatgtga atcttcctag accacctgga 540gccctttgcg agcagaaacg ggagacctgt gcagatggtg aggcagcagg ctgcactatc 600aacaatagcc tatccaacat ccagtggctt cgaaagatga gttctgatgg actgggctcc 660cgcagcatca agcaagagat ggaggaaaag gagaattgtc acctggagca gcgacaggtt 720aaggttgagg agccttcgag accatcagcg tcctggcaga actctgtgtc tgagcggcca 780ccctactctt acatggccat gatacaattc gccatcaaca gcactgagag gaagcgcatg 840actttgaaag acatctatac gtggattgag gaccactttc cctactttaa gcacattgcc 900aagccaggct ggaagaactc catccgccac aacctttccc tgcacgacat gtttgtccgg 960gagacgtctg ccaatggcaa ggtctccttc tggaccattc accccagtgc caaccgctac 1020ttgacattgg accaggtgtt taagccactg gacccagggt ctccacaatt gcccgagcac 1080ttggaatcac agcagaaacg accgaatcca gagctccgcc ggaacatgac catcaaaacc 1140gaactccccc tgggcgcacg gcggaagatg aagccactgc taccacgggt cagctcatac 1200ctggtaccta tccagttccc ggtgaaccag tcactggtgt tgcagccctc ggtgaaggtg 1260ccattgcccc tggcggcttc cctcatgagc tcagagcttg cccgccatag caagcgagtc 1320cgcattgccc ccaaggtgct gctagctgag gaggggatag ctcctctttc ttctgcagga 1380ccagggaaag aggagaaact cctgtttgga gaagggtttt ctcctttgct tccagttcag 1440actatcaagg

aggaagaaat ccagcctggg gaggaaatgc cacacttagc gagacccatc 1500aaagtggaga gccctccctt ggaagagtgg ccctccccgg ccccatcttt caaagaggaa 1560tcatctcact cctgggagga ttcgtcccaa tctcccaccc caagacccaa gaagtcctac 1620agtgggctta ggtccccaac ccggtgtgtc tcggaaatgc ttgtgattca acacagggag 1680aggagggaga ggagccggtc tcggaggaaa cagcatctac tgcctccctg tgtggatgag 1740ccggagctgc tcttctcaga ggggcccagt acttcccgct gggccgcaga gctcccgttc 1800ccagcagact cctctgaccc tgcctcccag ctcagctact cccaggaagt gggaggacct 1860tttaagacac ccattaagga aacgctgccc atctcctcca ccccgagcaa atctgtcctc 1920cccagaaccc ctgaatcctg gaggctcacg cccccagcca aagtaggggg actggatttc 1980agcccagtac aaacctccca gggtgcctct gaccccttgc ctgaccccct ggggctgatg 2040gatctcagca ccactccctt gcaaagtgct cccccccttg aatcaccgca aaggctcctc 2100agttcagaac ccttagacct catctccgtc ccctttggca actcttctcc ctcagatata 2160gacgtcccca agccaggctc cccggagcca caggtttctg gccttgcagc caatcgttct 2220ctgacagaag gcctggtcct ggacacaatg aatgacagcc tcagcaagat cctgctggac 2280atcagctttc ctggcctgga cgaggaccca ctgggccctg acaacatcaa ctggtcccag 2340tttattcctg agctacagta gagccctgcc cttgcccctg tgctcaagct gtccaccatc 2400ccgggcactc caaggctcag tgcaccccaa gcctctgagt gaggacagca ggcagggact 2460gttctgctcc tcatagctcc ctgctgcctg attatgcaaa agtagcagtc acaccctagc 2520cactgctggg accttgtgtt ccccaagagt atctgattcc tctgctgtcc ctgccaggag 2580ctgaagggtg ggaacaacaa aggcaatggt gaaaagagat taggaacccc ccagcctgtt 2640tccattctct gcccagcagt ctcttacctt ccctgatctt tgcagggtgg tccgtgtaaa 2700tagtataaat tctccaaatt atcctctaat tataaatgta agcttatttc cttagatcat 2760tatccagaga ctgccagaag gtgggtagga tgacctgggg tttcaattga cttctgttcc 2820ttgcttttag ttttgataga agggaagacc tgcagtgcac ggtttcttcc aggctgaggt 2880acctggatct tgggttcttc actgcaggga cccagacaag tggatctgct tgccagagtc 2940ctttttgccc ctccctgcca cctccccgtg tttccaagtc agctttcctg caagaagaaa 3000tcctggttaa aaaagtcttt tgtattgggt caggagttga atttggggtg ggaggatgga 3060tgcaactgaa gcagagtgtg ggtgcccaga tgtgcgctat tagatgtttc tctgataatg 3120tccccaatca taccagggag actggcattg acgagaactc aggtggaggc ttgagaaggc 3180cgaaagggcc cctgacctgc ctggcttcct tagcttgccc ctcagctttg caaagagcca 3240ccctaggccc cagctgaccg catgggtgtg agccagcttg agaacactaa ctactcaata 3300aaagcgaagg tggaccnaaa aaaaaaaaaa aaaaaa 33363052365DNAHomo sapiens 305tcccagcctt cccatccccc caccgaaagc aaatcattca acgacccccg accctccgac 60ggcaggagcc ccccgacctc ccaggcggac cgcccttccc tccccgcgcg ggttccgggc 120ccggcgagag ggcgcgacga cagccgaggc catggaggtg acggcggacc agccgcgctg 180ggtgagccac caccaccccg ccgtgctcaa cgggcagcac ccggacacgc accacccggg 240cctcagccac tcctacatgg acgcggcgca gtacccgctg ccggaggagg tggatgtgct 300ttttaacatc gacggtcaag gcaaccacgt cccgccctac tacggaaact cggtcagggc 360cacggtgcag aggtaccctc cgacccacca cgggagccag gtgtgccgcc cgcctctgct 420tcatggatcc ctaccctggc tggacggcgg caaagccctg ggcagccacc acaccgcctc 480cccctggaat ctcagcccct tctccaagac gtccatccac cacggctccc cggggcccct 540ctccgtctac cccccggcct cgtcctcctc cttgtcgggg ggccacgcca gcccgcacct 600cttcaccttc ccgcccaccc cgccgaagga cgtctccccg gacccatcgc tgtccacccc 660aggctcggcc ggctcggccc ggcaggacga gaaagagtgc ctcaagtacc aggtgcccct 720gcccgacagc atgaagctgg agtcgtccca ctcccgtggc agcatgaccg ccctgggtgg 780agcctcctcg tcgacccacc accccatcac cacctacccg ccctacgtgc ccgagtacag 840ctccggactc ttccccccca gcagcctgct gggcggctcc cccaccggct tcggatgcaa 900gtccaggccc aaggcccggt ccagcacagg cagggagtgt gtgaactgtg gggcaacctc 960gaccccactg tggcggcgag atggcacggg acactacctg tgcaacgcct gcgggctcta 1020tcacaaaatg aacggacaga accggcccct cattaagccc aagcgaaggc tgtctgcagc 1080caggagagca gggacgtcct gtgcgaactg tcagaccacc acaaccacac tctggaggag 1140gaatgccaat ggggaccctg tctgcaatgc ctgtgggctc tactacaagc ttcacaatat 1200taacagaccc ctgactatga agaaggaagg catccagacc agaaaccgaa aaatgtctag 1260caaatccaaa aagtgcaaaa aagtgcatga ctcactggag gacttcccca agaacagctc 1320gtttaacccg gccgccctct ccagacacat gtcctccctg agccacatct cgcccttcag 1380ccactccagc cacatgctga ccacgcccac gccgatgcac ccgccatcca gcctgtcctt 1440tggaccacac cacccctcca gcatggtcac cgccatgggt tagagccctg ctcgatgctc 1500acagggcccc cagcgagagt ccctgcagtc cctttcgact tgcatttttg caggagcagt 1560atcatgaagc ctaaacgcga tggatatatg tttttgaagg cagaaagcaa aattatgttt 1620gccactttgc aaaggagctc actgtggtgt ctgtgttcca accactgaat ctggacccca 1680tctgtgaata agccattctg actcatatcc cctatttaac agggtctcta gtgctgtgaa 1740aaaaaaaaat cctgaacatt gcatataact tatattgtaa gaaatactgt acaatgactt 1800tattgcatct gggtagctgt aaggcatgaa ggatgccaag aagtttaagg aatatgggag 1860aaatagtgtg gaaattaaga agaaactagg tctgatattc aaatggacaa actgccagtt 1920ttgtttcctt tcactggcca cagttgtttg atgcattaaa agaaaataaa aaaaagaaaa 1980aagagaaaag aaaaaaaaag aaaaaagttg taggcgaatc atttgttcaa agctgttggc 2040cctctgcaaa ggaaatacca gttctgggca atcagtgtta ccgttcacca gttgccattg 2100agggtttcag agagcctttt tctaggccta catgctttgt gaacaagtcc ctgtaattgt 2160tgtttgtatg tataattcaa agcaccaaaa taagaaaaga tgtagattta tttcatcata 2220ttatacagac cgaactgttg tataaattta tttactgcta gtcttaagaa ctgctttctt 2280tcgtttgttt gtttcaatat tttccttctc tctcaatttt cggttgaata aactagatta 2340cattcagttg gcaaaaaaaa aaaaa 23653061117DNAHomo sapiens 306gcaccaacca gcaccatgcc catgatactg gggtactggg acatccgcgg gctggcccac 60gccatccgcc tgctcctgga atacacagac tcaagctatg aggaaaagaa gtacacgatg 120ggggacgctc ctgattatga cagaagccag tggctgaatg aaaaattcaa gctgggcctg 180gactttccca atctgcccta cttgattgat ggggctcaca agatcaccca gagcaacgcc 240atcttgtgct acattgcccg caagcacaac ctgtgtgggg agacagaaga ggagaagatt 300cgtgtggaca ttttggagaa ccagaccatg gacaaccata tgcagctggg catgatctgc 360tacaatccag aatttgagaa actgaagcca aagtacttgg aggaactccc tgaaaagcta 420aagctctact cagagtttct ggggaagcgg ccatggtttg caggaaacaa gatcactttt 480gtagattttc tcgtctatga tgtccttgac ctccaccgta tatttgagcc caactgcttg 540gacgccttcc caaatctgaa ggacttcatc tcccgctttg agggcttgga gaagatctct 600gcctacatga agtccagccg cttcctccca agacctgtgt tctcaaagat ggctgtctgg 660ggcaacaagt agggccttga aggcaggagg tgggagtgag gagcccatac tcagcctgct 720gcccaggctg tgcagcgcag ctggactctg catcccagca cctgcctcct cgttcctttc 780tcctgtttat tcccatcttt actcccaaga cttcattgtc cctcttcact ccccctaaac 840ccctgtccca tgcaggccct ttgaagcctc agctacccac tatccttcgt gaacatcccc 900tcccatcatt acccttccct gcactaaagc cagcctgacc ttccttcctg ttagtggttg 960tgtctgcttt aaagcctgcc tggcccctcg cctgtggagc tcagccccga gctgtccccg 1020tgttgcatga aggagcagca ttgactggtt tacaggccct gctcctgcag catggtccct 1080gcctaggcct acctgatgga agtaaagcct caaccac 11173071266DNAHomo sapiens 307ctcggaagcc cgtcaccatg tcgtgcgagt cgtctatggt tctcgggtac tgggatattc 60gtgggctggc gcacgccatc cgcctgctcc tggagttcac ggatacctct tatgaggaga 120aacggtacac gtgcggggaa gctcctgact atgatcgaag ccaatggctg gatgtgaaat 180tcaagctaga cctggacttt cctaatctgc cctacctcct ggatgggaag aacaagatca 240cccagagcaa tgccatcttg cgctacatcg ctcgcaagca caacatgtgt ggtgagactg 300aagaagaaaa gattcgagtg gacatcatag agaaccaagt aatggatttc cgcacacaac 360tgataaggct ctgttacagc tctgaccacg aaaaactgaa gcctcagtac ttggaagagc 420tacctggaca actgaaacaa ttctccatgt ttctgtggaa attctcatgg tttgccgggg 480aaaagctcac ctttgtggat tttctcacct atgatatctt ggatcagaac cgtatatttg 540accccaagtg cctggatgag ttcccaaacc tgaaggcttt catgtgccgt tttgaggctt 600tggagaaaat cgctgcctac ttacagtctg atcagttctg caagatgccc atcaacaaca 660agatggccca gtggggcaac aagcctgtat gctgagcagg aggcagactt gcagagcttg 720ttttgtttca tcctgtccgt aaggggtcag cgctcttgct ttgctctttt caatgaatag 780cacttatgtt actggtgtcc agctgagttt ctcttgggta taaaggctaa aagggaaaaa 840ggatatgtgg agaatcatca agatatgaat tgaatcgctg cgatactgtg gcatttccct 900actccccaac tgagttcaag ggctgtaggt tcatgcccaa gccctgagag tgggtactag 960aaaaaacgag attgcacagt tggagagagc aggtgtgtta aatggactgg agtccctgtg 1020aagactgggt gaggataaca caagtaaaac tgtggtactg atggacttaa ccggagttcg 1080gaaaccgtcc tgtgtacaca tgggagttta gtgtgataaa ggcagtattt cagactggtg 1140ggctagccaa tagagttggc aattgcttat tgaaactcat taaaaataat agagccccac 1200ttgacactat tcactaaaat taatctggaa tttaaggccc aacattaaac acaaagctgt 1260attgat 12663082162DNAHomo sapiens 308gggctgcgct gtccagctgt ggctatggcc ccagccccga gatgaggagg gagagaacta 60ggggcccgca ggcctgggaa tttccgtccc ccaccaagtc cggatgctca ctccaaagtc 120tcagcaggcc cctgagggag ggagctgtca gccagggaaa accgagaaca ccatcaccat 180gacaaccagt caccagcctc aggacagata caaagctgtc tggcttatct tcttcatgct 240gggtctggga acgctgctcc cgtggaattt tttcatgacg gccactcagt atttcacaaa 300ccgcctggac atgtcccaga atgtgtcctt ggtcactgct gaactgagca aggacgccca 360ggcgtcagcc gcccctgcag cacccttgcc tgagcggaac tctctcagtg ccatcttcaa 420caatgtcatg accctatgtg ccatgctgcc cctgctgtta ttcacctacc tcaactcctt 480cctgcatcag aggatccccc agtccgtacg gatcctgggc agcctggtgg ccatcctgct 540ggtgtttctg atcactgcca tcctggtgaa ggtgcagctg gatgctctgc ccttctttgt 600catcaccatg atcaagatcg tgctcattaa ttcatttggt gccatcctgc agggcagcct 660gtttggtctg gctggccttc tgcctgccag ctacacggcc cccatcatga gtggccaggg 720cctagcaggc ttctttgcct ccgtggccat gatctgcgct attgccagtg gctcggaact 780atcagaaagt gccttcggct actttatcac agcctgtgct gttatcattt tgaccatcat 840ctgttacctg ggcctgcccc gcctggaatt ctaccgctac taccagcagc tcaagcttga 900aggacccggg gagcaggaga ccaagttgga cctcattagc aaaggagagg agccaagagc 960aggcaaagag gaatctggag tttcagtctc caactctcag cccaccaatg aaagccactc 1020tatcaaagcc atcctgaaaa atatctcagt cctggctttc tctgtctgct tcatcttcac 1080tatcaccatt gggatgtttc cagccgtgac tgttgaggtc aagtccagca tcgcaggcag 1140cagcacctgg gaacgttact tcattcctgt gtcctgtttc ttgactttca atatctttga 1200ctggttgggc cggagcctca cagctgtatt catgtggcct gggaaggaca gccgctggct 1260gccaagcctg gtgctggccc ggctggtgtt tgtgccactg ctgctgctgt gcaacattaa 1320gccccgccgc tacctgactg tggtcttcga gcacgatgcc tggttcatct tcttcatggc 1380tgcctttgcc ttctccaacg gctacctcgc cagcctctgc atgtgcttcg ggcccaagaa 1440agtgaagcca gctgaggcag agaccgcagg agccatcatg gccttcttcc tgtgtctggg 1500tctggcactg ggggctgttt tctccttcct gttccgggca attgtgtgac aaaggatgga 1560cagaaggact gcctgcctcc ctccctgtct gcctcctgcc ccttccttct gccaggggtg 1620atcctgagtg gtctggcggt tttttcttct aactgacttc tgctttccac ggcgtgtgct 1680gggcccggat ctccaggccc tggggaggga gcctctggac ggacagtggg gacattgtgg 1740gtttggggct cagagtcgag ggacggggtg tagcctcggc atttgcttga gtttctccac 1800tcttggctct gactgatccc tgcttgtgca ggccagtgga ggctcttggg cttggagaac 1860acgtgtgtct ctgtgtatgt gtctgtgtgt ctgcgtccgt gtctgtcaga ctgtctgcct 1920gtcctggggt ggctaggagc tgggtctgac cgttgtatgg tttgacctga tatactccat 1980tctcccctgc gcctcctcct ctgtgttttt tccatgtccc cctcccaact ccccatgccc 2040agtttttacc catcatgcac cctgtacagt tgccacgtta ctgccttttt taaaaatata 2100tttgacagaa accaggtgcc ttcagaggct ctctgattta aataaacctt tcttgttttt 2160tt 21623093933DNAHomo sapiens 309cacgaggcag cactctcttc gtcgcttcgg ccagtgtgtc gggctgggcc ctgacaagcc 60acctgaggag aggctcggag ccgggcccgg accccggcga ttgccgcccg cttctctcta 120gtctcacgag gggtttcccg cctcgcaccc ccacctctgg acttgccttt ccttctcttc 180tccgcgtgtg gagggagcca gcgcttaggc cggagcgagc ctgggggccg cccgccgtga 240agacatcgcg gggaccgatt caccatggag ggcgccggcg gcgcgaacga caagaaaaag 300ataagttctg aacgtcgaaa agaaaagtct cgagatgcag ccagatctcg gcgaagtaaa 360gaatctgaag ttttttatga gcttgctcat cagttgccac ttccacataa tgtgagttcg 420catcttgata aggcctctgt gatgaggctt accatcagct atttgcgtgt gaggaaactt 480ctggatgctg gtgatttgga tattgaagat gacatgaaag cacagatgaa ttgcttttat 540ttgaaagcct tggatggttt tgttatggtt ctcacagatg atggtgacat gatttacatt 600tctgataatg tgaacaaata catgggatta actcagtttg aactaactgg acacagtgtg 660tttgatttta ctcatccatg tgaccatgag gaaatgagag aaatgcttac acacagaaat 720ggccttgtga aaaagggtaa agaacaaaac acacagcgaa gcttttttct cagaatgaag 780tgtaccctaa ctagccgagg aagaactatg aacataaagt ctgcaacatg gaaggtattg 840cactgcacag gccacattca cgtatatgat accaacagta accaacctca gtgtgggtat 900aagaaaccac ctatgacctg cttggtgctg atttgtgaac ccattcctca cccatcaaat 960attgaaattc ctttagatag caagactttc ctcagtcgac acagcctgga tatgaaattt 1020tcttattgtg atgaaagaat taccgaattg atgggatatg agccagaaga acttttaggc 1080cgctcaattt atgaatatta tcatgctttg gactctgatc atctgaccaa aactcatcat 1140gatatgttta ctaaaggaca agtcaccaca ggacagtaca ggatgcttgc caaaagaggt 1200ggatatgtct gggttgaaac tcaagcaact gtcatatata acaccaagaa ttctcaacca 1260cagtgcattg tatgtgtgaa ttacgttgtg agtggtatta ttcagcacga cttgattttc 1320tcccttcaac aaacagaatg tgtccttaaa ccggttgaat cttcagatat gaaaatgact 1380cagctattca ccaaagttga atcagaagat acaagtagcc tctttgacaa acttaagaag 1440gaacctgatg ctttaacttt gctggcccca gccgctggag acacaatcat atctttagat 1500tttggcagca acgacacaga aactgatgac cagcaacttg aggaagtacc attatataat 1560gatgtaatgc tcccctcacc caacgaaaaa ttacagaata taaatttggc aatgtctcca 1620ttacccaccg ctgaaacgcc aaagccactt cgaagtagtg ctgaccctgc actcaatcaa 1680gaagttgcat taaaattaga accaaatcca gagtcactgg aactttcttt taccatgccc 1740cagattcagg atcagacacc tagtccttcc gatggaagca ctagacaaag ttcacctgag 1800cctaatagtc ccagtgaata ttgtttttat gtggatagtg atatggtcaa tgaattcaag 1860ttggaattgg tagaaaaact ttttgctgaa gacacagaag caaagaaccc attttctact 1920caggacacag atttagactt ggagatgtta gctccctata tcccaatgga tgatgacttc 1980cagttacgtt ccttcgatca gttgtcacca ttagaaagca gttccgcaag ccctgaaagc 2040gcaagtcctc aaagcacagt tacagtattc cagcagactc aaatacaaga acctactgct 2100aatgccacca ctaccactgc caccactgat gaattaaaaa cagtgacaaa agaccgtatg 2160gaagacatta aaatattgat tgcatctcca tctcctaccc acatacataa agaaactact 2220agtgccacat catcaccata tagagatact caaagtcgga cagcctcacc aaacagagca 2280ggaaaaggag tcatagaaca gacagaaaaa tctcatccaa gaagccctaa cgtgttatct 2340gtcgctttga gtcaaagaac tacagttcct gaggaagaac taaatccaaa gatactagct 2400ttgcagaatg ctcagagaaa gcgaaaaatg gaacatgatg gttcactttt tcaagcagta 2460ggaattggaa cattattaca gcagccagac gatcatgcag ctactacatc actttcttgg 2520aaacgtgtaa aaggatgcaa atctagtgaa cagaatggaa tggagcaaaa gacaattatt 2580ttaataccct ctgatttagc atgtagactg ctggggcaat caatggatga aagtggatta 2640ccacagctga ccagttatga ttgtgaagtt aatgctccta tacaaggcag cagaaaccta 2700ctgcagggtg aagaattact cagagctttg gatcaagtta actgagcttt ttcttaattt 2760cattcctttt tttggacact ggtggctcac tacctaaagc agtctattta tattttctac 2820atctaatttt agaagcctgg ctacaatact gcacaaactt ggttagttca atttttgatc 2880ccctttctac ttaatttaca ttaatgctct tttttagtat gttctttaat gctggatcac 2940agacagctca ttttctcagt tttttggtat ttaaaccatt gcattgcagt agcatcattt 3000taaaaaatgc acctttttat ttatttattt ttggctaggg agtttatccc tttttcgaat 3060tatttttaag aagatgccaa tataattttt gtaagaaggc agtaaccttt catcatgatc 3120ataggcagtt gaaaaatttt tacacctttt ttttcacatt ttacataaat aataatgctt 3180tgccagcagt acgtggtagc cacaattgca caatatattt tcttaaaaaa taccagcagt 3240tactcatgga atatattctg cgtttataaa actagttttt aagaagaaat tttttttggc 3300ctatgaaatt gttaaacctg gaacatgaca ttgttaatca tataataatg attcttaaat 3360gctgtatggt ttattattta aatgggtaaa gccatttaca taatatagaa agatatgcat 3420atatctagaa ggtatgtggc atttatttgg ataaaattct caattcagag aaatcatctg 3480atgtttctat agtcactttg ccagctcaaa agaaaacaat accctatgta gttgtggaag 3540tttatgctaa tattgtgtaa ctgatattaa acctaaatgt tctgcctacc ctgttggtat 3600aaagatattt tgagcagact gtaaacaaga aaaaaaaaat catgcattct tagcaaaatt 3660gcctagtatg ttaatttgct caaaatacaa tgtttgattt tatgcacttt gtcgctatta 3720acatcctttt tttcatgtag atttcaataa ttgagtaatt ttagaagcat tattttagga 3780atatatagtt gtcacagtaa atatcttgtt ttttctatgt acattgtaca aatttttcat 3840tccttttgct ctttgtggtt ggatctaaca ctaactgtat tgttttgtta catcaaataa 3900acatcttctg tggaaaaaaa aaaaaaaaaa aaa 39333102872DNAHomo sapiens 310tccaggaatc gatagtgcat tcgtgcgcgc ggccgcccgt cgcttcgcac agggctggat 60ggttgtattg ggcagggtgg ctccaggatg ttaggaactg tgaagatgga agggcatgaa 120accagcgact ggaacagcta ctacgcagac acgcaggagg cctactcctc ggtcccggtc 180agcaacatga actcaggcct gggctccatg aactccatga acacctacat gaccatgaac 240accatgacta cgagcggcaa catgaccccg gcgtccttca acatgtccta tgccaacccg 300gccttagggg ccggcctgag tcccggcgca gtagccggca tgccgggggg ctcggcgggc 360gccatgaaca gcatgactgc ggccggcgtg acggccatgg gtacggcgct gagcccgagc 420ggcatgggcg ccatgggtgc gcagcaggcg gcctccatga tgaatggcct gggcccctac 480gcggccgcca tgaacccgtg catgagcccc atggcgtacg cgccgtccaa cctgggccgc 540agccgcgcgg gcggcggcgg cgacgccaag acgttcaagc gcagttaccc gcacgccaag 600ccgccctact cgtacatctc gctcatcacc atggccatcc agcgggcgcc cagcaagatg 660ctcacgctga gcgagatcta ccagtggatc atggacctct tcccctatta ccggcagaac 720cagcagcgct ggcagaactc catccgccac tcgctgtcct tcaatgactg cttcgtcaag 780gtggcacgct ccccggacaa gccgggcaag ggctcctact ggacgctgca cccggactcc 840ggcaacatgt tcgagaacgg ctgctacttg cgccgccaga agcgcttcaa gtgcgagaag 900cagccggggg ccggcggcgg gggcgggagc ggaagcgggg gcagcggcgc caagggcggc 960cctgagagcc gcaaggaccc ctctggcgcc tctaacccca gcgccgactc gcccctccat 1020cggggtgtgc acgggaagac cggccagcta gagggcgcgc cggccccggg cccggccgcc 1080agcccccaga ctctggacca cagtggggcg acggcgacag ggggcgcctc ggagttgaag 1140actccagcct cctcaactgc gccccccata agctccgggc ccggggcgct ggcctctgtg 1200cccgcctctc acccggcaca cggcttggca ccccacgagt cccagctgca cctgaaaggg 1260gacccccact actccttcaa ccacccgttc tccatcaaca acctcatgtc ctcctcggag 1320cagcagcata agctggactt caaggcatac gaacaggcac tgcaatactc gccttacggc 1380tctacgttgc ccgccagcct gcctctaggc agcgcctcgg tgaccaccag gagccccatc 1440gagccctcag ccctggagcc ggcgtactac caaggtgtgt attccagacc cgtcctaaac 1500acttcctagc tcccgggact ggggggtttg tctggcatag ccatgctggt agcaagagag 1560aaaaaatcaa cagcaaacaa aaccacacaa accaaaccgt caacagcata ataaaatcca 1620acaactattt ttatttcatt tttcatgcac aaccttgccc ccagtgcaaa agactgttac 1680tttattattg tattcaaaat tcattgtgta tattactaca aagacggccc caaaccaatt 1740tttttcctgc gaagtttaat gatccacaag tgtatatatg aaattctcct ccttccttgc 1800ccccctctct ttcttccctc ttggccctcc agacattcta gtttgtggag ggttatttaa 1860aaaacaaaaa ggaagatggt caagtttgta aaatatttgt ttgtgctttt cccccctcct 1920tacctgaccc cctacgagtt tacaggcttg tggcaatact cttaaccata agaattgaaa

1980tggtgaagaa acaagtatac actagaggct cttaaaagta ttgaaaagac aatactgctg 2040ttatatagca agacataaac agattataaa catcagagcc atttgcttct cagtttacat 2100ttctgataca tgcagatagc agatgtcttt aaatgaaata catgtatatt gtgtatggac 2160ttaattatgc acatgctcag atgtgtagac atcctccgta tatttacata acatatagag 2220gtaatagata ggtgatatac gtgatacgtt ctcaagagtt gcttgaccga aagttacaag 2280gaccccaacc cctttgctct ctacccacag atggccctgg gaacaatcct caggaattgc 2340cctcaagaac tcgcttcttt gctttgagag tgccatggtc atgtcattct gaggtacata 2400acacataaat tagtttctat gagtgtatac catttaaaga ttttttcagt aaagggaata 2460ttacatgttg ggaggaggag ataagttata gggagctgga tttcaaacgg tggtccaaga 2520ttcaaaaatc ctattgatag tggccatttt aatcattgcc atcgtgtgct tgtttcatcc 2580agtgttatgc actttccaca gttggtgtta gtatagccag agggtttcat tattatttct 2640ctttgctttc tcaatgttaa tttattgcat ggtttattct ttttctttac agctgaaatt 2700gctttaaatg atggttaaaa ttacaaatta aattgggaat ttttatcaat gtgattgtaa 2760ttaaaaatat tttgatttaa ataacaaaaa taataccaga ttttaagccg cggaaaatgt 2820tcttgatcat ttgcagttaa ggactttaaa taaatcaaat gttaacaaaa aa 2872311926DNAHomo sapiens 311ggggcccatt ctgtttcagc cagtcgccaa gaatcatgaa agtcgccagt ggcagcaccg 60ccaccgccgc cgcgggcccc agctgcgcgc tgaaggccgg caagacagcg agcggtgcgg 120gcgaggtggt gcgctgtctg tctgagcaga gcgtggccat ctcgcgctgc cggggcgccg 180gggcgcgcct gcctgccctg ctggacgagc agcaggtaaa cgtgctgctc tacgacatga 240acggctgtta ctcacgcctc aaggagctgg tgcccaccct gccccagaac cgcaaggtga 300gcaaggtgga gattctccag cacgtcatcg actacatcag ggaccttcag ttggagctga 360actcggaatc cgaagttggg acccccgggg gccgagggct gccggtccgg gctccgctca 420gcaccctcaa cggcgagatc agcgccctga cggccgaggc ggcatgcgtt cctgcggacg 480atcgcatctt gtgtcgctga agcgcctccc ccagggaccg gcggacccca gccatccagg 540gggcaagagg aattacgtgc tctgtgggtc tcccccaacg cgcctcgccg gatctgaggg 600agaacaagac cgatcggcgg ccactgcgcc cttaactgca tccagcctgg ggctgaggct 660gaggcactgg cgaggagagg gcgctcctct ctgcacacct actagtcacc agagacttta 720gggggtggga ttccactcgt gtgtttctat tttttgaaaa gcagacattt taaaaaatgg 780tcacgtttgg tgcttctcag atttctgagg aaattgcttt gtattgtata ttacaatgat 840caccgactga gaatattgtt ttacaatagt tctgtggggc tgtttttttg ttattaaaca 900aataatttag atggtgaaaa aaaaaa 9263124989DNAHomo sapiens 312tttttttttt ttttgagaaa gggaatttca tcccaaataa aaggaatgaa gtctggctcc 60ggaggagggt ccccgacctc gctgtggggg ctcctgtttc tctccgccgc gctctcgctc 120tggccgacga gtggagaaat ctgcgggcca ggcatcgaca tccgcaacga ctatcagcag 180ctgaagcgcc tggagaactg cacggtgatc gagggctacc tccacatcct gctcatctcc 240aaggccgagg actaccgcag ctaccgcttc cccaagctca cggtcattac cgagtacttg 300ctgctgttcc gagtggctgg cctcgagagc ctcggagacc tcttccccaa cctcacggtc 360atccgcggct ggaaactctt ctacaactac gccctggtca tcttcgagat gaccaatctc 420aaggatattg ggctttacaa cctgaggaac attactcggg gggccatcag gattgagaaa 480aatgctgacc tctgttacct ctccactgtg gactggtccc tgatcctgga tgcggtgtcc 540aataactaca ttgtggggaa taagccccca aaggaatgtg gggacctgtg tccagggacc 600atggaggaga agccgatgtg tgagaagacc accatcaaca atgagtacaa ctaccgctgc 660tggaccacaa accgctgcca gaaaatgtgc ccaagcacgt gtgggaagcg ggcgtgcacc 720gagaacaatg agtgctgcca ccccgagtgc ctgggcagct gcagcgcgcc tgacaacgac 780acggcctgtg tagcttgccg ccactactac tatgccggtg tctgtgtgcc tgcctgcccg 840cccaacacct acaggtttga gggctggcgc tgtgtggacc gtgacttctg cgccaacatc 900ctcagcgccg agagcagcga ctccgagggg tttgtgatcc acgacggcga gtgcatgcag 960gagtgcccct cgggcttcat ccgcaacggc agccagagca tgtactgcat cccttgtgaa 1020ggtccttgcc cgaaggtctg tgaggaagaa aagaaaacaa agaccattga ttctgttact 1080tctgctcaga tgctccaagg atgcaccatc ttcaagggca atttgctcat taacatccga 1140cgggggaata acattgcttc agagctggag aacttcatgg ggctcatcga ggtggtgacg 1200ggctacgtga agatccgcca ttctcatgcc ttggtctcct tgtccttcct aaaaaacctt 1260cgcctcatcc taggagagga gcagctagaa gggaattact ccttctacgt cctcgacaac 1320cagaacttgc agcaactgtg ggactgggac caccgcaacc tgaccatcaa agcagggaaa 1380atgtactttg ctttcaatcc caaattatgt gtttccgaaa tttaccgcat ggaggaagtg 1440acggggacta aagggcgcca aagcaaaggg gacataaaca ccaggaacaa cggggagaga 1500gcctcctgtg aaagtgacgt cctgcatttc acctccacca ccacgtcgaa gaatcgcatc 1560atcataacct ggcaccggta ccggccccct gactacaggg atctcatcag cttcaccgtt 1620tactacaagg aagcaccctt taagaatgtc acagagtatg atgggcagga tgcctgcggc 1680tccaacagct ggaacatggt ggacgtggac ctcccgccca acaaggacgt ggagcccggc 1740atcttactac atgggctgaa gccctggact cagtacgccg tttacgtcaa ggctgtgacc 1800ctcaccatgg tggagaacga ccatatccgt ggggccaaga gtgagatctt gtacattcgc 1860accaatgctt cagttccttc cattcccttg gacgttcttt cagcatcgaa ctcctcttct 1920cagttaatcg tgaagtggaa ccctccctct ctgcccaacg gcaacctgag ttactacatt 1980gtgcgctggc agcggcagcc tcaggacggc tacctttacc ggcacaatta ctgctccaaa 2040gacaaaatcc ccatcaggaa gtatgccgac ggcaccatcg acattgagga ggtcacagag 2100aaccccaaga ctgaggtgtg tggtggggag aaagggcctt gctgcgcctg ccccaaaact 2160gaagccgaga agcaggccga gaaggaggag gctgaatacc gcaaagtctt tgagaatttc 2220ctgcacaact ccatcttcgt gcccagacct gaaaggaagc ggagagatgt catgcaagtg 2280gccaacacca ccatgtccag ccgaagcagg aacaccacgg ccgcagacac ctacaacatc 2340accgacccgg aagagctgga gacagagtac cctttctttg agagcagagt ggataacaag 2400gagagaactg tcatttctaa ccttcggcct ttcacattgt accgcatcga tatccacagc 2460tgcaaccacg aggctgagaa gctgggctgc agcgcctcca acttcgtctt tgcaaggact 2520atgcccgcag aaggagcaga tgacattcct gggccagtga cctgggagcc aaggcctgaa 2580aactccatct ttttaaagtg gccggaacct gagaatccca atggattgat tctaatgtat 2640gaaataaaat acggatcaca agttgaggat cagcgagaat gtgtgtccag acaggaatac 2700aggaagtatg gaggggccaa gctaaaccgg ctaaacccgg ggaactacac agcccggatt 2760caggccacat ctctctctgg gaatgggtcg tggacagatc ctgtgttctt ctatgtccag 2820gccaaaacag gatatgaaaa cttcatccat ctgatcatcg ctctgcccgt cgctgtcctg 2880ttgatcgtgg gagggttggt gattatgctg tacgtcttcc atagaaagag aaataacagc 2940aggctgggga atggagtgct gtatgcctct gtgaacccgg agtacttcag cgctgctgat 3000gtgtacgttc ctgatgagtg ggaggtggct cgggagaaga tcaccatgag ccgggaactt 3060gggcaggggt cgtttgggat ggtctatgaa ggagttgcca agggtgtggt gaaagatgaa 3120cctgaaacca gagtggccat taaaacagtg aacgaggccg caagcatgcg tgagaggatt 3180gagtttctca acgaagcttc tgtgatgaag gagttcaatt gtcaccatgt ggtgcgattg 3240ctgggtgtgg tgtcccaagg ccagccaaca ctggtcatca tggaactgat gacacggggc 3300gatctcaaaa gttatctccg gtctctgagg ccagaaatgg agaataatcc agtcctagca 3360cctccaagcc tgagcaagat gattcagatg gccggagaga ttgcagacgg catggcatac 3420ctcaacgcca ataagttcgt ccacagagac cttgctgccc ggaattgcat ggtagccgaa 3480gatttcacag tcaaaatcgg agattttggt atgacgcgag atatctatga gacagactat 3540taccggaaag gaggcaaagg gctgctgccc gtgcgctgga tgtctcctga gtccctcaag 3600gatggagtct tcaccactta ctcggacgtc tggtccttcg gggtcgtcct ctgggagatc 3660gccacactgg ccgagcagcc ctaccagggc ttgtccaacg agcaagtcct tcgcttcgtc 3720atggagggcg gccttctgga caagccagac aactgtcctg acatgctgtt tgaactgatg 3780cgcatgtgct ggcagtataa ccccaagatg aggccttcct tcctggagat catcagcagc 3840atcaaagagg agatggagcc tggcttccgg gaggtctcct tctactacag cgaggagaac 3900aagctgcccg agccggagga gctggacctg gagccagaga acatggagag cgtccccctg 3960gacccctcgg cctcctcgtc ctccctgcca ctgcccgaca gacactcagg acacaaggcc 4020gagaacggcc ccggccctgg ggtgctggtc ctccgcgcca gcttcgacga gagacagcct 4080tacgcccaca tgaacggggg ccgcaagaac gagcgggcct tgccgctgcc ccagtcttcg 4140acctgctgat ccttggatcc tgaatctgtg caaacagtaa cgtgtgcgca cgcgcagcgg 4200ggtggggggg gagagagagt tttaacaatc cattcacaag cctcctgtac ctcagtggat 4260cttcagttct gcccttgctg cccgcgggag acagcttctc tgcagtaaaa cacatttggg 4320atgttccttt tttcaatatg caagcagctt tttattccct gcccaaaccc ttaactgaca 4380tgggccttta agaaccttaa tgacaacact taatagcaac agagcacttg agaaccagtc 4440tcctcactct gtccctgtcc ttccctgttc tccctttctc tctcctctct gcttcataac 4500ggaaaaataa ttgccacaag tccagctggg aagccctttt tatcagtttg aggaagtggc 4560tgtccctgtg gccccatcca accactgtac acacccgcct gacaccgtgg gtcattacaa 4620aaaaacacgt ggagatggaa atttttacct ttatctttca cctttctagg gacatgaaat 4680ttacaaaggg ccatcgttca tccaaggctg ttaccatttt aacgctgcct aattttgcca 4740aaatcctgaa ctttctccct catcggcccg gcgctgattc ctcgtgtccg gaggcatggg 4800tgagcatggc agctggttgc tccatttgag agacacgctg gcgacacact ccgtccatcc 4860gactgcccct gctgtgctgc tcaaggccac aggcacacag gtctcattgc ttctgactag 4920attattattt gggggaactg gacacaatag gtctttctct cagtgaaggt ggggagaagc 4980tgaaccggc 498931312515DNAHomo sapiens 313ctaccgggcg gaggtgagcg cggcgccggc tcctcctgcg gcggactttg ggtgcgactt 60gacgagcggt ggttcgacaa gtggccttgc gggccggatc gtcccagtgg aagagttgta 120aatttgcttc tggccttccc ctacggatta tacctggcct tcccctacgg attatactca 180acttactgtt tagaaaatgt ggcccacgag acgcctggtt actatcaaaa ggagcggggt 240cgacggtccc cactttcccc tgagcctcag cacctgcttg tttggaaggg gtattgaatg 300tgacatccgt atccagcttc ctgttgtgtc aaaacaacat tgcaaaattg aaatccatga 360gcaggaggca atattacata atttcagttc cacaaatcca acacaagtaa atgggtctgt 420tattgatgag cctgtacggc taaaacatgg agatgtaata actattattg atcgttcctt 480caggtatgaa aatgaaagtc ttcagaatgg aaggaagtca actgaatttc caagaaaaat 540acgtgaacag gagccagcac gtcgtgtctc aagatctagc ttctcttctg accctgatga 600gaaagctcaa gattccaagg cctattcaaa aatcactgaa ggaaaagttt caggaaatcc 660tcaggtacat atcaagaatg tcaaagaaga cagtaccgca gatgactcaa aagacagtgt 720tgctcaggga acaactaatg ttcattcctc agaacatgct ggacgtaatg gcagaaatgc 780agctgatccc atttctgggg attttaaaga aatttccagc gttaaattag tgagccgtta 840tggagaattg aagtctgttc ccactacaca atgtcttgac aatagcaaaa aaaatgaatc 900tcccttttgg aagctttatg agtcagtgaa gaaagagttg gatgtaaaat cacaaaaaga 960aaatgtccta cagtattgta gaaaatctgg attacaaact gattacgcaa cagagaaaga 1020aagtgctgat ggtttacagg gggagaccca actgttggtc tcgcgtaagt caagaccaaa 1080atctggtggg agcggccacg ctgtggcaga gcctgcttca cctgaacaag agcttgacca 1140gaacaagggg aagggaagag acgtggagtc tgttcagact cccagcaagg ctgtgggcgc 1200cagctttcct ctctatgagc cggctaaaat gaagacccct gtacaatatt cacagcaaca 1260aaattctcca caaaaacata agaacaaaga cctgtatact actggtagaa gagaatctgt 1320gaatctgggt aaaagtgaag gcttcaaggc tggtgataaa actcttactc ccaggaagct 1380ttcaactaga aatcgaacac cagctaaagt tgaagatgca gctgactctg ccactaagcc 1440agaaaatctc tcttccaaaa ccagaggaag tattcctaca gatgtggaag ttctgcctac 1500ggaaactgaa attcacaatg agccattttt aactctgtgg ctcactcaag ttgagaggaa 1560gatccaaaag gattccctca gcaagcctga gaaattgggc actacagctg gacagatgtg 1620ctctgggtta cctggtctta gttcagttga tatcaacaac tttggtgatt ccattaatga 1680gagtgaggga atacctttga aaagaaggcg tgtgtccttt ggtgggcacc taagacctga 1740actatttgat gaaaacttgc ctcctaatac gcctctcaaa aggggagaag ccccaaccaa 1800aagaaagtct ctggtaatgc acactccacc tgtcctgaag aaaatcatca aggaacagcc 1860tcaaccatca ggaaaacaag agtcaggttc agaaatccat gtggaagtga aggcacaaag 1920cttggttata agccctccag ctcctagtcc taggaaaact ccagttgcca gtgatcaacg 1980ccgtaggtcc tgcaaaacag cccctgcttc cagcagcaaa tctcagacag aggttcctaa 2040gagaggagga gaaagagtgg caacctgcct tcaaaagaga gtgtctatca gccgaagtca 2100acatgatatt ttacagatga tatgttccaa aagaagaagt ggtgcttcgg aagcaaatct 2160gattgttgca aaatcatggg cagatgtagt aaaacttggt gcaaaacaaa cacaaactaa 2220agtcataaaa catggtcctc aaaggtcaat gaacaaaagg caaagaagac ctgctactcc 2280aaagaagcct gtgggcgaag ttcacagtca atttagtaca ggccacgcaa actctccttg 2340taccataata atagggaaag ctcatactga aaaagtacat gtgcctgctc gaccctacag 2400agtgctcaac aacttcattt ccaaccaaaa aatggacttt aaggaagatc tttcaggaat 2460agctgaaatg ttcaagaccc cagtgaagga gcaaccgcag ttgacaagca catgtcacat 2520cgctatttca aattcagaga atttgcttgg aaaacagttt caaggaactg attcaggaga 2580agaacctctg ctccccacct cagagagttt tggaggaaat gtgttcttca gtgcacagaa 2640tgcagcaaaa cagccatctg ataaatgctc tgcaagccct cccttaagac ggcagtgtat 2700tagagaaaat ggaaacgtag caaaaacgcc caggaacacc tacaaaatga cttctctgga 2760gacaaaaact tcagatactg agacagagcc ttcaaaaaca gtatccactg taaacaggtc 2820aggaaggtct acagagttca ggaatataca gaagctacct gtggaaagta agagtgaaga 2880aacaaataca gaaattgttg agtgcatcct aaaaagaggt cagaaggcaa cactactaca 2940acaaaggaga gaaggagaga tgaaggaaat agaaagacct tttgagacat ataaggaaaa 3000tattgaatta aaagaaaacg atgaaaagat gaaagcaatg aagagatcaa gaacttgggg 3060gcagaaatgt gcaccaatgt ctgacctgac agacctcaag agcttgcctg atacagaact 3120catgaaagac acggcacgtg gccagaatct cctccaaacc caagatcatg ccaaggcacc 3180aaagagtgag aaaggcaaaa tcactaaaat gccctgccag tcattacaac cagaaccaat 3240aaacacccca acacacacaa aacaacagtt gaaggcatcc ctggggaaag taggtgtgaa 3300agaagagctc ctagcagtcg gcaagttcac acggacgtca ggggagacca cgcacacgca 3360cagagagcca gcaggagatg gcaagagcat cagaacgttt aaggagtctc caaagcagat 3420cctggaccca gcagcccgtg taactggaat gaagaagtgg ccaagaacgc ctaaggaaga 3480ggcccagtca ctagaagacc tggctggctt caaagagctc ttccagacac caggtccctc 3540tgaggaatca atgactgatg agaaaactac caaaatagcc tgcaaatctc caccaccaga 3600atcagtggac actccaacaa gcacaaagca atggcctaag agaagtctca ggaaagcaga 3660tgtagaggaa gaattcttag cactcaggaa actaacacca tcagcaggga aagccatgct 3720tacgcccaaa ccagcaggag gtgatgagaa agacattaaa gcatttatgg gaactccagt 3780gcagaaactg gacctggcag gaactttacc tggcagcaaa agacagctac agactcctaa 3840ggaaaaggcc caggctctag aagacctggc tggctttaaa gagctcttcc agactcctgg 3900tcacaccgag gaattagtgg ctgctggtaa aaccactaaa ataccctgcg actctccaca 3960gtcagaccca gtggacaccc caacaagcac aaagcaacga cccaagagaa gtatcaggaa 4020agcagatgta gagggagaac tcttagcgtg caggaatcta atgccatcag caggcaaagc 4080catgcacacg cctaaaccat cagtaggtga agagaaagac atcatcatat ttgtgggaac 4140tccagtgcag aaactggacc tgacagagaa cttaaccggc agcaagagac ggccacaaac 4200tcctaaggaa gaggcccagg ctctggaaga cctgactggc tttaaagagc tcttccagac 4260ccctggtcat actgaagaag cagtggctgc tggcaaaact actaaaatgc cctgcgaatc 4320ttctccacca gaatcagcag acaccccaac aagcacaaga aggcagccca agacaccttt 4380ggagaaaagg gacgtacaga aggagctctc agccctgaag aagctcacac agacatcagg 4440ggaaaccaca cacacagata aagtaccagg aggtgaggat aaaagcatca acgcgtttag 4500ggaaactgca aaacagaaac tggacccagc agcaagtgta actggtagca agaggcaccc 4560aaaaactaag gaaaaggccc aacccctaga agacctggct ggctggaaag agctcttcca 4620gacaccagta tgcactgaca agcccacgac tcacgagaaa actaccaaaa tagcctgcag 4680atcacaacca gacccagtgg acacaccaac aagctccaag ccacagtcca agagaagtct 4740caggaaagtg gacgtagaag aagaattctt cgcactcagg aaacgaacac catcagcagg 4800caaagccatg cacacaccca aaccagcagt aagtggtgag aaaaacatct acgcatttat 4860gggaactcca gtgcagaaac tggacctgac agagaactta actggcagca agagacggct 4920acaaactcct aaggaaaagg cccaggctct agaagacctg gctggcttta aagagctctt 4980ccagacacga ggtcacactg aggaatcaat gactaacgat aaaactgcca aagtagcctg 5040caaatcttca caaccagacc tagacaaaaa cccagcaagc tccaagcgac ggctcaagac 5100atccctgggg aaagtgggcg tgaaagaaga gctcctagca gttggcaagc tcacacagac 5160atcaggagag actacacaca cacacacaga gccaacagga gatggtaaga gcatgaaagc 5220atttatggag tctccaaagc agatcttaga ctcagcagca agtctaactg gcagcaagag 5280gcagctgaga actcctaagg gaaagtctga agtccctgaa gacctggccg gcttcatcga 5340gctcttccag acaccaagtc acactaagga atcaatgact aatgaaaaaa ctaccaaagt 5400atcctacaga gcttcacagc cagacctagt ggacacccca acaagctcca agccacagcc 5460caagagaagt ctcaggaaag cagacactga agaagaattt ttagcattta ggaaacaaac 5520gccatcagca ggcaaagcca tgcacacacc caaaccagca gtaggtgaag agaaagacat 5580caacacgttt ttgggaactc cagtgcagaa actggaccag ccaggaaatt tacctggcag 5640caatagacgg ctacaaactc gtaaggaaaa ggcccaggct ctagaagaac tgactggctt 5700cagagagctt ttccagacac catgcactga taaccccaca gctgatgaga aaactaccaa 5760aaaaatactc tgcaaatctc cgcaatcaga cccagcggac accccaacaa acacaaagca 5820acggcccaag agaagcctca agaaagcaga cgtagaggaa gaatttttag cattcaggaa 5880actaacacca tcagcaggca aagccatgca cacgcctaaa gcagcagtag gtgaagagaa 5940agacatcaac acatttgtgg ggactccagt ggagaaactg gacctgctag gaaatttacc 6000tggcagcaag agacggccac aaactcctaa agaaaaggcc aaggctctag aagatctggc 6060tggcttcaaa gagctcttcc agacaccagg tcacactgag gaatcaatga ccgatgacaa 6120aatcacagaa gtatcctgca aatctccaca accagaccca gtcaaaaccc caacaagctc 6180caagcaacga ctcaagatat ccttggggaa agtaggtgtg aaagaagagg tcctaccagt 6240cggcaagctc acacagacgt cagggaagac cacacagaca cacagagaga cagcaggaga 6300tggaaagagc atcaaagcgt ttaaggaatc tgcaaagcag atgctggacc cagcaaacta 6360tggaactggg atggagaggt ggccaagaac acctaaggaa gaggcccaat cactagaaga 6420cctggccggc ttcaaagagc tcttccagac accagaccac actgaggaat caacaactga 6480tgacaaaact accaaaatag cctgcaaatc tccaccacca gaatcaatgg acactccaac 6540aagcacaagg aggcggccca aaacaccttt ggggaaaagg gatatagtgg aagagctctc 6600agccctgaag cagctcacac agaccacaca cacagacaaa gtaccaggag atgaggataa 6660aggcatcaac gtgttcaggg aaactgcaaa acagaaactg gacccagcag caagtgtaac 6720tggtagcaag aggcagccaa gaactcctaa gggaaaagcc caacccctag aagacttggc 6780tggcttgaaa gagctcttcc agacaccagt atgcactgac aagcccacga ctcacgagaa 6840aactaccaaa atagcctgca gatctccaca accagaccca gtgggtaccc caacaatctt 6900caagccacag tccaagagaa gtctcaggaa agcagacgta gaggaagaat ccttagcact 6960caggaaacga acaccatcag tagggaaagc tatggacaca cccaaaccag caggaggtga 7020tgagaaagac atgaaagcat ttatgggaac tccagtgcag aaattggacc tgccaggaaa 7080tttacctggc agcaaaagat ggccacaaac tcctaaggaa aaggcccagg ctctagaaga 7140cctggctggc ttcaaagagc tcttccagac accaggcact gacaagccca cgactgatga 7200gaaaactacc aaaatagcct gcaaatctcc acaaccagac ccagtggaca ccccagcaag 7260cacaaagcaa cggcccaaga gaaacctcag gaaagcagac gtagaggaag aatttttagc 7320actcaggaaa cgaacaccat cagcaggcaa agccatggac accccaaaac cagcagtaag 7380tgatgagaaa aatatcaaca catttgtgga aactccagtg cagaaactgg acctgctagg 7440aaatttacct ggcagcaaga gacagccaca gactcctaag gaaaaggctg aggctctaga 7500ggacctggtt ggcttcaaag aactcttcca gacaccaggt cacactgagg aatcaatgac 7560tgatgacaaa atcacagaag tatcctgtaa atctccacag ccagagtcat tcaaaacctc 7620aagaagctcc aagcaaaggc tcaagatacc cctggtgaaa gtggacatga aagaagagcc 7680cctagcagtc agcaagctca cacggacatc aggggagact acgcaaacac acacagagcc 7740aacaggagat agtaagagca tcaaagcgtt taaggagtct ccaaagcaga tcctggaccc 7800agcagcaagt gtaactggta gcaggaggca gctgagaact cgtaaggaaa aggcccgtgc 7860tctagaagac ctggttgact tcaaagagct cttctcagca ccaggtcaca ctgaagagtc 7920aatgactatt gacaaaaaca caaaaattcc ctgcaaatct cccccaccag aactaacaga 7980cactgccacg agcacaaaga gatgccccaa gacacgtccc aggaaagaag taaaagagga 8040gctctcagca gttgagaggc tcacgcaaac atcagggcaa agcacacaca cacacaaaga

8100accagcaagc ggtgatgagg gcatcaaagt attgaagcaa cgtgcaaaga agaaaccaaa 8160cccagtagaa gaggaaccca gcaggagaag gccaagagca cctaaggaaa aggcccaacc 8220cctggaagac ctggccggct tcacagagct ctctgaaaca tcaggtcaca ctcaggaatc 8280actgactgct ggcaaagcca ctaaaatacc ctgcgaatct cccccactag aagtggtaga 8340caccacagca agcacaaaga ggcatctcag gacacgtgtg cagaaggtac aagtaaaaga 8400agagccttca gcagtcaagt tcacacaaac atcaggggaa accacggatg cagacaaaga 8460accagcaggt gaagataaag gcatcaaagc attgaaggaa tctgcaaaac agacaccggc 8520tccagcagca agtgtaactg gcagcaggag acggccaaga gcacccaggg aaagtgccca 8580agccatagaa gacctagctg gcttcaaaga cccagcagca ggtcacactg aagaatcaat 8640gactgatgac aaaaccacta aaataccctg caaatcatca ccagaactag aagacaccgc 8700aacaagctca aagagacggc ccaggacacg tgcccagaaa gtagaagtga aggaggagct 8760gttagcagtt ggcaagctca cacaaacctc aggggagacc acgcacaccg acaaagagcc 8820ggtaggtgag ggcaaaggca cgaaagcatt taagcaacct gcaaagcgga acgtggacgc 8880agaagatgta attggcagca ggagacagcc aagagcacct aaggaaaagg cccaacccct 8940ggaagacctg gccagcttcc aagagctctc tcaaacacca ggccacactg aggaactggc 9000aaatggtgct gctgatagct ttacaagcgc tccaaagcaa acacctgaca gtggaaaacc 9060tctaaaaata tccagaagag ttcttcgggc ccctaaagta gaacccgtgg gagacgtggt 9120aagcaccaga gaccctgtaa aatcacaaag caaaagcaac acttccctgc ccccactgcc 9180cttcaagagg ggaggtggca aagatggaag cgtcacggga accaagaggc tgcgctgcat 9240gccagcacca gaggaaattg tggaggagct gccagccagc aagaagcaga gggttgctcc 9300cagggcaaga ggcaaatcat ccgaacccgt ggtcatcatg aagagaagtt tgaggacttc 9360tgcaaaaaga attgaacctg cggaagagct gaacagcaac gacatgaaaa ccaacaaaga 9420ggaacacaaa ttacaagact cggtccctga aaataaggga atatccctgc gctccagacg 9480ccaagataag actgaggcag aacagcaaat aactgaggtc tttgtattag cagaaagaat 9540agaaataaac agaaatgaaa agaagcccat gaagacctcc ccagagatgg acattcagaa 9600tccagatgat ggagcccgga aacccatacc tagagacaaa gtcactgaga acaaaaggtg 9660cttgaggtct gctagacaga atgagagctc ccagcctaag gtggcagagg agagcggagg 9720gcagaagagt gcgaaggttc tcatgcagaa tcagaaaggg aaaggagaag caggaaattc 9780agactccatg tgcctgagat caagaaagac aaaaagccag cctgcagcaa gcactttgga 9840gagcaaatct gtgcagagag taacgcggag tgtcaagagg tgtgcagaaa atccaaagaa 9900ggctgaggac aatgtgtgtg tcaagaaaat aacaaccaga agtcataggg acagtgaaga 9960tatttgacag aaaaatcgaa ctgggaaaaa tataataaag ttagttttgt gataagttct 10020agtgcagttt ttgtcataaa ttacaagtga attctgtaag taaggctgtc agtctgctta 10080agggaagaaa actttggatt tgctgggtct gaatcggctt cataaactcc actgggagca 10140ctgctgggct cctggactga gaatagttga acaccggggg ctttgtgaag gagtctgggc 10200caaggtttgc cctcagcttt gcagaatgaa gccttgaggt ctgtcaccac ccacagccac 10260cctacagcag ccttaactgt gacacttgcc acactgtgtc gtcgtttgtt tgcctatgtt 10320ctccagggca cggtggcagg aacaactatc ctcgtctgtc ccaacactga gcaggcactc 10380ggtaaacacg aatgaatgga taagcgcacg gatgaatgga gcttacaaga tctgtctttc 10440caatggccgg gggcatttgg tccccaaatt aaggctattg gacatctgca caggacagtc 10500ctatttttga tgtcctttcc tttctgaaaa taaagttttg tgctttggag aatgactcgt 10560gagcacatct ttagggacca agagtgactt tctgtaagga gtgactcgtg gcttgccttg 10620gtctcttggg aatacttttc taactagggt tgctctcacc tgagacattc tccacccgcg 10680gaatctcagg gtcccaggct gtgggccatc acgacctcaa actggctcct aatctccagc 10740tttcctgtca ttgaaagctt cggaagttta ctggctctgc tcccgcctgt tttctttctg 10800actctatctg gcagcccgat gccacccagt acaggaagtg acaccagtac tctgtaaagc 10860atcatcatcc ttggagagac tgagcactca gcaccttcag ccacgatttc aggatcgctt 10920ccttgtgagc cgctgcctcc gaaatctcct ttgaagccca gacatctttc tccagcttca 10980gacttgtaga tataactcgt tcatcttcat ttactttcca ctttgccccc tgtcctctct 11040gtgttcccca aatcagagaa tagcccgcca tcccccagat cacctgtctg gattcctccc 11100cattcaccca ccttgccagg tgcaggtgag gatggtgcac cagacagggt agctgtcccc 11160caaaatgtgc cctgtgcggg cagtgccctg tctccacgtt tgtttcccca gtgtctggcg 11220gggagccagg tgacatcata aatacttgct gaatgaatgc agaaatcagc ggtactgact 11280tgtactatat tggctgccat gatagggttc tcacagcgtc atccatgatc gtaagggaga 11340atgacattct gcttgaggga gggaatagaa aggggcaggg aggggacatc tgagggcttc 11400acagggctgc aaagggtaca gggattgcac cagggcagaa caggggaggg tgttcaagga 11460agagtggctc ttagcagagg cactttggaa ggtgtgaggc ataaatgctt ccttctacgt 11520aggccaacct caaaactttc agtaggaatg ttgctatgat caagttgttc taacacttta 11580gacttagtag taattatgaa cctcacatag aaaaatttca tccagccata tgcctgtgga 11640gtggaatatt ctgtttagta gaaaaatcct ttagagttca gctctaacca gaaatcttgc 11700tgaagtatgt cagcaccttt tctcaccctg gtaagtacag tatttcaaga gcacgctaag 11760ggtggttttc attttacagg gctgttgatg atgggttaaa aatgttcatt taagggctac 11820ccccgtgttt aatagatgaa caccacttct acacaaccct ccttggtact gggggaggga 11880gagatctgac aaatactgcc cattccccta ggctgactgg atttgagaac aaatacccac 11940ccatttccac catggtatgg taacttctct gagcttcagt ttccaagtga atttccatgt 12000aataggacat tcccattaaa tacaagctgt ttttactttt tcgcctccca gggcctgtgc 12060gatctggtcc cccagcctct cttgggcttt cttacactaa ctctgtacct accatctcct 12120gcctccctta ggcaggcacc tccaaccacc acacactccc tgctgttttc cctgcctgga 12180actttcccac cagccccacc aagatcattt catccagtcc tgagctcagc ttaagggagg 12240cttcttgcct gtgggttccc tcacccccat gcctgtcctc caggctgggg caggttctta 12300gtttgcctgg aattgttctg tacctctttg tagcacgtag tgttgtgaaa ctaagccact 12360aattgagttt ctggctcccc tcctggggtt gtaagttttg ttcattcatg agggccgact 12420gtatttcctg gttactgtat cccagtgacc agccacagga gatgtccaat aaagtatgtg 12480atgaaatggt cttaaaaaaa aaaaaaaaaa aaaaa 125153142444DNAHomo sapiens 314ggcacgaggc ggggccgggt cgcagctggg cccgcggcat ggacgaactg ttccccctca 60tcttcccggc agagcagccc aagcagcggg gcatgcgctt ccgctacaag tgcgaggggc 120gctccgcggg cagcatccca ggcgagagga gcacagatac caccaagacc caccccacca 180tcaagatcaa tggctacaca ggaccaggga cagtgcgcat ctccctggtc accaaggacc 240ctcctcaccg gcctcacccc cacgagcttg taggaaagga ctgccgggat ggcttctatg 300aggctgagct ctgcccggac cgctgcatcc acagtttcca gaacctggga atccagtgtg 360tgaagaagcg ggacctggag caggctatca gtcagcgcat ccagaccaac aacaacccct 420tccaagttcc tatagaagag cagcgtgggg actacgacct gaatgctgtg cggctctgct 480tccaggtgac agtgcgggac ccatcaggca ggcccctccg cctgccgcct gtcctttctc 540atcccatctt tgacaatcgt gcccccaaca ctgccgagct caagatctgc cgagtgaacc 600gaaactctgg cagctgcctc ggtggggatg agatcttcct actgtgtgac aaggtgcaga 660aagaggacat tgaggtgtat ttcacgggac caggctggga ggcccgaggc tccttttcgc 720aagctgatgt gcaccgacaa gtggccattg tgttccggac ccctccctac gcagacccca 780gcctgcaggc tcctgtgcgt gtctccatgc agctgcggcg gccttccgac cgggagctca 840gtgagcccat ggaattccag tacctgccag atacagacga tcgtcaccgg attgaggaga 900aacgtaaaag gacatatgag accttcaaga gcatcatgaa gaagagtcct ttcagcggac 960ccaccgaccc ccggcctcca cctcgacgca ttgctgtgcc ttcccgcagc tcagcttctg 1020tccccaagcc agcaccccag ccctatccct ttacgtcatc cctgagcacc atcaactatg 1080atgagtttcc caccatggtg tttccttctg ggcagatcag ccaggcctcg gccttggccc 1140cggcccctcc ccaagtcctg ccccaggctc cagcccctgc ccctgctcca gccatggtat 1200cagctctggc ccaggcccca gcccctgtcc cagtcctagc cccaggccct cctcaggctg 1260tggccccacc tgcccccaag cccacccagg ctggggaagg aacgctgtca gaggccctgc 1320tgcagctgca gtttgatgat gaagacctgg gggccttgct tggcaacagc acagacccag 1380ctgtgttcac agacctggca tccgtcgaca actccgagtt tcagcagctg ctgaaccagg 1440gcatacctgt ggccccccac acaactgagc ccatgctgat ggagtaccct gaggctataa 1500ctcgcctagt gacagcccag aggccccccg acccagctcc tgctccactg ggggccccgg 1560ggctccccaa tggcctcctt tcaggagatg aagacttctc ctccattgcg gacatggact 1620tctcagccct gctgagtcag atcagctcct aagggggtga cgcctgccct ccccagagca 1680ctggttgcag gggattgaag ccctccaaaa gcacttacgg attctggtgg ggtgtgttcc 1740aactgccccc aactttgtgg atgtcttcct tggagggggg agccatattt tattctttta 1800ttgtcagtat ctgtatctct ctctcttttt ggaggtgctt aagcagaagc attaacttct 1860ctggaaaggg gggagctggg gaaactcaaa cttttcccct gtcctgatgg tcagctccct 1920tctctgtagg gaactgtggg gtcccccatc cccatcctcc agcttctggt actctcctag 1980agacagaagc aggctggagg taaggccttt gagcccacaa agccttatca agtgtcttcc 2040atcatggatt cattacagct taatcaaaat aacgccccag ataccagccc ctgtatggca 2100ctggcattgt ccctgtgcct aacaccagcg tttgaggggc tgccttcctg ccctacagag 2160gtctctgccg gctctttcct tgctcaacca tggctgaagg aaacagtgca acagcactgg 2220ctctctccag gatccagaag gggtttggtc tggacttcct tgctctcccc tcttctcaag 2280tgccttaata gtagggtaag ttgttaagag tgggggagag caggctggca gctctccagt 2340caggaggcat agtttttagt gaacaatcaa agcacttgga ctcttgctct ttctactctg 2400aactaataaa gctgttgcca agctggacgg cacgagctcg tgcc 2444315732DNAHomo sapiens 315tgctgcgaac cacgtgggtc ccgggcgcgt ttcgggtgct ggcggctgca gccggagttc 60aaacctaagc agctggaagg aaccatggcc aactgtgagc gtaccttcat tgcgatcaaa 120ccagatgggg tccagcgggg tcttgtggga gagattatca agcgttttga gcagaaagga 180ttccgccttg ttggtctgaa attcatgcaa gcttccgaag atcttctcaa ggaacactac 240gttgacctga aggaccgtcc attctttgcc ggcctggtga aatacatgca ctcagggccg 300gtagttgcca tggtctggga ggggctgaat gtggtgaaga cgggccgagt catgctcggg 360gagaccaacc ctgcagactc caagcctggg accatccgtg gagacttctg catacaagtt 420ggcaggaaca ttatacatgg cagtgattct gtggagagtg cagagaagga gatcggcttg 480tggtttcacc ctgaggaact ggtagattac acgagctgtg ctcagaactg gatctatgaa 540tgacaggagg gcagaccaca ttgcttttca catccatttc ccctccttcc catgggcaga 600ggaccaggct gtaggaaatc tagttattta caggaacttc atcataattt ggagggaagc 660tcttggagct gtgagttctc cctgtacagt gttaccatcc ccgaccatct gattaaaatg 720cttcctccca gc 7323162422DNAHomo sapiens 316gtcagcctcc cttccaccgc catattgggc cactaaaaaa agggggctcg tcttttcggg 60gtgtttttct ccccctcccc tgtccccgct tgctcacggc tctgcgactc cgacgccggc 120aaggtttgga gagcggctgg gttcgcggga cccgcgggct tgcacccgcc cagactcgga 180cgggctttgc caccctctcc gcttgcctgg tcccctctcc tctccgccct cccgctcgcc 240agtccatttg atcagcggag actcggcggc cgggccgggg cttccccgca gcccctgcgc 300gctcctagag ctcgggccgt ggctcgtcgg ggtctgtgtc ttttggctcc gagggcagtc 360gctgggcttc cgagaggggt tcgggccgcg taggggcgct ttgttttgtt cggttttgtt 420tttttgagag tgcgagagag gcggtcgtgc agacccggga gaaagatgtc aaacgtgcga 480gtgtctaacg ggagccctag cctggagcgg atggacgcca ggcaggcgga gcaccccaag 540ccctcggcct gcaggaacct cttcggcccg gtggaccacg aagagttaac ccgggacttg 600gagaagcact gcagagacat ggaagaggcg agccagcgca agtggaattt cgattttcag 660aatcacaaac ccctagaggg caagtacgag tggcaagagg tggagaaggg cagcttgccc 720gagttctact acagaccccc gcggcccccc aaaggtgcct gcaaggtgcc ggcgcaggag 780agccaggatg tcagcgggag ccgcccggcg gcgcctttaa ttggggctcc ggctaactct 840gaggacacgc atttggtgga cccaaagact gatccgtcgg acagccagac ggggttagcg 900gagcaatgcg caggaataag gaagcgacct gcaaccgacg attcttctac tcaaaacaaa 960agagccaaca gaacagaaga aaatgtttca gacggttccc caaatgccgg ttctgtggag 1020cagacgccca agaagcctgg cctcagaaga cgtcaaacgt aaacagctcg aattaagaat 1080atgtttcctt gtttatcaga tacatcactg cttgatgaag caaggaagat atacatgaaa 1140attttaaaaa tacatatcgc tgacttcatg gaatggacat cctgtataag cactgaaaaa 1200caacaacaca ataacactaa aattttaggc actcttaaat gatctgcctc taaaagcgtt 1260ggatgtagca ttatgcaatt aggtttttcc ttatttgctt cattgtacta cctgtgtata 1320tagtttttac cttttatgta gcacataaac tttggggaag ggagggcagg gtggggctga 1380ggaactgacg tggagcgggg tatgaagagc ttgctttgat ttacagcaag tagataaata 1440tttgacttgc atgaagagaa gcaattttgg ggaagggttt gaattgtttt ctttaaagat 1500gtaatgtccc tttcagagac agctgatact tcatttaaaa aaatcacaaa aatttgaaca 1560ctggctaaag ataattgcta tttattttta caagaagttt attctcattt gggagatctg 1620gtgatctccc aagctatcta aagtttgtta gatagctgca tgtggctttt ttaaaaaagc 1680aacagaaacc tatcctcact gccctcccca gtctctctta aagttggaat ttaccagtta 1740attactcagc agaatggtga tcactccagg tagtttgggg caaaaatccg aggtgcttgg 1800gagttttgaa tgttaagaat tgaccatctg cttttattaa atttgttgac aaaattttct 1860cattttcttt tcacttcggg ctgtgtaaac acagtcaaaa taattctaaa tccctcgata 1920tttttaaaga tctgtaagta acttcacatt aaaaaatgaa atatttttta atttaaagct 1980tactctgtcc atttatccac aggaaagtgt tatttttaaa ggaaggttca tgtagagaaa 2040agcacacttg taggataagt gaaatggata ctacatcttt aaacagtatt tcattgcctg 2100tgtatggaaa aaccatttga agtgtacctg tgtacataac tctgtaaaaa cactgaaaaa 2160ttatactaac ttatttatgt taaaagattt tttttaatct agacaatata caagccaaag 2220tggcatgttt tgtgcatttg taaatgctgt gttgggtaga ataggttttc ccctcttttg 2280ttaaataata tggctatgct taaaaggttg catactgagc caagtataat tttttgtaat 2340gtgtgaaaaa gatgccaatt attgttacac attaagtaat caataaagaa aacttccata 2400gctaaaaaaa aaaaaaaaaa aa 24223175061DNAHomo sapiens 317atggctcaga tatttagcaa cagcggattt aaagaatgtc cattttcaca tccggaacca 60acaagagcaa aagatgtgga caaagaagaa gcattacaga tggaagcaga ggctttagca 120aaactgcaaa aggatagaca agtgactgac aatcagagag gctttgagtt gtcaagcagc 180accagaaaaa aagcacaggt ttataacaag caggattatg atctcatggt gtttcctgaa 240tcagattccc aaaaaagagc attagatatt gatgtagaaa agctcaccca agctgaactt 300gagaaactat tgctggatga cagtttcgag actaaaaaaa cacctgtatt accagttact 360cctattctga gcccttcctt ttcagcacag ctctatttta gacctactat tcagagagga 420cagtggccac ctggattacc tgggccttcc acttatgctt taccttctat ttatccttct 480acttacagta aacaggctgc attccaaaat ggcttcaatc caagaatgcc cacttttcca 540tctacagaac ctatatattt aagtcttccg ggacaatctc catatttctc atatcctttg 600acacctgcca caccctttca tccacaagga agcttaccta tctatcgtcc agtagtcagt 660actgacatgg caaaactatt tgacaaaata gctagtacat cagaattttt aaaaaatggg 720aaagcaagga ctgatttgga gataacagat tcaaaagtca gcaatctaca ggtatctcca 780aagtctgagg atatcagtaa atttgactgg ttagacttgg atcctctaag taagcctaag 840gtggataatg tggaggtatt agaccatgag gaagagaaaa atgtttcaag tttgctagca 900aaggatcctt gggatgctgt tcttcttgaa gagagatcga cagcaaattg tcatcttgaa 960agaaaggtga atggaaaatc cctttctgtg gcaactgtta caagaagcca gtctttaaat 1020attcgaacaa ctcagcttgc aaaagcccag ggccatatat ctcagaaaga cccaaatggg 1080accagtagtt tgccaactgg aagttctctt cttcaagaag ttgaagtaca gaatgaggag 1140atggcagctt tttgtcgatc cattacaaaa ttgaagacca aatttccata taccaatcac 1200cgcacaaacc caggctattt gttaagtcca gtcacagcgc aaagaaacat atgcggagaa 1260aatgctagtg tgaaggtctc cattgacatt gaaggatttc agctaccagt tacttttacg 1320tgtgatgtga gttctactgt agaaatcatt ataatgcaag ccctttgctg ggtacatgat 1380gacttgaatc aagtagatgt tggcagctat gttctaaaag tttgtggtca agaggaagtg 1440ctgcagaata atcattgcct tggaagtcat gagcatattc aaaactgtcg aaaatgggac 1500acagaaatta gactacaact cttgaccttc agtgcaatgt gtcaaaatct ggcccgaaca 1560gcagaagatg atgaaacacc cgtggattta aacaaacacc tgtatcaaat agaaaaacct 1620tgcaaagaag ccatgacgag acaccctgtt gaagaactct tagattctta tcacaaccaa 1680gtagaactgg ctcttcaaat tgaaaaccaa caccgagcag tagatcaagt aattaaagct 1740gtaagaaaaa tctgtagtgc tttagatggt gtcgagactc ttgccattac agaatcagta 1800aagaagctaa agagagcagt taatcttcca aggagtaaaa ctgctgatgt gacttctttg 1860tttggaggag aagacactag caggagttca actaggggct cacttaatcc tgaaaatcct 1920gttcaagtaa gcataaacca attaactgca gcaatttatg atcttctcag actccatgca 1980aattctggta ggagtcctac agactgtgcc caaagtagca agagtgtcaa ggaagcatgg 2040actacaacag agcagctcca gtttactatt tttgctgctc atggaatttc aagtaattgg 2100gtatcaaatt atgaaaaata ctacttgata tgttcactgt ctcacaatgg aaaggatctt 2160tttaaaccta ttcaatcaaa gaaggttggc acttacaaga atttcttcta tcttattaaa 2220tgggatgaac taatcatttt tcctatccag atatcacaat tgccattaga atcagttctt 2280caccttactc tttttggaat tttaaatcag agcagtggaa gttcccctga ttctaataag 2340cagagaaagg gaccagaagc tttgggcaaa gtttctttac ctctttgtga ctttagacgg 2400tttttaacat gtggaactaa acttctatat ctttggactt catcacatac aaattctgtt 2460cctggaacag ttaccaaaaa aggatatgtc atggaaagaa tagtgctaca ggttgatttt 2520ccttctcctg catttgatat tatttataca actcctcaag ttgacagaag cattatacag 2580caacataact tagaaacact agagaatgat ataaaaggga aacttcttga tattcttcat 2640aaagactcat cacttggact ttctaaagaa gataaagctt ttttatggga gaaacgttat 2700tattgcttca aacacccaaa ttgtcttcct aaaatattag caagcgcccc aaactggaaa 2760tggggtaatc ttgccaaaac ttactcattg cttcaccagt ggcctgcatt gtacccacta 2820attgcattgg aacttcttga ttcaaaattt gctgatcagg aagtaagatc cctagctgtg 2880acctggattg aggccattag tgatgatgag ctaacagatc ttcttccaca gtttgtacaa 2940gctttgaaat atgaaattta cttgaatagt tcattagtgc aattcctttt gtccagggca 3000ttgggaaata tccagatagc acacaattta tattggcttc tcaaagatgc cctgcatgat 3060gtacagttta gtacccgata cgaacatgtt ttgggtgctc tcctgtcagt aggaggaaaa 3120cgacttagag aagaacttct aaaacagacg aaacttgtac agcttttagg aggagtagca 3180gaaaaagtaa ggcaggctag tggatcagcc agacaggttg ttctccaaag aagtatggaa 3240cgagtacagt ccttttttca gaaaaataaa tgccgtctcc ctctcaagcc aagtctagtg 3300gcaaaagaat taaatattaa gtcgtgttcc ttcttcagtt ctaatgctgt ccccctaaaa 3360gtcacaatgg tgaatgctga ccctctggga gaagaaatta atgtcatgtt taaggttggt 3420gaagatcttc ggcaagatat gttagcttta cagatgataa agattatgga taagatctgg 3480cttaaagaag gactagatct gaggatggta attttcaaat gtctctcaac tggcagagat 3540cgaggcatgg tggagctggt tcctgcttcc gataccctca ggaaaatcca agtggaatat 3600ggtgtgacag gatcctttaa agataaacca cttgcagagt ggctaaggaa atacaatccc 3660tctgaagaag aatatgaaaa ggcttcagag aactttatct attcctgtgc tggatgctgt 3720gtagccacct atgttttagg catctgtgat cgacacaatg acaatataat gcttcgaagc 3780acgggacaca tgtttcacat tgactttgga aagtttttgg gacatgcaca gatgtttggc 3840agcttcaaaa gggatcgggc tccttttgtg ctgacctctg atatggcata tgtcattaat 3900gggggtgaaa agcccaccat tcgttttcag ttgtttgtgg acctctgctg tcaggcctac 3960aacttgataa gaaagcagac aaaccttttt cttaacctcc tttcactgat gattccttca 4020gggttaccag aacttacaag tattcaagat ttgaaatacg ttagagatgc acttcaaccc 4080caaactacag acgcagaagc tacaattttc tttactaggc ttattgaatc aagtttggga 4140agcattgcca caaagtttaa cttcttcatt cacaaccttg ctcagcttcg tttttctggt 4200cttccttcta atgatgagcc catcctttca ttttcaccta aaacatactc ctttagacaa 4260gatggtcgaa tcaaggaagt ctctgttttt acatatcata agaaatacaa cccagataaa 4320cattatattt atgtagtccg aattttgtgg gaaggacaga ttgaaccatc atttgtcttc 4380cgaacatttg tcgaatttca ggaacttcac aataagctca gtattatttt tccactttgg 4440aagttaccag gctttcctaa taggatggtt ctaggaagaa cacacataaa agatgtagca 4500gccaaaagga aaattgagtt aaacagttac ttacagagtt tgatgaatgc ttcaacggat 4560gtagcagagt gtgatcttgt ttgtactttc ttccaccctt tacttcgtga tgagaaagct 4620gaagggatag ctaggtctgc agatgcaggt tccttcagtc ctactccagg ccaaatagga 4680ggagctgtga aattatccat ctcttaccga aatggtactc ttttcatcat ggtgatgcat 4740atcaaagatc ttgttactga agatggagct gacccaaatc catatgtcaa aacataccta 4800cttccagata accacaaaac

atccaaacgt aaaaccaaaa tttcacgaaa aacgaggaat 4860ccgacattca atgaaatgct tgtatacagt ggatatagca aagaaaccct aagacagcga 4920gaacttcaac taagtgtact cagtgcagaa tctctgcggg agaatttttt cttgggtgga 4980gtaaccctgc ctttgaaaga tttcaacttg agcaaagaga cggttaaatg gtatcagctg 5040actgcggcaa catacttgta a 50613183014DNAHomo sapiens 318ctgaccagcg ccgccctccc ccgcccccga cccaggaggt ggagatccct ccggtccagc 60cacattcaac acccactttc tcctccctct gcccctatat tcccgaaacc ccctcctcct 120tcccttttcc ctcctccctg gagacggggg aggagaaaag gggagtccag tcgtcatgac 180tgagctgaag gcaaagggtc cccgggctcc ccacgtggcg ggcggcccgc cctcccccga 240ggtcggatcc ccactgctgt gtcgcccagc cgcaggtccg ttcccgggga gccagacctc 300ggacaccttg cctgaagttt cggccatacc tatctccctg gacgggctac tcttccctcg 360gccctgccag ggacaggacc cctccgacga aaagacgcag gaccagcagt cgctgtcgga 420cgtggagggc gcatattcca gagctgaagc tacaaggggt gctggaggca gcagttctag 480tcccccagaa aaggacagcg gactgctgga cagtgtcttg gacactctgt tggcgccctc 540aggtcccggg cagagccaac ccagccctcc cgcctgcgag gtcaccagct cttggtgcct 600gtttggcccc gaacttcccg aagatccacc ggctgccccc gccacccagc gggtgttgtc 660cccgctcatg agccggtccg ggtgcaaggt tggagacagc tccgggacgg cagctgccca 720taaagtgctg ccccggggcc tgtcaccagc ccggcagctg ctgctcccgg cctctgagag 780ccctcactgg tccggggccc cagtgaagcc gtctccgcag gccgctgcgg tggaggttga 840ggaggaggat ggctctgagt ccgaggagtc tgcgggtccg cttctgaagg gcaaacctcg 900ggctctgggt ggcgcggcgg ctggaggagg agccgcggct gtcccgccgg gggcggcagc 960aggaggcgtc gccctggtcc ccaaggaaga ttcccgcttc tcagcgccca gggtcgccct 1020ggtggagcag gacgcgccga tggcgcccgg gcgctccccg ctggccacca cggtgatgga 1080tttcatccac gtgcctatcc tgcctctcaa tcacgcctta ttggcagccc gcactcggca 1140gctgctggaa gacgaaagtt acgacggcgg ggccggggct gccagcgcct ttgccccgcc 1200gcggagttca ccctgtgcct cgtccacccc ggtcgctgta ggcgacttcc ccgactgcgc 1260gtacccgccc gacgccgagc ccaaggacga cgcgtaccct ctctatagcg acttccagcc 1320gcccgctcta aagataaagg aggaggagga aggcgcggag gcctccgcgc gctccccgcg 1380ttcctacctt gtggccggtg ccaaccccgc agccttcccg gatttcccgt tggggccacc 1440gcccccgctg ccgccgcgag cgaccccatc cagacccggg gaagcggcgg tgacggccgc 1500acccgccagt gcctcagtct cgtctgcgtc ctcctcgggg tcgaccctgg agtgcatcct 1560gtacaaagcg gagggcgcgc cgccccagca gggcccgttc gcgccgccgc cctgcaaggc 1620gccgggcgcg agcggctgcc tgctcccgcg ggacggcctg ccctccacct ccgcctctgc 1680cgccgccgcc ggggcggccc ccgcgctcta ccctgcactc ggcctcaacg ggctcccgca 1740gctcggctac caggccgccg tgctcaagga gggcctgccg caggtctacc cgccctatct 1800caactacctg aggccggatt cagaagccag ccagagccca caatacagct tcgagtcatt 1860acctcagaag atttgtttaa tctgtgggga tgaagcatca ggctgtcatt atggtgtcct 1920tacctgtggg agctgtaagg tcttctttaa gagggcaatg gaagggcagc acaactactt 1980atgtgctgga agaaatgact gcatcgttga taaaatccgc agaaaaaact gcccagcatg 2040tcgccttaga aagtgctgtc aggctggcat ggtccttgga ggtcgaaaat ttaaaaagtt 2100caataaagtc agagttgtga gagcactgga tgctgttgct ctcccacagc cagtgggcgt 2160tccaaatgaa agccaagccc taagccagag attcactttt tcaccaggtc aagacataca 2220gttgattcca ccactgatca acctgttaat gagcattgaa ccagatgtga tctatgcagg 2280acatgacaac acaaaacctg acacctccag ttctttgctg acaagtctta atcaactagg 2340cgagaggcaa cttctttcag tagtcaagtg gtctaaatca ttgccaggtt ttcgaaactt 2400acatattgat gaccagataa ctctcattca gtattcttgg atgagcttaa tggtgtttgg 2460tctaggatgg agatcctaca aacacgtcag tgggcagatg ctgtattttg cacctgatct 2520aatactaaat gaacagcgga tgaaagaatc atcattctat tcattatgcc ttaccatgtg 2580gcagatccca caggagtttg tcaagcttca agttagccaa gaagagttcc tctgtatgaa 2640agtattgtta cttcttaata caattccttt ggaagggcta cgaagtcaaa cccagtttga 2700ggagatgagg tcaagctaca ttagagagct catcaaggca attggtttga ggcaaaaagg 2760agttgtgtcg agctcacagc gtttctatca acttacaaaa cttcttgata acttgcatga 2820tcttgtcaaa caacttcatc tgtactgctt gaatacattt atccagtccc gggcactgag 2880tgttgaattt ccagaaatga tgtctgaagt tattgctgca caattaccca agatattggc 2940agggatggtg aaaccccttc tctttcataa aaagtgaatg tcatcttttt cttttaaaga 3000attaaatttt gtgg 30143192148DNAHomo sapiens 319gcttcagggt acagctcccc cgcagccaga agccgggcct gcagcgcctc agcaccgctc 60cgggacaccc cacccgcttc ccaggcgtga cctgtcaaca gcaacttcgc ggtgtggtga 120actctctgag gaaaaaccat tttgattatt actctcagac gtgcgtggca acaagtgact 180gagacctaga aatccaagcg ttggaggtcc tgaggccagc ctaagtcgct tcaaaatgga 240acgaaggcgt ttgtggggtt ccattcagag ccgatacatc agcatgagtg tgtggacaag 300cccacggaga cttgtggagc tggcagggca gagcctgctg aaggatgagg ccctggccat 360tgccgccctg gagttgctgc ccagggagct cttcccgcca ctcttcatgg cagcctttga 420cgggagacac agccagaccc tgaaggcaat ggtgcaggcc tggcccttca cctgcctccc 480tctgggagtg ctgatgaagg gacaacatct tcacctggag accttcaaag ctgtgcttga 540tggacttgat gtgctccttg cccaggaggt tcgccccagg aggtggaaac ttcaagtgct 600ggatttacgg aagaactctc atcaggactt ctggactgta tggtctggaa acagggccag 660tctgtactca tttccagagc cagaagcagc tcagcccatg acaaagaagc gaaaagtaga 720tggtttgagc acagaggcag agcagccctt cattccagta gaggtgctcg tagacctgtt 780cctcaaggaa ggtgcctgtg atgaattgtt ctcctacctc attgagaaag tgaagcgaaa 840gaaaaatgta ctacgcctgt gctgtaagaa gctgaagatt tttgcaatgc ccatgcagga 900tatcaagatg atcctgaaaa tggtgcagct ggactctatt gaagatttgg aagtgacttg 960tacctggaag ctacccacct tggcgaaatt ttctccttac ctgggccaga tgattaatct 1020gcgtagactc ctcctctccc acatccatgc atcttcctac atttccccgg agaaggaaga 1080gcagtatatc gcccagttca cctctcagtt cctcagtctg cagtgcctgc aggctctcta 1140tgtggactct ttatttttcc ttagaggccg cctggatcag ttgctcaggc acgtgatgaa 1200ccccttggaa accctctcaa taactaactg ccggctttcg gaaggggatg tgatgcatct 1260gtcccagagt cccagcgtca gtcagctaag tgtcctgagt ctaagtgggg tcatgctgac 1320cgatgtaagt cccgagcccc tccaagctct gctggagaga gcctctgcca ccctccagga 1380cctggtcttt gatgagtgtg ggatcacgga tgatcagctc cttgccctcc tgccttccct 1440gagccactgc tcccagctta caaccttaag cttctacggg aattccatct ccatatctgc 1500cttgcagagt ctcctgcagc acctcatcgg gctgagcaat ctgacccacg tgctgtatcc 1560tgtccccctg gagagttatg aggacatcca tggtaccctc cacctggaga ggcttgccta 1620tctgcatgcc aggctcaggg agttgctgtg tgagttgggg cggcccagca tggtctggct 1680tagtgccaac ccctgtcctc actgtgggga cagaaccttc tatgacccgg agcccatcct 1740gtgcccctgt ttcatgccta actagctggg tgcacatatc aaatgcttca ttctgcatac 1800ttggacacta aagccaggat gtgcatgcat cttgaagcaa caaagcagcc acagtttcag 1860acaaatgttc agtgtgagtg aggaaaacat gttcagtgag gaaaaaacat tcagacaaat 1920gttcagtgag gaaaaaaagg ggaagttggg gataggcaga tgttgacttg aggagttaat 1980gtgatctttg gggagataca tcttatagag ttagaaatag aatctgaatt tctaaaggga 2040gattctggct tgggaagtac atgtaggagt taatccctgt gtagactgtt gtaaagaaac 2100tgttgaaaat aaagagaagc aatgtgaagc aaaaaaaaaa aaaaaaaa 2148320540DNAHomo sapiens 320atccctgact cggggtcgcc tttggagcag agaggaggca atggccacca tggagaacaa 60ggtgatctgc gccctggtcc tggtgtccat gctggccctc ggcaccctgg ccgaggccca 120gacagagacg tgtacagtgg ccccccgtga aagacagaat tgtggttttc ctggtgtcac 180gccctcccag tgtgcaaata agggctgctg tttcgacgac accgttcgtg gggtcccctg 240gtgcttctat cctaatacca tcgacgtccc tccagaagag gagtgtgaat tttagacact 300tctgcaggga tctgcctgca tcctgacggg gtgccgtccc cagcacggtg attagtccca 360gagctcggct gccacctcca ccggacacct cagacacgct tctgcagctg tgcctcggct 420cacaacacag attgactgct ctgactttga ctactcaaaa ttggcctaaa aattaaaaga 480gatcgatatt aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa 5403212346DNAHomo sapiens 321gcacgaggct gcggcgggtc cgggcccatg aggcgacgaa ggaggcggga cggcttttac 60ccagccccgg acttccgaga cagggaagct gaggacatgg caggagtgtt tgacatagac 120ctggaccagc cagaggacgc gggctctgag gatgagctgg aggagggggg tcagttaaat 180gaaagcatgg accatggggg agttggacca tatgaacttg gcatggaaca ttgtgagaaa 240tttgaaatct cagaaactag tgtgaacaga gggccagaaa aaatcagacc agaatgtttt 300gagctacttc gggtacttgg taaagggggc tatggaaagg tttttcaagt acgaaaagta 360acaggagcaa atactgggaa aatatttgcc atgaaggtgc ttaaaaaggc aatgatagta 420agaaatgcta aagatacagc tcatacaaaa gcagaacgga atattctgga ggaagtaaag 480catcccttca tcgtggattt aatttatgcc tttcagactg gtggaaaact ctacctcatc 540cttgagtatc tcagtggagg agaactattt atgcagttag aaagagaggg aatatttatg 600gaagacactg cctgctttta cttggcagaa atctccatgg ctttggggca tttacatcaa 660aaggggatca tctacagaga cctgaagccg gagaatatca tgcttaatca ccaaggtcat 720gtgaaactaa cagactttgg actatgcaaa gaatctattc atgatggaac agtcacacac 780acattttgtg gaacaataga atacatggcc cctgaaatct tgatgagaag tggccacaat 840cgtgctgtgg attggtggag tttgggagca ttaatgtatg acatgctgac tggagcaccc 900ccattcactg gggagaatag aaagaaaaca attgacaaaa tcctcaaatg taaactcaat 960ttgcctccct acctcacaca agaagccaga gatctgctta aaaagctgct gaaaagaaat 1020gctgcttctc gtctgggagc tggtcctggg gacgctggag aagttcaagc tcatccattc 1080tttagacaca ttaactggga agaacttctg gctcgaaagg tggagccccc ctttaaacct 1140ctgttgcaat ctgaagagga tgtaagtcag tttgattcca agtttacacg tcagacacct 1200gtcgacagcc cagatgactc aactctcagt gaaagtgcca atcaggtctt tctgggtttt 1260acatatgtgg ctccatctgt acttgaaagt gtgaaagaaa agttttcctt tgaaccaaaa 1320atccgatcac ctcgaagatt tattggcagc ccacgaacac ctgtcagccc agtcaaattt 1380tctcctgggg atttctgggg aagaggtgct tcggccagca cagcaaatcc tcagacacct 1440gtggaatacc caatggaaac aagtggcata gagcagatgg atgtgacaat gagtggggaa 1500gcatcggcac cacttccaat acgacagccg aactctgggc catacaaaaa acaagctttt 1560cccatgatct ccaaacggcc agagcacctg cgtatgaatc tatgacagag caatgctttt 1620aatgaattta aggcaaaaag gtggagaggg agatgtgtga gcatcctgca aggtgaaaca 1680agactcaaaa tgacagtttc agagagtcaa tgtcattaca tagaacactt cggacacagg 1740aaaaataaac gtggatttta aaaaatcaat caatggtgca aaaaaaaact taaagcaaaa 1800tagtattgct gaactcttag gcacatcaat taattgattc ctcgcgacat ctttctcaac 1860cttatcaagg attttcatgt tgatgactcg aaactgacag tattaagggt aggatgttgc 1920tctgaatcac tgtgagtctg atgtgtgaag aagggtatcc tttcattagg caagtacaaa 1980ttgcctataa tacttgcaac taaggacaaa ttagcatgca agcttggtca aacttttccc 2040aggcaaaatg ggaaggcaaa gacaaaagaa acttaccaat tgatgtttta cgtgcaaaca 2100acctgaatct tttttttata taaatatata tttttcaaat agatttttga ttcagctcat 2160tatgaaaaac atcccaaact ttaaaatgcg aaattattgg ttggtgtgaa gaaagccaga 2220caacttctgt ttcttctctt ggtgaaataa taaaatgcaa atgaatcatt gttaacacag 2280ctgtggctcg tttgagggat tggggtggac ctggggttta ttttcagtaa cccagctgcg 2340gagcct 23463222420DNAHomo sapiens 322tccggggcgg cccccggcag ccagcgcgac gttccaaaat cgaacctcag tggcggcgct 60cggaagcgga actctgccgg ggccgcgccg gctacattgt ttcctccccc cgactccctc 120ccgccccctt cccccgcctt tcttccctcc gcgacccggg ccgtgcgtcc gtccccctgc 180ctctgcctgg cggtccctcc tcccctctcc ttgcacccat acctctttgt accgcacccc 240ctggggaccc ctgcgcccct cccctccccc ctgaccgcat ggaccgtccc gcaggccgct 300gatgccgccc gcggcgaggt ggcccggacc gcagtgcccc aagagagctc taatggtacc 360aagtgacagg ttggctttac tgtgactcgg ggacgccaga gctcctgaga agatgtcagc 420aatacaggcc gcctggccat ccggtacaga atgtattgcc aagtacaact tccacggcac 480tgccgagcag gacctgccct tctgcaaagg agacgtgctc accattgtgg ccgtcaccaa 540ggaccccaac tggtacaaag ccaaaaacaa ggtgggccgt gagggcatca tcccagccaa 600ctacgtccag aagcgggagg gcgtgaaggc gggtaccaaa ctcagcctca tgccttggtt 660ccacggcaag atcacacggg agcaggctga gcggcttctg tacccgccgg agacaggcct 720gttcctggtg cgggagagca ccaactaccc cggagactac acgctgtgcg tgagctgcga 780cggcaaggtg gagcactacc gcatcatgta ccatgccagc aagctcagca tcgacgagga 840ggtgtacttt gagaacctca tgcagctggt ggagcactac acctcagacg cagatggact 900ctgtacgcgc ctcattaaac caaaggtcat ggagggcaca gtggcggccc aggatgagtt 960ctaccgcagc ggctgggccc tgaacatgaa ggagctgaag ctgctgcaga ccatcgggaa 1020gggggagttc ggagacgtga tgctgggcga ttaccgaggg aacaaagtcg ccgtcaagtg 1080cattaagaac gacgccactg cccaggcctt cctggctgaa gcctcagtca tgacgcaact 1140gcggcatagc aacctggtgc agctcctggg cgtgatcgtg gaggagaagg gcgggctcta 1200catcgtcact gagtacatgg ccaaggggag ccttgtggac tacctgcggt ctaggggtcg 1260gtcagtgctg ggcggagact gtctcctcaa gttctcgcta gatgtctgcg aggccatgga 1320atacctggag ggcaacaatt tcgtgcatcg agacctggct gcccgcaatg tgctggtgtc 1380tgaggacaac gtggccaagg tcagcgactt tggtctcacc aaggaggcgt ccagcaccca 1440ggacacgggc aagctgccag tcaagtggac agcccctgag gccctgagag agaagaaatt 1500ctccactaag tctgacgtgt ggagtttcgg aatccttctc tgggaaatct actcctttgg 1560gcgagtgcct tatccaagaa ttcccctgaa ggacgtcgtc cctcgggtgg agaagggcta 1620caagatggat gcccccgacg gctgcccgcc cgcagtctat gaagtcatga agaactgctg 1680gcacctggac gccgccatgc ggccctcctt cctacagctc cgagagcagc ttgagcacat 1740caaaacccac gagctgcacc tgtgacggct ggcctccgcc tgggtcatgg gcctgtgggg 1800actgaacctg gaagatcatg gacctggtgc ccctgctcac tgggcccgag cctgaactga 1860gccccagcgg gctggcgggc ctttttcctg cgtcccagcc tgcacccctc cggccccgtc 1920tctcttggac ccacctgtgg ggcctgggga gcccactgag gggccaggga ggaaggaggc 1980cacggagcgg gcggcagcgc cccaccacgt cgggcttccc tggcctcccg ccactcgcct 2040tcttagagtt ttattccttt ccttttttga gatttttttt ccgtgtgttt attttttatt 2100atttttcaag ataaggagaa agaaagtacc cagcaaatgg gcattttaca agaagtacga 2160atcttatttt tcctgtcctg cccgtgaggt gggggggacc gggcccctct ctagggaccc 2220ctcgccccag cctcattccc cattctgtgt cccatgtccc gtgtctcctc ggtcgccccg 2280tgtttgcgct tgaccatgtt gcactgtttg catgcgcccg aggcagacgt ctgtcagggg 2340cttggatttc gtgtgccgct gccacccgcc cacccgcctt gtgagatgga atcgtaataa 2400accacgccat gaggaaaaaa 24203232253DNAHomo sapiens 323ggaagacttg ggtccttggg tcgcaggtgg gagccgacgg gtgggtagac cgtgggggat 60atctcagtgg cggacgagga cggcggggac aaggggcggc tggtcggagt ggcggagcgt 120caagtcccct gtcggttcct ccgtccctga gtgtccttgg cgctgccttg tgcccgccca 180gcgcctttgc atccgctcct gggcaccgag gcgccctgta ggatactgct tgttacttat 240tacagctaga ggcatcatgg accgatctaa agaaaactgc atttcaggac ctgttaaggc 300tacagctcca gttggaggtc caaaacgtgt tctcgtgact cagcaaattc cttgtcagaa 360tccattacct gtaaatagtg gccaggctca gcgggtcttg tgtccttcaa attcttccca 420gcgcgttcct ttgcaagcac aaaagcttgt ctccagtcac aagccggttc agaatcagaa 480gcagaagcaa ttgcaggcaa ccagtgtacc tcatcctgtc tccaggccac tgaataacac 540ccaaaagagc aagcagcccc tgccatcggc acctgaaaat aatcctgagg aggaactggc 600atcaaaacag aaaaatgaag aatcaaaaaa gaggcagtgg gctttggaag actttgaaat 660tggtcgccct ctgggtaaag gaaagtttgg taatgtttat ttggcaagag aaaagcaaag 720caagtttatt ctggctctta aagtgttatt taaagctcag ctggagaaag ccggagtgga 780gcatcagctc agaagagaag tagaaataca gtcccacctt cggcatccta atattcttag 840actgtatggt tatttccatg atgctaccag agtctaccta attctggaat atgcaccact 900tggaacagtt tatagagaac ttcagaaact ttcaaagttt gatgagcaga gaactgctac 960ttatataaca gaattggcaa atgccctgtc ttactgtcat tcgaagagag ttattcatag 1020agacattaag ccagagaact tacttcttgg atcagctgga gagcttaaaa ttgcagattt 1080tgggtggtca gtacatgctc catcttccag gaggaccact ctctgtggca ccctggacta 1140cctgccccct gaaatgattg aaggtcggat gcatgatgag aaggtggatc tctggagcct 1200tggagttctt tgctatgaat ttttagttgg gaagcctcct tttgaggcaa acacatacca 1260agagacctac aaaagaatat cacgggttga attcacattc cctgactttg taacagaggg 1320agccagggac ctcatttcaa gactgttgaa gcataatccc agccagaggc caatgctcag 1380agaagtactt gaacacccct ggatcacagc aaattcatca aaaccatcaa attgccaaaa 1440caaagaatca gctagcaaac agtcttagga atcgtgcagg gggagaaatc cttgagccag 1500ggctgccata taacctgaca ggaacatgct actgaagttt attttaccat tgactgctgc 1560cctcaatcta gaacgctaca caagaaatat ttgttttact cagcaggtgt gccttaacct 1620ccctattcag aaagctccac atcaataaac atgacactct gaagtgaaag tagccacgag 1680aattgtgcta cttatactgg ttcataatct ggaggcaagg ttcgactgca gccgccccgt 1740cagcctgtgc taggcatggt gtcttcacag gaggcaaatc cagagcctgg ctgtggggaa 1800agtgaccact ctgccctgac cccgatcagt taaggagctg tgcaataacc ttcctagtac 1860ctgagtgagt gtgtaactta ttgggttggc gaagcctggt aaagctgttg gaatgagtat 1920gtgattcttt ttaagtatga aaataaagat atatgtacag acttgtattt tttctctggt 1980ggcattcctt taggaatgct gtgtgtctgt ccggcacccc ggtaggcctg attgggtttc 2040tagtcctcct taaccactta tctcccatat gagagtgtga aaaataggaa cacgtgctct 2100acctccattt agggatttgc ttgggataca gaagaggcca tgtgtctcag agctgttaag 2160ggcttatttt tttaaaacat tggagtcata gcatgtgtgt aaactttaaa tatgcaaata 2220aataagtatc tatgtctaaa aaaaaaaaaa aaa 22533241619DNAHomo sapiens 324ccgccagatt tgaatcgcgg gacccgttgg cagaggtggc ggcggcggca tgggtgcccc 60gacgttgccc cctgcctggc agccctttct caaggaccac cgcatctcta cattcaagaa 120ctggcccttc ttggagggct gcgcctgcac cccggagcgg atggccgagg ctggcttcat 180ccactgcccc actgagaacg agccagactt ggcccagtgt ttcttctgct tcaaggagct 240ggaaggctgg gagccagatg acgaccccat agaggaacat aaaaagcatt cgtccggttg 300cgctttcctt tctgtcaaga agcagtttga agaattaacc cttggtgaat ttttgaaact 360ggacagagaa agagccaaga acaaaattgc aaaggaaacc aacaataaga agaaagaatt 420tgaggaaact gcgaagaaag tgcgccgtgc catcgagcag ctggctgcca tggattgagg 480cctctggccg gagctgcctg gtcccagagt ggctgcacca cttccagggt ttattccctg 540gtgccaccag ccttcctgtg ggccccttag caatgtctta ggaaaggaga tcaacatttt 600caaattagat gtttcaactg tgctcctgtt ttgtcttgaa agtggcacca gaggtgcttc 660tgcctgtgca gcgggtgctg ctggtaacag tggctgcttc tctctctctc tctctttttt 720gggggctcat ttttgctgtt ttgattcccg ggcttaccag gtgagaagtg agggaggaag 780aaggcagtgt cccttttgct agagctgaca gctttgttcg cgtgggcaga gccttccaca 840gtgaatgtgt ctggacctca tgttgttgag gctgtcacag tcctgagtgt ggacttggca 900ggtgcctgtt gaatctgagc tgcaggttcc ttatctgtca cacctgtgcc tcctcagagg 960acagtttttt tgttgttgtg tttttttgtt tttttttttt ggtagatgca tgacttgtgt 1020gtgatgagag aatggagaca gagtccctgg ctcctctact gtttaacaac atggctttct 1080tattttgttt gaattgttaa ttcacagaat agcacaaact acaattaaaa ctaagcacaa 1140agccattcta agtcattggg gaaacggggt gaacttcagg tggatgagga gacagaatag 1200agtgatagga agcgtctggc agatactcct tttgccactg ctgtgtgatt agacaggccc 1260agtgagccgc ggggcacatg ctggccgctc ctccctcaga aaaaggcagt ggcctaaatc 1320ctttttaaat gacttggctc gatgctgtgg gggactggct gggctgctgc aggccgtgtg 1380tctgtcagcc caaccttcac atctgtcacg ttctccacac gggggagaga cgcagtccgc 1440ccaggtcccc gctttctttg gaggcagcag ctcccgcagg gctgaagtct ggcgtaagat 1500gatggatttg attcgccctc ctccctgtca tagagctgca gggtggattg ttacagcttc 1560gctggaaacc tctggaggtc atctcggctg ttcctgagaa ataaaaagcc tgtcatttc 16193255010DNAHomo sapiens 325ggcggctcgg gacggaggac gcgctagtgt gagtgcgggc ttctagaact acaccgaccc 60tcgtgtcctc

ccttcatcct gcggggctgg ctggagcggc cgctccggtg ctgtccagca 120gccataggga gccgcacggg gagcgggaaa gcggtcgcgg ccccaggcgg ggcggccggg 180atggagcggg gccgcgagcc tgtggggaag gggctgtggc ggcgcctcga gcggctgcag 240gttcttctgt gtggcagttc agaatgatgg atcaagctag atcagcattc tctaacttgt 300ttggtggaga accattgtca tatacccggt tcagcctggc tcggcaagta gatggcgata 360acagtcatgt ggagatgaaa cttgctgtag atgaagaaga aaatgctgac aataacacaa 420aggccaatgt cacaaaacca aaaaggtgta gtggaagtat ctgctatggg actattgctg 480tgatcgtctt tttcttgatt ggatttatga ttggctactt gggctattgt aaaggggtag 540aaccaaaaac tgagtgtgag agactggcag gaaccgagtc tccagtgagg gaggagccag 600gagaggactt ccctgcagca cgtcgcttat attgggatga cctgaagaga aagttgtcgg 660agaaactgga cagcacagac ttcaccagca ccatcaagct gctgaatgaa aattcatatg 720tccctcgtga ggctggatct caaaaagatg aaaatcttgc gttgtatgtt gaaaatcaat 780ttcgtgaatt taaactcagc aaagtctggc gtgatcaaca ttttgttaag attcaggtca 840aagacagcgc tcaaaactcg gtgatcatag ttgataagaa cggtagactt gtttacctgg 900tggagaatcc tgggggttat gtggcgtata gtaaggctgc aacagttact ggtaaactgg 960tccatgctaa ttttggtact aaaaaagatt ttgaggattt atacactcct gtgaatggat 1020ctatagtgat tgtcagagca gggaaaatca cctttgcaga aaaggttgca aatgctgaaa 1080gcttaaatgc aattggtgtg ttgatataca tggaccagac taaatttccc attgttaacg 1140cagaactttc attctttgga catgctcatc tggggacagg tgacccttac acacctggat 1200tcccttcctt caatcacact cagtttccac catctcggtc atcaggattg cctaatatac 1260ctgtccagac aatctccaga gctgctgcag aaaagctgtt tgggaatatg gaaggagact 1320gtccctctga ctggaaaaca gactctacat gtaggatggt aacctcagaa agcaagaatg 1380tgaagctcac tgtgagcaat gtgctgaaag agataaaaat tcttaacatc tttggagtta 1440ttaaaggctt tgtagaacca gatcactatg ttgtagttgg ggcccagaga gatgcatggg 1500gccctggagc tgcaaaatcc ggtgtaggca cagctctcct attgaaactt gcccagatgt 1560tctcagatat ggtcttaaaa gatgggtttc agcccagcag aagcattatc tttgccagtt 1620ggagtgctgg agactttgga tcggttggtg ccactgaatg gctagaggga tacctttcgt 1680ccctgcattt aaaggctttc acttatatta atctggataa agcggttctt ggtaccagca 1740acttcaaggt ttctgccagc ccactgttgt atacgcttat tgagaaaaca atgcaaaatg 1800tgaagcatcc ggttactggg caatttctat atcaggacag caactgggcc agcaaagttg 1860agaaactcac tttagacaat gctgctttcc ctttccttgc atattctgga atcccagcag 1920tttctttctg tttttgcgag gacacagatt atccttattt gggtaccacc atggacacct 1980ataaggaact gattgagagg attcctgagt tgaacaaagt ggcacgagca gctgcagagg 2040tcgctggtca gttcgtgatt aaactaaccc atgatgttga attgaacctg gactatgaga 2100ggtacaacag ccaactgctt tcatttgtga gggatctgaa ccaatacaga gcagacataa 2160aggaaatggg cctgagttta cagtggctgt attctgctcg tggagacttc ttccgtgcta 2220cttccagact aacaacagat ttcgggaatg ctgagaaaac agacagattt gtcatgaaga 2280aactcaatga tcgtgtcatg agagtggagt atcacttcct ctctccctac gtatctccaa 2340aagagtctcc tttccgacat gtcttctggg gctccggctc tcacacgctg ccagctttac 2400tggagaactt gaaactgcgt aaacaaaata acggtgcttt taatgaaacg ctgttcagaa 2460accagttggc tctagctact tggactattc agggagctgc aaatgccctc tctggtgacg 2520tttgggacat tgacaatgag ttttaaatgt gatacccata gcttccatga gaacagcagg 2580gtagtctggt ttctagactt gtgctgatcg tgctaaattt tcagtagggc tacaaaacct 2640gatgttaaaa ttccatccca tcatcttggt actactagat gtctttaggc agcagctttt 2700aatacagggt agataacctg tacttcaagt taaagtgaat aaccacttaa aaaatgtcca 2760tgatggaata ttcccctatc tctagaattt taagtgcttt gtaatgggaa ctgcctcttt 2820cctgttgttg ttaatgaaaa tgtcagaaac cagttatgtg aatgatctct ctgaatccta 2880agggctggtc tctgctgaag gttgtaagtg gttcgcttac tttgagtgat cctccaactt 2940catttgatgc taaataggag ataccaggtt gaaagacctc tccaaatgag atctaagcct 3000ttccataagg aatgtagcag gtttcctcat tcctgaaaga aacagttaac tttcagaaga 3060gatgggcttg ttttcttgcc aatgaggtct gaaatggagg tccttctgct ggataaaatg 3120aggttcaact gttgattgca ggaataaggc cttaatatgt taacctcagt gtcatttatg 3180aaaagagggg accagaagcc aaagacttag tatattttct tttcctctgt cccttccccc 3240ataagcctcc atttagttct ttgttatttt tgtttcttcc aaagcacatt gaaagagaac 3300cagtttcagg tgtttagttg cagactcagt ttgtcagact ttaaagaata atatgctgcc 3360aaattttggc caaagtgtta atcttagggg agagctttct gtccttttgg cactgagata 3420tttattgttt atttatcagt gacagagttc actataaatg gtgttttttt aatagaatat 3480aattatcgga agcagtgcct tccataatta tgacagttat actgtcggtt ttttttaaat 3540aaaagcagca tctgctaata aaacccaaca gatactggaa gttttgcatt tatggtcaac 3600acttaagggt tttagaaaac agccgtcagc caaatgtaat tgaataaagt tgaagctaag 3660atttagagat gaattaaatt taattagggg ttgctaagaa gcgagcactg accagataag 3720aatgctggtt ttcctaaatg cagtgaattg tgaccaagtt ataaatcaat gtcacttaaa 3780ggctgtggta gtactcctgc aaaattttat agctcagttt atccaaggtg taactctaat 3840tcccatttgc aaaatttcca gtacctttgt cacaatccta acacattatc gggagcagtg 3900tcttccataa tgtataaaga acaaggtagt ttttacctac cacagtgtct gtatcggaga 3960cagtgatctc catatgttac actaagggtg taagtaatta tcgggaacag tgtttcccat 4020aattttcttc atgcaatgac atcttcaaag cttgaagatc gttagtatct aacatgtatc 4080ccaactccta taattcccta tcttttagtt ttagttgcag aaacattttg tggtcattaa 4140gcattgggtg ggtaaattca accactgtaa aatgaaatta ctacaaaatt tgaaatttag 4200cttgggtttt tgttaccttt atggtttctc caggtcctct acttaatgag atagcagcat 4260acatttataa tgtttgctat tgacaagtca ttttaattta tcacattatt tgcatgttac 4320ctcctataaa cttagtgcgg acaagtttta atccagaatt gaccttttga cttaaagcag 4380agggactttg tatagaaggt ttgggggctg tggggaagga gagtcccctg aaggtctgac 4440acgtctgcct acccattcgt ggtgatcaat taaatgtagg tatgaataag ttcgaagctc 4500cgtgagtgaa ccatcatata aacgtgtagt acagctgttt gtcatagggc agttggaaac 4560ggcctcctag ggaaaagttc atagggtctc ttcaggttct tagtgtcact tacctagatt 4620tacagcctca cttgaatgtg tcactactca cagtctcttt aatcttcagt tttatcttta 4680atctcctctt ttatcttgga ctgacattta gcgtagctaa gtgaaaaggt catagctgag 4740attcctggtt cgggtgttac gcacacgtac ttaaatgaaa gcatgtggca tgttcatcgt 4800ataacacaat atgaatacag ggcatgcatt ttgcagcagt gagtctcttc agaaaaccct 4860tttctacagt tagggttgag ttacttccta tcaagccagt acgtgctaac aggctcaata 4920ttcctgaatg aaatatcaga ctagtgacaa gctcctggtc ttgagatgtc ttctcgttaa 4980ggagtagggc cttttggagg taaaggtata 50103262574DNAHomo sapiens 326cctgtttaga cacatggaca acaatcccag cgctacaagg cacacagtcc gcttcttcgt 60cctcagggtt gccagcgctt cctggaagtc ctgaagctct cgcagtgcag tgagttcatg 120caccttcttg ccaagcctca gtctttggga tctggggagg ccgcctggtt ttcctccctc 180cttctgcacg tctgctgggg tctcttcctc tccaggcctt gccgtccccc tggcctctct 240tcccagctca cacatgaaga tgcacttgca aagggctctg gtggtcctgg ccctgctgaa 300ctttgccacg gtcagcctct ctctgtccac ttgcaccacc ttggacttcg gccacatcaa 360gaagaagagg gtggaagcca ttaggggaca gatcttgagc aagctcaggc tcaccagccc 420ccctgagcca acggtgatga cccacgtccc ctatcaggtc ctggcccttt acaacagcac 480ccgggagctg ctggaggaga tgcatgggga gagggaggaa ggctgcaccc aggaaaacac 540cgagtcggaa tactatgcca aagaaatcca taaattcgac atgatccagg ggctggcgga 600gcacaacgaa ctggctgtct gccctaaagg aattacctcc aaggttttcc gcttcaatgt 660gtcctcagtg gagaaaaata gaaccaacct attccgagca gaattccggg tcttgcgggt 720gcccaacccc agctctaagc ggaatgagca gaggatcgag ctcttccaga tccttcggcc 780agatgagcac attgccaaac agcgctatat cggtggcaag aatctgccca cacggggcac 840tgccgagtgg ctgtcctttg atgtcactga cactgtgcgt gagtggctgt tgagaagaga 900gtccaactta ggtctagaaa tcagcattca ctgtccatgt cacacctttc agcccaatgg 960agatatcctg gaaaacattc acgaggtgat ggaaatcaaa ttcaaaggcg tggacaatga 1020ggatgaccat ggccgtggag atctggggcg cctcaagaag cagaaggatc accacaaccc 1080tcatctaatc ctcatgatga ttcccccaca ccggctcgac aacccgggcc aggggggtca 1140gaggaagaag cgggctttgg acaccaatta ctgcttccgc aacttggagg agaactgctg 1200tgtgcgcccc ctctacattg acttccgaca ggatctgggc tggaagtggg tccatgaacc 1260taagggctac tatgccaact tctgctcagg cccttgccca tacctccgca gtgcagacac 1320aacccacagc acggtgctgg gactgtacaa cactctgaac cctgaagcat ctgcctcgcc 1380ttgctgcgtg ccccaggacc tggagcccct gaccatcctg tactatgttg ggaggacccc 1440caaagtggag cagctctcca acatggtggt gaagtcttgt aaatgtagct gagaccccac 1500gtgcgacaga gagaggggag agagaaccac cactgcctga ctgcccgctc ctcgggaaac 1560acacaagcaa caaacctcac tgagaggcct ggagcccaca accttcggct ccgggcaaat 1620ggctgagatg gaggtttcct tttggaacat ttctttcttg ctggctctga gaatcacggt 1680ggtaaagaaa gtgtgggttt ggttagagga aggctgaact cttcagaaca cacagacttt 1740ctgtgacgca gacagagggg atggggatag aggaaaggga tggtaagttg agatgttgtg 1800tggcaatggg atttgggcta ccctaaaggg agaaggaagg gcagagaatg gctgggtcag 1860ggccagactg gaagacactt cagatctgag gttggatttg ctcattgctg taccacatct 1920gctctaggga atctggatta tgttatacaa ggcaagcatt ttttttttta aagacaggtt 1980acgaagacaa agtcccagaa ttgtatctca tactgtctgg gattaagggc aaatctatta 2040cttttgcaaa ctgtcctcta catcaattaa catcgtgggt cactacaggg agaaaatcca 2100ggtcatgcag ttcctggccc atcaactgta ttgggccttt tggatatgct gaacgcagaa 2160gaaagggtgg aaatcaaccc tctcctgtct gccctctggg tccctcctct cacctctccc 2220tcgatcatat ttccccttgg acacttggtt agacgccttc caggtcagga tgcacatttc 2280tggattgtgg ttccatgcag ccttggggca ttatgggtct tcccccactt cccctccaag 2340accctgtgtt catttggtgt tcctggaagc aggtgctaca acatgtgagg cattcgggga 2400agctgcacat gtgccacaca gtgacttggc cccagacgca tagactgagg tataaagaca 2460agtatgaata ttactctcaa aatctttgta taaataaata tttttggggc atcctggatg 2520atttcatctt ctggaatatt gtttctagaa cagtaaaagc cttattctaa ggtg 25743271421DNAHomo sapiens 327acttactgcg ggacggcctt ggagagtact cgggttcgtg aacttcccgg aggcgcaatg 60agctgcatta acctgcccac tgtgctgccc ggctccccca gcaagacccg ggggcagatc 120caggtgattc tcgggccgat gttctcagga aaaagcacag agttgatgag acgcgtccgt 180cgcttccaga ttgctcagta caagtgcctg gtgatcaagt atgccaaaga cactcgctac 240agcagcagct tctgcacaca tgaccggaac accatggagg cgctgcccgc ctgcctgctc 300cgagacgtgg cccaggaggc cctgggcgtg gctgtcatag gcatcgacga ggggcagttt 360ttccctgaca tcatggagtt ctgcgaggcc atggccaacg ccgggaagac cgtaattgtg 420gctgcactgg atgggacctt ccagaggaag ccatttgggg ccatcctgaa cctggtgccg 480ctggccgaga gcgtggtgaa gctgacggcg gtgtgcatgg agtgcttccg ggaagccgcc 540tataccaaga ggctcggcac agagaaggag gtcgaggtga ttgggggagc agacaagtac 600cactccgtgt gtcggctctg ctacttcaag aaggcctcag gccagcctgc cgggccggac 660aacaaagaga actgcccagt gccaggaaag ccaggggaag ccgtggctgc caggaagctc 720tttgccccac agcagattct gcaatgcagc cctgccaact gagggacctg caagggccgc 780ccgctccctt cctgccactg ccgcctactg gacgctgccc tgcatgctgc ccagccactc 840caggaggaag tcgggaggcg tggagggtga ccacaccttg gccttctggg aactctcctt 900tgtgtggctg ccccacctgc cgcatgctcc ctcctctcct acccactggt ctgcttaaag 960cttccctctc agctgctggg acgatcgccc aggctggagc tggccccgct tggtggcctg 1020ggatctggca cactccctct ccttggggtg agggacagag ccccacgctg ttgacatcag 1080cctgcttctt cccctctgcg gctttcactg ctgagtttct gttctccctg ggaagcctgt 1140gccagcacct ttgagccttg gcccacactg aggcttaggc ctctctgcct gggatgggct 1200cccaccctcc cctgaggatg gcctggattc acgccctctt gtttcctttt gggctcaaag 1260cccttcctac ctctggtgat ggtttccaca ggaacaacag catctttcac caagatgggt 1320ggcaccaacc ttgctgggac ttggatccca ggggcttatc tcttcaagtg tggagagggc 1380agggtccacg cctctgctgt agcttatgaa attaactaat t 14213284604DNAHomo sapiens 328ggaacagctt gtccacccgc cggccggacc agaagccttt gggtctgaag tgtctgtgag 60acctcacaga agagcacccc tgggctccac ttacctgccc cctgctcctt cagggatgga 120ggcaatggcg gccagcactt ccctgcctga ccctggagac tttgaccgga acgtgccccg 180gatctgtggg gtgtgtggag accgagccac tggctttcac ttcaatgcta tgacctgtga 240aggctgcaaa ggcttcttca ggcgaagcat gaagcggaag gcactattca cctgcccctt 300caacggggac tgccgcatca ccaaggacaa ccgacgccac tgccaggcct gccggctcaa 360acgctgtgtg gacatcggca tgatgaagga gttcattctg acagatgagg aagtgcagag 420gaagcgggag atgatcctga agcggaagga ggaggaggcc ttgaaggaca gtctgcggcc 480caagctgtct gaggagcagc agcgcatcat tgccatactg ctggacgccc accataagac 540ctacgacccc acctactccg acttctgcca gttccggcct ccagttcgtg tgaatgatgg 600tggagggagc catccttcca ggcccaactc cagacacact cccagcttct ctggggactc 660ctcctcctcc tgctcagatc actgtatcac ctcttcagac atgatggact cgtccagctt 720ctccaatctg gatctgagtg aagaagattc agatgaccct tctgtgaccc tagagctgtc 780ccagctctcc atgctgcccc acctggctga cctggtcagt tacagcatcc aaaaggtcat 840tggctttgct aagatgatac caggattcag agacctcacc tctgaggacc agatcgtact 900gctgaagtca agtgccattg aggtcatcat gttgcgctcc aatgagtcct tcaccatgga 960cgacatgtcc tggacctgtg gcaaccaaga ctacaagtac cgcgtcagtg acgtgaccaa 1020agccggacac agcctggagc tgattgagcc cctcatcaag ttccaggtgg gactgaagaa 1080gctgaacttg catgaggagg agcatgtcct gctcatggcc atctgcatcg tctccccaga 1140tcgtcctggg gtgcaggacg ccgcgctgat tgaggccatc caggaccgcc tgtccaacac 1200actgcagacg tacatccgct gccgccaccc gcccccgggc agccacctgc tctatgccaa 1260gatgatccag aagctagccg acctgcgcag cctcaatgag gagcactcca agcagtaccg 1320ctgcctctcc ttccagcctg agtgcagcat gaagctaacg ccccttgtgc tcgaagtgtt 1380tggcaatgag atctcctgac taggacagcc tgtgcggtgc ctgggtgggg ctgctcctcc 1440agggccacgt gccaggcccg gggctggcgg ctactcagca gccctcctca cccgtctggg 1500gttcagcccc tcctctgcca cctcccctat ccacccagcc cattctctct cctgtccaac 1560ctaacccctt tcctgcgggc ttttccccgg tcccttgaga cctcagccat gaggagttgc 1620tgtttgtttg acaaagaaac ccaagtgggg gcagagggca gaggctggag gcaggccttg 1680cccagagatg cctccaccgc tgcctaagtg gctgctgact gatgttgagg gaacagacag 1740gagaaatgca tccattcctc agggacagag acacctgcac ctccccccac tgcaggcccc 1800gcttgtccag cgcctagtgg ggtctccctc tcctgcctta ctcacgataa ataatcggcc 1860cacagctccc accccacccc cttcagtgcc caccaacatc ccattgccct ggttatattc 1920tcacgggcag tagctgtggt gaggtgggtt ttcttcccat cactggagca ccaggcacga 1980acccacctgc tgagagaccc aaggaggaaa aacagacaaa aacagcctca cagaagaata 2040tgacagctgt ccctgtcacc aagctcacag ttcctcgccc tgggtctaag gggttggttg 2100aggtggaagc cctccttcca cggatccatg tagcaggact gaattgtccc cagtttgcag 2160aaaagcacct gccgacctcg tcctccccct gccagtgcct tacctcctgc ccaggagagc 2220cagccctccc tgtcctcctc ggatcaccga gagtagccga gagcctgctc ccccaccccc 2280tccccagggg agagggtctg gagaagcagt gagccgcatc ttctccatct ggcagggtgg 2340gatggaggag aagaattttc agaccccagc ggctgagtca tgatctccct gccgcctcaa 2400tgtggttgca aggccgctgt tcaccacagg gctaagagct aggctgccgc accccagagt 2460gtgggaaggg agagcggggc agtctcgggt ggctagtcag agagagtgtt tgggggttcc 2520gtgatgtagg gtaaggtgcc ttcttattct cactccacca cccaaaagtc aaaaggtgcc 2580tgtgaggcag gggcggagtg atacaacttc aagtgcatgc tctctgcagg tcgagcccag 2640cccagctggt gggaagcgtc tgtccgttta ctccaaggtg ggtctttgtg agagtgagct 2700gtaggtgtgc gggaccggta cagaaaggcg ttcttcgagg tggatcacag aggcttcttc 2760agatcaatgc ttgagtttgg aatcggccgc attccctgag tcaccaggaa tgttaaagtc 2820agtgggaacg tgactgcccc aactcctgga agctgtgtcc ttgcacctgc atccgtagtt 2880ccctgaaaac ccagagagga atcagacttc acactgcaag agccttggtg tccacctggc 2940cccatgtctc tcagaattct tcaggtggaa aaacatctga aagccacgtt ccttactgca 3000gaatagcata tatatcgctt aatcttaaat ttattagata tgagttgttt tcagactcag 3060actccatttg tattatagtc taatatacag ggtagcaggt accactgatt tggagatatt 3120tatgggggga gaacttacat tgtgaaactt ctgtacatta attattattg ctgttgttat 3180tttacaaggg tctagggaga gacccttgtt tgattttagc tgcagaactg tattggtcca 3240gcttgctctt cagtgggaga aaaacacttg taagttgcta aacgagtcaa tcccctcatt 3300caggaaaact gacagaggag ggcgtgactc acccaagcca tatataacta gctagaagtg 3360ggccaggaca ggccgggcgc ggtggctcac gcctgtaatc ccagcagttt gggaggtcga 3420ggtaggtgga tcacctgagg tcgggagttc gagaccaacc tgaccaacat ggagaaaccc 3480tgtctctatt aaaaatacaa aaaaaaaaaa aaaaaaaaat agccgggcat ggtggcgcaa 3540gcctgtaatc ccagctactc aggaggctga ggcagaagaa ttgaacccag gaggtggagg 3600ttgcagtgag ctgagatcgt gccgttactc tccaacctgg acaacaagag cgaaactccg 3660tcttagaagt ggaccaggac aggaccagat tttggagtca tggtccggtg tccttttcac 3720tacaccatgt ttgagctcag acccccactc tcattcccca ggtggctgac ccagtccctg 3780ggggaagccc tggatttcag aaagagccaa gtctggatct gggacccttt ccttccttcc 3840ctggcttgta actccaccaa gcccatcaga aggagaagga aggagactca cctctgcctc 3900aatgtgaatc agaccctacc ccaccacgat gtgccctggc tgctgggctc tccacctcag 3960gccttggata atgctgttgc ctcatctata acatgcattt gtctttgtaa tgtcaccacc 4020ttcccagctc tccctctggc cctgcttctt cggggaactc ctgaaatatc agttactcag 4080ccctgggccc caccacctag gccactcctc caaaggaagt ctaggagctg ggaggaaaag 4140aaaagagggg aaaatgagtt tttatggggc tgaacgggga gaaaaggtca tcatcgattc 4200tactttagaa tgagagtgtg aaatagacat ttgtaaatgt aaaactttta aggtatatca 4260ttataactga aggagaaggt gccccaaaat gcaagatttt ccacaagatt cccagagaca 4320ggaaaatcct ctggctggct aactggaagc atgtaggaga atccaagcga ggtcaacaga 4380gaaggcagga atgtgtggca gatttagtga aagctagaga tatggcagcg aaaggatgta 4440aacagtgcct gctgaatgat ttccaaagag aaaaaaagtt tgccagaagt ttgtcaagtc 4500aaccaatgta gaaagctttg cttatggtaa taaaaatggc tcatacttat atagcactta 4560ctttgtttgc aagtactgct gtaaataaat gctttatgca aacc 46043292076DNAHomo sapiens 329cggggaaggg gagggaggag ggggacgagg gctctggcgg gtttggaggg gctgaacatc 60gcggggtgtt ctggtgtccc ccgccccgcc tctccaaaaa gctacaccga cgcggaccgc 120ggcggcgtcc tccctcgccc tcgcttcacc tcgcgggctc cgaatgcggg gagctcggat 180gtccggtttc ctgtgaggct tttacctgac acccgccgcc tttccccggc actggctggg 240agggcgccct gcaaagttgg gaacgcggag ccccggaccc gctcccgccg cctccggctc 300gcccaggggg ggtcgccggg aggagcccgg gggagaggga ccaggagggg cccgcggcct 360cgcaggggcg cccgcgcccc cacccctgcc cccgccagcg gaccggtccc ccacccccgg 420tccttccacc atgcacttgc tgggcttctt ctctgtggcg tgttctctgc tcgccgctgc 480gctgctcccg ggtcctcgcg aggcgcccgc cgccgccgcc gccttcgagt ccggactcga 540cctctcggac gcggagcccg acgcgggcga ggccacggct tatgcaagca aagatctgga 600ggagcagtta cggtctgtgt ccagtgtaga tgaactcatg actgtactct acccagaata 660ttggaaaatg tacaagtgtc agctaaggaa aggaggctgg caacataaca gagaacaggc 720caacctcaac tcaaggacag aagagactat aaaatttgct gcagcacatt ataatacaga 780gatcttgaaa agtattgata atgagtggag aaagactcaa tgcatgccac gggaggtgtg 840tatagatgtg gggaaggagt ttggagtcgc gacaaacacc ttctttaaac ctccatgtgt 900gtccgtctac agatgtgggg gttgctgcaa tagtgagggg ctgcagtgca tgaacaccag 960cacgagctac ctcagcaaga cgttatttga aattacagtg cctctctctc aaggccccaa 1020accagtaaca atcagttttg ccaatcacac ttcctgccga tgcatgtcta aactggatgt 1080ttacagacaa gttcattcca ttattagacg ttccctgcca gcaacactac cacagtgtca 1140ggcagcgaac aagacctgcc ccaccaatta catgtggaat aatcacatct gcagatgcct 1200ggctcaggaa gattttatgt tttcctcgga tgctggagat gactcaacag atggattcca 1260tgacatctgt ggaccaaaca aggagctgga tgaagagacc tgtcagtgtg tctgcagagc 1320ggggcttcgg

cctgccagct gtggacccca caaagaacta gacagaaact catgccagtg 1380tgtctgtaaa aacaaactct tccccagcca atgtggggcc aaccgagaat ttgatgaaaa 1440cacatgccag tgtgtatgta aaagaacctg ccccagaaat caacccctaa atcctggaaa 1500atgtgcctgt gaatgtacag aaagtccaca gaaatgcttg ttaaaaggaa agaagttcca 1560ccaccaaaca tgcagctgtt acagacggcc atgtacgaac cgccagaagg cttgtgagcc 1620aggattttca tatagtgaag aagtgtgtcg ttgtgtccct tcatattgga aaagaccaca 1680aatgagctaa gattgtactg ttttccagtt catcgatttt ctattatgga aaactgtgtt 1740gccacagtag aactgtctgt gaacagagag acccttgtgg gtccatgcta acaaagacaa 1800aagtctgtct ttcctgaacc atgtggataa ctttacagaa atggactgga gctcatctgc 1860aaaaggcctc ttgtaaagac tggttttctg ccaatgacca aacagccaag attttcctct 1920tgtgatttct ttaaaagaat gactatataa tttatttcca ctaaaaatat tgtttctgca 1980ttcattttta tagcaacaac aattggtaaa actcactgtg atcaatattt ttatatcatg 2040caaaatatgt ttaaaataaa atgaaaattg tattat 20763302819DNAHomo sapiens 330ctgggcccag ctcccccgag aggtggtcgg atcctctggg ctgctcggtc gatgcctgtg 60ccactgacgt ccaggcatga ggtggttcct gccctggacg ctggcagcag tgacagcagc 120agccgccagc accgtcctgg ccacggccct ctctccagcc cctacgacca tggactttac 180cccagctcca ctggaggaca cctcctcacg cccccaattc tgcaagtggc catgtgagtg 240cccgccatcc ccaccccgct gcccgctggg ggtcagcctc atcacagatg gctgtgagtg 300ctgtaagatg tgcgctcagc agcttgggga caactgcacg gaggctgcca tctgtgaccc 360ccaccggggc ctctactgtg actacagcgg ggaccgcccg aggtacgcaa taggagtgtg 420tgcacaggtg gtcggtgtgg gctgcgtcct ggatggggtg cgctacaaca acggccagtc 480cttccagcct aactgcaagt acaactgcac gtgcatcgac ggcgcggtgg gctgcacacc 540actgtgcctc cgagtgcgcc ccccgcgtct ctggtgcccc cacccgcggc gcgtgagcat 600acctggccac tgctgtgagc agtgggtatg tgaggacgac gccaagaggc cacgcaagac 660cgcaccccgt gacacaggag ccttcgatgc tgtgggtgag gtggaggcat ggcacaggaa 720ctgcatagcc tacacaagcc cctggagccc ttgctccacc agctgcggcc tgggggtctc 780cactcggatc tccaatgtta acgcccagtg ctggcctgag caagagagcc gcctctgcaa 840cttgcggcca tgcgatgtgg acatccatac actcattaag gcagggaaga agtgtctggc 900tgtgtaccag ccagaggcat ccatgaactt cacacttgcg ggctgcatca gcacacgctc 960ctatcaaccc aagtactgtg gagtttgcat ggacaatagg tgctgcatcc cctacaagtc 1020taagactatc gacgtgtcct tccagtgtcc tgatgggctt ggcttctccc gccaggtcct 1080atggattaat gcctgcttct gtaacctgag ctgtaggaat cccaatgaca tctttgctga 1140cttggaatcc taccctgact tctcagaaat tgccaactag gcaggcacaa atcttgggtc 1200ttggggacta acccaatgcc tgtgaagcag tcagccctta tggccaataa cttttcacca 1260atgagcctta gttaccctga tctggaccct tggcctccat ttctgtctct aaccattcaa 1320atgacgcctg atggtgctgc tcaggcccat gctatgagtt ttctccttga tatcattcag 1380catctactct aaagaaaaat gcctgtctct agctgttctg gactacaccc aagcctgatc 1440cagcctttcc aagtcactag aagtcctgct ggatcttgcc taaatcccaa gaaatggaat 1500caggtagact tttaatatca ctaatttctt ctttagatgc caaaccacaa gactctttgg 1560gtccattcag atgaatagat ggaatttgga acaatagaat aatctattat ttggagcctg 1620ccaagaggta ctgtaatggg taattctgac gtcagcgcac caaaactatc ctgattccaa 1680atatgtatgc acctcaaggt catcaaacat ttgccaagtg agttgaatag ttgcttaatt 1740ttgattttta atggaaagtt gtatccatta acctgggcat tgttgaggtt aagtttctct 1800tcacccctac actgtgaagg gtacagatta ggtttgtccc agtcagaaat aaaatttgat 1860aaacattcct gttgatggga aaagccccca gttaatactc cagagacagg gaaaggtcag 1920cccgtttcag aaggaccaat tgactctcac actgaatcag ctgctgactg gcagggcttt 1980gggcagttgg ccaggctctt ccttgaatct tctcccttgt cctgcttggg gttcatagga 2040attggtaagg cctctggact ggcctgtctg gcccctgaga gtggtgccct ggaacactcc 2100tctactctta cagagccttg agagacccag ctgcagacca tgccagaccc actgaaatga 2160ccaagacagg ttcaggtagg ggtgtgggtc aaaccaagaa gtgggtgccc ttggtagcag 2220cctggggtga cctctagagc tggaggctgt gggactccag gggcccccgt gttcaggaca 2280catctattgc agagactcat ttcacagcct ttcgttctgc tgaccaaatg gccagttttc 2340tggtaggaag atggaggttt accggttgtt tagaaacaga aatagactta ataaaggttt 2400aaagctgaag aggttgaagc taaaaggaaa aggttgttgt taatgaatat caggctatta 2460tttattgtat taggaaaata taatatttac tgttagaatt cttttattta gggccttttc 2520tgtgccagac attgctctca gtgctttgca tgtattagct cactgaatct tcacgacaat 2580gttgagaagt tcccattatt atttctgttc ttacaaatgt gaaacggaag ctcatagagg 2640tgagaaaact caaccagagt cacccagttg gtgactggga aagttaggat tcagatcgaa 2700attggactgt ctttataacc catattttcc ccctgttttt agagcttcca aatgtgtcag 2760aataggaaaa cattgcaata aatggcttga ttttttaaaa aaaaaaaaaa aaaaaaaaa 28193312540DNAHomo sapiens 331gaaaaggtgg acaagtccta ttttcaagag aagatgactt ttaacagttt tgaaggatct 60aaaacttgtg tacctgcaga catcaataag gaagaagaat ttgtagaaga gtttaataga 120ttaaaaactt ttgctaattt tccaagtggt agtcctgttt cagcatcaac actggcacga 180gcagggtttc tttatactgg tgaaggagat accgtgcggt gctttagttg tcatgcagct 240gtagatagat ggcaatatgg agactcagca gttggaagac acaggaaagt atccccaaat 300tgcagattta tcaacggctt ttatcttgaa aatagtgcca cgcagtctac aaattctggt 360atccagaatg gtcagtacaa agttgaaaac tatctgggaa gcagagatca ttttgcctta 420gacaggccat ctgagacaca tgcagactat cttttgagaa ctgggcaggt tgtagatata 480tcagacacca tatacccgag gaaccctgcc atgtattgtg aagaagctag attaaagtcc 540tttcagaact ggccagacta tgctcaccta accccaagag agttagcaag tgctggactc 600tactacacag gtattggtga ccaagtgcag tgcttttgtt gtggtggaaa actgaaaaat 660tgggaacctt gtgatcgtgc ctggtcagaa cacaggcgac actttcctaa ttgcttcttt 720gttttgggcc ggaatcttaa tattcgaagt gaatctgatg ctgtgagttc tgataggaat 780ttcccaaatt caacaaatct tccaagaaat ccatccatgg cagattatga agcacggatc 840tttacttttg ggacatggat atactcagtt aacaaggagc agcttgcaag agctggattt 900tatgctttag gtgaaggtga taaagtaaag tgctttcact gtggaggagg gctaactgat 960tggaagccca gtgaagaccc ttgggaacaa catgctaaat ggtatccagg gtgcaaatat 1020ctgttagaac agaagggaca agaatatata aacaatattc atttaactca ttcacttgag 1080gagtgtctgg taagaactac tgagaaaaca ccatcactaa ctagaagaat tgatgatacc 1140atcttccaaa atcctatggt acaagaagct atacgaatgg ggttcagttt caaggacatt 1200aagaaaataa tggaggaaaa aattcagata tctgggagca actataaatc acttgaggtt 1260ctggttgcag atctagtgaa tgctcagaaa gacagtatgc aagatgagtc aagtcagact 1320tcattacaga aagagattag tactgaagag cagctaaggc gcctgcaaga ggagaagctt 1380tgcaaaatct gtatggatag aaatattgct atcgtttttg ttccttgtgg acatctagtc 1440acttgtaaac aatgtgctga agcagttgac aagtgtccca tgtgctacac agtcattact 1500ttcaagcaaa aaatttttat gtcttaatct aactctatag taggcatgtt atgttgttct 1560tattaccctg attgaatgtg tgatgtgaac tgactttaag taatcaggat tgaattccat 1620tagcatttgc taccaagtag gaaaaaaaat gtacatggca gtgttttagt tggcaatata 1680atctttgaat ttcttgattt ttcagggtat tagctgtatt atccattttt tttactgtta 1740tttaattgaa accatagact aagaataaga agcatcatac tataactgaa cacaatgtgt 1800attcatagta tactgattta atttctaagt gtaagtgaat taatcatctg gattttttat 1860tcttttcaga taggcttaac aaatggagct ttctgtatat aaatgtggag attagagtta 1920atctccccaa tcacataatt tgttttgtgt gaaaaaggaa taaattgttc catgctggtg 1980gaaagataga gattgttttt agaggttggt tgttgtgttt taggattctg tccattttct 2040tgtaaaggga taaacacgga cgtgtgcgaa atatgtttgt aaagtgattt gccattgttg 2100aaagcgtatt taatgataga atactatcga gccaacatgt actgacatgg aaagatgtca 2160gagatatgtt aagtgtaaaa tgcaagtggc gggacactat gtatagtctg agccagatca 2220aagtatgtat gttgttaata tgcatagaac gagagatttg gaaagatata caccaaactg 2280ttaaatgtgg tttctcttcg gggagggggg gattggggga ggggccccag aggggtttta 2340gaggggcctt ttcactttcg acttttttca ttttgttctg ttcggatttt ttataagtat 2400gtagaccccg aagggtttta tgggaactaa catcagtaac ctaacccccg tgactatcct 2460gtgctcttcc tagggagctg tgttgtttcc cacccaccac ccttccctct gaacaaatgc 2520ctgagtgctg gggcactttg 25403321474DNAHomo sapiens 332aaaaagaaat caagaatgca attttattta caatagtcac gccggaaata cctagaaata 60aatttaactg aggatgtaaa agacctctac aaggagagtt caatgcgtag cgggagcgga 120gagctgaccc cagagagccc tgggcagccc cacctccgcc gccggcctag ttaccatcac 180accccggaga gcccgcagct gccgcagccg gccccagtca ccatcaccgc aaccatgagc 240agcgaggccg agacccagca gccgcccgcc gccccccccg ccgcccccgc cctcagcgcc 300gccgacacca agcccggcac taccggagcg gcgcagggag cggtggcccg ggcggctcac 360atcggcggcg ctggcgcggg cgacaagaag gtcatcgcaa cgaaggtttt gggaacagta 420aaatggttca atgtaaggaa cggatatggt ttcatcaaca ggaatgacac caaggaagat 480gtatttgtac accagactgc cataaagaag aataacccca ggaagtacct tcgcagtgta 540ggagatggag agactgtgga gtttgatgtt gttgaaggag aaaagggtgc ggaggcagca 600aatgttacag gtcctggtgg tgttccagtt caaggcagta aatatgcagc agaccgtaac 660cattatagac gctatccacg tcgtaggggt cctccacgca attaccagca aaattaccag 720aatagtgaga gtggggaaaa gaacgaggga tcggagagtg ctcccgaagc caggcccaac 780aacgccggcc ctacgcaggc gaaggttccc accttactac atgcggagac ctatgggcgt 840cgaccacagt attccaaccc tcctgtgcag ggagaagtga tggagggtgc tgacaaccag 900ggtgcaggag aacaaggtag accagtgagg cagatatgta tcggggatat agaccacgat 960tccgcagggg ccctcctcgc caaaagacag cctagagagg acggcaatga agaagataaa 1020gaaaatcaag gagatgagac ccaaggtcag cagccacctc aagctcggta ccgccgcaac 1080ttcaattacc gacgcagacg cccagaaaac cctaaaccac aagatggcaa agagacaaaa 1140gcagccgatc caccagctga gaattcgtcc gctcccgagg ctgagcaggg cggggctgag 1200taaatgccgg cttaccatct ctaccatcat ccggtttagt catccaacaa gaagaaatat 1260gaaattccag caataagaaa tgaacaaaag attggagctg aagacctaaa gtgcttgctt 1320tttgcccgtt gaccagataa atagaactat ctgcattatc tatgcagcat ggggttttta 1380ttatgtttta cctaaagacg tctctttttg gtaataacaa accgtgtttt ttaaaaaagc 1440ctggtttttc tcaatacgcc tttaaaggaa ttcc 14743334079DNAHomo sapiens 333ggagcggcgg gcgggcggga gggctggcgg ggcgaacgtc tgggagacgt ctgaaagacc 60aacgagactt tggagaccag agacgcgcct ggggggacct ggggcttggg gcgtgcgaga 120tttcccttgc attcgctggg agctcgcgca gggatcgtcc catggccggg gctcggagcc 180gcgacccttg gggggcctcc gggatttgct acctttttgg ctccctgctc gtcgaactgc 240tcttctcacg ggctgtcgcc ttcaatctgg acgtgatggg tgccttgcgc aaggagggcg 300agccaggcag cctcttcggc ttctctgtgg ccctgcaccg gcagttgcag ccccgacccc 360agagctggct gctggtgggt gctccccagg ccctggctct tcctgggcag caggcgaatc 420gcactggagg cctcttcgct tgcccgttga gcctggagga gactgactgc tacagagtgg 480acatcgacca gggagctgat atgcaaaagg aaagcaagga gaaccagtgg ttgggagtca 540gtgttcggag ccaggggcct gggggcaaga ttgttacctg tgcacaccga tatgaggcaa 600ggcagcgagt ggaccagatc ctggagacgc gggatatgat tggtcgctgc tttgtgctca 660gccaggacct ggccatccgg gatgagttgg atggtgggga atggaagttc tgtgagggac 720gcccccaagg ccatgaacaa tttgggttct gccagcaggg cacagctgcc gccttctccc 780ctgatagcca ctacctcctc tttggggccc caggaaccta taattggaag gggttgcttt 840ttgtgaccaa cattgatagc tcagaccccg accagctggt gtataaaact ttggaccctg 900ctgaccggct cccaggacca gccggagact tggccctcaa tagctactta ggcttctcta 960ttgactcggg gaaaggtctg gtgcgtgcag aagagctgag ctttgtggct ggagcccccc 1020gcgccaacca caagggtgct gtggttatcc tgcgcaagga cagcgccagt cgcctggtgc 1080ccgaggttat gctgtctggg gagcgcctga cctccggctt tggctactca ctggctgtgg 1140ctgacctcaa cagtgatggc tggccagacc tgatagtggg tgccccctac ttctttgagc 1200gccaagaaga gctggggggt gctgtgtatg tgtacttgaa ccaggggggt cactgggctg 1260ggatctcccc tctccggctc tgcggctccc ctgactccat gttcgggatc agcctggctg 1320tcctggggga cctcaaccaa gatggctttc cagatattgc agtgggtgcc ccctttgatg 1380gtgatgggaa agtcttcatc taccatggga gcagcctggg ggttgtcgcc aaaccttcac 1440aggtgctgga gggcgaggct gtgggcatca agagcttcgg ctactccctg tcaggcagct 1500tggatatgga tgggaaccaa taccctgacc tgctggtggg ctccctggct gacaccgcag 1560tgctcttcag ggccagaccc atcctccatg tctcccatga ggtctctatt gctccacgaa 1620gcatcgacct ggagcagccc aactgtgctg gcggccactc ggtctgtgtg gacctaaggg 1680tctgtttcag ctacattgca gtccccagca gctatagccc tactgtggcc ctggactatg 1740tgttagatgc ggacacagac cggaggctcc ggggccaggt tccccgtgtg acgttcctga 1800gccgtaacct ggaagaaccc aagcaccagg cctcgggcac cgtgtggctg aagcaccagc 1860atgaccgagt ctgtggagac gccatgttcc agctccagga aaatgtcaaa gacaagcttc 1920gggccattgt agtgaccttg tcctacagtc tccagacccc tcggctccgg cgacaggctc 1980ctggccaggg gctgcctcca gtggccccca tcctcaatgc ccaccagccc agcacccagc 2040gggcagagat ccacttcctg aagcaaggct gtggtgaaga caagatctgc cagagcaatc 2100tgcagctggt ccacgcccgc ttctgtaccc gggtcagcga cacggaattc caacctctgc 2160ccatggatgt ggatggaaca acagccctgt ttgcactgag tgggcagcca gtcattggcc 2220tggagctgat ggtcaccaac ctgccatcgg acccagccca gccccaggct gatggggatg 2280atgcccatga agcccagctc ctggtcatgc ttcctgactc actgcactac tcaggggtcc 2340gggccctgga ccctgcggag aagccactct gcctgtccaa tgagaatgcc tcccatgttg 2400agtgtgagct ggggaacccc atgaagagag gtgcccaggt caccttctac ctcatcctta 2460gcacctccgg gatcagcatt gagaccacgg aactggaggt agagctgctg ttggccacga 2520tcagtgagca ggagctgcat ccagtctctg cacgagcccg tgtcttcatt gagctgccac 2580tgtccattgc aggaatggcc attccccagc aactcttctt ctctggtgtg gtgaggggcg 2640agagagccat gcagtctgag cgggatgtgg gcagcaaggt caagtatgag gtcacggttt 2700ccaaccaagg ccagtcgctc agaaccctgg gctctgcctt cctcaacatc atgtggcctc 2760atgagattgc caatgggaag tggttgctgt acccaatgca ggttgagctg gagggcgggc 2820aggggcctgg gcagaaaggg ctttgctctc ccaggcccaa catcctccac ctggatgtgg 2880acagtaggga taggaggcgg cgggagctgg agccacctga gcagcaggag cctggtgagc 2940ggcaggagcc cagcatgtcc tggtggccag tgtcctctgc tgagaagaag aaaaacatca 3000ccctggactg cgcccggggc acggccaact gtgtggtgtt cagctgccca ctctacagct 3060ttgaccgcgc ggctgtgctg catgtctggg gccgtctctg gaacagcacc tttctggagg 3120agtactcagc tgtgaagtcc ctggaagtga ttgtccgggc caacatcaca gtgaagtcct 3180ccataaagaa cttgatgctc cgagatgcct ccacagtgat cccagtgatg gtatacttgg 3240accccatggc tgtggtggca gaaggagtgc cctggtgggt catcctcctg gctgtactgg 3300ctgggctgct ggtgctagca ctgctggtgc tgctcctgtg gaagatggga ttcttcaaac 3360gggcgaagca ccccgaggcc accgtgcccc agtaccatgc ggtgaagatt cctcgggaag 3420accgacagca gttcaaggag gagaagacgg gcaccatcct gaggaacaac tggggcagcc 3480cccggcggga gggcccggat gcacacccca tcctggctgc tgacgggcat cccgagctgg 3540gccccgatgg gcatccaggg ccaggcaccg cctaggttcc catgtcccag cctggcctgt 3600ggctgccctc catcccttcc ccagagatgg ctccttggga tgaagagggt agagtgggct 3660gctggtgtcg catcaagatt tggcaggatc ggcttcctca ggggcacaga cctctcccac 3720ccacaagaac tcctcccacc caacttcccc ttagagtgct gtgagatgag agtgggtaaa 3780tcagggacag ggccatgggg tagggtgaga agggcagggg tgtcctgatg caaaggtggg 3840gagaagggat cctaatccct tcctctccca ttcaccctgt gtaacaggac cccaaggacc 3900tgcctccccg gaagtgcctt aacctagagg gtcggggagg aggttgtgtc actgactcag 3960gctgctcctt ctctagtttc ccctctcatc tgaccttagt ttgctgccat cagtctagtg 4020gtttcgtggt ttcgtctatt tattaaaaaa tatttgagaa caaaaaaaaa aaaaaaaaa 40793343373DNAHomo sapiens 334ggtggcaact tctcctcctg cggccgggag cggcctgcct gcctccctgc gcacccgcag 60cctcccccgc tgcctcccta gggctcccct ccggccgcca gcgcccattt ttcattccct 120agatagagat actttgcgcg cacacacata catacgcgcg caaaaaggaa aaaaaaaaaa 180aaaagcccac cctccagcct cgctgcaaag agaaaaccgg agcagccgca gctcgcagct 240cgcagctcgc agcccgcagc ccgcagagga cgcccagagc ggcgagcagg cgggcagacg 300gaccgacgga ctcgcgccgc gtccacctgt cggccgggcc cagccgagcg cgcagcgggc 360acgccgcgcg cgcggagcag ccgtgcccgc cgcccgggcc cgccgccagg gcgcacacgc 420tcccgccccc ctacccggcc cgggcgggag tttgcacctc tccctgcccg ggtgctcgag 480ctgccgttgc aaagccaact ttggaaaaag ttttttgggg gagacttggg ccttgaggtg 540cccagctccg cgctttccga ttttgggggc ctttccagaa aatgttgcaa aaaagctaag 600ccggcgggca gaggaaaacg cctgtagccg gcgagtgaag acgaaccatc gactgccgtg 660ttccttttcc tcttggaggt tggagtcccc tgggcgcccc cacacggcta gacgcctcgg 720ctggttcgcg acgcagcccc ccggccgtgg atgctgcact cgggctcggg atccgcccag 780gtagccggcc tcggacccag gtcctgcgcc caggtcctcc cctgcccccc agcgacggag 840ccggggccgg gggcggcggc gccgggggca tgcgggtgag ccgcggctgc agaggcctga 900gcgcctgatc gccgcggacc tgagccgagc ccacccccct ccccagcccc ccaccctggc 960cgcgggggcg gcgcgctcga tctacgcgtc cggggccccg cggggccggg cccggagtcg 1020gcatgaatcg ctgctgggcg ctcttcctgt ctctctgctg ctacctgcgt ctggtcagcg 1080ccgaggggga ccccattccc gaggagcttt atgagatgct gagtgaccac tcgatccgct 1140cctttgatga tctccaacgc ctgctgcacg gagaccccgg agaggaagat ggggccgagt 1200tggacctgaa catgacccgc tcccactctg gaggcgagct ggagagcttg gctcgtggaa 1260gaaggagcct gggttccctg accattgctg agccggccat gatcgccgag tgcaagacgc 1320gcaccgaggt gttcgagatc tcccggcgcc tcatagaccg caccaacgcc aacttcctgg 1380tgtggccgcc ctgtgtggag gtgcagcgct gctccggctg ctgcaacaac cgcaacgtgc 1440agtgccgccc cacccaggtg cagctgcgac ctgtccaggt gagaaagatc gagattgtgc 1500ggaagaagcc aatctttaag aaggccacgg tgacgctgga agaccacctg gcatgcaagt 1560gtgagacagt ggcagctgca cggcctgtga cccgaagccc ggggggttcc caggagcagc 1620gagccaaaac gccccaaact cgggtgacca ttcggacggt gcgagtccgc cggcccccca 1680agggcaagca ccggaaattc aagcacacgc atgacaagac ggcactgaag gagacccttg 1740gagcctaggg gcatcggcag gagagtgtgt gggcagggtt atttaatatg gtatttgctg 1800tattgccccc atggggtcct tggagtgata atattgtttc cctcgtccgt ctgtctcgat 1860gcctgattcg gacggccaat ggtgcttccc ccacccctcc acgtgtccgt ccacccttcc 1920atcagcgggt ctcctcccag cggcctccgg tcttgcccag cagctcaaag aagaaaaaga 1980aggactgaac tccatcgcca tcttcttccc ttaactccaa gaacttggga taagagtgtg 2040agagagactg atggggtcgc tctttggggg aaacgggttc cttcccctgc acctggcctg 2100ggccacacct gagcgctgtg gactgtcctg aggagccctg aggacctctc agcatagcct 2160gcctgatccc tgaacccctg gccagctctg aggggaggca cctccaggca ggccaggctg 2220cctcggactc catggctaag accacagacg ggcacacaga ctggagaaaa cccctcccac 2280ggtgcccaaa caccagtcac ctcgtctccc tggtgcctct gtgcacagtg gcttcttttc 2340gttttcgttt tgaagacgtg gactcctctt ggtgggtgtg gccagcacac caagtggctg 2400ggtgccctct caggtgggtt agagatggag tttgctgttg aggtggtgta gatggtgacc 2460tgggtatccc ctgcctcctg ccaccccttc ctccccatac tccactctga ttcacctctt 2520cctctggttc ctttcatctc tctacctcca ccctgcattt tcctcttgtc ctggcccttc 2580agtctgctcc accaaggggc tcttgaaccc cttattaagg ccccagatga ccccagtcac 2640tcctctctag ggcagaagac tagaggccag ggcagcaagg gacctgctca tcatattcca 2700acccagccac gactgccatg taaggttgtg cagggtgtgt actgcacaag gacattgtat 2760gcagggagca ctgttcacat catagataaa gctgatttgt atatttatta tgacaatttc 2820tggcagatgt aggtaaagag gaaaaggatc cttttcctaa ttcacacaaa gactccttgt 2880ggactggctg tgcccctgat gcagcctgtg gctggagtgg ccaaatagga gggagactgt 2940ggtaggggca gggaggcaac actgctgtcc acatgacctc catttcccaa agtcctctgc 3000tccagcaact gcccttccag gtgggtgtgg gacacctggg agaaggtctc caagggaggg 3060tgcagccctc ttgcccgcac ccctccctgc ttgcacactt ccccatcttt gatccttctg 3120agctccacct ctggtggctc ctcctaggaa accagctcgt gggctgggaa tgggggagag

3180aagggaaaag atccccaaga ccccctgggg tgggatctga gctcccacct cccttcccac 3240ctactgcact ttcccccttc ccgccttcca aaacctgctt ccttcagttt gtaaagtcgg 3300tgattatatt tttgggggct ttccttttat tttttaaatg taaaatttat ttatattccg 3360tatttaaagt tgt 33733352304DNAHomo sapiens 335gtccccgcag cgccgtcgcg ccctcctgcc gcaggccacc gaggccgccg ccgtctagcg 60ccccgacctc gccaccatga gagccctgct ggcgcgcctg cttctctgcg tcctggtcgt 120gagcgactcc aaaggcagca atgaacttca tcaagttcca tcgaactgtg actgtctaaa 180tggaggaaca tgtgtgtcca acaagtactt ctccaacatt cactggtgca actgcccaaa 240gaaattcgga gggcagcact gtgaaataga taagtcaaaa acctgctatg aggggaatgg 300tcacttttac cgaggaaagg ccagcactga caccatgggc cggccctgcc tgccctggaa 360ctctgccact gtccttcagc aaacgtacca tgcccacaga tctgatgctc ttcagctggg 420cctggggaaa cataattact gcaggaaccc agacaaccgg aggcgaccct ggtgctatgt 480gcaggtgggc ctaaagccgc ttgtccaaga gtgcatggtg catgactgcg cagatggaaa 540aaagccctcc tctcctccag aagaattaaa atttcagtgt ggccaaaaga ctctgaggcc 600ccgctttaag attattgggg gagaattcac caccatcgag aaccagccct ggtttgcggc 660catctacagg aggcaccggg ggggctctgt cacctacgtg tgtggaggca gcctcatcag 720cccttgctgg gtgatcagcg ccacacactg cttcattgat tacccaaaga aggaggacta 780catcgtctac ctgggtcgct caaggcttaa ctccaacacg caaggggaga tgaagtttga 840ggtggaaaac ctcatcctac acaaggacta cagcgctgac acgcttgctc accacaacga 900cattgccttg ctgaagatcc gttccaagga gggcaggtgt gcgcagccat cccggactat 960acagaccatc tgcctgccct cgatgtataa cgatccccag tttggcacaa gctgtgagat 1020cactggcttt ggaaaagaga attctaccga ctatctctat ccggagcagc tgaaaatgac 1080tgttgtgaag ctgatttccc accgggagtg tcagcagccc cactactacg gctctgaagt 1140caccaccaaa atgctatgtg ctgctgaccc ccaatggaaa acagattcct gccagggaga 1200ctcaggggga cccctcgtct gttccctcca aggccgcatg actttgactg gaattgtgag 1260ctggggccgt ggatgtgccc tgaaggacaa gccaggcgtc tacacgagag tctcacactt 1320cttaccctgg atccgcagtc acaccaagga agagaatggc ctggccctct gagggtcccc 1380agggaggaaa cgggcaccac ccgctttctt gctggttgtc atttttgcag tagagtcatc 1440tccatcagct gtaagaagag actgggaaga taggctctgc acagatggat ttgcctgtgg 1500caccaccagg gtgaacgaca atagctttac cctcacggat aggcctgggt gctggctgcc 1560cagaccctct ggccaggatg gaggggtggt cctgactcaa catgttactg accagcaact 1620tgtctttttc tggactgaag cctgcaggag ttaaaaaggg cagggcatct cctgtgcatg 1680ggctcgaagg gagagccagc tcccccgacc ggtgggcatt tgtgaggccc atggttgaga 1740aatgaataat ttcccaatta ggaagtgtaa gcagctgagg tctcttgagg gagcttagcc 1800aatgtgggag cagcggtttg gggagcagag acactaacga cttcagggca gggctctgat 1860attccatgaa tgtatcagga aatatatatg tgtgtgtatg tttgcacact tgttgtgtgg 1920gctgtgagtg taagtgtgag taagagctgg tgtctgattg ttaagtctaa atatttcctt 1980aaactgtgtg gactgtgatg ccacacagag tggtctttct ggagaggtta taggtcactc 2040ctggggcctc ttgggtcccc cacgtgacag tgcctgggaa tgtacttatt ctgcagcatg 2100acctgtgacc agcactgtct cagtttcact ttcacataga tgtccctttc ttggccagtt 2160atcccttcct tttagcctag ttcatccaat cctcactggg tggggtgagg accactcctt 2220acactgaata tttatatttc actattttta tttatatttt tgtaatttta aataaaagtg 2280atcaataaaa tgtgattttt ctga 23043361876DNAHomo sapiens 336cgcggccgcg gttcgctgtg gcgggcgcct gggccgccgg ctgtttaact tcgcttccgc 60tggcccatag tgatctttgc agtgacccag cagcatcact gtttcttggc gtgtgaagat 120aacccaagga attgaggaag ttgctgagaa gagtgtgctg gagatgctct aggaaaaaat 180tgaatagtga gacgagttcc agcgcaaggg tttctggttt gccaagaaga aagtgaacat 240catggatcag aacaacagcc tgccacctta cgctcagggc ttggcctccc ctcagggtgc 300catgactccc ggaatcccta tctttagtcc aatgatgcct tatggcactg gactgacccc 360acagcctatt cagaacacca atagtctgtc tattttggaa gagcaacaaa ggcagcagca 420gcaacaacaa cagcagcagc agcagcagca gcagcagcaa cagcaacagc agcagcagca 480gcagcagcag cagcagcagc agcagcagca gcagcagcag caacaggcag tggcagctgc 540agccgttcag cagtcaacgt cccagcaggc aacacaggga acctcaggcc aggcaccaca 600gctcttccac tcacagactc tcacaactgc acccttgccg ggcaccactc cactgtatcc 660ctcccccatg actcccatga cccccatcac tcctgccacg ccagcttcgg agagttctgg 720gattgtaccg cagctgcaaa atattgtatc cacagtgaat cttggttgta aacttgacct 780aaagaccatt gcacttcgtg cccgaaacgc cgaatataat cccaagcggt ttgctgcggt 840aatcatgagg ataagagagc cacgaaccac ggcactgatt ttcagttctg ggaaaatggt 900gtgcacagga gccaagagtg aagaacagtc cagactggca gcaagaaaat atgctagagt 960tgtacagaag ttgggttttc cagctaagtt cttggacttc aagattcaga acatggtggg 1020gagctgtgat gtgaagtttc ctataaggtt agaaggcctt gtgctcaccc accaacaatt 1080tagtagttat gagccagagt tatttcctgg tttaatctac agaatgatca aacccagaat 1140tgttctcctt atttttgttt ctggaaaagt tgtattaaca ggtgctaaag tcagagcaga 1200aatttatgaa gcatttgaaa acatctaccc tattctaaag ggattcagga agacgacgta 1260atggctctca tgtacccttg cctcccccac ccccttcttt tttttttttt aaacaaatca 1320gtttgttttg gtacctttaa atggtggtgt tgtgagaaga tggatgttga gttgcagggt 1380gtggcaccag gtgatgccct tctgtaagtg cccaccgcgg gatgccggga aggggcatta 1440tttgtgcact gagaacaccg cgcagcgtga ctgtgagttg ctcataccgt gctgctatct 1500gggcagcgct gcccatttat ttatatgtag attttaaaca ctgctgttga caagttggtt 1560tgagggagaa aactttaagt gttaaagcca cctctataat tgattggact ttttaatttt 1620aatgtttttc cccatgaacc acagttttta tatttctacc agaaaagtaa aaatcttttt 1680taaaagtgtt gtttttctaa tttataactc ctaggggtta tttctgtgcc agacacattc 1740cacctctcca gtattgcagg acggaatata tgtgttaatg aaaatgaatg gctgtacata 1800tttttttctt tcttcagagt actctgtaca ataaatgcag tttataaaag tgttaaaaaa 1860aaaaaaaaaa aaaaaa 18763376633DNAHomo sapiens 337ttctccccgc cccccagttg ttgtcgaagt ctgggggttg ggactggacc ccctgattgc 60gtaagagcaa aaagcgaagg cgcaatctgg acactgggag attcggagcg cagggagttt 120gagagaaact tttattttga agagaccaag gttgaggggg ggcttatttc ctgacagcta 180tttacttaga gcaaatgatt agttttagaa ggatggacta taacattgaa tcaattacaa 240aacgcggttt ttgagcccat tactgttgga gctacaggga gagaaacagg aggagactgc 300aagagatcat ttgggaaggc cgtgggcacg ctctttactc catgtgtggg acattcattg 360cggaataaca tcggaggaga agtttcccag agctatgggg acttcccatc cggcgttcct 420ggtcttaggc tgtcttctca cagggctgag cctaatcctc tgccagcttt cattaccctc 480tatccttcca aatgaaaatg aaaaggttgt gcagctgaat tcatcctttt ctctgagatg 540ctttggggag agtgaagtga gctggcagta ccccatgtct gaagaagaga gctccgatgt 600ggaaatcaga aatgaagaaa acaacagcgg cctttttgtg acggtcttgg aagtgagcag 660tgcctcggcg gcccacacag ggttgtacac ttgctattac aaccacactc agacagaaga 720gaatgagctt gaaggcaggc acatttacat ctatgtgcca gacccagatg tagcctttgt 780acctctagga atgacggatt atttagtcat cgtggaggat gatgattctg ccattatacc 840ttgtcgcaca actgatcccg agactcctgt aaccttacac aacagtgagg gggtggtacc 900tgcctcctac gacagcagac agggctttaa tgggaccttc actgtagggc cctatatctg 960tgaggccacc gtcaaaggaa agaagttcca gaccatccca tttaatgttt atgctttaaa 1020agcaacatca gagctggatc tagaaatgga agctcttaaa accgtgtata agtcagggga 1080aacgattgtg gtcacctgtg ctgtttttaa caatgaggtg gttgaccttc aatggactta 1140ccctggagaa gtgaaaggca aaggcatcac aatgctggaa gaaatcaaag tcccatccat 1200caaattggtg tacactttga cggtccccga ggccacggtg aaagacagtg gagattacga 1260atgtgctgcc cgccaggcta ccagggaggt caaagaaatg aagaaagtca ctatttctgt 1320ccatgagaaa ggtttcattg aaatcaaacc caccttcagc cagttggaag ctgtcaacct 1380gcatgaagtc aaacattttg ttgtagaggt gcgggcctac ccacctccca ggatatcctg 1440gctgaaaaac aatctgactc tgattgaaaa tctcactgag atcaccactg atgtggaaaa 1500gattcaggaa ataaggtatc gaagcaaatt aaagctgatc cgtgctaagg aagaagacag 1560tggccattat actattgtag ctcaaaatga agatgctgtg aagagctata cttttgaact 1620gttaactcaa gttccttcat ccattctgga cttggtcgat gatcaccatg gctcaactgg 1680gggacagacg gtgaggtgca cagctgaagg cacgccgctt cctgatattg agtggatgat 1740atgcaaagat attaagaaat gtaataatga aacttcctgg actattttgg ccaacaatgt 1800ctcaaacatc atcacggaga tccactcccg agacaggagt accgtggagg gccgtgtgac 1860tttcgccaaa gtggaggaga ccatcgccgt gcgatgcctg gctaagaatc tccttggagc 1920tgagaaccga gagctgaagc tggtggctcc caccctgcgt tctgaactca cggtggctgc 1980tgcagtcctg gtgctgttgg tgattgtgat catctcactt attgtcctgg ttgtcatttg 2040gaaacagaaa ccgaggtatg aaattcgctg gagggtcatt gaatcaatca gcccggatgg 2100acatgaatat atttatgtgg acccgatgca gctgccttat gactcaagat gggagtttcc 2160aagagatgga ctagtgcttg gtcgggtctt ggggtctgga gcgtttggga aggtggttga 2220aggaacagcc tatggattaa gccggtccca acctgtcatg aaagttgcag tgaagatgct 2280aaaacccacg gccagatcca gtgaaaaaca agctctcatg tctgaactga agataatgac 2340tcacctgggg ccacatttga acattgtaaa cttgctggga gcctgcacca agtcaggccc 2400catttacatc atcacagagt attgcttcta tggagatttg gtcaactatt tgcataagaa 2460tagggatagc ttcctgagcc accacccaga gaagccaaag aaagagctgg atatctttgg 2520attgaaccct gctgatgaaa gcacacggag ctatgttatt ttatcttttg aaaacaatgg 2580tgactacatg gacatgaagc aggctgatac tacacagtat gtccccatgc tagaaaggaa 2640agaggtttct aaatattccg acatccagag atcactctat gatcgtccag cctcatataa 2700gaagaaatct atgttagact cagaagtcaa aaacctcctt tcagatgata actcagaagg 2760ccttacttta ttggatttgt tgagcttcac ctatcaagtt gcccgaggaa tggagttttt 2820ggcttcaaaa aattgtgtcc accgtgatct ggctgctcgc aacgtcctcc tggcacaagg 2880aaaaattgtg aagatctgtg actttggcct ggccagagac atcatgcatg attcgaacta 2940tgtgtcgaaa ggcagtacct ttctgcccgt gaagtggatg gctcctgaga gcatctttga 3000caacctctac accacactga gtgatgtctg gtcttatggc attctgctct gggagatctt 3060ttcccttggt ggcacccctt accccggcat gatggtggat tctactttct acaataagat 3120caagagtggg taccggatgg ccaagcctga ccacgctacc agtgaagtct acgagatcat 3180ggtgaaatgc tggaacagtg agccggagaa gagaccctcc ttttaccacc tgagtgagat 3240tgtggagaat ctgctgcctg gacaatataa aaagagttat gaaaaaattc acctggactt 3300cctgaagagt gaccatcctg ctgtggcacg catgcgtgtg gactcagaca atgcatacat 3360tggtgtcacc tacaaaaacg aggaagacaa gctgaaggac tgggagggtg gtctggatga 3420gcagagactg agcgctgaca gtggctacat cattcctctg cctgacattg accctgtccc 3480tgaggaggag gacctgggca agaggaacag acacagctcg cagacctctg aagagagtgc 3540cattgagacg ggttccagca gttccacctt catcaagaga gaggacgaga ccattgaaga 3600catcgacatg atggacgaca tcggcataga ctcttcagac ctggtggaag acagcttcct 3660gtaactggcg gattcgaggg gttccttcca cttctggggc cacctctgga tcccgttcag 3720aaaaccactt tattgcaatg cggaggttga gaggaggact tggttgatgt ttaaagagaa 3780gttcccagcc aagggcctcg gggagcgttc taaatatgaa tgaatgggat attttgaaat 3840gaactttgtc agtgttgcct ctcgcaatgc ctcagtagca tctcagtggt gtgtgaagtt 3900tggagataga tggataaggg aataataggc cacagaaggt gaactttgtg cttcaaggac 3960attggtgaga gtccaacaga cacaatttat actgcgacag aacttcagca ttgtaattat 4020gtaaataact ctaaccaagg ctgtgtttag attgtattaa ctatcttctt tggacttctg 4080aagagaccac tcaatccatc catgtacttc cctcttgaaa cctgatgtca gctgctgttg 4140aactttttaa agaagtgcat gaaaaaccat ttttgaacct taaaaggtac tggtactata 4200gcattttgct atctttttta gtgttaagag ataaagaata ataattaacc aaccttgttt 4260aatagatttg ggtcatttag aagcctgaca actcattttc atattgtaat ctatgtttat 4320aatactacta ctgttatcag taatgctaaa tgtgtaataa tgtaacatga tttccctcca 4380gagaaagcac aatttaaaac aatccttact aagtaggtga tgagtttgac agtttttgac 4440atttatatta aataacatgt ttctctataa agtatggtaa tagctttagt gaattaaatt 4500tagttgagca tagagaacaa agtaaaagta gtgttgtcca ggaagtcaga atttttaact 4560gtactgaata ggttccccaa tccatcgtat taaaaaacaa ttaactgccc tctgaaataa 4620tgggattaga aacaaacaaa actcttaagt cctaaaagtt ctcaatgtag aggcataaac 4680ctgtgctgaa cataacttct catgtatatt acccaatgga aaatataatg atcagcaaaa 4740agactggatt tgcagaagtt tttttttttt ttcttcatgc ctgatgaaag ctttggcaac 4800cccaatatat gtattttttg aatctatgaa cctgaaaagg gtcagaagga tgcccagaca 4860tcagcctcct tctttcaccc cttaccccaa agagaaagag tttgaaactc gagaccataa 4920agatattctt tagtggaggc tggatgtgca ttagcctgga tcctcagttc tcaaatgtgt 4980gtggcagcca ggatgactag atcctgggtt tccatccttg agattctgaa gtatgaagtc 5040tgagggaaac cagagtctgt atttttctaa actccctggc tgttctgatc ggccagtttt 5100cggaaacact gacttaggtt tcaggaagtt gccatgggaa acaaataatt tgaactttgg 5160aacagggttg gaattcaacc acgcaggaag cctactattt aaatccttgg cttcaggtta 5220gtgacattta atgccatcta gctagcaatt gcgaccttaa tttaactttc cagtcttagc 5280tgaggctgag aaagctaaag tttggttttg acaggttttc caaaagtaaa gatgctactt 5340cccactgtat gggggagatt gaactttccc cgtctcccgt cttctgcctc ccactccata 5400ccccgccaag gaaaggcatg tacaaaaatt atgcaattca gtgttccaag tctctgtgta 5460accagctcag tgttttggtg gaaaaaacat tttaagtttt actgataatt tgaggttaga 5520tgggaggatg aattgtcaca tctatccaca ctgtcaaaca ggttggtgtg ggttcattgg 5580cattctttgc aatactgctt aattgctgat accatatgaa tgaaacatgg gctgtgatta 5640ctgcaatcac tgtgctatcg gcagatgatg ctttggaaga tgcagaagca ataataaagt 5700acttgactac ctactggtgt aatctcaatg caagccccaa ctttcttatc caactttttc 5760atagtaagtg cgaagactga gccagattgg ccaattaaaa acgaaaacct gactaggttc 5820tgtagagcca attagacttg aaatacgttt gtgtttctag aatcacagct caagcattct 5880gtttatcgct cactctccct tgtacagcct tattttgttg gtgctttgca ttttgatatt 5940gctgtgagcc ttgcatgaca tcatgaggcc ggatgaaact tctcagtcca gcagtttcca 6000gtcctaacaa atgctcccac ctgaatttgt atatgactgc atttgtgggt gtgtgtgtgt 6060tttcagcaaa ttccagattt gtttcctttt ggcctcctgc aaagtctcca gaagaaaatt 6120tgccaatctt tcctactttc tatttttatg atgacaatca aagccggcct gagaaacact 6180atttgtgact ttttaaacga ttagtgatgt ccttaaaatg tggtctgcca atctgtacaa 6240aatggtccta tttttgtgaa gagggacata agataaaatg atgttataca tcaatatgta 6300tatatgtatt tctatataga cttggagaat actgccaaaa catttatgac aagctgtatc 6360actgccttcg tttatatttt tttaactgtg ataatcccca caggcacatt aactgttgca 6420cttttgaatg tccaaaattt atattttaga aataataaaa agaaagatac ttacatgttc 6480ccaaaacaat ggtgtggtga atgtgtgaga aaaactaact tgatagggtc taccaataca 6540aaatgtatta cgaatgcccc tgttcatgtt tttgttttaa aacgtgtaaa tgaagatctt 6600tatatttcaa taaatgatat ataatttaaa gtt 6633338994DNAHomo sapiens 338tgctggccag cacctcgagg gaagatggcg gacgaggaga agctgccgcc cggctgggag 60aagcgcatga gccgcagctc aggccgagtg tactacttca accacatcac taacgccagc 120cagtgggagc ggcccagcgg caacagcagc agtggtggca aaaacgggca gggggagcct 180gccagggtcc gctgctcgca cctgctggtg aagcacagcc agtcacggcg gccctcgtcc 240tggcggcagg agaagatcac ccggaccaag gaggaggccc tggagctgat caacggctac 300atccagaaga tcaagtcggg agaggaggac tttgagtctc tggcctcaca gttcagcgac 360tgcagctcag ccaaggccag gggagacctg ggtgccttca gcagaggtca gatgcagaag 420ccatttgaag acgcctcgtt tgcgctgcgg acgggggaga tgagcgggcc cgtgttcacg 480gattccggca tccacatcat cctccgcact gagtgagggt ggggagccca ggcctggcct 540cggggcaggg cagggcggct aggccggcca gctccccctt gcccgccagc cagtggccga 600accccccact ccctgccacc gtcacacagt atttattgtt cccacaatgg ctgggagggg 660gcccttccag attgggggcc ctggggtccc cactccctgt ccatccccag ttggggctgc 720gaccgccaga ttctccctta aggaattgac ttcagcaggg gtgggaggct cccagaccca 780gggcagtgtg gtgggagggg tgttccaaag agaaggcctg gtcagcagag ccgccccgtg 840tccccccagg tgctggaggc agactcgagg gccgaattgt ttctagttag gccacgctcc 900tctgttcagt cgcaaaggtg aacactcatg cggcagccat gggccctctg agcaactgtg 960cagacccttt cacccccaat taaacccaga acca 994339772DNAHomo sapiens 339agctcgtgcc gaattcggca cgagccgggt cggagccatg gcggtggcaa attcaagtcc 60tgttaacccc gtggtgttct ttgatgtcag tattggcggt caggaagttg gccgcatgaa 120gatcgagctc tttgcagacg ttgtgcctaa gacggccgag aactttaggc agttctgcac 180cggagaattc aggaaagatg gggttccaat aggatacaaa ggaagcacct tccacagggt 240cataaaggat ttcatgattc agggtggaga ttttgttaat ggagatggta ctggagtcgc 300cagtatttac cgggggccat ttgcagatga aaattttaaa cttagacact cagctccagg 360cctgctttcc atggcgaaca gtggtccaag tacaaatggc tgtcagttct ttatcacctg 420ctctaagtgc gattggctgg atgggaagca tgtggtgttt ggaaaaatca tcgatggact 480tctagtgatg agaaagattg agaatgttcc cacaggcccc aacaataagc ccaagctacc 540tgtggtgatc tcgcagtgtg gggagatgta gtccagacaa agactgaatc aggccttccc 600ttcttcttgg tggtgttctt gagtaagata atctggactg gcccccgtct ttgcttccct 660gcctgctgct gccccatttg atcaagagac catggaagtg tcagagattc agaatccaag 720attgtcttta agttttcaac tgtaaataaa gtttttttgt atgcgtaaaa aa 772340919DNAHomo sapiens 340cgctcgcctc cctcgctcca cgcgcgcccg gacgcggcgg ccaggcttgc gcgtggttcc 60cctcccggtg ggcggattcc tgggcaagat gaagtgggtg tgggcgctct tgctgttggc 120ggcgtgggca gcggccgagc gcgactgccg agtgagcagc ttccgagtca aggagaactt 180cgacaaggct cgcttctctg ggacctggta cgccatggcc aagaaggacc ccgagggcct 240ctttctgcag gacaacatcg tcgcggagtt ctcggtggac gagaccggcc agatgagcgc 300cacagccaag ggccgagtcc gtcttttgaa taactgggac gtgtgcgcag acatggtggg 360caccttcaca gacaccgagg accctgccaa gttcaagatg aagtactggg gcgtagcctc 420ctttctgcag aaaggaaatg atgaccactg gatcgtcgac acagactacg acacgtatgc 480cgtacagtac tcctgccgcc tcctgaacct cgatggcacc tgtgctgaca gctactcctt 540cgtgttttcc cgggacccca acggcctgcc cccagaagcg cagaagattg taaggcagcg 600gcaggaggag ctgtgcctgg ccaggcagta caggctgatc gtccacaacg gttactgcga 660tggcagatca gaaagaaacc ttttgtagca atatcaagaa tctagtttca tctgagaact 720tctgattagc tctcagtctt cagctctatt tatcttagga gtttaatttg cccttctctc 780cccatcttcc ctcagttccc ataaaacctt cattacacat aaagatacac gtgggggtca 840gtgaatctgc ttgcctttcc tgaaagtttc tggggcttaa gattccagac tctgattcat 900taaactatag tcacccgtg 9193417365DNAHomo sapiens 341ggcagtttgt aggtcgcgag ggaagcgctg aggatcagga agggggcact gagtgtccgt 60gggggaatcc tcgtgatagg aactggaata tgccttgagg gggacactat gtctttaaaa 120acgtcggctg gtcatgaggt caggagttcc agaccagcct gaccaacgtg gtgaaactcc 180gtctctacta aaaatacaaa aattagccgg gcgtggtgcc gctccagcta ctcaggaggc 240tgaggcagga gaatcgctag aacccgggag gcggaggttg cagtgagccg agatcgcgcc 300attgcactcc agcctgggcg acagagcgag actgtctcaa aacaaaacaa aacaaaacaa 360aacaaaaaac accggctgtt cattggaaca gaaagaaatg gatttatctg ctcttcgcgt 420tgaagaagta caaaatgtca ttaatgctat gcagaaaatc ttagagtgtc ccatctgtct 480ggagttgatc aaggaacctg tctccacaaa gtgtgaccac atattttgca aattttgcat 540gctgaaactt ctcaaccaga agaaagggcc ttcacagtgt cctttatgta agaatgatat 600aaccaaaagg agcctacaag aaagtacgag atttagtcaa cttgttgaag agctattgaa 660aatcatttgt gcttttcagc ttgacacagg tttggagtat gcaaacagct ataattttgc 720aaaaaaggaa aataactctc ctgaacatct aaaagatgaa gtttctatca tccaaagtat 780gggctacaga aaccgtgcca aaagacttct acagagtgaa cccgaaaatc cttccttgca 840ggaaaccagt ctcagtgtcc aactctctaa ccttggaact gtgagaactc tgaggacaaa 900gcagcggata caacctcaaa agacgtctgt ctacattgaa ttgggatctg attcttctga 960agataccgtt aataaggcaa cttattgcag tgtgggagat

caagaattgt tacaaatcac 1020ccctcaagga accagggatg aaatcagttt ggattctgca aaaaaggctg cttgtgaatt 1080ttctgagacg gatgtaacaa atactgaaca tcatcaaccc agtaataatg atttgaacac 1140cactgagaag cgtgcagctg agaggcatcc agaaaagtat cagggtagtt ctgtttcaaa 1200cttgcatgtg gagccatgtg gcacaaatac tcatgccagc tcattacagc atgagaacag 1260cagtttatta ctcactaaag acagaatgaa tgtagaaaag gctgaattct gtaataaaag 1320caaacagcct ggcttagcaa ggagccaaca taacagatgg gctggaagta aggaaacatg 1380taatgatagg cggactccca gcacagaaaa aaaggtagat ctgaatgctg atcccctgtg 1440tgagagaaaa gaatggaata agcagaaact gccatgctca gagaatccta gagatactga 1500agatgttcct tggataacac taaatagcag cattcagaaa gttaatgagt ggttttccag 1560aagtgatgaa ctgttaggtt ctgatgactc acatgatggg gagtctgaat caaatgccaa 1620agtagctgat gtattggacg ttctaaatga ggtagatgaa tattctggtt cttcagagaa 1680aatagactta ctggccagtg atcctcatga ggctttaata tgtaaaagtg aaagagttca 1740ctccaaatca gtagagagta atattgaaga caaaatattt gggaaaacct atcggaagaa 1800ggcaagcctc cccaacttaa gccatgtaac tgaaaatcta attataggag catttgttac 1860tgagccacag ataatacaag agcgtcccct cacaaataaa ttaaagcgta aaaggagacc 1920tacatcaggc cttcatcctg aggattttat caagaaagca gatttggcag ttcaaaagac 1980tcctgaaatg ataaatcagg gaactaacca aacggagcag aatggtcaag tgatgaatat 2040tactaatagt ggtcatgaga ataaaacaaa aggtgattct attcagaatg agaaaaatcc 2100taacccaata gaatcactcg aaaaagaatc tgctttcaaa acgaaagctg aacctataag 2160cagcagtata agcaatatgg aactcgaatt aaatatccac aattcaaaag cacctaaaaa 2220gaataggctg aggaggaagt cttctaccag gcatattcat gcgcttgaac tagtagtcag 2280tagaaatcta agcccaccta attgtactga attgcaaatt gatagttgtt ctagcagtga 2340agagataaag aaaaaaaagt acaaccaaat gccagtcagg cacagcagaa acctacaact 2400catggaaggt aaagaacctg caactggagc caagaagagt aacaagccaa atgaacagac 2460aagtaaaaga catgacagcg atactttccc agagctgaag ttaacaaatg cacctggttc 2520ttttactaag tgttcaaata ccagtgaact taaagaattt gtcaatccta gccttccaag 2580agaagaaaaa gaagagaaac tagaaacagt taaagtgtct aataatgctg aagaccccaa 2640agatctcatg ttaagtggag aaagggtttt gcaaactgaa agatctgtag agagtagcag 2700tatttcattg gtacctggta ctgattatgg cactcaggaa agtatctcgt tactggaagt 2760tagcactcta gggaaggcaa aaacagaacc aaataaatgt gtgagtcagt gtgcagcatt 2820tgaaaacccc aagggactaa ttcatggttg ttccaaagat aatagaaatg acacagaagg 2880ctttaagtat ccattgggac atgaagttaa ccacagtcgg gaaacaagca tagaaatgga 2940agaaagtgaa cttgatgctc agtatttgca gaatacattc aaggtttcaa agcgccagtc 3000atttgctccg ttttcaaatc caggaaatgc agaagaggaa tgtgcaacat tctctgccca 3060ctctgggtcc ttaaagaaac aaagtccaaa agtcactttt gaatgtgaac aaaaggaaga 3120aaatcaagga aagaatgagt ctaatatcaa gcctgtacag acagttaata tcactgcagg 3180ctttcctgtg gttggtcaga aagataagcc agttgataat gccaaatgta gtatcaaagg 3240aggctctagg ttttgtctat catctcagtt cagaggcaac gaaactggac tcattactcc 3300aaataaacat ggacttttac aaaacccata tcgtatacca ccactttttc ccatcaagtc 3360atttgttaaa actaaatgta agaaaaatct gctagaggaa aactttgagg aacattcaat 3420gtcacctgaa agagaaatgg gaaatgagaa cattccaagt acagtgagca caattagccg 3480taataacatt agagaaaatg tttttaaaga agccagctca agcaatatta atgaagtagg 3540ttccagtact aatgaagtgg gctccagtat taatgaaata ggttccagtg atgaaaacat 3600tcaagcagaa ctaggtagaa acagagggcc aaaattgaat gctatgctta gattaggggt 3660tttgcaacct gaggtctata aacaaagtct tcctggaagt aattgtaagc atcctgaaat 3720aaaaaagcaa gaatatgaag aagtagttca gactgttaat acagatttct ctccatatct 3780gatttcagat aacttagaac agcctatggg aagtagtcat gcatctcagg tttgttctga 3840gacacctgat gacctgttag atgatggtga aataaaggaa gatactagtt ttgctgaaaa 3900tgacattaag gaaagttctg ctgtttttag caaaagcgtc cagaaaggag agcttagcag 3960gagtcctagc cctttcaccc atacacattt ggctcagggt taccgaagag gggccaagaa 4020attagagtcc tcagaagaga acttatctag tgaggatgaa gagcttccct gcttccaaca 4080cttgttattt ggtaaagtaa acaatatacc ttctcagtct actaggcata gcaccgttgc 4140taccgagtgt ctgtctaaga acacagagga gaatttatta tcattgaaga atagcttaaa 4200tgactgcagt aaccaggtaa tattggcaaa ggcatctcag gaacatcacc ttagtgagga 4260aacaaaatgt tctgctagct tgttttcttc acagtgcagt gaattggaag acttgactgc 4320aaatacaaac acccaggatc ctttcttgat tggttcttcc aaacaaatga ggcatcagtc 4380tgaaagccag ggagttggtc tgagtgacaa ggaattggtt tcagatgatg aagaaagagg 4440aacgggcttg gaagaaaata atcaagaaga gcaaagcatg gattcaaact taggtgaagc 4500agcatctggg tgtgagagtg aaacaagcgt ctctgaagac tgctcagggc tatcctctca 4560gagtgacatt ttaaccactc agcagaggga taccatgcaa cataacctga taaagctcca 4620gcaggaaatg gctgaactag aagctgtgtt agaacagcat gggagccagc cttctaacag 4680ctacccttcc atcataagtg actcttctgc ccttgaggac ctgcgaaatc cagaacaaag 4740cacatcagaa aaagcagtat taacttcaca gaaaagtagt gaatacccta taagccagaa 4800tccagaaggc ctttctgctg acaagtttga ggtgtctgca gatagttcta ccagtaaaaa 4860taaagaacca ggagtggaaa ggtcatcccc ttctaaatgc ccatcattag atgataggtg 4920gtacatgcac agttgctctg ggagtcttca gaatagaaac tacccatctc aagaggagct 4980cattaaggtt gttgatgtgg aggagcaaca gctggaagag tctgggccac acgatttgac 5040ggaaacatct tacttgccaa ggcaagatct agagggaacc ccttacctgg aatctggaat 5100cagcctcttc tctgatgacc ctgaatctga tccttctgaa gacagagccc cagagtcagc 5160tcgtgttggc aacataccat cttcaacctc tgcattgaaa gttccccaat tgaaagttgc 5220agaatctgcc cagagtccag ctgctgctca tactactgat actgctgggt ataatgcaat 5280ggaagaaagt gtgagcaggg agaagccaga attgacagct tcaacagaaa gggtcaacaa 5340aagaatgtcc atggtggtgt ctggcctgac cccagaagaa tttatgctcg tgtacaagtt 5400tgccagaaaa caccacatca ctttaactaa tctaattact gaagagacta ctcatgttgt 5460tatgaaaaca gatgctgagt ttgtgtgtga acggacactg aaatattttc taggaattgc 5520gggaggaaaa tgggtagtta gctatttctg ggtgacccag tctattaaag aaagaaaaat 5580gctgaatgag catgattttg aagtcagagg agatgtggtc aatggaagaa accaccaagg 5640tccaaagcga gcaagagaat cccaggacag aaagatcttc agggggctag aaatctgttg 5700ctatgggccc ttcaccaaca tgcccacaga tcaactggaa tggatggtac agctgtgtgg 5760tgcttctgtg gtgaaggagc tttcatcatt cacccttggc acaggtgtcc acccaattgt 5820ggttgtgcag ccagatgcct ggacagagga caatggcttc catgcaattg ggcagatgtg 5880tgaggcacct gtggtgaccc gagagtgggt gttggacagt gtagcactct accagtgcca 5940ggagctggac acctacctga taccccagat cccccacagc cactactgac tgcagccagc 6000cacaggtaca gagccacagg accccaagaa tgagcttaca aagtggcctt tccaggccct 6060gggagctcct ctcactcttc agtccttcta ctgtcctggc tactaaatat tttatgtaca 6120tcagcctgaa aaggacttct ggctatgcaa gggtccctta aagattttct gcttgaagtc 6180tcccttggaa atctgccatg agcacaaaat tatggtaatt tttcacctga gaagatttta 6240aaaccattta aacgccacca attgagcaag atgctgattc attatttatc agccctattc 6300tttctattca ggctgttgtt ggcttagggc tggaagcaca gagtggcttg gcctcaagag 6360aatagctggt ttccctaagt ttacttctct aaaaccctgt gttcacaaag gcagagagtc 6420agacccttca atggaaggag agtgcttggg atcgattatg tgacttaaag tcagaatagt 6480ccttgggcag ttctcaaatg ttggagtgga acattgggga ggaaattctg aggcaggtat 6540tagaaatgaa aaggaaactt gaaacctggg catggtggct cacgcctgta atcccagcac 6600tttgggaggc caaggtgggc agatcactgg aggtcaggag ttcgaaacca gcctggccaa 6660catggtgaaa ccccatctct actaaaaata cagaaattag ccggtcatgg tggtggacac 6720ctgtaatccc agctactcag gtggctaagg caggagaatc acttcagccc gggaggtgga 6780ggttgcagtg agccaagatc ataccacggc actccagcct gggtgacagt gagactgtgg 6840ctcaaaaaaa aaaaaaaaaa aggaaaatga aactaggaaa ggtttcttaa agtctgagat 6900atatttgcta gatttctaaa gaatgtgttc taaaacagca gaagattttc aagaaccggt 6960ttccaaagac agtcttctaa ttcctcatta gtaataagta aaatgtttat tgttgtagct 7020ctggtatata atccattcct cttaaaatat aagacctctg gcatgaatat ttcatatcta 7080taaaatgaca gatcccacca ggaaggaagc tgttgctttc tttgaggtga tttttttcct 7140ttgctccctg ttgctgaaac catacagctt cataaataat tttgcttgct gaaggaagaa 7200aaagtgtttt tcataaaccc attatccagg actgtttata gctgttggaa ggactaggtc 7260ttccctagcc cccccagtgt gcaagggcag tgaagacttg attgtacaaa atacgttttg 7320taaatgttgt gctgttaaca ctgcaaataa acttggtagc aaaca 736534210386DNAHomo sapiensunsure(0)...(0)n = a, t, c or g 342attgaggact cggaaatgag gtccaagggt agccaaggat ggctgcagct tcatatgatc 60agttgttaaa gcaagttgag gcactgaaga tggagaactc aaatcttcga caagagctag 120aagataattc caatcatctt acaaaactgg aaactgaggc atctaatatg aaggaagtac 180ttaaacaact acaaggaagt attgaagatg aagctatggc ttcttctgga cagattgatt 240tattagagcg tcttaaagag cttaacttag atagcagtaa tttccctgga gtaaaactgc 300ggtcaaaaat gtccctccgt tcttatggaa gccgggaagg atctgtatca agccgttctg 360gagagtgcag tcctgttcct atgggttcat ttccaagaag agggtttgta aatggaagca 420gagaaagtac tggatattta gaagaacttg agaaagagag gtcattgctt cttgctgatc 480ttgacaaaga agaaaaggaa aaagactggt attacgctca acttcagaat ctcactaaaa 540gaatagatag tcttccttta actgaaaatt tttccttaca aacagatatg accagaaggc 600aattggaata tgaagcaagg caaatcagag ttgcgatgga agaacaacta ggtacctgcc 660aggatatgga aaaacgagca cagcgaagaa tagccagaat tcagcaaatc gaaaaggaca 720tacttcgtat acgacagctt ttacagtccc aagcaacaga agcagagagg tcatctcaga 780acaagcatga aaccggctca catgatgctg agcggcagaa tgaaggtcaa ggagtgggag 840aaatcaacat ggcaacttct ggtaatggtc agggttcaac tacacgaatg gaccatgaaa 900cagccagtgt tttgagttct agtagcacac actctgcacc tcgaaggctg acaagtcatc 960tgggaaccaa ggtggaaatg gtgtattcat tgttgtcaat gcttggtact catgataagg 1020atgatatgtc gcgaactttg ctagctatgt ctagctccca agacagctgt atatccatgc 1080gacagtctgg atgtcttcct ctcctcatcc agcttttaca tggcaatgac aaagactctg 1140tattgttggg aaattcccgg ggcagtaaag aggctcgggc cagggccagt gcagcactcc 1200acaacatcat tcactcacag cctgatgaca agagaggcag gcgtgaaatc cgagtccttc 1260atcttttgga acagatacgc gcttactgtg aaacctgttg ggagtggcag gaagctcatg 1320aaccaggcat ggaccaggac aaaaatccaa tgccagctcc tgttgaacat cagatctgtc 1380ctgctgtgtg tgttctaatg aaactttcat ttgatgaaga gcatagacat gcaatgaatg 1440aactaggggg actacaggcc attgcagaat tattgcaagt ggactgtgaa atgtacgggc 1500ttactaatga ccactacagt attacactaa gacgatatgc tggaatggct ttgacaaact 1560tgacttttgg agatgtagcc aacaaggcta cgctatgctc tatgaaaggc tgcatgagag 1620cacttgtggc ccaactaaaa tctgaaagtg aagacttaca gcaggttatt gcaagtgttt 1680tgaggaattt gtcttggcga gcagatgtaa atagtaaaaa gacgttgcga gaagttggaa 1740gtgtgaaagc attgatggaa tgtgctttag aagttaaaaa ggaatcaacc ctcaaaagcg 1800tattgagtgc cttatggaat ttgtcagcac attgcactga gaataaagct gatatatgtg 1860ctgtagatgg tgcacttgca tttttggttg gcactcttac ttaccggagc cagacaaaca 1920ctttagccat tattgaaagt ggaggtggga tattacggaa tgtgtccagc ttgatagcta 1980caaatgagga ccacaggcaa atcctaagag agaacaactg tctacaaact ttattacaac 2040acttaaaatc tcatagtttg acaatagtca gtaatgcatg tggaactttg tggaatctct 2100cagcaagaaa tcctaaagac caggaagcat tatgggacat gggggcagtt agcatgctca 2160agaacctcat tcattcaaag cacaaaatga ttgctatggg aagtgctgca gctttaagga 2220atctcatggc aaataggcct gcgaagtaca aggatgccaa tattatgtct cctggctcaa 2280gcttgccatc tcttcatgtt aggaaacaaa aagccctaga agcagaatta gatgctcagc 2340acttatcaga aacttttgac aatatagaca atttaagtcc caaggcatct catcgtagta 2400agcagagaca caagcaaagt ctctatggtg attatgtttt tgacaccaat cgacatgatg 2460ataataggtc agacaatttt aatactggca acatgactgt cctttcacca tatttgaata 2520ctacagtgtt acccagctcc tcttcatcaa gaggaagctt agatagttct cgttctgaaa 2580aagatagaag tttggagaga gaacgcggaa ttggtctagg caactaccat ccagcaacag 2640aaaatccagg aacttcttca aagcgaggtt tgcagatctc caccactgca gcccagattg 2700ccaaagtcat ggaagaagtg tcagccattc atacctctca ggaagacaga agttctgggt 2760ctaccactga attacattgt gtgacagatg agagaaatgc acttagaaga agctctgctg 2820cccatacaca ttcaaacact tacaatttca ctaagtcgga aaattcaaat aggacatgtt 2880ctatgcctta tgccaaatta gaatacaaga gatcttcaaa tgatagttta aatagtgtca 2940gtagtagtga tggttatggt aaaagaggtc aaatgaaacc ctcgattgaa tcctattctg 3000aagatgatga aagtaagttt tgcagttatg gtcaataccc agccgaccta gcccataaaa 3060tacatagtgc aaatcatatg gatgataatg atggagaact agatacacca ataaattata 3120gtcttaaata ttcagatgag cagttgaact ctggaaggca aagtccttca cagaatgaaa 3180gatgggcaag acccaaacac ataatagaag atgaaataaa acaaagtgag caaagacaat 3240caaggaatca aagtacaact tatcctgttt atactgagag cactgatgat aaacacctca 3300agttccaacc acattttgga cagcaggaat gtgtttctcc atacaggtca cggggagcca 3360atggttcaga aacaaatcga gtgggttcta atcatggaat taatcaaaat gtaagccagt 3420ctttgtgtca agaagatgac tatgaagatg ataagcctac caattatagt gaacgttact 3480ctgaagaaga acagcatgaa gaagaagaga gaccaacaaa ttatagcata aaatataatg 3540aagagaaacg tcatgtggat cagcctattg attatagttt aaaatatgcc acagatattc 3600cttcatcaca gaaacagtca ttttcattct caaagagttc atctggacaa agcagtaaaa 3660ccgaacatat gtcttcaagc agtgagaata cgtccacacc ttcatctaat gccaagaggc 3720agaatcagct ccatccaagt tctgcacaga gtagaagtgg tcagcctcaa aaggctgcca 3780cttgcaaagt ttcttctatt aaccaagaaa caatacagac ttattgtgta gaagatactc 3840caatatgttt ttcaagatgt agttcattat catctttgtc atcagctgaa gatgaaatag 3900gatgtaatca gacgacacag gaagcagatt ctgctaatac cctgcaaata gcagaaataa 3960aagaaaagat tggaactagg tcagctgaag atcctgtgag cgaagttcca gcagtgtcac 4020agcaccctag aaccaaatcc agcagactgc agggttctag tttatcttca gaatcagcca 4080ggcacaaagc tgttgaattt tcttcaggag cgaaatctcc ctccaaaagt ggtgctcaga 4140cacccaaaag tccacctgaa cactatgttc aggagacccc actcatgttt agcagatgta 4200cttctgtcag ttcacttgat agttttgaga gtcgttcgat tgccagctcc gttcagagtg 4260aaccatgcag tggaatggta agtggcatta taagccccag tgatcttcca gatagccctg 4320gacaaaccat gccaccaagc agaagtaaaa cacctccacc acctcctcaa acagctcaaa 4380ccaagcgaga agtacctaaa aataaagcac ctactgctga aaagagagag agtggaccta 4440agcaagctgc agtaaatgct gcagttcaga gggtccaggt tcttccagat gctgatactt 4500tattacattt tgccacggaa agtactccag atggattttc ttgttcatcc agcctgagtg 4560ctctgagcct cgatgagcca tttatacaga aagatgtgga attaagaata atgcctccag 4620ttcaggaaaa tgacaatggg aatgaaacag aatcagagca gcctaaagaa tcaaatgaaa 4680accaagagaa agaggcagaa aaaactattg attctgaaaa ggacctatta gatgattcag 4740atgatgatga tattgaaata ctagaagaat gtattatttc tgccatgcca acaaagtcat 4800cacgtaaagc aaaaaagcca gcccagactg cttcaaaatt acctccacct gtggcaagga 4860aaccaagtca gctgcctgtg tacaaacttc taccatcaca aaacaggttg caaccccaaa 4920agcatgttag ttttacaccg ggggatgata tgccacgggt gtattgtgtt gaagggacac 4980ctataaactt ttccacagct acatctctaa gtgatctaac aatcgaatcc cctccaaatg 5040agttagctgc tggagaagga gttagaggag gagcacagtc aggtgaattt gaaaaacgag 5100ataccattcc tacagaaggc agaagtacag atgaggctca aggaggaaaa acctcatctg 5160taaccatacc tgaattggat gacaataaag cagaggaagg tgatattctt gcagaatgca 5220ttaattctgc tatgcccaaa gggaaaagtc acaagccttt ccgtgtgaaa aagataatgg 5280accaggtcca gcaagcatct gcgtcgtctt ctgcacccaa caaaaatcag ttagatggta 5340agaaaaagaa accaacttca ccagtaaaac ctataccaca aaatactgaa tataggacac 5400gtgtaagaaa aaatgcagac tcaaaaaata atttaaatgc tgagagagtt ttctcagaca 5460acaaagattc aaagaaacag aatttgaaaa ataattccaa ggacttcaat gataagctcc 5520caaataatga agatagagtc agaggaagtt ttgcttttga ttcacctcat cattacacgc 5580ctattgaagg aactccttac tgtttttcac gaaatgattc tttgagttct ctagattttg 5640atgatgatga tgttgacctt tccagggaaa aggctgaatt aagaaaggca aaagaaaata 5700aggaatcaga ggctaaagtt accagccaca cagaactaac ctccaaccaa caatcagcta 5760ataagacaca agctattgca aagcagccaa taaatcgagg tcagcctaaa cccatacttc 5820agaaacaatc cacttttccc cagtcatcca aagacatacc agacagaggg gcagcaactg 5880atgaaaagtt acagaatttt gctattgaaa atactccagt ttgcttttct cataattcct 5940ctctgagttc tctcagtgac attgaccaag aaaacaacaa taaagaaaat gaacctatca 6000aagagactga gccccctgac tcacagggag aaccaagtaa acctcaagca tcaggctatg 6060ctcctaaatc atttcatgtt gaagataccc cagtttgttt ctcaagaaac agttctctca 6120gttctcttag tattgactct gaagatgacc tgttgcagga atgtataagc tccgcaatgc 6180caaaaaagaa aaagccttca agactcaagg gtgataatga aaaacatagt cccagaaata 6240tgggtggcat attaggtgaa gatctgacac ttgatttgaa agatatacag agaccagatt 6300cagaacatgg tctatcccct gattcagaaa attttgattg gaaagctatt caggaaggtg 6360caaattccat agtaagtagt ttacatcaag ctgctgctgc tgcatgttta tctagacaag 6420cttcgtctga ttcagattcc atcctttccc tgaaatcagg aatctctctg ggatcaccat 6480ttcatcttac acctgatcaa gaagaaaaac cctttacaag taataaaggc ccacgaattc 6540taaaaccagg ggagaaaagt acattggaaa ctaaaaagat agaatctgaa agtaaaggaa 6600tcaaaggagg aaaaaaagtt tataaaagtt tgattactgg aaaagttcga tctaattcag 6660aaatttcagg ccaaatgaaa cagccccttc aagcaaacat gccttcaatc tctcgaggca 6720ggacaatgat tcatattcca ggagttcgaa atagctcctc aagtacaagt cctgtttcta 6780aaaaaggccc accccttaag actccagcct ccaaaagccc tagtgaaggt caaacagcca 6840ccacttctcc tagaggagcc aagccatctg tgaaatcaga attaagccct gttgccaggc 6900agacatccca aataggtggg tcaagtaaag caccttctag atcaggatct agagattcga 6960ccccttcaag acctgcccag caaccattaa gtagacctat acagtctcct ggccgaaact 7020caatttcccc tggtagaaat ggaataagtc ctcctaacaa attatctcaa cttccaagga 7080catcatcccc tagtactgct tcaactaagt cctcaggttc tggaaaaatg tcatatacat 7140ctccaggtag acagatgagc caacagaacc ttaccaaaca aacaggttta tccaagaatg 7200ccagtagtat tccaagaagt gagtctgcct ccaaaggact aaatcagatg aataatggta 7260atggagccaa taaaaaggta gaactttcta gaatgtcttc aactaaatca agtggaagtg 7320aatctgatag atcagaaaga cctgtattag tacgccagtc aactttcatc aaagaagctc 7380caagcccaac cttaagaaga aaattggagg aatctgcttc atttgaatct ctttctccat 7440catctagacc agcttctccc actaggtccc aggcacaaac tccagtttta agtccttccc 7500ttcctgatat gtctctatcc acacattcgt ctgttcaggc tggtggatgg cgaaaactcc 7560cacctaatct cagtcccact atagagtata atgatggaag accagcaaag cgccatgata 7620ttgcacggtc tcattctgaa agtccttcta gacttccaat caataggtca ggaacctgga 7680aacgtgagca cagcaaacat tcatcatccc ttcctcgagt aagcacttgg agaagaactg 7740gaagttcatc ttcaattctt tctgcttcat cagaatccag tgaaaaagca aaaagtgagg 7800atgaaaaaca tgtgaactct atttcaggaa ccaaacaaag taaagaaaac caagtatccg 7860caaaaggaac atggagaaaa ataaaagaaa atgaattttc tcccacaaat agtacttctc 7920agaccgtttc ctcaggtgct acaaatggtg ctgaatcaaa gactctaatt tatcaaatgg 7980cacctgctgt ttctaaaaca gaggatgttt gggtgagaat tgaggactgt cccattaaca 8040atcctagatc tggaagatct cccacaggta atactccccc ggtgattgac agtgtttcag 8100aaaaggcaaa tccaaacatt aaagattcaa aagataatca ggcaaaacaa aatgtgggta 8160atggcagtgt tcccatgcgt accgtgggtt tggaaaatcg cctgaactcc tttattcagg 8220tggatgcccc tgaccaaaaa ggaactgaga taaaaccagg acaaaataat cctgtccctg 8280tatcagagac taatgaaagt tctatagtgg aacgtacccc attcagttct agcagctcaa 8340gcaaacacag ttcacctagt gggactgttg ctgccagagt gactcctttt aattacaacc 8400caagccctag gaaaagcagc gcagatagca cttcagctcg gccatctcag atcccaactc 8460cagtgaataa caacacaaag aagcgagatt ccaaaactga cagcacagaa tccagtggaa 8520cccaaagtcc taagcgccat tctgggtctt accttgtgac atctgtttaa aagagaggaa 8580gaatgaaact aagaaaattc tatgttaatt acaactgcta tatagacatt

ttgtttcaaa 8640tgaaacttta aaagactgaa aaattttgta aataggtttg attcttgtta gagggttttt 8700gttctggaag ccatatttga tagtatactt tgtcttcact ggtcttattt tgggaggcac 8760tcttgatggt taggaaaaaa atagtaaagc caagtatgtt tgtacagtat gttttacatg 8820tatttaaagt agcatcccat cccaacttcc tttaattatt gcttgtctta aaataatgaa 8880cactacagat agaaaatatg atatattgct gttatcaatc atttctagat tataaactga 8940ctaaacttac atcagggaaa aattggtatt tatgcaaaaa aaaatgtttt tgtccttgtg 9000agtccatcta acatcataat taatcatgtg gctgtgaaat tcacagtaat atggttcccg 9060atgaacaagc tttacccagc ctgtttgctt tactgcatga atgaaactga tggttcaatt 9120tcagaagtaa tgattaacag ttatgtggtc acatgatgtg catagagata gctacagtgt 9180aataatttac actattttgt gctccaaaca aaacaaaaat ctgtgtaact gtaaaacatt 9240gaatgaaact attttacctg aactagattt tatctgaaag taggtagaat ttttgctatg 9300ctgtaatttg ttgtatattc tggtatttga ggtgagatgg ctgctctttt attaatgaga 9360catgaattgt gtctcaacag aaactaaatg aacatttcag aataaattat tgctgtatgt 9420aaactgttac tgaaattggt atttgtttga agggtcttgt ttcacatttg tattaataat 9480tgtttaaaat gcctctttta aaagcttata taaatttttt ncttcagctt ctatgcatta 9540agagtaaaat tcctcttact gtaataaaaa caattgaaga agactgttgc cacttaacca 9600ttccatgcgt tggcacttat ctattcctga aattctttta tgtgattagc tcatcttgat 9660ttttaacatt tttccactta aacttttttt tcttactcca ctggagctca gtaaaagtaa 9720attcatgtaa tagcaatgca agcagcctag cacagactaa gcattgagca taataggccc 9780acataatttc ctctttctta atattataga aattctgtac ttgaaattga ttcttagaca 9840ttgcagtctc ttcgaggctt tacagtgtaa actgtcttgc cccttcatct tcttgttgca 9900actgggtctg acatgaacac tttttatcac cctgtatgtt agggcaagat ctcagcagtg 9960aagtataatc agcactttgc catgctcaga aaattcaaat cacatggaac tttagaggta 10020gatttaatac gattaagata ttcagaagta tattttagaa tccctgcctg ttaaggaaac 10080tttatttgtg gtaggtacag ttctggggta catgttaagt gtccccttat acagtggagg 10140gaagtcttcc ttcctgaagg aaaataaact gacacttatt aactaagata atttacttaa 10200tatatcttcc ctgatttgtt ttaaaagatc agagggtgac tgatgataca tgcatacata 10260tttgttgaat aaatgaaaat ttatttttag tgataagatt catacactct gtatttgggg 10320agagaaaacc tttttaagca tggtggggca ctcagatagg agtgaataca cctacctggt 10380ggtcat 103863432191DNAHomo sapiens 343ggtggccgag cgggggaccg ggaagcatgg cccgggggtc ggcggttgcc tgggcggcgc 60tcgggccgtt gttgtggggc tgcgcgctgg ggctgcaggg cgggatgctg tacccccagg 120agagcccgtc gcgggagtgc aaggagctgg acggcctctg gagcttccgc gccgacttct 180ctgacaaccg acgccggggc ttcgaggagc agtggtaccg gcggccgctg tgggagtcag 240gccccaccgt ggacatgcca gttccctcca gcttcaatga catcagccag gactggcgtc 300tgcggcattt tgtcggctgg gtgtggtacg aacgggaggt gatcctgccg gagcgatgga 360cccaggacct gcgcacaaga gtggtgctga ggattggcag tgcccattcc tatgccatcg 420tgtgggtgaa tggggtcgac acgctagagc atgagggggg ctacctcccc ttcgaggccg 480acatcagcaa cctggtccag gtggggcccc tgccctcccg gctccgaatc actatcgcca 540tcaacaacac actcaccccc accaccctgc caccagggac catccaatac ctgactgaca 600cctccaagta tcccaagggt tactttgtcc agaacacata ttttgacttt ttcaactacg 660ctggactgca gcggtctgta cttctgtaca cgacacccac cacctacatc gatgacatca 720ccgtcaccac cagcgtggag caagacagtg ggctggtgaa ttaccagatc tctgtcaagg 780gcagtaacct gttcaagttg gaagtgcgtc ttttggatgc agaaaacaaa gtcgtggcga 840atgggactgg gacccagggc caacttaagg tgccaggtgt cagcctctgg tggccgtacc 900tgatgcacga acgccctgcc tatctgtatt cattggaggt gcagctgact gcacagacgt 960cactggggcc tgtgtctgac ttctacacac tccctgtggg gatccgcact gtggctgtca 1020ccaagagcca gttcctcatc aatgggaaac ctttctattt ccacggtgtc aacaagcatg 1080aggatgcgga catccgaggg aagggcttcg actggccgct gctggtgaag gacttcaacc 1140tgcttcgctg gcttggtgcc aacgctttcc gtaccagcca ctacccctat gcagaggaag 1200tgatgcagat gtgtgaccgc tatgggattg tggtcatcga tgagtgtccc ggcgtgggcc 1260tggcgctgcc gcagttcttc aacaacgttt ctctgcatca ccacatgcag gtgatggaag 1320aagtggtgcg tagggacaag aaccaccccg cggtcgtgat gtggtctgtg gccaacgagc 1380ctgcgtccca cctagaatct gctggctact acttgaagat ggtgatcgct cacaccaaat 1440ccttggaccc ctcccggcct gtgacctttg tgagcaactc taactatgca gcagacaagg 1500gggctccgta tgtggatgtg atctgtttga acagctacta ctcttggtat cacgactacg 1560ggcacctgga gttgattcag ctgcagctgg ccacccagtt tgagaactgg tataagaagt 1620atcagaagcc cattattcag agcgagtatg gagcagaaac gattgcaggg tttcaccagg 1680atccacctct gatgttcact gaagagtacc agaaaagtct gctagagcag taccatctgg 1740gtctggatca aaaacgcaga aaatatgtgg ttggagagct catttggaat tttgccgatt 1800tcatgactga acagtcaccg acgagagtgc tggggaataa aaaggggatc ttcactcggc 1860agagacaacc aaaaagtgca gcgttccttt tgcgagagag atactggaag attgccaatg 1920aaaccaggta tccccactca gtagccaagt cacaatgttt ggaaaacagc ccgtttactt 1980gagcaagact gataccacct gcgtgtccct tcctccccga gtcagggcga cttccacagc 2040agcagaacaa gtgcctcctg gactgttcac ggcagaccag aacgtttctg gcctgggttt 2100tgtggtcatc tattctagca gggaacacta aaggtggaaa taaaagattt tctattatgg 2160aaataaagag ttggcatgaa agtcgctact g 21913442776DNAHomo sapiens 344cagggcagac tggtagcaaa gcccccacgc ccagccagga gcaccgccgc ggactccagc 60acaccgaggg acatgctggg cctgcgcccc ccactgctcg ccctggtggg gctgctctcc 120ctcgggtgcg tcctctctca ggagtgcacg aagttcaagg tcagcagctg ccgggaatgc 180atcgagtcgg ggcccggctg cacctggtgc cagaagctga acttcacagg gccgggggat 240cctgactcca ttcgctgcga cacccggcca cagctgctca tgaggggctg tgcggctgac 300gacatcatgg accccacaag cctcgctgaa acccaggaag accacaatgg gggccagaag 360cagctgtccc cacaaaaagt gacgctttac ctgcgaccag gccaggcagc agcgttcaac 420gtgaccttcc ggcgggccaa gggctacccc atcgacctgt actatctgat ggacctctcc 480tactccatgc ttgatgacct caggaatgtc aagaagctag gtggcgacct gctccgggcc 540ctcaacgaga tcaccgagtc cggccgcatt ggcttcgggt ccttcgtgga caagaccgtg 600ctgccgttcg tgaacacgca ccctgataag ctgcgaaacc catgccccaa caaggagaaa 660gagtgccagc ccccgtttgc cttcaggcac gtgctgaagc tgaccaacaa ctccaaccag 720tttcagaccg aggtcgggaa gcagctgatt tccggaaacc tggatgcacc cgagggtggg 780ctggacgcca tgatgcaggt cgccgcctgc ccggaggaaa tcggctggcg caacgtcacg 840cggctgctgg tgtttgccac tgatgacggc ttccatttcg cgggcgacgg aaagctgggc 900gccatcctga cccccaacga cggccgctgt cacctggagg acaacttgta caagaggagc 960aacgaattcg actacccatc ggtgggccag ctggcgcaca agctggctga aaacaacatc 1020cagcccatct tcgcggtgac cagtaggatg gtgaagacct acgagaaact caccgagatc 1080atccccaagt cagccgtggg ggagctgtct gaggactcca gcaatgtggt ccatctcatt 1140aagaatgctt acaataaact ctcctccagg gtcttcctgg atcacaacgc cctccccgac 1200accctgaaag tcacctacga ctccttctgc agcaatggag tgacgcacag gaaccagccc 1260agaggtgact gtgatggcgt gcagatcaat gtcccgatca ccttccaggt gaaggtcacg 1320gccacagagt gcatccagga gcagtcgttt gtcatccggg cgctgggctt cacggacata 1380gtgaccgtgc aggttcttcc ccagtgtgag tgccggtgcc gggaccagag cagagaccgc 1440agcctctgcc atggcaaggg cttcttggag tgcggcatct gcaggtgtga cactggctac 1500attgggaaaa actgtgagtg ccagacacag ggccggagca gccaggagct ggaaggaagc 1560tgccggaagg acaacaactc catcatctgc tcagggctgg gggactgtgt ctgcgggcag 1620tgcctgtgcc acaccagcga cgtccccggc aagctgatat acgggcagta ctgcgagtgt 1680gacaccatca actgtgagcg ctacaacggc caggtctgcg gcggcccggg gagggggctc 1740tgcttctgcg ggaagtgccg ctgccacccg ggctttgagg gctcagcgtg ccagtgcgag 1800aggaccactg agggctgcct gaacccgcgg cgtgttgagt gtagtggtcg tggccggtgc 1860cgctgcaacg tatgcgagtg ccattcaggc taccagctgc ctctgtgcca ggagtgcccc 1920ggctgcccct caccctgtgg caagtacatc tcctgcgccg agtgcctgaa gttcgaaaag 1980ggcccctttg ggaagaactg cagcgcggcg tgtccgggcc tgcagctgtc gaacaacccc 2040gtgaagggca ggacctgcaa ggagagggac tcagagggct gctgggtggc ctacacgctg 2100gagcagcagg acgggatgga ccgctacctc atctatgtgg atgagagccg agagtgtgtg 2160gcaggcccca acatcgccgc catcgtcggg ggcaccgtgg caggcatcgt gctgatcggc 2220attctcctgc tggtcatctg gaaggctctg atccacctga gcgacctccg ggagtacagg 2280cgctttgaga aggagaagct caagtcccag tggaacaatg ataatcccct tttcaagagc 2340gccaccacga cggtcatgaa ccccaagttt gctgagagtt aggagcactt ggtgaagaca 2400aggccgtcag gacccaccat gtctgcccca tcacgcggcc gagacatggc ttggccacag 2460ctcttgagga tgtcaccaat taaccagaaa tccagttatt ttccgccctc aaaatgacag 2520ccatggccgg ccggtgcttc tgggggctcg tcggggggac agctccactc tgactggcac 2580agtctttgca tggagacttg aggagggctt gaggttggtg aggttaggtg cgtgtttcct 2640gtgcaagtca ggacatcagt ctgattaaag gtggtgccaa tttatttaca tttaaacttg 2700tcagggtata aaatgacatc ccattaatta tattgttaat caatcacgtg tatagaaaaa 2760aaaataaaac ttcaat 27763453160DNAHomo sapiens 345cctcccctcg cccggcgcgg tcccgtccgc ctctcgctcg cctcccgcct cccctcggtc 60ttccgaggcg cccgggctcc cggcgcggcg gcggaggggg cgggcaggcc ggcgggcggt 120gatgtggcag gactctttat gcgctgcggc aggatacgcg ctcggcgctg ggacgcgact 180gcgctcagtt ctctcctctc ggaagctgca gccatgatgg aagtttgaga gttgagccgc 240tgtgaggcga ggccgggctc aggcgaggga gatgagagac ggcggcggcc gcggcccgga 300gcccctctca gcgcctgtga gcagccgcgg gggcagcgcc ctcggggagc cggccggcct 360gcggcggcgg cagcggcggc gtttctcgcc tcctcttcgt cttttctaac cgtgcagcct 420cttcctcggc ttctcctgaa agggaaggtg gaagccgtgg gctcgggcgg gagccggctg 480aggcgcggcg gcggcggcgg cggcacctcc cgctcctgga gcggggggga gaagcggcgg 540cggcggcggc cgcggcggct gcagctccag ggagggggtc tgagtcgcct gtcaccattt 600ccagggctgg gaacgccgga gagttggtct ctccccttct actgcctcca acacggcggc 660ggcggcggcg gcacatccag ggacccgggc cggttttaaa cctcccgtcc gccgccgccg 720caccccccgt ggcccgggct ccggaggccg ccggcggagg cagccgttcg gaggattatt 780cgtcttctcc ccattccgct gccgccgctg ccaggcctct ggctgctgag gagaagcagg 840cccagtcgct gcaaccatcc agcagccgcc gcagcagcca ttacccggct gcggtccaga 900gccaagcggc ggcagagcga ggggcatcag ctaccgccaa gtccagagcc atttccatcc 960tgcagaagaa gccccgccac cagcagcttc tgccatctct ctcctccttt ttcttcagcc 1020acaggctccc agacatgaca gccatcatca aagagatcgt tagcagaaac aaaaggagat 1080atcaagagga tggattcgac ttagacttga cctatattta tccaaacatt attgctatgg 1140gatttcctgc agaaagactt gaaggcgtat acaggaacaa tattgatgat gtagtaaggt 1200ttttggattc aaagcataaa aaccattaca agatatacaa tctttgtgct gaaagacatt 1260atgacaccgc caaatttaat tgcagagttg cacaatatcc ttttgaagac cataacccac 1320cacagctaga acttatcaaa cccttttgtg aagatcttga ccaatggcta agtgaagatg 1380acaatcatgt tgcagcaatt cactgtaaag ctggaaaggg acgaactggt gtaatgatat 1440gtgcatattt attacatcgg ggcaaatttt taaaggcaca agaggcccta gatttctatg 1500gggaagtaag gaccagagac aaaaagggag taactattcc cagtcagagg cgctatgtgt 1560attattatag ctacctgtta aagaatcatc tggattatag accagtggca ctgttgtttc 1620acaagatgat gtttgaaact attccaatgt tcagtggcgg aacttgcaat cctcagtttg 1680tggtctgcca gctaaaggtg aagatatatt cctccaattc aggacccaca cgacgggaag 1740acaagttcat gtactttgag ttccctcagc cgttacctgt gtgtggtgat atcaaagtag 1800agttcttcca caaacagaac aagatgctaa aaaaggacaa aatgtttcac ttttgggtaa 1860atacattctt cataccagga ccagaggaaa cctcagaaaa agtagaaaat ggaagtctat 1920gtgatcaaga aatcgatagc atttgcagta tagagcgtgc agataatgac aaggaatatc 1980tagtacttac tttaacaaaa aatgatcttg acaaagcaaa taaagacaaa gccaaccgat 2040acttttctcc aaattttaag gtgaagctgt acttcacaaa aacagtagag gagccgtcaa 2100atccagaggc tagcagttca acttctgtaa caccagatgt tagtgacaat gaacctgatc 2160attatagata ttctgacacc actgactctg atccagagaa tgaacctttt gatgaagatc 2220agcatacaca aattacaaaa gtctgaattt ttttttatca agagggataa aacaccatga 2280aaataaactt gaataaactg aaaatggacc tttttttttt taatggcaat aggacattgt 2340gtcagattac cagttatagg aacaattctc ttttcctgac caatcttgtt ttaccctata 2400catccacagg gttttgacac ttgttgtcca gttgaaaaaa ggttgtgtag ctgtgtcatg 2460tatatacctt tttgtgtcaa aaggacattt aaaattcaat taggattaat aaagatggca 2520ctttcccgtt ttattccagt tttataaaaa gtggagacag actgatgtgt atacgtagga 2580attttttcct tttgtgttct gtcaccaact gaagtggcta aagagctttg tgatatactg 2640gttcacatcc tacccctttg cacttgtggc aacagataag tttgcagttg gctaagagag 2700gtttccgaaa ggttttgcta ccattctaat gcatgtattc gggttagggc aatggagggg 2760aatgctcaga aaggaaataa ttttatgctg gactctggac catataccat ctccagctat 2820ttacacacac ctttctttag catgctacag ttattaatct ggacattcga ggaattggcc 2880gctgtcactg cttgttgttt gcgcattttt ttttaaagca tattggtgct agaaaaggca 2940gctaaaggaa gtgaatctgt attggggtac aggaatgaac cttctgcaac atcttaagat 3000ccacaaatga agggatataa aaataatgtc ataggtaaga aacacagcaa caatgactta 3060accatataaa tgtggaggct atcaacaaag aatgggcttg aaacattata aaaattgaca 3120atgatttatt aaatatgttt tctcaattgt aaaaaaaaaa 31603462629DNAHomo sapiens 346acttgtcatg gcgactgtcc agctttgtgc caggagcctc gcaggggttg atgggattgg 60ggttttcccc tcccatgtgc tcaagactgg cgctaaaagt tttgagcttc tcaaaagtct 120agagccaccg tccagggagc aggtagctgc tgggctccgg ggacactttg cgttcgggct 180gggagcgtgc tttccacgac ggtgacacgc ttccctggat tggcagccag actgccttcc 240gggtcactgc catggaggag ccgcagtcag atcctagcgt cgagccccct ctgagtcagg 300aaacattttc agacctatgg aaactacttc ctgaaaacaa cgttctgtcc cccttgccgt 360cccaagcaat ggatgatttg atgctgtccc cggacgatat tgaacaatgg ttcactgaag 420acccaggtcc agatgaagct cccagaatgc cagaggctgc tccccgcgtg gcccctgcac 480cagcagctcc tacaccggcg gcccctgcac cagccccctc ctggcccctg tcatcttctg 540tcccttccca gaaaacctac cagggcagct acggtttccg tctgggcttc ttgcattctg 600ggacagccaa gtctgtgact tgcacgtact cccctgccct caacaagatg ttttgccaac 660tggccaagac ctgccctgtg cagctgtggg ttgattccac acccccgccc ggcacccgcg 720tccgcgccat ggccatctac aagcagtcac agcacatgac ggaggttgtg aggcgctgcc 780cccaccatga gcgctgctca gatagcgatg gtctggcccc tcctcagcat cttatccgag 840tggaaggaaa tttgcgtgtg gagtatttgg atgacagaaa cacttttcga catagtgtgg 900tggtgcccta tgagccgcct gaggttggct ctgactgtac caccatccac tacaactaca 960tgtgtaacag ttcctgcatg ggcggcatga accggaggcc catcctcacc atcatcacac 1020tggaagactc cagtggtaat ctactgggac ggaacagctt tgaggtgcgt gtttgtgcct 1080gtcctgggag agaccggcgc acagaggaag agaatctccg caagaaaggg gagcctcacc 1140acgagctgcc cccagggagc actaagcgag cactgcccaa caacaccagc tcctctcccc 1200agccaaagaa gaaaccactg gatggagaat atttcaccct tcagatccgt gggcgtgagc 1260gcttcgagat gttccgagag ctgaatgagg ccttggaact caaggatgcc caggctggga 1320aggagccagg ggggagcagg gctcactcca gccacctgaa gtccaaaaag ggtcagtcta 1380cctcccgcca taaaaaactc atgttcaaga cagaagggcc tgactcagac tgacattctc 1440cacttcttgt tccccactga cagcctccca cccccatctc tccctcccct gccattttgg 1500gttttgggtc tttgaaccct tgcttgcaat aggtgtgcgt cagaagcacc caggacttcc 1560atttgctttg tcccggggct ccactgaaca agttggcctg cactggtgtt ttgttgtggg 1620gaggaggatg gggagtagga cataccagct tagattttaa ggtttttact gtgagggatg 1680tttgggagat gtaagaaatg ttcttgcagt taagggttag tttacaatca gccacattct 1740aggtaggtag gggcccactt caccgtacta accagggaag ctgtccctca tgttgaattt 1800tctctaactt caaggcccat atctgtgaaa tgctggcatt tgcacctacc tcacagagtg 1860cattgtgagg gttaatgaaa taatgtacat ctggccttga aaccaccttt tattacatgg 1920ggtctaaaac ttgaccccct tgagggtgcc tgttccctct ccctctccct gttggctggt 1980gggttggtag tttctacagt tgggcagctg gttaggtaga gggagttgtc aagtcttgct 2040ggcccagcca aaccctgtct gacaacctct tggtcgacct tagtacctaa aaggaaatct 2100caccccatcc cacaccctgg aggatttcat ctcttgtata tgatgatctg gatccaccaa 2160gacttgtttt atgctcaggg tcaatttctt ttttcttttt tttttttttt tttctttttc 2220tttgagactg ggtctcgctt tgttgcccag gctggagtgg agtggcgtga tcttggctta 2280ctgcagcctt tgcctccccg gctcgagcag tcctgcctca gcctccggag tagctgggac 2340cacaggttca tgccaccatg gccagccaac ttttgcatgt tttgtagaga tggggtctca 2400cagtgttgcc caggctggtc tcaaactcct gggctcaggc gatccacctg tctcagcctc 2460ccagagtgct gggattacaa ttgtgagcca ccacgtggag ctggaagggt caacatcttt 2520tacattctgc aagcacatct gcattttcac cccacccttc ccctccttct ccctttttat 2580atcccatttt tatatcgatc tcttatttta caataaaact ttgctgcca 26293473442DNAHomo sapiens 347agccggtgcg ccgcagacta gggcgcctcg ggccagggag cgcggaggag ccatggccac 60cgctaacggg gccgtggaaa acgggcagcc ggacgggaag ccgccggccc tgccgcgccc 120catccgcaac ctggaggtca agttcaccaa gatatttatc aacaatgaat ggcacgaatc 180caagagtggg aaaaagtttg ctacatgtaa cccttcaact cgggagcaaa tatgtgaagt 240ggaagaagga gataagcccg acgtggacaa ggctgtggag gctgcacagg ttgccttcca 300gaggggctcg ccatggcgcc ggctggatgc cctgagtcgt gggcggctgc tgcaccagct 360ggctgacctg gtggagaggg accgcgccac cttggccgcc ctggagacga tggatacagg 420gaagccattt cttcatgctt ttttcatcga cctggagggc tgtattagaa ccctcagata 480ctttgcaggg tgggcagaca aaatccaggg caagaccatc cccacagatg acaacgtcgt 540atgcttcacc aggcatgagc ccattggtgt ctgtggggcc atcactccat ggaacttccc 600cctgctgatg ctggtgtgga agctggcacc cgccctctgc tgtgggaaca ccatggtcct 660gaagcctgcg gagcagacac ctctcaccgc cctttatctc ggctctctga tcaaagaggc 720cgggttccct ccaggagtgg tgaacattgt gccaggattc gggcccacag tgggagcagc 780aatttcttct caccctcaga tcaacaagat cgccttcacc ggctccacag aggttggaaa 840actggttaaa gaagctgcgt cccggagcaa tctgaagcgg gtgacgctgg agctgggggg 900gaagaacccc tgcatcgtgt gtgcggacgc tgacttggac ttggcagtgg agtgtgccca 960tcagggagtg ttcttcaacc aaggccagtg ttgcacggca gcctccaggg tgttcgtgga 1020ggagcaggtc tactctgagt ttgtcaggcg gagcgtggag tatgccaaga aacggcccgt 1080gggagacccc ttcgatgtca aaacagaaca ggggcctcag attgatcaaa agcagttcga 1140caaaatctta gagctgatcg agagtgggaa gaaggaaggg gccaagctgg aatgcggggg 1200ctcagccatg gaagacaagg ggctcttcat caaacccact gtcttctcag aagtcacaga 1260caacatgcgg attgccaaag aggagatttt cgggccagtg caaccaatac tgaagttcaa 1320aagtatcgaa gaagtgataa aaagagcgaa tagcaccgac tatggactca cagcagccgt 1380gttcacaaaa aatctcgaca aagccctgaa gttggcttct gccttagagt ctggaacggt 1440ctggatcaac tgctacaacg ccctctatgc acaggctcca tttggtggct ttaaaatgtc 1500aggaaatggc agagaactag gtgaatacgc tttggccgaa tacacagaag tgaaaactgt 1560caccatcaaa cttggcgaca agaacccctg aaggaaaggc ggggctcctt cctcaaacat 1620cggacggcgg aatgtggcag atgaaatgtg ctggaggaaa aaaatgacat ttctgacctt 1680cccgggacac attcttctgg aggctttaca tctactggag ttgaatgatt gctgttttcc 1740tctcactctc ctgtttattc accagactgg ggatgcctat aggttgtctg tgaaatcgca 1800gtcctgcctg gggagggagc tgttggccat ttctgtgttt ccctttaaac cagatcctgg 1860agacagtgag atactcaggg cgttgttaac agggagtggt atttgaagtg tccagcagtt 1920gcttgaaatg ctttgccgaa tctgactcca gtaagaatgt gggaaaaccc cctgtgtgtt 1980ctgcaagcag ggctcttgca ccagcggtct cctcagggtg gacctgctta cagagcaagc 2040cacgcctctt tccgaggtga aggtgggacc attccttggg aaaggattca cagtaaggtt 2100ttttggtttt tgttttttgt tttcttgttt ttaaaaaaag gatttcacag tgagaaagtt 2160ttggttagtg cataccgtgg aagggcgcca gggtctttgt ggattgcatg ttgacattga 2220ccgtgagatt cggcttcaaa ccaatactgc ctttggaata tgacagaatc aatagcccag

2280agagcttagt caaagacgat atcacggtct accttaacca aggcactttc ttaagcagaa 2340aatattgttg aggttacctt tgctgctaaa gatccaatct tctaacgcca caacagcata 2400gcaaatccta ggataattca cctcctcatt tgacaaatca gagctgtaat tcactttaac 2460aaattacgca tttctatcac gttcactaac agcttatgat aagtctgtgt agtcttcctt 2520ttctccagtt ctgttaccca atttagatta gtaaagcgta cacaactgga aagactgctg 2580taataacaca gccttgttat ttttaagtcc tattttgata ttaatttctg attagttagt 2640aaataacacc tggattctat ggaggacctc ggtcttcatc caagtggcct gagtatttca 2700ctggcaggtt gtgaattttt cttttcctct ttgggaatcc aaatgatgat gtgcaatttc 2760atgttttaac ttgggaaact gaaagtgttc ccatatagct tcaaaaacaa aaacaaatgt 2820gttatccgac ggatactttt atggttacta actagtactt tcctaattgg gaaagtagtg 2880cttaagtttg caaattaagt tggggagggc aataataaaa tgagggcccg taacagaacc 2940agtgtgtgta taacgaaaac catgtataaa atgggcctat cacccttgtc agagatataa 3000attaccacat ttggcttccc ttcatcagct aacacttatc acttatacta ccaataactt 3060gttaaatcag gatttggctt catacactga attttcagta ttttatctca agtagatata 3120gacactaacc ttgatagtga tacgttagag ggttcctatt cttccattgt acgataatgt 3180ctttaatatg aaatgctaca ttatttataa ttggtagagt tattgtatct ttttatagtt 3240gtaagtacac agaggtggta tatttaaact tctgtaatat actgtattta gaaatggaaa 3300tatatatagt gttaggtttc acttctttta aggtttaccc ctgtggtgtg gtttaaaaat 3360ctataggcct gggaattccg atcctagctg cagatcgcat cccacaatgc gagaatgata 3420aaataaaatt ggatatttga ga 3442348737DNAHomo sapiens 348ggagtttcgc cgccgcagtc ttcgccacca tgccgcccta caccgtggtc tatttcccag 60ttcgaggccg ctgcgcggcc ctgcgcatgc tgctggcaga tcagggccag agctggaagg 120aggaggtggt gaccgtggag acgtggcagg agggctcact caaagcctcc tgcctatacg 180ggcagctccc caagttccag gacggagacc tcaccctgta ccagtccaat accatcctgc 240gtcacctggg ccgcaccctt gggctctatg ggaaggacca gcaggaggca gccctggtgg 300acatggtgaa tgacggcgtg gaggacctcc gctgcaaata catctccctc atctacacca 360actatgaggc gggcaaggat gactatgtga aggcactgcc cgggcaactg aagccttttg 420agaccctgct gtcccagaac cagggaggca agaccttcat tgtgggagac cagatctcct 480tcgctgacta caacctgctg gacttgctgc tgatccatga ggtcctagcc cctggctgcc 540tggatgcgtt ccccctgctc tcagcatatg tggggcgcct cagcgcccgg cccaagctca 600aggccttcct ggcctcccct gagtacgtga acctccccat caatggcaac gggaaacagt 660gagggttggg gggactctga gcgggaggca gagtttgcct tcctttctcc aggaccaata 720aaatttctaa gagagct 7373495189DNAHomo sapiens 349atggccaagt cgggtggctg cggcgcggga gccggcgtgg gcggcggcaa cggggcactg 60acctgggtga acaatgctgc aaaaaaagaa gagtcagaaa ctgccaacaa aaatgattct 120tcaaagaagt tgtctgttga gagagtgtat cagaagaaga cacaacttga acacattctt 180cttcgtcctg atacatatat tgggtcagtg gagccattga cgcagttcat gtgggtgtat 240gatgaagatg taggaatgaa ttgcagggag gttacctttg tgccaggttt atacaagatc 300tttgatgaaa ttttggttaa tgctgctgac aataaacaga gggataagaa catgacttgt 360attaaagttt ctattgatcc tgaatctaac attataagca tttggaataa tgggaaaggc 420attccagtag tagaacacaa ggtagagaaa gtttatgttc ctgctttaat ttttggacag 480cttttaacat ccagtaacta tgatgatgat gagaaaaaag ttacaggtgg tcgtaatggt 540tatggtgcaa aactttgtaa tattttcagt acaaagttta cagtagaaac agcttgcaaa 600gaatacaaac acagttttaa gcagacatgg atgaataata tgatgaagac ttctgaagcc 660aaaattaaac attttgatgg tgaagattac acatgcataa cattccaacc agatctgtcc 720aaatttaaga tggaaaaact tgacaaggat attgtggccc tcatgactag aagggcatat 780gatttggctg gttcgtgtag aggggtcaag gtcatgttta atggaaagaa attgcctgta 840aatggatttc gcagttatgt agatctttat gtgaaagaca aattggatga aactggggtg 900gccctgaaag ttattcatga gcttgcaaat gaaagatggg atgtttgtct cacattgagt 960gaaaaaggat tccagcaaat cagctttgta aatagtattg caactacaaa aggtggacgg 1020cacgtggatt atgtggtaga tcaagttgtt ggtaaactga ttgaagtagt taagaaaaag 1080aacaaagctg gtgtatcagt gaaaccattt caagtaaaaa accatatatg ggtttttatt 1140aattgcctta ttgaaaatcc aacttttgat tctcagacta aggaaaacat gactctgcag 1200cccaaaagtt ttgggtctaa atgccagctg tcagaaaaat tttttaaagc agcctctaat 1260tgtggcattg tagaaagtat cctgaactgg gtgaaattta aggctcagac tcagctgaat 1320aagaagtgtt catcagtaaa atacagtaaa atcaaaggta ttcccaaact ggatgatgct 1380aatgatgctg gtggtaaaca ttccctggag tgtacactga tattaacaga gggagactct 1440gccaaatcac tggctgtgtc tggattaggt gtgattggac gagacagata cggagttttt 1500ccactcaggg gcaaaattct taatgtacgg gaagcttctc ataaacagat catggaaaat 1560gctgaaataa ataatattat taaaatagtt ggtctacaat ataagaaaag ttacgatgat 1620gcagaatctc tgaaaacctt acgctatgga aagattatga ttatgaccga tcaggatcaa 1680gatggttctc acataaaagg cctgcttatt aatttcatcc atcacaattg gccatcactt 1740ttgaagcatg gttttcttga agagttcatt actcctattg taaaggcaag caaaaataag 1800caggaacttt ccttctacag tattcctgaa tttgacgaat ggaaaaaaca tatagaaaac 1860cagaaagcct ggaaaataaa gtactataaa ggattgggta ctagtacagc taaagaagca 1920aaggaatatt ttgctgatat ggaaaggcat cgcatcttgt ttagatatgc tggtcctgaa 1980gatgatgctg ccattacctt ggcatttagt aagaagaaga ttgatgacag aaaagaatgg 2040ttaacaaatt ttatggaaga ccggagacag cgtaggctac atggcttacc agagcaattt 2100ttatatggta ctgcaacaaa gcatttgact tataatgatt tcatcaacaa ggaattgatt 2160ctcttctcaa actcagacaa tgaaagatct ataccatctc ttgttgatgg ctttaaacct 2220ggccagcgga aagttttatt tacctgtttc aagaggaatg ataaacgtga agtaaaagtt 2280gcccagttgg ctggctctgt tgctgagatg tcggcttatc atcatggaga acaagcattg 2340atgatgacta ttgtgaattt ggctcagaac tttgtgggaa gtaacaacat taacttgctt 2400cagcctattg gtcagtttgg aactcggctt catggtggca aagatgctgc aagccctcgt 2460tatattttca caatgttaag cactttagca aggctacttt ttcctgctgt ggatgacaac 2520ctccttaagt tcctttatga tgataatcaa cgtgtagagc ctgagtggta tattcctata 2580attcccatgg ttttaataaa tggtgctgag ggcattggta ctggatgggc ttgtaaacta 2640cccaactatg atgctaggga aattgtgaac aatgtcagac gaatgctaga tggcctggat 2700cctcatccca tgcttccaaa ctacaaaaac tttaaaggca cgattcaaga acttggtcaa 2760aaccagtatg cagtcagtgg tgaaatattt gtagtggaca gaaacacagt agaaattaca 2820gagcttccag ttagaacttg gacacaggta tataaagaac aggttttaga acctatgcta 2880aatggaacag ataaaacacc agcattaatt tctgattata aagaatatca tactgacaca 2940actgtgaaat ttgtggtgaa aatgactgaa gagaaactag cacaagcaga agctgctgga 3000ctgcataaag tttttaaact tcaaactact cttacttgta attccatggt actttttgat 3060catatgggat gtctgaagaa atatgaaact gtgcaagaca ttctgaaaga attctttgat 3120ttacgattaa gttattacgg tttacgtaag gagtggcttg tgggaatgtt gggagcagaa 3180tctacaaagc ttaacaatca agcccgtttc attttagaga agatacaagg gaaaattact 3240atagagaata ggtcaaagaa agatttgatt caaatgttag tccagagagg ttatgaatct 3300gacccagtga aagcctggaa agaagcacaa gaaaaggcag cagaagagga tgaaacacaa 3360aaccagcatg atgatagttc ctccgattca ggaactcctt caggcccaga ttttaattat 3420attttaaata tgtctctgtg gtctcttact aaagaaaaag ttgaagaact gattaaacag 3480agagatgcaa aagggcgaga ggtcaatgat cttaaaagaa aatctccttc agatctttgg 3540aaagaggatt tagcggcatt tgttgaagaa ctggataaag tggaatctca agaacgagaa 3600gatgttctgg ctggaatgtc tggaaaagca attaaaggta aagttggcaa acctaaggtg 3660aagaaactcc agttggaaga gacaatgccc tcaccttatg gcagaagaat aattcctgaa 3720attacagcta tgaaggcaga tgccagcaaa aagttgctga agaagaagaa gggtgatctt 3780gatactgcag cagtaaaagt ggaatttgat gaagaattca gtggagcacc agtagaaggt 3840gcaggagaag aggcattgac tccatcagtt cctataaata aaggtcccaa acctaagagg 3900gagaagaagg agcctggtac cagagtgaga aaaacaccta catcatctgg taaacctagt 3960gcaaagaaag tgaagaaacg gaatccttgg tcagatgatg aatccaagtc agaaagtgat 4020ttggaagaaa cagaacctgt ggttattcca agagattctt tgcttaggag agcagcagcc 4080gaaagaccta aatacacatt tgatttctca gaagaagagg atgatgatgc tgatgatgat 4140gatgatgaca ataatgattt agaggaattg aaagttaaag catctcccat aacaaatgat 4200ggggaagatg aatttgttcc ttcagatggg ttagataaag atgaatatac attttcacca 4260ggcaaatcaa aagccactcc agaaaaatct ttgcatgaca aaaaaagtca ggattttgga 4320aatctcttct catttccttc atattctcag aagtcagaag atgattcagc taaatttgac 4380agtaatgaag aagattctgc ttctgttttt tcaccatcat ttggtctgaa acagacagat 4440aaagttccaa gtaaaacggt agctgctaaa aagggaaaac cgtcttcaga tacagtccct 4500aagcccaaga gagccccaaa acagaagaaa gtagtagagg ctgtaaactc tgactcggat 4560tcagaatttg gcattccaaa gaagactaca acaccaaaag gtaaaggccg aggggcaaag 4620aaaaggaaag catctggctc tgaaaatgaa ggcgattata accctggcag gaaaacatcc 4680aaaacaacaa gcaagaaacc gaagaagaca tcttttgatc aggattcaga tgtggacatc 4740ttcccctcag acttccctac tgagccacct tctctgccac gaaccggtcg ggctaggaaa 4800gaagtaaaat attttgcaga gtctgatgaa gaagaagatg atgttgattt tgcaatgttt 4860aattaagtgc ccaaagagca caaacatttt tcaacaaata tcttgtgttg tccttttgtc 4920ttctctgtct cagacttttg tacatctggc ttattttaat gtgatgatgt aattgacggt 4980tttttattat tgtggtaggc cttttaacat tttgttctta cacatacagt tttatgctct 5040tttttactca ttgaaatgtc acgtactgtc tgattggctt gtagaattgt tatagactgc 5100cgtgcattag cacagatttt aattgtcatg gttacaaact acagacctgc tttttgaaat 5160gaaatttaaa cattaaaaat ggaactgtg 51893501536DNAHomo sapiens 350gggggggggg ggaccacttg gcctgcctcc gtcccgccgc gccacttggc ctgcctccgt 60cccgccgcgc cacttcgcct gcctccgtcc cccgcccgcc gcgccatgcc tgtggccggc 120tcggagctgc cgcgccggcc cttgcccccc gccgcacagg agcgggacgc cgagccgcgt 180ccgccgcacg gggagctgca gtacctgggg cagatccaac acatcctccg ctgcggcgtc 240aggaaggacg accgcacggg caccggcacc ctgtcggtat tcggcatgca ggcgcgctac 300agcctgagag atgaattccc tctgctgaca accaaacgtg tgttctggaa gggtgttttg 360gaggagttgc tgtggtttat caagggatcc acaaatgcta aagagctgtc ttccaaggga 420gtgaaaatct gggatgccaa tggatcccga gactttttgg acagcctggg attctccacc 480agagaagaag gggacttggg cccagtttat ggcttccagt ggaggcattt tggggcagaa 540tacagagata tggaatcaga ttattcagga cagggagttg accaactgca aagagtgatt 600gacaccatca aaaccaaccc tgacgacaga agaatcatca tgtgcgcttg gaatccaaga 660gatcttcctc tgatggcgct gcctccatgc catgccctct gccagttcta tgtggtgaac 720agtgagctgt cctgccagct gtaccagaga tcgggagaca tgggcctcgg tgtgcctttc 780aacatcgcca gctacgccct gctcacgtac atgattgcgc acatcacggg cctgaagcca 840ggtgacttta tacacacttt gggagatgca catatttacc tgaatcacat cgagccactg 900aaaattcagc ttcagcgaga acccagacct ttcccaaagc tcaggattct tcgaaaagtt 960gagaaaattg atgacttcaa agctgaagac tttcagattg aagggtacaa tccgcatcca 1020actattaaaa tggaaatggc tgtttagggt gctttcaaag gagcttgaag gatattgtca 1080gtctttaggg gttgggctgg atgccgaggt aaaagttctt tttgctctaa aagaaaaagg 1140aactaggtca aaaatctgtc cgtgacctat cagttattaa tttttaagga tgttgccact 1200ggcaaatgta actgtgccag ttctttccat aataaaaggc tttgagttaa ctcactgagg 1260gtatctgaca atgctgaggt tatgaacaaa gtgaggagaa tgaaatgtat gtgctcttag 1320caaaaacatg tatgtgcatt tcaatcccac gtacttataa agaaggttgg tgaatttcac 1380aagctatttt tggaatattt ttagaatatt ttaagaattt cacaagctat tccctcaaat 1440ctgagggagc tgagtaacac catcgatcat gatgtagagt gtggttatga actttatagt 1500tgttttatat gttgctataa taaagaagtg ttctgc 15363512386DNAHomo sapiens 351ggaggaggaa gcaagcgagg gggctggttc ctgagcttcg caattcctgt gtcgccttct 60gggctcccag cctgccgggt cgcatgatcc ctccggccgg agctggtttt tttgccagcc 120accgcgaggc cggctgagtt accggcatcc ccgcagccac ctcctctccc gacctgtgat 180acaaaagatc ttccgggggc tgcacctgcc tgcctttgcc taaggcggat ttgaatctct 240ttctctccct tcagaatctt atcttggctt tggatcttag aagagaatca ctaaccagag 300acgagactca gtgagtgagc aggtgttttg gacaatggac tggttgagcc catccctatt 360ataaaaatgt ctcagagcaa ccgggagctg gtggttgact ttctctccta caagctttcc 420cagaaaggat acagctggag tcagtttagt gatgtggaag agaacaggac tgaggcccca 480gaagggactg aatcggagat ggagaccccc agtgccatca atggcaaccc atcctggcac 540ctggcagaca gccccgcggt gaatggagcc actggccaca gcagcagttt ggatgcccgg 600gaggtgatcc ccatggcagc agtaaagcaa gcgctgaggg aggcaggcga cgagtttgaa 660ctgcggtacc ggcgggcatt cagtgacctg acatcccagc tccacatcac cccagggaca 720gcatatcaga gctttgaaca ggatactttt gtggaactct atgggaacaa tgcagcagcc 780gagagccgaa agggccagga acgcttcaac cgctggttcc tgacgggcat gactgtggcc 840ggcgtggttc tgctgggctc actcttcagt cggaaatgac cagacactga ccatccactc 900taccctccca cccccttctc tgctccacca catcctccgt ccagccgcca ttgccaccag 960gagaaccact acatgcagcc catgcccacc tgcccatcac agggttgggc ccagatctgg 1020tcccttgcag ctagttttct agaatttatc acacttctgt gagaccccca cacctcagtt 1080cccttggcct cagaattcac aaaatttcca caaaatctgt ccaaaggagg ctggcaggta 1140tggaagggtt tgtggctggg ggcaggaggg ccctacctga ttggtgcaac ccttacccct 1200tagcctccct gaaaatgttt ttctgccagg gagcttgaaa gttttcagaa cctcttcccc 1260agaaaggaga ctagattgcc tttgttttga tgtttgtggc ctcagaattg atcattttcc 1320ccccactctc cccacactaa cctgggttcc ctttccttcc atccctaccc cctaagagcc 1380atttaggggc cacttttgac tagggattca ggctgcttgg gataaagatg caaggaccag 1440gactccctcc tcacctctgg actggctaga gtcctcactc ccagtccaaa tgtcctccag 1500aagcctctgg ctagaggcca gccccaccca ggagggaggg ggctatagct acaggaagca 1560ccccatgcca aagctagggt ggcccttgca gttcagcacc accctagtcc cttcccctcc 1620ctggctccca tgaccatact gagggaccaa ctgggcccaa gacagatgcc ccagagctgt 1680ttatggcctc agctgcctca cttcctacaa gagcagcctg tggcatcttt gccttgggct 1740gctcctcatg gtgggttcag gggactcagc cctgaggtga aagggagcta tcaggaacag 1800ctatgggagc cccagggtct tccctacctc aggcaggaag ggcaggaagg agagcctgct 1860gcatggggtg gggtagggct gactagaagg gccagtcctg cctggccagg cagatctgtg 1920ccccatgcct gtccagcctg ggcagccagg ctgccaaggc cagagtggcc tggccaggag 1980ctcttcaggc ctccctctct cttctgctcc acccttggcc tgtctcatcc ccaggggtcc 2040cagccacccc gggctctctg ctgtacatat ttgagactag tttttattcc ttgtgaagat 2100gatatactat ttttgttaag cgtgtctgta tttatgtgtg aggagctgct ggcttgcagt 2160gcgcgtgcac gtggagagct ggtgcccgga gattggacgg cctgatgctc cctcccctgc 2220cctggtccag ggaagctggc cgagggtcct ggctcctgag gggcatctgc ccctccccca 2280acccccaccc cacacttgtt ccagctcttt gaaatagtct gtgtgaaggt gaaagtgcag 2340ttcagtaata aactgtgttt actcagtgaa aaaaaaaaaa aaaaaa 23863521270DNAHomo sapiens 352agacgttcgc acacctgggt gccagcgccc cagaggtccc gggacagccc gaggcgccgc 60gcccgccgcc ccgagctccc caagccttcg agagcggcgc acactcccgg tctccactcg 120ctcttccaac acccgctcgt tttggcggca gctcgtgtcc cagagaccga gttgccccag 180agaccgagac gccgccgctg cgaaggacca atgagagccc cgctgctacc gccggcgccg 240gtggtgctgt cgctcttgat actcggctca ggccattatg ctgctggatt ggacctcaat 300gacacctact ctgggaagcg tgaaccattt tctggggacc acagtgctga tggatttgag 360gttacctcaa gaagtgagat gtcttcaggg agtgagattt cccctgtgag tgaaatgcct 420tctagtagtg aaccgtcctc gggagccgac tatgactact cagaagagta tgataacgaa 480ccacaaatac ctggctatat tgtcgatgat tcagtcagag ttgaacaggt agttaagccc 540ccccaaaaca agacggaaag tgaaaatact tcagataaac ccaaaagaaa gaaaaaggga 600ggcaaaaatg gaaaaaatag aagaaacaga aagaagaaaa atccatgtaa tgcagaattt 660caaaatttct gcattcacgg agaatgcaaa tatatagagc acctggaagc agtaacatgc 720aaatgtcagc aagaatattt cggtgaacgg tgtggggaaa agtccatgaa aactcacagc 780atgattgaca gtagtttatc aaaaattgca ttagcagcca tagctgcctt tatgtctgct 840gtgatcctca cagctgttgc tgttattaca gtccagctta gaagacaata cgtcaggaaa 900tatgaaggag aagctgagga acgaaagaaa cttcgacaag agaatggaaa tgtacatgct 960atagcataac tgaagataaa attacaggat atcacattgg agtcactgcc aagtcatagc 1020cataaatgat gagtcggtcc tctttccagt ggatcataag acaatggacc ctttttgtta 1080tgatggtttt aaactttcaa ttgtcacttt ttatgctatt tctgtatata aaggtgcacg 1140aaggtaaaaa gtattttttc aagttgtaaa taatttattt aatatttaat ggaagtgtat 1200ttattttaca gctcattaaa cttttttaac caaacagaaa aaaaaaaaaa aaaaaaaaaa 1260aaaaaaaaaa 12703531600DNAHomo sapiens 353gccccgccgc cggcagtgga ccgctgtgcg cgaaccctga accctacggt cccgacccgc 60gggcgaggcc gggtacctgg gctgggatcc ggagcaagcg ggcgagggca gcgccctaag 120caggcccgga gcgatggcag ccttgatgac cccgggaacc ggggccccac ccgcgcctgg 180tgacttctcc ggggaaggga gccagggact tcccgaccct tcgccagagc ccaagcagct 240cccggagctg atccgcatga agcgagacgg aggccgcctg agcgaagcgg acatcagggg 300cttcgtggcc gctgtggtga atgggagcgc gcagggcgca cagatcgggg ccatgctgat 360ggccatccga cttcggggca tggatctgga ggagacctcg gtgctgaccc aggccctggc 420tcagtcggga cagcagctgg agtggccaga ggcctggcgc cagcagcttg tggacaagca 480ttccacaggg ggtgtgggtg acaaggtcag cctggtcctc gcacctgccc tggcggcatg 540tggctgcaag gtgccaatga tcagcggacg tggtctgggg cacacaggag gcaccttgga 600taagctggag tctattcctg gattcaatgt catccagagc ccagagcaga tgcaagtgct 660gctggaccag gcgggctgct gtatcgtggg tcagagtgag cagctggttc ctgcggacgg 720aatcctatat gcagccagag atgtgacagc caccgtggac agcctgccac tcatcacagc 780ctccattctc agtaagaaac tcgtggaggg gctgtccgct ctggtggtgg acgttaagtt 840cggaggggcc gccgtcttcc ccaaccagga gcaggcccgg gagctggcaa agacgctggt 900tggcgtggga gccagcctag ggcttcgggt cgcggcagcg ctgaccgcca tggacaagcc 960cctgggtcgc tgcgtgggcc acgccctgga ggtggaggag gcgctgctct gcatggacgg 1020cgcaggcccg ccagacttaa gggacctggt caccacgctc gggggcgccc tgctctggct 1080cagcggacac gcggggactc aggctcaggg cgctgcccgg gtggccgcgg cgctggacga 1140cggctcggcc cttggccgct tcgagcggat gctggcggcg cagggcgtgg atcccggtct 1200ggcccgagcc ctgtgctcgg gaagtcccgc agaacgccgg cagctgctgc ctcgcgcccg 1260ggagcaggag gagctgctgg cgcccgcaga tggcaccgtg gagctggtcc gggcgctgcc 1320gctggcgctg gtgctgcacg agctcggggc cgggcgcagc cgcgctgggg agccgctccg 1380cctgggggtg ggcgcagagc tgctggtcga cgtgggtcag aggctgcgcc gtgggacccc 1440ctggctccgc gtgcaccggg acggccccgc gctcagcggc ccgcagagcc gcgccctgca 1500ggaggcgctc gtactctccg accgcgcgcc attcgccgcc ccctcgccct tcgcagagct 1560cgttctgccg ccgcagcaat aaagctcctt tgccgcgaaa 16003541842DNAHomo sapiens 354cgatcagatc gatctaagat ggcgactgtc gaaccggaaa ccacccctac tcctaatccc 60ccgactacag aagaggagaa aacggaatct aatcaggagg ttgctaaccc agaacactat 120attaaacatc ccctacagaa cagatgggca ctctggtttt ttaaaaatga taaaagcaaa 180acttggcaag caaacctgcg gctgatctcc aagtttgata ctgttgaaga cttttgggct 240ctgtacaacc atatccagtt gtctagtaat ttaatgcctg gctgtgacta ctcacttttt 300aaggatggta ttgagcctat gtgggaagat gagaaaaaca aacggggagg acgatggcta 360attacattga acaaacagca gagacgaagt gacctcgatc gcttttggct agagacactt 420ctgtgcctta ttggagaatc ttttgatgac tacagtgatg atgtatgtgg cgctgttgtt 480aatgttagag ctaaaggtga taagatagca atatggacta ctgaatgtga aaacagagaa 540gctgttacac atatagggag ggtatacaag gaaaggttag gacttcctcc aaagatagtg 600attggttatc agtcccacgc agacacagct actaagagcg gctccaccac taaaaatagg 660tttgttgttt aagaagacac cttctgagta ttctcatagg agactgcgtc aagcaatcga 720gatttgggag ctgaaccaaa gcctcttcaa aaagcagagt ggactgcatt taaatttgat 780ttccatctta atgttactca gatataagag

aagtctcatt cgcctttgtc ttgtacttct 840gtgttcattt tttttttttt tttttggcta gagtttccac tatcccaatc aaagaattac 900agtacacatc cccagaatcc ataaatgtgt tcctggccca ctctgtaata gttcagtaga 960attaccatta attacataca gattttacct atccacaata gtcagaaaac aacttggcat 1020ttctatactt tacaggaaaa aaaattctgt tgttccattt tatgcagaag catattttgc 1080tggtttgaaa gattatgatg catacagttt tctagcaatt ttctttgttt ctttttacag 1140cattgtcttt gctgtactct tgctgatggc tgctagattt taatttattt gtttccctac 1200ttgataatat tagtgattct gatttcagtt tttcatttgt tttgcttaaa tttttttttt 1260ttttttcctc atgtaacatt ggtgaaggat ccaggaatat gacacaaagg tggaataaac 1320attaattttg tgcattcttt ggtaattttt tttgtttttt gtaactacaa agctttgcta 1380caaatttatg catttcattc aaatcagtga tctatgtttg tgtgatttcc taaacataat 1440tgtggattat aaaaaatgta acatcataat tacattccta actagaatta gtatgtctgt 1500ttttgtatct ttatgctgta ttttaacact ttgtattact taggttattt tgctttggtt 1560aaaaatggct caagtagaaa agcagtccca ttcatattaa gacagtgtac aaaactgtaa 1620ataaaatgtg tacagtgaat tgtcttttag acaactagat ttgtcctttt atttctccat 1680ctttatagaa ggaatttgta cttcttattg caggcaagtc tctatattat gtcctctttt 1740gtggtgtctt ccatgtgaac agcataagtt tggagcacta gtttgattat tatgtttatt 1800acaattttta ataaattgaa taggtagtat catatatatg ga 18423554975DNAHomo sapiens 355ctctcacaca cacacacccc tcccctgcca tccctccccg gactccggct ccggctccga 60ttgcaatttg caacctccgc tgccgtcgcc gcagcagcca ccaattcgcc agcggttcag 120gtggctcttg cctcgatgtc ctagcctagg ggcccccggg ccggacttgg ctgggctccc 180ttcaccctct gcggagtcat gagggcgaac gacgctctgc aggtgctggg cttgcttttc 240agcctggccc ggggctccga ggtgggcaac tctcaggcag tgtgtcctgg gactctgaat 300ggcctgagtg tgaccggcga tgctgagaac caataccaga cactgtacaa gctctacgag 360aggtgtgagg tggtgatggg gaaccttgag attgtgctca cgggacacaa tgccgacctc 420tccttcctgc agtggattcg agaagtgaca ggctatgtcc tcgtggccat gaatgaattc 480tctactctac cattgcccaa cctccgcgtg gtgcgaggga cccaggtcta cgatgggaag 540tttgccatct tcgtcatgtt gaactataac accaactcca gccacgctct gcgccagctc 600cgcttgactc agctcaccga gattctgtca gggggtgttt atattgagaa gaacgataag 660ctttgtcaca tggacacaat tgactggagg gacatcgtga gggaccgaga tgctgagata 720gtggtgaagg acaatggcag aagctgtccc ccctgtcatg aggtttgcaa ggggcgatgc 780tggggtcctg gatcagaaga ctgccagaca ttgaccaaga ccatctgtgc tcctcagtgt 840aatggtcact gctttgggcc caaccccaac cagtgctgcc atgatgagtg tgccgggggc 900tgctcaggcc ctcaggacac agactgcttt gcctgccggc acttcaatga cagtggagcc 960tgtgtacctc gctgtccaca gcctcttgtc tacaacaagc taactttcca gctggaaccc 1020aatccccaca ccaagtatca gtatggagga gtttgtgtag ccagctgtcc ccataacttt 1080gtggtggatc aaacatcctg tgtcagggcc tgtcctcctg acaagatgga agtagataaa 1140aatgggctca agatgtgtga gccttgtggg ggactatgtc ccaaagcctg tgagggaaca 1200ggctctggga gccgcttcca gactgtggac tcgagcaaca ttgatggatt tgtgaactgc 1260accaagatcc tgggcaacct ggactttctg atcaccggcc tcaatggaga cccctggcac 1320aagatccctg ccctggaccc agagaagctc aatgtcttcc ggacagtacg ggagatcaca 1380ggttacctga acatccagtc ctggccgccc cacatgcaca acttcagtgt tttttccaat 1440ttgacaacca ttggaggcag aagcctctac aaccggggct tctcattgtt gatcatgaag 1500aacttgaatg tcacatctct gggcttccga tccctgaagg aaattagtgc tgggcgtatc 1560tatataagtg ccaataggca gctctgctac caccactctt tgaactggac caaggtgctt 1620cgggggccta cggaagagcg actagacatc aagcataatc ggccgcgcag agactgcgtg 1680gcagagggca aagtgtgtga cccactgtgc tcctctgggg gatgctgggg cccaggccct 1740ggtcagtgct tgtcctgtcg aaattatagc cgaggaggtg tctgtgtgac ccactgcaac 1800tttctgaatg gggagcctcg agaatttgcc catgaggccg aatgcttctc ctgccacccg 1860gaatgccaac ccatgggggg cactgccaca tgcaatggct cgggctctga tacttgtgct 1920caatgtgccc attttcgaga tgggccccac tgtgtgagca gctgccccca tggagtccta 1980ggtgccaagg gcccaatcta caagtaccca gatgttcaga atgaatgtcg gccctgccat 2040gagaactgca cccaggggtg taaaggacca gagcttcaag actgtttagg acaaacactg 2100gtgctgatcg gcaaaaccca tctgacaatg gctttgacag tgatagcagg attggtagtg 2160attttcatga tgctgggcgg cacttttctc tactggcgtg ggcgccggat tcagaataaa 2220agggctatga ggcgatactt ggaacggggt gagagcatag agcctctgga ccccagtgag 2280aaggctaaca aagtcttggc cagaatcttc aaagagacag agctaaggaa gcttaaagtg 2340cttggctcgg gtgtctttgg aactgtgcac aaaggagtgt ggatccctga gggtgaatca 2400atcaagattc cagtctgcat taaagtcatt gaggacaaga gtggacggca gagttttcaa 2460gctgtgacag atcatatgct ggccattggc agcctggacc atgcccacat tgtaaggctg 2520ctgggactat gcccagggtc atctctgcag cttgtcactc aatatttgcc tctgggttct 2580ctgctggatc atgtgagaca acaccggggg gcactggggc cacagctgct gctcaactgg 2640ggagtacaaa ttgccaaggg aatgtactac cttgaggaac atggtatggt gcatagaaac 2700ctggctgccc gaaacgtgct actcaagtca cccagtcagg ttcaggtggc agattttggt 2760gtggctgacc tgctgcctcc tgatgataag cagctgctat acagtgaggc caagactcca 2820attaagtgga tggcccttga gagtatccac tttgggaaat acacacacca gagtgatgtc 2880tggagctatg gtgtgacagt ttgggagttg atgaccttcg gggcagagcc ctatgcaggg 2940ctacgattgg ctgaagtacc agacctgcta gagaaggggg agcggttggc acagccccag 3000atctgcacaa ttgatgtcta catggtgatg gtcaagtgtt ggatgattga tgagaacatt 3060cgcccaacct ttaaagaact agccaatgag ttcaccagga tggcccgaga cccaccacgg 3120tatctggtca taaagagaga gagtgggcct ggaatagccc ctgggccaga gccccatggt 3180ctgacaaaca agaagctaga ggaagtagag ctggagccag aactagacct agacctagac 3240ttggaagcag aggaggacaa cctggcaacc accacactgg gctccgccct cagcctacca 3300gttggaacac ttaatcggcc acgtgggagc cagagccttt taagtccatc atctggatac 3360atgcccatga accagggtaa tcttgggggg tcttgccagg agtctgcagt ttctgggagc 3420agtgaacggt gcccccgtcc agtctctcta cacccaatgc cacggggatg cctggcatca 3480gagtcatcag aggggcatgt aacaggctct gaggctgagc tccaggagaa agtgtcaatg 3540tgtagaagcc ggagcaggag ccggagccca cggccacgcg gagatagcgc ctaccattcc 3600cagcgccaca gtctgctgac tcctgttacc ccactctccc cacccgggtt agaggaagag 3660gatgtcaacg gttatgtcat gccagataca cacctcaaag gtactccctc ctcccgggaa 3720ggcacccttt cttcagtggg tctcagttct gtcctgggta ctgaagaaga agatgaagat 3780gaggagtatg aatacatgaa ccggaggaga aggcacagtc cacctcatcc ccctaggcca 3840agttcccttg aggagctggg ttatgagtac atggatgtgg ggtcagacct cagtgcctct 3900ctgggcagca cacagagttg cccactccac cctgtaccca tcatgcccac tgcaggcaca 3960actccagatg aagactatga atatatgaat cggcaacgag atggaggtgg tcctgggggt 4020gattatgcag ccatgggggc ctgcccagca tctgagcaag ggtatgaaga gatgagagct 4080tttcaggggc ctggacatca ggccccccat gtccattatg cccgcctaaa aactctacgt 4140agcttagagg ctacagactc tgcctttgat aaccctgatt actggcatag caggcttttc 4200cccaaggcta atgcccagag aacgtaactc ctgctccctg tggcactcag ggagcattta 4260atggcagcta gtgcctttag agggtaccgt cttctcccta ttccctctct ctcccaggtc 4320ccagcccctt ttccccagtc ccagacaatt ccattcaatc tttggaggct tttaaacatt 4380ttgacacaaa attcttatgg tatgtagcca gctgtgcact ttcttctctt tcccaacccc 4440aggaaaggtt ttccttattt tgtgtgcttt cccagtccca ttcctcagct tcttcacagg 4500cactcctgga gatatgaagg attactctcc atatcccttc ctctcaggct cttgactact 4560tggaactagg ctcttatgtg tgcctttgtt tcccatcaga ctgtcaagaa gaggaaaggg 4620aggaaaccta gcagaggaaa gtgtaatttt ggtttatgac tcttaacccc ctagaaagac 4680agaagcttaa aatctgtgaa gaaagaggtt aggagtagat attgattact atcataattc 4740agcacttaac tatgagccag gcatcatact aaacttcacc tacattatct cacttagtcc 4800tttatcatcc ttaaaacaat tctgtgacat acatattatc tcattttaca caaagggaag 4860tcgggcatgg tggctcatgc ctgtaatctc agcactttgg gaggctgagg cagaaggatt 4920acctgaggca aggagtttga gaccagctta gccaacatag taagaccccc atctc 49753564627DNAHomo sapiens 356tcacttgcct gatatttcca gtgtcagagg gacacagcca acgtggggtc ccttctaggc 60tgacagccgc tctccagcca ctgccgcgag cccgtctgct cccgccctgc ccgtgcactc 120tccgcagccg ccctccgcca agccccagcg cccgctccca tcgccgatga ccgcggggag 180gaggatggag atgctctgtg ccggcagggt ccctgcgctg ctgctctgcc tgggtttcca 240tcttctacag gcagtcctca gtacaactgt gattccatca tgtatcccag gagagtccag 300tgataactgc acagctttag ttcagacaga agacaatcca cgtgtggctc aagtgtcaat 360aacaaagtgt agctctgaca tgaatggcta ttgtttgcat ggacagtgca tctatctggt 420ggacatgagt caaaactact gcaggtgtga agtgggttat actggtgtcc gatgtgaaca 480cttcttttta accgtccacc aacctttaag caaagagtat gtggctttga ccgtgattct 540tattattttg tttcttatca cagtcgtcgg ttccacatat tatttctgca gatggtacag 600aaatcgaaaa agtaaagaac caaagaagga atatgagaga gttacctcag gggatccaga 660gttgccgcaa gtctgaatgg cgccatcaaa cttatgggca gggataacag tgtgcctggt 720taatattaat attccatttt attaataata tttatgttgg gtcaagtgtt aggtcaataa 780cactgtattt taatgtactt gaaaaatgtt tttatttttg ttttattttt gacagactat 840ttgctaatgt ataatgtgca gaaaatattt aatatcaaaa gaaaattgat atttttatac 900aagtaatttc ctgagctaaa tgcttcattg aaagcttcaa agtttatatg cctggtgcac 960agtgcttaga agtaagcaat tcccaggtca tagctcaaga attgttagca aatgacagat 1020ttctgtaagc ctatatatat agtcaaatcg atttagtaag tatgtttttt atgttcctca 1080aatcagtgat aattggtttg actgtaccat ggtttgatat gtagttggca ccatggtatc 1140atatattaaa acaataatgc aattagaatt tgggagaagc aaatataggt cctgtgttaa 1200acactacaca tttgaaacaa gctaaccctg gggagtctat ggtctcttca ctcaggtctc 1260agctataatt ctgttatatg aggggcagtg gacagttccc tatgccaact cacgactcct 1320acaggtacta gtcactcatc taccagattc tgcctatgta aaatgaattg aaaaacaatt 1380ttctgtaatc ttttatttaa gtagtgggca tttcatagct tcacaatgtt ccttttttgt 1440atattacaac atttatgtga ggtaattatt gctcaacaga caattagaaa aaagtccaca 1500cttgaagcct aaatttgtgc tttttaagaa tatttttaga ctatttcttt ttataggggc 1560tttgctgaat tctaacatta aatcacagcc caaaatttga tggactaatt attattttaa 1620aatatatgaa gacaataatt ctacatgttg tcttaagatg gaaatacagt tatttcatct 1680tttattcaag gaagttttaa ctttaataca gctcagtaaa tggcttcttc tagaatgtaa 1740agttatgtat ttaaagttgt atcttgacac aggaaatggg aaaaaactta aaaattaata 1800tggtgtattt ttccaaatga aaaatctcaa ttgaaagctt ttaaaatgta gaaacttaaa 1860cacaccttcc tgtggaggct gagatgaaaa ctagggctca ttttcctgac atttgtttat 1920tttttggaag agacaaagat ttcttctgca ctctgagccc ataggtctca gagagttaat 1980aggagtattt ttgggctatt gcataaggag ccactgctgc caccactttt ggattttatg 2040ggaggctcct tcatcgaatg ctaaaccttt gagtagagtc tccctggatc acataccagg 2100tcagggagga tctgttcttc ctctacgttt atcctggcat gtgctagggt aaacgaaggc 2160ataataagcc atggctgacc tctggagcac caggtgccag gacttgtctc catgtgtatc 2220catgcattat ataccctggt gcaatcacac gactgtcatc taaagtcctg gccctggccc 2280ttactattag gaaaataaac agacaaaaac aagtaaatat atatggtcct atacatattg 2340tatatatatt catatacaaa catgtatgta tacatgacct taatggatca tagaattgca 2400gtcatttggt gctctgctaa ccatttatat aaaacttaaa aacaagagaa aagaaaaatc 2460aattagatct aaacagttat ttctgtttcc tatttaatat agctgaagtc aaaatatgta 2520agaacacatt ttaaatactc tacttacagt tggccctctg tggttagttc cacatctgtg 2580gattcaacca accaaggacg gaaaatgctt aaaaaataat acaacaacaa caaaaaatac 2640attataacaa ctatttactt tttttttttt ctttttgaga tggagtctcg ctctgttgcc 2700caggttggag tgcagtggca cgatctcggc tcactgcaac ctcacctccc gggttcaaga 2760gatcctcctg cctcagcctc ctgagcagct gggactacag gcgcatgcca ccatgcccag 2820ctaatttttg tatttttagt agaggcgggg tttcaccatg ttggccagga tggtctcaat 2880ctcctaacct tgagatccac cctccacagc ctcccaaact gctgggatta caggcgtgag 2940ccaccgcacg tagcatttac attaggtatt acaagtaatg taaagatgat ttaagtatac 3000aggaggatgt gaataggtta tatgcaagca ctatgccctt ttatataagt gacttgaaca 3060tctgtgcccg attttagtat gtgcaggggg gcgatctggg aatcagtccc ctgtggatac 3120caaggtacaa ctgtatttat taacgcttac tagatgtgag gagagtctga atattttcag 3180tgatcttggc tgtttcaaaa aaatctattg acttttcaat aaatcagctg caatccattt 3240atttcattta caaaagattt attgtaagcc tctcaatctt ggtttttcag ttgatcttaa 3300gcatgtcaat tcataaaaac aagtcatttt tgtatttttc atctttaaga atgcttaaaa 3360aagctaatcc ctaaaatagt tagatctttg taaatgcata ttaaataata aagtatgacc 3420cacattactt tttatgggtg aaaataagac aaaaataata gttttagtga ggatggtgct 3480gagtaaacat aaaaactgat ttgctctcag ctgatgtgtc ctgtacacag tgggaagatt 3540ttagttcaca cttagtctaa ctcccccatt ttacagattt ctcactatat atatttctag 3600aaggggctat gcatattcaa tgtattgaga accaaagcaa ccacaaatgc ataaatgcat 3660aatttatggt cttcaaccaa ggccacataa taacccagtt aacttactct ttaaccagga 3720atattaagtt ctataactag tactcaaggt ttaaccttaa aattaagatt tccttaacct 3780taaccttaaa attgatatta tattaaacat acataataca atgtaactcc actgttctcc 3840tgaatatttt ttgctctaat ctctctgccg aaagtcaaag tgatgggaga attggtatac 3900tggtatgact acgtcttaag tcagattttt atttatgagt ctttgagact aaattcaatc 3960accaccaggt atcaaatcaa cttttatgca gcaaatatat gattctagtg tctgactttt 4020gttaaattca gtaatgcagt ttttaaaaac ctgtatctga cccactttgt aatttttgct 4080ccaatatcca ttctgtagac ttttgaaaaa aaagttttta atttgatgcc caatatattc 4140tgaccgttaa aaaattcttg ttcatatggg agaaggggga gtaatgactt gtacaaacag 4200tatttctggt gtatatttta atgtttttaa aaagagtaat ttcatttaaa tatctgttat 4260tcaaatttga tgatgttaaa tgtaatataa tgtattttct ttttattttg cactctgtaa 4320ttgcactttt taagtttgaa gagccatttt ggtaaacggt ttttattaaa gatgctatgg 4380aacataaagt tgtattgcat gcaatttaaa gtaacttatt tgactatgaa tattatcgga 4440ttactgaatt gtatcaattt gtttgtgttc aatatcagct ttgataattg tgtaccttaa 4500gatattgaag gagaaaatag ataatttaca agatattatt aatttttatt tatttttctt 4560gggaattgaa aaaaattgaa ataaataaaa atgcattgaa catcttgcat tcaaaatctt 4620cactgac 46273572634DNAHomo sapiens 357ggcacgaggc tgagtgtccg tctcgcgccc ggaagcgggc gaccgccgtc agcccggagg 60aggaggagga ggaggaggag gagggggcgg ccatggggct gctgtcccag ggctcgccgc 120tgagctggga ggaaaccaag cgccatgccg accacgtgcg gcggcacggg atcctccagt 180tcctgcacat ctaccacgcc gtcaaggacc ggcacaagga cgttctcaag tggggcgatg 240aggtggaata catgttggta tcttttgatc atgaaaataa aaaagtccgg ttggtcctgt 300ctggggagaa agttcttgaa actctgcaag agaaggggga aaggacaaac ccaaaccatc 360ctaccctttg gagaccagag tatgggagtt acatgattga agggacacca ggacagccct 420acggaggaac aatgtccgag ttcaatacag ttgaggccaa catgcgaaaa cgccggaagg 480aggctacttc tatattagaa gaaaatcagg ctctttgcac aataacttca tttcccagat 540taggctgtcc tgggttcaca ctgcccgagg tcaaacccaa cccagtggaa ggaggagctt 600ccaagtccct cttctttcca gatgaagcaa taaacaagca ccctcgcttc agtaccttaa 660caagaaatat ccgacatagg agaggagaaa aggttgtcat caatgtacca atatttaagg 720acaagaatac accatctcca tttatagaaa catttactga ggatgatgaa gcttcaaggg 780cttctaagcc ggatcatatt tacatggatg ccatgggatt tggaatgggc aattgctgtc 840tccaggtgac attccaagcc tgcagtatat ctgaggccag atacctttat gatcagttgg 900ctactatctg tccaattgtt atggctttga gtgctgcatc tcccttttac cgaggctatg 960tgtcagacat tgattgtcgc tggggagtga tttctgcatc tgtagatgat agaactcggg 1020aggagcgagg actggagcca ttgaagaaca ataactatag gatcagtaaa tcccgatatg 1080actcaataga cagctattta tctaagtgtg gtgagaaata taatgacatc gacttgacga 1140tagataaaga gatctacgaa cagctgttgc aggaaggcat tgatcatctc ctggcccagc 1200atgttgctca tctctttatt agagacccac tgacactgtt tgaagagaaa atacacctgg 1260atgatgctaa tgagtctgac cattttgaga atattcagtc cacaaattgg cagacaatga 1320gatttaagcc ccctcctcca aactcagaca ttggatggag agtagaattt cgacccatgg 1380aggtgcaatt aacagacttt gagaactctg cctatgtggt gtttgtggta ctgctcacca 1440gagtgatcct ttcctacaaa ttggattttc tcattccact gtcaaaggtt gatgagaaca 1500tgaaggtagc acagaaaaga gatgctgtct tgcagggaat gttttatttc aggaaagata 1560tttgcaaagg tggcaatgca gtggtggatg gttgtggcaa ggcccagaac agcacggagc 1620tcgctgcaga ggagtacacc ctcatgagca tagacaccat catcaatggg aaggaaggtg 1680tgtttcctgg actgatccca attctgaact cttaccttga aaacatggaa gtggatgtgg 1740acaccagatg tagtattctg aactacctaa agctaattaa gaagagagca tctggagaac 1800taatgacagt tgccagatgg atgagggagt ttatcgcaaa ccatcctgac tacaagcaag 1860acagtgtcat aactgatgaa atgaattata gccttatttt gaagtgtaac caaattgcaa 1920atgaattatg tgaatgccca gagttacttg gatcagcatt taggaaagta aaatatagtg 1980gaagtaaaac tgactcatcc aactagacat tctacagaaa gaaaaatgca ttattgacga 2040actggctaca gtaccatgcc tctcagcccg tgtgtataat atgaagacca aatgatagaa 2100ctgtactgtt ttctgggcca gtgagccaga aattgattaa ggctttcttt ggtaggtaaa 2160tctagagttt atacagtgta catgtacata gtaaagtatt tttgattaac aatgtatttt 2220aataacatat ctaaagtcat catgaactgg cttgtacatt tttaaattct tactctggag 2280caacctactg tctaagcagt tttgtaaatg tactggtaat tgtacaatac ttgcattcca 2340gagttaaaat gtttactgta aatttttgtt cttttaaaga ctacctggga cctgatttat 2400tgaaattttt ctctttaaaa acattttctc tcgttaattt tcctttgtca tttcctttgt 2460tgtctacatt aaatcacttg aatccattga aagtgcttca agggtaatct tgggtttcta 2520gcaccttatc tatgatgttt cttttgcaat tggaataatc acttggtcac cttgccccaa 2580gctttcccct ctgaataaat acccattgaa ctctgaaaaa aaaaaaaaaa aaaa 26343581246DNAHomo sapiens 358gaccagccta cagccgcctg catctgtatc cagcgccagg tcccgccagt cccagctgcg 60cgcgcccccc agtcccgcac ccgttcggcc caggctaagt tagccctcac catgccggtc 120aaaggaggca ccaagtgcat caaatacctg ctgttcggat ttaacttcat cttctggctt 180gccgggattg ctgtccttgc cattggacta tggctccgat tcgactctca gaccaagagc 240atcttcgagc aagaaactaa taataataat tccagcttct acacaggagt ctatattctg 300atcggagccg gcgccctcat gatgctggtg ggcttcctgg gctgctgcgg ggctgtgcag 360gagtcccagt gcatgctggg actgttcttc ggcttcctct tggtgatatt cgccattgaa 420atagctgcgg ccatctgggg atattcccac aaggatgagg tgattaagga agtccaggag 480ttttacaagg acacctacaa caagctgaaa accaaggatg agccccagcg ggaaacgctg 540aaagccatcc actatgcgtt gaactgctgt ggtttggctg ggggcgtgga acagtttatc 600tcagacatct gccccaagaa ggacgtactc gaaaccttca ccgtgaagtc ctgtcctgat 660gccatcaaag aggtcttcga caataaattc cacatcatcg gcgcagtggg catcggcatt 720gccgtggtca tgatatttgg catgatcttc agtatgatct tgtgctgtgc tatccgcagg 780aaccgcgaga tggtctagag tcagcttaca tccctgagca ggaaagttta cccatgaaga 840ttggtgggat tttttgtttg tttgttttgt tttgtttgtt gtttgttgtt tgtttttttg 900ccactaattt tagtattcat tctgcattgc tagataaaag ctgaagttac tttatgtttg 960tcttttaatg cttcattcaa tattgacatt tgtagttgag cggggggttt ggtttgcttt 1020ggtttatatt ttttcagttg tttgtttttg cttgttatat taagcagaaa tcctgcaatg 1080aaaggtacta tatttgctag actctagaca agatattgta cataaaagaa tttttttgtc 1140tttaaataga tacaaatgtc tatcaacttt aatcaagttg taacttatat tgaagacaat 1200ttgatacata ataaaaaatt atgacaatgt caaaaaaaaa aaaaaa 12463592360DNAHomo sapiens 359gctacgcggg ccacgctgct ggctggcctg acctaggcgc gcggggtcgg gcggccgcgc 60gggcgggctg agtgagcaag acaagacact caagaagagc gagctgcgcc tgggtcccgg 120ccaggcttgc acgcagaggc gggcggcaga cggtgcccgg cggaatctcc tgagctccgc 180cgcccagctc tggtgccagc gcccagtggc cgccgcttcg aaagtgactg gtgcctcgcc 240gcctcctctc ggtgcgggac catgaagctg ctgccgtcgg tggtgctgaa gctctttctg 300gctgcagttc tctcggcact ggtgactggc gagagcctgg

agcggcttcg gagagggcta 360gctgctggaa ccagcaaccc ggaccctccc actgtatcca cggaccagct gctaccccta 420ggaggcggcc gggaccggaa agtccgtgac ttgcaagagg cagatctgga ccttttgaga 480gtcactttat cctccaagcc acaagcactg gccacaccaa acaaggagga gcacgggaaa 540agaaagaaga aaggcaaggg gctagggaag aagagggacc catgtcttcg gaaatacaag 600gacttctgca tccatggaga atgcaaatat gtgaaggagc tccgggctcc ctcctgcatc 660tgccacccgg gttaccatgg agagaggtgt catgggctga gcctcccagt ggaaaatcgc 720ttatatacct atgaccacac aaccatcctg gccgtggtgg ctgtggtgct gtcatctgtc 780tgtctgctgg tcatcgtggg gcttctcatg tttaggtacc ataggagagg aggttatgat 840gtggaaaatg aagagaaagt gaagttgggc atgactaatt cccactgaga gagacttgtg 900ctcaaggaat cggctgggga ctgctacctc tgagaagaca caaggtgatt tcagactgca 960gaggggaaag acttccatct agtcacaaag actccttcgt ccccagttgc cgtctaggat 1020tgggcctccc ataattgctt tgccaaaata ccagagcctt caagtgccaa acagagtatg 1080tccgatggta tctgggtaag aagaaagcaa aagcaaggga ccttcatgcc cttctgattc 1140ccctccacca aaccccactt cccctcataa gtttgtttaa acacttatct tctggattag 1200aatgccggtt aaattccata tgctccagga tctttgactg aaaaaaaaaa agaagaagaa 1260gaaggagagc aagaaggaaa gatttgtgaa ctggaagaaa gcaacaaaga ttgagaagcc 1320atgtactcaa gtaccaccaa gggatctgcc attgggaccc tccagtgctg gatttgatga 1380gttaactgtg aaataccaca agcctgagaa ctgaattttg ggacttctac ccagatggaa 1440aaataacaac tatttttgtt gttgttgttt gtaaatgcct cttaaattat atatttattt 1500tattctatgt atgttaattt atttagtttt taacaatcta acaataatat ttcaagtgcc 1560tagactgtta ctttggcaat ttcctggccc tccactcctc atccccacaa tctggcttag 1620tgccacccac ctttgccaca aagctaggat ggttctgtga cccatctgta gtaatttatt 1680gtctgtctac atttctgcag atcttccgtg gtcagagtgc cactgcggga gctctgtatg 1740gtcaggatgt aggggttaac ttggtcagag ccactctatg agttggactt cagtcttgcc 1800taggcgattt tgtctaccat ttgtgttttg aaagcccaag gtgctgatgt caaagtgtaa 1860cagatatcag tgtctccccg tgtcctctcc ctgccaagtc tcagaagagg ttgggcttcc 1920atgcctgtag ctttcctggt ccctcacccc catggcccca ggccacagcg tgggaactca 1980ctttcccttg tgtcaagaca tttctctaac tcctgccatt cttctggtgc tactccatgc 2040aggggtcagt gcagcagagg acagtctgga gaaggtatta gcaaagcaaa aggctgagaa 2100ggaacaggga acattggagc tgactgttct tggtaactga ttacctgcca attgctaccg 2160agaaggttgg aggtggggaa ggctttgtat aatcccaccc acctcaccaa aacgatgaag 2220gtatgctgtc atggtccttt ctggaagttt ctggtgccat ttctgaactg ttacaacttg 2280tatttccaaa cctggttcat atttatactt tgcaatccaa ataaagataa cccttattcc 2340ataaaaaaaa aaaaaaaaaa 23603601433DNAHomo sapiens 360attcggggcg agggaggagg aagaagcgga ggaggcggct cccgctcgca gggccgtgca 60cctgcccgcc cgcccgctcg ctcgctcgcc cgccgcgccg cgctgccgac cgccagcatg 120ctgccgagag tgggctgccc cgcgctgccg ctgccgccgc cgccgctgct gccgctgctg 180ccgctgctgc tgctgctact gggcgcgagt ggcggcggcg gcggggcgcg cgcggaggtg 240ctgttccgct gcccgccctg cacacccgag cgcctggccg cctgcgggcc cccgccggtt 300gcgccgcccg ccgcggtggc cgcagtggcc ggaggcgccc gcatgccatg cgcggagctc 360gtccgggagc cgggctgcgg ctgctgctcg gtgtgcgccc ggctggaggg cgaggcgtgc 420ggcgtctaca ccccgcgctg cggccagggg ctgcgctgct atccccaccc gggctccgag 480ctgcccctgc aggcgctggt catgggcgag ggcacttgtg agaagcgccg ggacgccgag 540tatggcgcca gcccggagca ggttgcagac aatggcgatg accactcaga aggaggcctg 600gtggagaacc acgtggacag caccatgaac atgttgggcg ggggaggcag tgctggccgg 660aagcccctca agtcgggtat gaaggagctg gccgtgttcc gggagaaggt cactgagcag 720caccggcaga tgggcaaggg tggcaagcat caccttggcc tggaggagcc caagaagctg 780cgaccacccc ctgccaggac tccctgccaa caggaactgg accaggtcct ggagcggatc 840tccaccatgc gccttccgga tgagcggggc cctctggagc acctctactc cctgcacatc 900cccaactgtg acaagcatgg cctgtacaac ctcaaacagt gcaagatgtc tctgaacggg 960cagcgtgggg agtgctggtg tgtgaacccc aacaccggga agctgatcca gggagccccc 1020accatccggg gggaccccga gtgtcatctc ttctacaatg agcagcagga ggcttgcggg 1080gtgcacaccc agcggatgca gtagaccgca gccagccggt gcctggcgcc cctgcccccc 1140gcccctctcc aaacaccggc agaaaacgga gagtgcttgg gtggtgggtg ctggaggatt 1200ttccagttct gacacacgta tttatatttg gaaagagacc agcaccgagc tcggcacctc 1260cccggcctct ctcttcccag ctgcagatgc cacacctgct ccttcttgct ttccccgggg 1320gaggaagggg gttgtggtcg gggagctggg gtacaggttt ggggaggggg aagagaaatt 1380tttatttttg aacccctgtg tcccttttgc ataagattaa aggaaggaaa agt 14333611632DNAHomo sapiens 361gccggccgaa cccagacccg aggttttaga agcagagtca ggcgaagctg ggccagaacc 60gcgacctccg caaccttgag cggcatccgt ggagtgcgcc tgcgcagcta cgaccgcagc 120aggaaagcgc cgccggccag gcccagctgt ggccggacag ggactggaag agaggacgcg 180gtcgagtagg tgtgcaccag ccctggcaac gagagcgtct accccgaact ctgctggcct 240tgaggtgggg aagccgggga gggcagttga ggaccccgcg gaggcgcgtg actggttgag 300cgggcaggcc agcctccgag ccgggtggac acaggtttta aaacatgaat cctacactca 360tccttgctgc cttttgcctg ggaattgcct cagctactct aacatttgat cacagtttag 420aggcacagtg gaccaagtgg aaggcgatgc acaacagatt atacggcatg aatgaagaag 480gatggaggag agcagtgtgg gagaagaaca tgaagatgat tgaactgcac aatcaggaat 540acagggaagg gaaacacagc ttcacaatgg ccatgaacgc ctttggagac atgaccagtg 600aagaattcag gcaggtgatg aatggctttc aaaaccgtaa gcccaggaag gggaaagtgt 660tccaggaacc tctgttttat gaggccccca gatctgtgga ttggagagag aaaggctacg 720tgactcctgt gaagaatcag ggtcagtgtg gttcttgttg ggcttttagt gctactggtg 780ctcttgaagg acagatgttc cggaaaactg ggaggcttat ctcactgagt gagcagaatc 840tggtagactg ctctgggcct caaggcaatg aaggctgcaa tggtggccta atggattatg 900ctttccagta tgttcaggat aatggaggcc tggactctga ggaatcctat ccatatgagg 960caacagaaga atcctgtaag tacaatccca agtattctgt tgctaatgac accggctttg 1020tggacatccc taagcaggag aaggccctga tgaaggcagt tgcaactgtg gggcccattt 1080ctgttgctat tgatgcaggt catgagtcct tcctgttcta taaagaaggc atttattttg 1140agccagactg tagcagtgaa gacatggatc atggtgtgct ggtggttggc tacggatttg 1200aaagcacaga atcagataac aataaatatt ggctggtgaa gaacagctgg ggtgaagaat 1260ggggcatggg tggctacgta aagatggcca aagaccggag aaaccattgt ggaattgcct 1320cagcagccag ctaccccact gtgtgagctg gtggacggtg atgaggaagg acttgactgg 1380ggatggcgca tgcatgggag gaattcatct tcagtctacc agcccccgct gtgtcggata 1440cacactcgaa tcattgaaga tccgagtgtg atttgaattc tgtgatattt tcacactggt 1500aaatgttacc tctattttaa ttactgctat aaataggttt atattattga ttcacttact 1560gactttgcat tttcgttttt aaaaggatgt ataaattttt acctgtttaa ataaaattta 1620atttcaaatg ta 16323622756DNAHomo sapiens 362atgctgtcct tccagtaccc cgacgtgtac cgcgacgaga ccgccgtaca ggattatcat 60ggtcataaaa tttgtgaccc ttacgcctgg cttgaagacc ccgacagtga acagactaag 120gcctttgtgg aggcccagaa taagattact gtgccatttc ttgagcagtg tcccatcaga 180ggtttataca aagagagaat gactgaacta tatgattatc ccaagtatag ttgccacttc 240aagaaaggaa aacggtattt ttatttttac aatacaggtt tgcagaacca gcgagtatta 300tatgtacagg attccttaga gggtgaggcc agagtgttcc tggaccccaa catactgtct 360gacgatggca cagtggcact ccgaggttat gcgttcagcg aagatggtga atattttgcc 420tatggtctga gtgccagtgg ctcagactgg gtgacaatca agttcatgaa agttgatggt 480gccaaagagc ttccagatgt gcttgaaaga gtcaagttca gctgtatggc ctggacccat 540gatgggaagg gaatgttcta caactcatac cctcaacagg atggaaaaag tgatggcaca 600gagacatcta ccaatctcca ccaaaagctc tactaccatg tcttgggaac cgatcagtca 660gaagatattt tgtgtgctga gtttcctgat gaacctaaat ggatgggtgg agctgagtta 720tctgatgatg gccgctatgt cttgttatca ataagggaag gatgtgatcc agtaaaccga 780ctctggtact gtgacctaca gcaggaatcc agtggcatcg cgggaatcct gaagtgggta 840aaactgattg acaactttga aggggaatat gactacgtga ccaatgaggg ggcggtgttc 900acattcaaga cgaatcgcca gtctcccaac tatcgcgtga tcaacattga cttcagggat 960cctgaagagt ctaagtggaa agtacttgtt cctgagcatg agaaagatgt cttagaatgg 1020atagcttgtg tcaggtccaa cttcttggtc ttatgctacc tccatgacgt caagaacatt 1080ctgcagctcc atgacctgac tactggtgct ctccttaaga ccttcccgct cgatgtcggc 1140agcattgtag ggtacagcgg tcagaagaag gacactgaaa tcttctatca gtttacttcc 1200tttttatctc caggtatcat ttatcactgt gatcttacca aagaggagct ggagccaaga 1260gttttccgag aggtgaccgt aaaaggaatt gatgcttctg attaccagac agtccagatt 1320ttctacccta gcaaggatgg tacgaagatt ccaatgttca ttgtgcataa aaaaagcata 1380aaattggatg gctctcatcc agctttctta tatggctatg gcggcttcaa catatccatc 1440acacccaact acagtgtttc caggcttatt tttgtgagac acatgggtgg tatcctggca 1500gtggccaaca tcagaggagg tggcgaatat ggagagacgt ggcataaagg tggtatcttg 1560gccaacaaac aaaactgctt tgatgacttt cagtgtgctg ctgagtatct gatcaaggaa 1620ggttacacat ctcccaagag gctgactatt aatggaggtt caaatggagg cctcttagtg 1680gctgcttgtg caaatcagag acctgacctc tttggttgtg ttattgccca agttggagta 1740atggacatgc tgaagtttca taaatatacc atcggccatg cttggaccac tgattatggg 1800tgctcggaca gcaaacaaca ctttgaatgg cttgtcaaat actctccatt gcataatgtg 1860aagttaccag aagcagatga catccagtac ccgtccatgc tgctcctcac tgctgaccat 1920gatgaccgcg tggtcccgct tcactccctg aagttcattg ccacccttca gtacatcgtg 1980ggccgcagca ggaagcaaag caaccccctg cttatccacg tggacaccaa ggcgggccac 2040ggggcgggga agcccacagc caaagtgata gaggaagtct cagacatgtt tgcgttcatc 2100gcgcggtgcc tgaacgtcga ctggattcca taaacagttt tcgtgcttcc tcctgacagc 2160gacagaaaac ctcaagggct ttcccacgtt gacaccaaga aaccactggg cataatgctt 2220ccccacggga acattattcc tggactgaca ggctacagtt gaacagaact gccgtgggaa 2280ttttatcttt tttaggcttc tcctttttag caaggccttg gtgtttcttt ttccaccctg 2340tctaggcaca tgtggttttt tggtgttttt tttaagggca tgttgggata aatagctaaa 2400tggcaacaaa cacattgtga atattagatt gctgaattaa ggatcatagt cgggcatact 2460tatctatatc cataacctct atatctttaa ataaatgtga gaactgttct catggagaag 2520acttctttgc aacaataata aatgttattt aagaatgaca gggatttact tccggtttct 2580tcatattgag gggcaactcc agaagtggag ttttctgtga gaataaagca tttcaccttt 2640ctgcaacaag ttagttttca agcagttaag tcatagaatg tttgttagct gtgaaaataa 2700gttgttcatc caaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaag gaattc 27563632768DNAHomo sapiens 363cactgctgtg cagggcagga aagctccatg cacatagccc agcaaagagc aacacagagc 60tgaaaggaag actcagagga gagagataag taaggaaagt agtgatggct ctcatcccag 120acttggccat ggaaacctgg cttctcctgg ctgtcagcct ggtgctcctc tatctatatg 180gaacccattc acatggactt tttaagaagc ttggaattcc agggcccaca cctctgcctt 240ttttgggaaa tattttgtcc taccataagg gcttttgtat gtttgacatg gaatgtcata 300aaaagtatgg aaaagtgtgg ggcttttatg atggtcaaca gcctgtgctg gctatcacag 360atcctgacat gatcaaaaca gtgctagtga aagaatgtta ttctgtcttc acaaaccgga 420ggccttttgg tccagtggga tttatgaaaa gtgccatctc tatagctgag gatgaagaat 480ggaagagatt acgatcattg ctgtctccaa ccttcaccag tggaaaactc aaggagatgg 540tccctatcat tgcccagtat ggagatgtgt tggtgagaaa tctgaggcgg gaagcagaga 600caggcaagcc tgtcaccttg aaagacgtct ttggggccta cagcatggat gtgatcacta 660gcacatcatt tggagtgaac atcgactctc tcaacaatcc acaagacccc tttgtggaaa 720acaccaagaa gcttttaaga tttgattttt tggatccatt ctttctctca ataacagtct 780ttccattcct catcccaatt cttgaagtat taaatatctg tgtgtttcca agagaagtta 840caaatttttt aagaaaatct gtaaaaagga tgaaagaaag tcgcctcgaa gatacacaaa 900agcaccgagt ggatttcctt cagctgatga ttgactctca gaattcaaaa gaaactgagt 960cccacaaagc tctgtccgat ctggagctcg tggcccaatc aattatcttt atttttgctg 1020gctatgaaac cacgagcagt gttctctcct tcattatgta tgaactggcc actcaccctg 1080atgtccagca gaaactgcag gaggaaattg atgcagtttt acccaataag gcaccaccca 1140cctatgatac tgtgctacag atggagtatc ttgacatggt ggtgaatgaa acgctcagat 1200tattcccaat tgctatgaga cttgagaggg tctgcaaaaa agatgttgag atcaatggga 1260tgttcattcc caaaggggtg gtggtgatga ttccaagcta tgctcttcac cgtgacccaa 1320agtactggac agagcctgag aagttcctcc ctgaaagatt cagcaagaag aacaaggaca 1380acatagatcc ttacatatac acaccctttg gaagtggacc cagaaactgc attggcatga 1440ggtttgctct catgaacatg aaacttgctc taatcagagt ccttcagaac ttctccttca 1500aaccttgtaa agaaacacag atccccctga aattaagctt aggaggactt cttcaaccag 1560aaaaacccgt tgttctaaag gttgagtcaa gggatggcac cgtaagtgga gcctgaattt 1620tcctaaggac ttctgctttg ctcttcaaga aatctgtgcc tgagaacacc agagacctca 1680aattactttg tgaatagaac tctgaaatga agatgggctt catccaatgg actgcataaa 1740taaccgggga ttctgtacat gcattgagct ctctcattgt ctgtgtagag tgttatactt 1800gggaatataa aggaggtgac caaatcagtg tgaggaggta gatttggctc ctctgcttct 1860cacgggacta tttccaccac ccccagttag caccattaac tcctcctgag ctctgataag 1920agaatcaaca tttctcaata atttcctcca caaattatta atgaaaataa gaattatttt 1980gatggctcta acaatgacat ttatatcaca tgttttctct ggagtattct ataagtttta 2040tgttaaatca ataaagacca ctttacaaaa gtattatcag atgctttcct gcacattaag 2100gagaaatcta tagaactgaa tgagaaccaa caagtaaata tttttggtca ttgtaatcac 2160tgttggcgtg gggcctttgt cagaactaga atttgattat taacataggt gaaagttaat 2220ccactgtgac tttgcccatt gtttagaaag aatattcata gtttaattat gccttttttg 2280atcaggcaca gtggctcacg cctgtaatcc tagcagtttg ggaggctgag ccgggtggat 2340cgcctgaggt caggagttca agacaagcct ggcctacatg gttgaaaccc catctctact 2400aaaaatacac aaattagcta ggcatggtgg actcgcctgt aatctcacta cacaggaggc 2460tgaggcagga gaatcacttg aacctgggag gcggatgttg aagtgagctg agattgcacc 2520actgcactcc agtctgggtg agagtgagac tcagtcttaa aaaaatatgc ctttttgaag 2580cacgtacatt ttgtaacaaa gaactgaagc tcttattata ttattagttt tgatttaatg 2640ttttcagccc atctcctttc atatttctgg gagacagaaa acatgtttcc ctacacctct 2700tgcattccat cctcaacacc caactgtctc gatgcaatga acacttaata aaaaacagtc 2760gattggtc 27683642984DNAHomo sapiens 364gaggaggaac agaaaagaaa agaaaagaaa aagtgggaaa caaataatct aagaatgagg 60agaaagcaag aagagtgacc cccttgtggg cactccattg gttttatggc gcctctactt 120tctggagttt gtgtaaaaca aaaatattat ggtctttgtg cacatttaca tcaagctcag 180cctgggcggc acagccagat gcgagatgcg tctctgctga tctgagtctg cctgcagcat 240ggacctgggt cttccctgaa gcatctccag ggctggaggg acgactgcca tgcaccgagg 300gctcatccat ccacagagca gggcagtggg aggagacgcc atgaccccca tcctcacggt 360cctgatctgt ctcgggctga gtctgggccc ccggacccac gtgcaggcag ggcacctccc 420caagcccacc ctctgggctg aaccaggctc tgtgatcacc caggggagtc ctgtgaccct 480caggtgtcag gggggccagg agacccagga gtaccgtcta tatagagaaa agaaaacagc 540accctggatt acacggatcc cacaggagct tgtgaagaag ggccagttcc ccatcccatc 600catcacctgg gaacatgcag ggcggtatcg ctgttactat ggtagcgaca ctgcaggccg 660ctcagagagc agtgaccccc tggagctggt ggtgacagga gcctacatca aacccaccct 720ctcagcccag cccagccccg tggtgaactc aggagggaat gtaaccctcc agtgtgactc 780acaggtggca tttgatggct tcattctgtg taaggaagga gaagatgaac acccacaatg 840cctgaactcc cagccccatg cccgtgggtc gtcccgcgcc atcttctccg tgggccccgt 900gagcccgagt cgcaggtggt ggtacaggtg ctatgcttat gactcgaact ctccctatga 960gtggtctcta cccagtgatc tcctggagct cctggtccta ggtgtttcta agaagccatc 1020actctcagtg cagccaggtc ctatcgtggc ccctgaggag accctgactc tgcagtgtgg 1080ctctgatgct ggctacaaca gatttgttct gtataaggac ggggaacgtg acttccttca 1140gctcgctggc gcacagcccc aggctgggct ctcccaggcc aacttcaccc tgggccctgt 1200gagccgctcc tacgggggcc agtacagatg ctacggtgca cacaacctct cctccgagtg 1260gtcggccccc agcgaccccc tggacatcct gatcgcagga cagttctatg acagagtctc 1320cctctcggtg cagccgggcc ccacggtggc ctcaggagag aacgtgaccc tgctgtgtca 1380gtcacaggga tggatgcaaa ctttccttct gaccaaggag ggggcagctg atgacccatg 1440gcgtctaaga tcaacgtacc aatctcaaaa ataccaggct gaattcccca tgggtcctgt 1500gacctcagcc catgcgggga cctacaggtg ctacggctca cagagctcca aaccctacct 1560gctgactcac cccagtgacc ccctggagct cgtggtctca ggaccgtctg ggggccccag 1620ctccccgaca acaggcccca cctccacatc tggccctgag gaccagcccc tcacccccac 1680cgggtcggat ccccagagtg gtctgggaag gcacctgggg gttgtgatcg gcatcttggt 1740ggccgtcatc ctactgctcc tcctcctcct cctcctcttc ctcatcctcc gacatcgacg 1800tcagggcaaa cactggacat cgacccagag aaaggctgat ttccaacatc ctgcaggggc 1860tgtggggcca gagcccacag acagaggcct gcagtggagg tccagcccag ctgccgatgc 1920ccaggaagaa aacctctatg ctgccgtgaa gcacacacag cctgaggatg gggtggagat 1980ggacactcgg agcccacacg atgaagaccc ccaggcagtg acgtatgccg aggtgaaaca 2040ctccagacct aggagagaaa tggcctctcc tccttcccca ctgtctgggg aattcctgga 2100cacaaaggac agacaggcgg aagaggacag gcagatggac actgaggctg ctgcatctga 2160agccccccag gatgtgacct acgcccagct gcacagcttg acccttagac ggaaggcaac 2220tgagcctcct ccatcccagg aagggccctc tccagctgtg cccagcatct acgccactct 2280ggccatccac tagcccaggg ggggacgcag accccacact ccatggagtc tggaatgcat 2340gggagctgcc cccccagtgg acaccattgg accccaccca gcctggatct accccaggag 2400actctgggaa cttttagggg tcactcaatt ctgcagtata aataactaat gtctctacaa 2460ttttgaaata aagcaacaga cttctcaata atcaatgaag tagctgagaa aactaagtca 2520gaaagtgcat taaactgaat cacaatgtaa atattacaca tcaagcgatg aaactggaaa 2580actacaagcc acgaatgaat gaattaggaa agaaaaaaag taggaaatga atgatcttgg 2640ctttcctata agaaatttag ggcagggcac ggtggctcac gcctgtaatt ccagcacttt 2700gggaggccga ggcgggcaga tcacgagttc aggagatcga gaccatcttg gccaacatgg 2760tgaaaccctg tctctcctaa aaatacaaaa attagctgga tgtggtggca gtgcctgtaa 2820tcccagctat ttgggaggct gaggcaggag aatcgcttga accagggagt cagaggtttc 2880agtgagccaa gatcgcacca ctgctctcca gcctggcgac agagggagac tccatctcaa 2940attaaaaaaa aaaaaaaaaa agaaagaaaa aaaaaaaaaa aaaa 29843653061DNAHomo sapiens 365cggcacgagg cgactttggt ggaggtagtt ctttggcagc gggcatggcg ggtaccgtgg 60tgctggacga tgtggagctg cgggaggctc agagagatta cctggacttc ctggacgacg 120aggaagacca gggaatttat cagagcaaag ttcgggagct gatcagtgac aaccaatacc 180ggctgattgt caatgtgaat gacctgcgca ggaaaaacga gaagagggct aaccggcttc 240tgaacaatgc ctttgaggag ctggttgcct tccagcgggc cttaaaggat tttgtggcct 300ccattgatgc tacctatgcc aagcagtatg aggagttcta cgtaggactg gaaggcagct 360ttggctccaa gcacgtctcc ccgcggactc ttacctcctg cttcctcagc tgtgtggtct 420gtgtggaggg cattgtcact aaatgttctc tagttcgtcc caaagtcgtc cgcagtgtcc 480actactgtcc tgctactaag aagaccatag agcgacgtta ttctgatctc accaccctgg 540tggcctttcc ctccagctct gtctatccta ccaaggatga ggagaacaat ccccttgaga 600cagaatatgg cctttctgtc tacaaggatc accagaccat caccatccag gagatgccgg 660agaaggcccc agccggccag ctcccccgct ctgtggacgt cattctggat gatgacttgg 720tggataaagc gaagcctggt gaccgggttc aggtggtggg aacctaccgt tgccttcctg 780gaaagaaggg aggctacacc tctgggacct tcaggactgt cctgattgcc tgtaatgtta 840agcagatgag caaggatgct cagccctctt tctctgctga ggatatagcc aagatcaaga 900agttcagtaa aacccgatcc aaggatatct ttgaccagct ggccaagtca ttggccccaa 960gtatccatgg gcatgactat gtcaagaaag caatcctctg cttgctcttg ggaggggtgg 1020aacgagacct agaaaatggc agccacatcc gtggggacat caatattctt ctaataggag 1080acccatccgt tgccaagtct cagcttctgc ggtatgtgct ttgcactgca ccccgagcta 1140tccccaccac tggccggggc tcctctggag

tgggtctgac ggctgctgtc accacagacc 1200aggaaacagg agagcgccgt ctggaagcag gggccatggt cctggctgac cgaggcgtgg 1260tttgcattga tgaatttgac aaaatgtctg acatggatcg cacagccatc catgaagtga 1320tggagcaggg tcgagtgacc attgccaagg ctggcatcca tgctcggctg aatgcccgct 1380gcagtgtttt ggcagctgcc aaccctgtct acggcaggta tgaccagtat aagactccaa 1440tggagaacat tgggctacag gactcactgc tgtcacgatt tgacttgctc ttcatcatgc 1500tggatcagat ggatcctgag caggatcggg agatctcaga ccatgtcctt cggatgcacc 1560gttacagagc acctggggag caggatggcg atgctatgcc cttgggtagt gctgtggata 1620tcctggccac agatgatccc aactttagcc aggaagatca gcaggacacc cagatttatg 1680agaagcatga caaccttcta catgggacca agaagaaaaa ggagaagatg gtgagtgcag 1740cattcatgaa gaagtacatc catgtggcca aaatcatcaa gcctgtcctg acacaggagt 1800cggccaccta cattgcagaa gagtattcac gcctgcgcag ccaggatagc atgagctcag 1860acaccgccag gacatctcca gttacagccc gaacactgga aactctgatt cgactggcca 1920cagcccatgc gaaggcccgc atgagcaaga ctgtggacct gcaggatgca gaggaagctg 1980tggagttggt ccagtatgct tactttaaga aggttctgga gaaggagaag aaacgtaaga 2040agcgaagtga ggatgaatca gagacagaag atgaagagga gaaaagccaa gaggaccagg 2100agcagaagag gaagagaagg aagactcgcc agccagatgc caaagatggg gattcatacg 2160acccctatga cttcagtgac acagaggagg aaatgcctca agtacacact ccaaagacgg 2220cagactcaca ggagaccaag gaatcccaga aagtggagtt gagtgaatcc aggttgaagg 2280cattcaaggt ggccctcttg gatgtgttcc gggaagctca tgcgcagtca atcggcatga 2340atcgcctcac agaatccatc aaccgggaca gcgaagagcc cttctcttca gttgagatcc 2400aggctgctct gagcaagatg caggatgaca atcaggtcat ggtgtctgag ggcatcatct 2460tcctcatctg aggaggcctc gtctctgaac ttgggttgtg ccgagagagt ttgttctgtg 2520tttcccaccc tctccctgac ccaagtcttt gcctctactc ccttaacagt gttgaattca 2580actgaaggcg aggaatgttg gtgatgaagc tgagttcagg actcggtgga ccctttggga 2640atgggtcatg aaagctgcca tggggtgagg aaagaggaga cagtgggaga ggacaatgac 2700tattgcatct tcattgcaaa agcactggct catccgccct acttcccatc ccacacaaac 2760ccaattgtaa ataacatatg acttctgagt acttttgggg gcacaactgt tttctgtttg 2820ctgttttttt gttttgtttt ttttctccag agcactttgg tctagactag gctttgggtg 2880gttccaattg gtggagagaa gctctgaggc acgtcatgca ggtcaagaaa gctttctttg 2940cagtagcacc agttaaggtg aatatgtatt gtatcacaaa acaaacccaa tatccagatg 3000aatatccgag atgttgaata aacttagcca tttcgtacaa aaaaaggggg gcccggtaaa 3060c 30613661360DNAHomo sapiens 366cgggggttgc tccgtccgtg ctccgcctcg ccatgacttc ctacagctat cgccagtcgt 60cggccacgtc gtccttcgga ggcctgggcg gcggctccgt gcgttttggg ccgggggtcg 120cttttcgcgc gcccagcatt cacgggggct ccggcggccg cggcgtatcc gtgtcctccg 180cccgctttgt gtcctcgtcc tcctcggggg gctacggcgg cggctacggc ggcgtcctga 240ccgcgtccga cgggctgctg gcgggcaacg agaagctaac catgcagaac ctcaacgacc 300gcctggcctc ctacctggac aaggtgcgcg ccctggaggc ggccaacggc gagctagagg 360tgaagatccg cgactggtac cagaagcagg ggcctgggcc ctcccgcgac tacagccact 420actacacgac catccaggac ctgcgggaca agattcttgg tgccaccatt gagaactcca 480ggattgtcct gcagatcgac aacgcccgtc tggctgcaga tgacttccga accaagtttg 540agacggaaca ggctctgcgc atgagcgtgg aggccgacat caacggcctg cgcagggtgc 600tggatgagct gaccctggcc aggaccgacc tggagatgca gatcgaaggc ctgaaggaag 660agctggccta cctgaagaag aaccatgagg aggaaatcag tacgctgagg ggccaagtgg 720gaggccaggt cagtgtggag gtggattccg ctccgggcac cgatctcgcc aagatcctga 780gtgacatgcg aagccaatat gaggtcatgg ccgagcagaa ccggaaggat gctgaagcct 840ggttcaccag ccggactgaa gaattgaacc gggaggtcgc tggccacacg gagcagctcc 900agatgagcag gtccgaggtt actgacctgc ggcgcaccct tcagggtctt gagattgagc 960tgcagtcaca gctgagcatg aaagctgcct tggaagacac actggcagaa acggaggcgc 1020gctttggagc ccagctggcg catatccagg cgctgatcag cggtattgaa gcccagctgg 1080cggatgtgcg agctgatagt gagcggcaga atcaggagta ccagcggctc atggacatca 1140agtcgcggct ggagcaggag attgccacct accgcagcct gctcgaggga caggaagatc 1200actacaacaa tttgtctgcc tccaaggtcc tctgaggcag caggctctgg ggcttctgct 1260gtcctttgga gggtgtcttc tgggtagagg gatgggaagg aagggaccct tacccccggc 1320tcttctcctg acctgccaat aaaaatttat ggtccaaggg 13603671412DNAHomo sapiens 367cggggtcgtc cgcaaagcct gagtcctgtc ctttctctct ccccggacag catgagcttc 60accactcgct ccaccttctc caccaactac cggtccctgg gctctgtcca ggcgcccagc 120tacggcgccc ggccggtcag cagcgcggcc agcgtctatg caggcgctgg gggctctggt 180tcccggatct ccgtgtcccg ctccaccagc ttcaggggcg gcatggggtc cgggggcctg 240gccaccggga tagccggggg tctggcagga atgggaggca tccagaacga gaaggagacc 300atgcaaagcc tgaacgaccg cctggcctct tacctggaca gagtgaggag cctggagacc 360gagaaccgga ggctggagag caaaatccgg gagcacttgg agaagaaggg accccaggtc 420agagactgga gccattactt caagatcatc gaggacctga gggctcagat cttcgcaaat 480actgtggaca atgcccgcat cgttctgcag attgacaatg cccgtcttgc tgctgatgac 540tttagagtca agtatgagac agagctggcc atgcgccagt ctgtggagaa cgacatccat 600gggctccgca aggtcattga tgacaccaat atcacacgac tgcagctgga gacagagatc 660gaggctctca aggaggagct gctcttcatg aagaagaacc acgaagagga agtaaaaggc 720ctacaagccc agattgccag ctctgggttg accgtggagg tagatgcccc caaatctcag 780gacctcgcca agatcatggc agacatccgg gcccaatatg acgagctggc tcggaagaac 840cgagaggagc tagacaagta ctggtctcag cagattgagg agagcaccac agtggtcacc 900acacagtctg ctgaggttgg agctgctgag acgacgctca cagagctgag acgtacagtc 960cagtccttgg agatcgacct ggactccatg agaaatctga aggccagctt ggagaacagc 1020ctgagggagg tggaggcccg ctacgcccta cagatggagc agctcaacgg gatcctgctg 1080caccttgagt cagagctggc acagacccgg gcagagggac agcgccaggc ccaggagtat 1140gaggccctgc tgaacatcaa ggtcaagctg gaggctgaga tcgccaccta ccgccgcctg 1200ctggaagatg gcgaggactt taatcttggt gatgccttgg acagcagcaa ctccatgcaa 1260accatccaaa agaccaccac ccgccggata gtggatggca aagtggtgtc tgagaccaat 1320gacaccaaag ttctgaggca ttaagccagc agaagcaggg taccctttgg ggagcaggag 1380gccaataaaa agttcagagt tcattggatg tc 14123681075DNAHomo sapiens 368cgcagcaaac acatccgtag aaggcagcgc ggccgccgag agccgcagcg ccgctcgccc 60gccgcccccc accccgccgc cccgcccggc gaattgcgcc ccgcgcccct cccctcgcgc 120ccccgagaca aagaggagag aaagtttgcg cggccgagcg gggcaggtga ggagggtgag 180ccgcgcggga ggggcccgcc tcggccccgg ctcagccccc gcccgcgccc ccagcccgcc 240gccgcgagca gcgcccggac cccccagcgg cggcccccgc ccgcccagcc ccccggcccg 300ccatgggcgc cgcggcccgc accctgcggc tggcgctcgg cctcctgctg ctggcgacgc 360tgcttcgccc ggccgacgcc tgcagctgct ccccggtgca cccgcaacag gcgttttgca 420atgcagatgt agtgatcagg gccaaagcgg tcagtgagaa ggaagtggac tctggaaacg 480acatttatgg caaccctatc aagaggatcc agtatgagat caagcagata aagatgttca 540aagggcctga gaaggatata gagtttatct acacggcccc ctcctcggca gtgtgtgggg 600tctcgctgga cgttggagga aagaaggaat atctcattgc aggaaaggcc gagggggacg 660gcaagatgca catcaccctc tgtgacttca tcgtgccctg ggacaccctg agcaccaccc 720agaagaagag cctgaaccac aggtaccaga tgggctgcga gtgcaagatc acgcgctgcc 780ccatgatccc gtgctacatc tcctccccgg acgagtgcct ctggatggac tgggtcacag 840agaagaacat caacgggcac caggccaagt tcttcgcctg catcaagaga agtgacggct 900cctgtgcgtg gtaccgcggc gcggcgcccc ccaagcagga gtttctcgac atcgaggacc 960cataagcagg cctccaacgc ccctgtggcc aactgcaaaa aaagcctcca agggtttcga 1020ctggtccagc tctgacatcc cttcctggaa acagcatgaa taaaacactc atccc 10753691127DNAHomo sapiens 369cacgggcggg gcggggcctg ggtccaccgg ggttctgagg ggagactgag gtcctgagcc 60gacagcctca gctccctgcc aggccagacc cggcagacag atgagggccc aggaggcctg 120gcgggcctgg gggcgctacg gtgggagagg aagccagggg tacctgcctc tgccttccag 180ggccaccgtt ggccccagct gtgccttgac tacgtaacat cttgtcctca cagcccagag 240catgttccag atcccagagt ttgagccgag tgagcaggaa gactccagct ctgcagagag 300gggcctgggc cccagccccg caggggacgg gccctcaggc tccggcaagc atcatcgcca 360ggccccaggc ctcctgtggg acgccagtca ccagcaggag cagccaacca gcagcagcca 420tcatggaggc gctggggctg tggagatccg gagtcgccac agctcctacc ccgcggggac 480ggaggacgac gaagggatgg gggaggagcc cagccccttt cggggccgct cgcgctcggc 540gccccccaac ctctgggcag cacagcgcta tggccgcgag ctccggagga tgagtgacga 600gtttgtggac tcctttaaga agggacttcc tcgcccgaag agcgcgggca cagcaacgca 660gatgcggcaa agctccagct ggacgcgagt cttccagtcc tggtgggatc ggaacttggg 720caggggaagc tccgccccct cccagtgacc ttcgctccac atcccgaaac tccacccgtt 780cccactgccc tgggcagcca tcttgaatat gggcggaagt acttccctca ggcctatgca 840aaaagaggat ccgtgctgtc tcctttggag ggagggctga cccagattcc cttccggtgc 900gtgtgaagcc acggaaggct tggtcccatc ggaagttttg ggttttccgc ccacagccgc 960cggaagtggc tccgtggccc cgccctcagg ctccgggctt tcccccaggc gcctgcgcta 1020agtcgcgagc caggtttaac cgttgcgtca ccgggacccg agcccccgcg atgccctggg 1080ggccgtgctc actaccaaat gttaataaag cccgcgtctg tgccgcc 11273701890DNAHomo sapiens 370cttaataaga agagaaggct tcaatggaac cttttgtggt cctggtgctg tgtctctctt 60ttatgcttct cttttcactc tggagacaga gctgtaggag aaggaagctc cctcctggcc 120ccactcctct tcctattatt ggaaatatgc tacagataga tgttaaggac atctgcaaat 180ctttcaccaa tttctcaaaa gtctatggtc ctgtgttcac cgtgtatttt ggcatgaatc 240ccatagtggt gtttcatgga tatgaggcag tgaaggaagc cctgattgat aatggagagg 300agttttctgg aagaggcaat tccccaatat ctcaaagaat tactaaagga cttggaatca 360tttccagcaa tggaaagaga tggaaggaga tccggcgttt ctccctcaca aacttgcgga 420attttgggat ggggaagagg agcattgagg accgtgttca agaggaagct cactgccttg 480tggaggagtt gagaaaaacc aaggcttcac cctgtgatcc cactttcatc ctgggctgtg 540ctccctgcaa tgtgatctgc tccgttgttt tccagaaacg atttgattat aaagatcaga 600attttctcac cctgatgaaa agattcaatg aaaacttcag gattctgaac tccccatgga 660tccaggtctg caataatttc cctctactca ttgattgttt cccaggaact cacaacaaag 720tgcttaaaaa tgttgctctt acacgaagtt acattaggga gaaagtaaaa gaacaccaag 780catcactgga tgttaacaat cctcgggact ttatggattg cttcctgatc aaaatggagc 840aggaaaagga caaccaaaag tcagaattca atattgaaaa cttggttggc actgtagctg 900atctatttgt tgctggaaca gagacaacaa gcaccactct gagatatgga ctcctgctcc 960tgctgaagca cccagaggtc acagctaaag tccaggaaga gattgatcat gtaattggca 1020gacacaggag cccctgcatg caggatagga gccacatgcc ttacactgat gctgtagtgc 1080acgagatcca gagatacagt gaccttgtcc ccaccggtgt gccccatgca gtgaccactg 1140atactaagtt cagaaactac ctcatcccca agagctttga taacaagata atgctggctg 1200cataaaacta gggcacaacc ataatggcat tactgacttc cgtgctacat gatgacaaag 1260aatttcctaa tccaaatatc tttgaccctg gccactttct agataagaat ggcaacttta 1320agaaaagtga ctacttcatg cctttctcag caggaaaacg aatttgtgca ggagaaggac 1380ttgcccgcat ggagctattt ttatttctaa ccacaatttt acagaacttt aacctgaaat 1440ctgttgatga tttaaagaac ctcaatacta ctgcagttac caaagggatt gtttctctgc 1500caccctcata ccagatctgc ttcatccctg tctgaagaat gctagcccat ctggctgctg 1560atctgctatc acctgcaact ctttttttat caaggacatt cccactatta tgtcttctct 1620gacctctcat caaatcttcc cattcactca atatcccata agcatccaaa ctccattaag 1680gagagttgtt caggtcactg cacaaatata tctgcaatta ttcatactct gtaacacttg 1740tattaattgc tgcatatgct aatacttttc taatgctgac tttttaatat gttatcactg 1800taaaacacag aaaagtgatt aatgaatgat aatttagtcc atttcttttg tgaatgtgct 1860aaataaaaag tgttattaat tgctggttca 18903714946DNAHomo sapiens 371agtcagccct gctgccagcc agtgccgggt gctggggact cagggaggcc cgccgggacc 60actgcgggac agtgagccga gcagaagctg gaacgcagga gaggaaggag agggggcggt 120cagggctctc aggagccggg tcctgggcaa ggcgcagccg ttttcaaatt ttcaggaaag 180cggtcggctc acactcgagc agtaaaaaga tgcctctggg gaggaggccc gtgcagctct 240ccgggcaatg gtggtggctc ggcctagaga ggcggtagtg gaacgcagac cctggtgggg 300gaatgacatc aagggaggag acgggcggga ccccagattt ctgcctgtgg gcgatggaag 360tgaggttcac tggccagcgg agccggacac agaacgcgca aaacgccgtg taggcctgga 420ggagccgaag agcaggcgga ccccctccgc gggggaacag tttccgccgg gagcacaaag 480caacggaccg gaagtggggg gcggaagtgc agtgggctca gcgccgactg cgcgcctctg 540cccgcgaaaa ctctgagctg gctgacagct ggggacgggt ggcggccctc gactggagtc 600ggttgagttc ctgagggacc ccggttctgg aaggttcgcc gcggagacaa gtgagcagtc 660tgtgccatag ggattctcga agagaacagc gttgtgtccc agtgcacatg ctcgcatcgc 720ttaccaggag tgcccgagac cctaagatgt tcggagtggt tttttcgcac agacccgaat 780agcctgcccc tcagccacgc tctgtgccct tctgagaaca ggctgatatg cccaagatag 840tcctgaatgg tgtgaccgta gacttccctt tccagcccta caaatgccaa caggagtaca 900tgaccaaggt cctggaatgt ctgcagcaga aggtgaatgg catcctggag agccctacgg 960gtacagggaa gacgctgtgc ctgctgtgca ccacgctggc ctggcgagaa cacctccgag 1020acggcatctc tgcccgcaag attgccgaga gggcgcaagg agagcttttc ccggatcggg 1080ccttgtcatc ctggggcaac gctgctgctg ctgctggaga ccccatagct tgctacacgg 1140acatcccaaa gattatttac gcctccagga cccactcgca actcacacag gtcatcaacg 1200agcttcggaa cacctcctac cggcctaagg tgtgtgtgct gggctcccgg gagcagctgt 1260gcatccatcc tgaggtgaag aaacaagaga gtaaccatct acagatccac ttgtgccgta 1320agaaggtggc aagtcgctcc tgtcatttct acaacaacgt agaagaaaaa agcctggagc 1380aggagctggc cagccccatc ctggacattg aggacttggt caagagcgga agcaagcaca 1440gggtgtgccc ttactacctg tcccggaacc tgaagcagca agccgacatc atattcatgc 1500cgtacaatta cttgttggat gccaagagcc gcagagcaca caacattgac ctgaagggga 1560cagtcgtgat ctttgacgaa gctcacaacg tggagaagat gtgtgaagaa tcggcatcct 1620ttgacctgac tccccatgac ctggcttcag gactggacgt catagaccag gtgctggagg 1680agcagaccaa ggcagcgcag cagggtgagc cccacccgga gttcagcgcg gactccccca 1740gcccagggct gaacatggag ctggaagaca ttgcaaagct gaagatgatc ctgctgcgcc 1800tggagggggc catcgatgct gttgagctgc ctggagacga cagcggtgtc accaagccag 1860ggagctacat ctttgagctg tttgctgaag cccagatcac gtttcagacc aagggctgca 1920tcctggactc gctggaccag atcatccagc acctggcagg acgtgctgga gtgttcacca 1980acacggccgg actgcagaag ctggcggaca ttatccagat tgtgttcagt gtggacccct 2040ccgagggcag ccctggttcc ccagcagggc tgggggcctt acagtcctat aaggtgcaca 2100tccatcctga tgctggtcac cggaggacgg ctcagcggtc tgatgcctgg agcaccactg 2160cagccagaaa gcgagggaag gtgctgagct actggtgctt cagtcccggc cacagcatgc 2220acgagctggt ccgccagggc gtccgctccc tcatccttac cagcggcacg ctggccccgg 2280tgtcctcctt tgctctggag atgcagatcc ctttcccagt ctgcctggag aacccacaca 2340tcatcgacaa gcaccagatc tgggtggggg tcgtccccag aggccccgat ggagcccagt 2400tgagctccgc gtttgacaga cggttttccg aggagtgctt atcctccctg gggaaggctc 2460tgggcaacat cgcccgcgtg gtgccctatg ggctcctgat cttcttccct tcctatcctg 2520tcatggagaa gagcctggag ttctggcggg cccgcgactt ggccaggaag atggaggcgc 2580tgaagccgct gtttgtggag cccaggagca aaggcagctt ctccgagacc atcagtgctt 2640actatgcaag ggttgccgcc cctgggtcca ccggcgccac cttcctggcg gtctgccggg 2700gcaaggccag cgaggggctg gacttctcag acacgaatgg ccgtggtgtg attgtcacgg 2760gcctcccgta ccccccacgc atggaccccc gggttgtcct caagatgcag ttcctggatg 2820agatgaaggg ccagggtggg gctgggggcc agttcctctc tgggcaggag tggtaccggc 2880agcaggcgtc cagggctgtg aaccaggcca tcgggcgagt gatccggcac cgccaggact 2940acggagctgt cttcctctgt gaccacaggt tcgcctttgc cgacgcaaga gcccaactgc 3000cctcctgggt gcgtccccac gtcagggtgt atgacaactt tggccatgtc atccgagacg 3060tggcccagtt cttccgtgtt gccgagcgaa ctatgccagc gccggccccc cgggctacag 3120cacccagtgt gcgtggagaa gatgctgtca gcgaggccaa gtcgcctggc cccttcttct 3180ccaccaggaa agctaagagt ctggacctgc atgtccccag cctgaagcag aggtcctcag 3240ggtcaccagc tgccggggac cccgagagta gcctgtgtgt ggagtatgag caggagccag 3300ttcctgcccg gcagaggccc agggggctgc tggccgccct ggagcacagc gaacagcggg 3360cggggagccc tggcgaggag caggcccaca gctgctccac cctgtccctc ctgtctgaga 3420agaggccggc agaagaaccg cgaggaggga ggaagaagat ccggctggtc agccacccgg 3480aggagcccgt ggctggtgca cagacggaca gggccaagct cttcatggtg gccgtgaagc 3540aggagttgag ccaagccaac tttgccacct tcacccaggc cctgcaggac tacaagggtt 3600ccgatgactt cgccgccctg gccgcctgtc tcggccccct ctttgctgag gaccccaaga 3660agcacaacct gctccaaggc ttctaccagt ttgtgcggcc ccaccataag cagcagtttg 3720aggaggtctg tatccagctg acaggacgag gctgtggcta tcggcctgag cacagcattc 3780cccgaaggca gcgggcacag ccggtcctgg accccactgg aagaacggcg ccggatccca 3840agctgaccgt gtccacggct gcagcccagc agctggaccc ccaagagcac ctgaaccagg 3900gcaggcccca cctgtcgccc aggccacccc caacaggaga ccctggcagc caaccacagt 3960gggggtctgg agtgcccaga gcagggaagc agggccagca cgccgtgagc gcctacctgg 4020ctgatgcccg cagggccctg gggtccgcgg gctgtagcca actcttggca gcgctgacag 4080cctataagca agacgacgac ctcgacaagg tgctggctgt gttggccgcc ctgaccactg 4140caaagccaga ggacttcccc ctgctgcaca ggttcagcat gtttgtgcgt ccacaccaca 4200agcagcgctt ctcacagacg tgcacagacc tgaccggccg gccctacccg ggcatggagc 4260caccgggacc ccaggaggag aggcttgccg tgcctcctgt gcttacccac agggctcccc 4320aaccaggccc ctcacggtcc gagaagaccg ggaagaccca gagcaagatc tcgtccttcc 4380ttagacagag gccagcaggg actgtggggg cgggcggtga ggatgcaggt cccagccagt 4440cctcaggacc tccccacggg cctgcagcat ctgagtgggg cctctaggat gtgcccagcc 4500tgccacaccg cctccaggaa gcagagcgtc atgcaggtct tctggccaga gccccagtga 4560gtgcccacgg aggcccccag cacacccaac gtggcttgat cacctgcctg tccagctctg 4620gtgggccaag aacccaccca acagaatagg ccagcccatg ccagccggct tggcccgctg 4680caggcctcag gcaggcgggg cccatggttg gtccctgcgg tgggaccgga tctgggcctg 4740cctctgagaa gccctgagct accttggggt ctggggtggg tttctgggaa agtgcttccc 4800cagaacttcc ctggctcctg gcctgtgagt ggtgccacag gggcacccca gctgagcccc 4860tcaccgggaa ggaggagacc cccgtgggca cgtgtccact tttaatcagg ggacagggct 4920ctctaataaa gctgctggca gtgccc 49463721743DNAHomo sapiens 372cagtatccct cctgacaaaa ctaacaaaaa tcctgttagc caaataatca gccacattca 60tatttaccgt caaagttttt atcctcattt tacagcagtg gagagcgatt gccccgggtc 120ccacgttagg aagagagaga actgggattt gcacccaggc aatctgggga cagagctgtg 180atcacaactc catgagtcag ggccgagcca gccccttcac caccagccgg ccgcgccccg 240ggaaggaagt ttgtggcgga ggaggttcgt acgggaggag ggggaggcgc ccacgcatct 300ggggctgact cgctctttcg caaaacgtct gggaggagtc cctggggcca caaaactgcc 360tccttcctga ggccagaagg agagaagacg tgcagggacc ccgcgcacag gagctgccct 420cgcgacatgg gtcacccgcc gctgctgccg ctgctgctgc tgctccacac ctgcgtccca 480gcctcttggg gcctgcggtg catgcagtgt aagaccaacg gggattgccg tgtggaagag 540tgcgccctgg gacaggacct ctgcaggacc acgatcgtgc gcttgtggga agaaggagaa 600gagctggagc tggtggagaa aagctgtacc cactcagaga agaccaacag gaccctgagc 660tatcggactg gcttgaagat caccagcctt accgaggttg tgtgtgggtt agacttgtgc 720aaccagggca actctggccg ggctgtcacc tattcccgaa gccgttacct cgaatgcatt 780tcctgtggct catcagacat gagctgtgag aggggccggc accagagcct gcagtgccgc 840agccctgaag aacagtgcct ggatgtggtg acccactgga tccaggaagg tgaagaaggg 900cgtccaaagg atgaccgcca cctccgtggc tgtggctacc ttcccggctg cccgggctcc 960aatggtttcc acaacaacga caccttccac ttcctgaaat gctgcaacac caccaaatgc

1020aacgagggcc caatcctgga gcttgaaaat ctgccgcaga atggccgcca gtgttacagc 1080tgcaagggga acagcaccca tggatgctcc tctgaagaga ctttcctcat tgactgccga 1140ggccccatga atcaatgtct ggtagccacc ggcactcacg aaccgaaaaa ccaaagctat 1200atggtaagag gctgtgcaac cgcctcaatg tgccaacatg cccacctggg tgacgccttc 1260agcatgaacc acattgatgt ctcctgctgt actaaaagtg gctgtaacca cccagacctg 1320gatgtccagt accgcagtgg ggctgctcct cagcctggcc ctgcccatct cagcctcacc 1380atcaccctgc taatgactgc cagactgtgg ggaggcactc tcctctggac ctaaacctga 1440aatccccctc tctgccctgg ctggatccgg gggacccctt tgcccttccc tcggctccca 1500gccctacaga cttgctgtgt gacctcaggc cagtgtgccg acctctctgg gcctcagttt 1560tcccagctat gaaaacagct atctcacaaa gttgtgtgaa gcagaagaga aaagctggag 1620gaaggccgtg ggcaatggga gagctcttgt tattattaat attgttgccg ctgttgtgtt 1680gttgttatta attaatattc atattattta ttttatactt acataaagat tttgtaccag 1740tgg 17433735061DNAHomo sapiens 373atggctcaga tatttagcaa cagcggattt aaagaatgtc cattttcaca tccggaacca 60acaagagcaa aagatgtgga caaagaagaa gcattacaga tggaagcaga ggctttagca 120aaactgcaaa aggatagaca agtgactgac aatcagagag gctttgagtt gtcaagcagc 180accagaaaaa aagcacaggt ttataacaag caggattatg atctcatggt gtttcctgaa 240tcagattccc aaaaaagagc attagatatt gatgtagaaa agctcaccca agctgaactt 300gagaaactat tgctggatga cagtttcgag actaaaaaaa cacctgtatt accagttact 360cctattctga gcccttcctt ttcagcacag ctctatttta gacctactat tcagagagga 420cagtggccac ctggattacc tgggccttcc acttatgctt taccttctat ttatccttct 480acttacagta aacaggctgc attccaaaat ggcttcaatc caagaatgcc cacttttcca 540tctacagaac ctatatattt aagtcttccg ggacaatctc catatttctc atatcctttg 600acacctgcca caccctttca tccacaagga agcttaccta tctatcgtcc agtagtcagt 660actgacatgg caaaactatt tgacaaaata gctagtacat cagaattttt aaaaaatggg 720aaagcaagga ctgatttgga gataacagat tcaaaagtca gcaatctaca ggtatctcca 780aagtctgagg atatcagtaa atttgactgg ttagacttgg atcctctaag taagcctaag 840gtggataatg tggaggtatt agaccatgag gaagagaaaa atgtttcaag tttgctagca 900aaggatcctt gggatgctgt tcttcttgaa gagagatcga cagcaaattg tcatcttgaa 960agaaaggtga atggaaaatc cctttctgtg gcaactgtta caagaagcca gtctttaaat 1020attcgaacaa ctcagcttgc aaaagcccag ggccatatat ctcagaaaga cccaaatggg 1080accagtagtt tgccaactgg aagttctctt cttcaagaag ttgaagtaca gaatgaggag 1140atggcagctt tttgtcgatc cattacaaaa ttgaagacca aatttccata taccaatcac 1200cgcacaaacc caggctattt gttaagtcca gtcacagcgc aaagaaacat atgcggagaa 1260aatgctagtg tgaaggtctc cattgacatt gaaggatttc agctaccagt tacttttacg 1320tgtgatgtga gttctactgt agaaatcatt ataatgcaag ccctttgctg ggtacatgat 1380gacttgaatc aagtagatgt tggcagctat gttctaaaag tttgtggtca agaggaagtg 1440ctgcagaata atcattgcct tggaagtcat gagcatattc aaaactgtcg aaaatgggac 1500acagaaatta gactacaact cttgaccttc agtgcaatgt gtcaaaatct ggcccgaaca 1560gcagaagatg atgaaacacc cgtggattta aacaaacacc tgtatcaaat agaaaaacct 1620tgcaaagaag ccatgacgag acaccctgtt gaagaactct tagattctta tcacaaccaa 1680gtagaactgg ctcttcaaat tgaaaaccaa caccgagcag tagatcaagt aattaaagct 1740gtaagaaaaa tctgtagtgc tttagatggt gtcgagactc ttgccattac agaatcagta 1800aagaagctaa agagagcagt taatcttcca aggagtaaaa ctgctgatgt gacttctttg 1860tttggaggag aagacactag caggagttca actaggggct cacttaatcc tgaaaatcct 1920gttcaagtaa gcataaacca attaactgca gcaatttatg atcttctcag actccatgca 1980aattctggta ggagtcctac agactgtgcc caaagtagca agagtgtcaa ggaagcatgg 2040actacaacag agcagctcca gtttactatt tttgctgctc atggaatttc aagtaattgg 2100gtatcaaatt atgaaaaata ctacttgata tgttcactgt ctcacaatgg aaaggatctt 2160tttaaaccta ttcaatcaaa gaaggttggc acttacaaga atttcttcta tcttattaaa 2220tgggatgaac taatcatttt tcctatccag atatcacaat tgccattaga atcagttctt 2280caccttactc tttttggaat tttaaatcag agcagtggaa gttcccctga ttctaataag 2340cagagaaagg gaccagaagc tttgggcaaa gtttctttac ctctttgtga ctttagacgg 2400tttttaacat gtggaactaa acttctatat ctttggactt catcacatac aaattctgtt 2460cctggaacag ttaccaaaaa aggatatgtc atggaaagaa tagtgctaca ggttgatttt 2520ccttctcctg catttgatat tatttataca actcctcaag ttgacagaag cattatacag 2580caacataact tagaaacact agagaatgat ataaaaggga aacttcttga tattcttcat 2640aaagactcat cacttggact ttctaaagaa gataaagctt ttttatggga gaaacgttat 2700tattgcttca aacacccaaa ttgtcttcct aaaatattag caagcgcccc aaactggaaa 2760tggggtaatc ttgccaaaac ttactcattg cttcaccagt ggcctgcatt gtacccacta 2820attgcattgg aacttcttga ttcaaaattt gctgatcagg aagtaagatc cctagctgtg 2880acctggattg aggccattag tgatgatgag ctaacagatc ttcttccaca gtttgtacaa 2940gctttgaaat atgaaattta cttgaatagt tcattagtgc aattcctttt gtccagggca 3000ttgggaaata tccagatagc acacaattta tattggcttc tcaaagatgc cctgcatgat 3060gtacagttta gtacccgata cgaacatgtt ttgggtgctc tcctgtcagt aggaggaaaa 3120cgacttagag aagaacttct aaaacagacg aaacttgtac agcttttagg aggagtagca 3180gaaaaagtaa ggcaggctag tggatcagcc agacaggttg ttctccaaag aagtatggaa 3240cgagtacagt ccttttttca gaaaaataaa tgccgtctcc ctctcaagcc aagtctagtg 3300gcaaaagaat taaatattaa gtcgtgttcc ttcttcagtt ctaatgctgt ccccctaaaa 3360gtcacaatgg tgaatgctga ccctctggga gaagaaatta atgtcatgtt taaggttggt 3420gaagatcttc ggcaagatat gttagcttta cagatgataa agattatgga taagatctgg 3480cttaaagaag gactagatct gaggatggta attttcaaat gtctctcaac tggcagagat 3540cgaggcatgg tggagctggt tcctgcttcc gataccctca ggaaaatcca agtggaatat 3600ggtgtgacag gatcctttaa agataaacca cttgcagagt ggctaaggaa atacaatccc 3660tctgaagaag aatatgaaaa ggcttcagag aactttatct attcctgtgc tggatgctgt 3720gtagccacct atgttttagg catctgtgat cgacacaatg acaatataat gcttcgaagc 3780acgggacaca tgtttcacat tgactttgga aagtttttgg gacatgcaca gatgtttggc 3840agcttcaaaa gggatcgggc tccttttgtg ctgacctctg atatggcata tgtcattaat 3900gggggtgaaa agcccaccat tcgttttcag ttgtttgtgg acctctgctg tcaggcctac 3960aacttgataa gaaagcagac aaaccttttt cttaacctcc tttcactgat gattccttca 4020gggttaccag aacttacaag tattcaagat ttgaaatacg ttagagatgc acttcaaccc 4080caaactacag acgcagaagc tacaattttc tttactaggc ttattgaatc aagtttggga 4140agcattgcca caaagtttaa cttcttcatt cacaaccttg ctcagcttcg tttttctggt 4200cttccttcta atgatgagcc catcctttca ttttcaccta aaacatactc ctttagacaa 4260gatggtcgaa tcaaggaagt ctctgttttt acatatcata agaaatacaa cccagataaa 4320cattatattt atgtagtccg aattttgtgg gaaggacaga ttgaaccatc atttgtcttc 4380cgaacatttg tcgaatttca ggaacttcac aataagctca gtattatttt tccactttgg 4440aagttaccag gctttcctaa taggatggtt ctaggaagaa cacacataaa agatgtagca 4500gccaaaagga aaattgagtt aaacagttac ttacagagtt tgatgaatgc ttcaacggat 4560gtagcagagt gtgatcttgt ttgtactttc ttccaccctt tacttcgtga tgagaaagct 4620gaagggatag ctaggtctgc agatgcaggt tccttcagtc ctactccagg ccaaatagga 4680ggagctgtga aattatccat ctcttaccga aatggtactc ttttcatcat ggtgatgcat 4740atcaaagatc ttgttactga agatggagct gacccaaatc catatgtcaa aacataccta 4800cttccagata accacaaaac atccaaacgt aaaaccaaaa tttcacgaaa aacgaggaat 4860ccgacattca atgaaatgct tgtatacagt ggatatagca aagaaaccct aagacagcga 4920gaacttcaac taagtgtact cagtgcagaa tctctgcggg agaatttttt cttgggtgga 4980gtaaccctgc ctttgaaaga tttcaacttg agcaaagaga cggttaaatg gtatcagctg 5040actgcggcaa catacttgta a 50613746802DNAHomo sapiens 374cggccccaga aaacccgagc gagtaggggg cggcgcgcag gagggaggag aactgggggc 60gcgggaggct ggtgggtgtc gggggtggag atgtagaaga tgtgacgccg cggcccggcg 120ggtgccagat tagcggacgg ctgcccgcgg ttgcaacggg atcccgggcg ctgcagcttg 180ggaggcggct ctccccaggc ggcgtccgcg gagacaccca tccgtgaacc ccaggtcccg 240ggccgccggc tcgccgcgca ccaggggccg gcggacagaa gagcggccga gcggctcgag 300gctgggggac cgcgggcgcg gccgcgcgct gccgggcggg aggctggggg gccggggccg 360gggccgtgcc ccggagcggg tcggaggccg gggccggggc cgggggacgg cggctccccg 420cgcggctcca gcggctcggg gatcccggcc gggccccgca gggaccatgg cagccgggag 480catcaccacg ctgcccgcct tgcccgagga tggcggcagc ggcgccttcc cgcccggcca 540cttcaaggac cccaagcggc tgtactgcaa aaacgggggc ttcttcctgc gcatccaccc 600cgacggccga gttgacgggg tccgggagaa gagcgaccct cacatcaagc tacaacttca 660agcagaagag agaggagttg tgtctatcaa aggagtgtgt gctaaccgtt acctggctat 720gaaggaagat ggaagattac tggcttctaa atgtgttacg gatgagtgtt tcttttttga 780acgattggaa tctaataact acaatactta ccggtcaagg aaatacacca gttggtatgt 840ggcactgaaa cgaactgggc agtataaact tggatccaaa acaggacctg ggcagaaagc 900tatacttttt cttccaatgt ctgctaagag ctgattttaa tggccacatc taatctcatt 960tcacatgaaa gaagaagtat attttagaaa tttgttaatg agagtaaaag aaaataaatg 1020tgtatagctc agtttggata attggtcaaa caatttttta tccagtagta aaatatgtaa 1080ccattgtccc agtaaagaaa aataacaaaa gttgtaaaat gtatattctc ccttttatat 1140tgcatctgct gttacccagt gaagcttacc tagagcaatg atctttttca cgcatttgct 1200ttattcgaaa agaggctttt aaaatgtgca tgtttagaaa caaaatttct tcatggaaat 1260catatacatt agaaaatcac agtcagatgt ttaatcaatc caaaatgtcc actatttctt 1320atgtcattcg ttagtctaca tgtttctaaa catataaatg tgaatttaat caattccttt 1380catagtttta taattctctg gcagttcctt atgatagagt ttataaaaca gtcctgtgta 1440aactgctgga agttcttcca cagtcaggtc aattttgtca aacccttctc tgtacccata 1500cagcagcagc ctagcaactc tgctggtgat gggagttgta ttttcagtct tcgccaggtc 1560attgagatcc atccactcac atcttaagca ttcttcctgg caaaaattta tggtgaatga 1620atatggcttt aggcggcaga tgatatacat atctgacttc ccaaaagctc caggatttgt 1680gtgctgttgc cgaatactca ggacggacct gaattctgat tttataccag tctcttcaaa 1740aacttctcga accgctgtgt ctcctacgta aaaaaagaga tgtacaaatc aataataatt 1800acacttttag aaactgtatc atcaaagatt ttcagttaaa gtagcattat gtaaaggctc 1860aaaacattac cctaacaaag taaagttttc aatacaaatt ctttgccttg tggatatcaa 1920gaaatcccaa aatattttct taccactgta aattcaagaa gcttttgaaa tgctgaatat 1980ttctttggct gctacttgga ggcttatcta cctgtacatt tttggggtca gctcttttta 2040acttcttgct gctctttttc ccaaaaggta aaaatataga ttgaaaagtt aaaacatttt 2100gcatggctgc agttcctttg tttcttgaga taagattcca aagaacttag attcatttct 2160tcaacaccga aatgctggag gtgtttgatc agttttcaag aaacttggaa tataaataat 2220tttataattc aacaaaggtt ttcacatttt ataaggttga tttttcaatt aaatgcaaat 2280ttgtgtggca ggatttttat tgccattaac atatttttgt ggctgctttt tctacacatc 2340cagatggtcc ctctaactgg gctttctcta attttgtgat gttctgtcat tgtctcccaa 2400agtatttagg agaagccctt taaaaagctg ccttcctcta ccactttgct ggaaagcttc 2460acaattgtca cagacaaaga tttttgttcc aatactcgtt ttgcctctat ttttcttgtt 2520tgtcaaatag taaatgatat ttgcccttgc agtaattcta ctggtgaaaa acatgcaaag 2580aagaggaagt cacagaaaca tgtctcaatt cccatgtgct gtgactgtag actgtcttac 2640catagactgt cttacccatc ccctggatat gctcttgttt tttccctcta atagctatgg 2700aaagatgcat agaaagagta taatgtttta aaacataagg cattcatctg ccatttttca 2760attacatgct gacttccctt acaattgaga tttgcccata ggttaaacat ggttagaaac 2820aactgaaagc ataaaagaaa aatctaggcc gggtgcagtg gctcatgcct atattccctg 2880cactttggga ggccaaagca ggaggatcgc ttgagcccag gagttcaaga ccaacctggt 2940gaaaccccgt ctctacaaaa aaacacaaaa aatagccagg catggtggcg tgtacatgtg 3000gtctcagata cttgggaggc tgaggtggga gggttgatca cttgaggctg agaggtcaag 3060gttgcagtga gccataatcg tgccactgca gtccagccta ggcaacagag tgagactttg 3120tctcaaaaaa agagaaattt tccttaataa gaaaagtaat ttttactctg atgtgcaata 3180catttgttat taaatttatt atttaagatg gtagcactag tcttaaattg tataaaatat 3240cccctaacat gtttaaatgt ccatttttat tcattatgct ttgaaaaata attatgggga 3300aatacatgtt tgttattaaa tttattatta aagatagtag cactagtctt aaatttgata 3360taacatctcc taacttgttt aaatgtccat ttttattctt tatgcttgaa aataaattat 3420ggggatccta tttagctctt agtaccacta atcaaaagtt cggcatgtag ctcatgatct 3480atgctgtttc tatgtcgtgg aagcaccgga tgggggtagt gagcaaatct gccctgctca 3540gcagtcacca tagcagctga ctgaaaatca gcactgcctg agtagttttg atcagtttaa 3600cttgaatcac taactgactg aaaattgaat gggcaaataa gtgcttttgt ctccagagta 3660tgcgggagac ccttccacct caagatggat atttcttccc caaggatttc aagatgaatt 3720gaaattttta atcaagatag tgtgctttat tctgttgtat tttttattat tttaatatac 3780tgtaagccaa actgaaataa catttgctgt tttataggtt tgaagaacat aggaaaaact 3840aagaggtttt gtttttattt ttgctgatga agagatatgt ttaaatatgt tgtattgttt 3900tgtttagtta caggacaata atgaaatgga gtttatattt gttatttcta ttttgttata 3960tttaataata gaattagatt gaaataaaat ataatgggaa ataatctgca gaatgtgggt 4020ttcctggtgt ttcctctgac tctagtgcac tgatgatctc tgataaggct cagctgcttt 4080atagttctct ggctaatgca gcagatactc ttcctgccag tggtaatacg attttttaag 4140aaggcagttt gtcaatttta atcttgtgga tacctttata ctcttagggt attattttat 4200acaaaagcct tgaggattgc attctatttt ctatatgacc ctcttgatat ttaaaaaaca 4260ctatggataa caattcttca tttacctagt attatgaaag aatgaaggag ttcaaacaaa 4320tgtgtttccc agttaactag ggtttactgt ttgagccaat ataaatgttt aactgtttgt 4380gatggcagta ttcctaaagt acattgcatg ttttcctaaa tacagagttt aaataatttc 4440agtaattctt agatgattca gcttcatcat taagaatatc ttttgtttta tgttgagtta 4500gaaatgcctt catatagaca tagtctttca gacctctact gtcagttttc atttctagct 4560gctttcaggg ttttatgaat tttcaggcaa agctttaatt tatactaagc ttaggaagta 4620tggctaatgc caacggcagt ttttttcttc ttaattccac atgactgagg catatatgat 4680ctctgggtag gtgagttgtt gtgacaacca caagcacttt tttttttttt aaagaaaaaa 4740aggtagtgaa tttttaatca tctggacttt aagaaggatt ctggagtata cttaggcctg 4800aaattatata tatttggctt ggaaatgtgt ttttcttcaa ttacatctac aagtaagtac 4860agctgaaatt cagaggaccc ataagagttc acatgaaaaa aatcaattca tttgaaaagg 4920caagatgcag gagagaggaa gccttgcaaa cctgcagact gctttttgcc caatatagat 4980tgggtaaggc tgcaaaacat aagcttaatt agctcacatg ctctgctctc acgtggcacc 5040agtggatagt gtgagagaat taggctgtag aacaaatggc cttctctttc agcattcaca 5100ccactacaaa atcatctttt atatcaacag aagaataagc ataaactaag caaaaggtca 5160ataagtacct gaaaccaaga ttggctagag atatatctta atgcaatcca ttttctgatg 5220gattgttacg agttggctat ataatgtatg tatggtattt tgatttgtgt aaaagtttta 5280aaaatcaagc tttaagtaca tggacatttt taaataaaat atttaaagac aatttagaaa 5340attgccttaa tatcattgtt ggctaaatag aataggggac atgcatatta aggaaaaggt 5400catggagaaa taatattggt atcaaacaaa tacattgatt tgtcatgata cacattgaat 5460ttgatccaat agtttaagga ataggtagga aaatttggtt tctatttttc gatttcctgt 5520aaatcagtga cataaataat tcttagctta ttttatattt ccttgtctta aatactgagc 5580tcagtaagtt gtgttagggg attatttctc agttgagact ttcttatatg acattttact 5640atgttttgac ttcctgacta ttaaaaataa atagtagaaa caattttcat aaagtgaaga 5700attatataat cactgcttta taactgactt tattatattt atttcaaagt tcatttaaag 5760gctactattc atcctctgtg atggaatggt caggaatttg ttttctcata gtttaattcc 5820aacaacaata ttagtcgtat ccaaaataac ctttaatgct aaactttact gatgtatatc 5880caaagcttct ccttttcaga cagattaatc cagaagcagt cataaacaga agaataggtg 5940gtatgttcct aatgatatta tttctactaa tggaataaac tgtaatatta gaaattatgc 6000tgctaattat atcagctctg aggtaatttc tgaaatgttc agactcagtc ggaacaaatt 6060ggaaaattta aatttttatt cttagctata aagcaagaaa gtaaacacat taatttcctc 6120aacattttta agccaattaa aaatataaaa gatacacacc aatatcttct tcaggctctg 6180acaggcctcc tggaaacttc cacatatttt tcaactgcag tataaagtca gaaaataaag 6240ttaacataac tttcactaac acacacatat gtagatttca caaaatccac ctataattgg 6300tcaaagtggt tgagaatata ttttttagta attgcatgca aaatttttct agcttccatc 6360ctttctccct cgtttcttct ttttttgggg gagctggtaa ctgatgaaat cttttcccac 6420cttttctctt caggaaatat aagtggtttt gtttggttaa cgtgatacat tctgtatgaa 6480tgaaacattg gagggaaaca tctactgaat ttctgtaatt taaaatattt tgctgctagt 6540taactatgaa cagatagaag aatcttacag atgctgctat aaataagtag aaaatataaa 6600tttcatcact aaaatatgct attttaaaat ctatttccta tattgtattt ctaatcagat 6660gtattactct tattatttct attgtatgtg ttaatgattt tatgtaaaaa tgtaattgct 6720tttcatgagt agtatgaata aaattgatta gtttgtgttt tcttgtctcc cgaaaaaaaa 6780aaaaaaaaaa aaaaaaaaaa aa 68023751840DNAHomo sapiens 375cccattaggt gacaggtttt tagagaagcc aatcacgtcg ccgcggtcct ggttctaaag 60tcctcgctca cccacccgga ctcattctcc ccagacgcca aggatggtgg tcatggcgcc 120ccgaaccctc ttcctgctgc tctcgggggc cctgaccctg accgagacct gggcgggctc 180ccactccatg aggtatttca gcgccgccgt gtcccggccc ggccgcgggg agccccgctt 240catcgccatg ggctacgtgg acgacacgca gttcgtgcgg ttcgacagcg actcggcgtg 300tccgaggatg gagccgcggg cgccgtgggt ggagcaggag gggccggagt attgggaaga 360ggagacacgg aacaccaagg cccacgcaca gactgacaga atgaacctgc agaccctgcg 420cggctactac aaccagagcg aggccagttc tcacaccctc cagtggatga ttggctgcga 480cctggggtcc gacggacgcc tcctccgcgg gtatgaacag tatgcctacg atggcaagga 540ttacctcgcc ctgaacgagg acctgcgctc ctggaccgca gcggacactg cggctcagat 600ctccaagcgc aagtgtgagg cggccaatgt ggctgaacaa aggagagcct acctggaggg 660cacgtgcgtg gagtggctcc acagatacct ggagaacggg aaggagatgc tgcagcgcgc 720ggaccccccc aagacacacg tgacccacca ccctgtcttt gactatgagg ccaccctgag 780gtgctgggcc ctgggcttct accctgcgga gatcatactg acctggcagc gggatgggga 840ggaccagacc caggacgtgg agctcgtgga gaccaggcct gcaggggatg gaaccttcca 900gaagtgggca gctgtggtgg tgccttctgg agaggagcag agatacacgt gccatgtgca 960gcatgagggg ctgccggagc ccctcatgct gagatggaag cagtcttccc tgcccaccat 1020ccccatcatg ggtatcgttg ctggcctggt tgtccttgca gctgtagtca ctggagctgc 1080ggtcgctgct gtgctgtgga gaaagaagag ctcagattga aaaggaggga gctactctca 1140ggctgcaagt aagtatgaag gaggctgatc cctgagatcc ttgggatctt gtgtttggga 1200gccatggggg agctcaccca ccccacaatt cctcctctgg ccacatctcc tgtggtctct 1260gaccaggtgc tgtttttgtt ctactctagg cagtgacagt gcccagggct ctaatgtgtc 1320tctcacggct tgtaaatgtg acaccccggg gggcctgatg tgtgtgggtt gttgagggga 1380acaggggaca tagctgtgct atgaggtttc tttgacttca atgtattgag catgtgatgg 1440gctgtttaaa gtgtcacccc tcactgtgac tgatatgaat ttgttcatga atatttttct 1500gtagtgtgaa acagctgccc tgtgtgggac tgagtggcaa gtccctttgt gacttcaaga 1560accctgactt ctctttgtgc agagaccagc ccacccctgt gcccaccatg accctcttcc 1620tcatgctgaa ctgcattcct tccccaatca cctttcctgt tccagaaaag gggctgggat 1680gtctccgtct ctgtctcaaa tttgtggtcc actgagctat aacttacttc tgtattaaaa 1740ttagaatctg agtgtaaatt tactttttca aattatttcc aagagagatt gatgggttaa 1800ttaaaggaga agattcctga aatttgagag acaaaataaa 18403766754DNAHomo sapiens 376gtcgacgtgg cggccggcgg cggctgcggg ctgagcggcg agtttccgat ttaaagctga 60gctgcgagga aaatggcggc gggaggatca aaatacttgc tggatggtgg actcagagac 120caataaaaat aaactgcttg aacatccttt gactggttag ccagttgctg atgtatattc 180aagatgagtg gattaggaga aaacttggat ccactggcca gtgattcacg aaaacgcaaa 240ttgccatgtg atactccagg acaaggtctt acctgcagtg gtgaaaaacg gagacgggag 300caggaaagta aatatattga agaattggct gagctgatat ctgccaatct tagtgatatt 360gacaatttca atgtcaaacc agataaatgt gcgattttaa

aggaaacagt aagacagata 420cgtcaaataa aagagcaagg aaaaactatt tccaatgatg atgatgttca aaaagccgat 480gtatcttcta cagggcaggg agttattgat aaagactcct taggaccgct tttacttcag 540gcattggatg gtttcctatt tgtggtgaat cgagacggaa acattgtatt tgtatcagaa 600aatgtcacac aatacctgca atataagcaa gaggacctgg ttaacacaag tgtttacaat 660atcttacatg aagaagacag aaaggatttt cttaagaatt taccaaaatc tacagttaat 720ggagtttcct ggacaaatga gacccaaaga caaaaaagcc atacatttaa ttgccgtatg 780ttgatgaaaa caccacatga tattctggaa gacataaacg ccagtcctga aatgcgccag 840agatatgaaa caatgcagtg ctttgccctg tctcagccac gagctatgat ggaggaaggg 900gaagatttgc aatcttgtat gatctgtgtg gcacgccgca ttactacagg agaaagaaca 960tttccatcaa accctgagag ctttattacc agacatgatc tttcaggaaa ggttgtcaat 1020atagatacaa attcactgag atcctccatg aggcctggct ttgaagatat aatccgaagg 1080tgtattcaga gattttttag tctaaatgat gggcagtcat ggtcccagaa acgtcactat 1140caagaagtta ccagtgatgg gatattttcc ccaacagctt atcttaatgg ccatgcagaa 1200accccagtat atcgattctc gttggctgat ggaactatag tgactgcaca gacaaaaagc 1260aaactcttcc gaaatcctgt aacaaatgat cgacatggct ttgtctcaac ccacttcctt 1320cagagagaac agaatggata tagaccaaac ccaaatcctg ttggacaagg gattagacca 1380cctatggctg gatgcaacag ttcggtaggc ggcatgagta tgtcgccaaa ccaaggctta 1440cagatgccga gcagcagggc ctatggcttg gcagacccta gcaccacagg gcagatgagt 1500ggagctaggt atgggggttc cagtaacata gcttcattga cccctgggcc aggcatgcaa 1560tcaccatctt cctaccagaa caacaactat aggctcaaca tgagtagccc cccacatggg 1620agtcctggtc ttgccccaaa ccagcagaat atcatgattt ctcctcgtaa tcgtgggagt 1680ccaaagatag cctcacatca gttttctcct gttgcaggtg tgcactctcc catggcatct 1740tctggcaata ctgggaacca cagcttttcc agcagctctc tcagtgccct gcaagccatc 1800agtgaaggtg tggggacttc ccttttatct actctgtcat caccaggccc caaattggat 1860aactctccca atatgaatat tacccaacca agtaaagtaa gcaatcagga ttccaagagt 1920cctctgggct tttattgcga ccaaaatcca gtggagagtt caatgtgtca gtcaaatagc 1980agagatcacc tcagtgacaa agaaagtaag gagagcagtg ttgagggggc agagaatcaa 2040aggggtcctt tggaaagcaa aggtcataaa aaattactgc agttacttac ctgttcttct 2100gatgaccggg gtcattcctc cttgaccaac tcccccctag attcaagttg taaagaatct 2160tctgttagtg tcaccagccc ctctggagtc tcctcctcta catctggagg agtatcctct 2220acatccaata tgcatgggtc actgttacaa gagaagcacc ggattttgca caagttgctg 2280cagaatggga attcaccagc tgaggtagcc aagattactg cagaagccac tgggaaagac 2340accagcagta taacttcttg tggggacgga aatgttgtca agcaggagca gctaagtcct 2400aagaagaagg agaataatgc acttcttaga tacctgctgg acagggatga tcctagtgat 2460gcactctcta aagaactaca gccccaagtg gaaggagtgg ataataaaat gagtcagtgc 2520accagctcca ccattcctag ctcaagtcaa gagaaagacc ctaaaattaa gacagagaca 2580agtgaagagg gatctggaga cttggataat ctagatgcta ttcttggtga tctgactagt 2640tctgactttt acaataattc catatcctca aatggtagtc atctggggac taagcaacag 2700gtgtttcaag gaactaattc tctgggtttg aaaagttcac agtctgtgca gtctattcgt 2760cctccatata accgagcagt gtctctggat agccctgttt ctgttggctc aagtcctcca 2820gtaaaaaata tcagtgcttt ccccatgtta ccaaagcaac ccatgttggg tgggaatcca 2880agaatgatgg atagtcagga aaattatggc tcaagtatgg gagactgggg cttaccaaac 2940tcaaaggccg gcagaatgga acctatgaat tcaaactcca tgggaagacc aggaggagat 3000tataatactt ctttacccag acctgcactg ggtggctcta ttcccacatt gcctcttcgg 3060tctaatagca taccaggtgc gagaccagta ttgcaacagc agcagcagat gcttcaaatg 3120aggcctggtg aaatccccat gggaatgggg gctaatccct atggccaagc agcagcatct 3180aaccaactgg gttcctggcc cgatggcatg ttgtccatgg aacaagtttc tcatggcact 3240caaaataggc ctcttcttag gaattccctg gatgatcttg ttgggccacc ttccaacctg 3300gaaggccaga gtgacgaaag agcattattg gaccagctgc acactcttct cagcaacaca 3360gatgccacag gcctggaaga aattgacaga gctttgggca ttcctgaact tgtcaatcag 3420ggacaggcat tagagcccaa acaggatgct ttccaaggcc aagaagcagc agtaatgatg 3480gatcagaagg caggattata tggacagaca tacccagcac aggggcctcc aatgcaagga 3540ggctttcatc ttcagggaca atcaccatct tttaactcta tgatgaatca gatgaaccag 3600caaggcaatt ttcctctcca aggaatgcac ccacgagcca acatcatgag accccggaca 3660aacaccccca agcaacttag aatgcagctt cagcagaggc tgcagggcca gcagtttttg 3720aatcagagcc gacaggcact tgaattgaaa atggaaaacc ctactgctgg tggtgctgcg 3780gtgatgaggc ctatgatgca gccccagcag ggttttctta atgctcaaat ggtcgcccaa 3840cgcagcagag agctgctaag tcatcacttc cgacaacaga gggtggctat gatgatgcag 3900cagcagcaac agcagcagca gcagcagcag cagcagcaac agcaacagca acagcaacag 3960cagcaacagc agcaaaccca ggccttcagc ccacctccta atgtgactgc ttcccccagc 4020atggatgggc ttttggcagg acccacaatg ccacaagctc ctccgcaaca gtttccatat 4080caaccaaatt atggaatggg acaacaacca gatccagcct ttggtcgagt gtctagtcct 4140cccaatgcaa tgatgtcgtc aagaatgggt ccctcccaga atcccatgat gcaacacccg 4200caggctgcat ccatctatca gtcctcagaa atgaagggct ggccatcagg aaatttggcc 4260aggaacagct ccttttccca gcagcagttt gcccaccagg ggaatcctgc agtgtatagt 4320atggtgcaca tgaatggcag cagtggtcac atgggacaga tgaacatgaa ccccatgccc 4380atgtctggca tgcctatggg tcctgatcag aaatactgct gacatctctg caccaggacc 4440tcttaaggaa accactgtac aaatgacact gcactaggat tattgggaag gaatcattgt 4500tccaggcatc catcttggaa gaaaggacca gctttgagct ccatcaaggg tattttaagt 4560gatgtcattt gagcaggact ggattttaag ccgaagggca atatctacgt gtttttcccc 4620cctccttctg ctgtgtatca tggtgttcaa aacagaaatg ttttttggca ttccacctcc 4680tagggatata attctggaga catggagtgt tactgatcat aaaacttttg tgtcactttt 4740ttctgccttg ctagccaaaa tctcttaaat acacgtaggt gggccagaga acattggaag 4800aatcaagaga gattagaata tctggtttct ctagttgcag tattggacaa agagcatagt 4860cccagccttc aggtgtagta gttctgtgtt gaccctttgt ccagtggaat tggtgattct 4920gaattgtcct ttactaatgg tgttgagttg ctctgtccct attatttgcc ctaggctttc 4980tcctaatgaa ggttttcatt tgccattcat gtcctgtaat acttcacctc caggaactgt 5040catggatgtc caaatggctt tgcagaaagg aaatgagatg acagtattta atcgcagcag 5100tagcaaactt ttcacatgct aatgtgcagc tgagtgcact ttatttaaaa agaatggata 5160aatgcaatat tcttgaggtc ttgagggaat agtgaaacac attcctggtt tttgcctaca 5220cttacgtgtt agacaagaac tatgattttt ttttttaaag tactggtgtc accctttgcc 5280tatatggtag agcaataatg ctttttaaaa ataaacttct gaaaacccaa ggccaggtac 5340tgcattctga atcagaatct cgcagtgttt ctgtgaatag atttttttgt aaatatgacc 5400tttaagatat tgtattatgt aaaatatgta tatacctttt tttgtaggtc acaacaactc 5460atttttacag agtttgtgaa gctaaatatt taacattgtt gatttcagta agctgtgtgg 5520tgaggctacc agtggaagag acatcccttg acttttgtgg cctgggggag gggtagtgca 5580ccacagcttt tccttcccca ccccccagcc ttagatgcct cgctcttttc aatctcttaa 5640tctaaatgct ttttaaagag attatttgtt tagatgtagg cattttaatt ttttaaaaat 5700tcctctacca gaactaagca ctttgttaat ttggggggaa agaatagata tggggaaata 5760aacttaaaaa aaaatcagga atttaaaaaa aacgagcaat ttgaagagaa tcttttggat 5820tttaagcagt ccgaaataat agcaattcat gggctgtgtg tgtgtgtgta tgtgtgtgtg 5880tgtgtgtgta tgtttaatta tgttaccttt tcatcccctt taggagcgtt ttcagatttt 5940ggttcgtaag acctgaatcc catattgaga tctcgagtag aatccttggt gtggtttctg 6000gtgtctgctc agctgtcccc tcattctact aatgtgatgc tttcattatg tccctgtgga 6060ttagaatagt gtcagttatt tcttaagtaa ctcagtaccc agaacagcca gttttactgt 6120gattcagagc cacagtctaa ctgagcacct tttaaacccc tccctcttct gccccctacc 6180acttttctgc tgttgcctct ctttgacacc tgttttagtc agttgggagg aagggaaaaa 6240tcaagtttaa ttccctttat ctgggttaat tcatttggtt caaatagttg acggaattgg 6300gtttctgaat gtctgtgaat ttcagaggtc tctgctagcc ttggtatcat tttctagcaa 6360taactgagag ccagttaatt ttaagaattt cacacattta gccaatcttt ctagatgtct 6420ctgaaggtaa gatcatttaa tatctttgat atgcttacga gtaagtgaat cctgattatt 6480tccagaccca ccaccagagt ggatcttatt ttcaaagcag tatagacaat tatgagtttg 6540ccctctttcc cctaccaagt tcaaaatata tctaagaaag attgtaaatc cgaaaacttc 6600cattgtagtg gcctgtgctt ttcagatagt atactctcct gtttggagac agaggaagaa 6660ccaggtcagt ctgtctcttt ttcagctcaa ttgtatctga cccttcttta agttatgtgt 6720gtggggagaa atagaatggt gctcttatgt cgac 6754377757DNAHomo sapiens 377ggaaccgaga ggctgagact aacccagaaa catccaattc tcaaactgaa gctcgcactc 60tcgcctccag catgaaagtc tctgccgccc ttctgtgcct gctgctcata gcagccacct 120tcattcccca agggctcgct cagccagatg caatcaatgc cccagtcacc tgctgttata 180acttcaccaa taggaagatc tcagtgcaga ggctcgcgag ctatagaaga atcaccagca 240gcaagtgtcc caaagaagct gtgatcttca agaccattgt ggccaaggag atctgtgctg 300accccaagca gaagtgggtt caggattcca tggaccacct ggacaagcaa acccaaactc 360cgaagacttg aacactcact ccacaaccca agaatctgca gctaacttat tttcccctag 420ctttccccag acaccctgtt ttattttatt ataatgaatt ttgtttgttg atgtgaaaca 480ttatgcctta agtaatgtta attcttattt aagttattga tgttttaagt ttatctttca 540tggtactagt gttttttaga tacagagact tggggaaatt gcttttcctc ttgaaccaca 600gttctacccc tgggatgttt tgagggtctt tgcaagaatc attaatacaa agaatttttt 660ttaacattcc aatgcattgc taaaatatta ttgtggaaat gaatattttg taactattac 720accaaataaa tatatttttg tacaaaaaaa aaaaaaa 757378476DNAHomo sapiens 378taaaggcaaa gaaggttttt atttaagtga caacatttga gagctaaaaa ccagctcaca 60tcaaaatcaa gacccagttg taaaaatctt ttaactccat aatgctgttt ttgtcttgtt 120agaaatctga tatcttacat tagcgtttct aacggatttt gtacaaggca gccataagga 180atataataaa cctttttcac cacagaacca tctgtcacag ataatactga aagttacaca 240cttaggaaca gtcagaccac agacaaggtc agactggctg ccaccaccaa gtaaacaact 300agaaaaggac agcggggtcc aagggtgggg gtccctgtgc acgagtcgcc ctcctctggc 360ctgccccccc tcgggtcacc tgtttctcct ttgccccaaa gagggtggag tcaaatgcag 420attttcctcc caactgcctg ttagtgtctc aacaaggaga gcagagccca ggtcag 4763792518DNAHomo sapiens 379gggtgcgctc ggccgtggcg cacctggtga gctccggggg cgctccgcct ccgcgcccca 60aatccccgga cctgcccaac gccgcctcgg cgccgcccgc cgccgctcca gaagcgccca 120ggagccctcc cgcgaaggct gggagcggga gcgcgacgcc cgcgaaggct gttgaggctc 180gagcgagctt ctccagaccg acctttctgc agctgagccc cggggggctg cgacgcgccg 240atgaccacgc gggccgggct gtgcaaagcc ccccggacac gggccgccgc ctgccctgga 300gcacaggcta cgccgagtga gcgccccctg gggcacccaa accaggatgg ggctcccacc 360cctctcccca gctccgcatc cccggcgcta ggacgcgttc cccacgccgc gtccgggcca 420ggagctccct tttccgtgga cctttgctat cctctggtct tcgggccgca ccccctccca 480acccattttc cagtgggggg cagcctgtgt caccttcttc acgtccttcc cgctcattga 540ctgccctcgc ccacgccgcc tcaggaccct gttctgcccc agagcccgga gggcggagag 600cccggcgaag gatgagttgg ccagttcccc gtcgcggccc ggcagcttaa aggctaaggg 660aaaaggggtt tcacgaagga gcggggttct ttttaatagg ggacatagcg gttgggaaga 720ctcgctcacc cgcttcccgg ctccagcgcc ccagttccct gtccctctta ccgtagttcc 780cctccccctc cacacccaga aatagcccgc gacaccagga ggccgccagc ttccccagga 840gcggggaggg ggacgcccgg ggtagaggag ggtcccattt agatgccctt cagcctgcca 900actcgtgctg gcctggcaaa gaagcggacc ccctgcccgg agcggccggc tggcccccgg 960gctgtgtgta ttttaaatgc atctgccggg aacgcagagc accgagggag atgggggcgc 1020tcagttcgct gaggaaggtg gctggtggcc catggaccca ccaccacctc ccttagcctc 1080ctgtgtggga ggagtttatg ggtatgtggc tcctgcccag tccaggtggg ctttcacttc 1140tactctattt cagttcctct ttcccgatct gggctggaga gcttcctcat tgttaaggca 1200gcagaaactt tcgctggatg gttttaggat aaggggtcat caatgctggc aagagtcggc 1260acaatgagga ccaggcttgc tgtgaagtgg tgtatgtgga aggtcggagg agtgttacag 1320gagtacctag ggagcctagc cgaggccagg gactctgctt ctactactgg ggcctatttg 1380atgggcatgc agggggcgga gctgctgaaa tggcctcacg gctcctgcat cgccatatcc 1440gagagcagct aaaggacctg aaggaagtga gccacgagag cctggtagtg ggggccattg 1500agaatgcctt ccagctcatg gatgagcaga tggcccggga gcggcgtggc caccaagtgg 1560aggggggctg ctgtgcactg gttgtgatct acctgctagg caaggtgtac gtggccaatg 1620caggcgatag cagggccatc attgtccgga atggtgaaat cattccaatg tcccgggagt 1680ttaccccgga gactgagcgc cagcgtcttc agctgcttgg cttcctgaaa ccagagctgc 1740taggcagtga attcacccac cttgagttcc cccgcagagt tctgcccaag gagctggggc 1800agaggatgtt gtaccgggac cagaacatga ccggctgggc ctacaaaaag atcgagctgg 1860aggatctcag gtttcctctg gtctgtgggg agggcaaaaa ggctcgggtg atggccacca 1920ttggggtgac ccgaggcttg ggagaccaca gccttaaggt ctgcagttcc accctgccca 1980tcaagccctt tctctcctgc ttccctgagg tacgagtgta tgacctgaca caatatgagc 2040actgcccaga tgatgtgcta gtcctgggaa cagatggcct gtgggatgtc actactgact 2100gtgaggtagc tgccactgtg gacagggtgc tgtcggccta tgagcctaat gaccacagca 2160ggtatacagc tctggcccaa gctctggtcc tgggggcccg gggtaccccc cgagaccgtg 2220gctggcgtct ccccaacaac aagctgggtt ccggggatga catctctgtc ttcgtcatcc 2280ccctgggagg gccaggcagt tactcctgag gggctgaaca ccatccctcc cactagcctc 2340tccatactta ctcctctcac agcccaaatt ctgaagttgt ctccctgacc cttctttagt 2400ggcaacttaa ctgaagaagg gatgtccgct atatccaaaa ttacagctat tggcaaataa 2460acgagatgga taaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaa 25183804160DNAHomo sapiens 380gcgcttgcgg aggattgcgt tgacgagact cttatttatt gtcaccaacc tgtggtggaa 60tttgcagttg cacattggat ctgattcgcc ccgccccgaa tgacgcctgc ccggaggcag 120tgaaagtaca gccgcgccgc cccaagtcag cctggacaca taaatcagca cgcggccgga 180gaaccccgca atctctgcgc ccacaaaata caccgacgat gcccgatcta ctttaagggc 240tgaaacccac gggcctgaga gactataaga gcgttcccta ccgccatgga acaacgggga 300cagaacgccc cggccgcttc gggggcccgg aaaaggcacg gcccaggacc cagggaggcg 360cggggagcca ggcctgggct ccgggtcccc aagacccttg tgctcgttgt cgccgcggtc 420ctgctgttgg tctcagctga gtctgctctg atcacccaac aagacctagc tccccagcag 480agagcggccc cacaacaaaa gaggtccagc ccctcagagg gattgtgtcc acctggacac 540catatctcag aagacggtag agattgcatc tcctgcaaat atggacagga ctatagcact 600cactggaatg acctcctttt ctgcttgcgc tgcaccaggt gtgattcagg tgaagtggag 660ctaagtccct gcaccacgac cagaaacaca gtgtgtcagt gcgaagaagg caccttccgg 720gaagaagatt ctcctgagat gtgccggaag tgccgcacag ggtgtcccag agggatggtc 780aaggtcggtg attgtacacc ctggagtgac atcgaatgtg tccacaaaga atcaggtaca 840aagcacagtg gggaagcccc agctgtggag gagacggtga cctccagccc agggactcct 900gcctctccct gttctctctc aggcatcatc ataggagtca cagttgcagc cgtagtcttg 960attgtggctg tgtttgtttg caagtcttta ctgtggaaga aagtccttcc ttacctgaaa 1020ggcatctgct caggtggtgg tggggaccct gagcgtgtgg acagaagctc acaacgacct 1080ggggctgagg acaatgtcct caatgagatc gtgagtatct tgcagcccac ccaggtccct 1140gagcaggaaa tggaagtcca ggagccagca gagccaacag gtgtcaacat gttgtccccc 1200ggggagtcag agcatctgct ggaaccggca gaagctgaaa ggtctcagag gaggaggctg 1260ctggttccag caaatgaagg tgatcccact gagactctga gacagtgctt cgatgacttt 1320gcagacttgg tgccctttga ctcctgggag ccgctcatga ggaagttggg cctcatggac 1380aatgagataa aggtggctaa agctgaggca gcgggccaca gggacacctt gtacacgatg 1440ctgataaagt gggtcaacaa aaccgggcga gatgcctctg tccacaccct gctggatgcc 1500ttggagacgc tgggagagag acttgccaag cagaagattg aggaccactt gttgagctct 1560ggaaagttca tgtatctaga aggtaatgca gactctgcca tgtcctaagt gtgattctct 1620tcaggaagtc agaccttccc tggtttacct tttttctgga aaaagcccaa ctggactcca 1680gtcagtagga aagtgccaca attgtcacat gaccggtact ggaagaaact ctcccatcca 1740acatcaccca gtggatggaa catcctgtaa cttttcactg cacttggcat tatttttata 1800agctgaatgt gataataagg acactatgga aatgtctgga tcattccgtt tgtgcgtact 1860ttgagatttg gtttgggatg tcattgtttt cacagcactt ttttatccta atgtaaatgc 1920tttatttatt tatttgggct acattgtaag atccatctac acagtcgttg tccgacttca 1980cttgatacta tatgatatga accttttttg ggtggggggt gcggggcagt tcactctgtc 2040tcccaggctg gagtgcaatg gtgcaatctt ggctcactat agccttgacc tctcaggctc 2100aagcgattct cccacctcag ccatccaaat agctgggacc acaggtgtgc accaccacgc 2160ccggctaatt ttttgtattt tgtctagata taggggctct ctatgttgct cagggtggtc 2220tcgaattcct ggactcaagc agtctgccca cctcagactc ccaaagcggt ggaattagag 2280gcgtgagccc ccatgcttgg ccttaccttt ctacttttat aattctgtat gttattattt 2340tatgaacatg aagaaacttt agtaaatgta cttgtttaca tagttatgtg aatagattag 2400ataaacataa aaggaggaga catacaatgg gggaagaaga agaagtcccc tgtaagatgt 2460cactgtctgg gttccagccc tccctcagat gtactttggc ttcaatgatt ggcaacttct 2520acaggggcca gtcttttgaa ctggacaacc ttacaagtat atgagtatta tttataggta 2580gttgtttaca tatgagtcgg gaccaaagag aactggatcc acgtgaagtc ctgtgtgtgg 2640ctggtcccta cctgggcagt ctcatttgca cccatagccc ccatctatgg acaggctggg 2700acagaggcag atgggttaga tcacacataa caatagggtc tatgtcatat cccaagtgaa 2760cttgagccct gtttgggctc aggagataga agacaaaatc tgtctcccac gtctgccatg 2820gcatcaaggg ggaagagtag atggtgcttg agaatggtgt gaaatggttg ccatctcagg 2880agtagatggc ccggctcact tctggttatc tgtcaccctg agcccatgag ctgcctttta 2940gggtacagat tgcctacttg aggaccttgg ccgctctgta agcatctgac tcatctcaga 3000aatgtcaatt cttaaacact gtggcaacag gacctagaat ggctgacgca ttaaggtttt 3060cttcttgtgt cctgttctat tattgtttta agacctcagt aaccatttca gcctctttcc 3120agcaaaccct tctccatagt atttcagtca tggaaggatc atttatgcag gtagtcattc 3180caggagtttt tggtcttttc tgtctcaagg cattgtgtgt tttgttccgg gactggtttg 3240ggtgggacaa agttagaatt gcctgaagat cacacattca gactgttgtg tctgtggagt 3300tttaggagtg gggggtgacc tttctggtct ttgcacttcc atcctctccc acttccatct 3360ggcatcccac gcgttgtccc ctgcacttct ggaaggcaca gggtgctgct gcctcctggt 3420ctttgccttt gctgggcctt ctgtgcagga cgctcagcct cagggctcag aaggtgccag 3480tccggtccca ggtcccttgt cccttccaca gaggccttcc tagaagatgc atctagagtg 3540tcagccttat cagtgtttaa gatttgtctt ttatttttaa tttttttgag acagaatctc 3600actctctcgc ccaggctgga gtgcaacggt acgatcttgg ctcagtgcaa cctccgcctc 3660ctgggttcaa gcgattctcg tgcctcagcc tccggagtag ctgggattgc aggcacccgc 3720caccacgcct ggctaatttt tgtattttta gtagagacgg ggtttcacca tgttggtcag 3780gctggtctcg aactcctgac ctcaggtgat ccaccttggc ctccgaaagt gctgggatta 3840caggcgtgag ccaccagcca ggccaagcta ttcttttaaa gtaagcttcc tgacgacatg 3900aaataattgg gggttttgtt gtttagttac attaggcttt gctatatccc caggccaaat 3960agcatgtgac acaggacagc catagtatag tgtgtcactc gtggttggtg tcctttcatg 4020cttctgccct gtcaaaggtc cctatttgaa atgtgttata atacaaacaa ggaagcacat 4080tgtgtacaaa atacttatgt atttatgaat ccatgaccaa attaaatatg aaaccttata 4140taaaaaaaaa aaaaaaaaaa 41603811295DNAHomo sapiens 381gtgcggagtt tggctgctcc ggggttagca ggtgagcctg cgatgcgcgg gaagacgttc 60cgctttgaaa tgcagcggga tttggtgagt ttcccgctgt ctccagcggt gcgggtgaag 120ctggtgtctg cggggttcca gactgctgag gaactcctag aggtgaaacc ctccgagctt 180agcaaagaag ttgggatatc taaagcagaa gccttagaaa ctctgcaaat tatcagaaga 240gaatgtctca caaataaacc aagatatgct ggtacatctg agtcacacaa gaagtgtaca 300gcactggaac ttcttgagca ggagcatacc cagggcttca taatcacctt ctgttcagca 360ctagatgata ttcttggggg tggagtgccc ttaatgaaaa caacagaaat ttgtggtgca 420ccaggtgttg gaaaaacaca attatgtatg cagttggcag tagatgtgca gataccagaa 480tgttttggag gagtggcagg tgaagcagtt tttattgata cagagggaag ttttatggtt 540gatagagtgg tagaccttgc tactgcctgc attcagcacc

ttcagcttat agcagaaaaa 600cacaagggag aggaacaccg aaaagctttg gaggatttca ctcttgataa tattctttct 660catatttatt attttcgctg tcgtgactac acagagttac tggcacaagt ttatcttctt 720ccagatttcc tttcagaaca ctcaaaggtt cgactagtga tagtggatgg tattgctttt 780ccatttcgtc atgacctaga tgacctgtct cttcgtactc ggttattaaa tggcctagcc 840cagcaaatga tcagccttgc aaataatcac agattagctg taattttaac caatcagatg 900acaacaaaga ttgatagaaa tcaggccttg cttgttcctg cattagggga aagttgggga 960catgctgcta caatacggct aatctttcat tgggaccgaa agcaaaggtt ggcaacattg 1020tacaagtcac ccagccagaa ggaatgcaca gtactgtttc aaatcaaacc tcagggattt 1080agagatactg ttgttacttc tgcatgttca ttgcaaacag aaggttcctt gagcacccgg 1140aaacggtcac gagacccaga ggaagaatta taacccagaa acaaatctca aagtgtacaa 1200atttattgat gttgtgaaat caatgtgtac aagtggactt gttaccttaa agtataaata 1260aacacactat ggcatgaatg aaaaaaaaaa aaaaa 12953822210DNAHomo sapiens 382cgcgcccctc cctcctcgcg gacctggcgg tgccggcgcc cggagtggcc ctttaaaagg 60cagcttattg tccggagggg gcgggcgggg ggcgccgacc gcggcctgag gcccggcccc 120tcccctctcc ctccctctgt ccccgcgtcg ctcgctggct agctcgctgg ctcgctcgcc 180cgtccggcgc acgctccgcc tccgtcagtt ggctccgctg tcgggtgcgc ggcgtggagc 240ggcagccggt ctggacgcgc ggccggggct gggggctggg agcgcggcgc gcaagatctc 300cccgcgcgag agcggcccct gccaccgggc gaggcctgcg ccgcgatggc agagatgggc 360agtaaagggg tgacggcggg aaagatcgcc agcaacgtgc agaagaagct cacccgcgcg 420caggagaagg ttctccagaa gctggggaag gcagatgaga ccaaggatga gcagtttgag 480cagtgcgtcc agaatttcaa caagcagctg acggagggca cccggctgca gaaggatctc 540cggacctacc tggcctccgt caaagccatg cacgaggctt ccaagaagct gaatgagtgt 600ctgcaggagg tgtatgagcc cgattggccc ggcagggatg aggcaaacaa gatcgcagag 660aacaacgacc tgctgtggat ggattaccac cagaagctgg tggaccaggc gctgctgacc 720atggacacgt acctgggcca gttccccgac atcaagtcac gcattgccaa gcgggggcgc 780aagctggtgg actacgacag tgcccggcac cactacgagt cccttcaaac tgccaaaaag 840aaggatgaag ccaaaattgc caaggccgag gaggagctca tcaaagccca gaaggtgttt 900gaggagatga atgtggatct gcaggaggag ctgccgtccc tgtggaacag ccgcgtaggt 960ttctacgtca acacgttcca gagcatcgcg ggcctggagg aaaacttcca caaggagatg 1020agcaagctca accagaacct caatgatgtg ctggtcggcc tggagaagca acacgggagc 1080aacaccttca cggtcaaggc ccagcccaga aagaaaagta aactgttttc gcggctgcgc 1140agaaagaaga acagtgacaa cgcgcctgca aaagggaaca agagcccttc gcctccagat 1200ggctcccctg ccgccacccc cgagatcaga gtcaaccacg agccagagcc ggccggcggg 1260gccacgcccg gggccaccct ccccaagtcc ccatctcagc cagcagaggc ctcggaggtg 1320gcgggtggga cccaacctgc ggctggagcc caggagccag gggagacggc ggcaagtgaa 1380gcagcctcca gctctcttcc tgctgtcgtg gtggagacct tcccagcaac tgtgaatggc 1440accgtggagg gcggcagtgg ggccgggcgc ttggacctgc ccccaggttt catgttcaag 1500gtacaggccc agcacgacta cacggccact gacacagacg agctgcagct caaggctggt 1560gatgtggtgc tggtgatccc cttccagaac cctgaagagc aggatgaagg ctggctcatg 1620ggcgtgaagg agagcgactg gaaccagcac aaggagctgg agaagtgccg tggcgtcttc 1680cccgagaact tcactgagag ggtcccatga cggcggggcc caggcagcct ccgggcgtgt 1740gaagaacacc tcctcccgaa aaatgtgtgg ttcttttttt tgttttgttt tcgtttttca 1800tcttttgaag agcaaaggga aatcaagagg agacccccag gcagaggggc gttctcccaa 1860agattaggtc gttttccaaa gagccgcgtc ccggcaagtc cggcggaatt caccagtgtt 1920cctgaagctg ctgtgtcctc tagttgagtt tctggcgccc ctgcctgtgc ccgcatgtgt 1980gcctggccgc agggcggggc tgggggctgc cgagccacca tgcttgcctg aagcttcggc 2040cgcgccaccc gggcaagggt cctcttttcc tggcagctgc tgtgggtggg gcccagacac 2100cagcctagcc tggctctgcc ccgcagacgg tctgtgtgct gtttgaaaat aaatcttagt 2160gttcaaaaca aaatgaaaca aaaaaaaaat gataaaaact ctcaaaaaaa 22103834604DNAHomo sapiens 383ggaacagctt gtccacccgc cggccggacc agaagccttt gggtctgaag tgtctgtgag 60acctcacaga agagcacccc tgggctccac ttacctgccc cctgctcctt cagggatgga 120ggcaatggcg gccagcactt ccctgcctga ccctggagac tttgaccgga acgtgccccg 180gatctgtggg gtgtgtggag accgagccac tggctttcac ttcaatgcta tgacctgtga 240aggctgcaaa ggcttcttca ggcgaagcat gaagcggaag gcactattca cctgcccctt 300caacggggac tgccgcatca ccaaggacaa ccgacgccac tgccaggcct gccggctcaa 360acgctgtgtg gacatcggca tgatgaagga gttcattctg acagatgagg aagtgcagag 420gaagcgggag atgatcctga agcggaagga ggaggaggcc ttgaaggaca gtctgcggcc 480caagctgtct gaggagcagc agcgcatcat tgccatactg ctggacgccc accataagac 540ctacgacccc acctactccg acttctgcca gttccggcct ccagttcgtg tgaatgatgg 600tggagggagc catccttcca ggcccaactc cagacacact cccagcttct ctggggactc 660ctcctcctcc tgctcagatc actgtatcac ctcttcagac atgatggact cgtccagctt 720ctccaatctg gatctgagtg aagaagattc agatgaccct tctgtgaccc tagagctgtc 780ccagctctcc atgctgcccc acctggctga cctggtcagt tacagcatcc aaaaggtcat 840tggctttgct aagatgatac caggattcag agacctcacc tctgaggacc agatcgtact 900gctgaagtca agtgccattg aggtcatcat gttgcgctcc aatgagtcct tcaccatgga 960cgacatgtcc tggacctgtg gcaaccaaga ctacaagtac cgcgtcagtg acgtgaccaa 1020agccggacac agcctggagc tgattgagcc cctcatcaag ttccaggtgg gactgaagaa 1080gctgaacttg catgaggagg agcatgtcct gctcatggcc atctgcatcg tctccccaga 1140tcgtcctggg gtgcaggacg ccgcgctgat tgaggccatc caggaccgcc tgtccaacac 1200actgcagacg tacatccgct gccgccaccc gcccccgggc agccacctgc tctatgccaa 1260gatgatccag aagctagccg acctgcgcag cctcaatgag gagcactcca agcagtaccg 1320ctgcctctcc ttccagcctg agtgcagcat gaagctaacg ccccttgtgc tcgaagtgtt 1380tggcaatgag atctcctgac taggacagcc tgtgcggtgc ctgggtgggg ctgctcctcc 1440agggccacgt gccaggcccg gggctggcgg ctactcagca gccctcctca cccgtctggg 1500gttcagcccc tcctctgcca cctcccctat ccacccagcc cattctctct cctgtccaac 1560ctaacccctt tcctgcgggc ttttccccgg tcccttgaga cctcagccat gaggagttgc 1620tgtttgtttg acaaagaaac ccaagtgggg gcagagggca gaggctggag gcaggccttg 1680cccagagatg cctccaccgc tgcctaagtg gctgctgact gatgttgagg gaacagacag 1740gagaaatgca tccattcctc agggacagag acacctgcac ctccccccac tgcaggcccc 1800gcttgtccag cgcctagtgg ggtctccctc tcctgcctta ctcacgataa ataatcggcc 1860cacagctccc accccacccc cttcagtgcc caccaacatc ccattgccct ggttatattc 1920tcacgggcag tagctgtggt gaggtgggtt ttcttcccat cactggagca ccaggcacga 1980acccacctgc tgagagaccc aaggaggaaa aacagacaaa aacagcctca cagaagaata 2040tgacagctgt ccctgtcacc aagctcacag ttcctcgccc tgggtctaag gggttggttg 2100aggtggaagc cctccttcca cggatccatg tagcaggact gaattgtccc cagtttgcag 2160aaaagcacct gccgacctcg tcctccccct gccagtgcct tacctcctgc ccaggagagc 2220cagccctccc tgtcctcctc ggatcaccga gagtagccga gagcctgctc ccccaccccc 2280tccccagggg agagggtctg gagaagcagt gagccgcatc ttctccatct ggcagggtgg 2340gatggaggag aagaattttc agaccccagc ggctgagtca tgatctccct gccgcctcaa 2400tgtggttgca aggccgctgt tcaccacagg gctaagagct aggctgccgc accccagagt 2460gtgggaaggg agagcggggc agtctcgggt ggctagtcag agagagtgtt tgggggttcc 2520gtgatgtagg gtaaggtgcc ttcttattct cactccacca cccaaaagtc aaaaggtgcc 2580tgtgaggcag gggcggagtg atacaacttc aagtgcatgc tctctgcagg tcgagcccag 2640cccagctggt gggaagcgtc tgtccgttta ctccaaggtg ggtctttgtg agagtgagct 2700gtaggtgtgc gggaccggta cagaaaggcg ttcttcgagg tggatcacag aggcttcttc 2760agatcaatgc ttgagtttgg aatcggccgc attccctgag tcaccaggaa tgttaaagtc 2820agtgggaacg tgactgcccc aactcctgga agctgtgtcc ttgcacctgc atccgtagtt 2880ccctgaaaac ccagagagga atcagacttc acactgcaag agccttggtg tccacctggc 2940cccatgtctc tcagaattct tcaggtggaa aaacatctga aagccacgtt ccttactgca 3000gaatagcata tatatcgctt aatcttaaat ttattagata tgagttgttt tcagactcag 3060actccatttg tattatagtc taatatacag ggtagcaggt accactgatt tggagatatt 3120tatgggggga gaacttacat tgtgaaactt ctgtacatta attattattg ctgttgttat 3180tttacaaggg tctagggaga gacccttgtt tgattttagc tgcagaactg tattggtcca 3240gcttgctctt cagtgggaga aaaacacttg taagttgcta aacgagtcaa tcccctcatt 3300caggaaaact gacagaggag ggcgtgactc acccaagcca tatataacta gctagaagtg 3360ggccaggaca ggccgggcgc ggtggctcac gcctgtaatc ccagcagttt gggaggtcga 3420ggtaggtgga tcacctgagg tcgggagttc gagaccaacc tgaccaacat ggagaaaccc 3480tgtctctatt aaaaatacaa aaaaaaaaaa aaaaaaaaat agccgggcat ggtggcgcaa 3540gcctgtaatc ccagctactc aggaggctga ggcagaagaa ttgaacccag gaggtggagg 3600ttgcagtgag ctgagatcgt gccgttactc tccaacctgg acaacaagag cgaaactccg 3660tcttagaagt ggaccaggac aggaccagat tttggagtca tggtccggtg tccttttcac 3720tacaccatgt ttgagctcag acccccactc tcattcccca ggtggctgac ccagtccctg 3780ggggaagccc tggatttcag aaagagccaa gtctggatct gggacccttt ccttccttcc 3840ctggcttgta actccaccaa gcccatcaga aggagaagga aggagactca cctctgcctc 3900aatgtgaatc agaccctacc ccaccacgat gtgccctggc tgctgggctc tccacctcag 3960gccttggata atgctgttgc ctcatctata acatgcattt gtctttgtaa tgtcaccacc 4020ttcccagctc tccctctggc cctgcttctt cggggaactc ctgaaatatc agttactcag 4080ccctgggccc caccacctag gccactcctc caaaggaagt ctaggagctg ggaggaaaag 4140aaaagagggg aaaatgagtt tttatggggc tgaacgggga gaaaaggtca tcatcgattc 4200tactttagaa tgagagtgtg aaatagacat ttgtaaatgt aaaactttta aggtatatca 4260ttataactga aggagaaggt gccccaaaat gcaagatttt ccacaagatt cccagagaca 4320ggaaaatcct ctggctggct aactggaagc atgtaggaga atccaagcga ggtcaacaga 4380gaaggcagga atgtgtggca gatttagtga aagctagaga tatggcagcg aaaggatgta 4440aacagtgcct gctgaatgat ttccaaagag aaaaaaagtt tgccagaagt ttgtcaagtc 4500aaccaatgta gaaagctttg cttatggtaa taaaaatggc tcatacttat atagcactta 4560ctttgtttgc aagtactgct gtaaataaat gctttatgca aacc 4604384545DNAHomo sapiens 384gagtgactct cacgagagcc gcgagagtca gcttggccaa tccgtgcggt cggcggccgc 60tccctttata agccgactcg cccggcagcg caccgggttg cggagggtgg gcctgggagg 120ggtggtggcc attttttgtc taaccctaac tgagaagggc gtaggcgccg tgcttttgct 180ccccgcgcgc tgtttttctc gctgactttc agcgggcgga aaagcctcgg cctgccgcct 240tccaccgttc attctagagc aaacaaaaaa tgtcagctgc tggcccgttc gcccctcccg 300gggacctgcg gcgggtcgcc tgcccagccc ccgaaccccg cctggaggcc gcggtcggcc 360cggggcttct ccggaggcac ccactgccac cgcgaagagt tgggctctgt cagccgcggg 420tctctcgggg gcgagggcga ggttcaggcc tttcaggccg caggaagagg aacggagcga 480gtccccgcgc gcggcgcgat tccctgagct gtgggacgtg cacccaggac tcggctcaca 540catgc 545

* * * * *


uspto.report is an independent third-party trademark research tool that is not affiliated, endorsed, or sponsored by the United States Patent and Trademark Office (USPTO) or any other governmental organization. The information provided by uspto.report is based on publicly available data at the time of writing and is intended for informational purposes only.

While we strive to provide accurate and up-to-date information, we do not guarantee the accuracy, completeness, reliability, or suitability of the information displayed on this site. The use of this site is at your own risk. Any reliance you place on such information is therefore strictly at your own risk.

All official trademark data, including owner information, should be verified by visiting the official USPTO website at www.uspto.gov. This site is not intended to replace professional legal advice and should not be used as a substitute for consulting with a legal professional who is knowledgeable about trademark law.

© 2024 USPTO.report | Privacy Policy | Resources | RSS Feed of Trademarks | Trademark Filings Twitter Feed