Microorganisms Engineered To Use Unconventional Sources Of Nitrogen

South; Colin R. ;   et al.

Patent Application Summary

U.S. patent application number 17/349559 was filed with the patent office on 2021-12-09 for microorganisms engineered to use unconventional sources of nitrogen. The applicant listed for this patent is Ginkgo Bioworks, Inc.. Invention is credited to Arthur J. Shaw, IV, Colin R. South.

Application Number20210380996 17/349559
Document ID /
Family ID1000005795224
Filed Date2021-12-09

United States Patent Application 20210380996
Kind Code A1
South; Colin R. ;   et al. December 9, 2021

MICROORGANISMS ENGINEERED TO USE UNCONVENTIONAL SOURCES OF NITROGEN

Abstract

Disclosed are genetically engineered organisms, such as yeast and bacteria, that have the ability to metabolize atypical nitrogen sources, such as melamine and cyanamide. Fermentation methods using the genetically engineered organisms are also described. The methods of the invention are robust processes for the industrial bioproduction of a variety of compounds, including commodities, fine chemicals, and pharmaceuticals.


Inventors: South; Colin R.; (Lexington, MA) ; Shaw, IV; Arthur J.; (Belmont, MA)
Applicant:
Name City State Country Type

Ginkgo Bioworks, Inc.

Boston

MA

US
Family ID: 1000005795224
Appl. No.: 17/349559
Filed: June 16, 2021

Related U.S. Patent Documents

Application Number Filing Date Patent Number
16392322 Apr 23, 2019 11041162
17349559
15679312 Aug 17, 2017 10316323
16392322
14759878 Jul 8, 2015 9765348
PCT/US2014/010332 Jan 6, 2014
15679312
61782351 Mar 14, 2013
61748901 Jan 4, 2013

Current U.S. Class: 1/1
Current CPC Class: C12P 3/00 20130101; C12N 15/81 20130101; C12N 15/70 20130101; C12N 1/20 20130101; C12Y 402/01069 20130101; C12Y 308/01 20130101; C12N 15/815 20130101; C12N 9/88 20130101; C12N 9/14 20130101
International Class: C12N 15/81 20060101 C12N015/81; C12P 3/00 20060101 C12P003/00; C12N 1/20 20060101 C12N001/20; C12N 9/14 20060101 C12N009/14; C12N 9/88 20060101 C12N009/88; C12N 15/70 20060101 C12N015/70

Claims



1-27. (canceled)

28. A fermentation method of culturing a cell comprising: providing a cell comprising a non-native nucleic acid molecule encoding a cyanamide hydratase enzyme; contacting the cell with a substrate, wherein the substrate comprises a nitrogen-containing fraction and a non-nitrogen-containing fraction; the nitrogen-containing fraction comprises, in an amount from about 10% by weight to about 100% by weight, cyanamide, or a salt thereof; and culturing the cell, wherein the cell can, unlike a cell of the same species that lacks the nucleic acid molecule, metabolize the cyanamide.

29. The method of claim 28, wherein the substrate does not comprise an antibiotic.

30. The method claim 28, wherein the substrate does not comprise ammonium or urea.

31. The method of claim 28, wherein the substrate comprises lignocellulosic material, glucose, xylose, sucrose, acetic acid, formic acid, lactic acid, butyric acid, a free fatty acid, dextrose, glycerol, fructose, lactose, galactose, mannose, rhamnose, or arabinose, or a combination thereof.

32. The method of claim 28, wherein the genetically engineered organism is contacted with the substrate in a fermentor.

33. The method of claim 28, wherein the non-native nucleic acid molecule comprises any one of SEQ ID NOs: 15 and 22-37 or functional variants thereof.
Description



RELATED APPLICATIONS

[0001] This application claims the benefit of priority to U.S. Provisional Patent Application Ser. Nos. 61/748,901, filed Jan. 4, 2013, and 61/782,351, filed Mar. 14, 2013; the contents of both of which are hereby incorporated by reference.

BACKGROUND

[0002] In the fermentation industry, cell culture media is typically formulated to provide all nutrients necessary for the growth of a host cell line, with particular emphasis on meeting the cell line's requirements for carbon, nitrogen, phosphorus, sulfur, and other major nutrients. Some cell lines require additional components, including amino acids, trace minerals and metals, and complex growth factors. The presence of these nutrients provides a suitable growth environment for the organism of choice and, unfortunately, for any potential contaminating organisms. In this environment the production organism is required to compete directly with any contaminant organism in the cell culture.

[0003] Even in robust hosts, the combination of opportunistic infections of the culture and the metabolic burden resulting from the demands of product manufacture is a major concern in monoculture operations. Industrial robustness is typically considered a multigenic trait specific to the host strain and thus difficult to engineer predictably into organisms late in the development process. Addition of selective growth inhibitors, such as bacterial antibiotics, is one method used to create a more robust fermentation environment for host organisms that are resistant to the growth inhibitor. However, antibiotic addition is often undesirable or unfeasible, and spontaneously resistant contaminations frequently result.

[0004] Accordingly, there exists a need for rationally engineered traits that, when engineered into a host organism, create a robust monoculture fermentation environment.

SUMMARY OF THE INVENTION

[0005] In certain embodiments, the invention relates to a genetically engineered organism, wherein the genetically engineered organism has been transformed by a nucleic acid molecule comprising any one or more of the sequences disclosed herein.

[0006] In certain embodiments, the invention relates to a genetically engineered organism, wherein the genetically engineered organism has been transformed by a nucleic acid molecule; the nucleic acid molecule comprises a non-native gene; and the non-native gene encodes for a non-native enzyme selected from the group consisting of allophanate hydrolase, biuret amidohydrolase, cyanuric acid amidohydrolase, guanine deaminase, melamine deaminase, isopropylammelide isopropylaminohydrolase, cyanamide hydratase, urease, and urea carboxylase.

[0007] In certain embodiments, the invention relates to a method, comprising the step of

[0008] contacting any one of the aforementioned genetically engineered organisms with a substrate,

[0009] wherein

[0010] the substrate comprises a nitrogen-containing fraction and a non-nitrogen-containing fraction;

[0011] the nitrogen-containing fraction comprises, in an amount from about 10% by weight to about 100% by weight, a nitrogen-containing compound of any one of Formulas I-III, or a salt thereof;

[0012] a native organism of the same species as the genetically engineered organism could not metabolize (i.e., use as a source of nitrogen) the nitrogen-containing compound;

[0013] the genetically engineered organism converts the substrate to a product; and

[0014] the compound of formula I is

##STR00001##

[0015] wherein, independently for each occurrence, [0016] is a five-, six, nine-, or ten-membered aryl or heteroaryl group; [0017] R is --OH, --CO.sub.2H, --NO.sub.2, --CN, substituted or unsubstituted amino, or substituted or unsubstituted alkyl; and [0018] n is 0, 1, 2, 3, 4, or 5; [0019] the compound of formula II is

[0019] ##STR00002## [0020] wherein, independently for each occurrence, [0021] X is --NH--, --N(alkyl)-, --O--, --C(R.sup.1).sub.2--, --S--, or absent; [0022] Y is --H, --NH.sub.2, --N(H)(alkyl), --N(alkyl).sub.2, --CO.sub.2H, --CN, or substituted or unsubstituted alkyl; and [0023] R.sup.1 is --H, --OH, --CO.sub.2H, --NO.sub.2, --CN, substituted or unsubstituted amino, or substituted or unsubstituted alkyl; and [0024] the compound of formula III is

[0024] ##STR00003## [0025] wherein, independently for each occurrence, [0026] Y is --H, --NH.sub.2, --N(H)(alkyl), --N(alkyl).sub.2, --CO.sub.2H, --CN, or substituted or unsubstituted alkyl.

[0027] In certain embodiments, the invention relates to a method, comprising the step of

[0028] contacting any one of the aforementioned genetically engineered organisms with a substrate,

[0029] wherein

[0030] the substrate comprises a nitrogen-containing fraction and a non-nitrogen-containing fraction;

[0031] the nitrogen-containing fraction comprises, in an amount from about 10% by weight to about 100% by weight, a nitrogen-containing compound selected from the group consisting of triazine, urea, melamine, cyanamide, 2-cyanoguanidine, ammeline, guanidine carbonate, ethylenediamine, ammelide, biuret, diethylenetriamine, triethylenetetramine, 1,3-diaminopropane, calcium cyanamide, cyanuric acid, aminoethylpiperazine, piperazine, and allophante; and

[0032] the genetically engineered organism converts the substrate to a product.

[0033] In certain embodiments, the invention relates to a method, comprising the step of

[0034] contacting any one of the aforementioned genetically engineered organisms with a substrate,

[0035] wherein

[0036] the substrate comprises a nitrogen-containing fraction and a non-nitrogen-containing fraction;

[0037] the nitrogen containing fraction consists essentially of a nitrogen-containing compound selected from the group consisting of triazine, urea, melamine, cyanamide, 2-cyanoguanidine, ammeline, guanidine carbonate, ethylenediamine, ammelide, biuret, diethylenetriamine, triethylenetetramine, 1,3-diaminopropane, calcium cyanamide, cyanuric acid, aminoethylpiperazine, piperazine, and allophante; and

[0038] the genetically engineered organism converts the substrate to a product.

[0039] In certain embodiments, the invention relates to a method comprising the step of

[0040] contacting any one of the aforementioned genetically engineered organisms with a substrate,

[0041] wherein

[0042] the substrate consists of a nitrogen-containing fraction and a non-nitrogen-containing fraction;

[0043] the nitrogen containing fraction consists of a nitrogen-containing compound selected from the group consisting of triazine, urea, melamine, cyanamide, 2-cyanoguanidine, ammeline, guanidine carbonate, ethylenediamine, ammelide, biuret, diethylenetriamine, triethylenetetramine, 1,3-diaminopropane, calcium cyanamide, cyanuric acid, aminoethylpiperazine, piperazine, and allophante; and

[0044] the genetically engineered organism converts the substrate to a product.

[0045] In certain embodiments, the invention relates to a product made by any one of the aforementioned methods.

[0046] In certain embodiments, the invention relates to a recombinant vector comprising a gene operably linked to a promoter, wherein the gene encodes an enzyme; and the enzyme is allophanate hydrolase, biuret amidohydrolase, cyanuric acid amidohydrolase, guanine deaminase, melamine deaminase, isopropylammelide isopropylaminohydrolase, cyanamide hydratase, urease, or urea carboxylase.

BRIEF DESCRIPTION OF THE FIGURES

[0047] FIG. 1 depicts a schematic representation of the melamine degradation pathway. 1--Melamine deaminase (tzrA) (EC 3.5.4.-); 2--Ammeline deaminase (guanine deaminase) (EC 3.5.4.3); 3--N-isopropylammelide isopropylamino (Ammelide) hydrolyase (EC 3.5.99.4); 4--Cyanuric acid hydrolyase (EC 3.5.2.15); 4a--Carboxybiuret decarboxylase, spontaneous reaction; 5--Biuret amidohydrolase (EC 3.5.1.84); 6--Allophanate hydrolyase (EC 3.5.1.54). Nitrogen can be assimilated (as NH.sub.3) by the action of the complete pathway acting on melamine, liberating 6 mol NH.sub.3 per mol melamine, or via a subset of enzymes acting on pathway intermediates (e.g., steps 4, 4a, 5, and 6 acting on cyanuric acid releasing 3 mol NH.sub.3 per mol cyanuric acid).

[0048] FIG. 2 tabulates exemplary compounds capable of delivering nitrogen that could be accessed by an engineered organism.

[0049] FIG. 3 tabulates DNA and protein sequences encoding the melamine degradation pathway.

[0050] FIG. 4 depicts a schematic representation of the cyanamide assimilation pathway. After conversion of cyanamide to urea by cyanamide hydratase (EC 4.2.1.69), urea can be degraded either via urease (EC 3.5.1.5) or by urea carboxylase (EC 6.3.4.6) and allophante hydrolyase (EC 3.5.1.54).

[0051] FIGS. 5-10 depict various plasmids of the invention.

[0052] FIG. 11 tabulates the concentrations of the components in the MOPS medium used in Example 9.

[0053] FIG. 12 depicts the growth progress of NS88 and NS91 (control) in media containing various concentrations of ammonium ion or melamine

[0054] FIG. 13 depicts the growth progress of NS90 and NS91 (control) in media containing various concentrations of ammonium ion or biuret.

[0055] FIG. 14 depicts images, taken after 48 h, of cultures grown in MOPS media with different nitrogen sources. From left to right: NS88 with 10 mM melamine; NS91 with 10 mM melamine; NS90 with 10 mM biuret (replicate 1); NS90 with 10 mM biuret (replicate 2); and NS91 with 10 mM biuret.

[0056] FIG. 15 depicts a plasmid of the invention.

[0057] FIG. 16 depicts a plasmid of the invention.

[0058] FIG. 17 depicts the growth progress of NS100 (control) and NS101 in media containing no nitrogen source, urea, or cyanamide.

[0059] FIG. 18 depicts the population fraction of NS100 (control) and NS101 in a urea-containing medium.

[0060] FIG. 19 depicts the population fraction of NS100 (control) and NS101 in a cyanamide-containing medium.

[0061] FIG. 20 depicts the growth progress of NS100 (control) and NS101 in media containing no nitrogen source, or media containing cyanamide.

[0062] FIG. 21 depicts the growth of an organism of the invention in the presence of an antibiotic on various nitrogen-containing media (see FIG. 33 for composition of SC amino acid media).

[0063] FIG. 22 tabulates the optical density at 600 nm after growth of four organisms of the invention on various media.

[0064] FIG. 23 tabulates the optical density at 600 nm after growth of three organisms of the invention on various media.

[0065] FIG. 24 depicts the growth of four organisms of the invention (NS91=control) on 0.25 mM melamine, as compared to the standard curves for a native organism on NH.sub.4Cl. Because melamine has six nitrogen atoms, organisms having the ability to utilize melamine should be approximately six times more efficient (see, for example, NS110 on 0.25 mM melamine, as compared to a native organism on 1.5 mM NH.sub.4Cl).

[0066] FIG. 25 depicts the growth of four organisms of the invention (NS91=control) on 0.25 mM ammeline, as compared to the standard curves for a native organism on NH.sub.4Cl. Because ammeline has five nitrogen atoms, organisms having the ability to utilize melamine should be approximately five times more efficient (see, for example, NS110 on 0.25 mM ammeline, as compared to a native organism on 1.25 mM NH.sub.4Cl).

[0067] FIG. 26 depicts the growth of various organisms of the invention on 0.5 mM NH.sub.4Cl. Importantly, the organisms described in FIGS. 26-28, for example NS120, NS91, NS107, and NS123, are E. coli strains derived from E. coli K12, E. coli B, E. coli Crooks, and E. coli MG1655 and are intended to show the breadth of the invention across various strains of E. coli.

[0068] FIG. 27 depicts the growth of various organisms of the invention on a medium containing no nitrogen.

[0069] FIG. 28 depicts the growth of various organisms of the invention on a medium containing 0.5 mM melamine

[0070] FIG. 29 tabulates a summary of various plasmids of the invention.

[0071] FIG. 30 tabulates a summary of various organisms of the invention.

[0072] FIG. 31 tabulates the components and molar concentrations of each component in a MOPS defined medium, which is used, for example, with E. coli.

[0073] FIG. 32 tabulates the components and weight concentrations of each component in a YNB medium, which is used, for example, with S. cerevisiae.

[0074] FIG. 33 tabulates the components and weight concentrations of each component in a SC amino acid medium.

DETAILED DESCRIPTION OF THE INVENTION

Overview

[0075] In certain embodiments, the invention relates to a genetically engineered host organism, wherein the genetically engineered host organism has a non-native ability to obtain a growth-limiting nutrient from a complex substrate; and the complex substrate could not have been metabolized or used as a nutrient by the native host organism. In certain embodiments, the non-native ability will provide the organism with a significant competitive advantage, and provide a major barrier to the success of contaminants in a fermentation. In certain embodiments, the genetically engineered host organism is a bacterium, a yeast, a fungus, a mammalian cell, or an insect cell. In certain embodiments, the genetically engineered host organism is a bacterium or a yeast.

[0076] In certain embodiments, the invention relates to a method of using the above-mentioned genetically engineered host organism, comprising contacting the genetically engineered host organism with a modified cell culture medium. In certain embodiments, the invention relates to a method of using the above-mentioned genetically engineered host organism, comprising contacting the genetically engineered host organism with a modified cell culture medium, wherein the genetically engineered host organism converts the cell culture medium to a product. In certain embodiments, using this approach provides a unique and targeted manner to promote the growth of the desired genetically engineered host organism. In certain embodiments, the above-mentioned methods minimize the growth of contaminant organisms, provide a valuable competitive advantage, and allow management of production of a range of valuable products.

[0077] In certain embodiments, the inventive methods decrease or eliminate the need for use of prophylactic antibiotics in large scale yeast cultures. Avoiding unnecessary antibiotics is an important benefit due to emerging environmental considerations and societal pressures. Additionally, in certain embodiments, the technique can be applied to bacterial systems in which antibiotics may not be added.

[0078] In certain embodiments, the genetically engineered host organism is a yeast; and the product is ethanol, isobutanol, lactic acid, an isoprenoid, a lipid, and enzyme product, or a high value specialty chemical.

[0079] In certain embodiments, the genetically engineered host organism is a bacterium;

[0080] and the product is butanol, ethanol, isopropanol, 1,3-propanediol (PDO), 1,4-butanediol (BDO), succinic acid, itaconic acid, an enzyme product, a polyol, a protein product, or a high value specialty chemical.

[0081] In certain embodiments, the inventive technology is applicable in the production of one or more commodities, fine chemicals, and pharmaceuticals.

Definitions

[0082] "Dry weight" and "dry cell weight" mean weight determined in the relative absence of water. For example, reference to oleaginous cells as comprising a specified percentage of a particular component by dry weight means that the percentage is calculated based on the weight of the cell after substantially all water has been removed.

[0083] "Exogenous gene" is a nucleic acid that codes for the expression of an RNA and/or protein that has been introduced into a cell (e.g., by transformation/transfection), and is also referred to as a "transgene." A cell comprising an exogenous gene may be referred to as a recombinant cell, into which additional exogenous gene(s) may be introduced. The exogenous gene may be from a different species (and so heterologous), or from the same species (and so homologous), relative to the cell being transformed. Thus, an exogenous gene can include a homologous gene that occupies a different location in the genome of the cell or is under different control, relative to the endogenous copy of the gene. An exogenous gene may be present in more than one copy in the cell. An exogenous gene may be maintained in a cell as an insertion into the genome (nuclear or plastid) or as an episomal molecule.

[0084] "Expression vector" or "expression construct" or "plasmid" or "recombinant DNA construct" is a vehicle for introducing a nucleic acid into a host cell. The nucleic acid can be one that has been generated via human intervention, including by recombinant means or direct chemical synthesis, with a series of specified nucleic acid elements that permit transcription and/or translation of a particular nucleic acid. The expression vector can be part of a plasmid, virus, or nucleic acid fragment, or other suitable vehicle. Typically, the expression vector includes a nucleic acid to be transcribed operably linked to a promoter.

[0085] "Inducible promoter" is a promoter that mediates transcription of an operably linked gene in response to a particular stimulus.

[0086] "In operable linkage" is a functional linkage between two nucleic acid sequences, such a control sequence (typically a promoter) and the linked sequence (typically a sequence that encodes a protein, also called a coding sequence). A promoter is in operable linkage with an exogenous gene if it can mediate transcription of the gene.

[0087] "Lysate" is a solution containing the contents of lysed cells.

[0088] "Lysis" is the breakage of the plasma membrane and optionally the cell wall of a biological organism sufficient to release at least some intracellular content, often by mechanical, viral or osmotic mechanisms that compromise its integrity.

[0089] "Lysing" is disrupting the cellular membrane and optionally the cell wall of a biological organism or cell sufficient to release at least some intracellular content.

[0090] "Osmotic shock" is the rupture of cells in a solution following a sudden reduction in osmotic pressure. Osmotic shock is sometimes induced to release cellular components of such cells into a solution.

[0091] The terms "plasmid", "vector" and "cassette" refer to an extra chromosomal element often carrying genes that are not part of the central metabolism of the cell, and usually in the form of circular double-stranded DNA molecules. Such elements may be autonomously replicating sequences, genome integrating sequences, phage or nucleotide sequences, linear or circular, of a single- or double-stranded DNA or RNA, derived from any source, in which a number of nucleotide sequences have been joined or recombined into a unique construction which is capable of introducing a promoter fragment and DNA sequence for a selected gene product along with appropriate 3' untranslated sequence into a cell. "Transformation cassette" refers to a specific vector containing a foreign gene and having elements, in addition to the foreign gene, that facilitate transformation of a particular host cell. "Expression cassette" refers to a specific vector containing a foreign gene and having elements in addition to the foreign gene that allow for enhanced expression of that gene in a foreign host.

[0092] "Promoter" is a nucleic acid control sequence that directs transcription of a nucleic acid. As used herein, a promoter includes necessary nucleic acid sequences near the start site of transcription. A promoter also optionally includes distal enhancer or repressor elements, which can be located as much as several thousand base pairs from the start site of transcription.

[0093] "Recombinant" is a cell, nucleic acid, protein, or vector, which has been modified due to the introduction of an exogenous nucleic acid or the alteration of a native nucleic acid. Thus, e.g., recombinant cells can express genes that are not found within the native (non-recombinant) form of the cell or express native genes differently than those genes are expressed by a non-recombinant cell. Recombinant cells can, without limitation, include recombinant nucleic acids that encode for a gene product or for suppression elements such as mutations, knockouts, antisense, interfering RNA (RNAi) or dsRNA that reduce the levels of active gene product in a cell. A "recombinant nucleic acid" is a nucleic acid originally formed in vitro, in general, by the manipulation of nucleic acid, e.g., using polymerases, ligases, exonucleases, and endonucleases, or otherwise is in a form not normally found in nature. Recombinant nucleic acids may be produced, for example, to place two or more nucleic acids in operable linkage. Thus, an isolated nucleic acid or an expression vector formed in vitro by ligating DNA molecules that are not normally joined in nature, are both considered recombinant for the purposes of this invention. Once a recombinant nucleic acid is made and introduced into a host cell or organism, it may replicate using the in vivo cellular machinery of the host cell; however, such nucleic acids, once produced recombinantly, although subsequently replicated intracellularly, are still considered recombinant for purposes of this invention. Similarly, a "recombinant protein" is a protein made using recombinant techniques, i.e., through the expression of a recombinant nucleic acid.

[0094] "Sonication" is a process of disrupting biological materials, such as a cell, by use of sound wave energy.

[0095] "Transformation" refers to the transfer of a nucleic acid fragment into a host organism or the genome of a host organism, resulting in genetically stable inheritance. Host organisms containing the transformed nucleic acid fragments are referred to as "recombinant", "transgenic" or "transformed" organisms. Thus, isolated polynucleotides of the present invention can be incorporated into recombinant constructs, typically DNA constructs, capable of introduction into and replication in a host cell. Such a construct can be a vector that includes a replication system and sequences that are capable of transcription and translation of a polypeptide-encoding sequence in a given host cell. Typically, expression vectors include, for example, one or more cloned genes under the transcriptional control of 5' and 3' regulatory sequences and a selectable marker. Such vectors also can contain a promoter regulatory region (e.g., a regulatory region controlling inducible or constitutive, environmentally- or developmentally-regulated, or location-specific expression), a transcription initiation start site, a ribosome binding site, a transcription termination site, and/or a polyadenylation signal.

Microbe Engineering

[0096] A. Overview

[0097] In certain embodiments of the invention, a microorganism is genetically modified to improve or provide de novo growth characteristics on a variety of feedstock materials.

[0098] Genes and gene products may be introduced into microbial host cells. Suitable host cells for expression of the genes and nucleic acid molecules are microbial hosts that can be found broadly within the fungal or bacterial families and which grow over a wide range of temperature, pH values, and solvent tolerances. Examples of suitable host strains include but are not limited to fungal or yeast species, such as Aspergillus, Trichoderma, Saccharomyces, Pichia, Candida, Hansenula, Kluyveromyces, or bacterial species, such as member of the proteobacteria and actinomycetes as well as the specific genera Acinetobacter, Arthrobacter, Brevibacterium, Acidovorax, Bacillus, Clostridia, Streptomyces, Escherichia, Salmonella, Pseudomonas, and Cornyebacterium.

[0099] E. coli is well suited to use as the host microorganism in the invention fermentative processes.

[0100] Microbial expression systems and expression vectors containing regulatory sequences that direct high level expression of foreign proteins are well known to those skilled in the art. Any of these could be used to construct chimeric genes to produce the any one of the gene products of the instant sequences. These chimeric genes could then be introduced into appropriate microorganisms via transformation techniques to provide high-level expression of the enzymes.

[0101] For example, a gene encoding an enzyme can be cloned in a suitable plasmid, and the aforementioned starting parent strain as a host can be transformed with the resulting plasmid. This approach can increase the copy number of each of the genes encoding the enzymes and, as a result, the activities of these enzymes can be increased. The plasmid is not particularly limited so long as it can autonomously replicate in the microorganism.

[0102] Vectors or cassettes useful for the transformation of suitable host cells are well known in the art. Typically the vector or cassette contains sequences directing transcription and translation of the relevant gene, a selectable marker, and sequences allowing autonomous replication or chromosomal integration. Suitable vectors comprise a region 5' of the gene harboring transcriptional initiation controls and a region 3' of the DNA fragment which controls transcriptional termination. It is most preferred when both control regions are derived from genes homologous to the transformed host cell, although it is to be understood that such control regions need not be derived from the genes native to the specific species chosen as a production host.

[0103] Promoters, cDNAs, and 3'UTRs, as well as other elements of the vectors, can be generated through cloning techniques using fragments isolated from native sources (see, for example, Molecular Cloning: A Laboratory Manual, Sambrook et al. (3d edition, 2001, Cold Spring Harbor Press; and U.S. Pat. No. 4,683,202 (incorporated by reference)). Alternatively, elements can be generated synthetically using known methods (see, for example, Gene. 1995 Oct. 16; 164(1):49-53).

[0104] B. Homologous Recombination

[0105] Homologous recombination is the ability of complementary DNA sequences to align and exchange regions of homology. Transgenic DNA ("donor") containing sequences homologous to the genomic sequences being targeted ("template") is introduced into the organism and then undergoes recombination into the genome at the site of the corresponding genomic homologous sequences.

[0106] The ability to carry out homologous recombination in a host organism has many practical implications for what can be carried out at the molecular genetic level and is useful in the generation of an oleaginous microbe that can produced tailored oils. By its very nature homologous recombination is a precise gene targeting event, hence, most transgenic lines generated with the same targeting sequence will be essentially identical in terms of phenotype, necessitating the screening of far fewer transformation events. Homologous recombination also targets gene insertion events into the host chromosome, potentially resulting in excellent genetic stability, even in the absence of genetic selection. Because different chromosomal loci will likely impact gene expression, even from heterologous promoters/UTRs, homologous recombination can be a method of querying loci in an unfamiliar genome environment and to assess the impact of these environments on gene expression.

[0107] A particularly useful genetic engineering approach using homologous recombination is to co-opt specific host regulatory elements such as promoters/UTRs to drive heterologous gene expression in a highly specific fashion.

[0108] Because homologous recombination is a precise gene targeting event, it can be used to precisely modify any nucleotide(s) within a gene or region of interest, so long as sufficient flanking regions have been identified. Therefore, homologous recombination can be used as a means to modify regulatory sequences impacting gene expression of RNA and/or proteins. It can also be used to modify protein coding regions in an effort to modify enzyme activities such as substrate specificity, affinities and Km, and thus affecting the desired change in metabolism of the host cell. Homologous recombination provides a powerful means to manipulate the host genome resulting in gene targeting, gene conversion, gene deletion, gene duplication, gene inversion and exchanging gene expression regulatory elements such as promoters, enhancers and 3'UTRs.

[0109] Homologous recombination can be achieved by using targeting constructs containing pieces of endogenous sequences to "target" the gene or region of interest within the endogenous host cell genome. Such targeting sequences can either be located 5' of the gene or region of interest, 3' of the gene/region of interest or even flank the gene/region of interest. Such targeting constructs can be transformed into the host cell either as a supercoiled plasmid DNA with additional vector backbone, a PCR product with no vector backbone, or as a linearized molecule. In some cases, it may be advantageous to first expose the homologous sequences within the transgenic DNA (donor DNA) with a restriction enzyme. This step can increase the recombination efficiency and decrease the occurrence of undesired events. Other methods of increasing recombination efficiency include using PCR to generate transforming transgenic DNA containing linear ends homologous to the genomic sequences being targeted.

[0110] C. Vectors and Vector Components

[0111] Vectors for transformation of microorganisms in accordance with the present invention can be prepared by known techniques familiar to those skilled in the art in view of the disclosure herein. A vector typically contains one or more genes, in which each gene codes for the expression of a desired product (the gene product) and is operably linked to one or more control sequences that regulate gene expression or target the gene product to a particular location in the recombinant cell.

[0112] This subsection is divided into subsections. Subsection 1 describes control sequences typically contained on vectors as well as novel control sequences provided by the present invention. Subsection 2 describes genes typically contained in vectors as well as novel codon optimization methods and genes prepared using them provided by the invention.

[0113] 1. Control Sequences

[0114] Control sequences are nucleic acids that regulate the expression of a coding sequence or direct a gene product to a particular location in or outside a cell. Control sequences that regulate expression include, for example, promoters that regulate transcription of a coding sequence and terminators that terminate transcription of a coding sequence. Another control sequence is a 3' untranslated sequence located at the end of a coding sequence that encodes a polyadenylation signal. Control sequences that direct gene products to particular locations include those that encode signal peptides, which direct the protein to which they are attached to a particular location in or outside the cell.

[0115] Thus, an exemplary vector design for expression of an exogenous gene in a microbe contains a coding sequence for a desired gene product (for example, a selectable marker, or an enzyme) in operable linkage with a promoter active in microalgae. Alternatively, if the vector does not contain a promoter in operable linkage with the coding sequence of interest, the coding sequence can be transformed into the cells such that it becomes operably linked to an endogenous promoter at the point of vector integration.

[0116] The promoter used to express an exogenous gene can be the promoter naturally linked to that gene or can be a heterologous promoter.

[0117] A promoter can generally be characterized as either constitutive or inducible. Constitutive promoters are generally active or function to drive expression at all times (or at certain times in the cell life cycle) at the same level. Inducible promoters, conversely, are active (or rendered inactive) or are significantly up- or down-regulated only in response to a stimulus. Both types of promoters find application in the methods of the invention. Inducible promoters useful in the invention include those that mediate transcription of an operably linked gene in response to a stimulus, such as an exogenously provided small molecule, temperature (heat or cold), lack of nitrogen in culture media, etc. Suitable promoters can activate transcription of an essentially silent gene or upregulate, preferably substantially, transcription of an operably linked gene that is transcribed at a low level.

[0118] Inclusion of termination region control sequence is optional, and if employed, then the choice is be primarily one of convenience, as the termination region is relatively interchangeable. The termination region may be native to the transcriptional initiation region (the promoter), may be native to the DNA sequence of interest, or may be obtainable from another source. See, for example, Chen and Orozco, Nucleic Acids Res. (1988) 16:8411.

[0119] 2. Genes and Codon Optimization

[0120] Typically, a gene includes a promoter, coding sequence, and termination control sequences. When assembled by recombinant DNA technology, a gene may be termed an expression cassette and may be flanked by restriction sites for convenient insertion into a vector that is used to introduce the recombinant gene into a host cell. The expression cassette can be flanked by DNA sequences from the genome or other nucleic acid target to facilitate stable integration of the expression cassette into the genome by homologous recombination. Alternatively, the vector and its expression cassette may remain unintegrated (e.g., an episome), in which case, the vector typically includes an origin of replication, which is capable of providing for replication of the heterologous vector DNA.

[0121] A common gene present on a vector is a gene that codes for a protein, the expression of which allows the recombinant cell containing the protein to be differentiated from cells that do not express the protein. Such a gene, and its corresponding gene product, is called a selectable marker or selection marker. Any of a wide variety of selectable markers can be employed in a transgene construct useful for transforming the organisms of the invention.

[0122] For optimal expression of a recombinant protein, it is beneficial to employ coding sequences that produce mRNA with codons optimally used by the host cell to be transformed. Thus, proper expression of transgenes can require that the codon usage of the transgene matches the specific codon bias of the organism in which the transgene is being expressed. The precise mechanisms underlying this effect are many, but include the proper balancing of available aminoacylated tRNA pools with proteins being synthesized in the cell, coupled with more efficient translation of the transgenic messenger RNA (mRNA) when this need is met. When codon usage in the transgene is not optimized, available tRNA pools are not sufficient to allow for efficient translation of the heterologous mRNA resulting in ribosomal stalling and termination and possible instability of the transgenic mRNA.

[0123] D. Expression of Two or More Exogenous Genes

[0124] Further, a genetically engineered microorganism may comprise and express more than one exogenous gene. One or more genes can be expressed using an inducible promoter, which allows the relative timing of expression of these genes to be controlled. Expression of the two or more exogenous genes may be under control of the same inducible promoter or under control of different inducible promoters. In the latter situation, expression of a first exogenous gene can be induced for a first period of time (during which expression of a second exogenous gene may or may not be induced) and expression of a second or further exogenous gene can be induced for a second period of time (during which expression of a first exogenous gene may or may not be induced). Provided herein are vectors and methods for engineering microbes to grow on non-traditional growth media.

[0125] E. Transformation

[0126] Cells can be transformed by any suitable technique including, e.g., biolistics, electroporation, glass bead transformation and silicon carbide whisker transformation. Any convenient technique for introducing a transgene into a microorganism can be employed in the present invention. Transformation can be achieved by, for example, the method of D. M. Morrison (Methods in Enzymology 68, 326 (1979)), the method by increasing permeability of recipient cells for DNA with calcium chloride (Mandel, M. and Higa, A., J. Mol. Biol., 53, 159 (1970)), or the like.

[0127] Examples of expression of transgenes in oleaginous yeast (e.g., Yarrowia lipolytica) can be found in the literature (see, for example, Bordes et al., J Microbiol Methods, Jun. 27 (2007)). Examples of expression of exogenous genes in bacteria such as E. coli are well known; see, for example, Molecular Cloning: A Laboratory Manual, Sambrook et al. (3d edition, 2001, Cold Spring Harbor Press).

[0128] Vectors for transformation of microorganisms in accordance with the present invention can be prepared by known techniques familiar to those skilled in the art. In one embodiment, an exemplary vector design for expression of a gene in a microorganism contains a gene encoding an enzyme in operable linkage with a promoter active in the microorganism. Alternatively, if the vector does not contain a promoter in operable linkage with the gene of interest, the gene can be transformed into the cells such that it becomes operably linked to an endogenous promoter at the point of vector integration. The vector can also contain a second gene that encodes a protein. Optionally, one or both gene(s) is/are followed by a 3' untranslated sequence containing a polyadenylation signal. Expression cassettes encoding the two genes can be physically linked in the vector or on separate vectors. Co-transformation of microbes can also be used, in which distinct vector molecules are simultaneously used to transform cells (see, for example, Protist 2004 December; 155(4):381-93). The transformed cells can be optionally selected based upon the ability to grow in the presence of the antibiotic or other selectable marker under conditions in which cells lacking the resistance cassette would not grow.

Nitrogen-Containing Compounds in Feedstocks

[0129] In certain embodiments, the invention relates to use of an atypical nitrogen-containing feedstock comprising, consisting essentially of, or consisting of a nitrogen-containing compound of any one of Formulas I-III. In certain embodiments, a non-genetically engineered organism, i.e., a native organism, could not metabolize (i.e., use as a source of nitrogen) the nitrogen-containing compounds in the feedstock.

[0130] In certain embodiments, the invention relates to any one of the aforementioned nitrogen-containing feedstocks, wherein the nitrogen-containing compound is a compound of formula I or a salt thereof:

##STR00004##

wherein, independently for each occurrence,

[0131] is a five-, six, nine-, or ten-membered aryl or heteroaryl group;

[0132] R is --OH, --CO.sub.2H, --NO.sub.2, --CN, substituted or unsubstituted amino, or substituted or unsubstituted alkyl; and

[0133] n is 0, 1, 2, 3, 4, or 5.

[0134] In certain embodiments, the invention relates to any one of the aforementioned nitrogen-containing feedstocks, wherein the nitrogen-containing compound is a compound of formula II or a salt thereof:

##STR00005##

wherein, independently for each occurrence,

[0135] X is --NH--, --N(alkyl)-, --O--, --C(R.sup.1).sub.2--, --S--, or absent;

[0136] Y is --H, --NH.sub.2, --N(H)(alkyl), --N(alkyl).sub.2, --CO.sub.2H, --CN, or substituted or unsubstituted alkyl; and

[0137] R.sup.1 is --H, --OH, --CO.sub.2H, --NO.sub.2, --CN, substituted or unsubstituted amino, or substituted or unsubstituted alkyl.

[0138] In certain embodiments, the invention relates to any one of the aforementioned nitrogen-containing feedstocks, wherein the nitrogen-containing compound is a compound of formula III or a salt thereof:

##STR00006##

wherein, independently for each occurrence,

[0139] Y is --H, --NH.sub.2, --N(H)(alkyl), --N(alkyl).sub.2, --CO.sub.2H, --CN, or substituted or unsubstituted alkyl.

[0140] In certain embodiments, the invention relates to any one of the aforementioned nitrogen-containing feedstocks, wherein the nitrogen-containing compound is selected from the group consisting of:

##STR00007##

[0141] In certain embodiments, the invention relates to any one of the aforementioned nitrogen-containing feedstocks, wherein the nitrogen-containing compound is selected from the group consisting of Hydrazine, 5-Aminotetrazole, Tetrazole, Melamine, Cyanamide, 2-Cyanoguanidine, Sodium azide, Carbohydrazide, 1,2,3-Triazole, 1,2,4-Triazole, 1,3-Diaminoguanidine HCl, Ammeline, 1,3,5-triazine, Aminoacetonitrile, Cyanoethylhydrazine, Azodicarbonamide, Biurea, Formamidoxime, 1,2-Dimethylhydrazine, 1,1-Dimethylhydrazine, ethylhydrazine, Ethylenediamine, Sodium dicyanamide, Guanidine carbonate, Methylamine, Ammelide, Hydroxylamine, Malononitrile, Biuret, Diethyltriamine, Hexamethylenetetramine, Triethylenetetramine, 1,3-Diaminopropane, Triethylenetetramine, 1,3-Diaminopropane, Hydroxyurea, Tetraethylenepentamine, Thiourea, Succinonitrile, Calcium cyanamide, Cyanuric acid, Aminoethylpiperazine, Piperazine, Dimethylamine, Ethylamine, dalfampridine, Tetranitromethane, Imidazolidinyl urea, Trinitromethane, malonamide, Chloramine, Allophante, Trimethylamine, Nitromethane, Acetaldoxime, Diazolidinyl urea, 1,2-Cyclohexanedione dioxime, Acetone oxime, Thioacetamide, Sodium thiocyanate, Isothiazole, Thiazole, Dimethylacetamide, Isothiazolinone, Methylene blue, Diethanolamine, Aspartame, Benzisothiazolinone, and Acesulfame potassium.

Exemplary Isolated Nucleic Acid Molecules and Vectors

[0142] In certain embodiments, the invention relates to an isolated nucleic acid molecule, wherein the nucleic acid molecule encodes an enzyme that provides the organism with the ability to assimilate a nitrogen source that otherwise would not have been accessible to the native organism; and the enzyme is allophanate hydrolase, biuret amidohydrolase, cyanuric acid amidohydrolase, guanine deaminase, ammeline hydrolase, ammelide hydrolyase, melamine deaminase, isopropylammelide isopropylaminohydrolase, cyanamide hydratase, urease, or urea carboxylase.

[0143] In certain embodiments, the invention relates to an isolated nucleic acid molecule, wherein the nucleic acid molecule is selected from the group consisting of trzE from Rhodococcus sp. strain Mel, trzE from Rhizobium leguminosarum, trzC MEL, trzC 12227, cah from Fusarium oxysporum Fo5176, cah from F. pseudograminaearum CS3096, cah from Gibberella zeae PH-1, cah from Aspergillus kawachii IFO 4308, cah from A. niger CBS 513.88, cah from A. niger ATCC 1015, cah from A. oryzae 3.042, cah from S. cerevisiae FostersB, atzF from Pseudomonas sp. strain ADP, DUR1,2 from S. cerevisiae, YALI0E 07271g from Y. lipolytica CLIB122, atzE from Pseudomonas sp. strain ADP, atzD from Pseudomonas sp. strain ADP, trzD from Pseudomonas sp. strain NRRLB-12227, atzD from Rhodococcus sp. Mel, trzD from Rhodococcus sp. Mel, guaD from E. coli K12 strain MG1566, blr3880 from Bradyrhizobium japonicum USDA 110, GUD1/Y DL238C from S. cerevisiae, YAL10E2 5740p from Y. lipolytica CLIB122, trzA from Williamsia sp. NRRL B-15444R, triA from Pseudomonas sp. strain NRRL B-12227, atzC from Pseudomonas sp. strain ADP, and cah from Myrothecium verrucaria.

[0144] In certain embodiments, the invention relates to an isolated nucleic acid molecule comprising any one of the sequences disclosed herein. In certain embodiments, the invention relates to an isolated nucleic acid molecule having at least 85% sequence homology with any one of the sequences disclosed herein. In certain embodiments, the invention relates to an isolated nucleic acid molecule having at least 90% sequence homology with any one of the sequences disclosed herein. In certain embodiments, the invention relates to an isolated nucleic acid molecule having at least 95% sequence homology with any one of the sequences disclosed herein. In certain embodiments, the invention relates to an isolated nucleic acid molecule having at least 99% sequence homology with any one of the sequences disclosed herein. In certain embodiments, the invention relates to an isolated nucleic acid molecule having any one of the sequences disclosed herein.

[0145] A recombinant vector comprising any one of the aformentioned nucleic acid molecules operably linked to a promoter.

[0146] In certain embodiments, the invention relates to a recombinant vector comprising any one of the sequences disclosed herein. In certain embodiments, the invention relates to a recombinant vector having at least 85% sequence homology with any one of the sequences disclosed herein. In certain embodiments, the invention relates to a recombinant vector having at least 90% sequence homology with any one of the sequences disclosed herein. In certain embodiments, the invention relates to a recombinant vector having at least 95% sequence homology with any one of the sequences disclosed herein. In certain embodiments, the invention relates to a recombinant vector having at least 99% sequence homology with any one of the sequences disclosed herein.

Exemplary Genetically Engineered Organisms of the Invention

[0147] In certain embodiments, the invention relates to a genetically engineered organism, wherein the genetically engineered organism has been transformed by a nucleic acid molecule or a recombinant vector comprising any one of the sequences disclosed herein. In certain embodiments, the nucleic acid molecule or recombinant vector has at least 85% sequence homology with any one of the sequences disclosed herein. In certain embodiments, the nucleic acid molecule or recombinant vector has at least 90% sequence homology with any one of the sequences disclosed herein. In certain embodiments, the nucleic acid molecule or recombinant vector has at least 95% sequence homology with any one of the sequences disclosed herein. In certain embodiments, the nucleic acid molecule or recombinant vector has at least 99% sequence homology with any one of the sequences disclosed herein. In certain embodiments, the invention relates to a genetically engineered organism, wherein the genetically engineered organism has been transformed by a nucleic acid molecule or a recombinant vector having any one of the sequences disclosed herein.

[0148] In certain embodiments, the invention relates to a genetically engineered organism, wherein the genetically engineered organism has been transformed by a nucleic acid molecule; the nucleic acid molecule comprises a non-native gene; and the non-native gene encodes for a non-native enzyme selected from the group consisting of allophanate hydrolase, biuret amidohydrolase, cyanuric acid amidohydrolase, guanine deaminase, ammeline hydrolase, ammelide hydrolyase, melamine deaminase, and isopropylammelide isopropylaminohydrolase, cyanamide hydratase, urease, or urea carboxylase.

[0149] In certain embodiments, the invention relates to any one of the aforementioned genetically engineered organisms, wherein the non-native gene is selected from the group consisting of atzF, DUR1,2 YALI0E 07271g, atzE, atzD, trzC, trzD, trzE, atzD, guaD, blr3880, GUD1/Y DL238C, YAL10E2 5740p, trzA, triA, atzC, and cah. In certain embodiments, the invention relates to any one of the aforementioned genetically engineered organisms, wherein the non-native gene is selected from the group consisting of atzF, DUR1,2 YALI0E 07271g, atzE, atzD, trzD, atzD, guaD, blr3880, GUD1/Y DL238C, YAL10E2 5740p, trzA, triA, atzC, and cah. Any organism may be used as a source of the non-native gene, as long as the organisms has the desired enzymatic activity The non-native gene can each be obtained from chromosomal DNA of any one of the aforementioned microorganisms by isolating a DNA fragment complementing auxotrophy of a variant strain lacking the enzymatic activity. Alternatively, if the nucleotide sequence of these gene of the organism has already been elucidated (Biochemistry, Vol. 22, pp. 5243-5249, 1983; J. Biochem. Vol. 95, pp. 909-916, 1984; Gene, Vol. 27, pp. 193-199, 1984; Microbiology, Vol. 140, pp. 1817-1828, 1994; Mol. Gene Genet. Vol. 218, pp. 330-339, 1989; and Molecular Microbiology, Vol. 6, pp. 317-326, 1992), the genes can be obtained by PCR using primers synthesized based on each of the elucidated nucleotide sequences, and the chromosome DNA as a template.

[0150] In certain embodiments, the invention relates to any one of the aforementioned genetically engineered organisms, wherein the non-native gene is selected from the group consisting of trzE from Rhodococcus sp. strain Mel, trzE from Rhizobium leguminosarum, trzC MEL, trzC 12227, cah from Fusarium oxysporum Fo5176, cah from F. pseudograminaearum CS3096, cah from Gibberella zeae PH-1, cah from Aspergillus kawachii IFO 4308, cah from A. niger CBS 513.88, cah from A. niger ATCC 1015, cah from A. oryzae 3.042, cah from S. cerevisiae FostersB, atzF from Pseudomonas sp. strain ADP, DUR1,2 from S. cerevisiae, YALI0E 07271g from Y. lipolytica CLIB122, atzE from Pseudomonas sp. strain ADP, atzD from Pseudomonas sp. strain ADP, trzD from Pseudomonas sp. strain NRRLB-12227, atzD from Rhodococcus sp. Mel, trzD from Rhodococcus sp. Mel, guaD from E. coli K12 strain MG1566, blr3880 from Bradyrhizobium japonicum USDA 110, GUD1/Y DL238C from S. cerevisiae, YAL10E2 5740p from Y. lipolytica CLIB122, trzA from Williamsia sp. NRRL B-15444R, triA from Pseudomonas sp. strain NRRL B-12227, atzC from Pseudomonas sp. strain ADP, and cah from Myrothecium verrucaria.

[0151] In certain embodiments, the invention relates to any one of the aforementioned genetically engineered organisms, wherein the genetically engineered organism is a species of the genus Yarrowia, Saccharomyces, Ogataea, Pichia, or Escherichia.

[0152] In certain embodiments, the invention relates to any one of the aforementioned genetically engineered organisms, wherein the genetically engineered organism is selected from the group consisting of Yarrowia lipolytica, Saccharomyces cerevisiae, Ogataea polymorpha, Pichia pastoris, and Escherichia coli.

[0153] In certain embodiments, the genetically engineered organism is not Rhodococcus sp. Strain Mel.

Exemplary Methods of the Invention

[0154] In certain embodiments, the invention relates to a method, comprising the step of

[0155] contacting any one of the aforementioned genetically engineered organisms with a substrate,

[0156] wherein

[0157] the substrate comprises a nitrogen-containing fraction and a non-nitrogen-containing fraction;

[0158] the nitrogen-containing fraction comprises, in an amount from about 10% by weight to about 100% by weight, a nitrogen-containing compound of any one of Formulas I-III;

[0159] a native organism of the same species as the genetically engineered organism could not metabolize (i.e., use as a source of nitrogen) the nitrogen-containing compound; and

[0160] the genetically engineered organism converts the substrate to a product.

[0161] In certain embodiments, the invention relates to any one of the aforementioned methods, wherein the nitrogen-containing compounds have a low molecular weight. In certain embodiments, the invention relates to any one of the aforementioned methods, wherein the nitrogen-containing compounds have a molecular weight between about 30 Da and about 800 Da. In certain embodiments, the invention relates to any one of the aforementioned methods, wherein the nitrogen-containing compounds have a molecular weight between about 40 Da and about 600 Da. In certain embodiments, the invention relates to any one of the aforementioned methods, wherein the nitrogen-containing compounds have a molecular weight of about 40 Da, about 50 Da, about 60 Da, about 70 Da, about 80 Da, about 90 Da, about 100 Da, about 110 Da, about 120 Da, about 130 Da, about 140 Da, about 150 Da, about 160 Da, about 170 Da, about 180 Da, about 190 Da, about 200 Da, about 220 Da, about 240 Da, about 260 Da, about 280 Da, about 300 Da, about 320 Da, about 340 Da, about 360 Da, about 380 Da, about 400 Da, about 420 Da, about 440 Da, about 460 Da, about 480 Da, about 500 Da, about 520 Da, about 540 Da, bout 560 Da, about 580 Da, or about 600 Da.

[0162] In certain embodiments, the invention relates to any one of the aforementioned methods, wherein the nitrogen-containing compounds have less than 12 carbon atoms. In certain embodiments, the invention relates to any one of the aforementioned methods, wherein the nitrogen-containing compounds have less than 8 carbon atoms. In certain embodiments, the invention relates to any one of the aforementioned methods, wherein the nitrogen-containing compounds have 1, 2, 3, 4, 5, 6, or 7 carbon atoms.

[0163] In certain embodiments, the invention relates to any one of the aforementioned methods, wherein the nitrogen-containing compounds have 1, 2, 3, 4, 5, 6, 7, 8, 9, or 10 nitrogen atoms.

[0164] In certain embodiments, the invention relates to any one of the aforementioned methods, wherein the nitrogen-containing compounds have 0, 1, 2, 3, 4, 5, 6, 7, or 8 oxygen atoms.

[0165] In certain embodiments, the invention relates to any one of the aforementioned methods, wherein the nitrogen-containing compounds have an octanol-water partition coefficient (log P) less than about 5. In certain embodiments, the invention relates to any one of the aforementioned methods, wherein the nitrogen-containing compounds have an octanol-water partition coefficient (log P) from about -0.5 to about 5. In certain embodiments, the invention relates to any one of the aforementioned methods, wherein the nitrogen-containing compounds have an octanol-water partition coefficient (log P) of about -0.5, about 0, about 0.5, about 1, about 1.5, about 2, about 2.5, about 3, about 3.5, about 4, or about 4.5.

[0166] In certain embodiments, the invention relates to any one of the aforementioned methods, wherein the nitrogen-containing compounds are soluble in water at about 20.degree. C. at a concentration of between about 0.01 g/L to about 1000 g/L. In certain embodiments, the invention relates to any one of the aforementioned methods, wherein the nitrogen-containing compounds are soluble in water at about 20.degree. C. at a concentration of about 0.01 g/L, about 0.05 g/L, about 0.1 g/L, about 0.5 g/L, about 1 g/L, about 5 g/L, about 10 g/L, about 15 g/L, about 20 g/L, about 25 g/L, about 30 g/L, about 35 g/L, about 40 g/L, about 45 g/L, about 50 g/L, about 55 g/L, about 60 g/L, about 65 g/L, about 70 g/L, about 75 g/L, about 80 g/L, about 85 g/L, about 90 g/L, about 95 g/L, or about 100 g/L.

[0167] In certain embodiments, the invention relates to any one of the aforementioned methods, wherein the nitrogen-containing compounds move through the cell membrane by passive transport. Passive transport includes diffusion, facilitated diffusion, and filtration.

[0168] In certain embodiments, the invention relates to any one of the aforementioned methods, wherein the nitrogen-containing compounds move through the cell membrane by active transport, such as, for example, via an ATP-Binding Cassette (ABC) transporter or other known transmembrane transporter.

[0169] In certain embodiments, the invention relates to any one of the aforementioned methods, wherein the nitrogen-containing compounds are transported through the cell membrane.

[0170] In certain embodiments, the invention relates to any one of the aforementioned methods, wherein the nitrogen-containing compounds are substantially non-biocidal.

[0171] In certain embodiments, the invention relates to any one of the aforementioned methods, wherein the nitrogen-containing compounds are substantially biodegradable.

[0172] In certain embodiments, the invention relates to any one of the aforementioned methods, wherein the nitrogen-containing fraction comprises the nitrogen-containing compound in about 10%, about 15%, about 20%, about 25%, about 30%, about 35%, about 40%, about 45%, about 50%, about 55%, about 60%, about 65%, about 70%, about 75%, about 80%, about 85%, about 90%, about 95%, or about 100% by weight.

[0173] In certain embodiments, the invention relates to a method, comprising the step of

[0174] contacting any one of the aforementioned genetically engineered organisms with a substrate,

[0175] wherein

[0176] the substrate comprises a nitrogen-containing fraction and a non-nitrogen-containing fraction; [0177] the nitrogen-containing fraction comprises, in an amount from about 10% by weight to about 100% by weight, a nitrogen-containing compound selected from the group consisting of triazine, urea, melamine, cyanamide, 2-cyanoguanidine, ammeline, guanidine carbonate, ethylenediamine, ammelide, biuret, diethylenetriamine, triethylenetetramine, 1,3-diaminopropane, calcium cyanamide, cyanuric acid, aminoethylpiperazine, piperazine, and allophante; and

[0178] the genetically engineered organism converts the substrate to a product.'

[0179] In certain embodiments, the invention relates to any one of the aforementioned methods, wherein the nitrogen-containing fraction comprises the nitrogen-containing compound in about 10%, about 15%, about 20%, about 25%, about 30%, about 35%, about 40%, about 45%, about 50%, about 55%, about 60%, about 65%, about 70%, about 75%, about 80%, about 85%, about 90%, about 95%, or about 100% by weight.

[0180] In certain embodiments, the invention relates to a method, comprising the step of

[0181] contacting any one of the aforementioned genetically engineered organisms with a substrate,

[0182] wherein

[0183] the substrate comprises a nitrogen-containing fraction and a non-nitrogen-containing fraction;

[0184] the nitrogen containing fraction consists essentially of a nitrogen-containing compound selected from the group consisting of triazine, urea, melamine, cyanamide, 2-cyanoguanidine, ammeline, guanidine carbonate, ethylenediamine, ammelide, biuret, diethylenetriamine, triethylenetetramine, 1,3-diaminopropane, calcium cyanamide, cyanuric acid, aminoethylpiperazine, piperazine, and allophante; and

[0185] the genetically engineered organism converts the substrate to a product.

[0186] In certain embodiments, the invention relates to a method, comprising the step of

[0187] contacting any one of the aforementioned genetically engineered organisms with a substrate,

[0188] wherein

[0189] the substrate consists of a nitrogen-containing fraction and a non-nitrogen-containing fraction;

[0190] the nitrogen containing fraction consists of a nitrogen-containing compound selected from the group consisting of triazine, urea, melamine, cyanamide, 2-cyanoguanidine, ammeline, guanidine carbonate, ethylenediamine, ammelide, biuret, diethylenetriamine, triethylenetetramine, 1,3-diaminopropane, calcium cyanamide, cyanuric acid, aminoethylpiperazine, piperazine, and allophante; and the genetically engineered organism converts the substrate to a product.

[0191] In certain embodiments, the invention relates to any one of the aforementioned methods, wherein the genetically engineered organism sequesters the product.

[0192] In certain embodiments, the invention relates to any one of the aforementioned methods, wherein a plurality of genetically engineered organisms is used.

[0193] In certain embodiments, the invention relates to any one of the aforementioned methods, wherein the substrate does not comprise an antibiotic.

[0194] In certain embodiments, the invention relates to any one of the aforementioned methods, wherein the substrate does not comprise ammonium sulfate.

[0195] In certain embodiments, the invention relates to any one of the aforementioned methods, wherein the substrate does not comprise urea.

[0196] In certain embodiments, the invention relates to any one of the aforementioned methods, wherein a non-genetically engineered organism, i.e., a native organism, could not metabolize (i.e., use as a source of nitrogen) the nitrogen-containing compound. In certain embodiments, the genetically engineered organism is not Rhodococcus sp. Strain Mel.

[0197] In certain embodiments, the invention relates to any one of the aforementioned methods, wherein the substrate comprises lignocellulosic material, glucose, xylose, sucrose, acetic acid, formic acid, lactic acid, butyric acid, a free fatty acid, dextrose, glycerol, fructose, lactose, galactose, mannose, rhamnose, or arabinose, or a combination thereof.

[0198] In certain embodiments, the invention relates to any one of the aforementioned methods, wherein the pH of the substrate is from about 2.5 to about 10.

[0199] In certain embodiments, the invention relates to any one of the aforementioned methods, wherein the genetically engineered organism is contacted with the substrate at a temperature of from about 15.degree. C. to about 80.degree. C.

[0200] In certain embodiments, the invention relates to any one of the aforementioned methods, wherein the genetically engineered organism is contacted with the substrate over a time period of from about 6 h to about 10 d.

[0201] In certain embodiments, the invention relates to any one of the aforementioned methods, wherein the genetically engineered organism is contacted with the substrate in a fermentor.

[0202] In certain embodiments, the invention relates to any one of the aforementioned methods, wherein the genetically engineered organism is contacted with the substrate in an industrial-size fermentor.

[0203] In certain embodiments, the invention relates to any one of the aforementioned methods, wherein a plurality of genetically engineered organisms is contacted with a plurality of substrates in a plurality of fermentors, wherein the plurality of fermentors are arranged in parallel.

[0204] In certain embodiments, the invention relates to any one of the aforementioned methods, wherein the product is ethanol, isopropanol, lactic acid, an isoprenoid, a lipid, a high-value specialty chemical, butanol, 1,3-propanediol, 1,4-butanediol, succinic acid, an expressed protein product, an enzyme product, a polyol, a pharmaceutical product, itaconic acid, or a high value specialty chemical.

Exemplary Products

[0205] In certain embodiments, the invention relates to a product made by any one of the aforementioned methods.

EXEMPLIFICATION

[0206] The following examples are provided to illustrate the invention. It will be understood, however, that the specific details given in each example have been selected for purpose of illustration and are not to be construed as limiting the scope of the invention. Generally, the experiments were conducted under similar conditions unless noted.

Example 1

[0207] The oleaginous yeast Yarrowia lipolytica may be engineered to convert melamine into ammonia. Melamine (C.sub.3N.sub.6H.sub.6) is a highly nitrogenous compound that can only be degraded by a very limited number of organisms including Rhodococcus sp. Strain Mel. Incorporating the pathway for melamine degradation into Yarrowia, accompanied with a modification in the media composition to use melamine as the predominant nitrogen source, will generate a more robust industrial production solution applicable to a number of applications. The advantage confirmed by this modification is significant enough to provide advantage in multiple applications including situations where the core technology may be significant genetic burden on the organism.

Example 2

[0208] Genes from FIG. 3, or suitable homologs, will be cloned into a host strain such as Yarrowia lipolytica, Saccharomyces cerevisiae, or Escherichia coli. Enzymes native to the host organism, such as allophante hydrolase or guanine deaminase may be overexpressed with a heterologous promoter. Functional expression will be assayed by enzymatic activity and the ability to confer nitrogen limited growth on the appropriate pathway intermediate. Ultimately, strains able to degrade melamine will be selected for improved utilization of the pathway via melamine limited continuous culturing or other selective methods. Similar strategies can be devised for nitrogen compounds listed in FIG. 2.

Example 3--Vector Construction Via Yeast Mediated Ligation

Base Vector

[0209] Vector pNC10 contains an E. coli pMB1 origin of replication and ampicillin resistance gene, a S. cerevisiae 2 .mu.m origin of replication and URA3 gene, and a multiple cloning site containing the 8-bp recognition sequences for PacI, PmeI, and AscI. DNA of interest is inserted in the multiple cloning site via yeast mediated homologous recombination (YML) cloning. (Shanks et al. 2006; Shanks et al. 2009). Briefly, target DNA sequences are amplified by PCR using primers with 20-40 bp overhang homology to adjacent DNA segments in the final vector. pNC10 or another suitable base vector is then restriction digested, creating a linearized plasmid. PCR products and linear plasmid are transformed in S. cerevisiae, and the native S. cerevisiae gap repair mechanism assembles an intact plasmid based on homology overhangs.

[0210] The complete vector can then be isolated from S. cerevisiae via a DNA extraction protocol and used to transform E. coli. Concentrated vector can then be recovered from E. coli via DNA plasmid mini-prep or other suitable standard molecular biology protocols. See FIG. 5.

Example 4--S. cerevisiae Transformation

[0211] Grow overnight a 5 mL culture of a S. cerevisiae ura3 auxotroph strain in YPD at 30 C.

[0212] Transfer 1.5 mL of overnight culture to 50 mL fresh YPD (OD.about.0.3) and shake at 200 rpm, 30.degree. C. in a flask. Allow to grow for approx. 4-5 hrs to an OD of 1.0.

[0213] Centrifuge cells at >5,000 rpm for 1 mM, resuspend in 50 mL sterile water and repeat.

[0214] Add 1 mL of 100 mM Lithium acetate to cell pellet and transfer cells to a 1.5 mL tube.

[0215] Spin cells for 10 sec at >12,000 rpm, remove supernatant, and resuspend in 400-800 .mu.L of 100 mM LiAc (each transformation uses 50 .mu.L of this cell suspension).

[0216] Prepare a transformation master mix of the following, per sample

TABLE-US-00001 X number of transformations + 1 50% PEG 3350 240 .mu.L 1M LiAc 36 .mu.L Salmon sperm DNA* (2 mg/mL) 50 .mu.L *SS DNA should be first boiled for 10 min and rapidly cooled to 4.degree. C.

[0217] Prepare one 1.5 mL tube for each transformation. Per tube, add: 5 .mu.L of digested vector, 5 .mu.L of each PCR insert (assuming a good PCR amplification, approx. 100-200 ng DNA), and water to bring the final volume to 34 .mu.L. Add 326 .mu.L master mix, and then 50 .mu.L of cell suspension. Vortex tubes to completely mix contents.

[0218] Incubate for 30 min at 30.degree. C., then mix by inverting and place in 42.degree. C. water bath for 30 min. (Note optimal time at 42.degree. C. varies strain to strain).

[0219] Spin down cells for 10 sec at >12,000 rpm, remove PEG mixture and resuspend in 1 mL sterile water. Spin down again, remove 800 .mu.L, and use final 200 .mu.L to resuspend and spread on SD-URA plates. Incubate at 30.degree. C. for 2-4 days.

Example 5--Expression of Melamine Assimilation Enzymes in S. cerevisiae

[0220] Melamine assimilation genes, or a subset of them, can be expressed in S. cerevisiae by construction of a vector using the yeast mediated ligation described above. Expression vectors consist of an S. cerevisiae functional promoter, a gene encoding an enzyme of the melamine assimilation pathway, and an S. cerevisiae functional terminator. Assemblies of the promoter-gene-terminator motif can be incorporated into a single strain, either on a replicating plasmid or integrated into a chromosome. Possible promoters and terminators are listed below, see also Sun et al. 2012. A representative plasmid, expressing the trzA melamine hydratase under control of the Y. lipolytica TEF1 promoter and terminator is shown below.

[0221] Plasmid AJS35 is an example of the melamine dehydratase trzA transcribed via the Y. lipolytica TEF1 promoter and terminator. See FIG. 6.

[0222] Strains NS98 and NS99 are industrial S. cerevisiae strains carrying plasmids pNC96 (hyg.sup.R, and a codon optimized trzE from Rhodococcus sp. MEL and pNC97 (hyg.sup.R, and a codon optimized trzE from Rhizobium leguminosarum), respectively. Strain NS100 is the same industrial S. cerevisiae stain carrying plasmid pNC67 (hyg.sup.R, nat.sup.R) which serves as a control strain.

[0223] Strains NS98, NS99, and NS100 were grown in defined YNB medium with 10 mM urea and 100 .mu.g/mL hygromycin to stationary phase aerobically at 30.degree. C. 1/1000 v/v inoculations were then made into the same defined medium with either 10 mM urea, 10 mM biuret, or no additional nitrogen and grown under the same conditions. Optical density was measured after 72 hours, as shown in FIG. 23. Strains NS98 and NS99 were able to grow to an optical density approximately double that of NS100 in medium containing biuret, and also approximately double that with medium with no nitrogen supply. This shows that S. cerevisiae strains expressing trzE genes are advantaged in their utilization of biuret.

DNA that can be used as promoters for gene transcription in S. cerevisiae

TABLE-US-00002 S. cerevisiae TPI promoter aggaacccatcaggttggtggaaGATTACCCGTTCTAAGACTTTTCAGCTT CCTCTATTGATGTTACACCTGGACACCCCTTTTCTGGCATCCAGTTTTTAA TCTTCAGTGGCATGTGAGATTCTCCGAAATTAATTAAAGCAATCACACAAT TCTCTCGGATACCACCTCGGTTGAAACTGACAGGTGGTTTGTTACGCATGC TAATGCAAAGGAGCCTATATACCTTTGGCTCGGCTGCTGTAACAGGGAATA TAAAGGGCAGCATAATTTAGGAGTTTAGTGAACTTGCAACATTTACTATTT TCCCTTCTTACGTAAATATTTTTCTTTTTAATTCTAAATCAATCTTTTTCA ATTTTTTGTTTGTATTCTTTTCTTGCTTAAAtctataac tacaaaaaacacatacataaactaaaa S. cerevisiae GPM1 promoter ttgctacgcaggctgcacaattacACGAGAATGCTCCCGCCTAGGATTTAA GGCTAAGGGACGTGCAATGCAGACGACAGATCTAAATGACCGTGTCGGTGA AGTGTTCGCCAAACTTTTCGGTTAACACATGCAGTGATGCACGCGCGATGG TGCTAAGTTACATATATATATATATATATATATATATATATATATAGCCAT AGTGATGTCTAAGTAACCTTTATGGTATATTTCTTAATGTGGAAAGATACT AGCGCGCGCACCCACACACAAGCTTCGTCTTTTCTTGAAGAAAAGAGGAAG CTCGCTAAATGGGATTCCACTTTCCGTTCCCTGCCAGCTGATGGAAAAAGG TTAGTGGAACGATGAAGAATAAAAAGAGAGATCCACTGAGGTGAAATTTCA GCTGACAGCGAGTTTCATGATCGTGATGAACAATGGTAACGAGTTGTGGCT GTTGCCAGGGAGGGTGGTTCTCAACTTTTAATGTATGGCCAAATCGCTACT TGGGTTTGTTATATAACAAAGAAGAAATAATGAACTGATTCTCTTCCTCCT TCTTGTCCTTTCTTAATTCTGTTGTAATTACCTTCCTTTGTAATTTTTTTT GTAATTATTCTtcttaataatccaaacaaacacacatattacaata S. cerevisiae TDH3 promoter tgctgtaacccgtacatgcccaaaATAGGGGGCGGGTTACACAGAATATAT AACATCGTAGGTGTCTGGGTGAACAGTTTATTCCTGGCATCCACTAAATAT AATGGAGCCCGCTTTTTAAGCTGGCATCCAGAAAAAAAAAGAATCCCAGCA CCAAAATATTGTTTTCTTCACCAACCATCAGTTCATAGGTCCATTCTCTTA GCGCAACTACAGAGAACAGGGGCACAAACAGGCAAAAAACGGGCACAACCT CAATGGAGTGATGCAACCTGCCTGGAGTAAATGATGACACAAGGCAATTGA CCCACGCATGTATCTATCTCATTTTCTTACACCTTCTATTACCTTCTGCTC TCTCTGATTTGGAAAAAGCTGAAAAAAAAGGTTGAAACCAGTTCCCTGAAA TTATTCCCCTACTTGACTAATAAGTATATAAAGACGGTAGGTATTGATTGT AATTCTGTAAATCTATTTCTTAAACTTCTTAAATTCTACTTTTATAGTTAG TCTTTTTTTTAGTTTTAAAACACCAAGAacttagtttcgaataaacacaca taaacaaacaaa S. cerevisiae FBA1 promoter gcaccgctggcttgaacaacaataCCAGCCTTCCAACTTCTGTAAATAACG GCGGTACGCCAGTGCCACCAGTACCGTTACCTTTCGGTATACCTCCTTTCC CCATGTTTCCAATGCCCTTCATGCCTCCAACGGCTACTATCACAAATCCTC ATCAAGCTGACGCAAGCCCTAAGAAATGAATAACAATACTGACAGTACTAA ATAATTGCCTACTTGGCTTCACATACGTTGCATACGTCGATATAGATAATA ATGATAATGACAGCAGGATTATCGTAATACGTAATAGTTGAAAATCTCAAA AATGTGTGGGTCATTACGTAAATAATGATAGGAATGGGATTCTTCTATTTT TCCTTTTTCCATTCTAGCAGCCGTCGGGAAAACGTGGCATCCTCTCTTTCG GGCTCAATTGGAGTCACGCTGCCGTGAGCATCCTCTCTTTCCATATCTAAC AACTGAGCACGTAACCAATGGAAAAGCATGAGCTTAGCGTTGCTCCAAAAA AGTATTGGATGGTTAATACCATTTGTCTGTTCTCTTCTGACTTTGACTCCT CAAAAAAAAAAAATCTACAATCAACAGATCGCTTCAATTACGCCCTCACAA AAACTTTTTTCCTTCTTCTTCGCCCACGTTAAATTTTATCCCTCATGTTGT CTAACGGATTTCTGCACTTGATTTATTATAAAAAGACAAAGACATAATACT TCTCTATCAATTTCAGTTATTGTTCTTCCTTGCGTTATTCTTCTGTTCTTC TTTTTCTTTTGTcatatataaccataaccaagtaatacatattcaaa Y. lipolytica TEF1 promoter tataaacggtattttcacaattgcACCCCAGCCAGACCGATAGCCGGTCGC AATCCGCCACCCACAACCGTCTACCTCCCACAGAACCCCGTCACTTCCACC CTTTTCCACCAGATCATATGTCCCAACTTGCCAAATTAAAACCGTGCGAAT TTTCAAAATAAACTTTGGCAAAGAGGCTGCAAAGGAGGGGCTGGTGAGGGC GTCTGGAAGTCGACCAGAGACCGGGTTGGCGGCGCATTTGTGTCCCAAAAA ACAGCCCCAATTGCCCCAATTGACCCCAAATTGACCCAGTAGCGGGCCCAA CCCCGGCGAGAGCCCCCTTCTCCCCACATATCAAACCTCCCCCGGTTCCCA CACTTGCCGTTAAGGGCGTAGGGTACTGCAGTCTGGAATCTACGCTTGTTC AGACTTTGTACTAGTTTCTTTGTCTGGCCATCCGGGTAACCCATGCCGGAC GCAAAATAGACTACTGAAAATTTTTTTGCTTTGTGGTTGGGACTTTAGCCA AGGGTATAAAAGACCACCGTCCCCGAATTACCTTTCCTCTTCTTTTCTCTC TCTCCTTGTCAACTCACACCCGAAATCGTtaagcatttccttctgagtata agaatcattcaaa S. cerevisiae PDC1 promoter gcataatattgtccgctgcccgttTTTCTGTTAGACGGTGTCTTGATCTAC TTGCTATCGTTCAACACCACCTTATTTTCTAACTATTTTTTTTTTAGCTCA TTTGAATCAGCTTATGGTGATGGCACATTTTTGCATAAACCTAGCTGTCCT CGTTGAACATAGGAAAAAAAAATATATAAACAAGGCTCTTTCACTCTCCTT GGAATCAGATTTGGGTTTGTTCCCTTTATTTTCATATTTCTTGTCATATTC TTTTCTCAATTATTATCTTCTACTCATAacctcacgcaaaataacacagtc aaatcaatcaaa S. cerevisiae TEF1 promoter CATAGCTTCAAAATGTTTCTACTCCTTTTTTACTCTTCCAGATTTTCTCGG ACTCCGCGCATCGCCGTACCACTTCAAAACACCCAAGCACAGCATACTAAA TTTCCCCTCTTTCTTCCTCTAGGGTGTCGTTAATTACCCGTACTAAAGGTT TGGAAAAGAAAAAAGAGACCGCCTCGTTTCTTTTTCTTCGTCGAAAAAGGC AATAAAAATTTTTATCACGTTTCTTTTTCTTGAAAATTTTTTTTTTTGATT TTTTTCTCTTTCGATGACCTCCCATTGATATTTAAGTTAATAAACGGTCTT CAATTTCTCAAGTTTCAGTTTCATTTTTCTTGTTCTATTACAACTTTTTTT ACTTCTTGCTCATTAGAAAGAaagcatagcaatctaatctaagttttaatt acaaa

DNA sequences that can be used as terminators of gene transcription

TABLE-US-00003 S. cerevisiae TPI terminator taagattaatataattatataaAAATATTATCTTCTTTTCTTTATATCTAG TGTTATGTAAAATAAATTGATGACTACGGAAAGCTTTTTTATATTGTTTCT TTTTCATTCTGAGCCACTTAAATTTCGTGAATGTTCTTGTAAGGGACGGTA GATTTACAAGTGATACAACAAAAAGCAAGGCGCTTTTTCTAATAAAAAGAA GAAAAGCATTTAACAATTGAACACCTCTATATCAACGAAGAATATTACTTT GTCTCTAAATCCTTGTAAAATGTGTACGATCTCTATATGGGTTACTCATAA gtgtaccgaagactgcattgaaag S. cerevisiae GPM1 terminator gtctgaagaatgaatgatttgaTGATTTCTTTTTCCCTCCATTTTTCTTAC TGAATATATCAATGATATAGACTTGTATAGTTTATTATTTCAAATTAAGTA GCTATATATAGTCAAGATAACGTTTGTTTGACACGATTACATTATTCGTCG ACATCTTTTTTCAGCCTGTCGTGGTAGCAATTTGAGGAGTATTATTAATTG AATAGGTTCATTTTGCGCTCGCATAAACAGTTTTCGTCAGGGACAGTATGT TGGAATGAGTGGTAATTAATGGTGACATGACATGTTATAGCAATAACCTTG ATGTTTACATCGTAGTTTAATGTACACCCCGCGAATTCGTTCAAGTAggag tgcaccaattgcaaagggaa S. cerevisiae TDH3 terminator gtgaatttactttaaatcttgcATTTAAATAAATTTTCTTTTTATAGCTTT ATGACTTAGTTTCAATTTATATACTATTTTAATGACATTTTCGATTCATTG ATTGAAAGCTTTGTGTTTTTTCTTGATGCGCTATTGCATTGTTCTTGTCTT TTTCGCCACATGTAATATCTGTAGTAGATACCTGATACATTGTGGATGCTG AGTGAAATTTTAGTTAATAATGGAGGCGCTCTTAATAATTTTGGGGATATT GGCTTTTTTTTTTAAAGTTTACAAATGAATTTTTTCCGCCAGGATAACGAT TCTGAAGTTACTCTTAGCGTTCCTATCGGTACAGCCATCAAATCATGCCTA TAAATCATGCCTATATTTGCGTGCAGTCAGTATCATCTACATGAAAAAAAC TCCCGCAATTTCTTATAGAATACGTTGAAAATTAAATGTACGCGCCAAGAT AAGATAACATATATCTAGATGCAGTAATATACACAGATTCCCGCGGA S. cerevisiae FBA1 terminator gttaattcaaattaattgatatAGTTTTTTAATGAGTATTGAATCTGTTTA GAAATAATGGAATATTATTTTTATTTATTTATTTATATTATTGGTCGGCTC TTTTCTTCTGAAGGTCAATGACAAAATGATATGAAGGAAATAATGATTTCT AAAATTTTACAACGTAAGATATTTTTACAaaagcctagctcatctt Y. lipolytica TEF1 terminator gctgcttgtacctagtgcaaccccagtttgttaaaAATTAGTAGTCAAAAA CTTCTGAGTTAGAAATTTGTGAGTGTAGTGAGATTGTAGAGTATCATGTGT GTCCGTAAGTGAAGTGTTATTGACTCTTAGTTAGTTTATCTAGTACTCGTT TAGTTGACACTGATCTAGTATTTTACGAGGCGTATGACTTTAGCCAAGTGT TGTACTTAGTCTTCTCTCCAAACATGAGAGGGCTCTGTCACTCAGTCGGCC TATGGGTGAGATGGCTTGGTGAGATCTTTCGATAGTCTCGTCAAGATGGTA GGATGATGGGGGAATACATTACTGCTCTCGTCAAGGAAACCACAATCAGAT CACACCATCCTCCATGGTAtccgatgactctcttctccacagt S. cerevisiae PDC1 terminator acaagctaagttgactgctgctACCAACGCTAAGCAATAAGCGATTTAATC TCTAATTATTAGTTAAAGTTTTATAAGCATTTTTATGTAACGAAAAATAAA TTGGTTCATATTATTACTGCACTGTCACTTACCATGGAAAGACCAGACAAG AAGTTGCCGACACGACAGTCTGTTGAattggcttaagtctgggtccgctt S. cerevisiae CYC1 terminator caggccccttttcctttgtcgaTATCATGTAATTAGTTATGTCACGCTTAC ATTCACGCCCTCCTCCCACATCCGCTCTAACCGAAAAGGAAGGAGTTAGAC AACCTGAAGTCTAGGTCCCTATTTATTTTTTTTAATAGTTATGTTAGTATT AAGAACGTTATTTATATTTCAAATTTTTCTTTTTTTTCTGTACAAACGCGT GTACGCATGTAACATTATACTGAAAACCTTGCTTGAGAAGGTTTTGGGACG CTCGAAGGCTTTAATTTGC

Example 6--Expression of melamine assimilation enzymes in E. coli

[0224] Melamine assimilation genes, or a subset of them, can be expressed in E. coli by construction of a vector using the yeast mediated ligation described above. Expression vectors consist of an E. coli functional promoter, a gene encoding an enzyme of the melamine assimilation pathway, and an E. coli functional terminator. Alternatively, several genes can be expressed from a single promoter as part of a gene operon; in this case inter-gene linker sequences are placed between genes. Sequences that can act as promoters, terminators, and linkers are listed below, as well as two representative E. coli expression plasmids, AJS67 (expressing genes for degradation of melamine to cyanuric acid with release of 3 NH.sub.3 per melamine) and AJS68 (expressing genes for degradation of cyanuric acid to NH.sub.3 and CO.sub.2 with release of 3 NH.sub.3 per cyanuric acid)

TABLE-US-00004 E. coli Ptach promoter agctggtgacaattaatcatcggctcgtataatgtgtggaattgaatcgat ataaggaggttaatca E. coli trpT' terminator ctcaaaatatattttccctctatcttctcgttgcgcttaatttgactaatt ctcattagcgaggcgcgcctttccataggctccgcccc inter-gene operon linkers lacZ-lacY linker ggaaatccatt galT-galK linker ggaacgacc

[0225] See FIG. 7 and FIG. 8.

Example 7--Expression of Cyanamide Assimilation Enzyme in S. cerevisiae

[0226] The gene expression methods described in example 5 can also be used in example 7. S. cerevisiae has the native ability to convert urea to NH.sub.3 and CO.sub.2 via the actions of urea carboxylase and allophante hydrolase, encoded in the fusion gene DUR1,2. Therefore, functional expression of cyanamide hydrolase is sufficient to convert cyanamide to NH.sub.3. A representative cyanamide hydratase expression vector is shown below, with Y. lipolytica TEF1 promoter and terminator and a S. cerevisiae codon-optimized cyanamide hydratase (cah) from Myrothecium verrucaria. See FIG. 9.

Example 8--Expression of Cyanamide Assimilation Enzymes in E. coli

[0227] The gene expression methods described in Example 6 can also be used in example 8. Unlike S. cerevisiae, most E. coli strains are unable to utilize urea as a nitrogen source, so these additional conversion steps must also be engineered. Either a urea carboxylase/allophante hydrolase system or a urease enzyme with appropriate accessory enzymes must be expressed in addition to a cyanamide hydrolase. Urease can be found in some E. coli isolates (Collins and Falkow 1990) or heterologously expressed (Cussac et al. 1992). Alternatively, the DUR1,2 genes from S. cerevisiae could be expressed, as shown below in plasmid AJS70, along with a cyanamide hydratase. See FIG. 10.

Example 9--Expression of Melamine Assimilation Enzymes in E. coli

[0228] Several E. coli strains containing partial or complete melamine utilization pathways were constructed, as shown in FIGS. 29 and 30. Vector and strain construction was as described in example 6. All vectors contain the ampicillin resistance gene, and 100 ug/mL ampicillin was added to all culture medium. These strains were grown in MOPS defined medium with different nitrogen sources.

[0229] E. coli strains and melamine utilization genes

[0230] NS88--triA (step 1)

[0231] NS89--trzA, guaD, trzC (steps 1, 2, 3)

[0232] NS90--trzD, trzE, DUR1,2 (steps 4, 5, 6)

[0233] NS91--none (control strain)

[0234] NS93--triA, native guaD selected for improved ammeline utilization (steps 1, 2)

[0235] NS103--triA, guaD, trzC (steps 1, 2, 3)

[0236] NS109--triA, guaD, trzC, trzD 12227, trzE, DUR1,2 (steps 1-6)

[0237] NS110--triA, guaD, trzC, atzD ADP, trzE, DUR1,2 (steps 1-6)

[0238] FIG. 12 shows the growth progress of NS88 and NS91 (control) in media containing various concentrations of ammonium chloride or melamine NS88 grown on 1 mM melamine reaches an optical density comparable to that of the equivalent use of 2 mM ammonium chloride, suggesting that 2 mM ammonia are liberated from melamine by triA and the natively encoded guaD genes. The control strain NS91 does not grow with melamine as nitrogen source.

[0239] FIG. 13 shows the growth progress of NS90 and NS91 (control) in media containing various concentrations of ammonium chloride or biuret. NS90 grown on 1 mM biuret reaches an optical density comparable to that of the equivalent use of 3 mM ammonium chloride, suggesting that 3 mM ammonia are liberated from biuret by trzE and the DUR1,2. The control strain NS91 does not grow with biuret as nitrogen source.

[0240] FIG. 24 shows the growth progress of NS91, NS103, NS109, and NS110 in medium containing 0.25 mM melamine as sole nitrogen source. An average of all four strains grown on different ammonium chloride concentrations from 0 to 1.5 mM is also shown as a standard curve for growth with limiting nitrogen. NS91 grown on melamine is similar to the 0 mM ammonium chloride control. NS103 grown on 0.25 mM melamine is similar to 1-0.75 mM ammonium chloride, suggesting it is approximately utilizating the predicted 3 mM ammonia per 1 mM melamine Strains NS109 and NS110 grown on 0.25 mM melamine are similar to 1.5-1.25 mM ammonium chloride, suggesting it is approximately utilizating the predicted 6 mM ammonia per 1 mM melamine

[0241] FIG. 25 shows the growth progress of NS91, NS103, NS109, and NS110 in medium containing 0.25 mM ammeline as sole nitrogen source. An average of all four strains grown on different ammonium chloride concentrations from 0 to 1.5 mM is also shown as a standard curve for growth with limiting nitrogen. NS91 grown on ammeline is similar to the 0 mM ammonium chloride control. NS103 grown on 0.25 mM ammeline is similar to 0.5 mM ammonium chloride, suggesting it is approximately utilizating the predicted 2 mM ammonia per 1 mM ammeline. Strains NS109 and NS110 grown on 0.25 mM ammeline are similar to 1.25-1.0 mM ammonium chloride, suggesting it is approximately utilizating the predicted 5 mM ammonia per 1 mM ammeline.

[0242] FIGS. 26, 27, and 28 show E. coli strains derived from E. coli K12, E. coli MG1655, E. coli B, and E. coli Crooks (C) containing either pNC121 with the complete melamine utilization pathway, or pNC53, a control vector. See FIGS. 29 and 30 for strain details. All the strains containing pNC121 are able to grow on 0.5 mM melamine as sole nitrogen source (FIG. 28). This indicates that the melamine utilization pathway is broadly applicable to E. coli strains that are commonly utilized for biotechnology applications.

[0243] Strains can also be selected for improved utilization of melamine derived nitrogen sources, in one example NS88 was passaged for 11 serial transfers in MOPS defined medium with 0.5 mM ammeline as sole nitrogen source. After the final passage, single colonies were isolated, and one was designated as NS93. NS93 and NS91 were grown overnight in medium with 0.5 mM ammonium chloride as sole nitrogen source, and then inoculated in medium with 0.5 mM ammeline as sole nitrogen source. NS91 exhibited a maximum growth rate of 0.024 hr.sup.-1 on ammeline, while NS93 exhibited a maximum growth rate of 0.087 hr.sup.-1.

Media Utilization

[0244] Cultures grown aerobically at 37.degree. C. with 100 mg/L ampicillin. Pre-cultures were grown in LB media with 100 mg/L ampicillin, washed once with an equal volume of MOPS media containing no nitrogen, and inoculated at 5% v/v of the final fermentation volume. The content of the MOPS medium is outlined in FIG. 11.

Imaging Cultures in Various Media

[0245] Precultures were grown in LB media with 100 mg/L ampicillin, 0.1 mL were directly inoculated into 5 mL MOPS media with 100 mg/L ampicillin and the indicated nitrogen source. Grown at 37.degree. C. in a drum roller at 30 rpm. See FIG. 14.

Example 10--Organisms Engineered to Utilize Cyanamide Organisms

[0246] NS100--industrial S. cerevisiae strain with pNC67 (hyg.sup.R, nat.sup.R)

[0247] NS101--industrial S. cerevisiae strain with pNC93 (hyg.sup.R, cah)

[0248] NS111--S. cerevisiae NRRL Y-2223 with pNC93 (hyg.sup.R, cah)

[0249] NS112--S. cerevisiae NRRL Y-2223 with pNC67 (hyg.sup.R, nat.sup.R)

[0250] See FIG. 16.

Utilization of Cyanamide in Defined Medium

[0251] Optical density of NS100 and NS101 grown in defined medium with different nitrogen sources. NS100 and NS101 were grown overnight in YPD medium, washed once in an equal volume of sterile water, and inoculated at 3.33% v/v. Strain NS101 is able to grow to an optical density with cyanamide comparable to that with urea, while NS100 grows to an optical density comparable to that with no nitrogen present in the medium. Data are averages of 3 replicate wells in a 96 well plate; 150 .mu.L per well. 30.degree. C., YNB medium contained 20 g/L glucose, 1.7 g/L YNB base medium without amino acids or ammonium sulfate, 5 g/L sodium sulfate, 100 .mu.g/mL hygromycin, and either 10 mM urea, 10 mM cyanamide, or no nitrogen source. Inoculation was with 5 .mu.L of culture pregrown for 24 hrs in the same medium with urea as nitrogen source. See FIG. 17.

[0252] Additionally, strains NS100, NS101, NS111, and NS112 were grown in defined YNB medium with 10 mM urea and 100 .mu.g/mL hygromycin to stationary phase aerobically at 30.degree. C. 1/1000 v/v inoculations were then made into the same defined medium with either 10 mM urea, 10 mM cyanamide, or no additional nitrogen and grown under the same conditions. Optical density was measured after 72 hours, as shown in FIG. 22. Strains NS101 and NS111, two different S. cerevisiae strains carrying the cah gene, were able to grow to an optical density comparable to that with urea; however, NS100 and NS112 only were able to grow to an optical density equal to or lower than in media with no nitrogen source. This shows that multiple S. cerevisiae strains are able to utilize cyanamide in the presence of the cah gene.

Competition in Defined Medium

[0253] Strains NS100 (hyg.sup.R, nat.sup.R) and NS101 (hyg.sup.R, cah) were grown in defined medium with 100 .mu.g/mL hygromycin with urea as nitrogen source, and then both inoculated into defined medium containing either 10 mM urea or 10 mM cyanamide as nitrogen source. Upon growth to stationary phase, 1/100 v/v serial transfers were made to fresh medium with the same composition. The culture population was monitored via counting the number of hyg.sup.R, nat.sup.R colony forming units and subtracting from the number of hyg.sup.R colony forming units. See FIG. 18 and FIG. 19 for one experiment in defined minimal medium. A second experiment is shown in FIG. 21. The second experiment included both defined minimal (YNB) and defined complex (YNB+SC amino acids) medium compositions. The defined YNB medium contained 20 g/L glucose, 1.7 g/L YNB base medium without amino acids or ammonium sulfate, 5 g/L sodium sulfate, and either 10 mM urea, 10 mM cyanamide, or no nitrogen source. Medium compositions are additionally given in FIGS. 32 and 33. Growth occurred aerobically at 30.degree. C. Colony forming units were counted by serial dilutions in YPD media with either 300 .mu.g/mL hygromycin or 100 .mu.g/mL nourseothricin, and are the average of 3 dilution counts. See FIG. 18 and FIG. 19.

Utilization of Cyanamide in Rich Medium

[0254] Optical density of NS100 and NS101 grown in rich YPD medium with 100 .mu.g/mL hydgromycin and with and without 10 mM cyanamide. NS100 and NS101 were grown overnight in YNB medium, and inoculated at 3.33% v/v. NS101 experiences a shorter lag phase than NS100 in the presence of 10 mM cyanamide. Thus, cyanamide, in addition to functioning as a sole source of nitrogen, can also act as a deterrent for microbial growth. Data are averages of 3 replicate wells in a 96 well plate; 150 .mu.L per well. 30.degree. C., YPD medium or YPD medium with 10 mM cyanamide. Inoculation was with 5 .mu.L of culture pregrown for 24 hrs in the YNB medium with urea as nitrogen source.

[0255] See FIG. 20.

Example 11--Cyanamide Hydratase Activity Assay

[0256] This assay measured the conversion rate of cyanamide to urea. In the first step, cyanamide was hydrated to urea by cyanamide hydratase, which was detected in cell free extract of a S. cerevisiae strain expressing the cah gene and a control strain without cah. In the second step of the assay, a commercial kit (Megazyme, Ireland) was used to detect urea via enzymatic conversion of urea to ammonia followed by NADPH linked conversion of ammonia and 2-oxoglutarate to NADP+, H.sub.2O, and glutamic acid.

[0257] Cell free extracts were prepared by growing S. cerevisiae strains in 50 mL yeast extract, peptone, dextrose (YPD) medium with 300 .mu.g/mL hygromycin to an optical density between 1-2. Cells were harvested by centrifugation, washed once in an equal volume of water, and re-suspended in Y-PER lysis buffer (Thermo Scientific, USA) following the manufacturer's instructions. After incubation at room temperature for 20 minutes, the lysate was centrifuged at 14,000.times.g for 10 mM and the supernatant was recovered as the cell free extract. Total protein was measured by a Nanodrop spectrophotometer (Thermo Scientific, USA).

Protocol

[0258] Add together in a 100 .mu.L volume:

[0259] 10 .mu.L of 50 mM NaPO4, pH 7.7;

[0260] 10 .mu.L of 200 mM cyanamide made fresh

[0261] 5-20 .mu.L cell free extract

[0262] balance water (60 .mu.L for 20 .mu.L CFE)

add 100 uL of above sample to 2.9 mL Megazyme urea/ammonia assay reagents and monitor at 340 nm.

TABLE-US-00005 Cyanamide hydratase activity .mu.mol Standard Strain Genotype mg.sup.-1 min.sup.-1 Deviation NS100 hyg.sup.R nat.sup.R 0.019 0.001 N5101 hyg.sup.R cah 0.073 0.002

Example 12--Exemplary Sequences of the Invention

[0263] Sequence 1 is the DNA sequence of the allophanate hydrolase atzF gene in

[0264] Pseudomonas sp. strain ADP.

[0265] Sequence 2 is the DNA sequence of allophanate hydrolase DUR1,2 gene in S. cerevisiae.

[0266] Sequence 3 is the DNA sequence of allophanate hydrolase YALI0E07271g gene in Y. lipolytica CLIB122.

[0267] Sequence 4 is the DNA sequence of the biuret amidohydrolase atzE gene in Pseudomonas sp. strain ADP.

[0268] Sequence 5 is the DNA sequence of the cyanuric acid amidohydrolase atzD gene in Pseudomonas sp. strain ADP.

[0269] Sequence 6 is the DNA sequence of the cyanuric acid amidohydrolase trzD gene in Pseudomonas sp. strain NRRLB-12227 (formerly Acidovorax citrulli).

[0270] Sequence 7 is the DNA sequence of the cyanuric acid amidohydrolase atzD trzD gene in Rhodococcus sp. Mel.

[0271] Sequence 8 is the DNA sequence of the guanine deaminase guaD gene in E. coli K12 strain MG1566.

[0272] Sequence 9 is the DNA sequence of the guanine deaminase blr3880 gene in Bradyrhizobium japonicum USDA 110.

[0273] Sequence 10 is the DNA sequence of the guanine deaminase GUD1/YDL238C gene in S. cerevisiae.

[0274] Sequence 11 is the DNA sequence of the guanine deaminase YALI0E25740p gene in Y. lipolytica CLIB122.

[0275] Sequence 12 is the DNA sequence of the melamine deaminase trzA gene in Williamsia sp. NRRL B-15444R (formerly R. corallinus).

[0276] Sequence 13 is the DNA sequence of the melamine deaminase triA gene in Pseudomonas sp. strain NRRL B-12227 (formerly Acidovorax citrulli).

[0277] Sequence 14 is the DNA sequence of the isopropylammelide isopropylaminohydrolase atzC gene in Pseudomonas sp. strain ADP.

[0278] Sequence 15 is the cDNA sequence of the Myrothecium verrucaria cyanamide hydratase (cah) gene.

[0279] Sequences 16-21 are DNA sequences of the invention.

[0280] Sequences 22-37 are the sequences of various cyanamide hydratase (cah) genes for use in the invention.

[0281] Sequences 38 and 39 are the sequences of various trzC genes for use in the invention.

[0282] Sequences 40 and 41 are the sequences of various trzE genes for use in the invention.

[0283] Sequence 42 is the sequence of plasmid pNC10.

[0284] Sequence 43 is the sequence of plasmid pNC53.

[0285] Sequence 44 is the sequence of plasmid pNC67.

[0286] Sequence 45 is the sequence of plasmid pNC85.

[0287] Sequence 46 is the sequence of plasmid pNC86.

[0288] Sequence 47 is the sequence of plasmid pNC87.

[0289] Sequence 48 is the sequence of plasmid pNC93.

[0290] Sequence 49 is the sequence of plasmid pNC96.

[0291] Sequence 50 is the sequence of plasmid pNC97.

[0292] Sequence 51 is the sequence of plasmid pNC101.

[0293] Sequence 52 is the sequence of plasmid pNC120.

[0294] Sequence 53 is the sequence of plasmid pNC121.

INCORPORATION BY REFERENCE

[0295] All of the U.S. patents and U.S. published patent applications cited herein are hereby incorporated by reference.

EQUIVALENTS

[0296] Those skilled in the art will recognize, or be able to ascertain using no more than routine experimentation, many equivalents to the specific embodiments of the invention described herein. Such equivalents are intended to be encompassed by the following claims.

Sequence CWU 1

1

7611818DNAPseudomonas sp. 1atgaatgacc gcgcgcccca ccctgaaaga tctggtcgag tcacgccgga tcacctgacc 60gatctggctt cctatcaggc tgcctatgcc gccggtacag acgccgccga cgtcatttcg 120gacctgtatg cccgtatcaa agaagacggc gaaaatccga tctggattag cctgttgccc 180ttggaaagcg cattggcgat gctggccgac gcgcagcaac gcaaggacaa gggagaagcg 240ttgccgctct ttggcatccc cttcggcgtc aaggacaaca tcgacgtcgc aggccttccg 300acgactgccg ggtgtacggg gttcgcgcgt acgccccgac agcacgcctt cgtcgtacag 360cgcctggtgg acgctggcgc gatcccgatc ggaaaaacga acctcgatca attcgcgacc 420gggttgaacg gcactcgcac gccgtttggc attccgcgct gcgtgttcaa cgagaactac 480gtatccggcg gctccagcag tggctccgca gtggccgtcg ccaacggcac ggtaccgttc 540tcgctcggga cggacactgc cggttccggc cgcattcctg ctgcgttcaa caatctggtg 600ggcttgaaac cgaccaaagg cctgttctcg ggcagtggac tggttcccgc ggcgcgaagc 660cttgactgca tcagcgtcct cgcccatacc gtagatgacg cccttgcggt cgcacgcgtc 720gccgccggct acgatgctga tgacgctttt tcgcgcaagg cgggcgccgc cgcactgaca 780gaaaagagtt ggcctcgtcg cttcaatttc ggggtcccag cggcggaaca tcgccagttt 840ttcggtgacg cggaagccga ggcgcttttc aataaagcgg ttcgcaagct tgaagagatg 900ggtggcacct gcatctcgtt tgactatacc cccttcaggc aggctgctga actgctctac 960gccggccctt gggttgcgga gcgcctggcg gccatcgaga gccttgcgga cgagcatccc 1020gaggtgctcc acccggtcgt tcgtgacatc atcttgtccg cgaagcgaat gagcgcagtc 1080gacacgttca acggtatcta tcgcctggcc gaccttgtca gggctgcaga gagcacttgg 1140gaaaagatcg atgtgatgct gctgccgacg gcgccgacca tctacactgt agaagacatg 1200ctcgccgatc cggtacgcct caacagcaat ctgggcttct acacgaactt cgtgaacttg 1260atggatttgt ccgcgattgc tgttcccgca ggcttccgaa ccaatggcct gccatttggc 1320gtcactttca tcggtcgggc gttcgaagat ggggcgatcg caagcttggg aaaagctttc 1380gtggagcacg acctcgccaa gggcaacgcg gccacggcgg cgccacccaa ggataccgtc 1440gcaatcgccg tggtaggtgc acatctctcc gaccagccct tgaatcatca gctcacggag 1500agcggcggaa agctacgggc aacaacgcgt actgcgccgg gatatgcctt gtacgcactc 1560cgtgatgcga cgccggctaa gcctggaatg ttgcgcgacc agaatgcggt cgggagcatc 1620gaagtggaaa tctgggatct gccggtcgcc gggttcggtg cgtttgtaag tgaaattccg 1680gcgccgttgg gtatcgggac aataacactc gaagacggca gccatgtgaa aggctttctg 1740tgcgagccac atgccatcga gacggcgctc gacatcactc actacggcgg ctggcgagca 1800tacctcgcgg ctcaatag 181825508DNASaccharomyces cerevisiae 2atgacagtta gttccgatac aactgctgaa atatcgttag gttggtcaat ccaagactgg 60attgatttcc acaagtcatc aagctcccag gcttcactaa ggcttcttga atcactacta 120gactctcaaa atgttgcgcc agtcgataat gcgtggatat cgctaatttc aaaggaaaat 180ttactgcacc aattccaaat tttaaagagc agagaaaata aagaaactct acctctctac 240ggtgtcccta ttgctgttaa ggacaacatc gacgttagag gtctacgcac caccgctgca 300tgtccatcct ttgcatatga gccttccaaa gactctaaag tagtagaact actaagaaat 360gcaggtgcaa taatcgtggg taagacaaac ttggaccaat ttgccacagg attagtcggc 420acacggtctc catatgggaa aacaccttgc gcttttagca aagagcatgt atctggtggt 480tcctccgctg ggtcagcatc ggtggtcgcc agaggtatcg taccaattgc attgggtact 540gatacagcag gttctggtag agtcccagcc gccttgaaca acctgattgg cctaaagcca 600acaaagggcg tcttttcctg tcaaggtgta gttcccgctt gtaaatcttt agactgcgtc 660tccatctttg cattaaacct aagtgatgct gaacgctgct tccgcatcat gtgccagcca 720gatcctgata atgatgaata ttctagaccc tatgtttcca acccaaagaa aaatttttca 780agcaatgtaa cgattgctat tcctaaaaat atcccatggt atggtgaaac caagaatcct 840gtactgtttt ccaatgctgt cgaaaatcta tcaagaacgg gcgctaacgt catagaaatt 900gattttgagc ctcttttaga gttagctcgc tgtttatacg aaggtacttg ggtggccgag 960cgttatcaag ctattcaatc gtttttggac agtaaaccac caaaggaatc tttggaccct 1020actgttattt caattataga aggggccaag aaatacagtg cagtagactg cttcagtttt 1080gaatacaaaa gacaaggcat cttgcaaaaa gtgagacgac ttctcgaatc agtcgatgtc 1140ttgtgtgtgc ccacatgtcc cttaaatcct actatgcaac aagttgcgga tgaaccagtc 1200ctagtcaatt caagacaagg cacatggact aattttgtca acttggcaga tttggcagcc 1260cttgctgttc ccgcagggtt ccgagacgat ggtttgccaa atggtattac tttaatcggt 1320aaaaaattca cagattacgc actattagag ttggctaacc gctatttcca aaatatgttc 1380cccaacggtt ccagaacata cggtactttt acctcttctt cagtaaagcc agcaaacgat 1440caattagtgg gaccagacta tgacccatct acgtccataa aattggctgt tgtcggtgca 1500catcttaagg gtctgcctct acattggcaa ttggaaaagg tcaatgcaac atatttatgt 1560acaacaaaaa catcaaaagc ttaccagctt tttgctttgc ccaaaaatgg accagtttta 1620aaacctggtt tgagaagagt tcaagatagc aatggctctc aaatcgaatt agaagtgtac 1680agtgttccaa aagaactgtt cggtgctttt atttccatgg ttcctgaacc attgggaata 1740ggttcagtgg agttagaatc tggtgaatgg atcaaatcct ttatttgtga agaatctggt 1800tacaaagcca aaggtacagt tgatatcaca aagtatggtg gatttagagc atattttgaa 1860atgttgaaga aaaaagagtc ccaaaagaag aagttatttg ataccgtgtt aattgccaat 1920agaggtgaaa ttgccgttcg tattatcaag acattaaaaa aattgggtat tagatcagtt 1980gcagtttatt ccgaccctga taaatattct caacacgtta ctgatgcaga tgtttctgta 2040ccccttcatg gcacaaccgc agcccaaact tatttagaca tgaataagat catagatgcc 2100gctaagcaaa ctaatgcaca ggccattatt cctggttatg gtttcttgtc ggaaaatgcg 2160gatttttctg atgcgtgcac cagtgctggc attacctttg ttggtccttc gggagatatt 2220atcagaggtt tagggttaaa acattctgct agacagattg cacagaaggc tggcgttcct 2280ctagtgccag gctctttgct tatcacatca gttgaagagg ctaagaaagt cgcagcggaa 2340ttggaatacc cagttatggt gaagtcaact gctggtggcg gtggtattgg tttgcagaaa 2400gtcgattctg aagaggacat cgagcatatt tttgagactg tgaaacatca aggtgaaaca 2460tttttcggtg acgctggtgt atttctgaaa cggtttatcg aaaatgccag gcatgttgaa 2520gtccaactta tgggagatgg ttttggtaag gccattgctt tgggcgaacg tgattgttct 2580ttacagcgtc gtaaccaaaa agttatcgaa gaaactcctg caccaaattt gccagaaaag 2640acgaggttgg cgttaagaaa ggcagctgaa agtttgggat ctttattgaa ttacaagtgt 2700gctggtacgg ttgaatttat ttacgatgag aaaaaggacg agttttactt tttagaagtt 2760aatacaagat tacaagttga acatccaata acagaaatgg ttacagggtt agacttggtc 2820gagtggatga tcaggattgc cgctaatgat gcacctgatt ttgattctac aaaggtagaa 2880gtcaatgggg tttcaatgga ggcacgttta tatgctgaaa atccattgaa aaatttcaga 2940ccttctccag gtttacttgt cgatgtgaaa tttcctgatt gggcaagagt ggatacttgg 3000gttaagaaag gtactaatat ttctcccgaa tatgatccaa cattggccaa aattatcgtt 3060catgggaaag accgtgatga tgcaatttcc aagttaaatc aagcgttaga agaaacaaaa 3120gtttacggat gtattactaa cattgactac ctgaagtcta tcattaccag tgatttcttt 3180gctaaagcaa aagtttctac aaacattttg aactcttatc aatatgagcc taccgccatc 3240gaaattactt tgcccggtgc acacactagt attcaggatt accccggtag agttgggtac 3300tggagaattg gtgttccgcc ctctggtcca atggacgcat attcgtttag attggcgaac 3360agaattgttg gtaatgacta caggactcct gccattgaag taacgttgac tggtccatcc 3420atcgttttcc attgtgaaac tgtcattgcc attactggtg gtaccgctct atgtacatta 3480gacggccaag aaattcccca acacaaaccg gtcgaagtta agaggggatc tactttatcc 3540attggcaagt tgacaagcgg ctgtagagca tacttaggta tcaggggtgg cattgatgtg 3600cctaaatact tgggctctta ttctactttc actctaggaa atgtcggtgg atacaatgga 3660agggtgctaa aacttggaga cgtactattc ttaccaagca atgaagaaaa taaatcagtt 3720gagtgccttc cacagaatat tcctcaatca ttaattcctc aaatttccga aactaaggaa 3780tggagaattg gtgtaacatg tggtccccat gggtctccag atttttttaa acctgagtcc 3840atcgaagaat ttttcagtga gaagtggaag gttcattaca actccaatag atttggtgtc 3900cgtttgattg gacctaaacc taagtgggca agaagtaatg gtggtgaagg tggtatgcat 3960ccttcaaaca ctcacgatta cgtttattct ctgggtgcaa ttaatttcac gggtgatgag 4020ccagttatta ttacttgcga tggtccttcc ttaggtggtt ttgtgtgtca agctgttgtc 4080ccagaagcag aactgtggaa ggttggacag gttaaacccg gtgattccat tcagtttgtg 4140ccactttctt acgaaagctc gagatcctta aaggaatctc aggaagttgc aattaaatca 4200ttggatggta ctaagttaag gcgcttagac tctgtttcaa ttttaccatc attcgaaacg 4260cctattcttg cacaaatgga aaaagtgaat gagctttcac caaaggttgt atacagacaa 4320gcaggtgatc gttatgtttt ggtggaatac ggtgataatg aaatgaattt taatatttcc 4380tatagaattg aatgcctgat ctcccttgtg aaaaagaata agactattgg tattgttgaa 4440atgtcccaag gtgttagatc tgtgttgata gaatttgatg gttacaaagt cactcaaaaa 4500gaattgctta aagtattggt ggcatatgaa acagaaatcc agtttgatga aaattggaag 4560ataacttcta atataataag attaccgatg gctttcgaag actcgaagac tttggcatgt 4620gttcaaaggt atcaagaaac aattcgttcg tctgctccat ggttgccaaa taacgttgat 4680ttcattgcca atgtaaatgg aatttcaagg aatgaagttt atgatatgtt gtattctgcc 4740agatttatgg ttttaggttt aggtgatgtc ttcctagggt cgccttgtgc tgttccatta 4800gatcctcgtc acagattttt gggaagcaag tacaacccaa gtagaacata tacagaaaga 4860ggtgcagtcg gtattggcgg tatgtatatg tgcatatatg ctgctaacag tcctggtggg 4920taccaattag tgggtagaac aataccaatt tgggacaaac tatgtctggc cgcatcttct 4980gaggttccgt ggttgatgaa cccatttgac caagtcgaat tttacccagt ttctgaagaa 5040gatttggata aaatgactga agattgtgat aatggtgttt ataaagtcaa tatcgaaaag 5100agtgtttttg atcatcaaga atacttgaga tggatcaacg caaacaaaga ttccatcaca 5160gcattccagg agggccagct tggtgaaaga gcagaggaat ttgccaaatt gattcaaaat 5220gcaaactctg aactaaaaga aagtgtcaca gtcaaacctg acgaggaaga agacttccca 5280gaaggtgcag aaattgtata ttctgagtat tctgggcgtt tttggaaatc catagcatct 5340gttggagatg ttattgaagc aggtcaaggg ctactaatta ttgaagccat gaaagcggaa 5400atgattatat ccgctcctaa atcgggtaag attatcaaga tttgccatgg caatggtgat 5460atggttgatt ctggtgacat agtggccgtc atagagacat tggcatga 550835463DNAYarrowia lipolytica 3atgtgcaaat caatcggctg gactattgcc gaatggaagg aggcacagac caactcgtct 60tacgaggagg cccgacatcg actgttggac ctcgtggcca ccttcaagga ctacaagcat 120ggtgatccgg cttggatcac tgtcgcctca acagagcata tcaacaagca atggaaggag 180cttcagttga tgaagaagaa cccagagtcc cttccccttt acggagttcc tttcgctgta 240aaggacaaca ttgatgtcat cgactttccc acaaccgctg catgccccgc ctatctctac 300atccccaagg aagacgccac catggtccgt ctgatcaaag aggctggagg tatcgttgtc 360ggcaaaacca acctcgatca gttcgctact ggtctggtcg gaacccgatc tccttacgga 420aagactccca acaccttctc cgacaagcac gtatctggag gttcgtctgc tggctctgct 480tccgtagtcg cccgaggcct ggttcccttt tctcttggaa cagatactgc aggctcaggt 540cgggttcccg cctctctcaa caacctggta ggcctaaagc caaccgttgg cgcattttca 600gccaagggtg tggtacccgc ctgcaagtcg cttgattgcg tctccatttt ctcgctggtc 660ctgtctgacg ctcagctggt gttcaacatt gccgcccact ttgacaagga cgattgctac 720tcgcgacgtt tcccccagcg acctctcaag tcgtttggcc ccactccagt atttgccgtc 780cccgaaaccc ctctgtggtt tggagatgag ctcaaccctg ctctcttcga cgacgccgtt 840gagcgtttgc gacaacaggg cgtaaaggtc gtcaagattg acttcactcc tctgttcgac 900ctcgccaagt gcctctacga aggtccctgg gtggctgagc gatacgctgc catcaaggac 960tttgtgcaga accgaaagga agacatggac gaaactgtgt atggcattgt caagcaggct 1020gagaacttca ctgctgcaga cgcctttgcc tacgagtaca aacgacgagc cattgtgcga 1080aagattgagg agatcttctc ttccattgac ggtctgatcg tgcccacatg tcctctattc 1140cccaccatgg agtctgtggc taaggagcct gtcactgtca atgcccacca gggtacctac 1200accaactttg tcaacctcgc tgatctctct gctctagcta tccctgtcgg attccgaaag 1260gacggtttcc cctttggaat cactctcatc tctcaaaagt tcaacgacta cgctctgctg 1320gacatggctc agaagttcct gcctgcttct cgacctctgg gtgctctgcc aaaggacaag 1380ttcaccgcca agaagggaga tcttcttgcc tcttctatcg tcgacaacat gcctcgaacc 1440atccctctgg ctgttgtagg agcccatctc accggcatgc ctctcaactg gcagcttcaa 1500aaggtcgagg ctactcttgc ccgacgaacc aaaactgccg actactaccg actctacgct 1560ctggcgaaca ccgtgcctac aaagcctggt ctccgacggg ttcttccctc tgacactact 1620ctccgaggcg aggctattga ggttgaaatc tgggacgtgc cttacagaaa ctttggagag 1680ttcgtatcaa tggtccctca tcctcttggt atcggaacca ttgagcttgc cgacggaaaa 1740tgggtcaagg gtttcatttg cgagcagctg ggatacgacg acgctgagga catcaccaag 1800tttggcggct ggagagcgta caaggctgag actacccaga acctggagtc caagcctttc 1860gagactgttc tggtcgccaa ccgaggtgag attgccgttc gactcatcaa aactcttcga 1920aagatggata ttcgagctgt ggctgtcttc tccgagcctg atcggttcgc tcaacatgtt 1980cttgatgctg atgactctgt gtctctggaa ggtaccactg ccgccgagac ttacttgtcc 2040atccccaaga ttatcgctgc ttgcaagaag actggagccc aagccattct tcctggctac 2100ggtttcctgt ctgagaatgc tgacttctcc gacgcctgtg ccgaggctgg tatcgtattc 2160attggcccca ctggtgactc cattcgaaag ctcggtctca agcactctgc acgagagatt 2220gctcttgctt ctgacgtgcc tcttgtgccc ggtacaggcc tgatcgagac tgtttccgag 2280gcctccgagg ctgccgagaa gctcgagtac cccctgatga tcaagagtac cgctggtgga 2340ggtggtattg gtcttcagaa ggtcgacaaa cccgaggatc tcaagcgggc ttttgagacc 2400gtcaagcacc aaggtaagtc tttctttgga gacgatggtg tcttcatgga gcgatttgtc 2460gagaatgctc gacacgtgga ggttcagatt cttggtgacg gcaagggcaa cgctctcgct 2520attggcgagc gagactgttc tcttcagcga cgaaaccaga aggtcgtcga agagactcct 2580gcccccaact tccctgctga gactcgaact cgaatgatga aggcgtccga aatgctggca 2640aagaacctca actatcgagg tgccggcact gtggagttca ttttcgatga gaagcgaaac 2700gagttctact tccttgaggt taacgctcgt ctgcaggtcg agcatcccat cactgagtcc 2760gtcactggac tggatcttgt cgagtggatg attctcattg gagctggcaa ggccccagac 2820ttcgaggccc agcgtgccaa gaccccccag ggtgcttcta tcgaggcccg tctgtacgcc 2880gagaaccccg tcaaggactt tgtgccttct cccggtcagc tcaccgacgt gcagttccct 2940agtgatgctc gagtcgacac ctgggtcagc cgtggaacca agatctcagc agagtacgat 3000cccactcttg ccaagattat tgttcacggc tctgaccgag ctgacgccct gcgaaagctc 3060cagagagctc tggacgagac agtggttgcc ggcgtgacca ccaacctgga ctaccttaag 3120tccattgtcg gatctcagat gtttgccgag gccaaggtgt ccacccgagt actggactct 3180tacaactaca ctcccaatgc cattgagatc acttcccccg gctcctacac cactattcag 3240gattaccccg gtcgaaccaa gctgtggcat attggtgttc ctccttctgg acccatggat 3300gcctacgcct tccgggtggc caaccagatt gtgggcaacc accccaaggc tcctgctatc 3360gaagctacac ttgtgggccc ctcaattatg ttccacagcg acactgtgat tgccatcacc 3420ggtggatctg ctgaggccac tcttaatggt gagcccatcg agttctggaa gcctgtgact 3480gtcaaggctg gccagactct cgcaactggc cgtctcactt ctggctgcag attgtacatt 3540gcgattcgaa acggtctgtc tattccagag taccttggtt ctcgatccac cttcgctctc 3600ggtaaccttg gaggcttcaa cggtcgaact ctcaagtttg gcgatgtcat tttcatgggc 3660gagcccgagc ttccctcctg ctccattcct gctcccatct ccgagcatgc tcctgcctct 3720gatgacatga tccccaagta tggcaacgcc tggactgttg gagtcacttg cggccctcac 3780ggctcgccag acttttttgc tcacggctgg atggatacct tcttcgatgc caagtggaag 3840atccattaca actccaaccg atttggtgtt cgtctgattg gccccaagcc cgagtgggct 3900cgaaaggatg gaggagaggc tggtctgcat ccttccaacc agcacgacta tgtctactct 3960ctgggtgcca tcaatttcac cggtgatgag cctgtcattc tgacctgcga tggtccttct 4020ctcggtggct ttgtctgtgc tgctgttgtt gtagaggccg agctgtggaa gattggccag 4080gtcaagcccg gagacactgt gcagtttgtg cccatgacta ttgactctgc tcgacagctc 4140aagaaggccc aggacagaac cattaccaac ctgtgcggtt ccccgtacga gtctgttgat 4200gctcttctcg ctctggagga ttacgagaac cccatcatct acaccgtccc tgcctctacc 4260tccactcctc gagtcgtcta ccgacaggct ggagaccgat acattctggt cgagtacggt 4320gacaacaaca tggacattaa cctgtcctat cgaatccatc ggctcattga ggaagctcag 4380cagtctatca agggcattgt cgaaatgtct cgaggtgttc gttctgtgct gatcgagttc 4440catccttctg cctctcgatc cactctcatg caggctttgg tcgactttga gaagcgactt 4500cagtttgtcg agacctggca ggttccctct cgaattattc gactgccgat gtgctttgag 4560gactccaaga ccctggacgc tgtcaaacgg taccaggaga ccattcggtc aaaggctccc 4620tggcttccca acaacgtcga cttcattcga gacgtcaaca agttctccga ccgatctcag 4680gtccgagaca ttgtctacac tgcccgattc ctggttctgg gtcttggaga cgtgttcctt 4740ggtgctcctt gcgcggtacc tcttgatccc cgacacagac tgcttggaac aaagtacaat 4800ccctctcgaa cctacactcc caacggcact gtcggaattg gaggaatgta catgtgtatc 4860tacaccatgg aatctcctgg aggctaccag ttggttggtc gaactatccc catctgggac 4920aagctgtctc tcggccagga ccgaccttgg ctgctgtcac ccttcgacca gattgagtac 4980taccccgtcg acgaggagga gctcaaccac attaccaccg aggtggagaa cggtcgatat 5040gctgtggaga tggagcagtc cgtctttgat tatggcaagt attctgcctg gctcaaggac 5100aactctaagt ccattgaggc tcacattgct tctcaggcag agggtctgga cgacttcgcc 5160aacctgatca aggtcgccaa cgaggatctg gcctctggaa agactggagc caccaaggag 5220gagactcctc tgtcggcctc tgccgtccag gtcttctccg aggtcactgg ccgtttctgg 5280aagggcctgg ttgccgtcgg agatactgtt gacaagggcc agggtatcgt tgtggtggag 5340gccatgaaga ccgagatggt cgtcaacgcc cctgttgctg gaaaggttgt caagttgtac 5400aacaccaatg gagatatggt ggatactgga gattgtgtgg ctgtcatcga gcccattgtt 5460taa 546341374DNAPseudomonas sp. 4atgaagacag tagaaattat tgaaggtatc gcctctggca gaaccagtgc gcgcgacgtg 60tgcgaagagg cgctcgcaac catcggcgcg accgatggac tcatcaatgc ctttacatgc 120cgtacggttg aacgagcccg cgcagaggcg gatgccatcg atgttcgacg ggcgcgcggc 180gaggtacttc cgcctcttgc cggcctcccc tacgcggtaa agaatctgtt cgacatcgaa 240ggcgtgacga cgcttgccgg ctcgaagatc aaccgtactc tcccgcctgc gcgcgcagac 300gccgtgctgg tgcaacggct gaaagctgcc ggcgccgtgc tcctgggcgg cctcaatatg 360gacgagtttg cctatggatt tacgaccgaa aatacgcact atgggccgac ccggaacccg 420catgacaccg ggcgtatcgc tggtggttcg tcaggggggt ctggagcggc aatcgctgcg 480gggcaggtac cactatcgct cggatcggac accaacggtt ccatacgcgt gccagcatca 540ttgtgtggcg tgtgggggct gaagcctacc ttcggccgcc tgtcccggcg agggacatac 600ccgtttgttc acagcattga tcacctcggg ccattggccg atagcgtgga aggcttggcg 660ttggcctacg atgcaatgca gggcccggat ccgctcgacc ccggatgcag cgcatcgcgc 720atccaaccct cggtaccggt cctcagtcag ggtatcgctg ggctccggat cggcgtgctg 780ggtggctggt ttcgggacaa tgccggcccg gccgcgcgag ccgcggtcga tgttgccgcg 840cttacgctcg gcgccagcga agtcgtcatg tggcccgacg cggagatcgg gcgcgcagcc 900gccttcgtta tcactgccag cgagggaggc tgtctgcatc tcgatgatct tcgcatccgt 960ccgcaagact tcgagcctct gtccgtagat cgctttatct cgggggtttt acaaccggtc 1020gcgtggtact tgcgtgcaca gcggtttcga cgtgtctatc gagataaggt gaatgctctt 1080ttccgtgact gggacatatt aatcgctccc gcaacgccaa taagtgctcc cgcaatcggc 1140accgaatgga tcgaggtaaa cggtacacgc catccgtgcc gcccggctat gggacttctc 1200actcagccgg tctccttcgc aggctgtccg gtggtcgccg ctccaacgtg gcctggagaa 1260aacgatggca tgccgatcgg ggtacagctc atcgcggcgc cctggaacga atctctatgc 1320ctgcgcgcag gcaaggtatt acaagacacc ggtatcgccc gactgaaatg ttaa 137451092DNAPseudomonas sp. 5atgtatcaca tcgacgtttt ccgaatccct tgccacagcc ctggtgatac atcgggtctc 60gaggatttga ttgaaacagg ccgcgttgcc cccgccgaca tcgtcgcggt aatgggcaag 120accgagggca atggctgcgt caacgattac acgcgtgaat acgccaccgc catgcttgct 180gcgtgccttg ggcgtcattt gcaactccca ccccatgagg tggaaaagcg ggtcgcgttt 240gtgatgtcag gtgggacgga aggcgtgctg tccccccacc acacggtatt cgcaagacgt 300ccggcaatcg acgcgcatcg tcccgctggc aaacgtctca cgcttggaat cgccttcacg 360cgtgattttc tgccggagga aattggccgc cacgctcaga taacggagac agccggcgcc 420gtcaaacgcg caatgcgaga tgccgggatc gcttcgattg acgatctgca ttttgtgcag 480gtgaagtgtc cgctgctgac accagcaaag atcgcctcgg cgcgatcacg cggatgcgct 540ccagtcacga cggatacgta tgaatcgatg ggctattcgc gcggcgcttc ggccctgggc 600atcgctctcg ctacagaaga ggtgccctcc tcgatgctcg

tagacgaatc agtgctgaat 660gactggagtc tctcatcgtc actggcgtcg gcgtctgcag gcatcgaact ggagcacaac 720gtggtgatcg ctattggcat gagcgagcag gccaccagtg aactggtcat tgcccacggc 780gtgatgagcg acgcgatcga cgcggcctcg gtgcggcgaa cgattgaatc gctgggcata 840cgtagcgatg acgagatgga tcgcatcgtc aacgtattcg ccaaagcgga ggcgagcccg 900gacggggttg tacgaggtat gcggcacacg atgctaagtg actccgacat taattcgacc 960cgccatgcgc gggcggtcac cggcgcggcc attgcctcgg tagttgggca tggcatggtg 1020tatgtgtccg gtggcgccga gcatcaggga cctgccggcg gcggcccttt tgcagtcatt 1080gcccgcgctt aa 109261113DNAPseudomonas sp. 6atgcaagcgc aagtttttcg agttccaatg agtaatccag ccgatgttag tggcgtagcc 60aagctcatcg atgagggagt gatccgtgcc gaagaggtcg tctgcgttct cggcaagacc 120gaaggcaacg gctgtgtcaa tgacttcacg cgtggctaca ccaccctcgc gttcaaggtc 180tacttctccg agaaactggg cgtgtcccgg caagaggtcg gcgagcgcat cgctttcatc 240atgtccggcg gtaccgaagg cgtcatggcg cctcactgca ccatcttcac cgtgcagaag 300acggacaaca agcagaagac cgccgctgaa ggcaagcgac ttgccgttca gcagatcttt 360acccgcgagt tcctgccgga ggagatcggc cgcatgccgc aggtcacgga aacagccgac 420gctgttcgcc gcgccatgcg cgaagccggc atcgcggatg catccgatgt ccacttcgtt 480caggtcaagt gcccactgct cactgccggc cgcatgcatg acgctgtcga gcgcgggcat 540acggttgcca ccgaagatac ctatgagtcc atgggctact cccgcggcgc atccgcgctt 600ggtatcgccc tggccctcgg ggaagtcgag aaggccaacc tcagtgatga agttattacc 660gcagactaca gtctctactc ctcggttgcc tcaacttcgg cgggtatcga gttgatgaac 720aacgagatca tcgtcatggg caacagccgc gcatggggtg gtgacctcgt catcggccac 780gccgagatga aggacgccat cgacggtgca gcggtccggc aggccctgcg cgacgtcggg 840tgctgcgaga acgacctgcc gaccgtcgac gagctcggcc gcgtggtcaa tgtatttgcc 900aaggctgaag cctccccgga cggtgaggtt cgtaaccgcc gccacacgat gctggacgat 960tcggacatta acagcacgcg ccatgcgcga gcggtcgtca atgcagttat cgcttcgatc 1020gtgggagatc ccatggttta tgtctccggc ggctccgagc atcagggccc cgccggtggc 1080ggtcccgttg cagttatcgc gcgcacagct taa 111371113DNARhodococcus sp. 7atgcaagcgc aagtttttcg agttccaatg agtaatccag ccgatgttag tggcgtagcc 60aagctcatcg atgagggagt gatccgtgcc gaagaggtcg tctgcgttct cggcaagacc 120gaaggcaacg gctgtgtcaa tgacttcacg cgtggctaca ccaccctcgc gttcaaggtc 180tacttctccg agaaactggg cgtgtcccgg caagaggtcg gcgagcgcat cgctttcatc 240atgtccggcg gtaccgaagg cgtcatggcg cctcactgca ccatcttcac cgtgcagaag 300acggacaaca agcagaagac cgccgctgaa ggcaagcgac ttgccgttca gcagatcttt 360acccgcgagt tcctgccgga ggagatcggc cgcatgccgc aggtcacgga aacagccgac 420gctgttcgcc gcgccatgcg cgaagccggc atcgcggatg catccgatgt ccacttcgtt 480caggtcaagt gcccactgct cactgccggc cgcatgcatg acgctgtcga gcgcgggcat 540acggttgcca ccgaagatac ctatgagtcc atgggctact cccgcggcgc atccgcgctt 600ggtatcgccc tggccctcgg ggaagtcgag aaggccaacc tcagtgatga agttattacc 660gcagactaca gtctctactc ctcggttgcc tcaacttcgg cgggtatcga gttgatgaac 720aacgagatca tcgtcatggg caacagccgc gcatggggtg gtgacctcgt catcggccac 780gccgagatga aggacgccat cgacggtgca gcggtccggc aggccctgcg cgacgtcggg 840tgctgcgaga acgacctgcc gaccgtcgac gagctcggcc gcgtggtcaa tgtatttgcc 900aaggctgaag cctccccgga cggtgaggtt cgtaaccgcc gccacacgat gctggacgat 960tcggacatta acagcacgcg ccatgcgcga gcggtcgtca atgcagttat cgcttcgatc 1020gtgggagatc ccatggttta tgtctccggc ggctccgagc atcagggccc cgccggtggc 1080ggtcccgttg cagttatcgc gcgcacagct taa 111381320DNAEscherichia coli 8atgatgtcag gagaacacac gttaaaagcg gtacgaggca gttttattga tgtcacccgt 60acgatcgata acccggaaga gattgcctct gcgctgcggt ttattgagga tggtttatta 120ctcattaaac agggaaaagt ggaatggttt ggcgaatggg aaaacggaaa gcatcaaatt 180cctgacacca ttcgcgtgcg cgactatcgc ggcaaactga tagtaccggg ctttgtcgat 240acacatatcc attatccgca aagtgaaatg gtgggggcct atggtgagca attgctggag 300tggttgaata aacacacctt ccctactgaa cgtcgttatg aggatttaga gtacgcccgc 360gaaatgtcgg cgttcttcat caagcagctt ttacgtaacg gaaccaccac ggcgctggtg 420tttggcactg ttcatccgca atctgttgat gcgctgtttg aagccgccag tcatatcaat 480atgcgtatga ttgccggtaa ggtgatgatg gaccgcaacg caccggatta tctgctcgac 540actgccgaaa gcagctatca ccaaagcaaa gaactgatcg aacgctggca caaaaatggt 600cgtctgctat atgcgattac gccacgcttc gccccgacct catctcctga acagatggcg 660atggcgcaac gcctgaaaga agaatatccg gatacgtggg tacataccca tctctgtgaa 720aacaaagatg aaattgcctg ggtgaaatcg ctttatcctg accatgatgg ttatctggat 780gtttaccatc agtacggcct gaccggtaaa aactgtgtct ttgctcactg cgtccatctc 840gaagaaaaag agtgggatcg tctcagcgaa accaaatcca gcattgcttt ctgtccgacc 900tccaaccttt acctcggcag cggcttattc aacttgaaaa aagcatggca gaagaaagtt 960aaagtgggca tgggaacgga tatcggtgcc ggaaccactt tcaacatgct gcaaacgctg 1020aacgaagcct acaaagtatt gcaattacaa ggctatcgcc tctcggcata tgaagcgttt 1080tacctggcca cgctcggcgg agcgaaatct ctgggccttg acgatttgat tggcaacttt 1140ttacctggca aagaggctga tttcgtggtg atggaaccca ccgccactcc gctacagcag 1200ctgcgctatg acaactctgt ttctttagtc gacaaattgt tcgtgatgat gacgttgggc 1260gatgaccgtt cgatctaccg cacctacgtt gatggtcgtc tggtgtacga acgcaactaa 132091398DNABradyrhizobium japonicum 9atgaccaccg tcggtattcg cggcacgttc ttcgatttcg tcgacgatcc ctggaagcac 60atcggcaacg agcaggcggc tgcgcgcttt catcaggacg gcctcatggt cgtcaccgac 120ggcgtcatca aggcgttcgg tccgtacgag aagatcgccg ccgcgcatcc gggcgttgag 180atcacccata tcaaggaccg catcatcgtc ccgggcttca tcgacggcca catccatctg 240cctcagaccc gcgtgctcgg tgcctatggc gagcagctct tgccgtggct gcagaagtcg 300atctatcccg aggagatcaa gtacaaggat cgcaactacg cgcgcgaagg cgtgaagcgt 360tttctcgatg cactgctcgc cgccggcacc accacctgcc aggccttcac cagctcctca 420ccggtcgcga ccgaagagct gttcgaggag gcaagcaggc gcaacatgcg cgtgatcgcg 480ggtctcaccg ggatcgaccg caacgcgccg gccgaattca tcgatacgcc cgagaatttc 540tatcgcgaca gcaagcggct gatcgcgcag tatcacgaca agggccgtaa cctctacgct 600atcacgccgc gcttcgcctt cggcgcctcg cccgagctgc tgaaggcgtg tcagcgcctc 660aagcacgagc atccggactg ctgggtcaat acccacatct ccgagaaccc ggccgaatgc 720agcggcgtgc tggtcgagca cccggactgc caggattatc tcggcgtcta cgagaagttc 780gacctggtcg gcccaaagtt ctccggcggc cacggcgtct atctctcgaa caacgaattc 840cgccgcatgt ccaagaaagg cgcggcggta gtgttctgcc cgtgctcgaa cctgttcctc 900ggcagcggcc tgttccgtct cggccgcgcc accgatccgg agcatcgcgt gaagatgtcg 960ttcggcaccg atgtcggcgg cggcaaccgc ttctcgatga tctccgtgct cgacgacgct 1020tacaaggtcg gcatgtgcaa caacacgctg ctcgacggca gcatcgatcc gtcgcgcaag 1080gacctcgcgg aagccgagcg caacaagctc tcgccctatc gtggcttctg gtcggtcacg 1140ctcggcggcg ccgaaggcct ctacatcgac gacaagctcg gcaatttcga gcccggcaag 1200gaggccgatt tcgtcgcgct cgatccgaac ggcggacaac tggcgcaacc ctggcaccag 1260tcgctgattg ccgacggtgc aggtccgcgc acggttgatg aggccgcgag catgctgttc 1320gccgtcatga tggtcggcga cgatcgctgc gtcgacgaga cctgggtgat gggcaagcgc 1380ctctacaaga agagctga 1398101470DNASaccharomyces cerevisiae 10atgacaaaaa gtgatttatt atttgataaa ttcaacgaca aacatggaaa gtttctagtt 60ttttttggta cctttgtaga tacccctaaa ttaggagagc tgagaatcag agagaaaaca 120tctgttggag ttctcaacgg aatcatcagg tttgtgaaca gaaattcact cgatcctgtc 180aaagattgtt tagatcacga tagtagctta tcaccagagg atgtcacggt ggttgacata 240attggaaaag acaagactcg aaataacagc ttttattttc caggttttgt tgacacgcat 300aaccatgtct cgcaatatcc aaatgtcggc gtatttggga attctaccct gctggattgg 360ctagagaagt ataccttccc catagaagcc gcactagcaa acgaaaatat tgcgagagaa 420gtttacaata aggtaataag taagacgctt tctcacggta caacgactgt ggcttactat 480aataccattg atctcaagtc cactaagctc ttggctcaac taagctcctt attggggcag 540cgtgttcttg ttggaaaagt gtgcatggat accaatggtc ccgagtatta tattgaagat 600actaaaactt cctttgaaag cactgtgaaa gttgttaagt acatacggga aaccatttgt 660gatcccctcg taaatcctat agtgacacca aggttcgcgc cctcttgttc tagagaacta 720atgcaacagt tgtccaagct agtcaaggat gaaaacatac acgttcaaac ccacttgtcg 780gaaaataagg aggagataca gtgggttcaa gatttatttc ccgaatgtga gagctatact 840gatgtatacg acaaatatgg gctgctcaca gaaaaaacag tattggcaca ttgtattcat 900ctaacagatg ccgaagcgcg tgtgattaaa cagcgtcgct gtggtatatc tcattgtccc 960atttccaact cctctctgac ttctggagag tgtagggttc gatggttgct ggaccagggc 1020ataaaggttg gtctaggcac cgacgtttca gccggtcatt cttgtagcat actcaccacc 1080ggaaggcagg cctttgcagt ttcaaggcat ttggcaatga gagaaactga tcatgcaaaa 1140ctttcagtct ccgagtgcct atttcttgct acaatgggcg gagcacaagt cttgcgtatg 1200gatgagacct tggggacttt tgacgtcggt aagcagtttg acgctcaaat gatcgatacc 1260aatgctcccg gctcaaacgt ggatatgttt cattggcagc taaaggagaa ggatcaaatg 1320caagagcaag agcaagagca agggcaagac ccttataaga acccaccgct gcttactaat 1380gaagacataa tcgcaaaatg gttctttaac ggtgatgatc gcaacaccac taaagtttgg 1440gtagccggcc agcaagtcta ccagatttag 1470111356DNAYarrowia lipolytica 11atgactgctt caaacaccac agtttttttc ggagccatcg tcaatcccgc cagaagagca 60cttgaatacc tgccccaagc tgctatcggt gtcagggaag gggaaatcgt ctttttcgac 120agacatgctg aatcggcttc ggcgtctgct gccacccaca acattaagaa cttcgacacg 180gtggacttgt cgaaaaccac ctcgttcctt ttccccggtt tcatcgacac tcacattcat 240gcgccccagt accccaacag cggtattttc ggcaagacca cactgctaga ctggctgact 300acctacacct ttcccctgga gtcgtctctc aaggacccca aaatcgccca ggacgtgtac 360tccagggtag tcaagaagac tctcgccaac ggaactacaa cggctgctta ctacgccact 420gtccacgtgg agtccacaaa gaaactggct gacatttgtc tgtctcaagg tcagagagca 480cttgtgggaa gagtgtgcat ggaccaaaac actcctgatt actacagaga tgcaagcgtg 540gaggaggcca agaagagcga ccgggaagtt gttgagtata ttcagtctct taacaaaccc 600gatcgcatcc tccccatcat cacaccccgt tttgcgccct cttgcactgg tgaaatcatg 660tcctggcagg gagactatgc ccagaagaac aacctgcaca tccagactca catttctgaa 720aacaagggcg agattgcctg ggtcaaggag ctgtaccctg cttgcaaatc gtatgcagac 780acataccacc agcatggact gctgacagaa aagacgcttc tggcccatgc catctatctg 840accgacgaag aactcaacct ggtggagcag caaaagtgtg gactttccca ttgccccatt 900tccaactcgt cgctgacatc aggcgagttc catgctcgaa aaattctcga caggaacatt 960ccctttggtc tgggaaccga tgtttctgga ggttacgctc cttccattct cagcacagcc 1020agacacggtc ttctggtgtc tcgtcacgtg gccatgaagt ccgaaaacga cgccgacaag 1080ctgtctgtgg atgaggtact gtacttggcc actctgggtg gcgccgaggc tctcaaactg 1140gactcaaaga ttggttcttt cgaggtgggc aagaagttcg acgcccagca gattgatctc 1200gagactaacg gttctcctgt tgacattttt gactgggaat tgcctatttc cgagggaaac 1260aagctcgaga acctggtgca caagtggttg tttaatggag acgaccgaaa cacttctact 1320gtctgggtca acggagacaa ggtggtgacc aagtag 1356121437DNAWilliamsia sp. 12atgaccagaa tcgcaatcac cggcggacga gtcctgacca tggaccccga gcgccgcgtg 60ctcgaaccag gaacggttgt ggtcgaggac cagttcatcg cacaagtggg atccccggac 120gacgtcgaca tccgcggcgc cgaaatcatc gacgccaccg ggatggcagt gctccccggc 180ttcgtcaaca cccacaccca cgtcccacaa atcctcctca ggggtggtgc atcccatgac 240cgcaacctcc tcgaatggct gcacaacgtg ctctatcccg gcctcgctgc ctacacagac 300gacgacatcc gagtcggaac actgctgtac tgcgccgaag cccttcgttc tggcatcacc 360actgtcgtcg acaacgagga cgtccgaccc aacgacttcg cccgcgccgg ggccgccggg 420atcggcgcct tcaccgacgc aggaatccga gccatttacg cgcgcatgta cttcgacgcg 480ccacgcgccg aactcgaaga actcgtcgcc accatccacg ccaaggcccc cggcgccgtg 540cgcatggacg aatcagccag caccgaccac gtactggcag acctagacca actcatcacc 600cgccacgacc gcacagcaga tggccgcatc agggtgtggc ccgcacccgc catccccttc 660atggtcagtg aaaaaggaat gaaggcagcg caagagatcg cagcgagccg caccgacggc 720tggaccatgc acgtcagcga ggatcccatc gaggcccgag tgcactccat gaacgccccg 780gaatatttac accacctcgg ctgcctcgac gaccgactcc ttgccgcgca ctgcgtgcat 840atcgacagcc gagacatccg cctgttccgc cagcacgacg taaaaatttc tacccaacca 900gtatcgaaca gctacctggc ggccggaatt gcaccggtcc ccgaaatgct cgcccacggc 960gtgaccgtgg gcatcggtac cgacgacgcc aactgcaacg acagcgtgaa cctcatctcg 1020gacatgaaag tgctagcgct cattcaccga gctgcacatc gagatgcctc aatcatcaca 1080cctgaaaaaa tcatcgaaat ggccaccatc gacggagccc gctgcatcgg tatggccgat 1140cagattggtt ccctcgaggc gggtaaacgc gccgacatca tcaccctcga ccttcgtcac 1200gcccaaacaa ccccagcgca cgacttggcg gccaccatcg tctttcaggc ctacggcaac 1260gaggtcaacg acgtcctcgt caatggctcg gtagtgatgc gcgatcgagt actttctttt 1320ctgccgactc cccaagaaga aaaagcgctc tacgacgatg cgtcggagcg atcggctgca 1380atgctcgcac gggccggcct caccggcaca cgcacatggc aaacactggg atcgtag 1437131425DNAPseudomonas sp. 13atgcaaacgc tcagcatcca gcacggtacc ctcgtcacga tggatcagta ccgcagagtc 60cttggggata gctgggttca cgtgcaggat ggacggatcg tcgcgctcgg agtgcacgcc 120gagtcggtgc ctccgccagc ggatcgggtg atcgatgcac gcggcaaggt cgtgttaccc 180ggtttcatca atgcccacac ccatgtgaac cagatcctcc tgcgcggagg gccctcgcac 240gggcgtcaac tctatgactg gctgttcaac gttttgtatc cgggacaaaa ggcgatgaga 300ccggaggacg tagcggtggc ggtgaggttg tattgtgcgg aagctgtgcg cagcgggatt 360acgacgatca acgacaacgc cgattcggcc atctacccag gcaacatcga ggccgcgatg 420gcggtctatg gtgaggtggg tgtgagggtc gtctacgccc gcatgttctt tgatcggatg 480gacgggcgca ttcaagggta tgtggacgcc ttgaaggctc gctctcccca agtcgaactg 540tgctcgatca tggaggaaac ggctgtggcc aaagatcgga tcacagccct gtcagatcag 600tatcatggca cggcaggagg tcgtatatca gtttggcccg ctcctgccat taccccggcg 660gtgacagttg aaggaatgcg atgggcacaa gccttcgccc gtgatcgggc ggtaatgtgg 720acgcttcaca tggcggagag cgatcatgat gagcggcttc attggatgag tcccgccgag 780tacatggagt gttacggact cttggatgag cgtctgcagg tcgcgcattg cgtgtacttt 840gaccggaagg atgttcggct gctgcaccgc cacaatgtga aggtcgcgtc gcaggttgtg 900agcaatgcct acctcggctc aggggtggcc cccgtgccag agatggtgga gcgcggcatg 960gccgtgggca ttggaacaga tgacgggaat tgtaatgact ccgtaaacat gatcggagac 1020atgaagttta tggcccatat tcaccgcgcg gtgcatcggg atgcggacgt gctgacccca 1080gagaagattc ttgaaatggc gacgatcgat ggggcgcgtt cgttgggaat ggaccacgag 1140attggttcca tcgaaaccgg caagcgcgcg gaccttatcc tgcttgacct gcgtcaccct 1200cagacgactc ctcaccatca tttggcggcc acgatcgtgt ttcaggctta cggcaatgag 1260gtggacactg tcctgattga cggaaacgtt gtgatggaga accgccgctt gagctttctt 1320ccccctgaac gtgagttggc gttccttgag gaagcgcaga gccgcgccac agctattttg 1380cagcgggcga acatggtggc taacccagct tggcgcagcc tctag 1425141212DNAPseudomonas sp. 14atgagtaaag attttgattt aatcattaga aacgcctatc taagtgaaaa agacagtgta 60tatgatattg ggattgttgg tgacagaata atcaaaatag aagctaaaat tgaaggaacc 120gtaaaagacg aaattgatgc aaagggtaac cttgtgtctc ccggatttgt cgatgcacat 180acccatatgg ataagtcatt tacgagcaca ggtgaaagat taccgaagtt ttggagcaga 240ccttatacaa gggatgctgc catcgaggat ggcttgaaat attataaaaa tgctacccac 300gaagaaataa aaagacatgt gatagaacat gctcacatgc aggtactcca tgggacttta 360tacacccgga cccatgtaga tgtagattca gttgctaaaa caaaagcagt ggaagcagtt 420ttagaagcca aggaagagtt aaaggatctt atcgatatac aagtcgtagc ctttgcacag 480agtggatttt tcgttgattt ggaatctgaa tcattgatta gaaaatcctt ggatatgggc 540tgtgatttag ttgggggagt tgatcctgct acgcgggaaa ataatgttga gggttcttta 600gacctatgct ttaaattagc aaaggaatac gatgttgata tcgactatca catacatgat 660attggaactg ttggagtata ttcgataaat cgtcttgccc aaaagacaat tgaaaatggg 720tataagggta gagtaactac gagtcatgcc tggtgttttg cagatgctcc gtccgaatgg 780ctcgatgagg caatcccatt gtacaaggat tcgggtatga aatttgttac ctgttttagt 840agtacaccgc ctactatgcc ggtgataaag ctgcttgaag ctggcatcaa tcttggctgt 900gcttcggaca atatcagaga tttttgggtt ccctttggca acggtgatat ggtacaaggg 960gctctgatcg aaactcagag attagagtta aagacaaaca gagatttggg actaatttgg 1020aaaatgataa cgtcagaggg tgctagagtt ttaggaattg aaaagaacta tgggatagaa 1080gttggtaaaa aggccgatct tgttgtatta aattcgttgt caccacaatg ggcaataatc 1140gaccaagcaa aaagactatg cgtaattaaa aatggacgta tcattgtgaa ggatgaggtt 1200atagttgcct aa 121215735DNAMyrothecium verrucaria 15atgtcttctt cagaagtcaa agccaacgga tggactgccg ttccagtcag cgcaaaggcc 60attgttgact ccctgggaaa gcttggtgat gtctcctcat attctgtgga agatatcgcg 120ttccctgcgg cagacaaact tgttgccgag gcacaggcct ttgtgaaggc ccgattgagt 180cccgaaacct acaatcactc catgcgcgtt ttctactggg gaaccgtcat cgcgagacgt 240ttacttcccg agcaagctaa agacttgtct ccaagtacat gggcactgac atgtcttctg 300catgacgttg gtactgcgga ggcatacttt acatctacac gaatgtcctt cgatatttac 360ggtggcatta aggctatgga ggtgctcaag gtccttggga gtagcaccga ccaggctgag 420gctgttgccg aggccatcat tcgtcatgag gatgtggggg tagatggcaa catcacattc 480ctcggtcagt tgatccagct ggctacgctt tatgacaatg tcggggccta cgatgggatt 540gatgattttg gtagctgggt tgatgacacc acacgcaaca gtatcaacac ggcattccca 600cgacatggtt ggtgttcttg gtttgcctgc acggttcgta aggaagaaag taacaagcct 660tggtgccaca caacgcatat ccctcagttc gataaacaga tggaagcgaa cactttgatg 720aagccttggg agtaa 735164268DNAArtificial SequenceDescription of Artificial Sequence Synthetic polynucleotide 16ttatcgatga taagctgtca aagatgagaa ttaattccac ggactataga ctatactaga 60tactccgtct actgtacgat acacttccgc tcaggtcctt gtcctttaac gaggccttac 120cactcttttg ttactctatt gatccagctc agcaaaggca gtgtgatcta agattctatc 180ttcgcgatgt agtaaaacta gctagaccga gaaagagact agaaatgcaa aaggcacttc 240tacaatggct gccatcatta ttatccgatg tgacgctgca gcttctcaat gatattcgaa 300tacgctttga ggagatacag cctaatatcc gacaaactgt tttacagatt tacgatcgta 360cttgttaccc atcattgaat tttgaacatc cgaacctggg agttttccct gaaacagata 420gtatatttga acctgtataa taatatatag tctagcgctt tacggaagac aatgtatgta 480tttcggttcc tggagaaact attgcatcta ttgcataggt aatcttgcac gtcgcatccc 540cggttcattt tctgcgtttc catcttgcac ttcaatagca tatctttgtt aacgaagcat 600ctgtgcttca ttttgtagaa caaaaatgca acgcgagagc gctaattttt caaacaaaga 660atctgagctg catttttaca gaacagaaat gcaacgcgaa agcgctattt taccaacgaa 720gaatctgtgc ttcatttttg taaaacaaaa atgcaacgcg acgagagcgc taatttttca 780aacaaagaat ctgagctgca tttttacaga acagaaatgc aacgcgagag cgctatttta 840ccaacaaaga atctatactt cttttttgtt ctacaaaaat gcatcccgag agcgctattt 900ttctaacaaa gcatcttaga ttactttttt tctcctttgt gcgctctata atgcagtctc 960ttgataactt tttgcactgt aggtccgtta aggttagaag aaggctactt tggtgtctat 1020tttctcttcc ataaaaaaag cctgactcca cttcccgcgt ttactgatta ctagcgaagc 1080tgcgggtgca ttttttcaag ataaaggcat ccccgattat attctatacc gatgtggatt 1140gcgcatactt tgtgaacaga aagtgatagc gttgatgatt cttcattggt cagaaaatta 1200tgaacggttt cttctatttt gtctctatat actacgtata ggaaatgttt acattttcgt 1260attgttttcg attcactcta tgaatagttc ttactacaat ttttttgtct aaagagtaat 1320actagagata aacataaaaa atgtagaggt

cgagtttaga tgcaagttca aggagcgaaa 1380ggtggatggg taggttatat agggatatag cacagagata tatagcaaag agatactttt 1440gagcaatgtt tgtggaagcg gtattcgcaa tttaattaag tttaaacggc gcgcctttcc 1500ataggctccg cccccctgac gagcatcaca aaaatcgacg ctcaagtcag aggtggcgaa 1560acccgacagg actataaaga taccaggcgt ttccccctgg aagctccctc gtgcgctctc 1620ctgttccgac cctgccgctt accggatacc tgtccgcctt tctcccttcg ggaagcgtgg 1680cgctttctca tagctcacgc tgtaggtatc tcagttcggt gtaggtcgtt cgctccaagc 1740tgggctgtgt gcacgaaccc cccgttcagc ccgaccgctg cgccttatcc ggtaactatc 1800gtcttgagtc caacccggta agacacgact tatcgccact ggcagcagcc actggtaaca 1860ggattagcag agcgaggtat gtaggcggtg ctacagagtt cttgaagtgg tggcctaact 1920acggctacac tagaagaaca gtatttggta tctgcgctct gctgaagcca gttaccttcg 1980gaaaaagagt tggtagctct tgatccggca aacaaaccac cgctggtagc ggtggttttt 2040ttgtttgcaa gcagcagatt acgcgcagaa aaaaaggatc tcaagaagat cctttgatct 2100tttctacggg gtctgacgct cagtggaacg aaaactcacg ttaagggatt ttggtcatga 2160gattatcaaa aaggatcttc acctagatcc ttttaaatta aaaatgaagt tttaaatcaa 2220tctaaagtat atatgagtaa acttggtctg acagttacca atgcttaatc agtgaggcac 2280ctatctcagc gatctgtcta tttcgttcat ccatagttgc ctgactcccc gtcgtgtaga 2340taactacgat acgggagggc ttaccatctg gccccagtgc tgcaatgata ccgcgagacc 2400cacgctcacc ggctccagat ttatcagcaa taaaccagcc agccggaagg gccgagcgca 2460gaagtggtcc tgcaacttta tccgcctcca tccagtctat taattgttgc cgggaagcta 2520gagtaagtag ttcgccagtt aatagtttgc gcaacgttgt tgccattgct acaggcatcg 2580tggtgtcacg ctcgtcgttt ggtatggctt cattcagctc cggttcccaa cgatcaaggc 2640gagttacatg atcccccatg ttgtgcaaaa aagcggttag ctccttcggt cctccgatcg 2700ttgtcagaag taagttggcc gcagtgttat cactcatggt tatggcagca ctgcataatt 2760ctcttactgt catgccatcc gtaagatgct tttctgtgac tggtgagtac tcaaccaagt 2820cattctgaga atagtgtatg cggcgaccga gttgctcttg cccggcgtca atacgggata 2880ataccgcgcc acatagcaga actttaaaag tgctcatcat tggaaaacgt tcttcggggc 2940gaaaactctc aaggatctta ccgctgttga gatccagttc gatgtaaccc actcgtgcac 3000ccaactgatc ttcagcatct tttactttca ccagcgtttc tgggtgagca aaaacaggaa 3060ggcaaaatgc cgcaaaaaag ggaataaggg cgacacggaa atgttgaata ctcatactct 3120tcctttttca atattattga agcatttatc agggttattg tctcatgagc ggatacatat 3180ttgaatgtat ttagaaaaat aaacagcgat cgcgcggccg cgggtaataa ctgatataat 3240taaattgaag ctctaatttg tgagtttagt atacatgcat ttacttataa tacagttttt 3300tagttttgct ggccgcatct tctcaaatat gcttcccagc ctgcttttct gtaacgttca 3360ccctctacct tagcatccct tccctttgca aatagtcctc ttccaacaat aataatgtca 3420gatcctgtag agaccacatc atccacggtt ctatactgtt gacccaatgc gtctcccttg 3480tcatctaaac ccacaccggg tgtcataatc aaccaatcgt aaccttcatc tcttccaccc 3540atgtctcttt gagcaataaa gccgataaca aaatctttgt cgctcttcgc aatgtcaaca 3600gtacccttag tatattctcc agtagctagg gagcccttgc atgacaattc tgctaacatc 3660aaaaggcctc taggttcctt tgttacttct tccgccgcct gcttcaaacc gctaacaata 3720cctgggccca ccacaccgtg tgcattcgta atgtctgccc attctgctat tctgtataca 3780cccgcagagt actgcaattt gactgtatta ccaatgtcag caaattttct gtcttcgaag 3840agtaaaaaat tgtacttggc ggataatgcc tttagcggct taactgtgcc ctccatggaa 3900aaatcagtca agatatccac atgtgttttt agtaaacaaa ttttgggacc taatgcttca 3960actaactcca gtaattcctt ggtggtacga acatccaatg aagcacacaa gtttgtttgc 4020ttttcgtgca tgatattaaa tagcttggca gcaacaggac taggatgagt agcagcacgt 4080tccttatatg tagctttcga catgatttat cttcgtttcc tgcaggtttt tgttctgtgc 4140agttgggtta agaatactgg gcaatttcat gtttcttcaa caccacatat gcgtatatat 4200accaatctaa gtctgtgctc cttccttcgt tcttccttct gctcggagat taccgaatca 4260aagctagc 4268176706DNAArtificial SequenceDescription of Artificial Sequence Synthetic polynucleotide 17gatacttttg agcaatgttt gtggaagcgg tattcgcaat tataaacggt attttcacaa 60ttgcacccca gccagaccga tagccggtcg caatccgcca cccacaaccg tctacctccc 120acagaacccc gtcacttcca cccttttcca ccagatcata tgtcccaact tgccaaatta 180aaaccgtgcg aattttcaaa ataaactttg gcaaagaggc tgcaaaggag gggctggtga 240gggcgtctgg aagtcgacca gagaccgggt tggcggcgca tttgtgtccc aaaaaacagc 300cccaattgcc ccaattgacc ccaaattgac ccagtagcgg gcccaacccc ggcgagagcc 360cccttctccc cacatatcaa acctcccccg gttcccacac ttgccgttaa gggcgtaggg 420tactgcagtc tggaatctac gcttgttcag actttgtact agtttctttg tctggccatc 480cgggtaaccc atgccggacg caaaatagac tactgaaaat ttttttgctt tgtggttggg 540actttagcca agggtataaa agaccaccgt ccccgaatta cctttcctct tcttttctct 600ctctccttgt caactcacac ccgaaatcgt taagcatttc cttctgagta taagaatcat 660tcaaaatgac tagaatcgct atcacaggtg gtagagtttt gactatggac ccagaaagaa 720gagtattaga accaggtaca gttgttgttg aagatcaatt cattgcacaa gtcggttcac 780cagatgacgt agacatcaga ggtgctgaaa ttatagatgc cactggtatg gctgtattac 840caggtttcgt taatacacat acccacgttc ctcaaatttt gttaagaggt ggtgcttcac 900atgatagaaa tttgttggaa tggttgcaca acgtcttata tccaggtttg gctgcataca 960ctgatgacga tatcagagtt ggtacattgt tatattgtgc tgaagcattg agatccggta 1020ttactacagt tgtcgacaat gaagatgtta gacctaacga ttttgccaga gctggtgccg 1080ctggtattgg tgcattcact gatgccggta tcagagcaat ctatgccaga atgtactttg 1140atgctccaag agcagaattg gaagaattag tcgcaacaat acatgcaaaa gcccctggtg 1200ccgtaagaat ggacgaatct gcttcaaccg atcatgtttt ggcagactta gatcaattga 1260ttaccagaca tgacagaact gctgatggta gaattagagt atggccagct cctgcaatac 1320cattcatggt ttctgaaaag ggtatgaagg cagcccaaga aatagctgca tccagaactg 1380acggttggac aatgcatgtt agtgaagatc caatcgaagc cagagtccac tctatgaatg 1440ctcctgaata tttgcatcac ttgggttgtt tagacgatag attgttagcc gctcattgcg 1500ttcacataga ctcaagagat atcagattgt ttagacaaca tgatgttaag atatccacac 1560aacctgtctc caatagttac ttagcagccg gtatagcacc agttcctgaa atgttggctc 1620atggtgtcac agtaggtatt ggtaccgacg atgctaattg taacgactcc gtaaacttaa 1680tcagtgatat gaaggttttg gcattgatac atagagctgc acacagagat gctagtatca 1740ttaccccaga aaagataatc gaaatggcca ctattgacgg tgctagatgc attggtatgg 1800ctgatcaaat cggttctttg gaagctggta aaagagcaga cataatcact ttggatttga 1860gacatgcaca aaccactcct gcccacgatt tggccgctac aattgtcttt caagcttatg 1920gtaatgaagt aaacgatgtt ttggtcaacg gttctgtagt tatgagagat agagttttgt 1980cattcttacc aacccctcaa gaagaaaagg ctttatacga cgatgcatct gaaagatcag 2040cagccatgtt agccagagct ggtttgactg gtacaagaac ctggcaaact ttgggttctt 2100aagctgcttg tacctagtgc aaccccagtt tgttaaaaat tagtagtcaa aaacttctga 2160gttagaaatt tgtgagtgta gtgagattgt agagtatcat gtgtgtccgt aagtgaagtg 2220ttattgactc ttagttagtt tatctagtac tcgtttagtt gacactgatc tagtatttta 2280cgaggcgtat gactttagcc aagtgttgta cttagtcttc tctccaaaca tgagagggct 2340ctgtcactca gtcggcctat gggtgagatg gcttggtgag atctttcgat agtctcgtca 2400agatggtagg atgatggggg aatacattac tgctctcgtc aaggaaacca caatcagatc 2460acaccatcct ccatggtatc cgatgactct cttctccaca gttttccata ggctccgccc 2520ccctgacgag catcacaaaa atcgacgctc aagtcagagg tggcgaaacc cgacaggact 2580ataaagatac caggcgtttc cccctggaag ctccctcgtg cgctctcctg ttccgaccct 2640gccgcttacc ggatacctgt ccgcctttct cccttcggga agcgtggcgc tttctcatag 2700ctcacgctgt aggtatctca gttcggtgta ggtcgttcgc tccaagctgg gctgtgtgca 2760cgaacccccc gttcagcccg accgctgcgc cttatccggt aactatcgtc ttgagtccaa 2820cccggtaaga cacgacttat cgccactggc agcagccact ggtaacagga ttagcagagc 2880gaggtatgta ggcggtgcta cagagttctt gaagtggtgg cctaactacg gctacactag 2940aagaacagta tttggtatct gcgctctgct gaagccagtt accttcggaa aaagagttgg 3000tagctcttga tccggcaaac aaaccaccgc tggtagcggt ggtttttttg tttgcaagca 3060gcagattacg cgcagaaaaa aaggatctca agaagatcct ttgatctttt ctacggggtc 3120tgacgctcag tggaacgaaa actcacgtta agggattttg gtcatgagat tatcaaaaag 3180gatcttcacc tagatccttt taaattaaaa atgaagtttt aaatcaatct aaagtatata 3240tgagtaaact tggtctgaca gttaccaatg cttaatcagt gaggcaccta tctcagcgat 3300ctgtctattt cgttcatcca tagttgcctg actccccgtc gtgtagataa ctacgatacg 3360ggagggctta ccatctggcc ccagtgctgc aatgataccg cgagacccac gctcaccggc 3420tccagattta tcagcaataa accagccagc cggaagggcc gagcgcagaa gtggtcctgc 3480aactttatcc gcctccatcc agtctattaa ttgttgccgg gaagctagag taagtagttc 3540gccagttaat agtttgcgca acgttgttgc cattgctaca ggcatcgtgg tgtcacgctc 3600gtcgtttggt atggcttcat tcagctccgg ttcccaacga tcaaggcgag ttacatgatc 3660ccccatgttg tgcaaaaaag cggttagctc cttcggtcct ccgatcgttg tcagaagtaa 3720gttggccgca gtgttatcac tcatggttat ggcagcactg cataattctc ttactgtcat 3780gccatccgta agatgctttt ctgtgactgg tgagtactca accaagtcat tctgagaata 3840gtgtatgcgg cgaccgagtt gctcttgccc ggcgtcaata cgggataata ccgcgccaca 3900tagcagaact ttaaaagtgc tcatcattgg aaaacgttct tcggggcgaa aactctcaag 3960gatcttaccg ctgttgagat ccagttcgat gtaacccact cgtgcaccca actgatcttc 4020agcatctttt actttcacca gcgtttctgg gtgagcaaaa acaggaaggc aaaatgccgc 4080aaaaaaggga ataagggcga cacggaaatg ttgaatactc atactcttcc tttttcaata 4140ttattgaagc atttatcagg gttattgtct catgagcgga tacatatttg aatgtattta 4200gaaaaataaa cagcgatcgc gcggccgcgg gtaataactg atataattaa attgaagctc 4260taatttgtga gtttagtata catgcattta cttataatac agttttttag ttttgctggc 4320cgcatcttct caaatatgct tcccagcctg cttttctgta acgttcaccc tctaccttag 4380catcccttcc ctttgcaaat agtcctcttc caacaataat aatgtcagat cctgtagaga 4440ccacatcatc cacggttcta tactgttgac ccaatgcgtc tcccttgtca tctaaaccca 4500caccgggtgt cataatcaac caatcgtaac cttcatctct tccacccatg tctctttgag 4560caataaagcc gataacaaaa tctttgtcgc tcttcgcaat gtcaacagta cccttagtat 4620attctccagt agctagggag cccttgcatg acaattctgc taacatcaaa aggcctctag 4680gttcctttgt tacttcttcc gccgcctgct tcaaaccgct aacaatacct gggcccacca 4740caccgtgtgc attcgtaatg tctgcccatt ctgctattct gtatacaccc gcagagtact 4800gcaatttgac tgtattacca atgtcagcaa attttctgtc ttcgaagagt aaaaaattgt 4860acttggcgga taatgccttt agcggcttaa ctgtgccctc catggaaaaa tcagtcaaga 4920tatccacatg tgtttttagt aaacaaattt tgggacctaa tgcttcaact aactccagta 4980attccttggt ggtacgaaca tccaatgaag cacacaagtt tgtttgcttt tcgtgcatga 5040tattaaatag cttggcagca acaggactag gatgagtagc agcacgttcc ttatatgtag 5100ctttcgacat gatttatctt cgtttcctgc aggtttttgt tctgtgcagt tgggttaaga 5160atactgggca atttcatgtt tcttcaacac cacatatgcg tatatatacc aatctaagtc 5220tgtgctcctt ccttcgttct tccttctgct cggagattac cgaatcaaag ctagcttatc 5280gatgataagc tgtcaaagat gagaattaat tccacggact atagactata ctagatactc 5340cgtctactgt acgatacact tccgctcagg tccttgtcct ttaacgaggc cttaccactc 5400ttttgttact ctattgatcc agctcagcaa aggcagtgtg atctaagatt ctatcttcgc 5460gatgtagtaa aactagctag accgagaaag agactagaaa tgcaaaaggc acttctacaa 5520tggctgccat cattattatc cgatgtgacg ctgcagcttc tcaatgatat tcgaatacgc 5580tttgaggaga tacagcctaa tatccgacaa actgttttac agatttacga tcgtacttgt 5640tacccatcat tgaattttga acatccgaac ctgggagttt tccctgaaac agatagtata 5700tttgaacctg tataataata tatagtctag cgctttacgg aagacaatgt atgtatttcg 5760gttcctggag aaactattgc atctattgca taggtaatct tgcacgtcgc atccccggtt 5820cattttctgc gtttccatct tgcacttcaa tagcatatct ttgttaacga agcatctgtg 5880cttcattttg tagaacaaaa atgcaacgcg agagcgctaa tttttcaaac aaagaatctg 5940agctgcattt ttacagaaca gaaatgcaac gcgaaagcgc tattttacca acgaagaatc 6000tgtgcttcat ttttgtaaaa caaaaatgca acgcgacgag agcgctaatt tttcaaacaa 6060agaatctgag ctgcattttt acagaacaga aatgcaacgc gagagcgcta ttttaccaac 6120aaagaatcta tacttctttt ttgttctaca aaaatgcatc ccgagagcgc tatttttcta 6180acaaagcatc ttagattact ttttttctcc tttgtgcgct ctataatgca gtctcttgat 6240aactttttgc actgtaggtc cgttaaggtt agaagaaggc tactttggtg tctattttct 6300cttccataaa aaaagcctga ctccacttcc cgcgtttact gattactagc gaagctgcgg 6360gtgcattttt tcaagataaa ggcatccccg attatattct ataccgatgt ggattgcgca 6420tactttgtga acagaaagtg atagcgttga tgattcttca ttggtcagaa aattatgaac 6480ggtttcttct attttgtctc tatatactac gtataggaaa tgtttacatt ttcgtattgt 6540tttcgattca ctctatgaat agttcttact acaatttttt tgtctaaaga gtaatactag 6600agataaacat aaaaaatgta gaggtcgagt ttagatgcaa gttcaaggag cgaaaggtgg 6660atgggtaggt tatataggga tatagcacag agatatatag caaaga 6706188336DNAArtificial SequenceDescription of Artificial Sequence Synthetic polynucleotide 18gtttgtggaa gcggtattcg caatttaatt aaagctggtg acaattaatc atcggctcgt 60ataatgtgtg gaattgaatc gatataagga ggttaatcat atgactagaa tcgctatcac 120aggtggtaga gttttgacta tggacccaga aagaagagta ttagaaccag gtacagttgt 180tgttgaagat caattcattg cacaagtcgg ttcaccagat gacgtagaca tcagaggtgc 240tgaaattata gatgccactg gtatggctgt attaccaggt ttcgttaata cacataccca 300cgttcctcaa attttgttaa gaggtggtgc ttcacatgat agaaatttgt tggaatggtt 360gcacaacgtc ttatatccag gtttggctgc atacactgat gacgatatca gagttggtac 420attgttatat tgtgctgaag cattgagatc cggtattact acagttgtcg acaatgaaga 480tgttagacct aacgattttg ccagagctgg tgccgctggt attggtgcat tcactgatgc 540cggtatcaga gcaatctatg ccagaatgta ctttgatgct ccaagagcag aattggaaga 600attagtcgca acaatacatg caaaagcccc tggtgccgta agaatggacg aatctgcttc 660aaccgatcat gttttggcag acttagatca attgattacc agacatgaca gaactgctga 720tggtagaatt agagtatggc cagctcctgc aataccattc atggtttctg aaaagggtat 780gaaggcagcc caagaaatag ctgcatccag aactgacggt tggacaatgc atgttagtga 840agatccaatc gaagccagag tccactctat gaatgctcct gaatatttgc atcacttggg 900ttgtttagac gatagattgt tagccgctca ttgcgttcac atagactcaa gagatatcag 960attgtttaga caacatgatg ttaagatatc cacacaacct gtctccaata gttacttagc 1020agccggtata gcaccagttc ctgaaatgtt ggctcatggt gtcacagtag gtattggtac 1080cgacgatgct aattgtaacg actccgtaaa cttaatcagt gatatgaagg ttttggcatt 1140gatacataga gctgcacaca gagatgctag tatcattacc ccagaaaaga taatcgaaat 1200ggccactatt gacggtgcta gatgcattgg tatggctgat caaatcggtt ctttggaagc 1260tggtaaaaga gcagacataa tcactttgga tttgagacat gcacaaacca ctcctgccca 1320cgatttggcc gctacaattg tctttcaagc ttatggtaat gaagtaaacg atgttttggt 1380caacggttct gtagttatga gagatagagt tttgtcattc ttaccaaccc ctcaagaaga 1440aaaggcttta tacgacgatg catctgaaag atcagcagcc atgttagcca gagctggttt 1500gactggtaca agaacctggc aaactttggg ttcttaagga aatccattat gatgtcagga 1560gaacacacgt taaaagcggt acgaggcagt tttattgatg tcacccgtac gatcgataac 1620ccggaagaga ttgcctctgc gctgcggttt attgaggatg gtttattact cattaaacag 1680ggaaaagtgg aatggtttgg cgaatgggaa aacggaaagc atcaaattcc tgacaccatt 1740cgcgtgcgcg actatcgcgg caaactgata gtaccgggct ttgtcgatac acatatccat 1800tatccgcaaa gtgaaatggt gggggcctat ggtgagcaat tgctggagtg gttgaataaa 1860cacaccttcc ctactgaacg tcgttatgag gatttagagt acgcccgcga aatgtcggcg 1920ttcttcatca agcagctttt acgtaacgga accaccacgg cgctggtgtt tggcactgtt 1980catccgcaat ctgttgatgc gctgtttgaa gccgccagtc atatcaatat gcgtatgatt 2040gccggtaagg tgatgatgga ccgcaacgca ccggattatc tgctcgacac tgccgaaagc 2100agctatcacc aaagcaaaga actgatcgaa cgctggcaca aaaatggtcg tctgctatat 2160gcgattacgc cacgcttcgc cccgacctca tctcctgaac agatggcgat ggcgcaacgc 2220ctgaaagaag aatatccgga tacgtgggta catacccatc tctgtgaaaa caaagatgaa 2280attgcctggg tgaaatcgct ttatcctgac catgatggtt atctggatgt ttaccatcag 2340tacggcctga ccggtaaaaa ctgtgtcttt gctcactgcg tccatctcga agaaaaagag 2400tgggatcgtc tcagcgaaac caaatccagc attgctttct gtccgacctc caacctttac 2460ctcggcagcg gcttattcaa cttgaaaaaa gcatggcaga agaaagttaa agtgggcatg 2520ggaacggata tcggtgccgg aaccactttc aacatgctgc aaacgctgaa cgaagcctac 2580aaagtattgc aattacaagg ctatcgcctc tcggcatatg aagcgtttta cctggccacg 2640ctcggcggag cgaaatctct gggccttgac gatttgattg gcaacttttt acctggcaaa 2700gaggctgatt tcgtggtgat ggaacccacc gccactccgc tacagcagct gcgctatgac 2760aactctgttt ctttagtcga caaattgttc gtgatgatga cgttgggcga tgaccgttcg 2820atctaccgca cctacgttga tggtcgtctg gtgtacgaac gcaactaagg aacgaccatg 2880agagaagtcc aattgttaga tggtagaaga gttgatgtcg cctgtgctgg tcctttgatt 2940agtgaaatag gtgcccactt agatttgact gctccagttg aaattgattg tggtggtggt 3000ttagcaacta gaccttttac tgaacctcat ttgcacttag acaaagcagg tactgccgat 3060agattgcctg ccggtgcttc cacaatcggt gacgctattg ctgcaatgca aagtgtcaag 3120gtaaccgaaa gagataatgt cgccgctgta gcagccagaa tgcatagagt tttaaacaga 3180atcgtcgatg acggttccca cgctattaga gcattggttg atgtcgacga agtttggggt 3240ttaacagctt ttcatgctgc acaacaagtc caagccgctt tggccccaag agctgttgtc 3300caaattgtcg ctttcccaca acacggttta acccctcaag tattggcaat gttagaacaa 3360gcagccgctg aaggtgcagg tgccttgggt gctcatactg atgttgaccc agatcctgca 3420gcccacgttg gtgccgtcgc tgcaatagcc gctggtgctt ccttgccatt agaagttcat 3480actgacgaag gtgctagtcc agataaattt tatttgcctg cagtattgga agttttagat 3540agattcccag gtttgtctac tacattagct cattgtttgt cattaggtac aattgcacct 3600aagcaacaac aacattggat cgaagaatta gctcacagag atatcaaagt atgcgttgca 3660ccatctattt tgggtttcgg tttgccatta gcacctgtta gagccttaat agaagctggt 3720gtcggtatct tagtaggttc agacaatttg caagatgttt tctttccttt gggtacaggt 3780agagcaattg aaaacgttag attgttagcc accgcagccc aattaactgc accagaattg 3840gccggtcctt taattgctgg tgtaaccgac atagcttacg caaccgttac tggtgctgca 3900gatgccttgg ctgttgaatc tccagctaca ttagtagttc atgatgctac ctcacctgca 3960gaattgttaa gaggtataga cggtacaaga attaccgtta tagatggttt gttgacatct 4020ccattgcaat tggataaagg tatcaagtaa gtttaaacta atcccacagc cgccagttcc 4080gctggcggca ttttaacttt ctttaatggg cgcgcctttc cataggctcc gcccccctga 4140cgagcatcac aaaaatcgac gctcaagtca gaggtggcga aacccgacag gactataaag 4200ataccaggcg tttccccctg gaagctccct cgtgcgctct cctgttccga ccctgccgct 4260taccggatac ctgtccgcct ttctcccttc gggaagcgtg gcgctttctc atagctcacg 4320ctgtaggtat ctcagttcgg tgtaggtcgt tcgctccaag ctgggctgtg tgcacgaacc 4380ccccgttcag cccgaccgct gcgccttatc cggtaactat cgtcttgagt ccaacccggt 4440aagacacgac ttatcgccac tggcagcagc cactggtaac aggattagca gagcgaggta 4500tgtaggcggt gctacagagt tcttgaagtg gtggcctaac tacggctaca ctagaagaac 4560agtatttggt atctgcgctc tgctgaagcc agttaccttc ggaaaaagag ttggtagctc 4620ttgatccggc aaacaaacca ccgctggtag cggtggtttt tttgtttgca agcagcagat 4680tacgcgcaga aaaaaaggat ctcaagaaga tcctttgatc ttttctacgg ggtctgacgc 4740tcagtggaac gaaaactcac gttaagggat tttggtcatg agattatcaa aaaggatctt 4800cacctagatc cttttaaatt aaaaatgaag ttttaaatca atctaaagta tatatgagta 4860aacttggtct gacagttacc aatgcttaat cagtgaggca cctatctcag cgatctgtct 4920atttcgttca tccatagttg cctgactccc cgtcgtgtag ataactacga tacgggaggg 4980cttaccatct ggccccagtg ctgcaatgat accgcgagac ccacgctcac cggctccaga 5040tttatcagca ataaaccagc cagccggaag ggccgagcgc agaagtggtc ctgcaacttt 5100atccgcctcc atccagtcta ttaattgttg ccgggaagct agagtaagta gttcgccagt 5160taatagtttg cgcaacgttg ttgccattgc

tacaggcatc gtggtgtcac gctcgtcgtt 5220tggtatggct tcattcagct ccggttccca acgatcaagg cgagttacat gatcccccat 5280gttgtgcaaa aaagcggtta gctccttcgg tcctccgatc gttgtcagaa gtaagttggc 5340cgcagtgtta tcactcatgg ttatggcagc actgcataat tctcttactg tcatgccatc 5400cgtaagatgc ttttctgtga ctggtgagta ctcaaccaag tcattctgag aatagtgtat 5460gcggcgaccg agttgctctt gcccggcgtc aatacgggat aataccgcgc cacatagcag 5520aactttaaaa gtgctcatca ttggaaaacg ttcttcgggg cgaaaactct caaggatctt 5580accgctgttg agatccagtt cgatgtaacc cactcgtgca cccaactgat cttcagcatc 5640ttttactttc accagcgttt ctgggtgagc aaaaacagga aggcaaaatg ccgcaaaaaa 5700gggaataagg gcgacacgga aatgttgaat actcatactc ttcctttttc aatattattg 5760aagcatttat cagggttatt gtctcatgag cggatacata tttgaatgta tttagaaaaa 5820taaacagcga tcgcgcggcc gcgggtaata actgatataa ttaaattgaa gctctaattt 5880gtgagtttag tatacatgca tttacttata atacagtttt ttagttttgc tggccgcatc 5940ttctcaaata tgcttcccag cctgcttttc tgtaacgttc accctctacc ttagcatccc 6000ttccctttgc aaatagtcct cttccaacaa taataatgtc agatcctgta gagaccacat 6060catccacggt tctatactgt tgacccaatg cgtctccctt gtcatctaaa cccacaccgg 6120gtgtcataat caaccaatcg taaccttcat ctcttccacc catgtctctt tgagcaataa 6180agccgataac aaaatctttg tcgctcttcg caatgtcaac agtaccctta gtatattctc 6240cagtagctag ggagcccttg catgacaatt ctgctaacat caaaaggcct ctaggttcct 6300ttgttacttc ttccgccgcc tgcttcaaac cgctaacaat acctgggccc accacaccgt 6360gtgcattcgt aatgtctgcc cattctgcta ttctgtatac acccgcagag tactgcaatt 6420tgactgtatt accaatgtca gcaaattttc tgtcttcgaa gagtaaaaaa ttgtacttgg 6480cggataatgc ctttagcggc ttaactgtgc cctccatgga aaaatcagtc aagatatcca 6540catgtgtttt tagtaaacaa attttgggac ctaatgcttc aactaactcc agtaattcct 6600tggtggtacg aacatccaat gaagcacaca agtttgtttg cttttcgtgc atgatattaa 6660atagcttggc agcaacagga ctaggatgag tagcagcacg ttccttatat gtagctttcg 6720acatgattta tcttcgtttc ctgcaggttt ttgttctgtg cagttgggtt aagaatactg 6780ggcaatttca tgtttcttca acaccacata tgcgtatata taccaatcta agtctgtgct 6840ccttccttcg ttcttccttc tgctcggaga ttaccgaatc aaagctagct tatcgatgat 6900aagctgtcaa agatgagaat taattccacg gactatagac tatactagat actccgtcta 6960ctgtacgata cacttccgct caggtccttg tcctttaacg aggccttacc actcttttgt 7020tactctattg atccagctca gcaaaggcag tgtgatctaa gattctatct tcgcgatgta 7080gtaaaactag ctagaccgag aaagagacta gaaatgcaaa aggcacttct acaatggctg 7140ccatcattat tatccgatgt gacgctgcag cttctcaatg atattcgaat acgctttgag 7200gagatacagc ctaatatccg acaaactgtt ttacagattt acgatcgtac ttgttaccca 7260tcattgaatt ttgaacatcc gaacctggga gttttccctg aaacagatag tatatttgaa 7320cctgtataat aatatatagt ctagcgcttt acggaagaca atgtatgtat ttcggttcct 7380ggagaaacta ttgcatctat tgcataggta atcttgcacg tcgcatcccc ggttcatttt 7440ctgcgtttcc atcttgcact tcaatagcat atctttgtta acgaagcatc tgtgcttcat 7500tttgtagaac aaaaatgcaa cgcgagagcg ctaatttttc aaacaaagaa tctgagctgc 7560atttttacag aacagaaatg caacgcgaaa gcgctatttt accaacgaag aatctgtgct 7620tcatttttgt aaaacaaaaa tgcaacgcga cgagagcgct aatttttcaa acaaagaatc 7680tgagctgcat ttttacagaa cagaaatgca acgcgagagc gctattttac caacaaagaa 7740tctatacttc ttttttgttc tacaaaaatg catcccgaga gcgctatttt tctaacaaag 7800catcttagat tacttttttt ctcctttgtg cgctctataa tgcagtctct tgataacttt 7860ttgcactgta ggtccgttaa ggttagaaga aggctacttt ggtgtctatt ttctcttcca 7920taaaaaaagc ctgactccac ttcccgcgtt tactgattac tagcgaagct gcgggtgcat 7980tttttcaaga taaaggcatc cccgattata ttctataccg atgtggattg cgcatacttt 8040gtgaacagaa agtgatagcg ttgatgattc ttcattggtc agaaaattat gaacggtttc 8100ttctattttg tctctatata ctacgtatag gaaatgttta cattttcgta ttgttttcga 8160ttcactctat gaatagttct tactacaatt tttttgtcta aagagtaata ctagagataa 8220acataaaaaa tgtagaggtc gagtttagat gcaagttcaa ggagcgaaag gtggatgggt 8280aggttatata gggatatagc acagagatat atagcaaaga gatacttttg agcaat 8336198063DNAArtificial SequenceDescription of Artificial Sequence Synthetic polynucleotide 19gtttgtggaa gcggtattcg caatttaatt aaagctggtg acaattaatc atcggctcgt 60ataatgtgtg gaattgaatc gatataagga ggttaatcat atgtcctcca cagcattata 120caccgttcct accgcaggtc cagacgatgt tgccgccttg aaagcattag atggtcattc 180cgcctccgat attttggctg taataggtaa aacagagggt aatggttgtg ttaacgactt 240tagtagaacc ttatctgctg cagtttggca tccattgtta gaagattcag ccattacagt 300cttttccggt ggtgcagaag gtgtaataag tccacatgta aacatcttcg ttagagatga 360aagacaatat tctggtcacc ctagaggttt ggtaactgct gttggtagaa caagagttat 420cggtccagaa gaaattggta gacctgctca agtcgatgca gtacatgaaa ccgttgtcgc 480attgttaact gaattgggtg ttggtccaga tgacgttcac ttggtcttga ttaaatgccc 540tttgttatct tcagacgcta tagcaggtgt tcatagaaga ggtttaagac ctgtcactac 600agatacttac gaatctatgt caagatccag agccgcttct gctttgggta tagccatggc 660tttaaaggaa tgtgatagag acagagcatt gttagccttg gaaggtagag atgacgtttg 720gtcagcaaga gcctccgctt ccagtggtgc tgaattggat gactgccaca ttttagtagt 780tgcagaatca gatgcagccg ctaatccatt aagagcagcc catactgcca tgagagatgc 840tttggacatc caagctttaa cagaagtttt tgacagaatt gctgcagaag gtggtaccgt 900cagacaaata ttcgcaaagg ccgaagctga tccttcaggt gctatcagag gttatagaca 960taccatgtta actgattccg acgtcaatgc aacaagacac gccagagccg ctgtaggtgg 1020tttgattgca gccttacatg gtaacggtgc tgtctatgta tcaggtggtg cagaacacca 1080aggtccaagt ggtggtggtt ctgttactgt tatatatgat gttcctgcaa cagccaacgc 1140taccggtgaa gcttctagat aaggaaatcc attatgatat actcaacagt caacgctaat 1200ccttacgctt ggccttacga tggttcaata gaccctgctc acaccgcttt aatcttaatc 1260gattggcaaa tagacttttg tggtccaggt ggttatgtcg attccatggg ttacgactta 1320tccttgacta gaagtggttt agaacctaca gcaagagtat tggctgcagc cagagatact 1380ggtatgacag ttatccatac tagagaaggt cacagaccag atttggctga cttgccacct 1440aataagagat ggagatctgc atcagccggt gctgaaatcg gttcagttgg tccatgtggt 1500agaattttag tcagaggtga acctggttgg gaaatagtac cagaagttgc acctagagaa 1560ggtgaaccaa ttatagataa acctggtaaa ggtgctttct acgcaacaga tttggacttg 1620ttgttgagaa caagaggtat cacccatttg attttgaccg gtataactac agatgtttgc 1680gtccacacca ctatgagaga agccaacgat agaggttacg aatgtttaat tttgtctgat 1740tgcaccggtg ctactgacag aaagcatcac gaagctgcat tatctatggt caccatgcaa 1800ggtggtgtat tcggtgcaac tgcccattca gatgacttat tggccgcttt gggtacaacc 1860gttccagcag ccgctggtcc tagagctaga acagaataag gaacgaccat gacagttagt 1920tccgatacaa ctgctgaaat atcgttaggt tggtcaatcc aagactggat tgatttccac 1980aagtcatcaa gctcccaggc ttcactaagg cttcttgaat cactactaga ctctcaaaat 2040gttgcgccag tcgataatgc gtggatatcg ctaatttcaa aggaaaattt actgcaccaa 2100ttccaaattt taaagagcag agaaaataaa gaaactctac ctctctacgg tgtccctatt 2160gctgttaagg acaacatcga cgttagaggt ctacccacca ccgctgcatg tccatccttt 2220gcatatgagc cttccaaaga ctctaaagta gtagaactac taagaaatgc aggtgcgata 2280atcgtgggta agacaaactt ggaccaattt gccacaggat tagtcggcac acggtctcca 2340tatgggaaaa caccttgcgc ttttagcaaa gagcatgtat ctggtggttc ctccgctggg 2400tcagcatcgg tggtcgccag aggtatcgta ccaattgcat tgggtactga tacagcaggt 2460tctggtagag tcccagccgc cttgaacaac ctgattggcc taaagccaac aaagggcgtc 2520ttttcctgtc aaggtgtagt tcccgcttgt aaatctttag actgcgtctc catctttgca 2580ttaaacctaa gtgatgctga acgctgcttc cgcatcatgt gccagccaga tcctgataat 2640gatgaatatt ctagacccta tgtttccaac cctttgaaaa aattttcaag caatgtaacg 2700attgctattc ctaaaaatat cccatggtat ggtgaaacca agaatcctgt actgttttcc 2760aatgctgtcg aaaatctatc aagaacgggc gctaacgtca tagaaattga ttttgagcct 2820cttttagagt tagctcgctg tttatacgaa ggtacttggg tggccgagcg ttatcaagct 2880attcaatcgt ttttggacag taaaccacca aaggaatctt tggaccctac tgttatttca 2940attatagaag gggccaagaa atacagtgca gtagactgct tcagttttga atacaaaaga 3000caaggcatct tgcaaaaagt gagacgactt ctcgaatcag tcgatgtatt gtgtgtgccc 3060acatgtcctt taaatcctac tatgcaacaa gttgcggatg aaccagtcct agtcaattca 3120agacaaggca catggactaa ttttgtcaac ttggcagatt tggcagccct tgctgttccc 3180gcagggttcc gagacgatgg tttgccaaat ggtattactt taatcggtaa aaaattcaca 3240gattacgcac tattagagtt ggctaaccgc tatttccaaa atatattccc caacggttcc 3300agaacatacg gtacttttac ctcttcttca gtaaagccag caaacgatca attagtggga 3360ccagactatg acccatctac gtccataaaa ttggctgttg tcggtgcaca tcttaagggt 3420ctgcctctac attggcaatt ggaaaaggtc aatgcaacat atttatgtac aacaaaaaca 3480tcaaaagctt accagctttt tgctttgccc aaaaatggac cagttttaaa acctggtttg 3540agaagagttc aagatagcaa tggctctcaa atcgaattag aagtgtacag tgttccaaaa 3600gaactgttcg gtgcttttat ttccatggtt cctgaaccat taggaatagg ttcagtggag 3660ttagaatctg gtgaatggat caaatccttt atttgtgaag aatctggtta caaagccaaa 3720ggtacagttg atatcacaaa gtatggtgga tttagagcat attttgaaat gttgtaagtt 3780taaactaatc ccacagccgc cagttccgct ggcggcattt taactttctt taatgggcgc 3840gcctttccat aggctccgcc cccctgacga gcatcacaaa aatcgacgct caagtcagag 3900gtggcgaaac ccgacaggac tataaagata ccaggcgttt ccccctggaa gctccctcgt 3960gcgctctcct gttccgaccc tgccgcttac cggatacctg tccgcctttc tcccttcggg 4020aagcgtggcg ctttctcata gctcacgctg taggtatctc agttcggtgt aggtcgttcg 4080ctccaagctg ggctgtgtgc acgaaccccc cgttcagccc gaccgctgcg ccttatccgg 4140taactatcgt cttgagtcca acccggtaag acacgactta tcgccactgg cagcagccac 4200tggtaacagg attagcagag cgaggtatgt aggcggtgct acagagttct tgaagtggtg 4260gcctaactac ggctacacta gaagaacagt atttggtatc tgcgctctgc tgaagccagt 4320taccttcgga aaaagagttg gtagctcttg atccggcaaa caaaccaccg ctggtagcgg 4380tggttttttt gtttgcaagc agcagattac gcgcagaaaa aaaggatctc aagaagatcc 4440tttgatcttt tctacggggt ctgacgctca gtggaacgaa aactcacgtt aagggatttt 4500ggtcatgaga ttatcaaaaa ggatcttcac ctagatcctt ttaaattaaa aatgaagttt 4560taaatcaatc taaagtatat atgagtaaac ttggtctgac agttaccaat gcttaatcag 4620tgaggcacct atctcagcga tctgtctatt tcgttcatcc atagttgcct gactccccgt 4680cgtgtagata actacgatac gggagggctt accatctggc cccagtgctg caatgatacc 4740gcgagaccca cgctcaccgg ctccagattt atcagcaata aaccagccag ccggaagggc 4800cgagcgcaga agtggtcctg caactttatc cgcctccatc cagtctatta attgttgccg 4860ggaagctaga gtaagtagtt cgccagttaa tagtttgcgc aacgttgttg ccattgctac 4920aggcatcgtg gtgtcacgct cgtcgtttgg tatggcttca ttcagctccg gttcccaacg 4980atcaaggcga gttacatgat cccccatgtt gtgcaaaaaa gcggttagct ccttcggtcc 5040tccgatcgtt gtcagaagta agttggccgc agtgttatca ctcatggtta tggcagcact 5100gcataattct cttactgtca tgccatccgt aagatgcttt tctgtgactg gtgagtactc 5160aaccaagtca ttctgagaat agtgtatgcg gcgaccgagt tgctcttgcc cggcgtcaat 5220acgggataat accgcgccac atagcagaac tttaaaagtg ctcatcattg gaaaacgttc 5280ttcggggcga aaactctcaa ggatcttacc gctgttgaga tccagttcga tgtaacccac 5340tcgtgcaccc aactgatctt cagcatcttt tactttcacc agcgtttctg ggtgagcaaa 5400aacaggaagg caaaatgccg caaaaaaggg aataagggcg acacggaaat gttgaatact 5460catactcttc ctttttcaat attattgaag catttatcag ggttattgtc tcatgagcgg 5520atacatattt gaatgtattt agaaaaataa acagcgatcg cgcggccgcg ggtaataact 5580gatataatta aattgaagct ctaatttgtg agtttagtat acatgcattt acttataata 5640cagtttttta gttttgctgg ccgcatcttc tcaaatatgc ttcccagcct gcttttctgt 5700aacgttcacc ctctacctta gcatcccttc cctttgcaaa tagtcctctt ccaacaataa 5760taatgtcaga tcctgtagag accacatcat ccacggttct atactgttga cccaatgcgt 5820ctcccttgtc atctaaaccc acaccgggtg tcataatcaa ccaatcgtaa ccttcatctc 5880ttccacccat gtctctttga gcaataaagc cgataacaaa atctttgtcg ctcttcgcaa 5940tgtcaacagt acccttagta tattctccag tagctaggga gcccttgcat gacaattctg 6000ctaacatcaa aaggcctcta ggttcctttg ttacttcttc cgccgcctgc ttcaaaccgc 6060taacaatacc tgggcccacc acaccgtgtg cattcgtaat gtctgcccat tctgctattc 6120tgtatacacc cgcagagtac tgcaatttga ctgtattacc aatgtcagca aattttctgt 6180cttcgaagag taaaaaattg tacttggcgg ataatgcctt tagcggctta actgtgccct 6240ccatggaaaa atcagtcaag atatccacat gtgtttttag taaacaaatt ttgggaccta 6300atgcttcaac taactccagt aattccttgg tggtacgaac atccaatgaa gcacacaagt 6360ttgtttgctt ttcgtgcatg atattaaata gcttggcagc aacaggacta ggatgagtag 6420cagcacgttc cttatatgta gctttcgaca tgatttatct tcgtttcctg caggtttttg 6480ttctgtgcag ttgggttaag aatactgggc aatttcatgt ttcttcaaca ccacatatgc 6540gtatatatac caatctaagt ctgtgctcct tccttcgttc ttccttctgc tcggagatta 6600ccgaatcaaa gctagcttat cgatgataag ctgtcaaaga tgagaattaa ttccacggac 6660tatagactat actagatact ccgtctactg tacgatacac ttccgctcag gtccttgtcc 6720tttaacgagg ccttaccact cttttgttac tctattgatc cagctcagca aaggcagtgt 6780gatctaagat tctatcttcg cgatgtagta aaactagcta gaccgagaaa gagactagaa 6840atgcaaaagg cacttctaca atggctgcca tcattattat ccgatgtgac gctgcagctt 6900ctcaatgata ttcgaatacg ctttgaggag atacagccta atatccgaca aactgtttta 6960cagatttacg atcgtacttg ttacccatca ttgaattttg aacatccgaa cctgggagtt 7020ttccctgaaa cagatagtat atttgaacct gtataataat atatagtcta gcgctttacg 7080gaagacaatg tatgtatttc ggttcctgga gaaactattg catctattgc ataggtaatc 7140ttgcacgtcg catccccggt tcattttctg cgtttccatc ttgcacttca atagcatatc 7200tttgttaacg aagcatctgt gcttcatttt gtagaacaaa aatgcaacgc gagagcgcta 7260atttttcaaa caaagaatct gagctgcatt tttacagaac agaaatgcaa cgcgaaagcg 7320ctattttacc aacgaagaat ctgtgcttca tttttgtaaa acaaaaatgc aacgcgacga 7380gagcgctaat ttttcaaaca aagaatctga gctgcatttt tacagaacag aaatgcaacg 7440cgagagcgct attttaccaa caaagaatct atacttcttt tttgttctac aaaaatgcat 7500cccgagagcg ctatttttct aacaaagcat cttagattac tttttttctc ctttgtgcgc 7560tctataatgc agtctcttga taactttttg cactgtaggt ccgttaaggt tagaagaagg 7620ctactttggt gtctattttc tcttccataa aaaaagcctg actccacttc ccgcgtttac 7680tgattactag cgaagctgcg ggtgcatttt ttcaagataa aggcatcccc gattatattc 7740tataccgatg tggattgcgc atactttgtg aacagaaagt gatagcgttg atgattcttc 7800attggtcaga aaattatgaa cggtttcttc tattttgtct ctatatacta cgtataggaa 7860atgtttacat tttcgtattg ttttcgattc actctatgaa tagttcttac tacaattttt 7920ttgtctaaag agtaatacta gagataaaca taaaaaatgt agaggtcgag tttagatgca 7980agttcaagga gcgaaaggtg gatgggtagg ttatataggg atatagcaca gagatatata 8040gcaaagagat acttttgagc aat 8063206004DNAArtificial SequenceDescription of Artificial Sequence Synthetic polynucleotide 20gatacttttg agcaatgttt gtggaagcgg tattcgcaat tataaacggt attttcacaa 60ttgcacccca gccagaccga tagccggtcg caatccgcca cccacaaccg tctacctccc 120acagaacccc gtcacttcca cccttttcca ccagatcata tgtcccaact tgccaaatta 180aaaccgtgcg aattttcaaa ataaactttg gcaaagaggc tgcaaaggag gggctggtga 240gggcgtctgg aagtcgacca gagaccgggt tggcggcgca tttgtgtccc aaaaaacagc 300cccaattgcc ccaattgacc ccaaattgac ccagtagcgg gcccaacccc ggcgagagcc 360cccttctccc cacatatcaa acctcccccg gttcccacac ttgccgttaa gggcgtaggg 420tactgcagtc tggaatctac gcttgttcag actttgtact agtttctttg tctggccatc 480cgggtaaccc atgccggacg caaaatagac tactgaaaat ttttttgctt tgtggttggg 540actttagcca agggtataaa agaccaccgt ccccgaatta cctttcctct tcttttctct 600ctctccttgt caactcacac ccgaaatcgt taagcatttc cttctgagta taagaatcat 660tcaaaatgtc atcctcagaa gtaaaagcaa atggttggac cgcagttcct gtttccgcaa 720aagcaatagt agactccttg ggtaaattag gagatgtctc ttcatattcc gtagaagata 780ttgcctttcc agctgcagac aaattggtag ccgaagctca agcattcgtt aaggctagat 840tatctcctga aacctacaac cattcaatga gagttttcta ttggggtact gtcattgcca 900gaagattgtt accagaacaa gctaaagatt tgtctccttc aacatgggca ttaacctgtt 960tgttacacga cgttggtact gccgaagctt attttacctc cactagaatg agtttcgata 1020tctacggtgg tattaaagct atggaagtat tgaaggtttt aggttccagt acagatcaag 1080cagaagccgt tgctgaagca attataagac atgaagatgt tggtgtcgac ggtaacatca 1140catttttggg tcaattgatc caattggcaa cattgtacga taacgtcggt gcctacgacg 1200gtattgatga cttcggttcc tgggttgatg acactacaag aaacagtata aacactgctt 1260tcccaagaca tggttggtgt tcttggttcg catgcacagt tagaaaagaa gaatcaaaca 1320agccttggtg ccacaccaca cacataccac aattcgacaa acaaatggaa gcaaacacct 1380tgatgaaacc ttgggaataa gctgcttgta cctagtgcaa ccccagtttg ttaaaaatta 1440gtagtcaaaa acttctgagt tagaaatttg tgagtgtagt gagattgtag agtatcatgt 1500gtgtccgtaa gtgaagtgtt attgactctt agttagttta tctagtactc gtttagttga 1560cactgatcta gtattttacg aggcgtatga ctttagccaa gtgttgtact tagtcttctc 1620tccaaacatg agagggctct gtcactcagt cggcctatgg gtgagatggc ttggtgagat 1680ctttcgatag tctcgtcaag atggtaggat gatgggggaa tacattactg ctctcgtcaa 1740ggaaaccaca atcagatcac accatcctcc atggtatccg atgactctct tctccacagt 1800tttccatagg ctccgccccc ctgacgagca tcacaaaaat cgacgctcaa gtcagaggtg 1860gcgaaacccg acaggactat aaagatacca ggcgtttccc cctggaagct ccctcgtgcg 1920ctctcctgtt ccgaccctgc cgcttaccgg atacctgtcc gcctttctcc cttcgggaag 1980cgtggcgctt tctcatagct cacgctgtag gtatctcagt tcggtgtagg tcgttcgctc 2040caagctgggc tgtgtgcacg aaccccccgt tcagcccgac cgctgcgcct tatccggtaa 2100ctatcgtctt gagtccaacc cggtaagaca cgacttatcg ccactggcag cagccactgg 2160taacaggatt agcagagcga ggtatgtagg cggtgctaca gagttcttga agtggtggcc 2220taactacggc tacactagaa gaacagtatt tggtatctgc gctctgctga agccagttac 2280cttcggaaaa agagttggta gctcttgatc cggcaaacaa accaccgctg gtagcggtgg 2340tttttttgtt tgcaagcagc agattacgcg cagaaaaaaa ggatctcaag aagatccttt 2400gatcttttct acggggtctg acgctcagtg gaacgaaaac tcacgttaag ggattttggt 2460catgagatta tcaaaaagga tcttcaccta gatcctttta aattaaaaat gaagttttaa 2520atcaatctaa agtatatatg agtaaacttg gtctgacagt taccaatgct taatcagtga 2580ggcacctatc tcagcgatct gtctatttcg ttcatccata gttgcctgac tccccgtcgt 2640gtagataact acgatacggg agggcttacc atctggcccc agtgctgcaa tgataccgcg 2700agacccacgc tcaccggctc cagatttatc agcaataaac cagccagccg gaagggccga 2760gcgcagaagt ggtcctgcaa ctttatccgc ctccatccag tctattaatt gttgccggga 2820agctagagta agtagttcgc cagttaatag tttgcgcaac gttgttgcca ttgctacagg 2880catcgtggtg tcacgctcgt cgtttggtat ggcttcattc agctccggtt cccaacgatc 2940aaggcgagtt acatgatccc ccatgttgtg caaaaaagcg gttagctcct tcggtcctcc 3000gatcgttgtc agaagtaagt tggccgcagt gttatcactc atggttatgg cagcactgca 3060taattctctt actgtcatgc catccgtaag atgcttttct gtgactggtg agtactcaac 3120caagtcattc tgagaatagt gtatgcggcg accgagttgc tcttgcccgg cgtcaatacg 3180ggataatacc gcgccacata gcagaacttt aaaagtgctc atcattggaa aacgttcttc 3240ggggcgaaaa ctctcaagga tcttaccgct gttgagatcc agttcgatgt aacccactcg 3300tgcacccaac tgatcttcag catcttttac tttcaccagc gtttctgggt gagcaaaaac 3360aggaaggcaa aatgccgcaa aaaagggaat aagggcgaca cggaaatgtt gaatactcat 3420actcttcctt tttcaatatt attgaagcat ttatcagggt tattgtctca tgagcggata 3480catatttgaa tgtatttaga aaaataaaca gcgatcgcgc ggccgcgggt aataactgat 3540ataattaaat tgaagctcta atttgtgagt ttagtataca tgcatttact tataatacag 3600ttttttagtt ttgctggccg catcttctca

aatatgcttc ccagcctgct tttctgtaac 3660gttcaccctc taccttagca tcccttccct ttgcaaatag tcctcttcca acaataataa 3720tgtcagatcc tgtagagacc acatcatcca cggttctata ctgttgaccc aatgcgtctc 3780ccttgtcatc taaacccaca ccgggtgtca taatcaacca atcgtaacct tcatctcttc 3840cacccatgtc tctttgagca ataaagccga taacaaaatc tttgtcgctc ttcgcaatgt 3900caacagtacc cttagtatat tctccagtag ctagggagcc cttgcatgac aattctgcta 3960acatcaaaag gcctctaggt tcctttgtta cttcttccgc cgcctgcttc aaaccgctaa 4020caatacctgg gcccaccaca ccgtgtgcat tcgtaatgtc tgcccattct gctattctgt 4080atacacccgc agagtactgc aatttgactg tattaccaat gtcagcaaat tttctgtctt 4140cgaagagtaa aaaattgtac ttggcggata atgcctttag cggcttaact gtgccctcca 4200tggaaaaatc agtcaagata tccacatgtg tttttagtaa acaaattttg ggacctaatg 4260cttcaactaa ctccagtaat tccttggtgg tacgaacatc caatgaagca cacaagtttg 4320tttgcttttc gtgcatgata ttaaatagct tggcagcaac aggactagga tgagtagcag 4380cacgttcctt atatgtagct ttcgacatga tttatcttcg tttcctgcag gtttttgttc 4440tgtgcagttg ggttaagaat actgggcaat ttcatgtttc ttcaacacca catatgcgta 4500tatataccaa tctaagtctg tgctccttcc ttcgttcttc cttctgctcg gagattaccg 4560aatcaaagct agcttatcga tgataagctg tcaaagatga gaattaattc cacggactat 4620agactatact agatactccg tctactgtac gatacacttc cgctcaggtc cttgtccttt 4680aacgaggcct taccactctt ttgttactct attgatccag ctcagcaaag gcagtgtgat 4740ctaagattct atcttcgcga tgtagtaaaa ctagctagac cgagaaagag actagaaatg 4800caaaaggcac ttctacaatg gctgccatca ttattatccg atgtgacgct gcagcttctc 4860aatgatattc gaatacgctt tgaggagata cagcctaata tccgacaaac tgttttacag 4920atttacgatc gtacttgtta cccatcattg aattttgaac atccgaacct gggagttttc 4980cctgaaacag atagtatatt tgaacctgta taataatata tagtctagcg ctttacggaa 5040gacaatgtat gtatttcggt tcctggagaa actattgcat ctattgcata ggtaatcttg 5100cacgtcgcat ccccggttca ttttctgcgt ttccatcttg cacttcaata gcatatcttt 5160gttaacgaag catctgtgct tcattttgta gaacaaaaat gcaacgcgag agcgctaatt 5220tttcaaacaa agaatctgag ctgcattttt acagaacaga aatgcaacgc gaaagcgcta 5280ttttaccaac gaagaatctg tgcttcattt ttgtaaaaca aaaatgcaac gcgacgagag 5340cgctaatttt tcaaacaaag aatctgagct gcatttttac agaacagaaa tgcaacgcga 5400gagcgctatt ttaccaacaa agaatctata cttctttttt gttctacaaa aatgcatccc 5460gagagcgcta tttttctaac aaagcatctt agattacttt ttttctcctt tgtgcgctct 5520ataatgcagt ctcttgataa ctttttgcac tgtaggtccg ttaaggttag aagaaggcta 5580ctttggtgtc tattttctct tccataaaaa aagcctgact ccacttcccg cgtttactga 5640ttactagcga agctgcgggt gcattttttc aagataaagg catccccgat tatattctat 5700accgatgtgg attgcgcata ctttgtgaac agaaagtgat agcgttgatg attcttcatt 5760ggtcagaaaa ttatgaacgg tttcttctat tttgtctcta tatactacgt ataggaaatg 5820tttacatttt cgtattgttt tcgattcact ctatgaatag ttcttactac aatttttttg 5880tctaaagagt aatactagag ataaacataa aaaatgtaga ggtcgagttt agatgcaagt 5940tcaaggagcg aaaggtggat gggtaggtta tatagggata tagcacagag atatatagca 6000aaga 60042110640DNAArtificial SequenceDescription of Artificial Sequence Synthetic polynucleotide 21gtttgtggaa gcggtattcg caatttaatt aaagctggtg acaattaatc atcggctcgt 60ataatgtgtg gaattgaatc gatataagga ggttaatcat atgacagtta gttccgatac 120aactgctgaa atatcgttag gttggtcaat ccaagactgg attgatttcc acaagtcatc 180aagctcccag gcttcactaa ggcttcttga atcactacta gactctcaaa atgttgcgcc 240agtcgataat gcgtggatat cgctaatttc aaaggaaaat ttactgcacc aattccaaat 300tttaaagagc agagaaaata aagaaactct acctctctac ggtgtcccta ttgctgttaa 360ggacaacatc gacgttagag gtctacccac caccgctgca tgtccatcct ttgcatatga 420gccttccaaa gactctaaag tagtagaact actaagaaat gcaggtgcga taatcgtggg 480taagacaaac ttggaccaat ttgccacagg attagtcggc acacggtctc catatgggaa 540aacaccttgc gcttttagca aagagcatgt atctggtggt tcctccgctg ggtcagcatc 600ggtggtcgcc agaggtatcg taccaattgc attgggtact gatacagcag gttctggtag 660agtcccagcc gccttgaaca acctgattgg cctaaagcca acaaagggcg tcttttcctg 720tcaaggtgta gttcccgctt gtaaatcttt agactgcgtc tccatctttg cattaaacct 780aagtgatgct gaacgctgct tccgcatcat gtgccagcca gatcctgata atgatgaata 840ttctagaccc tatgtttcca accctttgaa aaaattttca agcaatgtaa cgattgctat 900tcctaaaaat atcccatggt atggtgaaac caagaatcct gtactgtttt ccaatgctgt 960cgaaaatcta tcaagaacgg gcgctaacgt catagaaatt gattttgagc ctcttttaga 1020gttagctcgc tgtttatacg aaggtacttg ggtggccgag cgttatcaag ctattcaatc 1080gtttttggac agtaaaccac caaaggaatc tttggaccct actgttattt caattataga 1140aggggccaag aaatacagtg cagtagactg cttcagtttt gaatacaaaa gacaaggcat 1200cttgcaaaaa gtgagacgac ttctcgaatc agtcgatgta ttgtgtgtgc ccacatgtcc 1260tttaaatcct actatgcaac aagttgcgga tgaaccagtc ctagtcaatt caagacaagg 1320cacatggact aattttgtca acttggcaga tttggcagcc cttgctgttc ccgcagggtt 1380ccgagacgat ggtttgccaa atggtattac tttaatcggt aaaaaattca cagattacgc 1440actattagag ttggctaacc gctatttcca aaatatattc cccaacggtt ccagaacata 1500cggtactttt acctcttctt cagtaaagcc agcaaacgat caattagtgg gaccagacta 1560tgacccatct acgtccataa aattggctgt tgtcggtgca catcttaagg gtctgcctct 1620acattggcaa ttggaaaagg tcaatgcaac atatttatgt acaacaaaaa catcaaaagc 1680ttaccagctt tttgctttgc ccaaaaatgg accagtttta aaacctggtt tgagaagagt 1740tcaagatagc aatggctctc aaatcgaatt agaagtgtac agtgttccaa aagaactgtt 1800cggtgctttt atttccatgg ttcctgaacc attaggaata ggttcagtgg agttagaatc 1860tggtgaatgg atcaaatcct ttatttgtga agaatctggt tacaaagcca aaggtacagt 1920tgatatcaca aagtatggtg gatttagagc atattttgaa atgttgaaga aaaaagagtc 1980ccaaaagaag aagttatttg ataccgtgtt aattgccaat agaggtgaaa ttgccgttcg 2040tattatcaag acattaaaaa aattgggtat tagatcagtt gcagtttatt ccgaccctga 2100taaatattct caacacgtta ctgatgcaga tgtttctgta ccccttcatg gcacaaccgc 2160agcccaaact tatttagaca tgaataagat catagatgcc gctaagcaaa ctaatgcaca 2220ggccattatt cctggttatg gtttcttgtc ggaaaatgcg gatttttctg atgcgtgcac 2280cagtgctggc attacctttg ttggtccttc gggagatatt atcagaggtt tagggttaaa 2340acattctgct agacagattg cacagaaggc tggcgttcct ctagtgccag gctctttgct 2400tatcacatca gttgaagagg ctaagaaagt cgcagcggaa ttggaatacc cagttatggt 2460gaagtcaact gctggtggcg gtggtattgg tttgcagaaa gtcgattctg aagaggacat 2520cgagcatatt tttgagactg tgaaacatca aggtgaaaca tttttcggtg acgctggtgt 2580atttctggaa cggtttatcg aaaatgccag gcatgttgaa gtccaactta tgggagatgg 2640ttttggtaag gccattgctt tgggcgaacg tgattgttct ttacagcgtc gtaaccaaaa 2700agttatcgaa gaaactcctg caccaaattt gccagaaaag acgaggttgg cgttaagaaa 2760ggcagctgaa agtttgggat ctttattgaa ttacaagtgt gctggtacgg ttgaatttat 2820ttacgatgag aaaaaggacg agttttactt tttagaagtt aatacaagat tacaagttga 2880acatccaata acagaaatgg ttacagggtt agacttggtc gagtggatga tcaggattgc 2940cgctaatgat gcacctgatt ttgattctac aaaggtagaa gtcaatgggg tttcaatgga 3000ggcacgttta tatgctgaaa atccattgaa aaatttcaga ccttctccag gtttacttgt 3060cgatgtgaaa tttcctgatt gggcaagagt ggatacttgg gttaagaaag gtactaatat 3120ttctcccgaa tatgatccaa cattggccaa aattatcgtt catgggaaag accgtgatga 3180tgcaatttcc aagttaaatc aagcgttaga agaaacaaaa gtttacggat gtattactaa 3240cattgactac ctgaagtcta tcattaccag tgatttcttt gctaaagcaa aagtttctac 3300aaacattttg aactcttatc aatatgagcc taccgccatc gaaattactt tgcccggtgc 3360acacactagt attcaggatt accccggtag agttgggtac tggagaattg gtgttccgcc 3420ctctggtcca atggacgcat attcgtttag attggcgaac agaattgttg gtaatgacta 3480caggactcct gccattgaag taacgttgac tggtccatcc atcgttttcc attgtgaaac 3540tgtcattgcc attactggtg gtaccgctct atgtacatta gacggccaag aaattcccca 3600acacaaaccg gtcgaagtta agaggggatc tactttatcc attggcaagt tgacaagcgg 3660ctgtagagca tacttaggta tcaggggtgg cattgatgtg cctaaatact tgggctctta 3720ttctactttc actctaggaa atgtcggtgg atacaatgga agggtgctaa aacttggaga 3780cgtactattc ttaccaagca atgaagaaaa taaatcagtt gagtgccttc cacagaatat 3840tcctcaatca ttaattcctc aaatttccga aactaaggaa tggagaattg gtgtaacatg 3900tggtccccat gggtctccag atttttttaa acctgagtcc atcgaagaat ttttcagtga 3960gaagtggaag gttcattaca actccaatag atttggtgtc cgtttgattg gacctaaacc 4020taagtgggca agaagtaatg gtggtgaagg tggtatgcat ccttcaaaca ctcacgatta 4080cgtttattct ctgggtgcaa ttaatttcac gggtgatgag ccagttatta ttacttgcga 4140tggtccttcc ttaggtggtt ttgtgtgtca agctgttgtc ccagaagcag aactgtggaa 4200ggttggacag gttaaacccg gtgattccat tcagtttgtg ccactttctt acgaaagctc 4260gagatcctta aaggaatctc aggatgttgc aattaaatca ttggatggta ctaagttaag 4320gcgcttagac tctgtttcaa ttttaccatc attcgaaacg cctattcttg cacaaatgga 4380aaaagtgaat gagctttcac caaaggttgt atacagacaa gcaggtgatc gttatgtttt 4440ggtggaatac ggtgataatg aaatgaattt taatatttcc tatagaattg aatgcctgat 4500ctcccttgtg aaaaagaata agactattgg tattgttgaa atgtcccaag gtgttagatc 4560tgtattgata gaatttgatg gttacaaagt cactcaaaaa gaattgctta aagtattggt 4620ggcatatgaa acagaaatcc agtttgatga aaattggaag ataacttcta atataataag 4680attaccgatg gctttcgaag actcgaagac tttggcatgt gttcaaaggt atcaagaaac 4740aattcgttcg tctgctccat ggttgccaaa taacgttgat ttcattgcca atgtaaatgg 4800aatttcaagg aatgaagttt atgatatgtt gtattctgcc agatttatgg ttttaggttt 4860aggtgatgtc ttcctagggt cgccttgtgc tgttccatta gatcctcgtc acagattttt 4920gggaagcaag tacaacccaa gtagaacata tacagaaaga ggtgcagtcg gtattggcgg 4980tatgtatatg tgcatatatg ctgctaacag tcctggtggg taccaattag tgggtagaac 5040aataccaatt tgggacaaac tatgtctggc cgcatcttct gaggttccgt ggttgatgaa 5100cccatttgac caagtcgaat tttacccagt ttctgaagaa gatttggata aaatgactga 5160agattgtgat aatggtgttt ataaagtcaa tatcgaaaag agtgtttttg atcatcaaga 5220atacttgaga tggatcaacg caaacaaaga ttccatcaca gcattccagg agggccagct 5280tggtgaaaga gcagaggaat ttgccaaatt gattcaaaat gcaaactctg aactaaaaga 5340aagtgtcaca gtcaaacctg acgaggaaga agacttccca gaaggtgcag aaattgtata 5400ttctgagtat tctgggcgtt tttggaaatc catagcatct gttggagatg ttattgaagc 5460aggtcaaggg ctactaatta ttgaagccat gaaagcggaa atgattatat ccgctcctaa 5520atcgggtaag attatcaaga tttgccatgg caatggtgat atggttgatt ctggtgacat 5580agtggccgtc atagagacat tggcatgagg aaatccatta tgtcatcctc agaagtaaaa 5640gcaaatggtt ggaccgcagt tcctgtttcc gcaaaagcaa tagtagactc cttgggtaaa 5700ttaggagatg tctcttcata ttccgtagaa gatattgcct ttccagctgc agacaaattg 5760gtagccgaag ctcaagcatt cgttaaggct agattatctc ctgaaaccta caaccattca 5820atgagagttt tctattgggg tactgtcatt gccagaagat tgttaccaga acaagctaaa 5880gatttgtctc cttcaacatg ggcattaacc tgtttgttac acgacgttgg tactgccgaa 5940gcttatttta cctccactag aatgagtttc gatatctacg gtggtattaa agctatggaa 6000gtattgaagg ttttaggttc cagtacagat caagcagaag ccgttgctga agcaattata 6060agacatgaag atgttggtgt cgacggtaac atcacatttt tgggtcaatt gatccaattg 6120gcaacattgt acgataacgt cggtgcctac gacggtattg atgacttcgg ttcctgggtt 6180gatgacacta caagaaacag tataaacact gctttcccaa gacatggttg gtgttcttgg 6240ttcgcatgca cagttagaaa agaagaatca aacaagcctt ggtgccacac cacacacata 6300ccacaattcg acaaacaaat ggaagcaaac accttgatga aaccttggga ataagtttaa 6360actaatccca cagccgccag ttccgctggc ggcattttaa ctttctttaa tgggcgcgcc 6420tttccatagg ctccgccccc ctgacgagca tcacaaaaat cgacgctcaa gtcagaggtg 6480gcgaaacccg acaggactat aaagatacca ggcgtttccc cctggaagct ccctcgtgcg 6540ctctcctgtt ccgaccctgc cgcttaccgg atacctgtcc gcctttctcc cttcgggaag 6600cgtggcgctt tctcatagct cacgctgtag gtatctcagt tcggtgtagg tcgttcgctc 6660caagctgggc tgtgtgcacg aaccccccgt tcagcccgac cgctgcgcct tatccggtaa 6720ctatcgtctt gagtccaacc cggtaagaca cgacttatcg ccactggcag cagccactgg 6780taacaggatt agcagagcga ggtatgtagg cggtgctaca gagttcttga agtggtggcc 6840taactacggc tacactagaa gaacagtatt tggtatctgc gctctgctga agccagttac 6900cttcggaaaa agagttggta gctcttgatc cggcaaacaa accaccgctg gtagcggtgg 6960tttttttgtt tgcaagcagc agattacgcg cagaaaaaaa ggatctcaag aagatccttt 7020gatcttttct acggggtctg acgctcagtg gaacgaaaac tcacgttaag ggattttggt 7080catgagatta tcaaaaagga tcttcaccta gatcctttta aattaaaaat gaagttttaa 7140atcaatctaa agtatatatg agtaaacttg gtctgacagt taccaatgct taatcagtga 7200ggcacctatc tcagcgatct gtctatttcg ttcatccata gttgcctgac tccccgtcgt 7260gtagataact acgatacggg agggcttacc atctggcccc agtgctgcaa tgataccgcg 7320agacccacgc tcaccggctc cagatttatc agcaataaac cagccagccg gaagggccga 7380gcgcagaagt ggtcctgcaa ctttatccgc ctccatccag tctattaatt gttgccggga 7440agctagagta agtagttcgc cagttaatag tttgcgcaac gttgttgcca ttgctacagg 7500catcgtggtg tcacgctcgt cgtttggtat ggcttcattc agctccggtt cccaacgatc 7560aaggcgagtt acatgatccc ccatgttgtg caaaaaagcg gttagctcct tcggtcctcc 7620gatcgttgtc agaagtaagt tggccgcagt gttatcactc atggttatgg cagcactgca 7680taattctctt actgtcatgc catccgtaag atgcttttct gtgactggtg agtactcaac 7740caagtcattc tgagaatagt gtatgcggcg accgagttgc tcttgcccgg cgtcaatacg 7800ggataatacc gcgccacata gcagaacttt aaaagtgctc atcattggaa aacgttcttc 7860ggggcgaaaa ctctcaagga tcttaccgct gttgagatcc agttcgatgt aacccactcg 7920tgcacccaac tgatcttcag catcttttac tttcaccagc gtttctgggt gagcaaaaac 7980aggaaggcaa aatgccgcaa aaaagggaat aagggcgaca cggaaatgtt gaatactcat 8040actcttcctt tttcaatatt attgaagcat ttatcagggt tattgtctca tgagcggata 8100catatttgaa tgtatttaga aaaataaaca gcgatcgcgc ggccgcgggt aataactgat 8160ataattaaat tgaagctcta atttgtgagt ttagtataca tgcatttact tataatacag 8220ttttttagtt ttgctggccg catcttctca aatatgcttc ccagcctgct tttctgtaac 8280gttcaccctc taccttagca tcccttccct ttgcaaatag tcctcttcca acaataataa 8340tgtcagatcc tgtagagacc acatcatcca cggttctata ctgttgaccc aatgcgtctc 8400ccttgtcatc taaacccaca ccgggtgtca taatcaacca atcgtaacct tcatctcttc 8460cacccatgtc tctttgagca ataaagccga taacaaaatc tttgtcgctc ttcgcaatgt 8520caacagtacc cttagtatat tctccagtag ctagggagcc cttgcatgac aattctgcta 8580acatcaaaag gcctctaggt tcctttgtta cttcttccgc cgcctgcttc aaaccgctaa 8640caatacctgg gcccaccaca ccgtgtgcat tcgtaatgtc tgcccattct gctattctgt 8700atacacccgc agagtactgc aatttgactg tattaccaat gtcagcaaat tttctgtctt 8760cgaagagtaa aaaattgtac ttggcggata atgcctttag cggcttaact gtgccctcca 8820tggaaaaatc agtcaagata tccacatgtg tttttagtaa acaaattttg ggacctaatg 8880cttcaactaa ctccagtaat tccttggtgg tacgaacatc caatgaagca cacaagtttg 8940tttgcttttc gtgcatgata ttaaatagct tggcagcaac aggactagga tgagtagcag 9000cacgttcctt atatgtagct ttcgacatga tttatcttcg tttcctgcag gtttttgttc 9060tgtgcagttg ggttaagaat actgggcaat ttcatgtttc ttcaacacca catatgcgta 9120tatataccaa tctaagtctg tgctccttcc ttcgttcttc cttctgctcg gagattaccg 9180aatcaaagct agcttatcga tgataagctg tcaaagatga gaattaattc cacggactat 9240agactatact agatactccg tctactgtac gatacacttc cgctcaggtc cttgtccttt 9300aacgaggcct taccactctt ttgttactct attgatccag ctcagcaaag gcagtgtgat 9360ctaagattct atcttcgcga tgtagtaaaa ctagctagac cgagaaagag actagaaatg 9420caaaaggcac ttctacaatg gctgccatca ttattatccg atgtgacgct gcagcttctc 9480aatgatattc gaatacgctt tgaggagata cagcctaata tccgacaaac tgttttacag 9540atttacgatc gtacttgtta cccatcattg aattttgaac atccgaacct gggagttttc 9600cctgaaacag atagtatatt tgaacctgta taataatata tagtctagcg ctttacggaa 9660gacaatgtat gtatttcggt tcctggagaa actattgcat ctattgcata ggtaatcttg 9720cacgtcgcat ccccggttca ttttctgcgt ttccatcttg cacttcaata gcatatcttt 9780gttaacgaag catctgtgct tcattttgta gaacaaaaat gcaacgcgag agcgctaatt 9840tttcaaacaa agaatctgag ctgcattttt acagaacaga aatgcaacgc gaaagcgcta 9900ttttaccaac gaagaatctg tgcttcattt ttgtaaaaca aaaatgcaac gcgacgagag 9960cgctaatttt tcaaacaaag aatctgagct gcatttttac agaacagaaa tgcaacgcga 10020gagcgctatt ttaccaacaa agaatctata cttctttttt gttctacaaa aatgcatccc 10080gagagcgcta tttttctaac aaagcatctt agattacttt ttttctcctt tgtgcgctct 10140ataatgcagt ctcttgataa ctttttgcac tgtaggtccg ttaaggttag aagaaggcta 10200ctttggtgtc tattttctct tccataaaaa aagcctgact ccacttcccg cgtttactga 10260ttactagcga agctgcgggt gcattttttc aagataaagg catccccgat tatattctat 10320accgatgtgg attgcgcata ctttgtgaac agaaagtgat agcgttgatg attcttcatt 10380ggtcagaaaa ttatgaacgg tttcttctat tttgtctcta tatactacgt ataggaaatg 10440tttacatttt cgtattgttt tcgattcact ctatgaatag ttcttactac aatttttttg 10500tctaaagagt aatactagag ataaacataa aaaatgtaga ggtcgagttt agatgcaagt 10560tcaaggagcg aaaggtggat gggtaggtta tatagggata tagcacagag atatatagca 10620aagagatact tttgagcaat 1064022244PRTFusarium oxysporum 22Met Leu Pro Thr Glu Val Glu Ala Asn Gly Trp Thr Ala Val Pro Val1 5 10 15Ser Ala Lys Ala Ile Lys Asp Ser Val Gly Gln Leu Val Pro Thr Gln 20 25 30Thr Tyr Thr Leu Gln Asp Ile Val Phe Pro Ser Glu Asp Lys Leu Val 35 40 45Ser Glu Ala Gln Ala Phe Val Lys Ala Arg Leu Ser Gln Glu Ala Tyr 50 55 60Asn His Ser Met Arg Val Phe Tyr Trp Gly Ser Ile Ile Ala Lys Arg65 70 75 80Leu Leu Pro Lys His Ala Glu Ala Leu Ser Pro Ser Thr Trp Ala Leu 85 90 95Thr Cys Leu Leu His Asp Ile Gly Thr Ala Glu Ala Tyr Phe Thr Ser 100 105 110Thr Arg Met Ser Phe Asp Ile Tyr Gly Gly Ile Lys Ala Met Glu Val 115 120 125Leu Lys Val Leu Gly Ser Ser Asp Asp Gln Ala Glu Ala Val Ala Glu 130 135 140Ala Ile Ile Arg His Glu Asp Met Gly Val Asp Gly Ser Ile Thr Phe145 150 155 160Leu Gly Gln Leu Ile Gln Leu Ala Thr Leu Tyr Asp Asn Val Gly Thr 165 170 175Tyr Glu Gly Ile Asp Asp Phe Gly Gly Trp Ile Asp Glu Ala Thr Arg 180 185 190Asp Asn Val Asn Lys Ala Ile Pro Arg His Gly Trp Cys Ser Trp Phe 195 200 205Ala Cys Thr Val Arg Lys Glu Glu Ser Asn Lys Pro Trp Cys His Thr 210 215 220Thr His Ile Pro Gln Phe Asp Lys Gln Met Glu Ala Asn Thr Leu Met225 230 235 240Lys Gln Trp Glu23735DNAFusarium oxysporum 23atgttgccca ccgaagtcga ggccaacggc tggactgccg tgcctgtcag cgccaaggca 60atcaaggact cggtcggaca gcttgtaccc acgcaaacct acactctcca agacatcgtt 120ttcccctctg aggacaaact tgtgtctgaa gctcaagcct ttgtcaaggc acggctaagt 180caagaagctt ataaccactc tatgcgagtt ttctactggg gatccattat tgccaagcgt 240ttgctaccca agcacgcaga ggccctgtcc ccgtccacct gggcgctgac atgtcttttg 300catgatatcg gtactgctga ggcttacttc

acttcaactc gcatgtcttt tgatatctat 360ggtggaatca aggcaatgga ggtgctcaaa gtcctcggta gcagcgacga tcaggccgag 420gcagtcgcag aggctatcat ccgtcatgaa gacatgggcg tggacggttc gattactttc 480ctaggccagt taattcagct tgctacgctg tatgacaacg ttgggacgta cgagggcatt 540gacgattttg gcggctggat tgacgaagct actcgggata atgtcaacaa agctattcct 600cgtcacggtt ggtgctcctg gtttgcctgt actgtccgca aggaggaatc caacaagcct 660tggtgccata ctacccatat tcctcaattt gataagcaga tggaggcaaa cactttgatg 720aaacagtggg agtag 73524245PRTFusarium pseudograminearum 24Met Ser Ser Pro Glu Val Lys Ile Asn Gly Trp Thr Ala Val Pro Leu1 5 10 15Asn Ala Lys Asn Ile Leu Asp Ser Val Gly Lys Leu Ala Glu Val Pro 20 25 30Thr Tyr Lys Ala Glu Asp Ile Lys Phe Pro Ser Asn Asp Lys Leu Val 35 40 45Ala Glu Ala Gln Ala Phe Val Lys Ala Arg Leu Ser Pro Glu Ala Tyr 50 55 60Asn His Ser Met Arg Val Phe Tyr Trp Gly Asn Ile Leu Ala Lys Arg65 70 75 80Leu Leu Pro Glu His Phe Glu Ala Leu Ser Thr Ser Thr Trp Ala Leu 85 90 95Thr Cys Leu Leu His Asp Ile Gly Thr Ala Asp Ala Phe Phe Thr Ser 100 105 110Thr His Met Ser Phe Asp Leu Tyr Gly Gly Ile Lys Ala Met Glu Val 115 120 125Leu Lys Val Leu Gly Gly Thr Thr Asp Gln Ala Glu Ala Val Ala Glu 130 135 140Ala Ile Ile Arg His Gln Asp Val Gly Val Asp Gly Thr Ile Thr Phe145 150 155 160Leu Gly Gln Leu Ile Gln Leu Ala Thr Leu Tyr Asp Asn Val Gly Val 165 170 175Tyr Glu Gly Ile Glu Asp Tyr Gly Ser Trp Val Asp Glu Val Thr Arg 180 185 190Asp Asn Ile Asn Arg Glu Phe Pro Arg His Lys Trp Ala Ser Cys Phe 195 200 205Ala Ser Val Ile Arg Gln Glu Glu Ser Asn Lys Pro Trp Cys His Ser 210 215 220Thr His Ile Val Gly Phe Pro Glu Lys Leu Glu Ala Asn Thr Leu Met225 230 235 240Lys Pro Trp Glu Glu 24525738DNAFusarium pseudograminearum 25atgtcttcac ctgaagtcaa gattaacggt tggactgctg tccccctcaa cgccaagaac 60attctcgatt ctgtaggaaa actcgcagaa gttcccacct acaaggcaga ggatattaaa 120ttcccatcaa atgacaagct cgtcgccgaa gcccaggcct ttgtcaaggc gcgactgagc 180ccagaagcgt ataatcactc catgagagta ttttactggg gaaacattct tgcaaagcgt 240ttgctgcccg agcattttga agctttgtcc acgtctacct gggcactcac ctgtctctta 300cacgacatag gaacggccga tgccttcttc acctccacgc acatgtcgtt cgatctctat 360ggcggcataa aggctatgga agtgctcaag gtgctcggcg gtactaccga ccaagctgaa 420gctgtcgccg aggccatcat acgtcatcag gatgtgggcg tggacggcac catcactttt 480cttgggcagc tgattcaact tgccacactt tacgacaacg tcggcgttta tgagggcatt 540gaggactatg gcagttgggt tgatgaggtc actcgcgata atatcaatag ggaatttcct 600cggcacaagt gggcatcttg ctttgcttct gtcattcgtc aggaggagtc caacaaaccc 660tggtgccatt ctacacatat tgtaggcttt cctgaaaagc ttgaggccaa cactcttatg 720aagccttggg aggagtag 73826245PRTFusarium graminearum 26Met Ser Ser Pro Glu Ala Lys Thr Asn Gly Trp Thr Ala Val Pro Leu1 5 10 15Asn Ala Lys Asn Ile Leu Asp Thr Val Gly Lys Leu Ala Glu Val Pro 20 25 30Thr Tyr Lys Ala Glu Asp Ile Gln Phe Pro Ser Asp Asp Lys Leu Val 35 40 45Ala Glu Ala Gln Ala Phe Ala Lys Ala Arg Leu Ser Pro Glu Ala Tyr 50 55 60Asn His Ser Met Arg Val Phe Tyr Trp Gly Asn Ile Leu Ala Lys Arg65 70 75 80Leu Leu Pro Glu His Phe Gly Ala Leu Ser Thr Ser Thr Trp Ala Leu 85 90 95Thr Cys Leu Leu His Asp Ile Gly Thr Ala Asp Val Phe Phe Thr Ser 100 105 110Thr His Met Ser Phe Asp Leu Tyr Gly Gly Ile Lys Ala Met Glu Val 115 120 125Leu Lys Val Leu Gly Gly Thr Thr Asp Gln Ala Glu Ala Val Ala Glu 130 135 140Ala Ile Ile Arg His Gln Asp Val Gly Val Asp Gly Thr Ile Thr Phe145 150 155 160Leu Gly Gln Leu Ile Gln Leu Ala Thr Leu Tyr Asp Asn Val Gly Val 165 170 175Tyr Glu Gly Ile Gln Asp Tyr Gly Ser Trp Val Asp Glu Ala Thr Arg 180 185 190Asp Asn Ile Asn Arg Ala Phe Pro Arg His Lys Trp Thr Ser Cys Phe 195 200 205Ala Ser Val Ile Arg Gln Glu Glu Ser Asn Lys Pro Trp Cys His Ser 210 215 220Thr His Ile Val Asp Phe Pro Glu Lys Leu Glu Ala Asn Thr Leu Met225 230 235 240Lys Pro Trp Glu Glu 24527738DNAFusarium graminearum 27atgtcttcac ctgaagccaa aactaacggt tggactgctg tccccctcaa cgctaagaat 60attctcgaca ctgtaggaaa gctcgcagaa gttcccacct acaaggcaga ggatattcaa 120tttccatcag acgacaagct agtcgccgaa gcccaagcct ttgccaaggc acgactaagc 180cctgaagcct ataatcactc catgcgagta ttttactggg gaaacattct tgcaaagcgt 240ttgctgccag agcattttgg agctttgtcc acgtctacct gggcactcac ctgtctctta 300cacgacatag gaacggccga tgtcttcttc acatccacac acatgtcgtt cgatctctat 360ggcggcataa aggctatgga agtgctcaag gtgctcggtg gtaccaccga ccaagctgaa 420gctgtcgccg aggccatcat acgtcatcag gatgtgggcg tggacggcac catcactttt 480cttgggcagc tgattcaact tgccacactt tatgataacg tcggcgttta tgagggcatt 540caagactatg gcagttgggt tgatgaggcc actcgcgata atatcaatag ggcatttcct 600cgacacaagt ggacgtcttg ctttgcttcc gtcattcgtc aggaggagtc caacaaaccc 660tggtgccatt ctacacatat tgtggacttt cctgaaaagc ttgaggccaa cactcttatg 720aagccttggg aggagtag 73828244PRTAspergillus kawachii 28Met Cys Asn Asp Glu Ile Lys Ala Asn Gly Trp Ser Ser Met Pro Ala1 5 10 15Asn Ala Gly Ala Ile Phe Thr Asp Gln Ser Phe Ile Glu Arg Ala Glu 20 25 30Ala Met Gln Leu Asp Thr Ile Ile Phe Pro Phe Asp Asp Pro Val Val 35 40 45Ser Lys Thr Trp Glu Tyr Ala Arg Ala Val Leu His Pro Gln Thr Leu 50 55 60Asn His Ser Met Arg Val Tyr Phe Tyr Gly Met Val Ile Thr Thr Gln65 70 75 80Gln Phe Pro Glu Ile Ala Ala Ser Leu Asn Pro Val Thr Trp Ala Leu 85 90 95Thr Cys Leu Leu His Asp Ile Gly Thr Ala Glu Glu Asn Leu Thr Ala 100 105 110Thr Arg Met Ser Phe Asp Ile Tyr Gly Gly Ile Lys Ala Leu His Val 115 120 125Leu Lys Glu Phe Gly Ala Thr Ala Asp Gln Ala Glu Ala Val Ala Glu 130 135 140Ala Ile Ile Arg His Glu Asp Met Gly Val Asp Gly Thr Ile Thr Tyr145 150 155 160Phe Gly Gln Leu Ile Gln Leu Ala Thr Thr Tyr Asp Asn Thr Gly Val 165 170 175His Pro His Val Lys Ser Phe Glu Gly Leu Val His Gln Thr Thr Arg 180 185 190Lys Gln Ile Asn Glu Ala Tyr Pro Arg Leu Lys Trp Cys Glu Phe Phe 195 200 205Ser Gly Met Ile Arg Lys Glu Glu Thr Ile Lys Pro Trp Cys His Ser 210 215 220Thr His Leu Val Asp Phe Asp Arg Glu Ile Glu Glu Asn Thr Leu Met225 230 235 240Arg Glu Trp Glu29735DNAAspergillus kawachii 29atgtgcaacg acgaaataaa agccaacggc tggtccagca tgcccgccaa tgccggtgcc 60atatttacgg accaatcctt catcgaaagg gcagaagcca tgcagctcga tacaatcata 120ttccccttcg acgatcctgt cgtttcaaag acctgggaat acgccagggc tgttcttcac 180ccccagacat tgaaccattc catgagggtc tacttctacg gaatggtaat caccacccag 240caattccctg aaatagcagc atccctcaac ccagtcacct gggctctgac ctgcctcctc 300cacgacatcg gtactgcgga ggagaaccta actgcaacgc gcatgtcatt cgatatctat 360ggcggtatca aggccctcca tgtgctgaag gagtttggtg ccactgcgga ccaggccgag 420gccgttgctg aggcgatcat tcgacatgag gatatgggcg tcgatggaac tattacatat 480ttcggtcagc ttattcagtt ggctactaca tatgataata ccggagttca tccgcatgtg 540aagagttttg agggcttggt gcatcagaca actcgcaaac agatcaatga ggcgtatccg 600cggttgaagt ggtgtgaatt tttctcgggg atgattagga aggaagagac gatcaagcct 660tggtgtcatt cgacccattt ggtggacttt gacagggaga tagaagagaa tacgcttatg 720agggagtggg agtaa 73530244PRTAspergillus niger 30Met Cys His Asp Glu Ile Lys Ala Asn Gly Trp Ser Ser Thr Pro Ala1 5 10 15Asn Ala Gly Ala Ile Phe Thr Asp Gln Ser Phe Ile Glu Arg Ala Glu 20 25 30Ala Val Glu Leu Asp Thr Ile Gln Phe Pro Phe Asp Asp Pro Val Val 35 40 45Ser Lys Thr Leu Glu Tyr Val Lys Ala Val Leu His Pro Glu Thr Leu 50 55 60Asn His Ser Met Arg Val Tyr Tyr Tyr Gly Met Val Ile Thr Thr Gln65 70 75 80Gln Phe Pro Glu Gln Ala Ala Ser Ile Asn Pro Val Thr Trp Ala Leu 85 90 95Thr Cys Leu Leu His Asp Leu Gly Thr Ala Glu Glu Asn Leu Thr Ala 100 105 110Thr Arg Met Ser Phe Asp Ile Tyr Gly Gly Ile Lys Ala Leu His Val 115 120 125Leu Lys Glu Phe Gly Ala Thr Ala Asp Gln Ala Glu Ala Ala Ala Glu 130 135 140Ala Ile Ile Arg His Glu Asp Met Gly Val Asp Gly Thr Ile Thr Tyr145 150 155 160Phe Gly Gln Leu Ile Gln Leu Ala Thr Thr Tyr Asp Asn Thr Gly Ile 165 170 175His Pro His Val Lys Gly Phe Glu Gly Leu Val His Arg Thr Thr Arg 180 185 190Lys Gln Ile Asn Glu Ala Tyr Pro Arg Leu Lys Trp Cys Ala Phe Phe 195 200 205Ser Gly Leu Ile Arg Lys Glu Glu Thr Ile Lys Pro Trp Cys His Ser 210 215 220Thr His Leu Val Asp Phe Asp Lys Glu Ile Glu Glu Asn Thr Leu Met225 230 235 240Arg Glu Trp Glu31735DNAAspergillus niger 31atgtgccacg acgaaatcaa agccaacggc tggtccagca ctcccgccaa tgccggtgcc 60atatttacgg accaatcctt cattgaaagg gcagaagccg tggagctcga tacgatccag 120ttcccctttg acgaccctgt agtctcgaag acattggaat atgtcaaggc tgttcttcac 180cccgagactt tgaatcattc catgagggtt tactattacg gaatggtaat caccacccaa 240caattccccg aacaagcagc atccataaac ccagtgacct gggctctgac ttgtctcctc 300cacgacctcg gaaccgcgga ggagaacctc accgcaacgc gcatgtcatt cgatatctac 360ggcggcatca aagccctcca tgtgctgaag gagtttggtg ccactgcgga ccaggccgaa 420gcagcagctg aggcaatcat tcgacatgaa gatatgggag tcgatggaac gattacctac 480ttcggtcagc ttattcagct ggctacgacg tatgataata ccgggattca tccgcatgtg 540aagggctttg aggggttggt ccatcgcacg actcgcaagc agattaatga ggcgtatccg 600cggttgaagt ggtgtgcgtt tttctccggg ttgattagaa aggaggagac gattaagcct 660tggtgtcatt cgactcattt ggtggatttt gataaggaga tcgaggagaa tacgcttatg 720agggagtggg agtaa 73532241PRTAspergillus niger 32Met Cys His Asp Lys Ile Pro Leu Asn Gly Trp Thr Ser Thr Pro Ala1 5 10 15Asn Ala Gly Ala Ile Phe Pro Asp Lys Pro Phe Ile His Pro Pro Thr 20 25 30Pro Ile Ser Ile Thr Asp Ile Pro Phe Pro Ser Thr Asp Pro Leu Val 35 40 45Ala Lys Thr Leu Glu Tyr Val Gln Ser Leu Leu Pro Arg Glu Thr Val 50 55 60Asn His Ser Met Arg Val Tyr Ser Tyr Gly Met Ile Leu Leu Thr Gln65 70 75 80Gln Phe Pro Ser His His Leu Ser Pro Thr Thr Trp Ala Leu Thr Cys 85 90 95Leu Leu His Asp Ile Gly Thr Ala Pro Ser Leu Leu Thr Ser Thr Asn 100 105 110Met Ser Phe Asp Leu Tyr Gly Gly Ile Lys Ala His Ser Val Leu Thr 115 120 125Ser Phe Asp Cys Pro Ala Asp Val Ala Asp Ala Val Ala Glu Ala Ile 130 135 140Ile Arg His Gln Asp Leu Gly Val Asp Gly Asn Ile Thr Phe Leu Gly145 150 155 160Gln Leu Ile Gln Leu Ala Thr Ile Tyr Asp Asn Val Gly Glu His Pro 165 170 175His Val Lys Asp Phe Gly Gly Leu Ile His Glu Asp Ala Arg Arg Glu 180 185 190Val Asn Glu Arg Trp Arg Arg Glu Gly Trp Cys Gly Val Phe Ala Asp 195 200 205Val Val Lys Leu Glu Val Gly Arg Lys Pro Trp Cys His Ser Thr His 210 215 220Ile Val Gly Phe Glu Gly Lys Val Arg Gly Asn Ala Leu Phe Gly Glu225 230 235 240Lys33726DNAAspergillus niger 33atgtgccacg acaagatccc cctcaacggc tggaccagca cccccgccaa cgctggtgcc 60atcttccccg acaagccctt catccaccca cccacgccca tctccatcac cgacatcccc 120ttcccctcca ccgatcccct cgtcgccaag accctcgaat acgtccaatc cctcctcccc 180cgcgagaccg tcaaccactc catgcgcgta tactcctacg gaatgatcct cctcacccag 240caattccctt cccaccatct atctccaaca acctgggccc taacctgcct tctgcatgac 300atcggcaccg ccccctccct cctcacctca acaaacatgt cctttgacct ctacggcggc 360atcaaagccc actccgtact tacttccttc gactgtcccg ctgatgttgc tgacgccgta 420gcggaagcta ttatccggca tcaggatcta ggcgtggatg ggaatatcac gttcctggga 480cagttgatcc agctggctac catttatgat aatgtggggg aacatccgca cgtcaaggac 540tttggagggt tgattcatga ggatgcgagg agggaggtta atgagcgctg gagaagggag 600ggatggtgtg gggtgtttgc tgatgtggtg aagttggagg tggggaggaa gccgtggtgt 660cattcgacgc atattgtggg gtttgagggg aaggttaggg ggaatgcgct ttttggggag 720aaatag 72634242PRTAspergillus oryzae 34Met Ser Pro Thr Arg Ala Ala Gln Val Glu Glu Tyr Gly Trp Thr Ala1 5 10 15Val Ser Cys Asp Pro Gln Gln Arg Ala Ala Thr Asn Pro Pro Thr Lys 20 25 30Pro Ser Val Pro Gln Leu Val Lys Asp Thr Thr Leu Pro Asp Thr Pro 35 40 45Leu Val Lys Asp Ala Met Glu Tyr Val Lys Ala Glu Leu Pro Ala His 50 55 60Thr Phe Asn His Ser Met Arg Val Tyr Tyr Tyr Gly Leu Ala Ile Ala65 70 75 80Arg Gln His Phe Pro Glu Trp Lys Phe Ser Asp Glu Thr Trp Leu Leu 85 90 95Thr Cys Leu Phe His Asp Ile Gly Thr Ile Asp Lys Tyr Thr Gln Asp 100 105 110Val Phe Met Ser Phe Asp Ile Tyr Gly Gly Ile Val Ala Leu Asn Val 115 120 125Leu Thr Glu Lys Gly Ala Pro Ala Pro Gln Ala Glu Ser Val Ala Glu 130 135 140Ala Ile Ile Arg His Gln Asp Pro Val Lys Val Gly Thr Ile His Ser145 150 155 160Val Gly Leu Leu Ile Gln Leu Ala Thr Gln Phe Asp Asn Leu Gly Ala 165 170 175His Lys Glu Tyr Val His Pro Asp Thr Val Glu Asp Val Asn Gln His 180 185 190Tyr Pro Arg Arg Gln Trp Ser Lys Cys Phe Ser Ser Lys Leu Arg Glu 195 200 205Glu Ile Gly Leu Lys Pro Trp Cys His Thr Thr Ala Glu Gly Glu Gly 210 215 220Phe Pro Val Gly Ile Glu Asn Asn Thr Leu Met Glu Pro Tyr Asp Gly225 230 235 240Arg Phe35729DNAAspergillus oryzae 35atgtcaccca ccagagcagc tcaagtcgaa gaatacggtt ggacagcggt gtcctgcgat 60cctcagcagc gagctgctac aaacccacct accaagcctt ctgttcccca gttggtcaaa 120gatacaactc ttcccgatac tcctctagtc aaagatgcca tggaatatgt taaggcagag 180ctacccgctc acacttttaa ccacagcatg cgtgtctact attatggcct tgcaatcgcc 240agacaacact tcccagaatg gaagttcagc gatgaaacct ggcttctcac ctgcctcttc 300cacgacatcg gcactatcga caagtacacc caagacgtct ttatgtcctt cgatatctac 360ggtggaattg tcgctctgaa cgtcctcacg gagaaaggtg cgccagcacc ccaggctgaa 420agtgtcgcag aagccatcat ccgtcatcag gatccggtga aagttgggac tattcattct 480gtcggtttac ttattcagct tgctacgcag tttgacaacc ttggtgccca caaggagtat 540gtccaccctg atactgtgga agatgtgaac cagcattatc cgcgtcgtca gtggtcgaag 600tgcttctcga gtaagctgag ggaggaaatt gggctcaagc cttggtgcca tactactgcg 660gagggcgagg ggttccctgt tgggatcgag aacaacactt tgatggagcc ttatgatgga 720cgcttctag 72936243PRTSaccharomyces cerevisiae 36Met Lys Leu Leu Arg Thr Val Phe Leu Pro Cys Ser Ser Ser Lys Glu1 5 10 15Ser Ile Met Ser Gln Tyr Gly Phe Val Arg Val Pro Arg Glu Val Glu 20 25 30Lys Ala Ile Pro Val Val Asn Ala Ser Arg Pro Arg Ala Val Val Pro 35 40 45Pro Pro Asn Ser Glu Thr Ala Arg Leu Val Arg Glu Tyr Ala Ala Lys 50 55 60Glu Leu Thr Ala Pro Val Leu Asn His Ser Leu Arg Val Phe Gln Tyr65 70 75 80Ser Leu Ala Ile Ile Arg Asp Gln Phe Pro Ala Trp Asp Leu Asp Gln 85 90 95Glu Val Leu Tyr Val Thr Cys Leu Leu His Asp Ile Ala Thr Thr Asp 100 105 110Lys Asn Met Arg Ala Thr Lys Met Ser Phe Glu Tyr Tyr Gly Gly Ile 115

120 125Leu Ser Arg Glu Leu Val Phe Asn Ala Thr Gly Gly Asn Gln Asp Tyr 130 135 140Ala Asp Ala Val Thr Glu Ala Ile Ile Arg His Gln Asp Leu Thr Gly145 150 155 160Thr Gly Tyr Ile Thr Thr Leu Gly Leu Ile Leu Gln Ile Ala Thr Thr 165 170 175Leu Asp Asn Val Gly Ser Asn Thr Asp Leu Ile His Ile Asp Thr Val 180 185 190Arg Ala Ile Asn Glu Gln Phe Pro Arg Leu His Trp Leu Ser Cys Phe 195 200 205Ala Thr Val Val Asn Thr Glu Asn Ser Arg Lys Pro Trp Gly His Thr 210 215 220Ser Ser Leu Gly Asp Asp Phe Ser Lys Lys Val Ile Cys Asn Thr Phe225 230 235 240Gly Tyr Asn37732DNASaccharomyces cerevisiae 37ttagttatac ccaaatgtat tgcatatgac tttctttgaa aaatcatcac ccaaagaact 60ggtgtggccc cacggttttc tcgagttttc agtgttcacc accgtagcaa aacatgataa 120ccagtgcagt cttggaaatt gctcattaat ggctctaact gtatcgatat gaatcagatc 180ggtattggat ccgacattgt caagcgtagt agcaatctgc agaatgagcc ccaaggtggt 240aatgtagcca gtcccagtca aatcctggtg acgaatgatg gcctcagtta ctgcatctgc 300atagtcctga tttccacctg tcgcattaaa tacaagctcc cttgaaagta tgccaccata 360atactcaaat gacatcttcg tggctctcat attcttatct gttgttgcaa tatcatgaag 420taagcaggtg acgtacaaaa cttcctgatc caagtcccat gctggaaatt ggtctcttat 480gatagctaaa ctatattgaa aaacacgcaa agagtggttt agaacggggg cagtcaattc 540tttagcggca tattcccgaa caagcctagc agtttcactg tttggaggcg gaacaacggc 600ccgtggtcta gatgcattca ccactggaat ggccttttct acctctctag gaactcttac 660aaatccgtac tgtgacatga ttgattcttt tgaagaggag caaggcaaaa aaacagtacg 720aagcaacttc at 73238390PRTRhodococcus sp. 38Met Arg Glu Val Gln Leu Leu Asp Gly Arg Arg Val Asp Val Ala Cys1 5 10 15Ala Gly Pro Leu Ile Ser Glu Ile Gly Ala His Leu Asp Leu Thr Ala 20 25 30Pro Val Glu Ile Asp Cys Gly Gly Gly Leu Ala Thr Arg Pro Phe Thr 35 40 45Glu Pro His Leu His Leu Asp Lys Ala Gly Thr Ala Asp Arg Leu Pro 50 55 60Ala Gly Ala Ser Thr Ile Gly Asp Ala Ile Ala Ala Met Gln Ser Val65 70 75 80Lys Val Thr Glu Arg Asp Asn Val Ala Ala Val Ala Ala Arg Met His 85 90 95Arg Val Leu Asn Arg Ile Val Asp Asp Gly Ser His Ala Ile Arg Ala 100 105 110Leu Val Asp Val Asp Glu Val Trp Gly Leu Thr Ala Phe His Ala Ala 115 120 125Gln Gln Val Gln Ala Ala Leu Ala Pro Arg Ala Val Val Gln Ile Val 130 135 140Ala Phe Pro Gln His Gly Leu Thr Pro Gln Val Leu Ala Met Leu Glu145 150 155 160Gln Ala Ala Ala Glu Gly Ala Gly Ala Leu Gly Ala His Thr Asp Val 165 170 175Asp Pro Asp Pro Ala Ala His Val Gly Ala Val Ala Ala Ile Ala Ala 180 185 190Gly Ala Ser Leu Pro Leu Glu Val His Thr Asp Glu Gly Ala Ser Pro 195 200 205Asp Lys Phe Tyr Leu Pro Ala Val Leu Glu Val Leu Asp Arg Phe Pro 210 215 220Gly Leu Ser Thr Thr Leu Ala His Cys Leu Ser Leu Gly Thr Ile Ala225 230 235 240Pro Lys Gln Gln Gln His Trp Ile Glu Glu Leu Ala His Arg Asp Ile 245 250 255Lys Val Cys Val Ala Pro Ser Ile Leu Gly Phe Gly Leu Pro Leu Ala 260 265 270Pro Val Arg Ala Leu Ile Glu Ala Gly Val Gly Ile Leu Val Gly Ser 275 280 285Asp Asn Leu Gln Asp Val Phe Phe Pro Leu Gly Thr Gly Arg Ala Ile 290 295 300Glu Asn Val Arg Leu Leu Ala Thr Ala Ala Gln Leu Thr Ala Pro Glu305 310 315 320Leu Ala Gly Pro Leu Ile Ala Gly Val Thr Asp Ile Ala Tyr Ala Thr 325 330 335Val Thr Gly Ala Ala Asp Ala Leu Ala Val Glu Ser Pro Ala Thr Leu 340 345 350Val Val His Asp Ala Thr Ser Pro Ala Glu Leu Leu Arg Gly Ile Asp 355 360 365Gly Thr Arg Ile Thr Val Ile Asp Gly Leu Leu Thr Ser Pro Leu Gln 370 375 380Leu Asp Lys Gly Ile Lys385 39039412PRTPseudomonas sp. 39Met Ser Met Glu Thr His Ser Tyr Val Asp Val Ala Ile Arg Asn Ala1 5 10 15Arg Leu Ala Asp Thr Glu Gly Ile Val Asp Ile Leu Ile His Asp Gly 20 25 30Arg Ile Ala Ser Ile Val Lys Ser Thr Lys Thr Lys Gly Ser Val Glu 35 40 45Ile Asp Ala His Glu Gly Leu Val Thr Ser Gly Leu Val Glu Pro His 50 55 60Ile His Leu Asp Lys Ala Leu Thr Ala Asp Arg Val Pro Ala Gly Ser65 70 75 80Ile Gly Asp Leu Arg Thr Arg Arg Gly Leu Glu Met Ala Ile Arg Ala 85 90 95Thr Arg Asp Ile Lys Arg Thr Phe Thr Val Glu Asp Val Arg Glu Arg 100 105 110Ala Ile Arg Ala Ala Leu Met Ala Ser Arg Ala Gly Thr Thr Ala Leu 115 120 125Arg Thr His Val Asp Val Asp Pro Ile Val Gly Leu Ala Gly Ile Arg 130 135 140Gly Val Leu Glu Ala Arg Glu Val Cys Ala Gly Leu Ile Asp Ile Gln145 150 155 160Ile Val Ala Phe Pro Gln Glu Gly Leu Phe Cys Ser Ala Gly Ala Val 165 170 175Asp Leu Met Arg Glu Ala Ile Lys Leu Gly Ala Asp Ala Val Gly Gly 180 185 190Ala Pro Ala Leu Asp Asp Arg Pro Gln Asp His Val Arg Ala Val Phe 195 200 205Asp Leu Ala Ala Glu Phe Gly Leu Pro Val Asp Met His Val Asp Glu 210 215 220Ser Asp Arg Arg Glu Asp Phe Thr Leu Pro Phe Val Ile Glu Ala Ala225 230 235 240Arg Glu Arg Arg Val Pro Asn Val Thr Val Ala His Ile Ser Ser Leu 245 250 255Ser Val Gln Thr Asp Asp Val Ala Arg Ser Thr Ile Ala Ala Leu Ala 260 265 270Asp Ala Asp Val Asn Val Val Val Asn Pro Ile Ile Val Lys Ile Thr 275 280 285Arg Leu Ser Glu Leu Leu Asp Ala Gly Val Ser Val Met Phe Gly Ser 290 295 300Asp Asn Leu Arg Asp Pro Phe Tyr Pro Leu Gly Ala Ala Asn Pro Leu305 310 315 320Gly Ser Ala Ile Phe Ala Cys Gln Ile Ala Ala Leu Gly Thr Pro Gln 325 330 335Asp Leu Arg Arg Val Phe Asp Ala Val Thr Ile Asn Ala Ala Arg Met 340 345 350Leu Gly Phe Pro Ser Leu Leu Gly Val Val Glu Gly Ala Val Ala Asp 355 360 365Leu Ala Val Phe Pro Ser Ala Thr Pro Glu Glu Val Val Leu Asp Gln 370 375 380Gln Ser Pro Leu Phe Val Leu Lys Gly Gly Arg Val Val Ala Met Arg385 390 395 400Leu Ala Ala Gly Ser Thr Ser Phe Arg Asp Tyr Ser 405 41040241PRTRhodococcus sp. 40Met Ile Tyr Ser Thr Val Asn Ala Asn Pro Tyr Ala Trp Pro Tyr Asp1 5 10 15Gly Ser Ile Asp Pro Ala His Thr Ala Leu Ile Leu Ile Asp Trp Gln 20 25 30Ile Asp Phe Cys Gly Pro Gly Gly Tyr Val Asp Ser Met Gly Tyr Asp 35 40 45Leu Ser Leu Thr Arg Ser Gly Leu Glu Pro Thr Ala Arg Val Leu Ala 50 55 60Ala Ala Arg Asp Thr Gly Met Thr Val Ile His Thr Arg Glu Gly His65 70 75 80Arg Pro Asp Leu Ala Asp Leu Pro Pro Asn Lys Arg Trp Arg Ser Ala 85 90 95Ser Ala Gly Ala Glu Ile Gly Ser Val Gly Pro Cys Gly Arg Ile Leu 100 105 110Val Arg Gly Glu Pro Gly Trp Glu Ile Val Pro Glu Val Ala Pro Arg 115 120 125Glu Gly Glu Pro Ile Ile Asp Lys Pro Gly Lys Gly Ala Phe Tyr Ala 130 135 140Thr Asp Leu Asp Leu Leu Leu Arg Thr Arg Gly Ile Thr His Leu Ile145 150 155 160Leu Thr Gly Ile Thr Thr Asp Val Cys Val His Thr Thr Met Arg Glu 165 170 175Ala Asn Asp Arg Gly Tyr Glu Cys Leu Ile Leu Ser Asp Cys Thr Gly 180 185 190Ala Thr Asp Arg Lys His His Glu Ala Ala Leu Ser Met Val Thr Met 195 200 205Gln Gly Gly Val Phe Gly Ala Thr Ala His Ser Asp Asp Leu Leu Ala 210 215 220Ala Leu Gly Thr Thr Val Pro Ala Ala Ala Gly Pro Arg Ala Arg Thr225 230 235 240Glu41233PRTRhizobium leguminosarum 41Met Asp Ala Met Val Glu Thr Asn Arg His Phe Ile Asp Ala Asp Pro1 5 10 15Tyr Pro Trp Pro Tyr Asn Gly Ala Leu Arg Pro Asp Asn Thr Ala Leu 20 25 30Ile Ile Ile Asp Met Gln Thr Asp Phe Cys Gly Lys Gly Gly Tyr Val 35 40 45Asp His Met Gly Tyr Asp Leu Ser Leu Val Gln Ala Pro Ile Glu Pro 50 55 60Ile Lys Arg Val Leu Ala Ala Met Arg Ala Lys Gly Tyr His Ile Ile65 70 75 80His Thr Arg Glu Gly His Arg Pro Asp Leu Ala Asp Leu Pro Ala Asn 85 90 95Lys Arg Trp Arg Ser Gln Arg Ile Gly Ala Gly Ile Gly Asp Pro Gly 100 105 110Pro Cys Gly Arg Ile Leu Thr Arg Gly Glu Pro Gly Trp Asp Ile Ile 115 120 125Pro Glu Leu Tyr Pro Ile Glu Gly Glu Thr Ile Ile Asp Lys Pro Gly 130 135 140Lys Gly Ser Phe Cys Ala Thr Asp Leu Glu Leu Val Leu Asn Gln Lys145 150 155 160Arg Ile Glu Asn Ile Ile Leu Thr Gly Ile Thr Thr Asp Val Cys Val 165 170 175Ser Thr Thr Met Arg Glu Ala Asn Asp Arg Gly Tyr Glu Cys Leu Leu 180 185 190Leu Glu Asp Cys Cys Gly Ala Thr Asp Tyr Gly Asn His Leu Ala Ala 195 200 205Ile Lys Met Val Lys Met Gln Gly Gly Val Phe Gly Ser Val Ser Asn 210 215 220Ser Ala Ala Leu Val Glu Ala Leu Pro225 230424268DNAArtificial SequenceDescription of Artificial Sequence Synthetic polynucleotide 42gtttgtggaa gcggtattcg caatttaatt aagtttaaac ggcgcgcctt tccataggct 60ccgcccccct gacgagcatc acaaaaatcg acgctcaagt cagaggtggc gaaacccgac 120aggactataa agataccagg cgtttccccc tggaagctcc ctcgtgcgct ctcctgttcc 180gaccctgccg cttaccggat acctgtccgc ctttctccct tcgggaagcg tggcgctttc 240tcatagctca cgctgtaggt atctcagttc ggtgtaggtc gttcgctcca agctgggctg 300tgtgcacgaa ccccccgttc agcccgaccg ctgcgcctta tccggtaact atcgtcttga 360gtccaacccg gtaagacacg acttatcgcc actggcagca gccactggta acaggattag 420cagagcgagg tatgtaggcg gtgctacaga gttcttgaag tggtggccta actacggcta 480cactagaaga acagtatttg gtatctgcgc tctgctgaag ccagttacct tcggaaaaag 540agttggtagc tcttgatccg gcaaacaaac caccgctggt agcggtggtt tttttgtttg 600caagcagcag attacgcgca gaaaaaaagg atctcaagaa gatcctttga tcttttctac 660ggggtctgac gctcagtgga acgaaaactc acgttaaggg attttggtca tgagattatc 720aaaaaggatc ttcacctaga tccttttaaa ttaaaaatga agttttaaat caatctaaag 780tatatatgag taaacttggt ctgacagtta ccaatgctta atcagtgagg cacctatctc 840agcgatctgt ctatttcgtt catccatagt tgcctgactc cccgtcgtgt agataactac 900gatacgggag ggcttaccat ctggccccag tgctgcaatg ataccgcgag acccacgctc 960accggctcca gatttatcag caataaacca gccagccgga agggccgagc gcagaagtgg 1020tcctgcaact ttatccgcct ccatccagtc tattaattgt tgccgggaag ctagagtaag 1080tagttcgcca gttaatagtt tgcgcaacgt tgttgccatt gctacaggca tcgtggtgtc 1140acgctcgtcg tttggtatgg cttcattcag ctccggttcc caacgatcaa ggcgagttac 1200atgatccccc atgttgtgca aaaaagcggt tagctccttc ggtcctccga tcgttgtcag 1260aagtaagttg gccgcagtgt tatcactcat ggttatggca gcactgcata attctcttac 1320tgtcatgcca tccgtaagat gcttttctgt gactggtgag tactcaacca agtcattctg 1380agaatagtgt atgcggcgac cgagttgctc ttgcccggcg tcaatacggg ataataccgc 1440gccacatagc agaactttaa aagtgctcat cattggaaaa cgttcttcgg ggcgaaaact 1500ctcaaggatc ttaccgctgt tgagatccag ttcgatgtaa cccactcgtg cacccaactg 1560atcttcagca tcttttactt tcaccagcgt ttctgggtga gcaaaaacag gaaggcaaaa 1620tgccgcaaaa aagggaataa gggcgacacg gaaatgttga atactcatac tcttcctttt 1680tcaatattat tgaagcattt atcagggtta ttgtctcatg agcggataca tatttgaatg 1740tatttagaaa aataaacagc gatcgcgcgg ccgcgggtaa taactgatat aattaaattg 1800aagctctaat ttgtgagttt agtatacatg catttactta taatacagtt ttttagtttt 1860gctggccgca tcttctcaaa tatgcttccc agcctgcttt tctgtaacgt tcaccctcta 1920ccttagcatc ccttcccttt gcaaatagtc ctcttccaac aataataatg tcagatcctg 1980tagagaccac atcatccacg gttctatact gttgacccaa tgcgtctccc ttgtcatcta 2040aacccacacc gggtgtcata atcaaccaat cgtaaccttc atctcttcca cccatgtctc 2100tttgagcaat aaagccgata acaaaatctt tgtcgctctt cgcaatgtca acagtaccct 2160tagtatattc tccagtagct agggagccct tgcatgacaa ttctgctaac atcaaaaggc 2220ctctaggttc ctttgttact tcttccgccg cctgcttcaa accgctaaca atacctgggc 2280ccaccacacc gtgtgcattc gtaatgtctg cccattctgc tattctgtat acacccgcag 2340agtactgcaa tttgactgta ttaccaatgt cagcaaattt tctgtcttcg aagagtaaaa 2400aattgtactt ggcggataat gcctttagcg gcttaactgt gccctccatg gaaaaatcag 2460tcaagatatc cacatgtgtt tttagtaaac aaattttggg acctaatgct tcaactaact 2520ccagtaattc cttggtggta cgaacatcca atgaagcaca caagtttgtt tgcttttcgt 2580gcatgatatt aaatagcttg gcagcaacag gactaggatg agtagcagca cgttccttat 2640atgtagcttt cgacatgatt tatcttcgtt tcctgcaggt ttttgttctg tgcagttggg 2700ttaagaatac tgggcaattt catgtttctt caacaccaca tatgcgtata tataccaatc 2760taagtctgtg ctccttcctt cgttcttcct tctgctcgga gattaccgaa tcaaagctag 2820cttatcgatg ataagctgtc aaagatgaga attaattcca cggactatag actatactag 2880atactccgtc tactgtacga tacacttccg ctcaggtcct tgtcctttaa cgaggcctta 2940ccactctttt gttactctat tgatccagct cagcaaaggc agtgtgatct aagattctat 3000cttcgcgatg tagtaaaact agctagaccg agaaagagac tagaaatgca aaaggcactt 3060ctacaatggc tgccatcatt attatccgat gtgacgctgc agcttctcaa tgatattcga 3120atacgctttg aggagataca gcctaatatc cgacaaactg ttttacagat ttacgatcgt 3180acttgttacc catcattgaa ttttgaacat ccgaacctgg gagttttccc tgaaacagat 3240agtatatttg aacctgtata ataatatata gtctagcgct ttacggaaga caatgtatgt 3300atttcggttc ctggagaaac tattgcatct attgcatagg taatcttgca cgtcgcatcc 3360ccggttcatt ttctgcgttt ccatcttgca cttcaatagc atatctttgt taacgaagca 3420tctgtgcttc attttgtaga acaaaaatgc aacgcgagag cgctaatttt tcaaacaaag 3480aatctgagct gcatttttac agaacagaaa tgcaacgcga aagcgctatt ttaccaacga 3540agaatctgtg cttcattttt gtaaaacaaa aatgcaacgc gacgagagcg ctaatttttc 3600aaacaaagaa tctgagctgc atttttacag aacagaaatg caacgcgaga gcgctatttt 3660accaacaaag aatctatact tcttttttgt tctacaaaaa tgcatcccga gagcgctatt 3720tttctaacaa agcatcttag attacttttt ttctcctttg tgcgctctat aatgcagtct 3780cttgataact ttttgcactg taggtccgtt aaggttagaa gaaggctact ttggtgtcta 3840ttttctcttc cataaaaaaa gcctgactcc acttcccgcg tttactgatt actagcgaag 3900ctgcgggtgc attttttcaa gataaaggca tccccgatta tattctatac cgatgtggat 3960tgcgcatact ttgtgaacag aaagtgatag cgttgatgat tcttcattgg tcagaaaatt 4020atgaacggtt tcttctattt tgtctctata tactacgtat aggaaatgtt tacattttcg 4080tattgttttc gattcactct atgaatagtt cttactacaa tttttttgtc taaagagtaa 4140tactagagat aaacataaaa aatgtagagg tcgagtttag atgcaagttc aaggagcgaa 4200aggtggatgg gtaggttata tagggatata gcacagagat atatagcaaa gagatacttt 4260tgagcaat 4268434399DNAArtificial SequenceDescription of Artificial Sequence Synthetic polynucleotide 43gtttgtggaa gcggtattcg caatttaatt aaagctggtg acaattaatc atcggctcgt 60ataatgtgtg gaattgaatc gatataagga ggttaatcat gtttaaaccc tcaaaatata 120ttttccctct atcttctcgt tgcgcttaat ttgactaatt ctcattagcg aggcgcgcct 180ttccataggc tccgcccccc tgacgagcat cacaaaaatc gacgctcaag tcagaggtgg 240cgaaacccga caggactata aagataccag gcgtttcccc ctggaagctc cctcgtgcgc 300tctcctgttc cgaccctgcc gcttaccgga tacctgtccg cctttctccc ttcgggaagc 360gtggcgcttt ctcatagctc acgctgtagg tatctcagtt cggtgtaggt cgttcgctcc 420aagctgggct gtgtgcacga accccccgtt cagcccgacc gctgcgcctt atccggtaac 480tatcgtcttg agtccaaccc ggtaagacac gacttatcgc cactggcagc agccactggt 540aacaggatta gcagagcgag gtatgtaggc ggtgctacag agttcttgaa gtggtggcct 600aactacggct acactagaag aacagtattt ggtatctgcg ctctgctgaa gccagttacc 660ttcggaaaaa gagttggtag ctcttgatcc ggcaaacaaa ccaccgctgg tagcggtggt 720ttttttgttt gcaagcagca gattacgcgc agaaaaaaag gatctcaaga agatcctttg 780atcttttcta cggggtctga cgctcagtgg aacgaaaact cacgttaagg gattttggtc 840atgagattat caaaaaggat cttcacctag atccttttaa attaaaaatg aagttttaaa 900tcaatctaaa gtatatatga gtaaacttgg tctgacagtt accaatgctt aatcagtgag 960gcacctatct cagcgatctg tctatttcgt tcatccatag ttgcctgact ccccgtcgtg 1020tagataacta cgatacggga gggcttacca tctggcccca gtgctgcaat gataccgcga 1080gacccacgct caccggctcc agatttatca gcaataaacc agccagccgg aagggccgag 1140cgcagaagtg gtcctgcaac tttatccgcc tccatccagt ctattaattg ttgccgggaa

1200gctagagtaa gtagttcgcc agttaatagt ttgcgcaacg ttgttgccat tgctacaggc 1260atcgtggtgt cacgctcgtc gtttggtatg gcttcattca gctccggttc ccaacgatca 1320aggcgagtta catgatcccc catgttgtgc aaaaaagcgg ttagctcctt cggtcctccg 1380atcgttgtca gaagtaagtt ggccgcagtg ttatcactca tggttatggc agcactgcat 1440aattctctta ctgtcatgcc atccgtaaga tgcttttctg tgactggtga gtactcaacc 1500aagtcattct gagaatagtg tatgcggcga ccgagttgct cttgcccggc gtcaatacgg 1560gataataccg cgccacatag cagaacttta aaagtgctca tcattggaaa acgttcttcg 1620gggcgaaaac tctcaaggat cttaccgctg ttgagatcca gttcgatgta acccactcgt 1680gcacccaact gatcttcagc atcttttact ttcaccagcg tttctgggtg agcaaaaaca 1740ggaaggcaaa atgccgcaaa aaagggaata agggcgacac ggaaatgttg aatactcata 1800ctcttccttt ttcaatatta ttgaagcatt tatcagggtt attgtctcat gagcggatac 1860atatttgaat gtatttagaa aaataaacag cgatcgcgcg gccgcgggta ataactgata 1920taattaaatt gaagctctaa tttgtgagtt tagtatacat gcatttactt ataatacagt 1980tttttagttt tgctggccgc atcttctcaa atatgcttcc cagcctgctt ttctgtaacg 2040ttcaccctct accttagcat cccttccctt tgcaaatagt cctcttccaa caataataat 2100gtcagatcct gtagagacca catcatccac ggttctatac tgttgaccca atgcgtctcc 2160cttgtcatct aaacccacac cgggtgtcat aatcaaccaa tcgtaacctt catctcttcc 2220acccatgtct ctttgagcaa taaagccgat aacaaaatct ttgtcgctct tcgcaatgtc 2280aacagtaccc ttagtatatt ctccagtagc tagggagccc ttgcatgaca attctgctaa 2340catcaaaagg cctctaggtt cctttgttac ttcttccgcc gcctgcttca aaccgctaac 2400aatacctggg cccaccacac cgtgtgcatt cgtaatgtct gcccattctg ctattctgta 2460tacacccgca gagtactgca atttgactgt attaccaatg tcagcaaatt ttctgtcttc 2520gaagagtaaa aaattgtact tggcggataa tgcctttagc ggcttaactg tgccctccat 2580ggaaaaatca gtcaagatat ccacatgtgt ttttagtaaa caaattttgg gacctaatgc 2640ttcaactaac tccagtaatt ccttggtggt acgaacatcc aatgaagcac acaagtttgt 2700ttgcttttcg tgcatgatat taaatagctt ggcagcaaca ggactaggat gagtagcagc 2760acgttcctta tatgtagctt tcgacatgat ttatcttcgt ttcctgcagg tttttgttct 2820gtgcagttgg gttaagaata ctgggcaatt tcatgtttct tcaacaccac atatgcgtat 2880atataccaat ctaagtctgt gctccttcct tcgttcttcc ttctgctcgg agattaccga 2940atcaaagcta gcttatcgat gataagctgt caaagatgag aattaattcc acggactata 3000gactatacta gatactccgt ctactgtacg atacacttcc gctcaggtcc ttgtccttta 3060acgaggcctt accactcttt tgttactcta ttgatccagc tcagcaaagg cagtgtgatc 3120taagattcta tcttcgcgat gtagtaaaac tagctagacc gagaaagaga ctagaaatgc 3180aaaaggcact tctacaatgg ctgccatcat tattatccga tgtgacgctg cagcttctca 3240atgatattcg aatacgcttt gaggagatac agcctaatat ccgacaaact gttttacaga 3300tttacgatcg tacttgttac ccatcattga attttgaaca tccgaacctg ggagttttcc 3360ctgaaacaga tagtatattt gaacctgtat aataatatat agtctagcgc tttacggaag 3420acaatgtatg tatttcggtt cctggagaaa ctattgcatc tattgcatag gtaatcttgc 3480acgtcgcatc cccggttcat tttctgcgtt tccatcttgc acttcaatag catatctttg 3540ttaacgaagc atctgtgctt cattttgtag aacaaaaatg caacgcgaga gcgctaattt 3600ttcaaacaaa gaatctgagc tgcattttta cagaacagaa atgcaacgcg aaagcgctat 3660tttaccaacg aagaatctgt gcttcatttt tgtaaaacaa aaatgcaacg cgacgagagc 3720gctaattttt caaacaaaga atctgagctg catttttaca gaacagaaat gcaacgcgag 3780agcgctattt taccaacaaa gaatctatac ttcttttttg ttctacaaaa atgcatcccg 3840agagcgctat ttttctaaca aagcatctta gattactttt tttctccttt gtgcgctcta 3900taatgcagtc tcttgataac tttttgcact gtaggtccgt taaggttaga agaaggctac 3960tttggtgtct attttctctt ccataaaaaa agcctgactc cacttcccgc gtttactgat 4020tactagcgaa gctgcgggtg cattttttca agataaaggc atccccgatt atattctata 4080ccgatgtgga ttgcgcatac tttgtgaaca gaaagtgata gcgttgatga ttcttcattg 4140gtcagaaaat tatgaacggt ttcttctatt ttgtctctat atactacgta taggaaatgt 4200ttacattttc gtattgtttt cgattcactc tatgaatagt tcttactaca atttttttgt 4260ctaaagagta atactagaga taaacataaa aaatgtagag gtcgagttta gatgcaagtt 4320caaggagcga aaggtggatg ggtaggttat atagggatat agcacagaga tatatagcaa 4380agagatactt ttgagcaat 4399448762DNAArtificial SequenceDescription of Artificial Sequence Synthetic polynucleotide 44gtttgtggaa gcggtattcg caatcattta gtcgtgcaat gtatgacttt aagatttgtg 60agcaggaaga aaagggagaa tcttctaacg ataaaccctt gaaaaactgg gtagactacg 120ctatgttgag ttgctacgca ggctgcacaa ttacacgaga atgctcccgc ctaggattta 180aggctaaggg acgtgcaatg cagacgacag atctaaatga ccgtgtcggt gaagtgttcg 240ccaaactttt cggttaacac atgcagtgat gcacgcgcga tggtgctaag ttacatatat 300atatatatat atatatatat atatatatag ccatagtgat gtctaagtaa cctttatggt 360atatttctta atgtggaaag atactagcgc gcgcacccac acacaagctt cgtcttttct 420tgaagaaaag aggaagctcg ctaaatggga ttccactttc cgttccctgc cagctgatgg 480aaaaaggtta gtggaacgat gaagaataaa aagagagatc cactgaggtg aaatttcagc 540tgacagcgag tttcatgatc gtgatgaaca atggtaacga gttgtggctg ttgccaggga 600gggtggttct caacttttaa tgtatggcca aatcgctact tgggtttgtt atataacaaa 660gaagaaataa tgaactgatt ctcttcctcc ttcttgtcct ttcttaattc tgttgtaatt 720accttccttt gtaatttttt ttgtaattat tcttcttaat aatccaaaca aacacacata 780ttacaataat gaagaagccc gagctgaccg ctacctctgt tgagaagttc ctgattgaga 840agtttgattc cgtttccgac ctgatgcagc tgtccgaggg cgaggagtct cgagccttct 900cctttgacgt gggcggacga ggttacgttc tgcgagtgaa ctcgtgtgcc gacggcttct 960acaaggatcg atacgtctac cgacactttg cttctgccgc tctgcccatc cctgaggttc 1020tcgacattgg cgagttctct gagtccctca cctactgcat ctctcgacga gctcagggag 1080tcaccctgca ggacctccct gagactgagc tgcctgctgt cctccagcct gttgctgagg 1140ccatggacgc tatcgctgct gctgatctgt cccagacctc gggtttcggc ccctttggac 1200ctcagggaat tggacagtac accacttggc gagacttcat ctgtgctatt gccgatcctc 1260acgtctacca ttggcagacc gttatggacg atactgtgtc ggcttctgtc gctcaggctc 1320tggacgagct gatgctctgg gccgaggatt gccccgaggt tcgacacctg gtgcatgctg 1380acttcggttc caacaacgtt ctcaccgaca acggccgaat cactgccgtg attgactggt 1440ccgaggctat gtttggcgac tcgcagtacg aggtggccaa catcttcttt tggcgaccct 1500ggctggcttg tatggagcag cagacccgat acttcgagcg acgacatcct gagctcgctg 1560gatcccctcg actgcgagct tacatgctcc gaattggtct ggaccagctc taccagtcgc 1620tggtggatgg caactttgac gatgctgcct gggctcaggg acgatgtgac gccatcgtgc 1680gatctggcgc tggaaccgtc ggacgaactc agattgcccg acgatccgct gctgtctgga 1740ccgacggatg cgtggaggtc ctggctgatt cgggtaaccg acgaccctct actcgacctc 1800gagctaagga gtaataaacg gcgcgccgtc tgaagaatga atgatttgat gatttctttt 1860tccctccatt tttcttactg aatatatcaa tgatatagac ttgtatagtt tattatttca 1920aattaagtag ctatatatag tcaagataac gtttgtttga cacgattaca ttattcgtcg 1980acatcttttt tcagcctgtc gtggtagcaa tttgaggagt attattaatt gaataggttc 2040attttgcgct cgcataaaca gttttcgtca gggacagtat gttggaatga gtggtaatta 2100atggtgacat gacatgttat agcaataacc ttgatgttta catcgtagtt taatgtacac 2160cccgcgaatt cgttcaagta ggagtgcacc aattgcaaag ggaaaagctg aatgggcagt 2220tcgaatagta ctttttccat aggctccgcc cccctgacga gcatcacaaa aatcgacgct 2280caagtcagag gtggcgaaac ccgacaggac tataaagata ccaggcgttt ccccctggaa 2340gctccctcgt gcgctctcct gttccgaccc tgccgcttac cggatacctg tccgcctttc 2400tcccttcggg aagcgtggcg ctttctcata gctcacgctg taggtatctc agttcggtgt 2460aggtcgttcg ctccaagctg ggctgtgtgc acgaaccccc cgttcagccc gaccgctgcg 2520ccttatccgg taactatcgt cttgagtcca acccggtaag acacgactta tcgccactgg 2580cagcagccac tggtaacagg attagcagag cgaggtatgt aggcggtgct acagagttct 2640tgaagtggtg gcctaactac ggctacacta gaagaacagt atttggtatc tgcgctctgc 2700tgaagccagt taccttcgga aaaagagttg gtagctcttg atccggcaaa caaaccaccg 2760ctggtagcgg tggttttttt gtttgcaagc agcagattac gcgcagaaaa aaaggatctc 2820aagaagatcc tttgatcttt tctacggggt ctgacgctca gtggaacgaa aactcacgtt 2880aagggatttt ggtcatgaga ttatcaaaaa ggatcttcac ctagatcctt ttaaattaaa 2940aatgaagttt taaatcaatc taaagtatat atgagtaaac ttggtctgac agttaccaat 3000gcttaatcag tgaggcacct atctcagcga tctgtctatt tcgttcatcc atagttgcct 3060gactccccgt cgtgtagata actacgatac gggagggctt accatctggc cccagtgctg 3120caatgatacc gcgagaccca cgctcaccgg ctccagattt atcagcaata aaccagccag 3180ccggaagggc cgagcgcaga agtggtcctg caactttatc cgcctccatc cagtctatta 3240attgttgccg ggaagctaga gtaagtagtt cgccagttaa tagtttgcgc aacgttgttg 3300ccattgctac aggcatcgtg gtgtcacgct cgtcgtttgg tatggcttca ttcagctccg 3360gttcccaacg atcaaggcga gttacatgat cccccatgtt gtgcaaaaaa gcggttagct 3420ccttcggtcc tccgatcgtt gtcagaagta agttggccgc agtgttatca ctcatggtta 3480tggcagcact gcataattct cttactgtca tgccatccgt aagatgcttt tctgtgactg 3540gtgagtactc aaccaagtca ttctgagaat agtgtatgcg gcgaccgagt tgctcttgcc 3600cggcgtcaat acgggataat accgcgccac atagcagaac tttaaaagtg ctcatcattg 3660gaaaacgttc ttcggggcga aaactctcaa ggatcttacc gctgttgaga tccagttcga 3720tgtaacccac tcgtgcaccc aactgatctt cagcatcttt tactttcacc agcgtttctg 3780ggtgagcaaa aacaggaagg caaaatgccg caaaaaaggg aataagggcg acacggaaat 3840gttgaatact catactcttc ctttttcaat attattgaag catttatcag ggttattgtc 3900tcatgagcgg atacatattt gaatgtattt agaaaaataa acagcgatcg cagctagctc 3960gtcgtgttca ggaactgttc gatggttcgg agagagtcgc cgcccagaac atacgcgcac 4020cgatgtcagc agacagcctt attacaagta tattcaagca agtatatccg tagggtgcgg 4080gtgatttgga tctaaggttc gtactcaaca ctcacgagca gcttgcctat gttacatcct 4140tttatcagac ataacataat tggagtttac ttacacacgg ggtgtacctg tatgagcacc 4200acctacaatt gtagcactgg tacttgtaca aagaatttat tcgtacgaat cacagggacg 4260gccgccctca ccgaaccagc gaatacctca gcggtcccct gcagtgactc aacaaagcga 4320tatgaacatc ttgcgatggt atcctgctga tagtttttac tgtacaaaca cctgtgtagc 4380tccttctagc atttttaagt tattcacacc tcaaggggag ggataaatta aataaattcc 4440aaaagcgaag atcgagaaac taaattaaaa ttccaaaaac gaagttggaa cacaaccccc 4500cgaaaaaaaa caacaaacaa aaaacccaac aaaataaaca aaaacaaaat aaatatataa 4560ctaccagtat ctgactaaaa gttcaaatac tcgtacttac aacaaataga aatgagccgg 4620ccaaaattct gcagaaaaaa atttcaaaca agtactggta taattaaatt aaaaaacaca 4680tcaaagtatc ataacgttag ttattttatt ttatttaata aaagaaaaca acaagatggg 4740ctcaaaactt tcaacttata cgatacatac caaataacaa tttagtattt atctaagtgc 4800ttttcgtaga taatggaata caaatggata tccagagtat acacatggat agtatacact 4860gacacgacaa ttctgtatct ctttatgtta actactgtga ggcattaaat agagcttgat 4920atataaaatg ttacatttca cagtctgaac ttttgcagat tacctaattt ggtaagatat 4980taattatgaa ctgaaagttg atggcggccg catagcttca aaatgtttct actccttttt 5040tactcttcca gattttctcg gactccgcgc atcgccgtac cacttcaaaa cacccaagca 5100cagcatacta aatttcccct ctttcttcct ctagggtgtc gttaattacc cgtactaaag 5160gtttggaaaa gaaaaaagag accgcctcgt ttctttttct tcgtcgaaaa aggcaataaa 5220aatttttatc acgtttcttt ttcttgaaaa tttttttttt tgattttttt ctctttcgat 5280gacctcccat tgatatttaa gttaataaac ggtcttcaat ttctcaagtt tcagtttcat 5340ttttcttgtt ctattacaac tttttttact tcttgctcat tagaaagaaa gcatagcaat 5400ctaatctaag ttttaattac aaaatgacca ctctggatga caccgcttac cgataccgaa 5460cttccgttcc tggcgatgcc gaggctattg aggctctgga tggatctttc accactgaca 5520ccgttttccg agtgaccgct actggcgacg gcttcaccct gcgagaggtg cctgtcgacc 5580ctcctctcac caaggttttc cctgacgatg agtcggacga tgagtctgac gctggagagg 5640acggcgaccc tgactctcga actttcgtgg cttacggcga cgatggagac ctggccggct 5700ttgtggtcgt ttcttactcc ggatggaacc gacgactgac cgtggaggac atcgaggtcg 5760ctcctgagca ccgaggtcat ggtgtcggac gagctctgat gggtctcgct actgagttcg 5820ctcgagagcg aggtgctggc cacctgtggc tcgaggtcac caacgttaac gcccctgcta 5880ttcatgccta ccgacgaatg ggttttaccc tgtgtggcct cgatactgcc ctgtacgacg 5940gaaccgcttc cgatggagag caggccctct acatgtcgat gccctgccct taaacaggcc 6000ccttttcctt tgtcgatatc atgtaattag ttatgtcacg cttacattca cgccctcctc 6060ccacatccgc tctaaccgaa aaggaaggag ttagacaacc tgaagtctag gtccctattt 6120atttttttta atagttatgt tagtattaag aacgttattt atatttcaaa tttttctttt 6180ttttctgtac aaacgcgtgt acgcatgtaa cattatactg aaaaccttgc ttgagaaggt 6240tttgggacgc tcgaaggctt taatttgcgg gtaataactg atataattaa attgaagctc 6300taatttgtga gtttagtata catgcattta cttataatac agttttttag ttttgctggc 6360cgcatcttct caaatatgct tcccagcctg cttttctgta acgttcaccc tctaccttag 6420catcccttcc ctttgcaaat agtcctcttc caacaataat aatgtcagat cctgtagaga 6480ccacatcatc cacggttcta tactgttgac ccaatgcgtc tcccttgtca tctaaaccca 6540caccgggtgt cataatcaac caatcgtaac cttcatctct tccacccatg tctctttgag 6600caataaagcc gataacaaaa tctttgtcgc tcttcgcaat gtcaacagta cccttagtat 6660attctccagt agctagggag cccttgcatg acaattctgc taacatcaaa aggcctctag 6720gttcctttgt tacttcttcc gccgcctgct tcaaaccgct aacaatacct gggcccacca 6780caccgtgtgc attcgtaatg tctgcccatt ctgctattct gtatacaccc gcagagtact 6840gcaatttgac tgtattacca atgtcagcaa attttctgtc ttcgaagagt aaaaaattgt 6900acttggcgga taatgccttt agcggcttaa ctgtgccctc catggaaaaa tcagtcaaga 6960tatccacatg tgtttttagt aaacaaattt tgggacctaa tgcttcaact aactccagta 7020attccttggt ggtacgaaca tccaatgaag cacacaagtt tgtttgcttt tcgtgcatga 7080tattaaatag cttggcagca acaggactag gatgagtagc agcacgttcc ttatatgtag 7140ctttcgacat gatttatctt cgtttcctgc aggtttttgt tctgtgcagt tgggttaaga 7200atactgggca atttcatgtt tcttcaacac cacatatgcg tatatatacc aatctaagtc 7260tgtgctcctt ccttcgttct tccttctgct cggagattac cgaatcaaag ctagcttatc 7320gatgataagc tgtcaaagat gagaattaat tccacggact atagactata ctagatactc 7380cgtctactgt acgatacact tccgctcagg tccttgtcct ttaacgaggc cttaccactc 7440ttttgttact ctattgatcc agctcagcaa aggcagtgtg atctaagatt ctatcttcgc 7500gatgtagtaa aactagctag accgagaaag agactagaaa tgcaaaaggc acttctacaa 7560tggctgccat cattattatc cgatgtgacg ctgcagcttc tcaatgatat tcgaatacgc 7620tttgaggaga tacagcctaa tatccgacaa actgttttac agatttacga tcgtacttgt 7680tacccatcat tgaattttga acatccgaac ctgggagttt tccctgaaac agatagtata 7740tttgaacctg tataataata tatagtctag cgctttacgg aagacaatgt atgtatttcg 7800gttcctggag aaactattgc atctattgca taggtaatct tgcacgtcgc atccccggtt 7860cattttctgc gtttccatct tgcacttcaa tagcatatct ttgttaacga agcatctgtg 7920cttcattttg tagaacaaaa atgcaacgcg agagcgctaa tttttcaaac aaagaatctg 7980agctgcattt ttacagaaca gaaatgcaac gcgaaagcgc tattttacca acgaagaatc 8040tgtgcttcat ttttgtaaaa caaaaatgca acgcgacgag agcgctaatt tttcaaacaa 8100agaatctgag ctgcattttt acagaacaga aatgcaacgc gagagcgcta ttttaccaac 8160aaagaatcta tacttctttt ttgttctaca aaaatgcatc ccgagagcgc tatttttcta 8220acaaagcatc ttagattact ttttttctcc tttgtgcgct ctataatgca gtctcttgat 8280aactttttgc actgtaggtc cgttaaggtt agaagaaggc tactttggtg tctattttct 8340cttccataaa aaaagcctga ctccacttcc cgcgtttact gattactagc gaagctgcgg 8400gtgcattttt tcaagataaa ggcatccccg attatattct ataccgatgt ggattgcgca 8460tactttgtga acagaaagtg atagcgttga tgattcttca ttggtcagaa aattatgaac 8520ggtttcttct attttgtctc tatatactac gtataggaaa tgtttacatt ttcgtattgt 8580tttcgattca ctctatgaat agttcttact acaatttttt tgtctaaaga gtaatactag 8640agataaacat aaaaaatgta gaggtcgagt ttagatgcaa gttcaaggag cgaaaggtgg 8700atgggtaggt tatataggga tatagcacag agatatatag caaagagata cttttgagca 8760at 8762455824DNAArtificial SequenceDescription of Artificial Sequence Synthetic polynucleotide 45gtttgtggaa gcggtattcg caatttaatt aaagctggtg acaattaatc atcggctcgt 60ataatgtgtg gaattgaatc gatataagga ggttaatcat atgcaaacgc tcagcatcca 120gcacggtacc ctcgtcacga tggatcagta ccgcagagtc cttggggata gctgggttca 180cgtgcaggat ggacggatcg tcgcgctcgg agtgcacgcc gagtcggtgc ctccgccagc 240ggatcgggtg atcgatgcac gcggcaaggt cgtgttaccc ggtttcatca atgcccacac 300ccatgtgaac cagatcctcc tgcgcggagg gccctcgcac gggcgtcaac tctatgactg 360gctgttcaac gttttgtatc cgggacaaaa ggcgatgaga ccggaggacg tagcggtggc 420ggtgaggttg tattgtgcgg aagctgtgcg cagcgggatt acgacgatca acgacaacgc 480cgattcggcc atctacccag gcaacatcga ggccgcgatg gcggtctatg gtgaggtggg 540tgtgagggtc gtctacgccc gcatgttctt tgatcggatg gacgggcgca ttcaagggta 600tgtggacgcc ttgaaggctc gctctcccca agtcgaactg tgctcgatca tggaggaaac 660ggctgtggcc aaagatcgga tcacagccct gtcagatcag tatcatggca cggcaggagg 720tcgtatatca gtttggcccg ctcctgccat taccccggcg gtgacagttg aaggaatgcg 780atgggcacaa gccttcgccc gtgatcgggc ggtaatgtgg acgcttcaca tggcggagag 840cgatcatgat gagcggcttc attggatgag tcccgccgag tacatggagt gttacggact 900cttggatgag cgtctgcagg tcgcgcattg cgtgtacttt gaccggaagg atgttcggct 960gctgcaccgc cacaatgtga aggtcgcgtc gcaggttgtg agcaatgcct acctcggctc 1020aggggtggcc cccgtgccag agatggtgga gcgcggcatg gccgtgggca ttggaacaga 1080tgacgggaat tgtaatgact ccgtaaacat gatcggagac atgaagttta tggcccatat 1140tcaccgcgcg gtgcatcggg atgcggacgt gctgacccca gagaagattc ttgaaatggc 1200gacgatcgat ggggcgcgtt cgttgggaat ggaccacgag attggttcca tcgaaaccgg 1260caagcgcgcg gaccttatcc tgcttgacct gcgtcaccct cagacgactc ctcaccatca 1320tttggcggcc acgatcgtgt ttcaggctta cggcaatgag gtggacactg tcctgattga 1380cggaaacgtt gtgatggaga accgccgctt gagctttctt ccccctgaac gtgagttggc 1440gttccttgag gaagcgcaga gccgcgccac agctattttg cagcgggcga acatggtggc 1500taacccagct tggcgcagcc tctaggttta aaccctcaaa atatattttc cctctatctt 1560ctcgttgcgc ttaatttgac taattctcat tagcgaggcg cgcctttcca taggctccgc 1620ccccctgacg agcatcacaa aaatcgacgc tcaagtcaga ggtggcgaaa cccgacagga 1680ctataaagat accaggcgtt tccccctgga agctccctcg tgcgctctcc tgttccgacc 1740ctgccgctta ccggatacct gtccgccttt ctcccttcgg gaagcgtggc gctttctcat 1800agctcacgct gtaggtatct cagttcggtg taggtcgttc gctccaagct gggctgtgtg 1860cacgaacccc ccgttcagcc cgaccgctgc gccttatccg gtaactatcg tcttgagtcc 1920aacccggtaa gacacgactt atcgccactg gcagcagcca ctggtaacag gattagcaga 1980gcgaggtatg taggcggtgc tacagagttc ttgaagtggt ggcctaacta cggctacact 2040agaagaacag tatttggtat ctgcgctctg ctgaagccag ttaccttcgg aaaaagagtt 2100ggtagctctt gatccggcaa acaaaccacc gctggtagcg gtggtttttt tgtttgcaag 2160cagcagatta cgcgcagaaa aaaaggatct caagaagatc ctttgatctt ttctacgggg 2220tctgacgctc agtggaacga aaactcacgt taagggattt tggtcatgag attatcaaaa 2280aggatcttca cctagatcct tttaaattaa aaatgaagtt ttaaatcaat ctaaagtata 2340tatgagtaaa cttggtctga cagttaccaa tgcttaatca gtgaggcacc tatctcagcg 2400atctgtctat ttcgttcatc catagttgcc tgactccccg tcgtgtagat aactacgata 2460cgggagggct taccatctgg ccccagtgct gcaatgatac cgcgagaccc acgctcaccg 2520gctccagatt tatcagcaat aaaccagcca gccggaaggg ccgagcgcag aagtggtcct 2580gcaactttat ccgcctccat ccagtctatt aattgttgcc gggaagctag agtaagtagt 2640tcgccagtta atagtttgcg caacgttgtt gccattgcta caggcatcgt ggtgtcacgc 2700tcgtcgtttg gtatggcttc attcagctcc ggttcccaac gatcaaggcg agttacatga 2760tcccccatgt tgtgcaaaaa agcggttagc tccttcggtc ctccgatcgt tgtcagaagt

2820aagttggccg cagtgttatc actcatggtt atggcagcac tgcataattc tcttactgtc 2880atgccatccg taagatgctt ttctgtgact ggtgagtact caaccaagtc attctgagaa 2940tagtgtatgc ggcgaccgag ttgctcttgc ccggcgtcaa tacgggataa taccgcgcca 3000catagcagaa ctttaaaagt gctcatcatt ggaaaacgtt cttcggggcg aaaactctca 3060aggatcttac cgctgttgag atccagttcg atgtaaccca ctcgtgcacc caactgatct 3120tcagcatctt ttactttcac cagcgtttct gggtgagcaa aaacaggaag gcaaaatgcc 3180gcaaaaaagg gaataagggc gacacggaaa tgttgaatac tcatactctt cctttttcaa 3240tattattgaa gcatttatca gggttattgt ctcatgagcg gatacatatt tgaatgtatt 3300tagaaaaata aacagcgatc gcgcggccgc gggtaataac tgatataatt aaattgaagc 3360tctaatttgt gagtttagta tacatgcatt tacttataat acagtttttt agttttgctg 3420gccgcatctt ctcaaatatg cttcccagcc tgcttttctg taacgttcac cctctacctt 3480agcatccctt ccctttgcaa atagtcctct tccaacaata ataatgtcag atcctgtaga 3540gaccacatca tccacggttc tatactgttg acccaatgcg tctcccttgt catctaaacc 3600cacaccgggt gtcataatca accaatcgta accttcatct cttccaccca tgtctctttg 3660agcaataaag ccgataacaa aatctttgtc gctcttcgca atgtcaacag tacccttagt 3720atattctcca gtagctaggg agcccttgca tgacaattct gctaacatca aaaggcctct 3780aggttccttt gttacttctt ccgccgcctg cttcaaaccg ctaacaatac ctgggcccac 3840cacaccgtgt gcattcgtaa tgtctgccca ttctgctatt ctgtatacac ccgcagagta 3900ctgcaatttg actgtattac caatgtcagc aaattttctg tcttcgaaga gtaaaaaatt 3960gtacttggcg gataatgcct ttagcggctt aactgtgccc tccatggaaa aatcagtcaa 4020gatatccaca tgtgttttta gtaaacaaat tttgggacct aatgcttcaa ctaactccag 4080taattccttg gtggtacgaa catccaatga agcacacaag tttgtttgct tttcgtgcat 4140gatattaaat agcttggcag caacaggact aggatgagta gcagcacgtt ccttatatgt 4200agctttcgac atgatttatc ttcgtttcct gcaggttttt gttctgtgca gttgggttaa 4260gaatactggg caatttcatg tttcttcaac accacatatg cgtatatata ccaatctaag 4320tctgtgctcc ttccttcgtt cttccttctg ctcggagatt accgaatcaa agctagctta 4380tcgatgataa gctgtcaaag atgagaatta attccacgga ctatagacta tactagatac 4440tccgtctact gtacgataca cttccgctca ggtccttgtc ctttaacgag gccttaccac 4500tcttttgtta ctctattgat ccagctcagc aaaggcagtg tgatctaaga ttctatcttc 4560gcgatgtagt aaaactagct agaccgagaa agagactaga aatgcaaaag gcacttctac 4620aatggctgcc atcattatta tccgatgtga cgctgcagct tctcaatgat attcgaatac 4680gctttgagga gatacagcct aatatccgac aaactgtttt acagatttac gatcgtactt 4740gttacccatc attgaatttt gaacatccga acctgggagt tttccctgaa acagatagta 4800tatttgaacc tgtataataa tatatagtct agcgctttac ggaagacaat gtatgtattt 4860cggttcctgg agaaactatt gcatctattg cataggtaat cttgcacgtc gcatccccgg 4920ttcattttct gcgtttccat cttgcacttc aatagcatat ctttgttaac gaagcatctg 4980tgcttcattt tgtagaacaa aaatgcaacg cgagagcgct aatttttcaa acaaagaatc 5040tgagctgcat ttttacagaa cagaaatgca acgcgaaagc gctattttac caacgaagaa 5100tctgtgcttc atttttgtaa aacaaaaatg caacgcgacg agagcgctaa tttttcaaac 5160aaagaatctg agctgcattt ttacagaaca gaaatgcaac gcgagagcgc tattttacca 5220acaaagaatc tatacttctt ttttgttcta caaaaatgca tcccgagagc gctatttttc 5280taacaaagca tcttagatta ctttttttct cctttgtgcg ctctataatg cagtctcttg 5340ataacttttt gcactgtagg tccgttaagg ttagaagaag gctactttgg tgtctatttt 5400ctcttccata aaaaaagcct gactccactt cccgcgttta ctgattacta gcgaagctgc 5460gggtgcattt tttcaagata aaggcatccc cgattatatt ctataccgat gtggattgcg 5520catactttgt gaacagaaag tgatagcgtt gatgattctt cattggtcag aaaattatga 5580acggtttctt ctattttgtc tctatatact acgtatagga aatgtttaca ttttcgtatt 5640gttttcgatt cactctatga atagttctta ctacaatttt tttgtctaaa gagtaatact 5700agagataaac ataaaaaatg tagaggtcga gtttagatgc aagttcaagg agcgaaaggt 5760ggatgggtag gttatatagg gatatagcac agagatatat agcaaagaga tacttttgag 5820caat 5824468336DNAArtificial SequenceDescription of Artificial Sequence Synthetic polynucleotide 46gtttgtggaa gcggtattcg caatttaatt aaagctggtg acaattaatc atcggctcgt 60ataatgtgtg gaattgaatc gatataagga ggttaatcat atgactagaa tcgctatcac 120aggtggtaga gttttgacta tggacccaga aagaagagta ttagaaccag gtacagttgt 180tgttgaagat caattcattg cacaagtcgg ttcaccagat gacgtagaca tcagaggtgc 240tgaaattata gatgccactg gtatggctgt attaccaggt ttcgttaata cacataccca 300cgttcctcaa attttgttaa gaggtggtgc ttcacatgat agaaatttgt tggaatggtt 360gcacaacgtc ttatatccag gtttggctgc atacactgat gacgatatca gagttggtac 420attgttatat tgtgctgaag cattgagatc cggtattact acagttgtcg acaatgaaga 480tgttagacct aacgattttg ccagagctgg tgccgctggt attggtgcat tcactgatgc 540cggtatcaga gcaatctatg ccagaatgta ctttgatgct ccaagagcag aattggaaga 600attagtcgca acaatacatg caaaagcccc tggtgccgta agaatggacg aatctgcttc 660aaccgatcat gttttggcag acttagatca attgattacc agacatgaca gaactgctga 720tggtagaatt agagtatggc cagctcctgc aataccattc atggtttctg aaaagggtat 780gaaggcagcc caagaaatag ctgcatccag aactgacggt tggacaatgc atgttagtga 840agatccaatc gaagccagag tccactctat gaatgctcct gaatatttgc atcacttggg 900ttgtttagac gatagattgt tagccgctca ttgcgttcac atagactcaa gagatatcag 960attgtttaga caacatgatg ttaagatatc cacacaacct gtctccaata gttacttagc 1020agccggtata gcaccagttc ctgaaatgtt ggctcatggt gtcacagtag gtattggtac 1080cgacgatgct aattgtaacg actccgtaaa cttaatcagt gatatgaagg ttttggcatt 1140gatacataga gctgcacaca gagatgctag tatcattacc ccagaaaaga taatcgaaat 1200ggccactatt gacggtgcta gatgcattgg tatggctgat caaatcggtt ctttggaagc 1260tggtaaaaga gcagacataa tcactttgga tttgagacat gcacaaacca ctcctgccca 1320cgatttggcc gctacaattg tctttcaagc ttatggtaat gaagtaaacg atgttttggt 1380caacggttct gtagttatga gagatagagt tttgtcattc ttaccaaccc ctcaagaaga 1440aaaggcttta tacgacgatg catctgaaag atcagcagcc atgttagcca gagctggttt 1500gactggtaca agaacctggc aaactttggg ttcttaagga aatccattat gatgtcagga 1560gaacacacgt taaaagcggt acgaggcagt tttattgatg tcacccgtac gatcgataac 1620ccggaagaga ttgcctctgc gctgcggttt attgaggatg gtttattact cattaaacag 1680ggaaaagtgg aatggtttgg cgaatgggaa aacggaaagc atcaaattcc tgacaccatt 1740cgcgtgcgcg actatcgcgg caaactgata gtaccgggct ttgtcgatac acatatccat 1800tatccgcaaa gtgaaatggt gggggcctat ggtgagcaat tgctggagtg gttgaataaa 1860cacaccttcc ctactgaacg tcgttatgag gatttagagt acgcccgcga aatgtcggcg 1920ttcttcatca agcagctttt acgtaacgga accaccacgg cgctggtgtt tggcactgtt 1980catccgcaat ctgttgatgc gctgtttgaa gccgccagtc atatcaatat gcgtatgatt 2040gccggtaagg tgatgatgga ccgcaacgca ccggattatc tgctcgacac tgccgaaagc 2100agctatcacc aaagcaaaga actgatcgaa cgctggcaca aaaatggtcg tctgctatat 2160gcgattacgc cacgcttcgc cccgacctca tctcctgaac agatggcgat ggcgcaacgc 2220ctgaaagaag aatatccgga tacgtgggta catacccatc tctgtgaaaa caaagatgaa 2280attgcctggg tgaaatcgct ttatcctgac catgatggtt atctggatgt ttaccatcag 2340tacggcctga ccggtaaaaa ctgtgtcttt gctcactgcg tccatctcga agaaaaagag 2400tgggatcgtc tcagcgaaac caaatccagc attgctttct gtccgacctc caacctttac 2460ctcggcagcg gcttattcaa cttgaaaaaa gcatggcaga agaaagttaa agtgggcatg 2520ggaacggata tcggtgccgg aaccactttc aacatgctgc aaacgctgaa cgaagcctac 2580aaagtattgc aattacaagg ctatcgcctc tcggcatatg aagcgtttta cctggccacg 2640ctcggcggag cgaaatctct gggccttgac gatttgattg gcaacttttt acctggcaaa 2700gaggctgatt tcgtggtgat ggaacccacc gccactccgc tacagcagct gcgctatgac 2760aactctgttt ctttagtcga caaattgttc gtgatgatga cgttgggcga tgaccgttcg 2820atctaccgca cctacgttga tggtcgtctg gtgtacgaac gcaactaagg aacgaccatg 2880agagaagtcc aattgttaga tggtagaaga gttgatgtcg cctgtgctgg tcctttgatt 2940agtgaaatag gtgcccactt agatttgact gctccagttg aaattgattg tggtggtggt 3000ttagcaacta gaccttttac tgaacctcat ttgcacttag acaaagcagg tactgccgat 3060agattgcctg ccggtgcttc cacaatcggt gacgctattg ctgcaatgca aagtgtcaag 3120gtaaccgaaa gagataatgt cgccgctgta gcagccagaa tgcatagagt tttaaacaga 3180atcgtcgatg acggttccca cgctattaga gcattggttg atgtcgacga agtttggggt 3240ttaacagctt ttcatgctgc acaacaagtc caagccgctt tggccccaag agctgttgtc 3300caaattgtcg ctttcccaca acacggttta acccctcaag tattggcaat gttagaacaa 3360gcagccgctg aaggtgcagg tgccttgggt gctcatactg atgttgaccc agatcctgca 3420gcccacgttg gtgccgtcgc tgcaatagcc gctggtgctt ccttgccatt agaagttcat 3480actgacgaag gtgctagtcc agataaattt tatttgcctg cagtattgga agttttagat 3540agattcccag gtttgtctac tacattagct cattgtttgt cattaggtac aattgcacct 3600aagcaacaac aacattggat cgaagaatta gctcacagag atatcaaagt atgcgttgca 3660ccatctattt tgggtttcgg tttgccatta gcacctgtta gagccttaat agaagctggt 3720gtcggtatct tagtaggttc agacaatttg caagatgttt tctttccttt gggtacaggt 3780agagcaattg aaaacgttag attgttagcc accgcagccc aattaactgc accagaattg 3840gccggtcctt taattgctgg tgtaaccgac atagcttacg caaccgttac tggtgctgca 3900gatgccttgg ctgttgaatc tccagctaca ttagtagttc atgatgctac ctcacctgca 3960gaattgttaa gaggtataga cggtacaaga attaccgtta tagatggttt gttgacatct 4020ccattgcaat tggataaagg tatcaagtaa gtttaaacta atcccacagc cgccagttcc 4080gctggcggca ttttaacttt ctttaatggg cgcgcctttc cataggctcc gcccccctga 4140cgagcatcac aaaaatcgac gctcaagtca gaggtggcga aacccgacag gactataaag 4200ataccaggcg tttccccctg gaagctccct cgtgcgctct cctgttccga ccctgccgct 4260taccggatac ctgtccgcct ttctcccttc gggaagcgtg gcgctttctc atagctcacg 4320ctgtaggtat ctcagttcgg tgtaggtcgt tcgctccaag ctgggctgtg tgcacgaacc 4380ccccgttcag cccgaccgct gcgccttatc cggtaactat cgtcttgagt ccaacccggt 4440aagacacgac ttatcgccac tggcagcagc cactggtaac aggattagca gagcgaggta 4500tgtaggcggt gctacagagt tcttgaagtg gtggcctaac tacggctaca ctagaagaac 4560agtatttggt atctgcgctc tgctgaagcc agttaccttc ggaaaaagag ttggtagctc 4620ttgatccggc aaacaaacca ccgctggtag cggtggtttt tttgtttgca agcagcagat 4680tacgcgcaga aaaaaaggat ctcaagaaga tcctttgatc ttttctacgg ggtctgacgc 4740tcagtggaac gaaaactcac gttaagggat tttggtcatg agattatcaa aaaggatctt 4800cacctagatc cttttaaatt aaaaatgaag ttttaaatca atctaaagta tatatgagta 4860aacttggtct gacagttacc aatgcttaat cagtgaggca cctatctcag cgatctgtct 4920atttcgttca tccatagttg cctgactccc cgtcgtgtag ataactacga tacgggaggg 4980cttaccatct ggccccagtg ctgcaatgat accgcgagac ccacgctcac cggctccaga 5040tttatcagca ataaaccagc cagccggaag ggccgagcgc agaagtggtc ctgcaacttt 5100atccgcctcc atccagtcta ttaattgttg ccgggaagct agagtaagta gttcgccagt 5160taatagtttg cgcaacgttg ttgccattgc tacaggcatc gtggtgtcac gctcgtcgtt 5220tggtatggct tcattcagct ccggttccca acgatcaagg cgagttacat gatcccccat 5280gttgtgcaaa aaagcggtta gctccttcgg tcctccgatc gttgtcagaa gtaagttggc 5340cgcagtgtta tcactcatgg ttatggcagc actgcataat tctcttactg tcatgccatc 5400cgtaagatgc ttttctgtga ctggtgagta ctcaaccaag tcattctgag aatagtgtat 5460gcggcgaccg agttgctctt gcccggcgtc aatacgggat aataccgcgc cacatagcag 5520aactttaaaa gtgctcatca ttggaaaacg ttcttcgggg cgaaaactct caaggatctt 5580accgctgttg agatccagtt cgatgtaacc cactcgtgca cccaactgat cttcagcatc 5640ttttactttc accagcgttt ctgggtgagc aaaaacagga aggcaaaatg ccgcaaaaaa 5700gggaataagg gcgacacgga aatgttgaat actcatactc ttcctttttc aatattattg 5760aagcatttat cagggttatt gtctcatgag cggatacata tttgaatgta tttagaaaaa 5820taaacagcga tcgcgcggcc gcgggtaata actgatataa ttaaattgaa gctctaattt 5880gtgagtttag tatacatgca tttacttata atacagtttt ttagttttgc tggccgcatc 5940ttctcaaata tgcttcccag cctgcttttc tgtaacgttc accctctacc ttagcatccc 6000ttccctttgc aaatagtcct cttccaacaa taataatgtc agatcctgta gagaccacat 6060catccacggt tctatactgt tgacccaatg cgtctccctt gtcatctaaa cccacaccgg 6120gtgtcataat caaccaatcg taaccttcat ctcttccacc catgtctctt tgagcaataa 6180agccgataac aaaatctttg tcgctcttcg caatgtcaac agtaccctta gtatattctc 6240cagtagctag ggagcccttg catgacaatt ctgctaacat caaaaggcct ctaggttcct 6300ttgttacttc ttccgccgcc tgcttcaaac cgctaacaat acctgggccc accacaccgt 6360gtgcattcgt aatgtctgcc cattctgcta ttctgtatac acccgcagag tactgcaatt 6420tgactgtatt accaatgtca gcaaattttc tgtcttcgaa gagtaaaaaa ttgtacttgg 6480cggataatgc ctttagcggc ttaactgtgc cctccatgga aaaatcagtc aagatatcca 6540catgtgtttt tagtaaacaa attttgggac ctaatgcttc aactaactcc agtaattcct 6600tggtggtacg aacatccaat gaagcacaca agtttgtttg cttttcgtgc atgatattaa 6660atagcttggc agcaacagga ctaggatgag tagcagcacg ttccttatat gtagctttcg 6720acatgattta tcttcgtttc ctgcaggttt ttgttctgtg cagttgggtt aagaatactg 6780ggcaatttca tgtttcttca acaccacata tgcgtatata taccaatcta agtctgtgct 6840ccttccttcg ttcttccttc tgctcggaga ttaccgaatc aaagctagct tatcgatgat 6900aagctgtcaa agatgagaat taattccacg gactatagac tatactagat actccgtcta 6960ctgtacgata cacttccgct caggtccttg tcctttaacg aggccttacc actcttttgt 7020tactctattg atccagctca gcaaaggcag tgtgatctaa gattctatct tcgcgatgta 7080gtaaaactag ctagaccgag aaagagacta gaaatgcaaa aggcacttct acaatggctg 7140ccatcattat tatccgatgt gacgctgcag cttctcaatg atattcgaat acgctttgag 7200gagatacagc ctaatatccg acaaactgtt ttacagattt acgatcgtac ttgttaccca 7260tcattgaatt ttgaacatcc gaacctggga gttttccctg aaacagatag tatatttgaa 7320cctgtataat aatatatagt ctagcgcttt acggaagaca atgtatgtat ttcggttcct 7380ggagaaacta ttgcatctat tgcataggta atcttgcacg tcgcatcccc ggttcatttt 7440ctgcgtttcc atcttgcact tcaatagcat atctttgtta acgaagcatc tgtgcttcat 7500tttgtagaac aaaaatgcaa cgcgagagcg ctaatttttc aaacaaagaa tctgagctgc 7560atttttacag aacagaaatg caacgcgaaa gcgctatttt accaacgaag aatctgtgct 7620tcatttttgt aaaacaaaaa tgcaacgcga cgagagcgct aatttttcaa acaaagaatc 7680tgagctgcat ttttacagaa cagaaatgca acgcgagagc gctattttac caacaaagaa 7740tctatacttc ttttttgttc tacaaaaatg catcccgaga gcgctatttt tctaacaaag 7800catcttagat tacttttttt ctcctttgtg cgctctataa tgcagtctct tgataacttt 7860ttgcactgta ggtccgttaa ggttagaaga aggctacttt ggtgtctatt ttctcttcca 7920taaaaaaagc ctgactccac ttcccgcgtt tactgattac tagcgaagct gcgggtgcat 7980tttttcaaga taaaggcatc cccgattata ttctataccg atgtggattg cgcatacttt 8040gtgaacagaa agtgatagcg ttgatgattc ttcattggtc agaaaattat gaacggtttc 8100ttctattttg tctctatata ctacgtatag gaaatgttta cattttcgta ttgttttcga 8160ttcactctat gaatagttct tactacaatt tttttgtcta aagagtaata ctagagataa 8220acataaaaaa tgtagaggtc gagtttagat gcaagttcaa ggagcgaaag gtggatgggt 8280aggttatata gggatatagc acagagatat atagcaaaga gatacttttg agcaat 8336478063DNAArtificial SequenceDescription of Artificial Sequence Synthetic polynucleotide 47gtttgtggaa gcggtattcg caatttaatt aaagctggtg acaattaatc atcggctcgt 60ataatgtgtg gaattgaatc gatataagga ggttaatcat atgtcctcca cagcattata 120caccgttcct accgcaggtc cagacgatgt tgccgccttg aaagcattag atggtcattc 180cgcctccgat attttggctg taataggtaa aacagagggt aatggttgtg ttaacgactt 240tagtagaacc ttatctgctg cagtttggca tccattgtta gaagattcag ccattacagt 300cttttccggt ggtgcagaag gtgtaataag tccacatgta aacatcttcg ttagagatga 360aagacaatat tctggtcacc ctagaggttt ggtaactgct gttggtagaa caagagttat 420cggtccagaa gaaattggta gacctgctca agtcgatgca gtacatgaaa ccgttgtcgc 480attgttaact gaattgggtg ttggtccaga tgacgttcac ttggtcttga ttaaatgccc 540tttgttatct tcagacgcta tagcaggtgt tcatagaaga ggtttaagac ctgtcactac 600agatacttac gaatctatgt caagatccag agccgcttct gctttgggta tagccatggc 660tttaaaggaa tgtgatagag acagagcatt gttagccttg gaaggtagag atgacgtttg 720gtcagcaaga gcctccgctt ccagtggtgc tgaattggat gactgccaca ttttagtagt 780tgcagaatca gatgcagccg ctaatccatt aagagcagcc catactgcca tgagagatgc 840tttggacatc caagctttaa cagaagtttt tgacagaatt gctgcagaag gtggtaccgt 900cagacaaata ttcgcaaagg ccgaagctga tccttcaggt gctatcagag gttatagaca 960taccatgtta actgattccg acgtcaatgc aacaagacac gccagagccg ctgtaggtgg 1020tttgattgca gccttacatg gtaacggtgc tgtctatgta tcaggtggtg cagaacacca 1080aggtccaagt ggtggtggtt ctgttactgt tatatatgat gttcctgcaa cagccaacgc 1140taccggtgaa gcttctagat aaggaaatcc attatgatat actcaacagt caacgctaat 1200ccttacgctt ggccttacga tggttcaata gaccctgctc acaccgcttt aatcttaatc 1260gattggcaaa tagacttttg tggtccaggt ggttatgtcg attccatggg ttacgactta 1320tccttgacta gaagtggttt agaacctaca gcaagagtat tggctgcagc cagagatact 1380ggtatgacag ttatccatac tagagaaggt cacagaccag atttggctga cttgccacct 1440aataagagat ggagatctgc atcagccggt gctgaaatcg gttcagttgg tccatgtggt 1500agaattttag tcagaggtga acctggttgg gaaatagtac cagaagttgc acctagagaa 1560ggtgaaccaa ttatagataa acctggtaaa ggtgctttct acgcaacaga tttggacttg 1620ttgttgagaa caagaggtat cacccatttg attttgaccg gtataactac agatgtttgc 1680gtccacacca ctatgagaga agccaacgat agaggttacg aatgtttaat tttgtctgat 1740tgcaccggtg ctactgacag aaagcatcac gaagctgcat tatctatggt caccatgcaa 1800ggtggtgtat tcggtgcaac tgcccattca gatgacttat tggccgcttt gggtacaacc 1860gttccagcag ccgctggtcc tagagctaga acagaataag gaacgaccat gacagttagt 1920tccgatacaa ctgctgaaat atcgttaggt tggtcaatcc aagactggat tgatttccac 1980aagtcatcaa gctcccaggc ttcactaagg cttcttgaat cactactaga ctctcaaaat 2040gttgcgccag tcgataatgc gtggatatcg ctaatttcaa aggaaaattt actgcaccaa 2100ttccaaattt taaagagcag agaaaataaa gaaactctac ctctctacgg tgtccctatt 2160gctgttaagg acaacatcga cgttagaggt ctacccacca ccgctgcatg tccatccttt 2220gcatatgagc cttccaaaga ctctaaagta gtagaactac taagaaatgc aggtgcgata 2280atcgtgggta agacaaactt ggaccaattt gccacaggat tagtcggcac acggtctcca 2340tatgggaaaa caccttgcgc ttttagcaaa gagcatgtat ctggtggttc ctccgctggg 2400tcagcatcgg tggtcgccag aggtatcgta ccaattgcat tgggtactga tacagcaggt 2460tctggtagag tcccagccgc cttgaacaac ctgattggcc taaagccaac aaagggcgtc 2520ttttcctgtc aaggtgtagt tcccgcttgt aaatctttag actgcgtctc catctttgca 2580ttaaacctaa gtgatgctga acgctgcttc cgcatcatgt gccagccaga tcctgataat 2640gatgaatatt ctagacccta tgtttccaac cctttgaaaa aattttcaag caatgtaacg 2700attgctattc ctaaaaatat cccatggtat ggtgaaacca agaatcctgt actgttttcc 2760aatgctgtcg aaaatctatc aagaacgggc gctaacgtca tagaaattga ttttgagcct 2820cttttagagt tagctcgctg tttatacgaa ggtacttggg tggccgagcg ttatcaagct 2880attcaatcgt ttttggacag taaaccacca aaggaatctt tggaccctac tgttatttca 2940attatagaag gggccaagaa atacagtgca gtagactgct tcagttttga atacaaaaga 3000caaggcatct tgcaaaaagt gagacgactt ctcgaatcag tcgatgtatt gtgtgtgccc 3060acatgtcctt taaatcctac tatgcaacaa gttgcggatg aaccagtcct agtcaattca 3120agacaaggca catggactaa ttttgtcaac ttggcagatt tggcagccct tgctgttccc 3180gcagggttcc gagacgatgg tttgccaaat ggtattactt taatcggtaa aaaattcaca 3240gattacgcac tattagagtt ggctaaccgc tatttccaaa atatattccc caacggttcc 3300agaacatacg gtacttttac ctcttcttca gtaaagccag caaacgatca attagtggga 3360ccagactatg acccatctac gtccataaaa ttggctgttg tcggtgcaca tcttaagggt 3420ctgcctctac attggcaatt ggaaaaggtc aatgcaacat atttatgtac aacaaaaaca

3480tcaaaagctt accagctttt tgctttgccc aaaaatggac cagttttaaa acctggtttg 3540agaagagttc aagatagcaa tggctctcaa atcgaattag aagtgtacag tgttccaaaa 3600gaactgttcg gtgcttttat ttccatggtt cctgaaccat taggaatagg ttcagtggag 3660ttagaatctg gtgaatggat caaatccttt atttgtgaag aatctggtta caaagccaaa 3720ggtacagttg atatcacaaa gtatggtgga tttagagcat attttgaaat gttgtaagtt 3780taaactaatc ccacagccgc cagttccgct ggcggcattt taactttctt taatgggcgc 3840gcctttccat aggctccgcc cccctgacga gcatcacaaa aatcgacgct caagtcagag 3900gtggcgaaac ccgacaggac tataaagata ccaggcgttt ccccctggaa gctccctcgt 3960gcgctctcct gttccgaccc tgccgcttac cggatacctg tccgcctttc tcccttcggg 4020aagcgtggcg ctttctcata gctcacgctg taggtatctc agttcggtgt aggtcgttcg 4080ctccaagctg ggctgtgtgc acgaaccccc cgttcagccc gaccgctgcg ccttatccgg 4140taactatcgt cttgagtcca acccggtaag acacgactta tcgccactgg cagcagccac 4200tggtaacagg attagcagag cgaggtatgt aggcggtgct acagagttct tgaagtggtg 4260gcctaactac ggctacacta gaagaacagt atttggtatc tgcgctctgc tgaagccagt 4320taccttcgga aaaagagttg gtagctcttg atccggcaaa caaaccaccg ctggtagcgg 4380tggttttttt gtttgcaagc agcagattac gcgcagaaaa aaaggatctc aagaagatcc 4440tttgatcttt tctacggggt ctgacgctca gtggaacgaa aactcacgtt aagggatttt 4500ggtcatgaga ttatcaaaaa ggatcttcac ctagatcctt ttaaattaaa aatgaagttt 4560taaatcaatc taaagtatat atgagtaaac ttggtctgac agttaccaat gcttaatcag 4620tgaggcacct atctcagcga tctgtctatt tcgttcatcc atagttgcct gactccccgt 4680cgtgtagata actacgatac gggagggctt accatctggc cccagtgctg caatgatacc 4740gcgagaccca cgctcaccgg ctccagattt atcagcaata aaccagccag ccggaagggc 4800cgagcgcaga agtggtcctg caactttatc cgcctccatc cagtctatta attgttgccg 4860ggaagctaga gtaagtagtt cgccagttaa tagtttgcgc aacgttgttg ccattgctac 4920aggcatcgtg gtgtcacgct cgtcgtttgg tatggcttca ttcagctccg gttcccaacg 4980atcaaggcga gttacatgat cccccatgtt gtgcaaaaaa gcggttagct ccttcggtcc 5040tccgatcgtt gtcagaagta agttggccgc agtgttatca ctcatggtta tggcagcact 5100gcataattct cttactgtca tgccatccgt aagatgcttt tctgtgactg gtgagtactc 5160aaccaagtca ttctgagaat agtgtatgcg gcgaccgagt tgctcttgcc cggcgtcaat 5220acgggataat accgcgccac atagcagaac tttaaaagtg ctcatcattg gaaaacgttc 5280ttcggggcga aaactctcaa ggatcttacc gctgttgaga tccagttcga tgtaacccac 5340tcgtgcaccc aactgatctt cagcatcttt tactttcacc agcgtttctg ggtgagcaaa 5400aacaggaagg caaaatgccg caaaaaaggg aataagggcg acacggaaat gttgaatact 5460catactcttc ctttttcaat attattgaag catttatcag ggttattgtc tcatgagcgg 5520atacatattt gaatgtattt agaaaaataa acagcgatcg cgcggccgcg ggtaataact 5580gatataatta aattgaagct ctaatttgtg agtttagtat acatgcattt acttataata 5640cagtttttta gttttgctgg ccgcatcttc tcaaatatgc ttcccagcct gcttttctgt 5700aacgttcacc ctctacctta gcatcccttc cctttgcaaa tagtcctctt ccaacaataa 5760taatgtcaga tcctgtagag accacatcat ccacggttct atactgttga cccaatgcgt 5820ctcccttgtc atctaaaccc acaccgggtg tcataatcaa ccaatcgtaa ccttcatctc 5880ttccacccat gtctctttga gcaataaagc cgataacaaa atctttgtcg ctcttcgcaa 5940tgtcaacagt acccttagta tattctccag tagctaggga gcccttgcat gacaattctg 6000ctaacatcaa aaggcctcta ggttcctttg ttacttcttc cgccgcctgc ttcaaaccgc 6060taacaatacc tgggcccacc acaccgtgtg cattcgtaat gtctgcccat tctgctattc 6120tgtatacacc cgcagagtac tgcaatttga ctgtattacc aatgtcagca aattttctgt 6180cttcgaagag taaaaaattg tacttggcgg ataatgcctt tagcggctta actgtgccct 6240ccatggaaaa atcagtcaag atatccacat gtgtttttag taaacaaatt ttgggaccta 6300atgcttcaac taactccagt aattccttgg tggtacgaac atccaatgaa gcacacaagt 6360ttgtttgctt ttcgtgcatg atattaaata gcttggcagc aacaggacta ggatgagtag 6420cagcacgttc cttatatgta gctttcgaca tgatttatct tcgtttcctg caggtttttg 6480ttctgtgcag ttgggttaag aatactgggc aatttcatgt ttcttcaaca ccacatatgc 6540gtatatatac caatctaagt ctgtgctcct tccttcgttc ttccttctgc tcggagatta 6600ccgaatcaaa gctagcttat cgatgataag ctgtcaaaga tgagaattaa ttccacggac 6660tatagactat actagatact ccgtctactg tacgatacac ttccgctcag gtccttgtcc 6720tttaacgagg ccttaccact cttttgttac tctattgatc cagctcagca aaggcagtgt 6780gatctaagat tctatcttcg cgatgtagta aaactagcta gaccgagaaa gagactagaa 6840atgcaaaagg cacttctaca atggctgcca tcattattat ccgatgtgac gctgcagctt 6900ctcaatgata ttcgaatacg ctttgaggag atacagccta atatccgaca aactgtttta 6960cagatttacg atcgtacttg ttacccatca ttgaattttg aacatccgaa cctgggagtt 7020ttccctgaaa cagatagtat atttgaacct gtataataat atatagtcta gcgctttacg 7080gaagacaatg tatgtatttc ggttcctgga gaaactattg catctattgc ataggtaatc 7140ttgcacgtcg catccccggt tcattttctg cgtttccatc ttgcacttca atagcatatc 7200tttgttaacg aagcatctgt gcttcatttt gtagaacaaa aatgcaacgc gagagcgcta 7260atttttcaaa caaagaatct gagctgcatt tttacagaac agaaatgcaa cgcgaaagcg 7320ctattttacc aacgaagaat ctgtgcttca tttttgtaaa acaaaaatgc aacgcgacga 7380gagcgctaat ttttcaaaca aagaatctga gctgcatttt tacagaacag aaatgcaacg 7440cgagagcgct attttaccaa caaagaatct atacttcttt tttgttctac aaaaatgcat 7500cccgagagcg ctatttttct aacaaagcat cttagattac tttttttctc ctttgtgcgc 7560tctataatgc agtctcttga taactttttg cactgtaggt ccgttaaggt tagaagaagg 7620ctactttggt gtctattttc tcttccataa aaaaagcctg actccacttc ccgcgtttac 7680tgattactag cgaagctgcg ggtgcatttt ttcaagataa aggcatcccc gattatattc 7740tataccgatg tggattgcgc atactttgtg aacagaaagt gatagcgttg atgattcttc 7800attggtcaga aaattatgaa cggtttcttc tattttgtct ctatatacta cgtataggaa 7860atgtttacat tttcgtattg ttttcgattc actctatgaa tagttcttac tacaattttt 7920ttgtctaaag agtaatacta gagataaaca taaaaaatgt agaggtcgag tttagatgca 7980agttcaagga gcgaaaggtg gatgggtagg ttatataggg atatagcaca gagatatata 8040gcaaagagat acttttgagc aat 8063488927DNAArtificial SequenceDescription of Artificial Sequence Synthetic polynucleotide 48gtttgtggaa gcggtattcg caatcattta gtcgtgcaat gtatgacttt aagatttgtg 60agcaggaaga aaagggagaa tcttctaacg ataaaccctt gaaaaactgg gtagactacg 120ctatgttgag ttgctacgca ggctgcacaa ttacacgaga atgctcccgc ctaggattta 180aggctaaggg acgtgcaatg cagacgacag atctaaatga ccgtgtcggt gaagtgttcg 240ccaaactttt cggttaacac atgcagtgat gcacgcgcga tggtgctaag ttacatatat 300atatatatat atatatatat atatatatag ccatagtgat gtctaagtaa cctttatggt 360atatttctta atgtggaaag atactagcgc gcgcacccac acacaagctt cgtcttttct 420tgaagaaaag aggaagctcg ctaaatggga ttccactttc cgttccctgc cagctgatgg 480aaaaaggtta gtggaacgat gaagaataaa aagagagatc cactgaggtg aaatttcagc 540tgacagcgag tttcatgatc gtgatgaaca atggtaacga gttgtggctg ttgccaggga 600gggtggttct caacttttaa tgtatggcca aatcgctact tgggtttgtt atataacaaa 660gaagaaataa tgaactgatt ctcttcctcc ttcttgtcct ttcttaattc tgttgtaatt 720accttccttt gtaatttttt ttgtaattat tcttcttaat aatccaaaca aacacacata 780ttacaataat gaagaagccc gagctgaccg ctacctctgt tgagaagttc ctgattgaga 840agtttgattc cgtttccgac ctgatgcagc tgtccgaggg cgaggagtct cgagccttct 900cctttgacgt gggcggacga ggttacgttc tgcgagtgaa ctcgtgtgcc gacggcttct 960acaaggatcg atacgtctac cgacactttg cttctgccgc tctgcccatc cctgaggttc 1020tcgacattgg cgagttctct gagtccctca cctactgcat ctctcgacga gctcagggag 1080tcaccctgca ggacctccct gagactgagc tgcctgctgt cctccagcct gttgctgagg 1140ccatggacgc tatcgctgct gctgatctgt cccagacctc gggtttcggc ccctttggac 1200ctcagggaat tggacagtac accacttggc gagacttcat ctgtgctatt gccgatcctc 1260acgtctacca ttggcagacc gttatggacg atactgtgtc ggcttctgtc gctcaggctc 1320tggacgagct gatgctctgg gccgaggatt gccccgaggt tcgacacctg gtgcatgctg 1380acttcggttc caacaacgtt ctcaccgaca acggccgaat cactgccgtg attgactggt 1440ccgaggctat gtttggcgac tcgcagtacg aggtggccaa catcttcttt tggcgaccct 1500ggctggcttg tatggagcag cagacccgat acttcgagcg acgacatcct gagctcgctg 1560gatcccctcg actgcgagct tacatgctcc gaattggtct ggaccagctc taccagtcgc 1620tggtggatgg caactttgac gatgctgcct gggctcaggg acgatgtgac gccatcgtgc 1680gatctggcgc tggaaccgtc ggacgaactc agattgcccg acgatccgct gctgtctgga 1740ccgacggatg cgtggaggtc ctggctgatt cgggtaaccg acgaccctct actcgacctc 1800gagctaagga gtaataaacg gcgcgccgtc tgaagaatga atgatttgat gatttctttt 1860tccctccatt tttcttactg aatatatcaa tgatatagac ttgtatagtt tattatttca 1920aattaagtag ctatatatag tcaagataac gtttgtttga cacgattaca ttattcgtcg 1980acatcttttt tcagcctgtc gtggtagcaa tttgaggagt attattaatt gaataggttc 2040attttgcgct cgcataaaca gttttcgtca gggacagtat gttggaatga gtggtaatta 2100atggtgacat gacatgttat agcaataacc ttgatgttta catcgtagtt taatgtacac 2160cccgcgaatt cgttcaagta ggagtgcacc aattgcaaag ggaaaagctg aatgggcagt 2220tcgaatagta ctttttccat aggctccgcc cccctgacga gcatcacaaa aatcgacgct 2280caagtcagag gtggcgaaac ccgacaggac tataaagata ccaggcgttt ccccctggaa 2340gctccctcgt gcgctctcct gttccgaccc tgccgcttac cggatacctg tccgcctttc 2400tcccttcggg aagcgtggcg ctttctcata gctcacgctg taggtatctc agttcggtgt 2460aggtcgttcg ctccaagctg ggctgtgtgc acgaaccccc cgttcagccc gaccgctgcg 2520ccttatccgg taactatcgt cttgagtcca acccggtaag acacgactta tcgccactgg 2580cagcagccac tggtaacagg attagcagag cgaggtatgt aggcggtgct acagagttct 2640tgaagtggtg gcctaactac ggctacacta gaagaacagt atttggtatc tgcgctctgc 2700tgaagccagt taccttcgga aaaagagttg gtagctcttg atccggcaaa caaaccaccg 2760ctggtagcgg tggttttttt gtttgcaagc agcagattac gcgcagaaaa aaaggatctc 2820aagaagatcc tttgatcttt tctacggggt ctgacgctca gtggaacgaa aactcacgtt 2880aagggatttt ggtcatgaga ttatcaaaaa ggatcttcac ctagatcctt ttaaattaaa 2940aatgaagttt taaatcaatc taaagtatat atgagtaaac ttggtctgac agttaccaat 3000gcttaatcag tgaggcacct atctcagcga tctgtctatt tcgttcatcc atagttgcct 3060gactccccgt cgtgtagata actacgatac gggagggctt accatctggc cccagtgctg 3120caatgatacc gcgagaccca cgctcaccgg ctccagattt atcagcaata aaccagccag 3180ccggaagggc cgagcgcaga agtggtcctg caactttatc cgcctccatc cagtctatta 3240attgttgccg ggaagctaga gtaagtagtt cgccagttaa tagtttgcgc aacgttgttg 3300ccattgctac aggcatcgtg gtgtcacgct cgtcgtttgg tatggcttca ttcagctccg 3360gttcccaacg atcaaggcga gttacatgat cccccatgtt gtgcaaaaaa gcggttagct 3420ccttcggtcc tccgatcgtt gtcagaagta agttggccgc agtgttatca ctcatggtta 3480tggcagcact gcataattct cttactgtca tgccatccgt aagatgcttt tctgtgactg 3540gtgagtactc aaccaagtca ttctgagaat agtgtatgcg gcgaccgagt tgctcttgcc 3600cggcgtcaat acgggataat accgcgccac atagcagaac tttaaaagtg ctcatcattg 3660gaaaacgttc ttcggggcga aaactctcaa ggatcttacc gctgttgaga tccagttcga 3720tgtaacccac tcgtgcaccc aactgatctt cagcatcttt tactttcacc agcgtttctg 3780ggtgagcaaa aacaggaagg caaaatgccg caaaaaaggg aataagggcg acacggaaat 3840gttgaatact catactcttc ctttttcaat attattgaag catttatcag ggttattgtc 3900tcatgagcgg atacatattt gaatgtattt agaaaaataa acagcgatcg cagctagctc 3960gtcgtgttca ggaactgttc gatggttcgg agagagtcgc cgcccagaac atacgcgcac 4020cgatgtcagc agacagcctt attacaagta tattcaagca agtatatccg tagggtgcgg 4080gtgatttgga tctaaggttc gtactcaaca ctcacgagca gcttgcctat gttacatcct 4140tttatcagac ataacataat tggagtttac ttacacacgg ggtgtacctg tatgagcacc 4200acctacaatt gtagcactgg tacttgtaca aagaatttat tcgtacgaat cacagggacg 4260gccgccctca ccgaaccagc gaatacctca gcggtcccct gcagtgactc aacaaagcga 4320tatgaacatc ttgcgatggt atcctgctga tagtttttac tgtacaaaca cctgtgtagc 4380tccttctagc atttttaagt tattcacacc tcaaggggag ggataaatta aataaattcc 4440aaaagcgaag atcgagaaac taaattaaaa ttccaaaaac gaagttggaa cacaaccccc 4500cgaaaaaaaa caacaaacaa aaaacccaac aaaataaaca aaaacaaaat aaatatataa 4560ctaccagtat ctgactaaaa gttcaaatac tcgtacttac aacaaataga aatgagccgg 4620ccaaaattct gcagaaaaaa atttcaaaca agtactggta taattaaatt aaaaaacaca 4680tcaaagtatc ataacgttag ttattttatt ttatttaata aaagaaaaca acaagatggg 4740ctcaaaactt tcaacttata cgatacatac caaataacaa tttagtattt atctaagtgc 4800ttttcgtaga taatggaata caaatggata tccagagtat acacatggat agtatacact 4860gacacgacaa ttctgtatct ctttatgtta actactgtga ggcattaaat agagcttgat 4920atataaaatg ttacatttca cagtctgaac ttttgcagat tacctaattt ggtaagatat 4980taattatgaa ctgaaagttg atggcggccg catagcttca aaatgtttct actccttttt 5040tactcttcca gattttctcg gactccgcgc atcgccgtac cacttcaaaa cacccaagca 5100cagcatacta aatttcccct ctttcttcct ctagggtgtc gttaattacc cgtactaaag 5160gtttggaaaa gaaaaaagag accgcctcgt ttctttttct tcgtcgaaaa aggcaataaa 5220aatttttatc acgtttcttt ttcttgaaaa tttttttttt tgattttttt ctctttcgat 5280gacctcccat tgatatttaa gttaataaac ggtcttcaat ttctcaagtt tcagtttcat 5340ttttcttgtt ctattacaac tttttttact tcttgctcat tagaaagaaa gcatagcaat 5400ctaatctaag ttttaattac aaaatgtcat cctcagaagt aaaagcaaat ggttggaccg 5460cagttcctgt ttccgcaaaa gcaatagtag actccttggg taaattagga gatgtctctt 5520catattccgt agaagatatt gcctttccag ctgcagacaa attggtagcc gaagctcaag 5580cattcgttaa ggctagatta tctcctgaaa cctacaacca ttcaatgaga gttttctatt 5640ggggtactgt cattgccaga agattgttac cagaacaagc taaagatttg tctccttcaa 5700catgggcatt aacctgtttg ttacacgacg ttggtactgc cgaagcttat tttacctcca 5760ctagaatgag tttcgatatc tacggtggta ttaaagctat ggaagtattg aaggttttag 5820gttccagtac agatcaagca gaagccgttg ctgaagcaat tataagacat gaagatgttg 5880gtgtcgacgg taacatcaca tttttgggtc aattgatcca attggcaaca ttgtacgata 5940acgtcggtgc ctacgacggt attgatgact tcggttcctg ggttgatgac actacaagaa 6000acagtataaa cactgctttc ccaagacatg gttggtgttc ttggttcgca tgcacagtta 6060gaaaagaaga atcaaacaag ccttggtgcc acaccacaca cataccacaa ttcgacaaac 6120aaatggaagc aaacaccttg atgaaacctt gggaataaac aggccccttt tcctttgtcg 6180atatcatgta attagttatg tcacgcttac attcacgccc tcctcccaca tccgctctaa 6240ccgaaaagga aggagttaga caacctgaag tctaggtccc tatttatttt ttttaatagt 6300tatgttagta ttaagaacgt tatttatatt tcaaattttt cttttttttc tgtacaaacg 6360cgtgtacgca tgtaacatta tactgaaaac cttgcttgag aaggttttgg gacgctcgaa 6420ggctttaatt tgcgggtaat aactgatata attaaattga agctctaatt tgtgagttta 6480gtatacatgc atttacttat aatacagttt tttagttttg ctggccgcat cttctcaaat 6540atgcttccca gcctgctttt ctgtaacgtt caccctctac cttagcatcc cttccctttg 6600caaatagtcc tcttccaaca ataataatgt cagatcctgt agagaccaca tcatccacgg 6660ttctatactg ttgacccaat gcgtctccct tgtcatctaa acccacaccg ggtgtcataa 6720tcaaccaatc gtaaccttca tctcttccac ccatgtctct ttgagcaata aagccgataa 6780caaaatcttt gtcgctcttc gcaatgtcaa cagtaccctt agtatattct ccagtagcta 6840gggagccctt gcatgacaat tctgctaaca tcaaaaggcc tctaggttcc tttgttactt 6900cttccgccgc ctgcttcaaa ccgctaacaa tacctgggcc caccacaccg tgtgcattcg 6960taatgtctgc ccattctgct attctgtata cacccgcaga gtactgcaat ttgactgtat 7020taccaatgtc agcaaatttt ctgtcttcga agagtaaaaa attgtacttg gcggataatg 7080cctttagcgg cttaactgtg ccctccatgg aaaaatcagt caagatatcc acatgtgttt 7140ttagtaaaca aattttggga cctaatgctt caactaactc cagtaattcc ttggtggtac 7200gaacatccaa tgaagcacac aagtttgttt gcttttcgtg catgatatta aatagcttgg 7260cagcaacagg actaggatga gtagcagcac gttccttata tgtagctttc gacatgattt 7320atcttcgttt cctgcaggtt tttgttctgt gcagttgggt taagaatact gggcaatttc 7380atgtttcttc aacaccacat atgcgtatat ataccaatct aagtctgtgc tccttccttc 7440gttcttcctt ctgctcggag attaccgaat caaagctagc ttatcgatga taagctgtca 7500aagatgagaa ttaattccac ggactataga ctatactaga tactccgtct actgtacgat 7560acacttccgc tcaggtcctt gtcctttaac gaggccttac cactcttttg ttactctatt 7620gatccagctc agcaaaggca gtgtgatcta agattctatc ttcgcgatgt agtaaaacta 7680gctagaccga gaaagagact agaaatgcaa aaggcacttc tacaatggct gccatcatta 7740ttatccgatg tgacgctgca gcttctcaat gatattcgaa tacgctttga ggagatacag 7800cctaatatcc gacaaactgt tttacagatt tacgatcgta cttgttaccc atcattgaat 7860tttgaacatc cgaacctggg agttttccct gaaacagata gtatatttga acctgtataa 7920taatatatag tctagcgctt tacggaagac aatgtatgta tttcggttcc tggagaaact 7980attgcatcta ttgcataggt aatcttgcac gtcgcatccc cggttcattt tctgcgtttc 8040catcttgcac ttcaatagca tatctttgtt aacgaagcat ctgtgcttca ttttgtagaa 8100caaaaatgca acgcgagagc gctaattttt caaacaaaga atctgagctg catttttaca 8160gaacagaaat gcaacgcgaa agcgctattt taccaacgaa gaatctgtgc ttcatttttg 8220taaaacaaaa atgcaacgcg acgagagcgc taatttttca aacaaagaat ctgagctgca 8280tttttacaga acagaaatgc aacgcgagag cgctatttta ccaacaaaga atctatactt 8340cttttttgtt ctacaaaaat gcatcccgag agcgctattt ttctaacaaa gcatcttaga 8400ttactttttt tctcctttgt gcgctctata atgcagtctc ttgataactt tttgcactgt 8460aggtccgtta aggttagaag aaggctactt tggtgtctat tttctcttcc ataaaaaaag 8520cctgactcca cttcccgcgt ttactgatta ctagcgaagc tgcgggtgca ttttttcaag 8580ataaaggcat ccccgattat attctatacc gatgtggatt gcgcatactt tgtgaacaga 8640aagtgatagc gttgatgatt cttcattggt cagaaaatta tgaacggttt cttctatttt 8700gtctctatat actacgtata ggaaatgttt acattttcgt attgttttcg attcactcta 8760tgaatagttc ttactacaat ttttttgtct aaagagtaat actagagata aacataaaaa 8820atgtagaggt cgagtttaga tgcaagttca aggagcgaaa ggtggatggg taggttatat 8880agggatatag cacagagata tatagcaaag agatactttt gagcaat 8927498918DNAArtificial SequenceDescription of Artificial Sequence Synthetic polynucleotide 49gtttgtggaa gcggtattcg caatcattta gtcgtgcaat gtatgacttt aagatttgtg 60agcaggaaga aaagggagaa tcttctaacg ataaaccctt gaaaaactgg gtagactacg 120ctatgttgag ttgctacgca ggctgcacaa ttacacgaga atgctcccgc ctaggattta 180aggctaaggg acgtgcaatg cagacgacag atctaaatga ccgtgtcggt gaagtgttcg 240ccaaactttt cggttaacac atgcagtgat gcacgcgcga tggtgctaag ttacatatat 300atatatatat atatatatat atatatatag ccatagtgat gtctaagtaa cctttatggt 360atatttctta atgtggaaag atactagcgc gcgcacccac acacaagctt cgtcttttct 420tgaagaaaag aggaagctcg ctaaatggga ttccactttc cgttccctgc cagctgatgg 480aaaaaggtta gtggaacgat gaagaataaa aagagagatc cactgaggtg aaatttcagc 540tgacagcgag tttcatgatc gtgatgaaca atggtaacga gttgtggctg ttgccaggga 600gggtggttct caacttttaa tgtatggcca aatcgctact tgggtttgtt atataacaaa 660gaagaaataa tgaactgatt ctcttcctcc ttcttgtcct ttcttaattc tgttgtaatt 720accttccttt gtaatttttt ttgtaattat tcttcttaat aatccaaaca aacacacata 780ttacaataat gaagaagccc gagctgaccg ctacctctgt tgagaagttc ctgattgaga 840agtttgattc cgtttccgac ctgatgcagc tgtccgaggg cgaggagtct cgagccttct 900cctttgacgt gggcggacga ggttacgttc tgcgagtgaa ctcgtgtgcc gacggcttct 960acaaggatcg atacgtctac cgacactttg cttctgccgc tctgcccatc cctgaggttc 1020tcgacattgg cgagttctct gagtccctca cctactgcat ctctcgacga gctcagggag 1080tcaccctgca ggacctccct gagactgagc tgcctgctgt cctccagcct gttgctgagg 1140ccatggacgc tatcgctgct gctgatctgt cccagacctc gggtttcggc ccctttggac 1200ctcagggaat tggacagtac accacttggc gagacttcat ctgtgctatt gccgatcctc 1260acgtctacca ttggcagacc gttatggacg atactgtgtc ggcttctgtc gctcaggctc

1320tggacgagct gatgctctgg gccgaggatt gccccgaggt tcgacacctg gtgcatgctg 1380acttcggttc caacaacgtt ctcaccgaca acggccgaat cactgccgtg attgactggt 1440ccgaggctat gtttggcgac tcgcagtacg aggtggccaa catcttcttt tggcgaccct 1500ggctggcttg tatggagcag cagacccgat acttcgagcg acgacatcct gagctcgctg 1560gatcccctcg actgcgagct tacatgctcc gaattggtct ggaccagctc taccagtcgc 1620tggtggatgg caactttgac gatgctgcct gggctcaggg acgatgtgac gccatcgtgc 1680gatctggcgc tggaaccgtc ggacgaactc agattgcccg acgatccgct gctgtctgga 1740ccgacggatg cgtggaggtc ctggctgatt cgggtaaccg acgaccctct actcgacctc 1800gagctaagga gtaataaacg gcgcgccgtc tgaagaatga atgatttgat gatttctttt 1860tccctccatt tttcttactg aatatatcaa tgatatagac ttgtatagtt tattatttca 1920aattaagtag ctatatatag tcaagataac gtttgtttga cacgattaca ttattcgtcg 1980acatcttttt tcagcctgtc gtggtagcaa tttgaggagt attattaatt gaataggttc 2040attttgcgct cgcataaaca gttttcgtca gggacagtat gttggaatga gtggtaatta 2100atggtgacat gacatgttat agcaataacc ttgatgttta catcgtagtt taatgtacac 2160cccgcgaatt cgttcaagta ggagtgcacc aattgcaaag ggaaaagctg aatgggcagt 2220tcgaatagta ctttttccat aggctccgcc cccctgacga gcatcacaaa aatcgacgct 2280caagtcagag gtggcgaaac ccgacaggac tataaagata ccaggcgttt ccccctggaa 2340gctccctcgt gcgctctcct gttccgaccc tgccgcttac cggatacctg tccgcctttc 2400tcccttcggg aagcgtggcg ctttctcata gctcacgctg taggtatctc agttcggtgt 2460aggtcgttcg ctccaagctg ggctgtgtgc acgaaccccc cgttcagccc gaccgctgcg 2520ccttatccgg taactatcgt cttgagtcca acccggtaag acacgactta tcgccactgg 2580cagcagccac tggtaacagg attagcagag cgaggtatgt aggcggtgct acagagttct 2640tgaagtggtg gcctaactac ggctacacta gaagaacagt atttggtatc tgcgctctgc 2700tgaagccagt taccttcgga aaaagagttg gtagctcttg atccggcaaa caaaccaccg 2760ctggtagcgg tggttttttt gtttgcaagc agcagattac gcgcagaaaa aaaggatctc 2820aagaagatcc tttgatcttt tctacggggt ctgacgctca gtggaacgaa aactcacgtt 2880aagggatttt ggtcatgaga ttatcaaaaa ggatcttcac ctagatcctt ttaaattaaa 2940aatgaagttt taaatcaatc taaagtatat atgagtaaac ttggtctgac agttaccaat 3000gcttaatcag tgaggcacct atctcagcga tctgtctatt tcgttcatcc atagttgcct 3060gactccccgt cgtgtagata actacgatac gggagggctt accatctggc cccagtgctg 3120caatgatacc gcgagaccca cgctcaccgg ctccagattt atcagcaata aaccagccag 3180ccggaagggc cgagcgcaga agtggtcctg caactttatc cgcctccatc cagtctatta 3240attgttgccg ggaagctaga gtaagtagtt cgccagttaa tagtttgcgc aacgttgttg 3300ccattgctac aggcatcgtg gtgtcacgct cgtcgtttgg tatggcttca ttcagctccg 3360gttcccaacg atcaaggcga gttacatgat cccccatgtt gtgcaaaaaa gcggttagct 3420ccttcggtcc tccgatcgtt gtcagaagta agttggccgc agtgttatca ctcatggtta 3480tggcagcact gcataattct cttactgtca tgccatccgt aagatgcttt tctgtgactg 3540gtgagtactc aaccaagtca ttctgagaat agtgtatgcg gcgaccgagt tgctcttgcc 3600cggcgtcaat acgggataat accgcgccac atagcagaac tttaaaagtg ctcatcattg 3660gaaaacgttc ttcggggcga aaactctcaa ggatcttacc gctgttgaga tccagttcga 3720tgtaacccac tcgtgcaccc aactgatctt cagcatcttt tactttcacc agcgtttctg 3780ggtgagcaaa aacaggaagg caaaatgccg caaaaaaggg aataagggcg acacggaaat 3840gttgaatact catactcttc ctttttcaat attattgaag catttatcag ggttattgtc 3900tcatgagcgg atacatattt gaatgtattt agaaaaataa acagcgatcg cagctagctc 3960gtcgtgttca ggaactgttc gatggttcgg agagagtcgc cgcccagaac atacgcgcac 4020cgatgtcagc agacagcctt attacaagta tattcaagca agtatatccg tagggtgcgg 4080gtgatttgga tctaaggttc gtactcaaca ctcacgagca gcttgcctat gttacatcct 4140tttatcagac ataacataat tggagtttac ttacacacgg ggtgtacctg tatgagcacc 4200acctacaatt gtagcactgg tacttgtaca aagaatttat tcgtacgaat cacagggacg 4260gccgccctca ccgaaccagc gaatacctca gcggtcccct gcagtgactc aacaaagcga 4320tatgaacatc ttgcgatggt atcctgctga tagtttttac tgtacaaaca cctgtgtagc 4380tccttctagc atttttaagt tattcacacc tcaaggggag ggataaatta aataaattcc 4440aaaagcgaag atcgagaaac taaattaaaa ttccaaaaac gaagttggaa cacaaccccc 4500cgaaaaaaaa caacaaacaa aaaacccaac aaaataaaca aaaacaaaat aaatatataa 4560ctaccagtat ctgactaaaa gttcaaatac tcgtacttac aacaaataga aatgagccgg 4620ccaaaattct gcagaaaaaa atttcaaaca agtactggta taattaaatt aaaaaacaca 4680tcaaagtatc ataacgttag ttattttatt ttatttaata aaagaaaaca acaagatggg 4740ctcaaaactt tcaacttata cgatacatac caaataacaa tttagtattt atctaagtgc 4800ttttcgtaga taatggaata caaatggata tccagagtat acacatggat agtatacact 4860gacacgacaa ttctgtatct ctttatgtta actactgtga ggcattaaat agagcttgat 4920atataaaatg ttacatttca cagtctgaac ttttgcagat tacctaattt ggtaagatat 4980taattatgaa ctgaaagttg atggcggccg catagcttca aaatgtttct actccttttt 5040tactcttcca gattttctcg gactccgcgc atcgccgtac cacttcaaaa cacccaagca 5100cagcatacta aatttcccct ctttcttcct ctagggtgtc gttaattacc cgtactaaag 5160gtttggaaaa gaaaaaagag accgcctcgt ttctttttct tcgtcgaaaa aggcaataaa 5220aatttttatc acgtttcttt ttcttgaaaa tttttttttt tgattttttt ctctttcgat 5280gacctcccat tgatatttaa gttaataaac ggtcttcaat ttctcaagtt tcagtttcat 5340ttttcttgtt ctattacaac tttttttact tcttgctcat tagaaagaaa gcatagcaat 5400ctaatctaag ttttaattac aaaatgatat actcaacagt caacgctaat ccttacgctt 5460ggccttacga tggttcaata gaccctgctc acaccgcttt aatcttaatc gattggcaaa 5520tagacttttg tggtccaggt ggttatgtcg attccatggg ttacgactta tccttgacta 5580gaagtggttt agaacctaca gcaagagtat tggctgcagc cagagatact ggtatgacag 5640ttatccatac tagagaaggt cacagaccag atttggctga cttgccacct aataagagat 5700ggagatctgc atcagccggt gctgaaatcg gttcagttgg tccatgtggt agaattttag 5760tcagaggtga acctggttgg gaaatagtac cagaagttgc acctagagaa ggtgaaccaa 5820ttatagataa acctggtaaa ggtgctttct acgcaacaga tttggacttg ttgttgagaa 5880caagaggtat cacccatttg attttgaccg gtataactac agatgtttgc gtccacacca 5940ctatgagaga agccaacgat agaggttacg aatgtttaat tttgtctgat tgcaccggtg 6000ctactgacag aaagcatcac gaagctgcat tatctatggt caccatgcaa ggtggtgtat 6060tcggtgcaac tgcccattca gatgacttat tggccgcttt gggtacaacc gttccagcag 6120ccgctggtcc tagagctaga acagaataaa caggcccctt ttcctttgtc gatatcatgt 6180aattagttat gtcacgctta cattcacgcc ctcctcccac atccgctcta accgaaaagg 6240aaggagttag acaacctgaa gtctaggtcc ctatttattt tttttaatag ttatgttagt 6300attaagaacg ttatttatat ttcaaatttt tctttttttt ctgtacaaac gcgtgtacgc 6360atgtaacatt atactgaaaa ccttgcttga gaaggttttg ggacgctcga aggctttaat 6420ttgcgggtaa taactgatat aattaaattg aagctctaat ttgtgagttt agtatacatg 6480catttactta taatacagtt ttttagtttt gctggccgca tcttctcaaa tatgcttccc 6540agcctgcttt tctgtaacgt tcaccctcta ccttagcatc ccttcccttt gcaaatagtc 6600ctcttccaac aataataatg tcagatcctg tagagaccac atcatccacg gttctatact 6660gttgacccaa tgcgtctccc ttgtcatcta aacccacacc gggtgtcata atcaaccaat 6720cgtaaccttc atctcttcca cccatgtctc tttgagcaat aaagccgata acaaaatctt 6780tgtcgctctt cgcaatgtca acagtaccct tagtatattc tccagtagct agggagccct 6840tgcatgacaa ttctgctaac atcaaaaggc ctctaggttc ctttgttact tcttccgccg 6900cctgcttcaa accgctaaca atacctgggc ccaccacacc gtgtgcattc gtaatgtctg 6960cccattctgc tattctgtat acacccgcag agtactgcaa tttgactgta ttaccaatgt 7020cagcaaattt tctgtcttcg aagagtaaaa aattgtactt ggcggataat gcctttagcg 7080gcttaactgt gccctccatg gaaaaatcag tcaagatatc cacatgtgtt tttagtaaac 7140aaattttggg acctaatgct tcaactaact ccagtaattc cttggtggta cgaacatcca 7200atgaagcaca caagtttgtt tgcttttcgt gcatgatatt aaatagcttg gcagcaacag 7260gactaggatg agtagcagca cgttccttat atgtagcttt cgacatgatt tatcttcgtt 7320tcctgcaggt ttttgttctg tgcagttggg ttaagaatac tgggcaattt catgtttctt 7380caacaccaca tatgcgtata tataccaatc taagtctgtg ctccttcctt cgttcttcct 7440tctgctcgga gattaccgaa tcaaagctag cttatcgatg ataagctgtc aaagatgaga 7500attaattcca cggactatag actatactag atactccgtc tactgtacga tacacttccg 7560ctcaggtcct tgtcctttaa cgaggcctta ccactctttt gttactctat tgatccagct 7620cagcaaaggc agtgtgatct aagattctat cttcgcgatg tagtaaaact agctagaccg 7680agaaagagac tagaaatgca aaaggcactt ctacaatggc tgccatcatt attatccgat 7740gtgacgctgc agcttctcaa tgatattcga atacgctttg aggagataca gcctaatatc 7800cgacaaactg ttttacagat ttacgatcgt acttgttacc catcattgaa ttttgaacat 7860ccgaacctgg gagttttccc tgaaacagat agtatatttg aacctgtata ataatatata 7920gtctagcgct ttacggaaga caatgtatgt atttcggttc ctggagaaac tattgcatct 7980attgcatagg taatcttgca cgtcgcatcc ccggttcatt ttctgcgttt ccatcttgca 8040cttcaatagc atatctttgt taacgaagca tctgtgcttc attttgtaga acaaaaatgc 8100aacgcgagag cgctaatttt tcaaacaaag aatctgagct gcatttttac agaacagaaa 8160tgcaacgcga aagcgctatt ttaccaacga agaatctgtg cttcattttt gtaaaacaaa 8220aatgcaacgc gacgagagcg ctaatttttc aaacaaagaa tctgagctgc atttttacag 8280aacagaaatg caacgcgaga gcgctatttt accaacaaag aatctatact tcttttttgt 8340tctacaaaaa tgcatcccga gagcgctatt tttctaacaa agcatcttag attacttttt 8400ttctcctttg tgcgctctat aatgcagtct cttgataact ttttgcactg taggtccgtt 8460aaggttagaa gaaggctact ttggtgtcta ttttctcttc cataaaaaaa gcctgactcc 8520acttcccgcg tttactgatt actagcgaag ctgcgggtgc attttttcaa gataaaggca 8580tccccgatta tattctatac cgatgtggat tgcgcatact ttgtgaacag aaagtgatag 8640cgttgatgat tcttcattgg tcagaaaatt atgaacggtt tcttctattt tgtctctata 8700tactacgtat aggaaatgtt tacattttcg tattgttttc gattcactct atgaatagtt 8760cttactacaa tttttttgtc taaagagtaa tactagagat aaacataaaa aatgtagagg 8820tcgagtttag atgcaagttc aaggagcgaa aggtggatgg gtaggttata tagggatata 8880gcacagagat atatagcaaa gagatacttt tgagcaat 8918508894DNAArtificial SequenceDescription of Artificial Sequence Synthetic polynucleotide 50gtttgtggaa gcggtattcg caatcattta gtcgtgcaat gtatgacttt aagatttgtg 60agcaggaaga aaagggagaa tcttctaacg ataaaccctt gaaaaactgg gtagactacg 120ctatgttgag ttgctacgca ggctgcacaa ttacacgaga atgctcccgc ctaggattta 180aggctaaggg acgtgcaatg cagacgacag atctaaatga ccgtgtcggt gaagtgttcg 240ccaaactttt cggttaacac atgcagtgat gcacgcgcga tggtgctaag ttacatatat 300atatatatat atatatatat atatatatag ccatagtgat gtctaagtaa cctttatggt 360atatttctta atgtggaaag atactagcgc gcgcacccac acacaagctt cgtcttttct 420tgaagaaaag aggaagctcg ctaaatggga ttccactttc cgttccctgc cagctgatgg 480aaaaaggtta gtggaacgat gaagaataaa aagagagatc cactgaggtg aaatttcagc 540tgacagcgag tttcatgatc gtgatgaaca atggtaacga gttgtggctg ttgccaggga 600gggtggttct caacttttaa tgtatggcca aatcgctact tgggtttgtt atataacaaa 660gaagaaataa tgaactgatt ctcttcctcc ttcttgtcct ttcttaattc tgttgtaatt 720accttccttt gtaatttttt ttgtaattat tcttcttaat aatccaaaca aacacacata 780ttacaataat gaagaagccc gagctgaccg ctacctctgt tgagaagttc ctgattgaga 840agtttgattc cgtttccgac ctgatgcagc tgtccgaggg cgaggagtct cgagccttct 900cctttgacgt gggcggacga ggttacgttc tgcgagtgaa ctcgtgtgcc gacggcttct 960acaaggatcg atacgtctac cgacactttg cttctgccgc tctgcccatc cctgaggttc 1020tcgacattgg cgagttctct gagtccctca cctactgcat ctctcgacga gctcagggag 1080tcaccctgca ggacctccct gagactgagc tgcctgctgt cctccagcct gttgctgagg 1140ccatggacgc tatcgctgct gctgatctgt cccagacctc gggtttcggc ccctttggac 1200ctcagggaat tggacagtac accacttggc gagacttcat ctgtgctatt gccgatcctc 1260acgtctacca ttggcagacc gttatggacg atactgtgtc ggcttctgtc gctcaggctc 1320tggacgagct gatgctctgg gccgaggatt gccccgaggt tcgacacctg gtgcatgctg 1380acttcggttc caacaacgtt ctcaccgaca acggccgaat cactgccgtg attgactggt 1440ccgaggctat gtttggcgac tcgcagtacg aggtggccaa catcttcttt tggcgaccct 1500ggctggcttg tatggagcag cagacccgat acttcgagcg acgacatcct gagctcgctg 1560gatcccctcg actgcgagct tacatgctcc gaattggtct ggaccagctc taccagtcgc 1620tggtggatgg caactttgac gatgctgcct gggctcaggg acgatgtgac gccatcgtgc 1680gatctggcgc tggaaccgtc ggacgaactc agattgcccg acgatccgct gctgtctgga 1740ccgacggatg cgtggaggtc ctggctgatt cgggtaaccg acgaccctct actcgacctc 1800gagctaagga gtaataaacg gcgcgccgtc tgaagaatga atgatttgat gatttctttt 1860tccctccatt tttcttactg aatatatcaa tgatatagac ttgtatagtt tattatttca 1920aattaagtag ctatatatag tcaagataac gtttgtttga cacgattaca ttattcgtcg 1980acatcttttt tcagcctgtc gtggtagcaa tttgaggagt attattaatt gaataggttc 2040attttgcgct cgcataaaca gttttcgtca gggacagtat gttggaatga gtggtaatta 2100atggtgacat gacatgttat agcaataacc ttgatgttta catcgtagtt taatgtacac 2160cccgcgaatt cgttcaagta ggagtgcacc aattgcaaag ggaaaagctg aatgggcagt 2220tcgaatagta ctttttccat aggctccgcc cccctgacga gcatcacaaa aatcgacgct 2280caagtcagag gtggcgaaac ccgacaggac tataaagata ccaggcgttt ccccctggaa 2340gctccctcgt gcgctctcct gttccgaccc tgccgcttac cggatacctg tccgcctttc 2400tcccttcggg aagcgtggcg ctttctcata gctcacgctg taggtatctc agttcggtgt 2460aggtcgttcg ctccaagctg ggctgtgtgc acgaaccccc cgttcagccc gaccgctgcg 2520ccttatccgg taactatcgt cttgagtcca acccggtaag acacgactta tcgccactgg 2580cagcagccac tggtaacagg attagcagag cgaggtatgt aggcggtgct acagagttct 2640tgaagtggtg gcctaactac ggctacacta gaagaacagt atttggtatc tgcgctctgc 2700tgaagccagt taccttcgga aaaagagttg gtagctcttg atccggcaaa caaaccaccg 2760ctggtagcgg tggttttttt gtttgcaagc agcagattac gcgcagaaaa aaaggatctc 2820aagaagatcc tttgatcttt tctacggggt ctgacgctca gtggaacgaa aactcacgtt 2880aagggatttt ggtcatgaga ttatcaaaaa ggatcttcac ctagatcctt ttaaattaaa 2940aatgaagttt taaatcaatc taaagtatat atgagtaaac ttggtctgac agttaccaat 3000gcttaatcag tgaggcacct atctcagcga tctgtctatt tcgttcatcc atagttgcct 3060gactccccgt cgtgtagata actacgatac gggagggctt accatctggc cccagtgctg 3120caatgatacc gcgagaccca cgctcaccgg ctccagattt atcagcaata aaccagccag 3180ccggaagggc cgagcgcaga agtggtcctg caactttatc cgcctccatc cagtctatta 3240attgttgccg ggaagctaga gtaagtagtt cgccagttaa tagtttgcgc aacgttgttg 3300ccattgctac aggcatcgtg gtgtcacgct cgtcgtttgg tatggcttca ttcagctccg 3360gttcccaacg atcaaggcga gttacatgat cccccatgtt gtgcaaaaaa gcggttagct 3420ccttcggtcc tccgatcgtt gtcagaagta agttggccgc agtgttatca ctcatggtta 3480tggcagcact gcataattct cttactgtca tgccatccgt aagatgcttt tctgtgactg 3540gtgagtactc aaccaagtca ttctgagaat agtgtatgcg gcgaccgagt tgctcttgcc 3600cggcgtcaat acgggataat accgcgccac atagcagaac tttaaaagtg ctcatcattg 3660gaaaacgttc ttcggggcga aaactctcaa ggatcttacc gctgttgaga tccagttcga 3720tgtaacccac tcgtgcaccc aactgatctt cagcatcttt tactttcacc agcgtttctg 3780ggtgagcaaa aacaggaagg caaaatgccg caaaaaaggg aataagggcg acacggaaat 3840gttgaatact catactcttc ctttttcaat attattgaag catttatcag ggttattgtc 3900tcatgagcgg atacatattt gaatgtattt agaaaaataa acagcgatcg cagctagctc 3960gtcgtgttca ggaactgttc gatggttcgg agagagtcgc cgcccagaac atacgcgcac 4020cgatgtcagc agacagcctt attacaagta tattcaagca agtatatccg tagggtgcgg 4080gtgatttgga tctaaggttc gtactcaaca ctcacgagca gcttgcctat gttacatcct 4140tttatcagac ataacataat tggagtttac ttacacacgg ggtgtacctg tatgagcacc 4200acctacaatt gtagcactgg tacttgtaca aagaatttat tcgtacgaat cacagggacg 4260gccgccctca ccgaaccagc gaatacctca gcggtcccct gcagtgactc aacaaagcga 4320tatgaacatc ttgcgatggt atcctgctga tagtttttac tgtacaaaca cctgtgtagc 4380tccttctagc atttttaagt tattcacacc tcaaggggag ggataaatta aataaattcc 4440aaaagcgaag atcgagaaac taaattaaaa ttccaaaaac gaagttggaa cacaaccccc 4500cgaaaaaaaa caacaaacaa aaaacccaac aaaataaaca aaaacaaaat aaatatataa 4560ctaccagtat ctgactaaaa gttcaaatac tcgtacttac aacaaataga aatgagccgg 4620ccaaaattct gcagaaaaaa atttcaaaca agtactggta taattaaatt aaaaaacaca 4680tcaaagtatc ataacgttag ttattttatt ttatttaata aaagaaaaca acaagatggg 4740ctcaaaactt tcaacttata cgatacatac caaataacaa tttagtattt atctaagtgc 4800ttttcgtaga taatggaata caaatggata tccagagtat acacatggat agtatacact 4860gacacgacaa ttctgtatct ctttatgtta actactgtga ggcattaaat agagcttgat 4920atataaaatg ttacatttca cagtctgaac ttttgcagat tacctaattt ggtaagatat 4980taattatgaa ctgaaagttg atggcggccg catagcttca aaatgtttct actccttttt 5040tactcttcca gattttctcg gactccgcgc atcgccgtac cacttcaaaa cacccaagca 5100cagcatacta aatttcccct ctttcttcct ctagggtgtc gttaattacc cgtactaaag 5160gtttggaaaa gaaaaaagag accgcctcgt ttctttttct tcgtcgaaaa aggcaataaa 5220aatttttatc acgtttcttt ttcttgaaaa tttttttttt tgattttttt ctctttcgat 5280gacctcccat tgatatttaa gttaataaac ggtcttcaat ttctcaagtt tcagtttcat 5340ttttcttgtt ctattacaac tttttttact tcttgctcat tagaaagaaa gcatagcaat 5400ctaatctaag ttttaattac aaaatggacg caatggtaga aacaaataga cacttcatag 5460atgccgaccc ttacccttgg ccttacaacg gtgccttgag acctgataac acagccttga 5520ttataatcga tatgcaaacc gacttttgtg gtaaaggtgg ttatgtcgat catatgggtt 5580acgacttatc attggtacaa gccccaatcg aacctattaa aagagtttta gctgcaatga 5640gagctaaggg ttatcatatt atacacacaa gagaaggtca cagaccagat ttggctgact 5700tacctgcaaa caagagatgg agatctcaaa gaataggtgc tggtatcggt gacccaggtc 5760cttgtggtag aattttgacc agaggtgaac caggttggga tatcattcca gaattgtacc 5820ctatagaagg tgaaactatc atcgataaac ctggtaaagg tagtttttgc gcaacagact 5880tagaattggt tttgaaccaa aagagaatcg aaaacatcat cttgaccggt atcactacag 5940atgtttgtgt ctctaccact atgagagaag caaacgatag aggttacgaa tgcttgttgt 6000tggaagattg ttgcggtgcc actgactacg gtaaccattt ggccgctatt aaaatggtca 6060agatgcaagg tggtgtattc ggttctgttt caaattccgc agccttggtt gaagcattac 6120cataaacagg ccccttttcc tttgtcgata tcatgtaatt agttatgtca cgcttacatt 6180cacgccctcc tcccacatcc gctctaaccg aaaaggaagg agttagacaa cctgaagtct 6240aggtccctat ttattttttt taatagttat gttagtatta agaacgttat ttatatttca 6300aatttttctt ttttttctgt acaaacgcgt gtacgcatgt aacattatac tgaaaacctt 6360gcttgagaag gttttgggac gctcgaaggc tttaatttgc gggtaataac tgatataatt 6420aaattgaagc tctaatttgt gagtttagta tacatgcatt tacttataat acagtttttt 6480agttttgctg gccgcatctt ctcaaatatg cttcccagcc tgcttttctg taacgttcac 6540cctctacctt agcatccctt ccctttgcaa atagtcctct tccaacaata ataatgtcag 6600atcctgtaga gaccacatca tccacggttc tatactgttg acccaatgcg tctcccttgt 6660catctaaacc cacaccgggt gtcataatca accaatcgta accttcatct cttccaccca 6720tgtctctttg agcaataaag ccgataacaa aatctttgtc gctcttcgca atgtcaacag 6780tacccttagt atattctcca gtagctaggg agcccttgca tgacaattct gctaacatca 6840aaaggcctct aggttccttt gttacttctt ccgccgcctg cttcaaaccg ctaacaatac 6900ctgggcccac cacaccgtgt gcattcgtaa tgtctgccca ttctgctatt ctgtatacac 6960ccgcagagta ctgcaatttg actgtattac caatgtcagc aaattttctg tcttcgaaga 7020gtaaaaaatt gtacttggcg gataatgcct ttagcggctt aactgtgccc tccatggaaa 7080aatcagtcaa gatatccaca tgtgttttta gtaaacaaat tttgggacct aatgcttcaa 7140ctaactccag taattccttg gtggtacgaa catccaatga agcacacaag tttgtttgct 7200tttcgtgcat gatattaaat agcttggcag caacaggact aggatgagta gcagcacgtt 7260ccttatatgt agctttcgac atgatttatc ttcgtttcct gcaggttttt gttctgtgca 7320gttgggttaa gaatactggg caatttcatg

tttcttcaac accacatatg cgtatatata 7380ccaatctaag tctgtgctcc ttccttcgtt cttccttctg ctcggagatt accgaatcaa 7440agctagctta tcgatgataa gctgtcaaag atgagaatta attccacgga ctatagacta 7500tactagatac tccgtctact gtacgataca cttccgctca ggtccttgtc ctttaacgag 7560gccttaccac tcttttgtta ctctattgat ccagctcagc aaaggcagtg tgatctaaga 7620ttctatcttc gcgatgtagt aaaactagct agaccgagaa agagactaga aatgcaaaag 7680gcacttctac aatggctgcc atcattatta tccgatgtga cgctgcagct tctcaatgat 7740attcgaatac gctttgagga gatacagcct aatatccgac aaactgtttt acagatttac 7800gatcgtactt gttacccatc attgaatttt gaacatccga acctgggagt tttccctgaa 7860acagatagta tatttgaacc tgtataataa tatatagtct agcgctttac ggaagacaat 7920gtatgtattt cggttcctgg agaaactatt gcatctattg cataggtaat cttgcacgtc 7980gcatccccgg ttcattttct gcgtttccat cttgcacttc aatagcatat ctttgttaac 8040gaagcatctg tgcttcattt tgtagaacaa aaatgcaacg cgagagcgct aatttttcaa 8100acaaagaatc tgagctgcat ttttacagaa cagaaatgca acgcgaaagc gctattttac 8160caacgaagaa tctgtgcttc atttttgtaa aacaaaaatg caacgcgacg agagcgctaa 8220tttttcaaac aaagaatctg agctgcattt ttacagaaca gaaatgcaac gcgagagcgc 8280tattttacca acaaagaatc tatacttctt ttttgttcta caaaaatgca tcccgagagc 8340gctatttttc taacaaagca tcttagatta ctttttttct cctttgtgcg ctctataatg 8400cagtctcttg ataacttttt gcactgtagg tccgttaagg ttagaagaag gctactttgg 8460tgtctatttt ctcttccata aaaaaagcct gactccactt cccgcgttta ctgattacta 8520gcgaagctgc gggtgcattt tttcaagata aaggcatccc cgattatatt ctataccgat 8580gtggattgcg catactttgt gaacagaaag tgatagcgtt gatgattctt cattggtcag 8640aaaattatga acggtttctt ctattttgtc tctatatact acgtatagga aatgtttaca 8700ttttcgtatt gttttcgatt cactctatga atagttctta ctacaatttt tttgtctaaa 8760gagtaatact agagataaac ataaaaaatg tagaggtcga gtttagatgc aagttcaagg 8820agcgaaaggt ggatgggtag gttatatagg gatatagcac agagatatat agcaaagaga 8880tacttttgag caat 8894518395DNAArtificial SequenceDescription of Artificial Sequence Synthetic polynucleotide 51gtttgtggaa gcggtattcg caatttaatt aaagctggtg acaattaatc atcggctcgt 60ataatgtgtg gaattgaatc gatataagga ggttaatcat atgtcaatgg aaacccatag 120ttatgtagac gtcgcaattc gtaacgcgcg tcttgccgat acggagggaa ttgtcgatat 180tcttattcac gatgggcgca ttgcgtccat cgtgaagtcg acaaaaacaa aaggatcggt 240ggagatcgat gctcatgagg gtctggtcac ttccggcctg gtagagcctc acatccatct 300cgataaggcc ctgacggcag atcgggttcc cgcaggaagc attggcgacc ttcgaacgcg 360acgaggcctt gagatggcaa ttcgggccac ccgtgatatc aagcgtacgt tcacggttga 420agatgttcga gaacgggcca tacgtgcggc cctgatggca tcccgtgcgg gaaccaccgc 480attgcggaca cacgtcgatg tcgacccgat tgtcggcctc gcaggtatcc gtggtgtcct 540tgaggcgcgt gaagtctgcg cgggattgat cgatatccag atcgtcgcct tccctcagga 600gggactcttc tgctctgcgg gggccgtgga cctcatgcgg gaggcgatca aactgggcgc 660ggatgccgtc ggcggcgcac ccgcgctgga tgatcgcccg caggaccatg tccgagccgt 720ttttgacctt gctgctgagt tcggcctgcc cgtagacatg cacgtcgatg agtccgaccg 780gcgggaagac tttacgcttc cctttgtgat tgaagctgcc cgtgaacggc gtgtgcccaa 840tgtgaccgtc gcgcacatca gctcgctgtc cgtacagacg gatgacgtag cacggtcgac 900cattgccgcc cttgcggacg ccgatgttaa tgtcgtggtt aatccgatca ttgtcaaaat 960tacgcggctg agtgaattac tcgatgccgg agtctccgta atgtttggct cggacaacct 1020gcgggatccg ttctatccgc tcggagcggc gaatcccctt ggatcagcca tttttgcctg 1080tcaaattgcc gcgctgggaa caccgcaaga tctcagacgg gtattcgatg cggtcaccat 1140caacgctgcc cgcatgctgg gattcccctc acttttaggc gtcgtggaag gggcagtcgc 1200ggatctcgca gtattcccat cggcgacgcc cgaggaggtt gttctggatc aacagtctcc 1260gctcttcgta ctcaagggcg gacgtgtcgt tgccatgcga ttggccgctg gatcaacgtc 1320gttccgcgac tactcatgag gaaatccatt atgatgtcag gagaacacac gttaaaagcg 1380gtacgaggca gttttattga tgtcacccgt acgatcgata acccggaaga gattgcctct 1440gcgctgcggt ttattgagga tggtttatta ctcattaaac agggaaaagt ggaatggttt 1500ggcgaatggg aaaacggaaa gcatcaaatt cctgacacca ttcgcgtgcg cgactatcgc 1560ggcaaactga tagtaccggg ctttgtcgat acacatatcc attatccgca aagtgaaatg 1620gtgggggcct atggtgagca attgctggag tggttgaata aacacacctt ccctactgaa 1680cgtcgttatg aggatttaga gtacgcccgc gaaatgtcgg cgttcttcat caagcagctt 1740ttacgtaacg gaaccaccac ggcgctggtg tttggcactg ttcatccgca atctgttgat 1800gcgctgtttg aagccgccag tcatatcaat atgcgtatga ttgccggtaa ggtgatgatg 1860gaccgcaacg caccggatta tctgctcgac actgccgaaa gcagctatca ccaaagcaaa 1920gaactgatcg aacgctggca caaaaatggt cgtctgctat atgcgattac gccacgcttc 1980gccccgacct catctcctga acagatggcg atggcgcaac gcctgaaaga agaatatccg 2040gatacgtggg tacataccca tctctgtgaa aacaaagatg aaattgcctg ggtgaaatcg 2100ctttatcctg accatgatgg ttatctggat gtttaccatc agtacggcct gaccggtaaa 2160aactgtgtct ttgctcactg cgtccatctc gaagaaaaag agtgggatcg tctcagcgaa 2220accaaatcca gcattgcttt ctgtccgacc tccaaccttt acctcggcag cggcttattc 2280aacttgaaaa aagcatggca gaagaaagtt aaagtgggca tgggaacgga tatcggtgcc 2340ggaaccactt tcaacatgct gcaaacgctg aacgaagcct acaaagtatt gcaattacaa 2400ggctatcgcc tctcggcata tgaagcgttt tacctggcca cgctcggcgg agcgaaatct 2460ctgggccttg acgatttgat tggcaacttt ttacctggca aagaggctga tttcgtggtg 2520atggaaccca ccgccactcc gctacagcag ctgcgctatg acaactctgt ttctttagtc 2580gacaaattgt tcgtgatgat gacgttgggc gatgaccgtt cgatctaccg cacctacgtt 2640gatggtcgtc tggtgtacga acgcaactaa ggaacgacca tgcaaacgct cagcatccag 2700cacggtaccc tcgtcacgat ggatcagtac cgcagagtcc ttggggatag ctgggttcac 2760gtgcaggatg gacggatcgt cgcgctcgga gtgcacgccg agtcggtgcc tccgccagcg 2820gatcgggtga tcgatgcacg cggcaaggtc gtgttacccg gtttcatcaa tgcccacacc 2880catgtgaacc agatcctcct gcgcggaggg ccctcgcacg ggcgtcaact ctatgactgg 2940ctgttcaacg ttttgtatcc gggacaaaag gcgatgagac cggaggacgt agcggtggcg 3000gtgaggttgt attgtgcgga agctgtgcgc agcgggatta cgacgatcaa cgacaacgcc 3060gattcggcca tctacccagg caacatcgag gccgcgatgg cggtctatgg tgaggtgggt 3120gtgagggtcg tctacgcccg catgttcttt gatcggatgg acgggcgcat tcaagggtat 3180gtggacgcct tgaaggctcg ctctccccaa gtcgaactgt gctcgatcat ggaggaaacg 3240gctgtggcca aagatcggat cacagccctg tcagatcagt atcatggcac ggcaggaggt 3300cgtatatcag tttggcccgc tcctgccatt accccggcgg tgacagttga aggaatgcga 3360tgggcacaag ccttcgcccg tgatcgggcg gtaatgtgga cgcttcacat ggcggagagc 3420gatcatgatg agcggcttca ttggatgagt cccgccgagt acatggagtg ttacggactc 3480ttggatgagc gtctgcaggt cgcgcattgc gtgtactttg accggaagga tgttcggctg 3540ctgcaccgcc acaatgtgaa ggtcgcgtcg caggttgtga gcaatgccta cctcggctca 3600ggggtggccc ccgtgccaga gatggtggag cgcggcatgg ccgtgggcat tggaacagat 3660gacgggaatt gtaatgactc cgtaaacatg atcggagaca tgaagtttat ggcccatatt 3720caccgcgcgg tgcatcggga tgcggacgtg ctgaccccag agaagattct tgaaatggcg 3780acgatcgatg gggcgcgttc gttgggaatg gaccacgaga ttggttccat cgaaaccggc 3840aagcgcgcgg accttatcct gcttgacctg cgtcaccctc agacgactcc tcaccatcat 3900ttggcggcca cgatcgtgtt tcaggcttac ggcaatgagg tggacactgt cctgattgac 3960ggaaacgttg tgatggagaa ccgccgcttg agctttcttc cccctgaacg tgagttggcg 4020ttccttgagg aagcgcagag ccgcgccaca gctattttgc agcgggcgaa catggtggct 4080aacccagctt ggcgcagcct ctagcctcaa aatatatttt ccctctatct tctcgttgcg 4140cttaatttga ctaattctca ttagcgaggc gcgcctttcc ataggctccg cccccctgac 4200gagcatcaca aaaatcgacg ctcaagtcag aggtggcgaa acccgacagg actataaaga 4260taccaggcgt ttccccctgg aagctccctc gtgcgctctc ctgttccgac cctgccgctt 4320accggatacc tgtccgcctt tctcccttcg ggaagcgtgg cgctttctca tagctcacgc 4380tgtaggtatc tcagttcggt gtaggtcgtt cgctccaagc tgggctgtgt gcacgaaccc 4440cccgttcagc ccgaccgctg cgccttatcc ggtaactatc gtcttgagtc caacccggta 4500agacacgact tatcgccact ggcagcagcc actggtaaca ggattagcag agcgaggtat 4560gtaggcggtg ctacagagtt cttgaagtgg tggcctaact acggctacac tagaagaaca 4620gtatttggta tctgcgctct gctgaagcca gttaccttcg gaaaaagagt tggtagctct 4680tgatccggca aacaaaccac cgctggtagc ggtggttttt ttgtttgcaa gcagcagatt 4740acgcgcagaa aaaaaggatc tcaagaagat cctttgatct tttctacggg gtctgacgct 4800cagtggaacg aaaactcacg ttaagggatt ttggtcatga gattatcaaa aaggatcttc 4860acctagatcc ttttaaatta aaaatgaagt tttaaatcaa tctaaagtat atatgagtaa 4920acttggtctg acagttacca atgcttaatc agtgaggcac ctatctcagc gatctgtcta 4980tttcgttcat ccatagttgc ctgactcccc gtcgtgtaga taactacgat acgggagggc 5040ttaccatctg gccccagtgc tgcaatgata ccgcgagacc cacgctcacc ggctccagat 5100ttatcagcaa taaaccagcc agccggaagg gccgagcgca gaagtggtcc tgcaacttta 5160tccgcctcca tccagtctat taattgttgc cgggaagcta gagtaagtag ttcgccagtt 5220aatagtttgc gcaacgttgt tgccattgct acaggcatcg tggtgtcacg ctcgtcgttt 5280ggtatggctt cattcagctc cggttcccaa cgatcaaggc gagttacatg atcccccatg 5340ttgtgcaaaa aagcggttag ctccttcggt cctccgatcg ttgtcagaag taagttggcc 5400gcagtgttat cactcatggt tatggcagca ctgcataatt ctcttactgt catgccatcc 5460gtaagatgct tttctgtgac tggtgagtac tcaaccaagt cattctgaga atagtgtatg 5520cggcgaccga gttgctcttg cccggcgtca atacgggata ataccgcgcc acatagcaga 5580actttaaaag tgctcatcat tggaaaacgt tcttcggggc gaaaactctc aaggatctta 5640ccgctgttga gatccagttc gatgtaaccc actcgtgcac ccaactgatc ttcagcatct 5700tttactttca ccagcgtttc tgggtgagca aaaacaggaa ggcaaaatgc cgcaaaaaag 5760ggaataaggg cgacacggaa atgttgaata ctcatactct tcctttttca atattattga 5820agcatttatc agggttattg tctcatgagc ggatacatat ttgaatgtat ttagaaaaat 5880aaacagcgat cgcgcggccg cgggtaataa ctgatataat taaattgaag ctctaatttg 5940tgagtttagt atacatgcat ttacttataa tacagttttt tagttttgct ggccgcatct 6000tctcaaatat gcttcccagc ctgcttttct gtaacgttca ccctctacct tagcatccct 6060tccctttgca aatagtcctc ttccaacaat aataatgtca gatcctgtag agaccacatc 6120atccacggtt ctatactgtt gacccaatgc gtctcccttg tcatctaaac ccacaccggg 6180tgtcataatc aaccaatcgt aaccttcatc tcttccaccc atgtctcttt gagcaataaa 6240gccgataaca aaatctttgt cgctcttcgc aatgtcaaca gtacccttag tatattctcc 6300agtagctagg gagcccttgc atgacaattc tgctaacatc aaaaggcctc taggttcctt 6360tgttacttct tccgccgcct gcttcaaacc gctaacaata cctgggccca ccacaccgtg 6420tgcattcgta atgtctgccc attctgctat tctgtataca cccgcagagt actgcaattt 6480gactgtatta ccaatgtcag caaattttct gtcttcgaag agtaaaaaat tgtacttggc 6540ggataatgcc tttagcggct taactgtgcc ctccatggaa aaatcagtca agatatccac 6600atgtgttttt agtaaacaaa ttttgggacc taatgcttca actaactcca gtaattcctt 6660ggtggtacga acatccaatg aagcacacaa gtttgtttgc ttttcgtgca tgatattaaa 6720tagcttggca gcaacaggac taggatgagt agcagcacgt tccttatatg tagctttcga 6780catgatttat cttcgtttcc tgcaggtttt tgttctgtgc agttgggtta agaatactgg 6840gcaatttcat gtttcttcaa caccacatat gcgtatatat accaatctaa gtctgtgctc 6900cttccttcgt tcttccttct gctcggagat taccgaatca aagctagctt atcgatgata 6960agctgtcaaa gatgagaatt aattccacgg actatagact atactagata ctccgtctac 7020tgtacgatac acttccgctc aggtccttgt cctttaacga ggccttacca ctcttttgtt 7080actctattga tccagctcag caaaggcagt gtgatctaag attctatctt cgcgatgtag 7140taaaactagc tagaccgaga aagagactag aaatgcaaaa ggcacttcta caatggctgc 7200catcattatt atccgatgtg acgctgcagc ttctcaatga tattcgaata cgctttgagg 7260agatacagcc taatatccga caaactgttt tacagattta cgatcgtact tgttacccat 7320cattgaattt tgaacatccg aacctgggag ttttccctga aacagatagt atatttgaac 7380ctgtataata atatatagtc tagcgcttta cggaagacaa tgtatgtatt tcggttcctg 7440gagaaactat tgcatctatt gcataggtaa tcttgcacgt cgcatccccg gttcattttc 7500tgcgtttcca tcttgcactt caatagcata tctttgttaa cgaagcatct gtgcttcatt 7560ttgtagaaca aaaatgcaac gcgagagcgc taatttttca aacaaagaat ctgagctgca 7620tttttacaga acagaaatgc aacgcgaaag cgctatttta ccaacgaaga atctgtgctt 7680catttttgta aaacaaaaat gcaacgcgac gagagcgcta atttttcaaa caaagaatct 7740gagctgcatt tttacagaac agaaatgcaa cgcgagagcg ctattttacc aacaaagaat 7800ctatacttct tttttgttct acaaaaatgc atcccgagag cgctattttt ctaacaaagc 7860atcttagatt actttttttc tcctttgtgc gctctataat gcagtctctt gataactttt 7920tgcactgtag gtccgttaag gttagaagaa ggctactttg gtgtctattt tctcttccat 7980aaaaaaagcc tgactccact tcccgcgttt actgattact agcgaagctg cgggtgcatt 8040ttttcaagat aaaggcatcc ccgattatat tctataccga tgtggattgc gcatactttg 8100tgaacagaaa gtgatagcgt tgatgattct tcattggtca gaaaattatg aacggtttct 8160tctattttgt ctctatatac tacgtatagg aaatgtttac attttcgtat tgttttcgat 8220tcactctatg aatagttctt actacaattt ttttgtctaa agagtaatac tagagataaa 8280cataaaaaat gtagaggtcg agtttagatg caagttcaag gagcgaaagg tggatgggta 8340ggttatatag ggatatagca cagagatata tagcaaagag atacttttga gcaat 83955212133DNAArtificial SequenceDescription of Artificial Sequence Synthetic polynucleotide 52gtttgtggaa gcggtattcg caatttaatt aaagctggtg acaattaatc atcggctcgt 60ataatgtgtg gaattgaatc gatataagga ggttaatcat atgcaagcgc aagtttttcg 120agttccaatg agtaatccag ccgatgttag tggcgtagcc aagctcatcg atgagggagt 180gatccgtgcc gaagaggtcg tctgcgttct cggcaagacc gaaggcaacg gctgtgtcaa 240tgacttcacg cgtggctaca ccaccctcgc gttcaaggtc tacttctccg agaaactggg 300cgtgtcccgg caagaggtcg gcgagcgcat cgctttcatc atgtccggcg gtaccgaagg 360cgtcatggcg cctcactgca ccatcttcac cgtgcagaag acggacaaca agcagaagac 420cgccgctgaa ggcaagcgac ttgccgttca gcagatcttt acccgcgagt tcctgccgga 480ggagatcggc cgcatgccgc aggtcacgga aacagccgac gctgttcgcc gcgccatgcg 540cgaagccggc atcgcggatg catccgatgt ccacttcgtt caggtcaagt gcccactgct 600cactgccggc cgcatgcatg acgctgtcga gcgcgggcat acggttgcca ccgaagatac 660ctatgagtcc atgggctact cccgcggcgc atccgcgctt ggtatcgccc tggccctcgg 720ggaagtcgag aaggccaacc tcagtgatga agttattacc gcagactaca gtctctactc 780ctcggttgcc tcaacttcgg cgggtatcga gttgatgaac aacgagatca tcgtcatggg 840caacagccgc gcatggggtg gtgacctcgt catcggccac gccgagatga aggacgccat 900cgacggtgca gcggtccggc aggccctgcg cgacgtcggg tgctgcgaga acgacctgcc 960gaccgtcgac gagctcggcc gcgtggtcaa tgtatttgcc aaggctgaag cctccccgga 1020cggtgaggtt cgtaaccgcc gccacacgat gctggacgat tcggacatta acagcacgcg 1080ccatgcgcga gcggtcgtca atgcagttat cgcttcgatc gtgggagatc ccatggttta 1140tgtctccggc ggctccgagc atcagggccc cgccggtggc ggtcccgttg cagttatcgc 1200gcgcacagct taaggaaatc cattatgata tactcaacag tcaacgctaa tccttacgct 1260tggccttacg atggttcaat agaccctgct cacaccgctt taatcttaat cgattggcaa 1320atagactttt gtggtccagg tggttatgtc gattccatgg gttacgactt atccttgact 1380agaagtggtt tagaacctac agcaagagta ttggctgcag ccagagatac tggtatgaca 1440gttatccata ctagagaagg tcacagacca gatttggctg acttgccacc taataagaga 1500tggagatctg catcagccgg tgctgaaatc ggttcagttg gtccatgtgg tagaatttta 1560gtcagaggtg aacctggttg ggaaatagta ccagaagttg cacctagaga aggtgaacca 1620attatagata aacctggtaa aggtgctttc tacgcaacag atttggactt gttgttgaga 1680acaagaggta tcacccattt gattttgacc ggtataacta cagatgtttg cgtccacacc 1740actatgagag aagccaacga tagaggttac gaatgtttaa ttttgtctga ttgcaccggt 1800gctactgaca gaaagcatca cgaagctgca ttatctatgg tcaccatgca aggtggtgta 1860ttcggtgcaa ctgcccattc agatgactta ttggccgctt tgggtacaac cgttccagca 1920gccgctggtc ctagagctag aacagaataa ggaacgacca tgacagttag ttccgataca 1980actgctgaaa tatcgttagg ttggtcaatc caagactgga ttgatttcca caagtcatca 2040agctcccagg cttcactaag gcttcttgaa tcactactag actctcaaaa tgttgcgcca 2100gtcgataatg cgtggatatc gctaatttca aaggaaaatt tactgcacca attccaaatt 2160ttaaagagca gagaaaataa agaaactcta cctctctacg gtgtccctat tgctgttaag 2220gacaacatcg acgttagagg tctacccacc accgctgcat gtccatcctt tgcatatgag 2280ccttccaaag actctaaagt agtagaacta ctaagaaatg caggtgcgat aatcgtgggt 2340aagacaaact tggaccaatt tgccacagga ttagtcggca cacggtctcc atatgggaaa 2400acaccttgcg cttttagcaa agagcatgta tctggtggtt cctccgctgg gtcagcatcg 2460gtggtcgcca gaggtatcgt accaattgca ttgggtactg atacagcagg ttctggtaga 2520gtcccagccg ccttgaacaa cctgattggc ctaaagccaa caaagggcgt cttttcctgt 2580caaggtgtag ttcccgcttg taaatcttta gactgcgtct ccatctttgc attaaaccta 2640agtgatgctg aacgctgctt ccgcatcatg tgccagccag atcctgataa tgatgaatat 2700tctagaccct atgtttccaa ccctttgaaa aaattttcaa gcaatgtaac gattgctatt 2760cctaaaaata tcccatggta tggtgaaacc aagaatcctg tactgttttc caatgctgtc 2820gaaaatctat caagaacggg cgctaacgtc atagaaattg attttgagcc tcttttagag 2880ttagctcgct gtttatacga aggtacttgg gtggccgagc gttatcaagc tattcaatcg 2940tttttggaca gtaaaccacc aaaggaatct ttggacccta ctgttatttc aattatagaa 3000ggggccaaga aatacagtgc agtagactgc ttcagttttg aatacaaaag acaaggcatc 3060ttgcaaaaag tgagacgact tctcgaatca gtcgatgtat tgtgtgtgcc cacatgtcct 3120ttaaatccta ctatgcaaca agttgcggat gaaccagtcc tagtcaattc aagacaaggc 3180acatggacta attttgtcaa cttggcagat ttggcagccc ttgctgttcc cgcagggttc 3240cgagacgatg gtttgccaaa tggtattact ttaatcggta aaaaattcac agattacgca 3300ctattagagt tggctaaccg ctatttccaa aatatattcc ccaacggttc cagaacatac 3360ggtactttta cctcttcttc agtaaagcca gcaaacgatc aattagtggg accagactat 3420gacccatcta cgtccataaa attggctgtt gtcggtgcac atcttaaggg tctgcctcta 3480cattggcaat tggaaaaggt caatgcaaca tatttatgta caacaaaaac atcaaaagct 3540taccagcttt ttgctttgcc caaaaatgga ccagttttaa aacctggttt gagaagagtt 3600caagatagca atggctctca aatcgaatta gaagtgtaca gtgttccaaa agaactgttc 3660ggtgctttta tttccatggt tcctgaacca ttaggaatag gttcagtgga gttagaatct 3720ggtgaatgga tcaaatcctt tatttgtgaa gaatctggtt acaaagccaa aggtacagtt 3780gatatcacaa agtatggtgg atttagagca tattttgaaa tgttgtaagg acacgataat 3840gtcaatggaa acccatagtt atgtagacgt cgcaattcgt aacgcgcgtc ttgccgatac 3900ggagggaatt gtcgatattc ttattcacga tgggcgcatt gcgtccatcg tgaagtcgac 3960aaaaacaaaa ggatcggtgg agatcgatgc tcatgagggt ctggtcactt ccggcctggt 4020agagcctcac atccatctcg ataaggccct gacggcagat cgggttcccg caggaagcat 4080tggcgacctt cgaacgcgac gaggccttga gatggcaatt cgggccaccc gtgatatcaa 4140gcgtacgttc acggttgaag atgttcgaga acgggccata cgtgcggccc tgatggcatc 4200ccgtgcggga accaccgcat tgcggacaca cgtcgatgtc gacccgattg tcggcctcgc 4260aggtatccgt ggtgtccttg aggcgcgtga agtctgcgcg ggattgatcg atatccagat 4320cgtcgccttc cctcaggagg gactcttctg ctctgcgggg gccgtggacc tcatgcggga 4380ggcgatcaaa ctgggcgcgg atgccgtcgg cggcgcaccc gcgctggatg atcgcccgca 4440ggaccatgtc cgagccgttt ttgaccttgc tgctgagttc ggcctgcccg tagacatgca 4500cgtcgatgag tccgaccggc gggaagactt tacgcttccc tttgtgattg aagctgcccg 4560tgaacggcgt gtgcccaatg tgaccgtcgc gcacatcagc tcgctgtccg tacagacgga 4620tgacgtagca cggtcgacca ttgccgccct tgcggacgcc gatgttaatg tcgtggttaa 4680tccgatcatt gtcaaaatta cgcggctgag tgaattactc gatgccggag tctccgtaat 4740gtttggctcg gacaacctgc gggatccgtt ctatccgctc ggagcggcga atccccttgg 4800atcagccatt tttgcctgtc aaattgccgc gctgggaaca ccgcaagatc tcagacgggt 4860attcgatgcg gtcaccatca acgctgcccg

catgctggga ttcccctcac ttttaggcgt 4920cgtggaaggg gcagtcgcgg atctcgcagt attcccatcg gcgacgcccg aggaggttgt 4980tctggatcaa cagtctccgc tcttcgtact caagggcgga cgtgtcgttg ccatgcgatt 5040ggccgctgga tcaacgtcgt tccgcgacta ctcatgagga aatccattat gatgtcagga 5100gaacacacgt taaaagcggt acgaggcagt tttattgatg tcacccgtac gatcgataac 5160ccggaagaga ttgcctctgc gctgcggttt attgaggatg gtttattact cattaaacag 5220ggaaaagtgg aatggtttgg cgaatgggaa aacggaaagc atcaaattcc tgacaccatt 5280cgcgtgcgcg actatcgcgg caaactgata gtaccgggct ttgtcgatac acatatccat 5340tatccgcaaa gtgaaatggt gggggcctat ggtgagcaat tgctggagtg gttgaataaa 5400cacaccttcc ctactgaacg tcgttatgag gatttagagt acgcccgcga aatgtcggcg 5460ttcttcatca agcagctttt acgtaacgga accaccacgg cgctggtgtt tggcactgtt 5520catccgcaat ctgttgatgc gctgtttgaa gccgccagtc atatcaatat gcgtatgatt 5580gccggtaagg tgatgatgga ccgcaacgca ccggattatc tgctcgacac tgccgaaagc 5640agctatcacc aaagcaaaga actgatcgaa cgctggcaca aaaatggtcg tctgctatat 5700gcgattacgc cacgcttcgc cccgacctca tctcctgaac agatggcgat ggcgcaacgc 5760ctgaaagaag aatatccgga tacgtgggta catacccatc tctgtgaaaa caaagatgaa 5820attgcctggg tgaaatcgct ttatcctgac catgatggtt atctggatgt ttaccatcag 5880tacggcctga ccggtaaaaa ctgtgtcttt gctcactgcg tccatctcga agaaaaagag 5940tgggatcgtc tcagcgaaac caaatccagc attgctttct gtccgacctc caacctttac 6000ctcggcagcg gcttattcaa cttgaaaaaa gcatggcaga agaaagttaa agtgggcatg 6060ggaacggata tcggtgccgg aaccactttc aacatgctgc aaacgctgaa cgaagcctac 6120aaagtattgc aattacaagg ctatcgcctc tcggcatatg aagcgtttta cctggccacg 6180ctcggcggag cgaaatctct gggccttgac gatttgattg gcaacttttt acctggcaaa 6240gaggctgatt tcgtggtgat ggaacccacc gccactccgc tacagcagct gcgctatgac 6300aactctgttt ctttagtcga caaattgttc gtgatgatga cgttgggcga tgaccgttcg 6360atctaccgca cctacgttga tggtcgtctg gtgtacgaac gcaactaagg aacgaccatg 6420caaacgctca gcatccagca cggtaccctc gtcacgatgg atcagtaccg cagagtcctt 6480ggggatagct gggttcacgt gcaggatgga cggatcgtcg cgctcggagt gcacgccgag 6540tcggtgcctc cgccagcgga tcgggtgatc gatgcacgcg gcaaggtcgt gttacccggt 6600ttcatcaatg cccacaccca tgtgaaccag atcctcctgc gcggagggcc ctcgcacggg 6660cgtcaactct atgactggct gttcaacgtt ttgtatccgg gacaaaaggc gatgagaccg 6720gaggacgtag cggtggcggt gaggttgtat tgtgcggaag ctgtgcgcag cgggattacg 6780acgatcaacg acaacgccga ttcggccatc tacccaggca acatcgaggc cgcgatggcg 6840gtctatggtg aggtgggtgt gagggtcgtc tacgcccgca tgttctttga tcggatggac 6900gggcgcattc aagggtatgt ggacgccttg aaggctcgct ctccccaagt cgaactgtgc 6960tcgatcatgg aggaaacggc tgtggccaaa gatcggatca cagccctgtc agatcagtat 7020catggcacgg caggaggtcg tatatcagtt tggcccgctc ctgccattac cccggcggtg 7080acagttgaag gaatgcgatg ggcacaagcc ttcgcccgtg atcgggcggt aatgtggacg 7140cttcacatgg cggagagcga tcatgatgag cggcttcatt ggatgagtcc cgccgagtac 7200atggagtgtt acggactctt ggatgagcgt ctgcaggtcg cgcattgcgt gtactttgac 7260cggaaggatg ttcggctgct gcaccgccac aatgtgaagg tcgcgtcgca ggttgtgagc 7320aatgcctacc tcggctcagg ggtggccccc gtgccagaga tggtggagcg cggcatggcc 7380gtgggcattg gaacagatga cgggaattgt aatgactccg taaacatgat cggagacatg 7440aagtttatgg cccatattca ccgcgcggtg catcgggatg cggacgtgct gaccccagag 7500aagattcttg aaatggcgac gatcgatggg gcgcgttcgt tgggaatgga ccacgagatt 7560ggttccatcg aaaccggcaa gcgcgcggac cttatcctgc ttgacctgcg tcaccctcag 7620acgactcctc accatcattt ggcggccacg atcgtgtttc aggcttacgg caatgaggtg 7680gacactgtcc tgattgacgg aaacgttgtg atggagaacc gccgcttgag ctttcttccc 7740cctgaacgtg agttggcgtt ccttgaggaa gcgcagagcc gcgccacagc tattttgcag 7800cgggcgaaca tggtggctaa cccagcttgg cgcagcctct agcctcaaaa tatattttcc 7860ctctatcttc tcgttgcgct taatttgact aattctcatt agcgaggcgc gcctttccat 7920aggctccgcc cccctgacga gcatcacaaa aatcgacgct caagtcagag gtggcgaaac 7980ccgacaggac tataaagata ccaggcgttt ccccctggaa gctccctcgt gcgctctcct 8040gttccgaccc tgccgcttac cggatacctg tccgcctttc tcccttcggg aagcgtggcg 8100ctttctcata gctcacgctg taggtatctc agttcggtgt aggtcgttcg ctccaagctg 8160ggctgtgtgc acgaaccccc cgttcagccc gaccgctgcg ccttatccgg taactatcgt 8220cttgagtcca acccggtaag acacgactta tcgccactgg cagcagccac tggtaacagg 8280attagcagag cgaggtatgt aggcggtgct acagagttct tgaagtggtg gcctaactac 8340ggctacacta gaagaacagt atttggtatc tgcgctctgc tgaagccagt taccttcgga 8400aaaagagttg gtagctcttg atccggcaaa caaaccaccg ctggtagcgg tggttttttt 8460gtttgcaagc agcagattac gcgcagaaaa aaaggatctc aagaagatcc tttgatcttt 8520tctacggggt ctgacgctca gtggaacgaa aactcacgtt aagggatttt ggtcatgaga 8580ttatcaaaaa ggatcttcac ctagatcctt ttaaattaaa aatgaagttt taaatcaatc 8640taaagtatat atgagtaaac ttggtctgac agttaccaat gcttaatcag tgaggcacct 8700atctcagcga tctgtctatt tcgttcatcc atagttgcct gactccccgt cgtgtagata 8760actacgatac gggagggctt accatctggc cccagtgctg caatgatacc gcgagaccca 8820cgctcaccgg ctccagattt atcagcaata aaccagccag ccggaagggc cgagcgcaga 8880agtggtcctg caactttatc cgcctccatc cagtctatta attgttgccg ggaagctaga 8940gtaagtagtt cgccagttaa tagtttgcgc aacgttgttg ccattgctac aggcatcgtg 9000gtgtcacgct cgtcgtttgg tatggcttca ttcagctccg gttcccaacg atcaaggcga 9060gttacatgat cccccatgtt gtgcaaaaaa gcggttagct ccttcggtcc tccgatcgtt 9120gtcagaagta agttggccgc agtgttatca ctcatggtta tggcagcact gcataattct 9180cttactgtca tgccatccgt aagatgcttt tctgtgactg gtgagtactc aaccaagtca 9240ttctgagaat agtgtatgcg gcgaccgagt tgctcttgcc cggcgtcaat acgggataat 9300accgcgccac atagcagaac tttaaaagtg ctcatcattg gaaaacgttc ttcggggcga 9360aaactctcaa ggatcttacc gctgttgaga tccagttcga tgtaacccac tcgtgcaccc 9420aactgatctt cagcatcttt tactttcacc agcgtttctg ggtgagcaaa aacaggaagg 9480caaaatgccg caaaaaaggg aataagggcg acacggaaat gttgaatact catactcttc 9540ctttttcaat attattgaag catttatcag ggttattgtc tcatgagcgg atacatattt 9600gaatgtattt agaaaaataa acagcgatcg cgcggccgcg ggtaataact gatataatta 9660aattgaagct ctaatttgtg agtttagtat acatgcattt acttataata cagtttttta 9720gttttgctgg ccgcatcttc tcaaatatgc ttcccagcct gcttttctgt aacgttcacc 9780ctctacctta gcatcccttc cctttgcaaa tagtcctctt ccaacaataa taatgtcaga 9840tcctgtagag accacatcat ccacggttct atactgttga cccaatgcgt ctcccttgtc 9900atctaaaccc acaccgggtg tcataatcaa ccaatcgtaa ccttcatctc ttccacccat 9960gtctctttga gcaataaagc cgataacaaa atctttgtcg ctcttcgcaa tgtcaacagt 10020acccttagta tattctccag tagctaggga gcccttgcat gacaattctg ctaacatcaa 10080aaggcctcta ggttcctttg ttacttcttc cgccgcctgc ttcaaaccgc taacaatacc 10140tgggcccacc acaccgtgtg cattcgtaat gtctgcccat tctgctattc tgtatacacc 10200cgcagagtac tgcaatttga ctgtattacc aatgtcagca aattttctgt cttcgaagag 10260taaaaaattg tacttggcgg ataatgcctt tagcggctta actgtgccct ccatggaaaa 10320atcagtcaag atatccacat gtgtttttag taaacaaatt ttgggaccta atgcttcaac 10380taactccagt aattccttgg tggtacgaac atccaatgaa gcacacaagt ttgtttgctt 10440ttcgtgcatg atattaaata gcttggcagc aacaggacta ggatgagtag cagcacgttc 10500cttatatgta gctttcgaca tgatttatct tcgtttcctg caggtttttg ttctgtgcag 10560ttgggttaag aatactgggc aatttcatgt ttcttcaaca ccacatatgc gtatatatac 10620caatctaagt ctgtgctcct tccttcgttc ttccttctgc tcggagatta ccgaatcaaa 10680gctagcttat cgatgataag ctgtcaaaga tgagaattaa ttccacggac tatagactat 10740actagatact ccgtctactg tacgatacac ttccgctcag gtccttgtcc tttaacgagg 10800ccttaccact cttttgttac tctattgatc cagctcagca aaggcagtgt gatctaagat 10860tctatcttcg cgatgtagta aaactagcta gaccgagaaa gagactagaa atgcaaaagg 10920cacttctaca atggctgcca tcattattat ccgatgtgac gctgcagctt ctcaatgata 10980ttcgaatacg ctttgaggag atacagccta atatccgaca aactgtttta cagatttacg 11040atcgtacttg ttacccatca ttgaattttg aacatccgaa cctgggagtt ttccctgaaa 11100cagatagtat atttgaacct gtataataat atatagtcta gcgctttacg gaagacaatg 11160tatgtatttc ggttcctgga gaaactattg catctattgc ataggtaatc ttgcacgtcg 11220catccccggt tcattttctg cgtttccatc ttgcacttca atagcatatc tttgttaacg 11280aagcatctgt gcttcatttt gtagaacaaa aatgcaacgc gagagcgcta atttttcaaa 11340caaagaatct gagctgcatt tttacagaac agaaatgcaa cgcgaaagcg ctattttacc 11400aacgaagaat ctgtgcttca tttttgtaaa acaaaaatgc aacgcgacga gagcgctaat 11460ttttcaaaca aagaatctga gctgcatttt tacagaacag aaatgcaacg cgagagcgct 11520attttaccaa caaagaatct atacttcttt tttgttctac aaaaatgcat cccgagagcg 11580ctatttttct aacaaagcat cttagattac tttttttctc ctttgtgcgc tctataatgc 11640agtctcttga taactttttg cactgtaggt ccgttaaggt tagaagaagg ctactttggt 11700gtctattttc tcttccataa aaaaagcctg actccacttc ccgcgtttac tgattactag 11760cgaagctgcg ggtgcatttt ttcaagataa aggcatcccc gattatattc tataccgatg 11820tggattgcgc atactttgtg aacagaaagt gatagcgttg atgattcttc attggtcaga 11880aaattatgaa cggtttcttc tattttgtct ctatatacta cgtataggaa atgtttacat 11940tttcgtattg ttttcgattc actctatgaa tagttcttac tacaattttt ttgtctaaag 12000agtaatacta gagataaaca taaaaaatgt agaggtcgag tttagatgca agttcaagga 12060gcgaaaggtg gatgggtagg ttatataggg atatagcaca gagatatata gcaaagagat 12120acttttgagc aat 121335312112DNAArtificial SequenceDescription of Artificial Sequence Synthetic polynucleotide 53gtttgtggaa gcggtattcg caatttaatt aaagctggtg acaattaatc atcggctcgt 60ataatgtgtg gaattgaatc gatataagga ggttaatcat atgtaccata tagatgtatt 120cagaatccct tgccatagtc caggtgacac ttcaggttta gaagatttga tagaaacagg 180tagagtcgct ccagcagata ttgttgctgt catgggtaaa acagagggta atggttgtgt 240taacgactat acaagagaat acgccaccgc tatgttggct gcatgcttag gtagacattt 300gcaattacca cctcacgaag ttgaaaagag agtagctttt gttatgtccg gtggtacaga 360aggtgtattg tctccacatc acaccgtttt cgctagaaga ccagcaatag atgcccatag 420acctgcaggt aaaagattga ctttaggtat cgcttttaca agagatttct tgcctgaaga 480aattggtaga catgcacaaa taaccgaaac tgcaggtgcc gttaagagag ctatgagaga 540tgctggtatc gcatcaatag atgacttaca tttcgtacag gttaagtgtc cattgttgac 600tcctgcaaag atcgcttcag caagatccag aggttgcgct ccagtcacta cagatacata 660tgaaagtatg ggttactcta gaggtgcctc agctttgggt attgcattag ccaccgaaga 720agttccttct tcaatgttgg tcgatgaatc cgtattaaat gactggagtt tgtccagttc 780tttagcttca gcatccgccg gtatagaatt ggaacataac gttgtcattg ccataggcat 840gtccgaacaa gctacaagtg aattagttat cgcacacggt gtcatgtctg atgccattga 900cgccgcttca gttagaagaa ctatagaatc tttgggtatc agatcagatg acgaaatgga 960tagaatagtc aacgtattcg ctaaagcaga agcctctcca gacggtgtag ttagaggcat 1020gagacatact atgttgagtg attctgacat caactctacc agacacgcta gagcagttac 1080tggtgcagcc atagcttctg tcgtaggtca tggtatggtt tatgtctcag gtggtgccga 1140acaccaaggt ccagctggtg gtggtccttt cgcagttatt gccagagctt aaggaaatcc 1200attatgatat actcaacagt caacgctaat ccttacgctt ggccttacga tggttcaata 1260gaccctgctc acaccgcttt aatcttaatc gattggcaaa tagacttttg tggtccaggt 1320ggttatgtcg attccatggg ttacgactta tccttgacta gaagtggttt agaacctaca 1380gcaagagtat tggctgcagc cagagatact ggtatgacag ttatccatac tagagaaggt 1440cacagaccag atttggctga cttgccacct aataagagat ggagatctgc atcagccggt 1500gctgaaatcg gttcagttgg tccatgtggt agaattttag tcagaggtga acctggttgg 1560gaaatagtac cagaagttgc acctagagaa ggtgaaccaa ttatagataa acctggtaaa 1620ggtgctttct acgcaacaga tttggacttg ttgttgagaa caagaggtat cacccatttg 1680attttgaccg gtataactac agatgtttgc gtccacacca ctatgagaga agccaacgat 1740agaggttacg aatgtttaat tttgtctgat tgcaccggtg ctactgacag aaagcatcac 1800gaagctgcat tatctatggt caccatgcaa ggtggtgtat tcggtgcaac tgcccattca 1860gatgacttat tggccgcttt gggtacaacc gttccagcag ccgctggtcc tagagctaga 1920acagaataag gaacgaccat gacagttagt tccgatacaa ctgctgaaat atcgttaggt 1980tggtcaatcc aagactggat tgatttccac aagtcatcaa gctcccaggc ttcactaagg 2040cttcttgaat cactactaga ctctcaaaat gttgcgccag tcgataatgc gtggatatcg 2100ctaatttcaa aggaaaattt actgcaccaa ttccaaattt taaagagcag agaaaataaa 2160gaaactctac ctctctacgg tgtccctatt gctgttaagg acaacatcga cgttagaggt 2220ctacccacca ccgctgcatg tccatccttt gcatatgagc cttccaaaga ctctaaagta 2280gtagaactac taagaaatgc aggtgcgata atcgtgggta agacaaactt ggaccaattt 2340gccacaggat tagtcggcac acggtctcca tatgggaaaa caccttgcgc ttttagcaaa 2400gagcatgtat ctggtggttc ctccgctggg tcagcatcgg tggtcgccag aggtatcgta 2460ccaattgcat tgggtactga tacagcaggt tctggtagag tcccagccgc cttgaacaac 2520ctgattggcc taaagccaac aaagggcgtc ttttcctgtc aaggtgtagt tcccgcttgt 2580aaatctttag actgcgtctc catctttgca ttaaacctaa gtgatgctga acgctgcttc 2640cgcatcatgt gccagccaga tcctgataat gatgaatatt ctagacccta tgtttccaac 2700cctttgaaaa aattttcaag caatgtaacg attgctattc ctaaaaatat cccatggtat 2760ggtgaaacca agaatcctgt actgttttcc aatgctgtcg aaaatctatc aagaacgggc 2820gctaacgtca tagaaattga ttttgagcct cttttagagt tagctcgctg tttatacgaa 2880ggtacttggg tggccgagcg ttatcaagct attcaatcgt ttttggacag taaaccacca 2940aaggaatctt tggaccctac tgttatttca attatagaag gggccaagaa atacagtgca 3000gtagactgct tcagttttga atacaaaaga caaggcatct tgcaaaaagt gagacgactt 3060ctcgaatcag tcgatgtatt gtgtgtgccc acatgtcctt taaatcctac tatgcaacaa 3120gttgcggatg aaccagtcct agtcaattca agacaaggca catggactaa ttttgtcaac 3180ttggcagatt tggcagccct tgctgttccc gcagggttcc gagacgatgg tttgccaaat 3240ggtattactt taatcggtaa aaaattcaca gattacgcac tattagagtt ggctaaccgc 3300tatttccaaa atatattccc caacggttcc agaacatacg gtacttttac ctcttcttca 3360gtaaagccag caaacgatca attagtggga ccagactatg acccatctac gtccataaaa 3420ttggctgttg tcggtgcaca tcttaagggt ctgcctctac attggcaatt ggaaaaggtc 3480aatgcaacat atttatgtac aacaaaaaca tcaaaagctt accagctttt tgctttgccc 3540aaaaatggac cagttttaaa acctggtttg agaagagttc aagatagcaa tggctctcaa 3600atcgaattag aagtgtacag tgttccaaaa gaactgttcg gtgcttttat ttccatggtt 3660cctgaaccat taggaatagg ttcagtggag ttagaatctg gtgaatggat caaatccttt 3720atttgtgaag aatctggtta caaagccaaa ggtacagttg atatcacaaa gtatggtgga 3780tttagagcat attttgaaat gttgtaagga cacgataatg tcaatggaaa cccatagtta 3840tgtagacgtc gcaattcgta acgcgcgtct tgccgatacg gagggaattg tcgatattct 3900tattcacgat gggcgcattg cgtccatcgt gaagtcgaca aaaacaaaag gatcggtgga 3960gatcgatgct catgagggtc tggtcacttc cggcctggta gagcctcaca tccatctcga 4020taaggccctg acggcagatc gggttcccgc aggaagcatt ggcgaccttc gaacgcgacg 4080aggccttgag atggcaattc gggccacccg tgatatcaag cgtacgttca cggttgaaga 4140tgttcgagaa cgggccatac gtgcggccct gatggcatcc cgtgcgggaa ccaccgcatt 4200gcggacacac gtcgatgtcg acccgattgt cggcctcgca ggtatccgtg gtgtccttga 4260ggcgcgtgaa gtctgcgcgg gattgatcga tatccagatc gtcgccttcc ctcaggaggg 4320actcttctgc tctgcggggg ccgtggacct catgcgggag gcgatcaaac tgggcgcgga 4380tgccgtcggc ggcgcacccg cgctggatga tcgcccgcag gaccatgtcc gagccgtttt 4440tgaccttgct gctgagttcg gcctgcccgt agacatgcac gtcgatgagt ccgaccggcg 4500ggaagacttt acgcttccct ttgtgattga agctgcccgt gaacggcgtg tgcccaatgt 4560gaccgtcgcg cacatcagct cgctgtccgt acagacggat gacgtagcac ggtcgaccat 4620tgccgccctt gcggacgccg atgttaatgt cgtggttaat ccgatcattg tcaaaattac 4680gcggctgagt gaattactcg atgccggagt ctccgtaatg tttggctcgg acaacctgcg 4740ggatccgttc tatccgctcg gagcggcgaa tccccttgga tcagccattt ttgcctgtca 4800aattgccgcg ctgggaacac cgcaagatct cagacgggta ttcgatgcgg tcaccatcaa 4860cgctgcccgc atgctgggat tcccctcact tttaggcgtc gtggaagggg cagtcgcgga 4920tctcgcagta ttcccatcgg cgacgcccga ggaggttgtt ctggatcaac agtctccgct 4980cttcgtactc aagggcggac gtgtcgttgc catgcgattg gccgctggat caacgtcgtt 5040ccgcgactac tcatgaggaa atccattatg atgtcaggag aacacacgtt aaaagcggta 5100cgaggcagtt ttattgatgt cacccgtacg atcgataacc cggaagagat tgcctctgcg 5160ctgcggttta ttgaggatgg tttattactc attaaacagg gaaaagtgga atggtttggc 5220gaatgggaaa acggaaagca tcaaattcct gacaccattc gcgtgcgcga ctatcgcggc 5280aaactgatag taccgggctt tgtcgataca catatccatt atccgcaaag tgaaatggtg 5340ggggcctatg gtgagcaatt gctggagtgg ttgaataaac acaccttccc tactgaacgt 5400cgttatgagg atttagagta cgcccgcgaa atgtcggcgt tcttcatcaa gcagctttta 5460cgtaacggaa ccaccacggc gctggtgttt ggcactgttc atccgcaatc tgttgatgcg 5520ctgtttgaag ccgccagtca tatcaatatg cgtatgattg ccggtaaggt gatgatggac 5580cgcaacgcac cggattatct gctcgacact gccgaaagca gctatcacca aagcaaagaa 5640ctgatcgaac gctggcacaa aaatggtcgt ctgctatatg cgattacgcc acgcttcgcc 5700ccgacctcat ctcctgaaca gatggcgatg gcgcaacgcc tgaaagaaga atatccggat 5760acgtgggtac atacccatct ctgtgaaaac aaagatgaaa ttgcctgggt gaaatcgctt 5820tatcctgacc atgatggtta tctggatgtt taccatcagt acggcctgac cggtaaaaac 5880tgtgtctttg ctcactgcgt ccatctcgaa gaaaaagagt gggatcgtct cagcgaaacc 5940aaatccagca ttgctttctg tccgacctcc aacctttacc tcggcagcgg cttattcaac 6000ttgaaaaaag catggcagaa gaaagttaaa gtgggcatgg gaacggatat cggtgccgga 6060accactttca acatgctgca aacgctgaac gaagcctaca aagtattgca attacaaggc 6120tatcgcctct cggcatatga agcgttttac ctggccacgc tcggcggagc gaaatctctg 6180ggccttgacg atttgattgg caacttttta cctggcaaag aggctgattt cgtggtgatg 6240gaacccaccg ccactccgct acagcagctg cgctatgaca actctgtttc tttagtcgac 6300aaattgttcg tgatgatgac gttgggcgat gaccgttcga tctaccgcac ctacgttgat 6360ggtcgtctgg tgtacgaacg caactaagga acgaccatgc aaacgctcag catccagcac 6420ggtaccctcg tcacgatgga tcagtaccgc agagtccttg gggatagctg ggttcacgtg 6480caggatggac ggatcgtcgc gctcggagtg cacgccgagt cggtgcctcc gccagcggat 6540cgggtgatcg atgcacgcgg caaggtcgtg ttacccggtt tcatcaatgc ccacacccat 6600gtgaaccaga tcctcctgcg cggagggccc tcgcacgggc gtcaactcta tgactggctg 6660ttcaacgttt tgtatccggg acaaaaggcg atgagaccgg aggacgtagc ggtggcggtg 6720aggttgtatt gtgcggaagc tgtgcgcagc gggattacga cgatcaacga caacgccgat 6780tcggccatct acccaggcaa catcgaggcc gcgatggcgg tctatggtga ggtgggtgtg 6840agggtcgtct acgcccgcat gttctttgat cggatggacg ggcgcattca agggtatgtg 6900gacgccttga aggctcgctc tccccaagtc gaactgtgct cgatcatgga ggaaacggct 6960gtggccaaag atcggatcac agccctgtca gatcagtatc atggcacggc aggaggtcgt 7020atatcagttt ggcccgctcc tgccattacc ccggcggtga cagttgaagg aatgcgatgg 7080gcacaagcct tcgcccgtga tcgggcggta atgtggacgc ttcacatggc ggagagcgat 7140catgatgagc ggcttcattg gatgagtccc gccgagtaca tggagtgtta cggactcttg 7200gatgagcgtc tgcaggtcgc gcattgcgtg tactttgacc ggaaggatgt tcggctgctg 7260caccgccaca atgtgaaggt cgcgtcgcag gttgtgagca atgcctacct cggctcaggg 7320gtggcccccg tgccagagat ggtggagcgc ggcatggccg tgggcattgg aacagatgac 7380gggaattgta atgactccgt aaacatgatc ggagacatga agtttatggc ccatattcac 7440cgcgcggtgc atcgggatgc ggacgtgctg accccagaga agattcttga aatggcgacg 7500atcgatgggg cgcgttcgtt gggaatggac cacgagattg gttccatcga aaccggcaag 7560cgcgcggacc ttatcctgct tgacctgcgt caccctcaga cgactcctca ccatcatttg 7620gcggccacga tcgtgtttca ggcttacggc aatgaggtgg acactgtcct gattgacgga

7680aacgttgtga tggagaaccg ccgcttgagc tttcttcccc ctgaacgtga gttggcgttc 7740cttgaggaag cgcagagccg cgccacagct attttgcagc gggcgaacat ggtggctaac 7800ccagcttggc gcagcctcta gcctcaaaat atattttccc tctatcttct cgttgcgctt 7860aatttgacta attctcatta gcgaggcgcg cctttccata ggctccgccc ccctgacgag 7920catcacaaaa atcgacgctc aagtcagagg tggcgaaacc cgacaggact ataaagatac 7980caggcgtttc cccctggaag ctccctcgtg cgctctcctg ttccgaccct gccgcttacc 8040ggatacctgt ccgcctttct cccttcggga agcgtggcgc tttctcatag ctcacgctgt 8100aggtatctca gttcggtgta ggtcgttcgc tccaagctgg gctgtgtgca cgaacccccc 8160gttcagcccg accgctgcgc cttatccggt aactatcgtc ttgagtccaa cccggtaaga 8220cacgacttat cgccactggc agcagccact ggtaacagga ttagcagagc gaggtatgta 8280ggcggtgcta cagagttctt gaagtggtgg cctaactacg gctacactag aagaacagta 8340tttggtatct gcgctctgct gaagccagtt accttcggaa aaagagttgg tagctcttga 8400tccggcaaac aaaccaccgc tggtagcggt ggtttttttg tttgcaagca gcagattacg 8460cgcagaaaaa aaggatctca agaagatcct ttgatctttt ctacggggtc tgacgctcag 8520tggaacgaaa actcacgtta agggattttg gtcatgagat tatcaaaaag gatcttcacc 8580tagatccttt taaattaaaa atgaagtttt aaatcaatct aaagtatata tgagtaaact 8640tggtctgaca gttaccaatg cttaatcagt gaggcaccta tctcagcgat ctgtctattt 8700cgttcatcca tagttgcctg actccccgtc gtgtagataa ctacgatacg ggagggctta 8760ccatctggcc ccagtgctgc aatgataccg cgagacccac gctcaccggc tccagattta 8820tcagcaataa accagccagc cggaagggcc gagcgcagaa gtggtcctgc aactttatcc 8880gcctccatcc agtctattaa ttgttgccgg gaagctagag taagtagttc gccagttaat 8940agtttgcgca acgttgttgc cattgctaca ggcatcgtgg tgtcacgctc gtcgtttggt 9000atggcttcat tcagctccgg ttcccaacga tcaaggcgag ttacatgatc ccccatgttg 9060tgcaaaaaag cggttagctc cttcggtcct ccgatcgttg tcagaagtaa gttggccgca 9120gtgttatcac tcatggttat ggcagcactg cataattctc ttactgtcat gccatccgta 9180agatgctttt ctgtgactgg tgagtactca accaagtcat tctgagaata gtgtatgcgg 9240cgaccgagtt gctcttgccc ggcgtcaata cgggataata ccgcgccaca tagcagaact 9300ttaaaagtgc tcatcattgg aaaacgttct tcggggcgaa aactctcaag gatcttaccg 9360ctgttgagat ccagttcgat gtaacccact cgtgcaccca actgatcttc agcatctttt 9420actttcacca gcgtttctgg gtgagcaaaa acaggaaggc aaaatgccgc aaaaaaggga 9480ataagggcga cacggaaatg ttgaatactc atactcttcc tttttcaata ttattgaagc 9540atttatcagg gttattgtct catgagcgga tacatatttg aatgtattta gaaaaataaa 9600cagcgatcgc gcggccgcgg gtaataactg atataattaa attgaagctc taatttgtga 9660gtttagtata catgcattta cttataatac agttttttag ttttgctggc cgcatcttct 9720caaatatgct tcccagcctg cttttctgta acgttcaccc tctaccttag catcccttcc 9780ctttgcaaat agtcctcttc caacaataat aatgtcagat cctgtagaga ccacatcatc 9840cacggttcta tactgttgac ccaatgcgtc tcccttgtca tctaaaccca caccgggtgt 9900cataatcaac caatcgtaac cttcatctct tccacccatg tctctttgag caataaagcc 9960gataacaaaa tctttgtcgc tcttcgcaat gtcaacagta cccttagtat attctccagt 10020agctagggag cccttgcatg acaattctgc taacatcaaa aggcctctag gttcctttgt 10080tacttcttcc gccgcctgct tcaaaccgct aacaatacct gggcccacca caccgtgtgc 10140attcgtaatg tctgcccatt ctgctattct gtatacaccc gcagagtact gcaatttgac 10200tgtattacca atgtcagcaa attttctgtc ttcgaagagt aaaaaattgt acttggcgga 10260taatgccttt agcggcttaa ctgtgccctc catggaaaaa tcagtcaaga tatccacatg 10320tgtttttagt aaacaaattt tgggacctaa tgcttcaact aactccagta attccttggt 10380ggtacgaaca tccaatgaag cacacaagtt tgtttgcttt tcgtgcatga tattaaatag 10440cttggcagca acaggactag gatgagtagc agcacgttcc ttatatgtag ctttcgacat 10500gatttatctt cgtttcctgc aggtttttgt tctgtgcagt tgggttaaga atactgggca 10560atttcatgtt tcttcaacac cacatatgcg tatatatacc aatctaagtc tgtgctcctt 10620ccttcgttct tccttctgct cggagattac cgaatcaaag ctagcttatc gatgataagc 10680tgtcaaagat gagaattaat tccacggact atagactata ctagatactc cgtctactgt 10740acgatacact tccgctcagg tccttgtcct ttaacgaggc cttaccactc ttttgttact 10800ctattgatcc agctcagcaa aggcagtgtg atctaagatt ctatcttcgc gatgtagtaa 10860aactagctag accgagaaag agactagaaa tgcaaaaggc acttctacaa tggctgccat 10920cattattatc cgatgtgacg ctgcagcttc tcaatgatat tcgaatacgc tttgaggaga 10980tacagcctaa tatccgacaa actgttttac agatttacga tcgtacttgt tacccatcat 11040tgaattttga acatccgaac ctgggagttt tccctgaaac agatagtata tttgaacctg 11100tataataata tatagtctag cgctttacgg aagacaatgt atgtatttcg gttcctggag 11160aaactattgc atctattgca taggtaatct tgcacgtcgc atccccggtt cattttctgc 11220gtttccatct tgcacttcaa tagcatatct ttgttaacga agcatctgtg cttcattttg 11280tagaacaaaa atgcaacgcg agagcgctaa tttttcaaac aaagaatctg agctgcattt 11340ttacagaaca gaaatgcaac gcgaaagcgc tattttacca acgaagaatc tgtgcttcat 11400ttttgtaaaa caaaaatgca acgcgacgag agcgctaatt tttcaaacaa agaatctgag 11460ctgcattttt acagaacaga aatgcaacgc gagagcgcta ttttaccaac aaagaatcta 11520tacttctttt ttgttctaca aaaatgcatc ccgagagcgc tatttttcta acaaagcatc 11580ttagattact ttttttctcc tttgtgcgct ctataatgca gtctcttgat aactttttgc 11640actgtaggtc cgttaaggtt agaagaaggc tactttggtg tctattttct cttccataaa 11700aaaagcctga ctccacttcc cgcgtttact gattactagc gaagctgcgg gtgcattttt 11760tcaagataaa ggcatccccg attatattct ataccgatgt ggattgcgca tactttgtga 11820acagaaagtg atagcgttga tgattcttca ttggtcagaa aattatgaac ggtttcttct 11880attttgtctc tatatactac gtataggaaa tgtttacatt ttcgtattgt tttcgattca 11940ctctatgaat agttcttact acaatttttt tgtctaaaga gtaatactag agataaacat 12000aaaaaatgta gaggtcgagt ttagatgcaa gttcaaggag cgaaaggtgg atgggtaggt 12060tatataggga tatagcacag agatatatag caaagagata cttttgagca at 12112541173DNARhodococcus sp. 54ttgcgggaag tccaactgct cgacggccgc cgagttgatg ttgcatgcgc cggaccgttg 60atctccgaaa tcggcgccca tctcgacctc accgctccag tggagatcga ctgtggcggc 120ggcctggcga cgcgaccgtt caccgaaccc catttgcacc tcgacaaggc ggggaccgcc 180gatcgtctac cggcaggcgc cagcaccatc ggtgatgcga tcgccgccat gcaatcggtg 240aaagtcactg agcgcgacaa tgtggcggcg gtcgccgcac gaatgcaccg cgtcctgaac 300cgcattgtcg acgatggttc ccacgccatt cgcgctctcg tcgacgtcga tgaggtctgg 360ggattgaccg cttttcatgc agcccaacaa gtccaagctg ctctcgcgcc gcgcgcggta 420gtacaaatcg tggccttccc acaacatggc ctcaccccgc aggtacttgc catgctcgag 480caagcggccg cagaaggtgc aggagcactc ggcgcccaca ccgacgtcga ccctgaccca 540gcggcgcacg tcggtgctgt ggccgccatt gccgccgggg catcgctacc gctcgaagtc 600cacactgatg aaggcgccag tcccgacaag ttctacttgc ctgcagtact ggaggtcctc 660gaccggtttc ctggactctc gacgaccctc gcacactgtc tgtcactcgg aacgatcgcg 720ccgaaacaac agcagcactg gattgaggaa ctggcccatc gggacatcaa agtctgtgtc 780gcgcctagca ttttaggttt cggcctgccc ttggcgccag tccgggcact catcgaggcc 840ggcgtcggaa tacttgtcgg atcagacaac ctgcaggacg ttttctttcc gctcggtacg 900ggccgcgcca tcgaaaacgt gcgtctgctg gcgaccgcag cacagctcac cgcacctgag 960ctcgctggcc cgctcatcgc aggtgtcacc gacatcgcgt acgccaccgt gaccggcgca 1020gcagatgcac tggcggtgga atcccccgca accctcgtcg tccacgacgc gacctcgccg 1080gcggagctgc ttcgcggcat cgacggtact cgaatcaccg ttatcgacgg cctgttgaca 1140tccccgctcc aactcgacaa aggaatcaag tga 1173551239DNAPseudomonas sp. 55atgtcaatgg aaacccatag ttatgtagac gtcgcaattc gtaacgcgcg tcttgccgat 60acggagggaa ttgtcgatat tcttattcac gatgggcgca ttgcgtccat cgtgaagtcg 120acaaaaacaa aaggatcggt ggagatcgat gctcatgagg gtctggtcac ttccggcctg 180gtagagcctc acatccatct cgataaggcc ctgacggcag atcgggttcc cgcaggaagc 240attggcgacc ttcgaacgcg acgaggcctt gagatggcaa ttcgggccac ccgtgatatc 300aagcgtacgt tcacggttga agatgttcga gaacgggcca tacgtgcggc cctgatggca 360tcccgtgcgg gaaccaccgc attgcggaca cacgtcgatg tcgacccgat tgtcggcctc 420gcaggtatcc gtggtgtcct tgaggcgcgt gaagtctgcg cgggattgat cgatatccag 480atcgtcgcct tccctcagga gggactcttc tgctctgcgg gggccgtgga cctcatgcgg 540gaggcgatca aactgggcgc ggatgccgtc ggcggcgcac ccgcgctgga tgatcgcccg 600caggaccatg tccgagccgt ttttgacctt gctgctgagt tcggcctgcc cgtagacatg 660cacgtcgatg agtccgaccg gcgggaagac tttacgcttc cctttgtgat tgaagctgcc 720cgtgaacggc gtgtgcccaa tgtgaccgtc gcgcacatca gctcgctgtc cgtacagacg 780gatgacgtag cacggtcgac cattgccgcc cttgcggacg ccgatgttaa tgtcgtggtt 840aatccgatca ttgtcaaaat tacgcggctg agtgaattac tcgatgccgg agtctccgta 900atgtttggct cggacaacct gcgggatccg ttctatccgc tcggagcggc gaatcccctt 960ggatcagcca tttttgcctg tcaaattgcc gcgctgggaa caccgcaaga tctcagacgg 1020gtattcgatg cggtcaccat caacgctgcc cgcatgctgg gattcccctc acttttaggc 1080gtcgtggaag gggcagtcgc ggatctcgca gtattcccat cggcgacgcc cgaggaggtt 1140gttctggatc aacagtctcc gctcttcgta ctcaagggcg gacgtgtcgt tgccatgcga 1200ttggccgctg gatcaacgtc gttccgcgac tactcatga 123956726DNARhodococcus sp. 56atgatatact caacagtcaa cgctaatcct tacgcttggc cttacgatgg ttcaatagac 60cctgctcaca ccgctttaat cttaatcgat tggcaaatag acttttgtgg tccaggtggt 120tatgtcgatt ccatgggtta cgacttatcc ttgactagaa gtggtttaga acctacagca 180agagtattgg ctgcagccag agatactggt atgacagtta tccatactag agaaggtcac 240agaccagatt tggctgactt gccacctaat aagagatgga gatctgcatc agccggtgct 300gaaatcggtt cagttggtcc atgtggtaga attttagtca gaggtgaacc tggttgggaa 360atagtaccag aagttgcacc tagagaaggt gaaccaatta tagataaacc tggtaaaggt 420gctttctacg caacagattt ggacttgttg ttgagaacaa gaggtatcac ccatttgatt 480ttgaccggta taactacaga tgtttgcgtc cacaccacta tgagagaagc caacgataga 540ggttacgaat gtttaatttt gtctgattgc accggtgcta ctgacagaaa gcatcacgaa 600gctgcattat ctatggtcac catgcaaggt ggtgtattcg gtgcaactgc ccattcagat 660gacttattgg ccgctttggg tacaaccgtt ccagcagccg ctggtcctag agctagaaca 720gaataa 72657702DNARhizobium leguminosarum 57atggacgcga tggtcgaaac caaccggcat tttatcgacg ccgatccgta tccgtggccc 60tataacggag ctctgaggcc tgacaatacc gccctcatca tcatcgacat gcagacggat 120ttctgcggca agggcggtta tgtcgaccac atgggctacg acctgtcgct ggtgcaggcg 180ccgatcgaac ccatcaaacg cgtgcttgcc gccatgcggg ccaagggtta tcacatcatc 240cacacccgcg agggccaccg ccccgacctc gccgatctgc cagcaaacaa acgctggcgc 300tcgcaacgga tcggggccgg catcggtgat cccggcccct gcggccgaat cctgacgcgt 360ggcgaacccg gctgggacat catccccgaa ctctacccga tcgaaggcga gacgatcatc 420gacaagcccg gcaagggttc gttctgcgcc accgacctcg aactcgtcct caaccagaaa 480cgcatcgaga acattatcct caccgggatc accaccgatg tctgcgtctc gacgacgatg 540cgcgaggcga acgaccgcgg ctacgaatgc ctgctgctgg aggactgctg tggtgcgacc 600gactacggaa accacctcgc cgccatcaag atggtgaaga tgcagggcgg cgtcttcggc 660tcggtctcca attccgcggc tctagtcgag gcgctgccct ga 70258474PRTArtificial SequenceDescription of Artificial Sequence Synthetic polypeptide 58Met Gln Thr Leu Ser Ile Gln His Gly Thr Leu Val Thr Met Asp Gln1 5 10 15Tyr Arg Arg Val Leu Gly Asp Ser Trp Val His Val Gln Asp Gly Arg 20 25 30Ile Val Ala Leu Gly Val His Ala Glu Ser Val Pro Pro Pro Ala Asp 35 40 45Arg Val Ile Asp Ala Arg Gly Lys Val Val Leu Pro Gly Phe Ile Asn 50 55 60Ala His Thr His Val Asn Gln Ile Leu Leu Arg Gly Gly Pro Ser His65 70 75 80Gly Arg Gln Leu Tyr Asp Trp Leu Phe Asn Val Leu Tyr Pro Gly Gln 85 90 95Lys Ala Met Arg Pro Glu Asp Val Ala Val Ala Val Arg Leu Tyr Cys 100 105 110Ala Glu Ala Val Arg Ser Gly Ile Thr Thr Ile Asn Asp Asn Ala Asp 115 120 125Ser Ala Ile Tyr Pro Gly Asn Ile Glu Ala Ala Met Ala Val Tyr Gly 130 135 140Glu Val Gly Val Arg Val Val Tyr Ala Arg Met Phe Phe Asp Arg Met145 150 155 160Asp Gly Arg Ile Gln Gly Tyr Val Asp Ala Leu Lys Ala Arg Ser Pro 165 170 175Gln Val Glu Leu Cys Ser Ile Met Glu Glu Thr Ala Val Ala Lys Asp 180 185 190Arg Ile Thr Ala Leu Ser Asp Gln Tyr His Gly Thr Ala Gly Gly Arg 195 200 205Ile Ser Val Trp Pro Ala Pro Ala Ile Thr Pro Ala Val Thr Val Glu 210 215 220Gly Met Arg Trp Ala Gln Ala Phe Ala Arg Asp Arg Ala Val Met Trp225 230 235 240Thr Leu His Met Ala Glu Ser Asp His Asp Glu Arg Leu His Trp Met 245 250 255Ser Pro Ala Glu Tyr Met Glu Cys Tyr Gly Leu Leu Asp Glu Arg Leu 260 265 270Gln Val Ala His Cys Val Tyr Phe Asp Arg Lys Asp Val Arg Leu Leu 275 280 285His Arg His Asn Val Lys Val Ala Ser Gln Val Val Ser Asn Ala Tyr 290 295 300Leu Gly Ser Gly Val Ala Pro Val Pro Glu Met Val Glu Arg Gly Met305 310 315 320Ala Val Gly Ile Gly Thr Asp Asp Gly Asn Cys Asn Asp Ser Val Asn 325 330 335Met Ile Gly Asp Met Lys Phe Met Ala His Ile His Arg Ala Val His 340 345 350Arg Asp Ala Asp Val Leu Thr Pro Glu Lys Ile Leu Glu Met Ala Thr 355 360 365Ile Asp Gly Ala Arg Ser Leu Gly Met Asp His Glu Ile Gly Ser Ile 370 375 380Glu Thr Gly Lys Arg Ala Asp Leu Ile Leu Leu Asp Leu Arg His Pro385 390 395 400Gln Thr Thr Pro His His His Leu Ala Ala Thr Ile Val Phe Gln Ala 405 410 415Tyr Gly Asn Glu Val Asp Thr Val Leu Ile Asp Gly Asn Val Val Met 420 425 430Glu Asn Arg Arg Leu Ser Phe Leu Pro Pro Glu Arg Glu Leu Ala Phe 435 440 445Leu Glu Glu Ala Gln Ser Arg Ala Thr Ala Ile Leu Gln Arg Ala Asn 450 455 460Met Val Ala Asn Pro Ala Trp Arg Ser Leu465 47059423DNASaccharomyces cerevisiae 59aggaacccat caggttggtg gaagattacc cgttctaaga cttttcagct tcctctattg 60atgttacacc tggacacccc ttttctggca tccagttttt aatcttcagt ggcatgtgag 120attctccgaa attaattaaa gcaatcacac aattctctcg gataccacct cggttgaaac 180tgacaggtgg tttgttacgc atgctaatgc aaaggagcct atataccttt ggctcggctg 240ctgtaacagg gaatataaag ggcagcataa tttaggagtt tagtgaactt gcaacattta 300ctattttccc ttcttacgta aatatttttc tttttaattc taaatcaatc tttttcaatt 360ttttgtttgt attcttttct tgcttaaatc tataactaca aaaaacacat acataaacta 420aaa 42360658DNASaccharomyces cerevisiae 60ttgctacgca ggctgcacaa ttacacgaga atgctcccgc ctaggattta aggctaaggg 60acgtgcaatg cagacgacag atctaaatga ccgtgtcggt gaagtgttcg ccaaactttt 120cggttaacac atgcagtgat gcacgcgcga tggtgctaag ttacatatat atatatatat 180atatatatat atatatatag ccatagtgat gtctaagtaa cctttatggt atatttctta 240atgtggaaag atactagcgc gcgcacccac acacaagctt cgtcttttct tgaagaaaag 300aggaagctcg ctaaatggga ttccactttc cgttccctgc cagctgatgg aaaaaggtta 360gtggaacgat gaagaataaa aagagagatc cactgaggtg aaatttcagc tgacagcgag 420tttcatgatc gtgatgaaca atggtaacga gttgtggctg ttgccaggga gggtggttct 480caacttttaa tgtatggcca aatcgctact tgggtttgtt atataacaaa gaagaaataa 540tgaactgatt ctcttcctcc ttcttgtcct ttcttaattc tgttgtaatt accttccttt 600gtaatttttt ttgtaattat tcttcttaat aatccaaaca aacacacata ttacaata 65861573DNASaccharomyces cerevisiae 61tgctgtaacc cgtacatgcc caaaataggg ggcgggttac acagaatata taacatcgta 60ggtgtctggg tgaacagttt attcctggca tccactaaat ataatggagc ccgcttttta 120agctggcatc cagaaaaaaa aagaatccca gcaccaaaat attgttttct tcaccaacca 180tcagttcata ggtccattct cttagcgcaa ctacagagaa caggggcaca aacaggcaaa 240aaacgggcac aacctcaatg gagtgatgca acctgcctgg agtaaatgat gacacaaggc 300aattgaccca cgcatgtatc tatctcattt tcttacacct tctattacct tctgctctct 360ctgatttgga aaaagctgaa aaaaaaggtt gaaaccagtt ccctgaaatt attcccctac 420ttgactaata agtatataaa gacggtaggt attgattgta attctgtaaa tctatttctt 480aaacttctta aattctactt ttatagttag tctttttttt agttttaaaa caccaagaac 540ttagtttcga ataaacacac ataaacaaac aaa 57362812DNASaccharomyces cerevisiae 62gcaccgctgg cttgaacaac aataccagcc ttccaacttc tgtaaataac ggcggtacgc 60cagtgccacc agtaccgtta cctttcggta tacctccttt ccccatgttt ccaatgccct 120tcatgcctcc aacggctact atcacaaatc ctcatcaagc tgacgcaagc cctaagaaat 180gaataacaat actgacagta ctaaataatt gcctacttgg cttcacatac gttgcatacg 240tcgatataga taataatgat aatgacagca ggattatcgt aatacgtaat agttgaaaat 300ctcaaaaatg tgtgggtcat tacgtaaata atgataggaa tgggattctt ctatttttcc 360tttttccatt ctagcagccg tcgggaaaac gtggcatcct ctctttcggg ctcaattgga 420gtcacgctgc cgtgagcatc ctctctttcc atatctaaca actgagcacg taaccaatgg 480aaaagcatga gcttagcgtt gctccaaaaa agtattggat ggttaatacc atttgtctgt 540tctcttctga ctttgactcc tcaaaaaaaa aaaatctaca atcaacagat cgcttcaatt 600acgccctcac aaaaactttt ttccttcttc ttcgcccacg ttaaatttta tccctcatgt 660tgtctaacgg atttctgcac ttgatttatt ataaaaagac aaagacataa tacttctcta 720tcaatttcag ttattgttct tccttgcgtt attcttctgt tcttcttttt cttttgtcat 780atataaccat aaccaagtaa tacatattca aa 81263625DNAYarrowia lipolytica 63tataaacggt attttcacaa ttgcacccca gccagaccga tagccggtcg caatccgcca 60cccacaaccg tctacctccc acagaacccc gtcacttcca cccttttcca ccagatcata 120tgtcccaact tgccaaatta aaaccgtgcg aattttcaaa ataaactttg gcaaagaggc 180tgcaaaggag gggctggtga gggcgtctgg aagtcgacca gagaccgggt tggcggcgca 240tttgtgtccc aaaaaacagc cccaattgcc ccaattgacc ccaaattgac ccagtagcgg 300gcccaacccc ggcgagagcc cccttctccc cacatatcaa acctcccccg gttcccacac 360ttgccgttaa gggcgtaggg tactgcagtc tggaatctac gcttgttcag actttgtact 420agtttctttg tctggccatc cgggtaaccc atgccggacg caaaatagac tactgaaaat 480ttttttgctt tgtggttggg actttagcca agggtataaa agaccaccgt ccccgaatta 540cctttcctct tcttttctct ctctccttgt caactcacac ccgaaatcgt taagcatttc 600cttctgagta taagaatcat tcaaa 62564318DNASaccharomyces cerevisiae 64gcataatatt gtccgctgcc cgtttttctg ttagacggtg tcttgatcta cttgctatcg 60ttcaacacca ccttattttc taactatttt ttttttagct catttgaatc agcttatggt 120gatggcacat ttttgcataa acctagctgt cctcgttgaa cataggaaaa

aaaaatatat 180aaacaaggct ctttcactct ccttggaatc agatttgggt ttgttccctt tattttcata 240tttcttgtca tattcttttc tcaattatta tcttctactc ataacctcac gcaaaataac 300acagtcaaat caatcaaa 31865413DNASaccharomyces cerevisiae 65catagcttca aaatgtttct actccttttt tactcttcca gattttctcg gactccgcgc 60atcgccgtac cacttcaaaa cacccaagca cagcatacta aatttcccct ctttcttcct 120ctagggtgtc gttaattacc cgtactaaag gtttggaaaa gaaaaaagag accgcctcgt 180ttctttttct tcgtcgaaaa aggcaataaa aatttttatc acgtttcttt ttcttgaaaa 240tttttttttt tgattttttt ctctttcgat gacctcccat tgatatttaa gttaataaac 300ggtcttcaat ttctcaagtt tcagtttcat ttttcttgtt ctattacaac tttttttact 360tcttgctcat tagaaagaaa gcatagcaat ctaatctaag ttttaattac aaa 41366330DNASaccharomyces cerevisiae 66taagattaat ataattatat aaaaatatta tcttcttttc tttatatcta gtgttatgta 60aaataaattg atgactacgg aaagcttttt tatattgttt ctttttcatt ctgagccact 120taaatttcgt gaatgttctt gtaagggacg gtagatttac aagtgataca acaaaaagca 180aggcgctttt tctaataaaa agaagaaaag catttaacaa ttgaacacct ctatatcaac 240gaagaatatt actttgtctc taaatccttg taaaatgtgt acgatctcta tatgggttac 300tcataagtgt accgaagact gcattgaaag 33067377DNASaccharomyces cerevisiae 67gtctgaagaa tgaatgattt gatgatttct ttttccctcc atttttctta ctgaatatat 60caatgatata gacttgtata gtttattatt tcaaattaag tagctatata tagtcaagat 120aacgtttgtt tgacacgatt acattattcg tcgacatctt ttttcagcct gtcgtggtag 180caatttgagg agtattatta attgaatagg ttcattttgc gctcgcataa acagttttcg 240tcagggacag tatgttggaa tgagtggtaa ttaatggtga catgacatgt tatagcaata 300accttgatgt ttacatcgta gtttaatgta caccccgcga attcgttcaa gtaggagtgc 360accaattgca aagggaa 37768506DNASaccharomyces cerevisiae 68gtgaatttac tttaaatctt gcatttaaat aaattttctt tttatagctt tatgacttag 60tttcaattta tatactattt taatgacatt ttcgattcat tgattgaaag ctttgtgttt 120tttcttgatg cgctattgca ttgttcttgt ctttttcgcc acatgtaata tctgtagtag 180atacctgata cattgtggat gctgagtgaa attttagtta ataatggagg cgctcttaat 240aattttgggg atattggctt ttttttttaa agtttacaaa tgaatttttt ccgccaggat 300aacgattctg aagttactct tagcgttcct atcggtacag ccatcaaatc atgcctataa 360atcatgccta tatttgcgtg cagtcagtat catctacatg aaaaaaactc ccgcaatttc 420ttatagaata cgttgaaaat taaatgtacg cgccaagata agataacata tatctagatg 480cagtaatata cacagattcc cgcgga 50669199DNASaccharomyces cerevisiae 69gttaattcaa attaattgat atagtttttt aatgagtatt gaatctgttt agaaataatg 60gaatattatt tttatttatt tatttatatt attggtcggc tcttttcttc tgaaggtcaa 120tgacaaaatg atatgaagga aataatgatt tctaaaattt tacaacgtaa gatattttta 180caaaagccta gctcatctt 19970400DNAYarrowia lipolytica 70gctgcttgta cctagtgcaa ccccagtttg ttaaaaatta gtagtcaaaa acttctgagt 60tagaaatttg tgagtgtagt gagattgtag agtatcatgt gtgtccgtaa gtgaagtgtt 120attgactctt agttagttta tctagtactc gtttagttga cactgatcta gtattttacg 180aggcgtatga ctttagccaa gtgttgtact tagtcttctc tccaaacatg agagggctct 240gtcactcagt cggcctatgg gtgagatggc ttggtgagat ctttcgatag tctcgtcaag 300atggtaggat gatgggggaa tacattactg ctctcgtcaa ggaaaccaca atcagatcac 360accatcctcc atggtatccg atgactctct tctccacagt 40071203DNASaccharomyces cerevisiae 71acaagctaag ttgactgctg ctaccaacgc taagcaataa gcgatttaat ctctaattat 60tagttaaagt tttataagca tttttatgta acgaaaaata aattggttca tattattact 120gcactgtcac ttaccatgga aagaccagac aagaagttgc cgacacgaca gtctgttgaa 180ttggcttaag tctgggtccg ctt 20372274DNASaccharomyces cerevisiae 72caggcccctt ttcctttgtc gatatcatgt aattagttat gtcacgctta cattcacgcc 60ctcctcccac atccgctcta accgaaaagg aaggagttag acaacctgaa gtctaggtcc 120ctatttattt tttttaatag ttatgttagt attaagaacg ttatttatat ttcaaatttt 180tctttttttt ctgtacaaac gcgtgtacgc atgtaacatt atactgaaaa ccttgcttga 240gaaggttttg ggacgctcga aggctttaat ttgc 2747367DNAEscherichia coli 73agctggtgac aattaatcat cggctcgtat aatgtgtgga attgaatcga tataaggagg 60ttaatca 677489DNAEscherichia coli 74ctcaaaatat attttccctc tatcttctcg ttgcgcttaa tttgactaat tctcattagc 60gaggcgcgcc tttccatagg ctccgcccc 897511DNAArtificial SequenceDescription of Artificial Sequence Synthetic oligonucleotide 75ggaaatccat t 11769DNAArtificial SequenceDescription of Artificial Sequence Synthetic oligonucleotide 76ggaacgacc 9

* * * * *


uspto.report is an independent third-party trademark research tool that is not affiliated, endorsed, or sponsored by the United States Patent and Trademark Office (USPTO) or any other governmental organization. The information provided by uspto.report is based on publicly available data at the time of writing and is intended for informational purposes only.

While we strive to provide accurate and up-to-date information, we do not guarantee the accuracy, completeness, reliability, or suitability of the information displayed on this site. The use of this site is at your own risk. Any reliance you place on such information is therefore strictly at your own risk.

All official trademark data, including owner information, should be verified by visiting the official USPTO website at www.uspto.gov. This site is not intended to replace professional legal advice and should not be used as a substitute for consulting with a legal professional who is knowledgeable about trademark law.

© 2024 USPTO.report | Privacy Policy | Resources | RSS Feed of Trademarks | Trademark Filings Twitter Feed