Methods, Synthetic Hosts And Reagents For The Biosynthesis Of Isoprene And Derivatives

Cartman; Stephen Thomas ;   et al.

Patent Application Summary

U.S. patent application number 16/337481 was filed with the patent office on 2019-07-18 for methods, synthetic hosts and reagents for the biosynthesis of isoprene and derivatives. The applicant listed for this patent is INVISTA NORTH AMERICA S.A.R.L.. Invention is credited to Stephen Thomas Cartman, Alexander Brett Foster, Ana Teresa De Santos Brito Mendes Roberts, Mark Paul Taylor.

Application Number20190218577 16/337481
Document ID /
Family ID60143757
Filed Date2019-07-18

United States Patent Application 20190218577
Kind Code A1
Cartman; Stephen Thomas ;   et al. July 18, 2019

METHODS, SYNTHETIC HOSTS AND REAGENTS FOR THE BIOSYNTHESIS OF ISOPRENE AND DERIVATIVES

Abstract

Methods and compositions for synthesizing dienes and derivative thereof, such as isoprene, in Cupriavidus necator are provided.


Inventors: Cartman; Stephen Thomas; (Redcar, GB) ; Foster; Alexander Brett; (Redcar, GB) ; Roberts; Ana Teresa De Santos Brito Mendes; (Redcar, GB) ; Taylor; Mark Paul; (Redcar, GB)
Applicant:
Name City State Country Type

INVISTA NORTH AMERICA S.A.R.L.

Wilmington

DE

US
Family ID: 60143757
Appl. No.: 16/337481
Filed: September 27, 2017
PCT Filed: September 27, 2017
PCT NO: PCT/US2017/053607
371 Date: March 28, 2019

Related U.S. Patent Documents

Application Number Filing Date Patent Number
62402209 Sep 30, 2016

Current U.S. Class: 1/1
Current CPC Class: Y02E 50/30 20130101; C07C 11/12 20130101; C12P 5/007 20130101; Y02E 50/343 20130101; C12N 9/88 20130101; C12N 15/74 20130101; C12Y 503/03002 20130101; C12N 9/90 20130101; C08F 136/08 20130101; C12N 2330/30 20130101; C12R 1/01 20130101; C12Y 402/03027 20130101; C12P 5/026 20130101
International Class: C12P 5/00 20060101 C12P005/00; C12N 9/90 20060101 C12N009/90; C12N 9/88 20060101 C12N009/88; C07C 11/12 20060101 C07C011/12; C08F 136/08 20060101 C08F136/08

Claims



1. A method for synthesizing isoprene in Cupriavidus necator, said method comprising enzymatically converting isopentenyl-pyrophosphate to dimethylallylpyrophosphate using a polypeptide having isopentenyl diphosphate isomerase enzyme activity.

2. The method of claim 1 wherein the polypeptide having isopentenyl diphosphate isomerase enzyme activity has at least 70% sequence identity to an amino acid sequence set forth in any of SEQ ID NOs: 1, 2, 3, 4, 5 or 6 or a functional fragment thereof.

3. The method of claim 1 wherein the polypeptide having isopentenyl diphosphate isomerase enzyme activity comprises the amino acid sequence set forth in any of SEQ ID NOs: 1, 2, 3, 4, 5 or 6 or a functional fragment thereof.

4. The method of claim 1 wherein the polypeptide having isopentenyl diphosphate isomerase enzyme activity is encoded by a nucleic acid sequence having at least 70% sequence identity to the nucleic acid sequence set forth in any of SEQ ID NOs: 8, 9, 10, 11, 12 or 13 or a functional fragment thereof.

5. The method of claim 1 wherein the polypeptide having isopentenyl diphosphate isomerase enzyme activity is encoded by a nucleic acid sequence comprising the nucleic acid sequence set forth in SEQ ID NOs: 8, 9, 10, 11, 12 or 13 or a functional fragment thereof.

6. A method for synthesizing isoprene in Cupriavidus necator, said method comprising enzymatically converting dimethylallylpyrophosphate to isoprene using a polypeptide having isoprene synthase enzyme activity.

7. The method of claim 6 wherein the polypeptide having isoprene synthase enzyme activity has at least 70% sequence identity to the amino acid sequence set forth in SEQ ID NO: 7 or a functional fragment thereof.

8. The method of claim 6 wherein the polypeptide having isoprene synthase enzyme activity comprises the amino acid sequence set forth in SEQ ID NO: 7 or a functional fragment thereof.

9. The method of claim 6 wherein the polypeptide having isoprene synthase enzyme activity is encoded by a nucleic acid sequence having at least 70% sequence identity to the nucleic acid sequence set forth in SEQ ID NO: 14 or a functional fragment thereof.

10. The method of claim 6 wherein the polypeptide having isoprene synthase enzyme activity is encoded by a nucleic acid sequence comprising the nucleic acid sequence set forth in SEQ ID NO: 14 or a functional fragment thereof.

11. A method for synthesizing isoprene in Cupriavidus necator, said method comprising enzymatically converting isopentenyl-pyrophosphate to dimethylallylpyrophosphate using a polypeptide having isopentenyl diphosphate isomerase enzyme activity; and enzymatically converting dimethylallylpyrophosphate to isoprene using a polypeptide having isoprene synthase enzyme activity.

12. The method of claim 11 wherein the polypeptide having isopentenyl diphosphate isomerase enzyme activity has at least 70% sequence identity to an amino acid sequence set forth in any of SEQ ID NOs: 1, 2, 3, 4, 5 or 6 or a functional fragment thereof.

13. The method of claim 11 wherein the polypeptide having isopentenyl diphosphate isomerase enzyme activity comprises an amino acid sequence set forth in any of SEQ ID NOs: 1, 2, 3, 4, 5 or 6 or a functional fragment thereof.

14. The method of claim 11 wherein the polypeptide having isopentenyl diphosphate isomerase enzyme activity is encoded by a nucleic acid sequence having at least 70% sequence identity to the nucleic acid sequence set forth in any of SEQ ID NOs: 8, 9, 10, 11, 12 or 13 or a functional fragment thereof.

15. The method of claim 11 wherein the polypeptide having isopentenyl diphosphate isomerase enzyme activity is encoded by a nucleic acid sequence comprising the nucleic acid sequence set forth in SEQ ID NOs: 8, 9, 10, 11, 12 or 13 or a functional fragment thereof.

16. The method of claim 11 wherein the polypeptide having isoprene synthase enzyme activity has at least 70% sequence identity to an amino acid sequence set forth in SEQ ID NO: 7 or a functional fragment thereof.

17. The method of claim 11 wherein the polypeptide having isoprene synthase enzyme activity comprises an amino acid sequence set forth in SEQ ID NO: 7 or a functional fragment thereof.

18. The method of claim 11 wherein the polypeptide having isoprene synthase enzyme activity is encoded by a nucleic acid sequence having at least 70% sequence identity to the nucleic acid sequence set forth in SEQ ID NO: 14 or a functional fragment thereof.

19. The method of claim 11 wherein the polypeptide having isoprene synthase enzyme activity is encoded by a nucleic acid sequence comprising the nucleic acid sequence set forth in SEQ ID NO: 14 or a functional fragment thereof.

20. The method of any of claims 1-19, wherein said method is performed in a recombinant Cupriavidus necator host.

21. The method of claim 20 wherein the recombinant Cupriavidus necator host comprises an exogenous nucleic acid sequence encoding a polypeptide having isopentenyl diphosphate isomerase enzyme activity.

22. The method of claim 21 wherein the polypeptide having isopentenyl diphosphate isomerase enzyme activity has at least 70% sequence identity to an amino acid sequence set forth in any of SEQ ID NOs: 1, 2, 3, 4, 5 or 6 or a functional fragment thereof.

23. The method of claim 21 wherein the polypeptide having isopentenyl diphosphate isomerase enzyme activity comprises an amino acid sequence set forth in any of SEQ ID NOs: 1, 2, 3, 4, 5 or 6 or a functional fragment thereof.

24. The method of claim 21 wherein the exogenous nucleic acid sequence has at least 70% sequence identity to the nucleic acid sequence set forth in any of SEQ ID NOs: 8, 9, 10, 11, 12 or 13 or a functional fragment thereof.

25. The method of claim 21 wherein the exogenous nucleic acid sequence comprises SEQ ID NO: 8, 9, 10, 11, 12 or 13 or a functional fragment thereof.

26. The method of claim 20 wherein the recombinant Cupriavidus necator host comprises an exogenous nucleic acid encoding a polypeptide having isoprene synthase enzyme activity.

27. The method of claim 26 wherein the polypeptide having isoprene synthase enzyme activity has at least 70% sequence identity to an amino acid sequence set forth in SEQ ID NO: 7 or a functional fragment thereof.

28. The method of claim 26 wherein the polypeptide having isoprene synthase enzyme activity comprises an amino acid sequence set forth in SEQ ID NO: 7 or a functional fragment thereof.

29. The method of claim 26 wherein the exogenous nucleic acid sequence has at least 70% sequence identity to the nucleic acid sequence set forth in SEQ ID NO: 14 or a functional fragment thereof.

30. The method of claim 26 wherein the exogenous nucleic acid sequence comprises SEQ ID NO: 14 or a functional fragment thereof.

31. The method of claim 20 wherein the recombinant Cupriavidus necator host comprises an exogenous nucleic acid encoding a polypeptide having isopentenyl diphosphate isomerase enzyme activity and an exogenous nucleic acid encoding a polypeptide having isoprene synthase enzyme activity.

32. The method of claim 20 wherein the recombinant Cupriavidus necator host has been transfected with a vector comprising any of SEQ ID NOs: 15, 16, 17, 18, 19, 20 or 21.

33. The method of any of claims 1 through 32, wherein at least one of the enzymatic conversions comprises gas fermentation within the Cupriavidus necator.

34. The method of claim 33, wherein the gas fermentation comprises at least one of natural gas, syngas, CO.sub.2/H.sub.2, methanol, ethanol, non-volatile residue, caustic wash from cyclohexane oxidation processes, or waste stream from a chemical or petrochemical industry.

35. The method of claim 34 wherein the gas fermentation comprises CO.sub.2/H.sub.2.

36. The method of any of claims 1 through 35, further comprising recovering produced isoprene.

37. A substantially pure recombinant Cupriavidus necator host capable of producing isoprene via a methylerythritol phosphate (MEP) pathway.

38. The recombinant Cupriavidus necator host of claim 37 comprising an exogenous nucleic acid sequence encoding a polypeptide having isopentenyl diphosphate isomerase enzyme activity.

39. The recombinant Cupriavidus necator host of claim 38 wherein the polypeptide having isopentenyl diphosphate isomerase enzyme activity has at least 70% sequence identity to an amino acid sequence set forth in any of SEQ ID NOs: 1, 2, 3, 4, 5 or 6 or a functional fragment thereof.

40. The recombinant Cupriavidus necator host of claim 38 wherein the polypeptide having isopentenyl diphosphate isomerase enzyme activity comprises an amino acid sequence set forth in any of SEQ ID NOs: 1, 2, 3, 4, 5 or 6 or a functional fragment thereof.

41. The recombinant Cupriavidus necator host of claim 38 the exogenous nucleic acid sequence has at least 70% sequence identity to the nucleic acid sequence set forth in any of SEQ ID NOs: 8, 9, 10, 11, 12 or 13 or a functional fragment thereof.

42. The recombinant Cupriavidus necator host of claim 38 wherein the exogenous nucleic acid sequence comprises SEQ ID NO: 8, 9, 10, 11, 12 or 13 or a functional fragment thereof.

43. The recombinant Cupriavidus necator host of claim 37 comprising an exogenous nucleic acid sequence encoding a polypeptide having isoprene synthase enzyme activity.

44. The recombinant Cupriavidus necator host of claim 43 wherein the polypeptide having isoprene synthase enzyme activity has at least 70% sequence identity to an amino acid sequence set forth in SEQ ID NO: 7 or a functional fragment thereof.

45. The recombinant Cupriavidus necator host of claim 43 wherein the polypeptide having isoprene synthase enzyme activity comprises an amino acid sequence set forth in SEQ ID NO: 7 or a functional fragment thereof.

46. The recombinant Cupriavidus necator host of claim 43 wherein the exogenous nucleic acid sequence has at least 70% sequence identity to the nucleic acid sequence set forth in SEQ ID NO: 14 or a functional fragment thereof.

47. The recombinant Cupriavidus necator host of claim 43 wherein the exogenous nucleic acid sequence comprises SEQ ID NO: 14 or a functional fragment thereof.

48. The recombinant Cupriavidus necator host of claim 37 comprising an exogenous nucleic acid sequence encoding a polypeptide having isopentenyl diphosphate isomerase enzyme activity and an exogenous nucleic acid sequence encoding a polypeptide having isoprene synthase enzyme activity.

49. The recombinant Cupriavidus necator host of any of claims 37 to 48, wherein at least one of the exogenous nucleic acid sequences is contained within a plasmid.

50. The recombinant Cupriavidus necator host of any of claims 37 to 48, wherein at least one of the exogenous nucleic acid sequences is integrated into a chromosome of the host.

51. The recombinant Cupriavidus necator host of claim 37 which has been transfected with a vector comprising any of SEQ ID NOs: 15, 16, 17, 18, 19, 20 or 21.

52. The recombinant Cupriavidus necator host of claim 37, wherein the host performs the enzymatic synthesis by gas fermentation.

53. The recombinant Cupriavidus necator host of claim 52, wherein the gas fermentation comprises at least one of natural gas, syngas, CO.sub.2/H.sub.2, methanol, ethanol, non-volatile residue, caustic wash from cyclohexane oxidation processes, or waste stream from a chemical or petrochemical industry.

54. The recombinant Cupriavidus necator host of claim 53, wherein the gas fermentation comprises CO.sub.2/H.sub.2.

55. A bioderived isoprene produced in a recombinant Cupriavidus necator host according to any of 37 through 54, wherein said bioderived isoprene has a carbon-12, carbon-13, and carbon-14 isotope ratio that reflects an atmospheric carbon dioxide uptake source.

56. A bio-derived, bio-based, or fermentation-derived product produced from any of the methods or hosts of any of claims 1 to 54, wherein said product comprises: (i) a composition comprising at least one bio-derived, bio-based, or fermentation-derived compound or any combination thereof; (ii) a bio-derived, bio-based, or fermentation-derived polymer comprising the bio-derived, bio-based, or fermentation-derived composition or compound of (i), or any combination thereof; (iii) a bio-derived, bio-based, or fermentation-derived cis-polyisoprene rubber, trans-polyisoprene rubber, or liquid polyisoprene rubber, comprising the bio-derived, bio-based, or fermentation-derived compound or bio-derived, bio-based, or fermentation-derived composition of (i), or any combination thereof or the bio-derived, bio-based, or fermentation-derived polymer of (ii), or any combination thereof; (iv) a molded substance obtained by molding the bio-derived, bio-based, or fermentation-derived polymer of (ii), or the bio-derived, bio-based, or fermentation-derived rubber of (iii), or any combination thereof; (v) a bio-derived, bio-based, or fermentation-derived formulation comprising the bio-derived, bio-based, or fermentation-derived composition of (i), the bio-derived, bio-based, or fermentation-derived compound of (i), the bio-derived, bio-based, or fermentation-derived polymer of (ii), the bio-derived, bio-based, or fermentation-derived rubber of (iii), or the bio-derived, bio-based, or fermentation-derived molded substance of (iv), or any combination thereof; or (vi) a bio-derived, bio-based, or fermentation-derived semi-solid or a non-semi-solid stream, comprising the bio-derived, bio-based, or fermentation-derived composition of (i), the bio-derived, bio-based, or fermentation-derived compound of (i), the bio-derived, bio-based, or fermentation-derived polymer of (ii), the bio-derived, bio-based, or fermentation-derived rubber of (iii), the bio-derived, bio-based, or fermentation-derived formulation of (iv), or the bio-derived, bio-based, or fermentation-derived molded substance of (v), or any combination thereof.
Description



[0001] This patent application claims the benefit of priority from U.S. Provisional Application Ser. No. 62/402,209, filed Sep. 30, 2016, teachings of which are hereby incorporated by reference in their entirety.

FIELD

[0002] The present invention relates to methods and compositions for synthesizing dienes and derivative thereof, such as isoprene, in Cupriavidus necator.

BACKGROUND

[0003] Isoprene is an important monomer for the production of specialty elastomers including motor mounts/fittings, surgical gloves, rubber bands, golf balls and shoes. Styrene-isoprene-styrene block copolymers form a key component of hot-melt pressure-sensitive adhesive formulations and cis-polyisoprene is utilized in the manufacture of tires (Whited et al. Industrial Biotechnology 2010 6(3):152-163). Manufacturers of rubber goods depend on either imported natural rubber from the Brazilian rubber tree or petroleum-based synthetic rubber polymers (Whited et al. 2010, supra).

[0004] Given an over-reliance on petrochemical feedstocks, biotechnology offers an alternative approach to the generation of industrially relevant products, via biocatalysis. Biotechnology offers more sustainable methods for producing industrial intermediates, in particular isoprene.

[0005] There are known metabolic pathways leading to the synthesis of isoprene in eukaryotes such as Populus alba and some prokaryotes such as Bacillis subtillis have been reported to emit isoprene (Whited et al. 2010, supra). Isoprene production in prokaryotes is however rare, and no prokaryotic Isoprene synthase (hereafter ISPS) has been described to date.

[0006] Generally, two metabolic routes have been described incorporating the molecule dimethylallyl-pyrophosphate (--PP), the precursor to isoprene. These are known as the mevalonate and the non-mevalonate pathways (Kuzuyama Biosci. Biotechnol. Biochem. 2002 66(8):1619-1627), both of which function in terpenoid synthesis in vivo. Both require the introduction of a non-native ISPS in order to divert carbon to isoprene production.

[0007] The mevalonate pathway generally occurs in higher eukaryotes and Archaea and incorporates a decarboxylase enzyme, mevalonate diphosphate decarboxylase (hereafter MDD), that introduces the first vinyl-group into the precursors leading to isoprene. The second vinyl-group is introduced by isoprene synthase in the final step in synthesizing isoprene. The non-mevalonate pathway or 2-C-methyl-D-erythritol 4-phosphate (MEP) pathway occurs in many bacteria and dimethylallyl-PP is generated alongside isopentenyl-PP, two molecules which are interconvertible via the action of isopentenyl pyrophophate isomerase or isopentyl diphosphate isomerase (hereafter IDI).

SUMMARY

[0008] An aspect of the present invention relates to methods for synthesizing isoprene in Cupriavidus necator.

[0009] In one nonlimiting embodiment, the method comprises enzymatically converting isopentenyl-pyrophosphate to dimethylallylpyrophosphate using a polypeptide having isopentenyl diphosphate isomerase enzyme activity.

[0010] In one nonlimiting embodiment, the method comprises enzymatically converting dimethylallylpyrophosphate to isoprene using a polypeptide having isoprene synthase enzyme activity.

[0011] Another aspect of the present invention relates to methods for synthesizing isoprene in Cupriavidus necator which comprise enzymatically converting isopentenyl-pyrophosphate to dimethylallylpyrophosphate using a polypeptide having isopentenyl diphosphate isomerase enzyme activity; an enzymatically converting dimethylallylpyrophosphate to isoprene using a polypeptide having isoprene synthase enzyme activity.

[0012] Another aspect of the present invention relates to a substantially pure recombinant Cupriavidus necator hosts capable of producing isoprene via a methylerythritol phosphate (MEP) pathway.

[0013] Another aspect of the present invention relates to bioderived isoprene produced in a recombinant Cupriavidus necator host.

[0014] Another aspect of the present invention relates to bio-derived, bio-based, or fermentation-derived products produced from any of the methods or hosts described herein.

[0015] Unless otherwise defined, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention pertains. Although methods and materials similar or equivalent to those described herein can be used to practice the invention, suitable methods and materials are described below. All publications, patent applications, patents, and other references mentioned herein are incorporated by reference in their entirety. In case of conflict, the present specification, including definitions, will control. In addition, the materials, methods, and examples are illustrative only and not intended to be limiting.

[0016] The details of one or more embodiments of the invention are set forth in the accompanying drawings and the description below. Other features, objects, and advantages of the invention will be apparent from the description and the drawings, and from the claims. The word "comprising" in the claims may be replaced by "consisting essentially of" or with "consisting of," according to standard practice in patent law.

BRIEF DESCRIPTION OF THE FIGURES

[0017] FIGS. 1A and 1B are bargraphs showing isoprene production (ppm) of IDI-ISPS expressing C. necator strains compared to a strain expressing ISPS alone. FIG. 1A compares isoprene production in C. necator strains transfected with vectors pBBR1-ISPS, pBBR1-EC IDI-ISPS, pBBR1-BS IDI-ISPS, pBBR1-SCIDI-ISPS, pBBR1-EFIDI-ISPS, pBBR1-SPyrIDI-ISPS. The S. pneumonia IDI construct is shown separately in FIG. 1B wherein it was tested with a different incubation volume and time alongside an E. coli IDI, accounting for the difference in isoprene yield.

[0018] FIGS. 2A through 2G are images of vectors pBBR1-ISPS (FIG. 2A), pBBR1-EC IDI-ISPS (FIG. 2B), pBBR1-BS IDI-ISPS (FIG. 2C), pBBR1-SCIDI-ISPS (FIG. 2D), pBBR1-EFIDI-ISPS (FIG. 2E), pBBR1-SPyrIDI-ISPS (FIG. 2F) and pBBR1-Spneu IDI-ISPS (FIG. 2G). Nucleic acid sequences of these vectors are set forth herein in SEQ ID NOs: 15 through 21, respectively.

DETAILED DESCRIPTION

[0019] Cupriavidus necator is a Gram-negative soil bacterium of the Betaproteobacteria class. This hydrogen-oxidizing bacterium is capable of growing at the interface of anaerobic and aerobic environments and easily adapts between heterotrophic and autotrophic lifestyles. Sources of energy for the bacterium include both organic compounds and hydrogen. C. necator does not naturally contain genes for isoprene synthase (ISPS) or isopentyl diphosphate isomerase (IDI) and therefore does not express these enzymes.

[0020] The present invention provides methods and compositions for synthesizing isoprene in C. necator. In the methods and compositions of the present invention, C. necator is used to synthesize isoprene via a methylerythritol phosphate (MEP) pathway.

[0021] Surprisingly, the inventors herein have found that the overexpression of IDI and ISPS in C. necator resulted in the production of isoprene, via the MEP pathway. Various vectors were constructed and confirmed by sequencing. Vectors constructed included pBBR1-ISPS, pBBR1-EC IDI-ISPS, pBBR1-BS IDI-ISPS, pBBR1-SCIDI-ISPS, pBBR1EF-IDI-ISPS, pBBR1-SPyrIDI-ISPS and pBBR1-Spneu IDI-ISPS. Images of the constructed vectors are set forth in FIGS. 2A through 2G, respectively and their nucleic acid sequences are shown in SEQ ID NOs: 15 through 21, respectively. Isoprene production by strains of C. necator H16 .DELTA.phaCAB transformed with these vectors is summarized in Table 3 and depicted graphically in FIGS. 1A and 1B. The construction of a bicistronic expression cassette comprising the P. alba isoprene synthase and an IDI was demonstrated to be sufficient to achieve isoprene production in C. necator H16.DELTA.phaCAB. The IDIs from E. coli, B. subtilis, S. cerevisiae and E. faecalis were shown to be active in C. necator H16 across a greater than ten-fold range of yields (0.03 to 0.4 ppm). The strain containing the IDI from B. subtilis produced the most isoprene under these growth conditions, approximately 0.4 ppm. Other functional IDIs generated strains with a range of isoprene yields.

[0022] This document thus provides methods and compositions which can convert central precursors including isopentenyl-pyrophosphate and/or dimethylallylpyrophosphate into isoprene.

[0023] As used herein, the term "central precursor" is used to denote any metabolite in any metabolic pathway described herein leading to the synthesis of isoprene.

[0024] The term "central metabolite" is used herein to denote a metabolite that is produced in all microorganisms to support growth.

[0025] A nonlimiting example of a C. necator host useful in the present invention is a C. necator of the H16 strain. In one nonlimiting embodiment, a C. necator host of the H16 strain with the phaCAB gene locus knocked out (.DELTA.phaCAB) is used.

[0026] In one nonlimiting embodiment, the method comprises enzymatically converting isopentenyl-pyrophosphate to dimethylallylpyrophosphate using a polypeptide having IDI enzyme activity.

[0027] Polypeptides having IDI enzyme activity and nucleic acids encoding IDIs have been identified from various organisms and are readily available in publicly available databases such as GenBank or EMBL. Examples include, but are in no way limited to, IDIs from E. coli, B. subtilis, S. cerevisiae, E. faecalis, S. pyrogenes and S. pneumonia. In one nonlimiting embodiment, the polypeptide having IDI enzyme activity has at least 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or 99.5% sequence identity to an amino acid sequence set forth in any of SEQ ID NOs: 1, 2, 3, 4, 5 or 6 or a functional fragment thereof. In one nonlimiting embodiment, the polypeptide having IDI enzyme activity comprises the amino acid sequence set forth in any of SEQ ID NOs: 1, 2, 3, 4, 5 or 6 or a functional fragment thereof. In one nonlimiting embodiment, the polypeptide having IDI enzyme activity is encoded by a nucleic acid sequence having at least 70%, 75%, 80%, 85%, 90%, 91%, 920, 93%, 94%, 95%, 96%, 97%, 98%, 99% or 99.5% sequence identity to the nucleic acid sequence set forth in any of SEQ ID NOs: 8, 9, 10, 11, 12 or 13 or a functional fragment thereof. In one nonlimiting embodiment, the polypeptide having IDI enzyme activity is encoded by a nucleic acid sequence comprising the nucleic acid sequence set forth in SEQ ID NOs. 8, 9, 10, 11, 12 or 13 or a functional fragment thereof.

[0028] In another nonlimiting embodiment, the method comprises enzymatically converting dimethylallylpyrophosphate to isoprene using a polypeptide having ISPS enzyme activity.

[0029] Polypeptides having ISPS enzyme activity and nucleic acids encoding ISPSs have been identified from various organisms and are readily available in publicly available databases such as GenBank or EMBL. A nonlimiting example is the ISPS of Populus alba. In one nonlimiting embodiment, the polypeptide having ISPS enzyme activity has at least 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or 99.5% sequence identity to the amino acid sequence set forth in SEQ ID NO: 7 or a functional fragment thereof. In one nonlimiting embodiment, the polypeptide having ISPS enzyme activity comprises the amino acid sequence set forth in SEQ ID NO: 7 or a functional fragment thereof. In one nonlimiting embodiment, the polypeptide having ISPS enzyme activity is encoded by a nucleic acid sequence having at least 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or 99.5% sequence identity to the nucleic acid sequence set forth in SEQ ID NO: 14 or a functional fragment thereof. In one nonlimiting embodiment, the polypeptide having ISPS enzyme activity is encoded by a nucleic acid sequence comprising the nucleic acid sequence set forth in SEQ ID NOs. 14 or a functional fragment thereof.

[0030] In one nonlimiting embodiment, the method for synthesizing isoprene in Cupriavidus necator comprises enzymatically converting isopentenyl-pyrophosphate to dimethylallylpyrophosphate using a polypeptide having IDI enzyme activity and enzymatically converting dimethylallylpyrophosphate to isoprene using a polypeptide having ISPS enzyme activity. In this embodiment, any of the polypeptides having IDI enzyme activity or ISPS enzyme activity described supra can be used.

[0031] The percent identity (homology) between two amino acid sequences can be determined as follows. First, the amino acid sequences are aligned using the BLAST 2 Sequences (B12seq) program from the stand-alone version of BLAST containing BLASTP version 2.0.14. This stand-alone version of BLAST can be obtained from the U.S. government's National Center for Biotechnology Information web site (www with the extension ncbi.nlm.nih.gov). Instructions explaining how to use the B12seq program can be found in the readme file accompanying BLASTZ. B12seq performs a comparison between two amino acid sequences using the BLASTP algorithm. To compare two amino acid sequences, the options of B12seq are set as follows: -i is set to a file containing the first amino acid sequence to be compared (e.g., C:\seq1.txt); -j is set to a file containing the second amino acid sequence to be compared (e.g., C:\seq2.txt); -p is set to blastp; -o is set to any desired file name (e.g., C:\output.txt); and all other options are left at their default setting. For example, the following command can be used to generate an output file containing a comparison between two amino acid sequences: C:\B12seq-i c:\seq1.txt-j c:\seq2.txt-p blastp-o c:\output.txt. If the two compared sequences share homology (identity), then the designated output file will present those regions of homology as aligned sequences. If the two compared sequences do not share homology (identity), then the designated output file will not present aligned sequences. Similar procedures can be following for nucleic acid sequences except that blastn is used.

[0032] Once aligned, the number of matches is determined by counting the number of positions where an identical amino acid residue is presented in both sequences. The percent identity (homology) is determined by dividing the number of matches by the length of the full-length polypeptide amino acid sequence followed by multiplying the resulting value by 100. It is noted that the percent identity (homology) value is rounded to the nearest tenth. For example, 90.11, 90.12, 90.13, and 90.14 is rounded down to 90.1, while 90.15, 90.16, 90.17, 90.18, and 90.19 is rounded up to 90.2. It also is noted that the length value will always be an integer.

[0033] It will be appreciated that a number of nucleic acids can encode a polypeptide having a particular amino acid sequence. The degeneracy of the genetic code is well known to the art; i.e., for many amino acids, there is more than one nucleotide triplet that serves as the codon for the amino acid. For example, codons in the coding sequence for a given enzyme can be modified such that optimal expression in a particular species (e.g., bacteria or fungus) is obtained, using appropriate codon bias tables for that species.

[0034] Functional fragments of any of the polypeptides or nucleic acid sequences described herein can also be used in the methods of the document. The term "functional fragment" as used herein refers to a peptide fragment of a polypeptide or a nucleic acid sequence fragment encoding a peptide fragment of a polypeptide that has at least 25% (e.g., at least: 30%; 40%; 50%; 60%; 70%; 75%; 80%; 85%; 90%; 95%; 98%; 99%; 100%; or even greater than 100%) of the activity of the corresponding mature, full-length, polypeptide. The functional fragment can generally, but not always, be comprised of a continuous region of the polypeptide, wherein the region has functional activity.

[0035] In one nonlimiting embodiment, methods of the present invention are performed in a recombinant Cupriavidus necator host. Recombinant hosts can naturally express none or some (e.g., one or more, two or more) of the enzymes of the pathways described herein. Endogenous genes of the recombinant hosts also can be disrupted to prevent the formation of undesirable metabolites or prevent the loss of intermediates in the pathway through other enzymes acting on such intermediates. Recombinant hosts can be referred to as recombinant host cells, engineered cells, or engineered hosts. Thus, as described herein, recombinant hosts can include exogenous nucleic acids encoding one or more of IDIs and/or ISPSs, as described herein.

[0036] The term "exogenous" as used herein with reference to a nucleic acid (or a protein) and a host refers to a nucleic acid that does not occur in (and cannot be obtained from) a cell of that particular type as it is found in nature or a protein encoded by such a nucleic acid. Thus, a non-naturally-occurring nucleic acid is considered to be exogenous to a host once in the host. It is important to note that non-naturally-occurring nucleic acids can contain nucleic acid subsequences or fragments of nucleic acid sequences that are found in nature provided the nucleic acid as a whole does not exist in nature. For example, a nucleic acid molecule containing a genomic DNA sequence within an expression vector is non-naturally-occurring nucleic acid, and thus is exogenous to a host cell once introduced into the host, since that nucleic acid molecule as a whole (genomic DNA plus vector DNA) does not exist in nature. Thus, any vector, autonomously replicating plasmid, or virus (e.g., retrovirus, adenovirus, or herpes virus) that as a whole does not exist in nature is considered to be non-naturally-occurring nucleic acid. It follows that genomic DNA fragments produced by PCR or restriction endonuclease treatment as well as cDNAs are considered to be non-naturally-occurring nucleic acid since they exist as separate molecules not found in nature. It also follows that any nucleic acid containing a promoter sequence and polypeptide-encoding sequence (e.g., cDNA or genomic DNA) in an arrangement not found in nature is non-naturally-occurring nucleic acid. A nucleic acid that is naturally-occurring can be exogenous to a particular host microorganism. For example, an entire chromosome isolated from a cell of yeast x is an exogenous nucleic acid with respect to a cell of yeast y once that chromosome is introduced into a cell of yeast y.

[0037] In contrast, the term "endogenous" as used herein with reference to a nucleic acid (e.g., a gene) (or a protein) and a host refers to a nucleic acid (or protein) that does occur in (and can be obtained from) that particular host as it is found in nature. Moreover, a cell "endogenously expressing" a nucleic acid (or protein) expresses that nucleic acid (or protein) as does a host of the same particular type as it is found in nature. Moreover, a host "endogenously producing" or that "endogenously produces" a nucleic acid, protein, or other compound produces that nucleic acid, protein, or compound as does a host of the same particular type as it is found in nature.

[0038] In one nonlimiting embodiment of the present invention, the method for isoprene production is performed in a recombinant Cupriavidus necator host comprising an exogenous nucleic acid sequence encoding a polypeptide having IDI enzyme activity. In this embodiment, any of the nucleic acid sequences encoding a polypeptide having IDI enzyme activity as described supra can be used.

[0039] In another nonlimiting embodiment of the present invention, the method is performed using a recombinant Cupriavidus necator host comprising an exogenous nucleic acid encoding a polypeptide having ISPS enzyme activity. In this embodiment, any of the nucleic acid sequences encoding a polypeptide having ISPS enzyme activity as described supra can be used.

[0040] In another nonlimiting embodiment, the method is performed using a recombinant Cupriavidus necator host comprising an exogenous nucleic acid encoding a polypeptide having IDI enzyme activity and an exogenous nucleic acid encoding a polypeptide having ISPS enzyme activity. In this embodiment, any of the nucleic acid sequences encoding a polypeptide having IDI enzyme activity and any of the nucleic acid sequences having ISPS enzyme activity as described supra can be used.

[0041] In another nonlimiting embodiment, the method for isoprene production of the present invention is performed in a recombinant Cupriavidus necator host which has been transformed with a vector comprising any of SEQ ID NOs:15, 16, 17, 18, 19, 20 or 21.

[0042] In any the methods described herein, a fermentation strategy can be used that entails anaerobic, micro-aerobic or aerobic cultivation. A fermentation strategy can entail nutrient limitation such as nitrogen, phosphate or oxygen limitation. A cell retention strategy using a ceramic hollow fiber membrane can be employed to achieve and maintain a high cell density during fermentation. The principal carbon source fed to the fermentation can derive from a biological or non-biological feedstock. The biological feedstock can be, or can derive from, monosaccharides, disaccharides, lignocellulose, hemicellulose, cellulose, lignin, levulinic acid and formic acid, triglycerides, glycerol, fatty acids, agricultural waste, condensed distillers' solubles or municipal waste. The non-biological feedstock can be, or can derive from, natural gas, syngas, CO.sub.2/H.sub.2, methanol, ethanol, non-volatile residue (NVR) a caustic wash waste stream from cyclohexane oxidation processes or waste stream from a chemical or petrochemical industry.

[0043] In one nonlimiting embodiment, at least one of the enzymatic conversions of the isoprene production method comprises gas fermentation within the Cupriavidus necator. In this embodiment, the gas fermentation may comprise at least one of natural gas, syngas, CO.sub.2/H.sub.2, methanol, ethanol, non-volatile residue, caustic wash from cyclohexane oxidation processes, or waste stream from a chemical or petrochemical industry. In one nonlimiting embodiment, the gas fermentation comprises CO.sub.2/H.sub.2.

[0044] The methods of the present invention may further comprise recovering produced isoprene from the Cupriavidus necator.

[0045] Once produced, any method can be used to isolate isoprene. For example, isoprene can be recovered from the fermenter off-gas stream as a volatile product as the boiling point of isoprene is 34.1.degree. C. At a typical fermentation temperature of approximately 30.degree. C., isoprene has a high vapor pressure and can be stripped by the gas flow rate through the broth for recovery from the off-gas. Isoprene can be selectively adsorbed onto, for example, an adsorbent and separated from the other off-gas components. Membrane separation technology may also be employed to separate isoprene from the other off-gas compounds. Isoprene may desorbed from the adsorbent using, for example, nitrogen and condensed at low temperature and high pressure.

[0046] Compositions for synthesizing isoprene in C. necator are also provided by the present invention.

[0047] In one nonlimiting embodiment, a substantially pure recombinant C. necator host capable of producing isoprene via a methylerythritol phosphate (MEP) pathway is provided.

[0048] As used herein, a "substantially pure culture" of a recombinant host microorganism is a culture of that microorganism in which less than about 40% (i.e., less than about 35%; 30%; 25%; 20%; 15%; 10%; 5%; 2%; 1%; 0.5%; 0.25%; 0.1%; 0.01%; 0.001%; 0.0001%; or even less) of the total number of viable cells in the culture are viable cells other than the recombinant microorganism, e.g., bacterial, fungal (including yeast), mycoplasmal, or protozoan cells. The term "about" in this context means that the relevant percentage can be 15% of the specified percentage above or below the specified percentage. Thus, for example, about 20% can be 17% to 23%. Such a culture of recombinant microorganisms includes the cells and a growth, storage, or transport medium. Media can be liquid, semi-solid (e.g., gelatinous media), or frozen. The culture includes the cells growing in the liquid or in/on the semi-solid medium or being stored or transported in a storage or transport medium, including a frozen storage or transport medium. The cultures are in a culture vessel or storage vessel or substrate (e.g., a culture dish, flask, or tube or a storage vial or tube).

[0049] In one nonlimiting embodiment, the recombinant C. necator host comprises an exogenous nucleic acid sequence encoding a polypeptide having IDI enzyme activity. Any nucleic acid sequence encoding a polypeptide having IDI enzyme activity as described supra can be used in this embodiment.

[0050] In another nonlimiting embodiment, the recombinant C. necator host comprises an exogenous nucleic acid encoding polypeptide having IPSP enzyme activity. Any nucleic acid sequence encoding a polypeptide having IPSP enzyme activity as described supra can be used in this embodiment.

[0051] In another nonlimiting embodiment, the recombinant C. necator host comprises an exogenous nucleic acid encoding a polypeptide having IDI enzyme activity and an exogenous nucleic acid encoding a polypeptide having ISPS enzyme activity. Any of the nucleic acid sequences encoding a polypeptide having IDI enzyme activity or IPSP enzyme activity as described supra can be used.

[0052] In one nonlimiting embodiment, at least one of the exogenous nucleic acid sequences in the recombinant host is contained within a plasmid.

[0053] In one nonlimiting embodiment, at least one of the exogenous nucleic acid sequences is integrated into a chromosome of the host.

[0054] In one nonlimiting embodiment, the recombinant C. necator host has been transfected with a vector comprising any of SEQ ID NOs:15, 16, 17, 18, 19, 20 or 21.

[0055] Also provided by the present invention is isoprene bioderived from a recombinant C. necator host according to any of methods described herein. In one nonlimiting embodiment, the bioderived isoprene has carbon isotope ratio that reflects an atmospheric carbon dioxide uptake source. Examples of such ratios include, but are not limited to, carbon-12, carbon-13, and carbon-14 isotopes.

[0056] In addition, the present invention provides bio-derived, bio-based, or fermentation-derived product produced using the methods and/or compositions disclosed herein. Examples of such products include, but are not limited to, compositions comprising at least one bio-derived, bio-based, or fermentation-derived compound or any combination thereof, as well as polymers, rubbers such as cis-polyisoprene rubber, trans-polyisoprene rubber, or liquid polyisoprene rubber, molded substances, formulations and semi-solid or non-semi-solid streams comprising one or more of the bio-derived, bio-based, or fermentation-derived compounds or compositions, combinations or products thereof.

[0057] The following section provides further illustration of the methods and compositions of the present invention. These working examples are illustrative only and are not intended to limit the scope of the invention in any way.

EXAMPLES

Example 1: Primers

[0058] Primers as listed in Table 1 were used in the following disclosed experiments.

TABLE-US-00001 Primer Sequence 1 5' GGAAGGAGCGAAGCATGCGTTGTAGCGTTAGC 3' (SEQ ID NO: 22) 2 5' GGGCTTTGTTAGCAGGCTTAGCGTTCGAACGGCAGAAT 3' (SEQ ID NO: 23) 3 5' GCCTGCTAACAAAGCCCGAAA 3' (SEQ ID NO: 24) 4 5' GCTTCGCTCCTTCCTTAAAG 3' (SEQ ID NO: 25) 5 5' GCCGCCCTATACCTTGTCT 3' (SEQ ID NO: 26) 6 5' ACGGCGTCACACTTTGCTAT 3' (SEQ ID NO: 27) 7 5' CGCGTCGCGAACGCCAGCAA 3' (SEQ ID NO: 28) 8 5' ACGGGGCCTGCCACCATACC 3' (SEQ ID NO: 29) 9 5' CTTATCGATGATAAGCTGTC 3' (SEQ ID NO: 30) 10 5' CAGCCCTAGATCGGCCACAG 3' (SEQ ID NO: 31) 11 5' TGCCTGCCCCTCCCTTTTGG 3' (SEQ ID NO: 32) 12 5' GCGGCGAGTGCGGGGGTTCC 3' (SEQ ID NO: 33) 13 5' GGAAACCCACGGCGGCAATG 3' (SEQ ID NO: 34) 14 5' ATCGGCTGTAGCCGCCTCTAGATT 3' (SEQ ID NO: 35) 15 5' AGTAACAATTGCTCAAGCAG 3' (SEQ ID NO: 36) 16 5' ATTCAGAGAAGAAACCAATT 3' (SEQ ID NO: 37) 17 5' GCTAGAAATAATTTTGAGCTCGCCAAGGAGATATAATGCAAAC 3' (SEQ ID NO: 38) 18 5' GCTTCGCTCCTTCCTTAAAGTTATTTAAGCTGGGTAAATGC 3' (SEQ ID NO: 39) 19 5' GCTAGAAATAATTTTGAGCTCGCCAAGGAGATATAATGGTC 3' (SEQ ID NO: 40) 20 5' GCTTCGCTCCTTCCTTAAAGTCAGCGCACCGAATACGA 3' (SEQ ID NO: 41) 21 5' GCTAGAAATAATTTTGAGCTCGCCAAGGAGATATAATGACTGCCGACAACAATAG 3' (SEQ ID NO: 42) 22 5' GCTTCGCTCCTTCCTTAAAGTTATAGCATTCTATGAATTTGCC 3' (SEQ ID NO: 43) 23 5'GCTAGAAATAATTTTGAGCTCGCCAAGGAGATATAATGAATCGAAAAGATGAAC 3' (SEQ ID NO: 44) 24 5' GCTTCGCTCCTTCCTTAAAGTTAACGTTTTGCGAAAACAG 3' (SEQ ID NO: 45) 25 5' GCTAGAAATAATTTTGAGCTCGCCAAGGAGATATAATGACTAACCGTAAAGATGATC 3' (SEQ ID NO: 46) 26 5' GCTTCGCTCCTTCCTTAAAGCTAATTGACCTGCTGCAAG 3' (SEQ ID NO: 47) 27 5' GCTAGAAATAATTTTGAGCTCGCCAAGGAGATATAATGACGACCAACCGCAAGGATG 3' (SEQ ID NO: 48) 28 5' GCTTCGCTCCTTCCTTAAAGTCACGCCTTCTTCATCTG 3' (SEQ ID NO: 49) 29 5' GCCGCCCTATACCTTGTCT 3' (SEQ ID NO: 50) 30 5' ACGGCGTCACACTTTGCTAT 3' (SEQ ID NO: 51)

Example 2: Cloning of Poplar ISPS for Expression in C. necator Spp.

[0059] The protein sequence for the Populus alba was obtained from GenBank (BAD98243.1) and the full gene (with an additional promoter and terminator), codon optimized for E. coli was purchased from Eurofins MWG (SEQ ID NO:52). This DNA was used as a template for amplification of the gene using primers 1 and 2 (see Table 1) and Phusion polymerase (NEB) with an annealing temperature of 45.degree. C. (the open reading frame (ORF) generated lacked the native plasmid tag; this ORF corresponds to nucleotides 168-1865 of SEQ ID NO:52). The vector backbone of pBBR1MCS3-pBAD was generated with primer 3 and 4 (see Table 1) and with Merck Millipore KOD polymerase with annealing temperatures of 50-55.degree. C. The two fragments were ligated using NEB Gibson Assembly reaction master mix as per the manufacturer's recommended protocol. The ligation mix was transformed into chemically competent E. coli NEB5.alpha. and correct clones verified via a combination of colony PCR and sequencing with primers 5 and 6 (see Table 1). Subsequently the whole construct was sequenced by MWG-Eurofins using primers 7-16 (see Table 1). A single verified construct was taken forward for further work and designated pBBR1-ISPS (see FIG. 2A; SEQ ID NO:15)

Example 3: Cloning of IDI-ISPS Bicistrons for Expression in C. necator spp.

[0060] A unique SacI restriction site was identified in pBBR1-ISPS, upstream of the ribosome binding site and downstream of the predicted transcriptional start site. pBBR1-ISPS was purified from NEB5.alpha. alpha using the Qiagen plasmid Midi prep kit, cut with SacI (NEB) and purified using the Qiagen PCR purification kit as per the recommended protocol. Nucleic acid sequences for IDIs from E. coli (SEQ ID NO:8), B. subtilis (SEQ ID NO:9), S. cerevisiae (SEQ ID NO:10), E. faecalis (SEQ ID NO:11), S. pyrogenes (SEQ ID NO:12) and S. pneumonia (SEQ ID NO:13) were obtained from GenBank. Each IDI was amplified from genomic DNA (purchased directly from DSMZ or ATCC) or in the case of the B. subtilis and S. pneumonia variants, from a codon optimized (C. necator) synthetic operon purchased from Eurofins MWG.

[0061] PCR products were generated with Merck Millipore KOD polymerase and an annealing temperature of 55.degree. C. and using primers 17-28 (see Table 1) purified using the Qiagen PCR purification kit and the recommended protocol. The PCR products were then used in a Gibson assembly with the SacI digested and purified pBBR1-ISPS and individual ligations transformed to E. coli NEB5.alpha.. Clones were verified via a combination of colony PCR with Taq polymerase (NEB) and sequencing with primers 29 and 30 (see Table 1). Single verified constructs representing each IDI coupled to ISPS were designated pBBR-EC IDI-ISPS (FIG. 2B; SEQ ID NO:16), pBBR1-BS IDI-ISPS (FIG. 2C; SEQ ID NO:17), pBBR1-SCIDI-ISPS (FIG. 2D; SEQ ID NO:18), pBBR1-EFIDI-ISPS (FIG. 2E; SEQ ID NO:19), pBBR1-SPyrIDI-ISPS (FIG. 2F; SEQ ID NO:20) and pBBR1 SpneuIDI-ISPS (FIG. 2G; SEQ ID NO:21) and further examined.

Example 4: Vector Preparation and Transference to C. necator H16 .DELTA.phaCAB

[0062] Vectors pBBR-EC IDI-ISPS, pBBR1-BS IDI-ISPS, pBBR1-SCIDI-ISPS, pBBR1-EFIDI-ISPS, pBBR1-SPyrIDI-ISPS and pBBR1 SpneuIDI-ISPS were prepared from their respective NEB5a hosts using the Qiagen Midi prep kit and appropriate culture volumes. A C. necator H16 strain with the phaCAB gene locus knocked out (.DELTA.phaCAB) was grown to mid/late exponential phase in tryptic soy broth (TSB) media at 30.degree. C. Cells were made competent with glycerol washes and used immediately. Unexpectedly, competent cells were transformed with at least 1 .mu.g of vector DNA via electroporation and recovered in TSB medium. Transformants were identified on TSB agar with 10 .mu.g/ml tetracycline. Single transformants representative of each IDI-ISPS clone were further examined.

Example 5: Isoprene Production in C. necator H16 .DELTA.phaCAB

[0063] IDI-ISPS clones in C. necator H16 .DELTA.phaCAB, representative of each IDI under study, were grown over 48 hours on TSB agar (without dextrose). The P. alba ISPS construct (pBBR1-ISPS) containing strain was also grown on the same media, as a control. Cultures were grown, induced and harvested. Cell pellets were resuspended in a suitable media and normalized in solution based on the wet cell weight. Further incubations with induction were performed in screw cap headspace gas chromatography (GC) vials (Anatune 093640-040-00 and 093640-038-00). Surprisingly, isoprene was produced and could be measured via gas chromatography-mass spectrometry (GCMS), the parameters for which are set out in Table 2. Ions monitored for isoprene were 39, 53 and 67 on an Agilent DB-624 column Agilent.

TABLE-US-00002 TABLE 2 GCMS analysis conditions for Isoprene GCMS CONDITIONS PARAMETER VALUE Carrier Gas Helium at constant flow (2.0 ml/min) Injector Split ratio Split 10:L Temperature 150.degree. C. Detector Source Temperature 230.degree. C. Quad Temperature 150.degree. C. Interface 260.degree. C. Gain 1 Scan Range] m/z 30-200 Threshold 150 Scan Speed 2{circumflex over ( )}2(A/D samples) 4 Sampling Rate 2{circumflex over ( )}n = 2{circumflex over ( )}2 Mode SCAN and SIM Solvent delay * 2.80 min Oven Temperature Initial T: 40.degree. C. .times. 10 min Oven Ramp 40.degree. C./min to 260.degree. C. for 5 min Injection volume 50 .mu.l from the HS in the GC 2 ml vial Incubation time and T 15 min at 95.degree. C. Agitator ON 500 rpm Injection volume 500 .mu.l of the Head Space Gas saver On after 2 min Concentration range 0.1-5.0 (.mu.g/ml) GC Column DB-624 (122-1334 Agilent) 60 m .times. 250 .mu.m .times. 1.4 .mu.m

Results of these isoprene production studies are shown in Table 3 and depicted graphically in FIG. 1.

TABLE-US-00003 TABLE 3 Isoprene production results of IDI-ISPS expressing C. necator strains Culture Mean Standard C. necator H16 .DELTA.phaCAB isoprene ppm deviation pBBR1-ISPS 0.0078 0.000051 pBBR1 - EC IDI-ISPS 0.030 0.0032 pBBR1 - BS IDI-ISPS 0.40 0.021 pBBR1 - SC-IDI-ISPS 0.076 0.0005 pBBR1 - EF IDI-ISPS 0.018 0.0012 pBBR1 - SPyr IDI-ISPS 0.0089 0.00089 pBBR1 - EC IDI-ISPS 0.184 0.003 pBBR1 - Spneu IDI-ISPS 0.595 0.011

Sequence CWU 1

1

521182PRTE. coli 1Met Gln Thr Glu His Val Ile Leu Leu Asn Ala Gln Gly Val Pro Thr1 5 10 15Gly Thr Leu Glu Lys Tyr Ala Ala His Thr Ala Asp Thr Arg Leu His 20 25 30Leu Ala Phe Ser Ser Trp Leu Phe Asn Ala Lys Gly Gln Leu Leu Val 35 40 45Thr Arg Arg Ala Leu Ser Lys Lys Ala Trp Pro Gly Val Trp Thr Asn 50 55 60Ser Val Cys Gly His Pro Gln Leu Gly Glu Ser Asn Glu Asp Ala Val65 70 75 80Ile Arg Arg Cys Arg Tyr Glu Leu Gly Val Glu Ile Thr Pro Pro Glu 85 90 95Ser Ile Tyr Pro Asp Phe Arg Tyr Arg Ala Thr Asp Pro Ser Gly Ile 100 105 110Val Glu Asn Glu Val Cys Pro Val Phe Ala Ala Arg Thr Thr Ser Ala 115 120 125Leu Gln Ile Asn Asp Asp Glu Val Met Asp Tyr Gln Trp Cys Asp Leu 130 135 140Ala Asp Val Leu His Gly Ile Asp Ala Thr Pro Trp Ala Phe Ser Pro145 150 155 160Trp Met Val Met Gln Ala Thr Asn Arg Glu Ala Arg Lys Arg Leu Ser 165 170 175Ala Phe Thr Gln Leu Lys 1802350PRTB. subtilis 2Met Val Thr Arg Ala Glu Arg Lys Arg Gln His Ile Asn His Ala Leu1 5 10 15Ser Ile Gly Gln Lys Arg Glu Thr Gly Leu Asp Asp Ile Thr Phe Val 20 25 30His Val Ser Leu Pro Asp Leu Ala Leu Glu Gln Val Asp Ile Ser Thr 35 40 45Lys Ile Gly Glu Leu Ser Ser Ser Ser Pro Ile Phe Ile Asn Ala Met 50 55 60Thr Gly Gly Gly Gly Lys Leu Thr Tyr Glu Ile Asn Lys Ser Leu Ala65 70 75 80Arg Ala Ala Ser Gln Ala Gly Ile Pro Leu Ala Val Gly Ser Gln Met 85 90 95Ser Ala Leu Lys Asp Pro Ser Glu Arg Leu Ser Tyr Glu Ile Val Arg 100 105 110Lys Glu Asn Pro Asn Gly Leu Ile Phe Ala Asn Leu Gly Ser Glu Ala 115 120 125Thr Ala Ala Gln Ala Lys Glu Ala Val Glu Met Ile Gly Ala Asn Ala 130 135 140Leu Gln Ile His Leu Asn Val Ile Gln Glu Ile Val Met Pro Glu Gly145 150 155 160Asp Arg Ser Phe Ser Gly Ala Leu Lys Arg Ile Glu Gln Ile Cys Ser 165 170 175Arg Val Ser Val Pro Val Ile Val Lys Glu Val Gly Phe Gly Met Ser 180 185 190Lys Ala Ser Ala Gly Lys Leu Tyr Glu Ala Gly Ala Ala Ala Val Asp 195 200 205Ile Gly Gly Tyr Gly Gly Thr Asn Phe Ser Lys Ile Glu Asn Leu Arg 210 215 220Arg Gln Arg Gln Ile Ser Phe Phe Asn Ser Trp Gly Ile Ser Thr Ala225 230 235 240Ala Ser Leu Ala Glu Ile Arg Ser Glu Phe Pro Ala Ser Thr Met Ile 245 250 255Ala Ser Gly Gly Leu Gln Asp Ala Leu Asp Val Ala Lys Ala Ile Ala 260 265 270Leu Gly Ala Ser Cys Thr Gly Met Ala Gly His Phe Leu Lys Ala Leu 275 280 285Thr Asp Ser Gly Glu Glu Gly Leu Leu Glu Glu Ile Gln Leu Ile Leu 290 295 300Glu Glu Leu Lys Leu Ile Met Thr Val Leu Gly Ala Arg Thr Ile Ala305 310 315 320Asp Leu Gln Lys Ala Pro Leu Val Ile Lys Gly Glu Thr His His Trp 325 330 335Leu Thr Glu Arg Gly Val Asn Thr Ser Ser Tyr Ser Val Arg 340 345 3503288PRTS. cerevisiae 3Met Thr Ala Asp Asn Asn Ser Met Pro His Gly Ala Val Ser Ser Tyr1 5 10 15Ala Lys Leu Val Gln Asn Gln Thr Pro Glu Asp Ile Leu Glu Glu Phe 20 25 30Pro Glu Ile Ile Pro Leu Gln Gln Arg Pro Asn Thr Arg Ser Ser Glu 35 40 45Thr Ser Asn Asp Glu Ser Gly Glu Thr Cys Phe Ser Gly His Asp Glu 50 55 60Glu Gln Ile Lys Leu Met Asn Glu Asn Cys Ile Val Leu Asp Trp Asp65 70 75 80Asp Asn Ala Ile Gly Ala Gly Thr Lys Lys Val Cys His Leu Met Glu 85 90 95Asn Ile Glu Lys Gly Leu Leu His Arg Ala Phe Ser Val Phe Ile Phe 100 105 110Asn Glu Gln Gly Glu Leu Leu Leu Gln Gln Arg Ala Thr Glu Lys Ile 115 120 125Thr Phe Pro Asp Leu Trp Thr Asn Thr Cys Cys Ser His Pro Leu Cys 130 135 140Ile Asp Asp Glu Leu Gly Leu Lys Gly Lys Leu Asp Asp Lys Ile Lys145 150 155 160Gly Ala Ile Thr Ala Ala Val Arg Lys Leu Asp His Glu Leu Gly Ile 165 170 175Pro Glu Asp Glu Thr Lys Thr Arg Gly Lys Phe His Phe Leu Asn Arg 180 185 190Ile His Tyr Met Ala Pro Ser Asn Glu Pro Trp Gly Glu His Glu Ile 195 200 205Asp Tyr Ile Leu Phe Tyr Lys Ile Asn Ala Lys Glu Asn Leu Thr Val 210 215 220Asn Pro Asn Val Asn Glu Val Arg Asp Phe Lys Trp Val Ser Pro Asn225 230 235 240Asp Leu Lys Thr Met Phe Ala Asp Pro Ser Tyr Lys Phe Thr Pro Trp 245 250 255Phe Lys Ile Ile Cys Glu Asn Tyr Leu Phe Asn Trp Trp Glu Gln Leu 260 265 270Asp Asp Leu Ser Glu Val Glu Asn Asp Arg Gln Ile His Arg Met Leu 275 280 2854347PRTE. faecalis 4Met Asn Arg Lys Asp Glu His Leu Ser Leu Ala Lys Ala Phe His Lys1 5 10 15Glu Lys Ser Asn Asp Phe Asp Arg Val Arg Phe Val His Gln Ser Phe 20 25 30Ala Glu Ser Ala Val Asn Glu Val Asp Ile Ser Thr Ser Phe Leu Ser 35 40 45Phe Gln Leu Pro Gln Pro Phe Tyr Val Asn Ala Met Thr Gly Gly Ser 50 55 60Gln Arg Ala Lys Glu Ile Asn Gln Gln Leu Gly Ile Ile Ala Lys Glu65 70 75 80Thr Gly Leu Leu Val Ala Thr Gly Ser Val Ser Ala Ala Leu Lys Asp 85 90 95Ala Ser Leu Ala Asp Thr Tyr Gln Ile Met Arg Lys Glu Asn Pro Asp 100 105 110Gly Leu Ile Phe Ala Asn Ile Gly Ala Gly Leu Gly Val Glu Glu Ala 115 120 125Lys Arg Ala Leu Asp Leu Phe Gln Ala Asn Ala Leu Gln Ile His Val 130 135 140Asn Val Pro Gln Glu Leu Val Met Pro Glu Gly Asp Arg Asp Phe Thr145 150 155 160Asn Trp Leu Thr Lys Ile Glu Ala Ile Val Gln Ala Val Glu Val Pro 165 170 175Val Ile Val Lys Glu Val Gly Phe Gly Met Ser Gln Glu Thr Leu Glu 180 185 190Lys Leu Thr Ser Ile Gly Val Gln Ala Ala Asp Val Ser Gly Gln Gly 195 200 205Gly Thr Ser Phe Thr Gln Ile Glu Asn Ala Arg Arg Lys Lys Arg Glu 210 215 220Leu Ser Phe Leu Asp Asp Trp Gly Gln Ser Thr Val Ile Ser Leu Leu225 230 235 240Glu Ser Gln Asn Trp Gln Lys Lys Leu Thr Ile Leu Gly Ser Gly Gly 245 250 255Val Arg Asn Ser Leu Asp Ile Val Lys Gly Leu Ala Leu Gly Ala Lys 260 265 270Ser Met Gly Val Ala Gly Thr Ile Leu Ala Ser Leu Met Ser Lys Asn 275 280 285Gly Leu Glu Asn Thr Leu Ala Leu Val Gln Gln Trp Gln Glu Glu Val 290 295 300Lys Met Leu Tyr Thr Leu Leu Gly Lys Lys Thr Thr Glu Glu Leu Thr305 310 315 320Ser Thr Ala Leu Val Leu Asp Pro Val Leu Val Asn Trp Cys His Asn 325 330 335Arg Gly Ile Asp Ser Thr Val Phe Ala Lys Arg 340 3455329PRTS. pyrogenes 5Met Thr Asn Arg Lys Asp Asp His Ile Lys Tyr Ala Leu Lys Tyr Gln1 5 10 15Ser Pro Tyr Asn Ala Phe Asp Asp Ile Glu Leu Ile His His Ser Leu 20 25 30Pro Ser Tyr Asp Leu Ser Asp Ile Asp Leu Ser Thr His Phe Ala Gly 35 40 45Gln Asp Phe Asp Phe Pro Phe Tyr Ile Asn Ala Met Thr Gly Gly Ser 50 55 60Gln Lys Gly Lys Ala Val Asn Glu Lys Leu Ala Lys Val Ala Ala Ala65 70 75 80Thr Gly Ile Val Met Val Thr Gly Ser Tyr Ser Ala Ala Leu Lys Asn 85 90 95Pro Asn Asp Asp Ser Tyr Arg Leu His Glu Val Ala Asp Asn Leu Lys 100 105 110Leu Ala Thr Asn Ile Gly Leu Asp Lys Pro Val Ala Leu Gly Gln Gln 115 120 125Thr Val Gln Glu Met Gln Pro Leu Phe Leu Gln Val His Val Asn Val 130 135 140Met Gln Glu Leu Leu Met Pro Glu Gly Glu Arg Val Phe His Thr Trp145 150 155 160Lys Lys His Leu Ala Glu Tyr Ala Ser Gln Ile Pro Val Pro Val Ile 165 170 175Leu Lys Glu Val Gly Phe Gly Met Asp Val Asn Ser Ile Lys Leu Ala 180 185 190His Asp Leu Gly Ile Gln Thr Phe Asp Ile Ser Gly Arg Gly Gly Thr 195 200 205Ser Phe Ala Tyr Ile Glu Asn Gln Arg Gly Gly Asp Arg Ser Tyr Leu 210 215 220Asn Asp Trp Gly Gln Thr Thr Val Gln Cys Leu Leu Asn Ala Gln Gly225 230 235 240Leu Met Asp Gln Val Glu Ile Leu Ala Ser Gly Gly Val Arg His Pro 245 250 255Leu Asp Met Ile Lys Cys Phe Val Leu Gly Ala Arg Ala Val Gly Leu 260 265 270Ser Arg Thr Val Leu Glu Leu Val Glu Lys Tyr Pro Thr Glu Arg Val 275 280 285Ile Ala Ile Val Asn Gly Trp Lys Glu Glu Leu Lys Ile Ile Met Cys 290 295 300Ala Leu Asp Cys Lys Thr Ile Lys Glu Leu Lys Gly Val Asp Tyr Leu305 310 315 320Leu Tyr Gly Arg Leu Gln Gln Val Asn 3256336PRTS. pneumonia 6Met Thr Thr Asn Arg Lys Asp Glu His Ile Leu Tyr Ala Leu Glu Gln1 5 10 15Lys Ser Ser Tyr Asn Ser Phe Asp Glu Val Glu Leu Ile His Ser Ser 20 25 30Leu Pro Leu Tyr Asn Leu Asp Glu Ile Asp Leu Ser Thr Glu Phe Ala 35 40 45Gly Arg Lys Trp Asp Phe Pro Phe Tyr Ile Asn Ala Met Thr Gly Gly 50 55 60Ser Asn Lys Gly Arg Glu Ile Asn Gln Lys Leu Ala Gln Val Ala Glu65 70 75 80Ser Cys Gly Ile Leu Phe Val Thr Gly Ser Tyr Ser Ala Ala Leu Lys 85 90 95Asn Pro Thr Asp Asp Ser Phe Ser Val Lys Ser Ser His Pro Asn Leu 100 105 110Leu Leu Gly Thr Asn Ile Gly Leu Asp Lys Pro Val Glu Leu Gly Leu 115 120 125Gln Thr Val Glu Glu Met Asn Pro Val Leu Leu Gln Val His Val Asn 130 135 140Val Met Gln Glu Leu Leu Met Pro Glu Gly Glu Arg Lys Phe Arg Ser145 150 155 160Trp Gln Ser His Leu Ala Asp Tyr Ser Lys Gln Ile Pro Val Pro Ile 165 170 175Val Leu Lys Glu Val Gly Phe Gly Met Asp Ala Lys Thr Ile Glu Arg 180 185 190Ala Tyr Glu Phe Gly Val Arg Thr Val Asp Leu Ser Gly Arg Gly Gly 195 200 205Thr Ser Phe Ala Tyr Ile Glu Asn Arg Arg Ser Gly Gln Arg Asp Tyr 210 215 220Leu Asn Gln Trp Gly Gln Ser Thr Met Gln Ala Leu Leu Asn Ala Gln225 230 235 240Glu Trp Lys Asp Lys Val Glu Leu Leu Val Ser Gly Gly Val Arg Asn 245 250 255Pro Leu Asp Met Ile Lys Cys Leu Val Phe Gly Ala Lys Ala Val Gly 260 265 270Leu Ser Arg Thr Val Leu Glu Leu Val Glu Thr Tyr Thr Val Glu Glu 275 280 285Val Ile Gly Ile Val Gln Gly Trp Lys Ala Asp Leu Arg Leu Ile Met 290 295 300Cys Ser Leu Asn Cys Ala Thr Ile Ala Asp Leu Gln Lys Val Asp Tyr305 310 315 320Leu Leu Tyr Gly Lys Leu Lys Glu Ala Lys Asp Gln Met Lys Lys Ala 325 330 3357560PRTPopulus alba 7Met Arg Cys Ser Val Ser Thr Glu Asn Val Ser Phe Thr Glu Thr Glu1 5 10 15Thr Glu Ala Arg Arg Ser Ala Asn Tyr Glu Pro Asn Ser Trp Asp Tyr 20 25 30Asp Tyr Leu Leu Ser Ser Asp Thr Asp Glu Ser Ile Glu Val Tyr Lys 35 40 45Asp Lys Ala Lys Lys Leu Glu Ala Glu Val Arg Arg Glu Ile Asn Asn 50 55 60Glu Lys Ala Glu Phe Leu Thr Leu Leu Glu Leu Ile Asp Asn Val Gln65 70 75 80Arg Leu Gly Leu Gly Tyr Arg Phe Glu Ser Asp Ile Arg Gly Ala Leu 85 90 95Asp Arg Phe Val Ser Ser Gly Gly Phe Asp Ala Val Thr Lys Thr Ser 100 105 110Leu His Gly Thr Ala Leu Ser Phe Arg Leu Leu Arg Gln His Gly Phe 115 120 125Glu Val Ser Gln Glu Ala Phe Ser Gly Phe Lys Asp Gln Asn Gly Asn 130 135 140Phe Leu Glu Asn Leu Lys Glu Asp Ile Lys Ala Ile Leu Ser Leu Tyr145 150 155 160Glu Ala Ser Phe Leu Ala Leu Glu Gly Glu Asn Ile Leu Asp Glu Ala 165 170 175Lys Val Phe Ala Ile Ser His Leu Lys Glu Leu Ser Glu Glu Lys Ile 180 185 190Gly Lys Glu Leu Ala Glu Gln Val Asn His Ala Leu Glu Leu Pro Leu 195 200 205His Arg Arg Thr Gln Arg Leu Glu Ala Val Trp Ser Ile Glu Ala Tyr 210 215 220Arg Lys Lys Glu Asp Ala Asn Gln Val Leu Leu Glu Leu Ala Ile Leu225 230 235 240Asp Tyr Asn Met Ile Gln Ser Val Tyr Gln Arg Asp Leu Arg Glu Thr 245 250 255Ser Arg Trp Trp Arg Arg Val Gly Leu Ala Thr Lys Leu His Phe Ala 260 265 270Arg Asp Arg Leu Ile Glu Ser Phe Tyr Trp Ala Val Gly Val Ala Phe 275 280 285Glu Pro Gln Tyr Ser Asp Cys Arg Asn Ser Val Ala Lys Met Phe Ser 290 295 300Phe Val Thr Ile Ile Asp Asp Ile Tyr Asp Val Tyr Gly Thr Leu Asp305 310 315 320Glu Leu Glu Leu Phe Thr Asp Ala Val Glu Arg Trp Asp Val Asn Ala 325 330 335Ile Asn Asp Leu Pro Asp Tyr Met Lys Leu Cys Phe Leu Ala Leu Tyr 340 345 350Asn Thr Ile Asn Glu Ile Ala Tyr Asp Asn Leu Lys Asp Lys Gly Glu 355 360 365Asn Ile Leu Pro Tyr Leu Thr Lys Ala Trp Ala Asp Leu Cys Asn Ala 370 375 380Phe Leu Gln Glu Ala Lys Trp Leu Tyr Asn Lys Ser Thr Pro Thr Phe385 390 395 400Asp Asp Tyr Phe Gly Asn Ala Trp Lys Ser Ser Ser Gly Pro Leu Gln 405 410 415Leu Val Phe Ala Tyr Phe Ala Val Val Gln Asn Ile Lys Lys Glu Glu 420 425 430Ile Glu Asn Leu Gln Lys Tyr His Asp Thr Ile Ser Arg Pro Ser His 435 440 445Ile Phe Arg Leu Cys Asn Asp Leu Ala Ser Ala Ser Ala Glu Ile Ala 450 455 460Arg Gly Glu Thr Ala Asn Ser Val Ser Cys Tyr Met Arg Thr Lys Gly465 470 475 480Ile Ser Glu Glu Leu Ala Thr Glu Ser Val Met Asn Leu Ile Asp Glu 485 490 495Thr Trp Lys Lys Met Asn Lys Glu Lys Leu Gly Gly Ser Leu Phe Ala 500 505 510Lys Pro Phe Val Glu Thr Ala Ile Asn Leu Ala Arg Gln Ser His Cys 515 520 525Thr Tyr His Asn Gly Asp Ala His Thr Ser Pro Asp Glu Leu Thr Arg 530 535 540Lys Arg Val Leu Ser Val Ile Thr Glu Pro Ile Leu Pro Phe Glu Arg545 550 555 5608549DNAE. coli 8atgcaaacgg aacacgtcat tttattgaat gcacagggag ttcccacggg tacgctggaa 60aagtatgccg cacacacggc agacacccgc ttacatctcg cgttctccag ttggctgttt 120aatgccaaag gacaattatt agttacccgc cgcgcactga gcaaaaaagc atggcctggc 180gtgtggacta actcggtttg tgggcaccca caactgggag aaagcaacga agacgcagtg 240atccgccgtt gccgttatga gcttggcgtg gaaattacgc ctcctgaatc tatctatcct 300gactttcgct accgcgccac cgatccgagt ggcattgtgg aaaatgaagt gtgtccggta 360tttgccgcac gcaccactag tgcgttacag

atcaatgatg atgaagtgat ggattatcaa 420tggtgtgatt tagcagatgt attacacggt attgatgcca cgccgtgggc gttcagtccg 480tggatggtga tgcaggcgac aaatcgcgaa gccagaaaac gattatctgc atttacccag 540cttaaataa 54991053DNAB. subtilis 9atggtcacgc gcgcggagcg caagcgccag cacatcaacc acgcgctctc catcggccag 60aagcgcgaaa ccggcctgga cgacatcacg tttgtgcatg tctcgctgcc ggacctggcc 120ctcgaacagg tcgacatctc gacgaagatt ggcgagctga gctcctcgtc gccgatcttc 180atcaacgcga tgaccggcgg tggtggcaag ctgacctacg agatcaacaa gtccctggcg 240cgcgcggcca gccaggccgg catcccgctg gcggtcggca gccagatgtc ggccctgaag 300gaccccagcg agcgcctgtc gtacgagatt gtccgcaagg aaaacccgaa cggcctgatc 360ttcgccaatc tgggctcgga agccaccgcg gcgcaggcca aagaagcggt ggagatgatc 420ggcgccaacg ccctgcagat ccacctgaac gtgatccaag agatcgtgat gcccgagggc 480gaccgttcct tctccggcgc cctcaagcgc atcgagcaaa tctgcagccg cgtgtcggtg 540cccgtcatcg tcaaggaagt gggcttcggc atgtcgaagg ccagcgccgg caagctgtac 600gaagccggcg cggccgccgt ggacatcggc ggctacggcg gcacgaactt cagcaagatt 660gagaatctgc gccgccagcg gcagatcagc ttcttcaact cgtggggcat cagcacggcc 720gcgtcgctgg cggagatccg gtccgagttc ccggcctcga ccatgatcgc gtccggtggc 780ctccaagacg ccctggacgt cgccaaggcc atcgccctgg gcgcgagctg caccggcatg 840gccggtcact tcctgaaggc cctgaccgat agcggcgagg aaggcctgct ggaagagatc 900cagctgatcc tggaagaact gaagctgatc atgacggtgc tgggcgcccg taccatcgcg 960gatctgcaaa aggcgccgct cgtgatcaag ggcgaaaccc atcactggct caccgagcgg 1020ggcgtgaaca ccagctcgta ttcggtgcgc tga 105310867DNAS. cerevisiae 10atgactgccg acaacaatag tatgccccat ggtgcagtat ctagttacgc caaattagtg 60caaaaccaaa cacctgaaga cattttggaa gagtttcctg aaattattcc attacaacaa 120agacctaata cccgatctag tgagacgtca aatgacgaaa gcggagaaac atgtttttct 180ggtcatgatg aggagcaaat taagttaatg aatgaaaatt gtattgtttt ggattgggac 240gataatgcta ttggtgccgg taccaagaaa gtttgtcatt taatggaaaa tattgaaaag 300ggtttactac atcgtgcatt ctccgtcttt attttcaatg aacaaggtga attactttta 360caacaaagag ccactgaaaa aataactttc cctgatcttt ggactaacac atgctgctct 420catccactat gtattgatga cgaattaggt ttgaagggta agctagacga taagattaag 480ggcgctatta ctgcggcggt gagaaaacta gatcatgaat taggtattcc agaagatgaa 540actaagacaa ggggtaagtt tcacttttta aacagaatcc attacatggc accaagcaat 600gaaccatggg gtgaacatga aattgattac atcctatttt ataagatcaa cgctaaagaa 660aacttgactg tcaacccaaa cgtcaatgaa gttagagact tcaaatgggt ttcaccaaat 720gatttgaaaa ctatgtttgc tgacccaagt tacaagttta cgccttggtt taagattatt 780tgcgagaatt acttattcaa ctggtgggag caattagatg acctttctga agtggaaaat 840gacaggcaaa ttcatagaat gctataa 867111044DNAE. faecalis 11atgaatcgaa aagatgaaca tctatcatta gctaaagcgt tccacaaaga aaaaagtaat 60gactttgatc gtgtgcgttt tgttcaccaa tcgtttgctg aatccgctgt taacgaagtg 120gatatttcca cttcgtttct ttcttttcag cttccccaac ctttttatgt caatgcaatg 180acaggtggta gtcagcgtgc aaaagaaatt aatcagcaat taggcattat tgccaaagaa 240actggccttt tagttgcgac aggatctgtc tcggcagcgt taaaagatgc tagtttagcg 300gatacgtatc aaattatgcg aaaagaaaac ccagatggac tcatttttgc caatattggt 360gcaggcttgg gtgtggaaga agcaaagcga gcgcttgatt tatttcaagc gaatgcctta 420caaatccatg taaatgtgcc ccaagaattg gtcatgcctg aaggagatcg tgatttcact 480aattggctaa ccaagattga agctatcgta caggccgtag aagtgcctgt cattgtcaaa 540gaggttggct ttggcatgag ccaagaaacc ttagaaaaac ttacctctat cggcgttcaa 600gcagcggatg tgagcggcca aggcggaacg agttttacac aaattgaaaa tgcccggcgg 660aagaaacgag aactttcttt cttagatgat tgggggcaat caacggtcat ctctcttctg 720gaatcacaaa attggcaaaa gaaactaact attctcggct ctggcggtgt gcgtaactct 780cttgatattg tcaaaggact cgctttaggt gccaaaagca tgggagttgc tgggactatc 840ttagcttccc ttatgagtaa aaatggttta gaaaatacct tagcccttgt acagcaatgg 900caagaagaag tgaaaatgct ttatactctt ttaggaaaaa agacgacaga agaattgacg 960agtaccgcac ttgtcctcga tccagtttta gttaattggt gtcataaccg tggtatcgac 1020agcactgttt tcgcaaaacg ttaa 104412990DNAS. pyrogenes 12atgactaacc gtaaagatga tcacatcaaa tatgctctca agtaccaatc gccttataat 60gcttttgatg acatagaact catacaccat tccttaccta gctatgattt gtctgatatt 120gatctcagta ctcattttgc tgggcaagac ttcgactttc ccttttacat caatgccatg 180acaggaggaa gtcaaaaagg caaagctgtc aatgaaaaat tggccaaagt agcagcagca 240acagggattg tcatggtgac agggtcttat agcgctgctt taaaaaatcc taacgacgat 300tcctatcgtt tacatgaggt ggcagataac ttgaaactag ccacgaatat tggtctagat 360aaacctgtgg cgctaggaca acaaacggtt caagaaatgc agcccctctt tttacaggtt 420catgtgaatg tgatgcaaga gttgctgatg ccagagggtg agcgcgtctt tcatacctgg 480aaaaaacacc tcgctgaata cgctagtcaa ataccagttc ctgtcattct caaagaagtt 540ggttttggca tggatgtcaa tagtatcaag ctagcacatg acctaggcat tcaaaccttt 600gatatttcag gtagaggagg aacttcattt gcttacattg aaaatcaaag agggggagac 660cgctcttact taaacgattg gggacaaacc actgttcagt gcttactgaa tgcacaagga 720ctgatggacc aagtggaaat cttagcttcg ggtggtgtca gacacccctt ggacatgatt 780aagtgttttg tcttaggagc acgtgcagtg ggactctcac gcaccgtttt agaattggtc 840gaaaaatacc caaccgagcg tgtgattgct atcgttaatg gctggaaaga agaattaaaa 900atcattatgt gtgctcttga ctgtaaaact attaaagaat taaagggagt cgactactta 960ctatatggac gcttgcagca ggtcaattag 990131008DNAS. pneumonia 13atgacgacca accgcaagga tgagcacatc ctctacgccc tggagcagaa gtcgtcgtac 60aactcgttcg acgaagtgga actgatccac tcgtcgctgc cgctgtataa cctggacgaa 120atcgacctgt ccaccgagtt cgccggccgc aagtgggatt tcccgttcta catcaatgcc 180atgaccggcg gtagcaacaa gggccgcgaa atcaatcaga agctggccca ggtcgccgag 240tcgtgcggca tcctgttcgt caccggcagc tactccgccg cgctgaagaa cccgaccgac 300gactcgttct cggtcaagag cagccacccg aatctgctgc tgggcacgaa catcggcctc 360gacaagcccg tcgaactggg cctgcagacc gtggaagaaa tgaaccccgt gctgctccag 420gtgcatgtga acgtgatgca agagctgctg atgccggagg gcgaacgcaa gttccgcagc 480tggcagtcgc acctggccga ctactcgaag cagatccccg tgccgatcgt gctgaaagaa 540gtgggcttcg gcatggacgc caagaccatc gagcgtgcct acgagttcgg cgtgcgcacc 600gtggacctct cgggccgcgg tggcacgagc ttcgcgtaca tcgaaaaccg gcgcagcggc 660cagcgcgact acctgaacca gtggggccaa tcgaccatgc aggccctgct gaacgcgcaa 720gaatggaagg acaaggtcga gctgctggtg tcgggcggcg tgcgtaaccc gctcgacatg 780atcaagtgcc tggtgttcgg cgccaaggcc gtgggcctgt cccgcaccgt gctggagctg 840gtcgaaacct acaccgtcga agaagtcatc ggcattgtcc agggctggaa ggccgacctc 900cgcctcatca tgtgctccct gaactgcgcc acgatcgcgg acctccagaa ggtggactat 960ctcctctacg gcaagctcaa agaagccaag gaccagatga agaaggcg 1008141683DNAPopulus alba 14atgcgttgta gcgttagcac cgaaaatgtg tcgtttacgg aaacggaaac cgaagctcgc 60cgcagcgcaa actatgaacc gaactcgtgg gattacgatt acctccttag cagcgatacg 120gatgaaagca ttgaagtgta taaagacaaa gccaagaaac tggaggccga agtccgtcgc 180gaaatcaaca atgagaaagc ggagtttctt acgttactgg aattgatcga taacgtgcaa 240cggttaggcc tcggctaccg ctttgagagc gatatccgtg gtgcactgga ccgcttcgta 300tcgtctggtg gttttgacgc cgttacgaaa acgagcctgc atggtacagc attgtctttt 360cggctgttgc gccagcatgg atttgaagtg tcacaggagg cattttcagg cttcaaagac 420cagaacggga attttttgga gaatttgaaa gaagatatca aagcgatctt atctctgtat 480gaggcgtcat ttctcgctct ggaaggggaa aatattctgg acgaagcgaa agtgttcgca 540atttcccatc tgaaagaact ttccgaagaa aagattggga aagaattggc cgaacaggtg 600aaccatgcgc tggaactgcc actgcaccgt cgcacccaac gcctcgaagc ggtatggtcg 660attgaagcgt atcgcaaaaa agaggatgca aatcaggttc tgctggaact ggccattctc 720gactataaca tgattcagtc cgtctatcaa cgtgatctgc gcgaaactag tcgttggtgg 780cgccgtgtag gacttgccac taaactgcat tttgcacgtg atcgtctgat tgagtcgttc 840tattgggcgg ttggtgtagc gtttgagccg cagtattctg attgccgcaa tagtgtggcg 900aaaatgttct cctttgtgac catcattgac gatatttacg acgtgtatgg caccctggat 960gaactggaat tattcaccga tgcagtagaa cgctgggacg tcaacgcgat caatgatttg 1020ccggattaca tgaaactgtg ttttctggcc ctgtataaca ccattaacga aattgcctat 1080gacaacctca aagacaaggg tgaaaatatc ctgccctatc tgactaaagc ttgggctgat 1140ctgtgtaacg cgttcttaca ggaagccaaa tggctctaca acaagagtac gcctactttc 1200gatgactact ttggcaacgc ttggaaaagc tctagcggcc ctttacaact ggtgttcgcg 1260tatttcgccg ttgttcagaa tatcaagaaa gaagagattg agaacctcca aaagtaccac 1320gatacgattt cgcgtccgtc acacatcttt cgcctttgca atgatttggc cagtgcatct 1380gcagagattg cgcgcggtga aactgccaac tccgtcagtt gctacatgcg taccaaaggc 1440atcagcgagg aactggctac cgagtcggtg atgaacttaa tcgatgaaac ctggaagaag 1500atgaacaaag agaaacttgg tggcagtctg tttgctaaac cgttcgttga gacagcgatt 1560aatctggcgc gtcaaagcca ctgcacctac cacaatggcg atgcccacac atccccagac 1620gaattaaccc ggaaacgtgt cctgagtgtc atcaccgaac ccattctgcc gttcgaacgc 1680taa 1683157399DNAArtificial sequenceSynthetic 15tagattaatt aacctccagc gcggggatct catgctggag ttcttcgccc acccccagac 60aagctgtgac cgtctccggg agctgcatgt gtcagaggtt ttcaccgtca tcaccgaaac 120gcgcgaggca gcagatcaat tcgcgcgcga aggcgaagcg gcatgcataa tgtgcctgtc 180aaatggacga agcagggatt ctgcaaaccc tatgctactc cgtcaagccg tcaattgtct 240gattcgttac caattatgac aacttgacgg ctacatcatt cactttttct tcacaaccgg 300cacggaactc gctcgggctg gccccggtgc attttttaaa tacccgcgag aaatagagtt 360gatcgtcaaa accaacattg cgaccgacgg tggcgatagg catccgggtg gtgctcaaaa 420gcagcttcgc ctggctgata cgttggtcct cgcgccagct taagacgcta atccctaact 480gctggcggaa aagatgtgac agacgcgacg gcgacaagca aacatgctgt gcgacgctgg 540cgatatcaaa attgctgtct gccaggtgat cgctgatgta ctgacaagcc tcgcgtaccc 600gattatccat cggtggatgg agcgactcgt taatcgcttc catgcgccgc agtaacaatt 660gctcaagcag atttatcgcc agcagctccg aatagcgccc ttccccttgc ccggcgttaa 720tgatttgccc aaacaggtcg ctgaaatgcg gctggtgcgc ttcatccggg cgaaagaacc 780ccgtattggc aaatattgac ggccagttaa gccattcatg ccagtaggcg cgcggacgaa 840agtaaaccca ctggtgatac cattcgcgag cctccggatg acgaccgtag tgatgaatct 900ctcctggcgg gaacagcaaa atatcacccg gtcggcaaac aaattctcgt ccctgatttt 960tcaccacccc ctgaccgcga atggtgagat tgagaatata acctttcatt cccagcggtc 1020ggtcgataaa aaaatcgaga taaccgttgg cctcaatcgg cgttaaaccc gccaccagat 1080gggcattaaa cgagtatccc ggcagcaggg gatcattttg cgcttcagcc atacttttca 1140tactcccgcc attcagagaa gaaaccaatt gtccatattg catcagacat tgccgtcact 1200gcgtctttta ctggctcttc tcgctaacca aaccggtaac cccgcttatt aaaagcattc 1260tgtaacaaag cgggaccaaa gccatgacaa aaacgcgtaa caaaagtgtc tataatcacg 1320gcagaaaagt ccacattgat tatttgcacg gcgtcacact ttgctatgcc atagcatttt 1380tatccataag attagcggat cctacctgac gctttttatc gcaactctct actgtttctc 1440catacccgtt ttttgggcta gaaataattt tgagctcctt taaggaagga gcgaagcatg 1500cgttgtagcg ttagcaccga aaatgtgtcg tttacggaaa cggaaaccga agctcgccgc 1560agcgcaaact atgaaccgaa ctcgtgggat tacgattacc tccttagcag cgatacggat 1620gaaagcattg aagtgtataa agacaaagcc aagaaactgg aggccgaagt ccgtcgcgaa 1680atcaacaatg agaaagcgga gtttcttacg ttactggaat tgatcgataa cgtgcaacgg 1740ttaggcctcg gctaccgctt tgagagcgat atccgtggtg cactggaccg cttcgtatcg 1800tctggtggtt ttgacgccgt tacgaaaacg agcctgcatg gtacagcatt gtcttttcgg 1860ctgttgcgcc agcatggatt tgaagtgtca caggaggcat tttcaggctt caaagaccag 1920aacgggaatt ttttggagaa tttgaaagaa gatatcaaag cgatcttatc tctgtatgag 1980gcgtcatttc tcgctctgga aggggaaaat attctggacg aagcgaaagt gttcgcaatt 2040tcccatctga aagaactttc cgaagaaaag attgggaaag aattggccga acaggtgaac 2100catgcgctgg aactgccact gcaccgtcgc acccaacgcc tcgaagcggt atggtcgatt 2160gaagcgtatc gcaaaaaaga ggatgcaaat caggttctgc tggaactggc cattctcgac 2220tataacatga ttcagtccgt ctatcaacgt gatctgcgcg aaactagtcg ttggtggcgc 2280cgtgtaggac ttgccactaa actgcatttt gcacgtgatc gtctgattga gtcgttctat 2340tgggcggttg gtgtagcgtt tgagccgcag tattctgatt gccgcaatag tgtggcgaaa 2400atgttctcct ttgtgaccat cattgacgat atttacgacg tgtatggcac cctggatgaa 2460ctggaattat tcaccgatgc agtagaacgc tgggacgtca acgcgatcaa tgatttgccg 2520gattacatga aactgtgttt tctggccctg tataacacca ttaacgaaat tgcctatgac 2580aacctcaaag acaagggtga aaatatcctg ccctatctga ctaaagcttg ggctgatctg 2640tgtaacgcgt tcttacagga agccaaatgg ctctacaaca agagtacgcc tactttcgat 2700gactactttg gcaacgcttg gaaaagctct agcggccctt tacaactggt gttcgcgtat 2760ttcgccgttg ttcagaatat caagaaagaa gagattgaga acctccaaaa gtaccacgat 2820acgatttcgc gtccgtcaca catctttcgc ctttgcaatg atttggccag tgcatctgca 2880gagattgcgc gcggtgaaac tgccaactcc gtcagttgct acatgcgtac caaaggcatc 2940agcgaggaac tggctaccga gtcggtgatg aacttaatcg atgaaacctg gaagaagatg 3000aacaaagaga aacttggtgg cagtctgttt gctaaaccgt tcgttgagac agcgattaat 3060ctggcgcgtc aaagccactg cacctaccac aatggcgatg cccacacatc cccagacgaa 3120ttaacccgga aacgtgtcct gagtgtcatc accgaaccca ttctgccgtt cgaacgctaa 3180gcctgctaac aaagcccgaa aggaagctga gttggctgct gccaccgctg agcactagtg 3240cggccgcttt gcgcattcac agttctccgc aagaattgat tggctccaat tcttggagtg 3300gtgaatccgt tagcgaggtg ccgccggctt ccattcaggt cgaggtggcc cggctccatg 3360caccgcgacg caacgcgggg aggcagacaa ggtatagggc ggcgcctaca atccatgcca 3420acccgttcca tgtgctcgcc gaggcggcat aaatcgccgt gacgatcagc ggtccagtga 3480tcgaagttag gctggtaaga gccgcgagcg atccttgaag ctgtccctga tggtcgtcat 3540ctacctgcct ggacagcatg gcctgcaacg cgggcatccc gatgccgccg gaagcgagaa 3600gaatcataat ggggaaggcc atccagcctc gcgtcgcgaa cgccagcaag acgtagccca 3660gcgcgtcggc cgccatgccg gcgataatgg cctgcttctc gccgaaacgt ttggtggcgg 3720gaccagtgac gaaggcttga gcgagggcgt gcaagattcc gaataccgca agcgacaggc 3780cgatcatcgt cgcgctccag cgaaagcggt cctcgccgaa aatgacccag agcgctgccg 3840gcacctgtcc tacgagttgc atgataaaga agacagtcat aagtgcggcg acgatagtca 3900tgccccgcgc ccaccggaag gagctgactg ggttgaaggc tctcaagggc atcggtcgac 3960gctctccctt atgcgactcc tgcattagga agcagcccag tagtaggttg aggccgttga 4020gcaccgccgc cgcaaggaat ggtgcatgca aggagatggc gcccaacagt cccccggcca 4080cggggcctgc caccataccc acgccgaaac aagcgctcat gagcccgaag tggcgagccc 4140gatcttcccc atcggtgatg tcggcgatat aggcgccagc aaccgcacct gtggcgccgg 4200tgatgccggc cacgatgcgt ccggcgtaga ggatccacag gacgggtgtg gtcgccatga 4260tcgcgtagtc gatagtggct ccaagtagcg aagcgagcag gactgggcgg cggccaaagc 4320ggtcggacag tgctccgaga acgggtgcgc atagaaattg catcaacgca tatagcgcta 4380gcagcacgcc atagtgactg gcgatgctgt cggaatggac gatatcccgc aagaggcccg 4440gcagtaccgg cataaccaag cctatgccta cagcatccag ggtgacggtg ccgaggatga 4500cgatgagcgc attgttagat ttcatacacg gtgcctgact gcgttagcaa tttaactgtg 4560ataaactacc gcattaaagc ttatcgatga taagctgtca aacatgagaa ttcttgaaga 4620cgaaagggcc tcgtgatacg cctattttta taggttaatg tcatgataat aatggtttct 4680tagacgtcag gtggcacttt tcggggaaat gtgcgcgccc gcgttcctgc tggcgctggg 4740cctgtttctg gcgctggact tcccgctgtt ccgtcagcag cttttcgccc acggccttga 4800tgatcgcggc ggccttggcc tgcatatccc gattcaacgg ccccagggcg tccagaacgg 4860gcttcaggcg ctcccgaagg tctcgggccg tctcttgggc ttgatcggcc ttcttgcgca 4920tctcacgcgc tcctgcggcg gcctgtaggg caggctcata cccctgccga accgcttttg 4980tcagccggtc ggccacggct tccggcgtct caacgcgctt tgagattccc agcttttcgg 5040ccaatccctg cggtgcatag gcgcgtggct cgaccgcttg cgggctgatg gtgacgtggc 5100ccactggtgg ccgctccagg gcctcgtaga acgcctgaat gcgcgtgtga cgtgccttgc 5160tgccctcgat gccccgttgc agccctagat cggccacagc ggccgcaaac gtggtctggt 5220cgcgggtcat ctgcgctttg ttgccgatga actccttggc cgacagcctg ccgtcctgcg 5280tcagcggcac cacgaacgcg gtcatgtgcg ggctggtttc gtcacggtgg atgctggccg 5340tcacgatgcg atccgccccg tacttgtccg ccagccactt gtgcgccttc tcgaagaacg 5400ccgcctgctg ttcttggctg gccgacttcc accattccgg gctggccgtc atgacgtact 5460cgaccgccaa cacagcgtcc ttgcgccgct tctctggcag caactcgcgc agtcggccca 5520tcgcttcatc ggtgctgctg gccgcccagt gctcgttctc tggcgtcctg ctggcgtcag 5580cgttgggcgt ctcgcgctcg cggtaggcgt gcttgagact ggccgccacg ttgcccattt 5640tcgccagctt cttgcatcgc atgatcgcgt atgccgccat gcctgcccct cccttttggt 5700gtccaaccgg ctcgacgggg gcagcgcaag gcggtgcctc cggcgggcca ctcaatgctt 5760gagtatactc actagacttt gcttcgcaaa gtcgtgaccg cctacggcgg ctgcggcgcc 5820ctacgggctt gctctccggg cttcgccctg cgcggtcgct gcgctccctt gccagcccgt 5880ggatatgtgg acgatggccg cgagcggcca ccggctggct cgcttcgctc ggcccgtgga 5940caaccctgct ggacaagctg atggacaggc tgcgcctgcc cacgagcttg accacaggga 6000ttgcccaccg gctacccagc cttcgaccac atacccaccg gctccaactg cgcggcctgc 6060ggccttgccc catcaatttt tttaattttc tctggggaaa agcctccggc ctgcggcctg 6120cgcgcttcgc ttgccggttg gacaccaagt ggaaggcggg tcaaggctcg cgcagcgacc 6180gcgcagcggc ttggccttga cgcgcctgga acgacccaag cctatgcgag tgggggcagt 6240cgaagggcga agcccgcccg cctgcccccc gagcctcacg gcggcgagtg cgggggttcc 6300aagggggcag cgccaccttg ggcaaggccg aaggccgcgc agtcgatcaa caagccccgg 6360aggggccact ttttgccgga gggggagccg cgccgaaggc gtgggggaac cccgcagggg 6420tgcccttctt tgggcaccaa agaactagat atagggcgaa atgcgaaaga cttaaaaatc 6480aacaacttaa aaaagggggg tacgcaacag ctcattgcgg caccccccgc aatagctcat 6540tgcgtaggtt aaagaaaatc tgtaattgac tgccactttt acgcaacgca taattgttgt 6600cgcgctgccg aaaagttgca gctgattgcg catggtgccg caaccgtgcg gcacccctac 6660cgcatggaga taagcatggc cacgcagtcc agagaaatcg gcattcaagc caagaacaag 6720cccggtcact gggtgcaaac ggaacgcaaa gcgcatgagg cgtgggccgg gcttattgcg 6780aggaaaccca cggcggcaat gctgctgcat cacctcgtgg cgcagatggg ccaccagaac 6840gccgtggtgg tcagccagaa gacactttcc aagctcatcg gacgttcttt gcggacggtc 6900caatacgcag tcaaggactt ggtggccgag cgctggatct ccgtcgtgaa gctcaacggc 6960cccggcaccg tgtcggccta cgtggtcaat gaccgcgtgg cgtggggcca gccccgcgac 7020cagttgcgcc tgtcggtgtt cagtgccgcc gtggtggttg atcacgacga ccaggacgaa 7080tcgctgttgg ggcatggcga cctgcgccgc atcccgaccc tgtatccggg cgagcagcaa 7140ctaccgaccg gccccggcga ggagccgccc agccagcccg gcattccggg catggaacca 7200gacctgccag ccttgaccga aacggaggaa tgggaacggc gcgggcagca gcgcctgccg 7260atgcccgatg agccgtgttt tctggacgat ggcgagccgt tggagccgcc gacacgggtc 7320acgctgccgc gccggtagca cttgggttgc gcagcaaccc gtaagtgcgc tgttccagac 7380tatcggctgt agccgcctc 7399167960DNAArtificial sequenceSynthetic 16ctcgatgccc cgttgcagcc ctagatcggc cacagcggcc gcaaacgtgg tctggtcgcg 60ggtcatctgc gctttgttgc cgatgaactc cttggccgac agcctgccgt cctgcgtcag 120cggcaccacg aacgcggtca tgtgcgggct ggtttcgtca cggtggatgc tggccgtcac 180gatgcgatcc gccccgtact tgtccgccag ccacttgtgc gccttctcga agaacgccgc 240ctgctgttct tggctggccg acttccacca ttccgggctg gccgtcatga cgtactcgac 300cgccaacaca gcgtccttgc gccgcttctc tggcagcaac tcgcgcagtc ggcccatcgc

360ttcatcggtg ctgctggccg cccagtgctc gttctctggc gtcctgctgg cgtcagcgtt 420gggcgtctcg cgctcgcggt aggcgtgctt gagactggcc gccacgttgc ccattttcgc 480cagcttcttg catcgcatga tcgcgtatgc cgccatgcct gcccctccct tttggtgtcc 540aaccggctcg acgggggcag cgcaaggcgg tgcctccggc gggccactca atgcttgagt 600atactcacta gactttgctt cgcaaagtcg tgaccgccta cggcggctgc ggcgccctac 660gggcttgctc tccgggcttc gccctgcgcg gtcgctgcgc tcccttgcca gcccgtggat 720atgtggacga tggccgcgag cggccaccgg ctggctcgct tcgctcggcc cgtggacaac 780cctgctggac aagctgatgg acaggctgcg cctgcccacg agcttgacca cagggattgc 840ccaccggcta cccagccttc gaccacatac ccaccggctc caactgcgcg gcctgcggcc 900ttgccccatc aattttttta attttctctg gggaaaagcc tccggcctgc ggcctgcgcg 960cttcgcttgc cggttggaca ccaagtggaa ggcgggtcaa ggctcgcgca gcgaccgcgc 1020agcggcttgg ccttgacgcg cctggaacga cccaagccta tgcgagtggg ggcagtcgaa 1080ggcgaagccc gcccgcctgc cccccgagcc tcacggcggc gagtgcgggg gttccaaggg 1140ggcagcgcca ccttgggcaa ggccgaaggc cgcgcagtcg atcaacaagc cccggagggg 1200ccactttttg ccggaggggg agccgcgccg aaggcgtggg ggaaccccgc aggggtgccc 1260ttctttgggc accaaagaac tagatatagg gcgaaatgcg aaagacttaa aaatcaacaa 1320cttaaaaaag gggggtacgc aacagctcat tgcggcaccc cccgcaatag ctcattgcgt 1380aggttaaaga aaatctgtaa ttgactgcca cttttacgca acgcataatt gttgtcgcgc 1440tgccgaaaag ttgcagctga ttgcgcatgg tgccgcaacc gtgcggcacc ctaccgcatg 1500gagataagca tggccacgca gtccagagaa atcggcattc aagccaagaa caagcccggt 1560cactgggtgc aaacggaacg caaagcgcat gaggcgtggg ccgggcttat tgcgaggaaa 1620cccacggcgg caatgctgct gcatcacctc gtggcgcaga tgggccacca gaacgccgtg 1680gtggtcagcc agaagacact ttccaagctc atcggacgtt ctttgcggac ggtccaatac 1740gcagtcaagg acttggtggc cgagcgctgg atctccgtcg tgaagctcaa cggccccggc 1800accgtgtcgg cctacgtggt caatgaccgc gtggcgtggg gccagccccg cgaccagttg 1860cgcctgtcgg tgttcagtgc cgccgtggtg gttgatcacg acgaccagga cgaatcgctg 1920ttggggcatg gcgacctgcg ccgcatcccg accctgtatc cgggcgagca gcaactaccg 1980accggccccg gcgaggagcc gcccagccag cccggcattc cgggcatgga accagacctg 2040ccagccttga ccgaaacgga ggaatgggaa cggcgcgggc agcagcgcct gccgatgccc 2100gatgagccgt gttttctgga cgatggcgag ccgttggagc cgccgacacg ggtcacgctg 2160ccgcgccggt agcacttggg ttgcgcagca acccgtaagt gcgctgttcc agactatcgg 2220ctgtagccgc ctctagatta attaacctcc agcgcgggga tctcatgctg gagttcttcg 2280cccaccccca gacaagctgt gaccgtctcc gggagctgca tgtgtcagag gttttcaccg 2340tcatcaccga aacgcgcgag gcagcagatc aattcgcgcg cgaaggcgaa gcggcatgca 2400taatgtgcct gtcaaatgga cgaagcaggg attctgcaaa ccctatgcta ctccgtcaag 2460ccgtcaattg tctgattcgt taccaattat gacaacttga cggctacatc attcactttt 2520tcttcacaac cggcacggaa ctcgctcggg ctggccccgg tgcatttttt aaatacccgc 2580gagaaataga gttgatcgtc aaaaccaaca ttgcgaccga cggtggcgat aggcatccgg 2640gtggtgctca aaagcagctt cgcctggctg atacgttggt cctcgcgcca gcttaagacg 2700ctaatcccta actgctggcg gaaaagatgt gacagacgcg acggcgacaa gcaaacatgc 2760tgtgcgacgc tggcgatatc aaaattgctg tctgccaggt gatcgctgat gtactgacaa 2820gcctcgcgta cccgattatc catcggtgga tggagcgact cgttaatcgc ttccatgcgc 2880cgcagtaaca attgctcaag cagatttatc gccagcagct ccgaatagcg cccttcccct 2940tgcccggcgt taatgatttg cccaaacagg tcgctgaaat gcggctggtg cgcttcatcc 3000gggcgaaaga accccgtatt ggcaaatatt gacggccagt taagccattc atgccagtag 3060gcgcgcggac gaaagtaaac ccactggtga taccattcgc gagcctccgg atgacgaccg 3120tagtgatgaa tctctcctgg cgggaacagc aaaatatcac ccggtcggca aacaaattct 3180cgtccctgat ttttcaccac cccctgaccg cgaatggtga gattgagaat ataacctttc 3240attcccagcg gtcggtcgat aaaaaaatcg agataaccgt tggcctcaat cggcgttaaa 3300cccgccacca gatgggcatt aaacgagtat cccggcagca ggggatcatt ttgcgcttca 3360gccatacttt tcatactccc gccattcaga gaagaaacca attgtccata ttgcatcaga 3420cattgccgtc actgcgtctt ttactggctc ttctcgctaa ccaaaccggt aaccccgctt 3480attaaaagca ttctgtaaca aagcgggacc aaagccatga caaaaacgcg taacaaaagt 3540gtctataatc acggcagaaa agtccacatt gattatttgc acggcgtcac actttgctat 3600gccatagcat ttttatccat aagattagcg gatcctacct gacgcttttt atcgcaactc 3660tctactgttt ctccataccc gttttttggg ctagaaataa ttttgagctc gccaaggaga 3720tataatgcaa acggaacacg tcattttatt gaatgcacag ggagttccca cgggtacgct 3780ggaaaagtat gccgcacaca cggcagacac ccgcttacat ctcgcgttct ccagttggct 3840gtttaatgcc aaaggacaat tattagttac ccgccgcgca ctgagcaaaa aagcatggcc 3900tggcgtgtgg actaactcgg tttgtgggca cccacaactg ggagaaagca acgaagacgc 3960agtgatccgc cgttgccgtt atgagcttgg cgtggaaatt acgcctcctg aatctatcta 4020tcctgacttt cgctaccgcg ccaccgatcc gagtggcatt gtggaaaatg aagtgtgtcc 4080ggtatttgcc gcacgcacca ctagtgcgtt acagatcaat gatgatgaag tgatggatta 4140tcaatggtgt gatttagcag atgtattaca cggtattgat gccacgccgt gggcgttcag 4200tccgtggatg gtgatgcagg cgacaaatcg cgaagccaga aaacgattat ctgcatttac 4260ccagcttaaa taactttaag gaaggagcga agcatgcgtt gtagcgttag caccgaaaat 4320gtgtcgttta cggaaacgga aaccgaagct cgccgcagcg caaactatga accgaactcg 4380tgggattacg attacctcct tagcagcgat acggatgaaa gcattgaagt gtataaagac 4440aaagccaaga aactggaggc cgaagtccgt cgcgaaatca acaatgagaa agcggagttt 4500cttacgttac tggaattgat cgataacgtg caacggttag gcctcggcta ccgctttgag 4560agcgatatcc gtggtgcact ggaccgcttc gtatcgtctg gtggttttga cgccgttacg 4620aaaacgagcc tgcatggtac agcattgtct tttcggctgt tgcgccagca tggatttgaa 4680gtgtcacagg aggcattttc aggcttcaaa gaccagaacg ggaatttttt ggagaatttg 4740aaagaagata tcaaagcgat cttatctctg tatgaggcgt catttctcgc tctggaaggg 4800gaaaatattc tggacgaagc gaaagtgttc gcaatttccc atctgaaaga actttccgaa 4860gaaaagattg ggaaagaatt ggccgaacag gtgaaccatg cgctggaact gccactgcac 4920cgtcgcaccc aacgcctcga agcggtatgg tcgattgaag cgtatcgcaa aaaagaggat 4980gcaaatcagg ttctgctgga actggccatt ctcgactata acatgattca gtccgtctat 5040caacgtgatc tgcgcgaaac tagtcgttgg tggcgccgtg taggacttgc cactaaactg 5100cattttgcac gtgatcgtct gattgagtcg ttctattggg cggttggtgt agcgtttgag 5160ccgcagtatt ctgattgccg caatagtgtg gcgaaaatgt tctcctttgt gaccatcatt 5220gacgatattt acgacgtgta tggcaccctg gatgaactgg aattattcac cgatgcagta 5280gaacgctggg acgtcaacgc gatcaatgat ttgccggatt acatgaaact gtgttttctg 5340gccctgtata acaccattaa cgaaattgcc tatgacaacc tcaaagacaa gggtgaaaat 5400atcctgccct atctgactaa agcttgggct gatctgtgta acgcgttctt acaggaagcc 5460aaatggctct acaacaagag tacgcctact ttcgatgact actttggcaa cgcttggaaa 5520agctctagcg gccctttaca actggtgttc gcgtatttcg ccgttgttca gaatatcaag 5580aaagaagaga ttgagaacct ccaaaagtac cacgatacga tttcgcgtcc gtcacacatc 5640tttcgccttt gcaatgattt ggccagtgca tctgcagaga ttgcgcgcgg tgaaactgcc 5700aactccgtca gttgctacat gcgtaccaaa ggcatcagcg aggaactggc taccgagtcg 5760gtgatgaact taatcgatga aacctggaag aagatgaaca aagagaaact tggtggcagt 5820ctgtttgcta aaccgttcgt tgagacagcg attaatctgg cgcgtcaaag ccactgcacc 5880taccacaatg gcgatgccca cacatcccca gacgaattaa cccggaaacg tgtcctgagt 5940gtcatcaccg aacccattct gccgttcgaa cgctaagcct gctaacaaag cccgaaagga 6000agctgagttg gctgctgcca ccgctgagca ctagtgcggc cgctttgcgc attcacagtt 6060ctccgcaaga attgattggc tccaattctt ggagtggtga atccgttagc gaggtgccgc 6120cggcttccat tcaggtcgag gtggcccggc tccatgcacc gcgacgcaac gcggggaggc 6180agacaaggta tagggcggcg cctacaatcc atgccaaccc gttccatgtg ctcgccgagg 6240cggcataaat cgccgtgacg atcagcggtc cagtgatcga agttaggctg gtaagagccg 6300cgagcgatcc ttgaagctgt ccctgatggt cgtcatctac ctgcctggac agcatggcct 6360gcaacgcggg catcccgatg ccgccggaag cgagaagaat cataatgggg aaggccatcc 6420agcctcgcgt cgcgaacgcc agcaagacgt agcccagcgc gtcggccgcc atgccggcga 6480taatggcctg cttctcgccg aaacgtttgg tggcgggacc agtgacgaag gcttgagcga 6540gggcgtgcaa gattccgaat accgcaagcg acaggccgat catcgtcgcg ctccagcgaa 6600agcggtcctc gccgaaaatg acccagagcg ctgccggcac ctgtcctacg agttgcatga 6660taaagaagac agtcataagt gcggcgacga tagtcatgcc ccgcgcccac cggaaggagc 6720tgactgggtt gaaggctctc aagggcatcg gtcgacgctc tcccttatgc gactcctgca 6780ttaggaagca gcccagtagt aggttgaggc cgttgagcac cgccgccgca aggaatggtg 6840catgcaagga gatggcgccc aacagtcccc cggccacggg gcctgccacc atacccacgc 6900cgaaacaagc gctcatgagc ccgaagtggc gagcccgatc ttccccatcg gtgatgtcgg 6960cgatataggc gccagcaacc gcacctgtgg cgccggtgat gccggccacg atgcgtccgg 7020cgtagaggat ccacaggacg ggtgtggtcg ccatgatcgc gtagtcgata gtggctccaa 7080gtagcgaagc gagcaggact gggcggcggc caaagcggtc ggacagtgct ccgagaacgg 7140gtgcgcatag aaattgcatc aacgcatata gcgctagcag cacgccatag tgactggcga 7200tgctgtcgga atggacgata tcccgcaaga ggcccggcag taccggcata accaagccta 7260tgcctacagc atccagggtg acggtgccga ggatgacgat gagcgcattg ttagatttca 7320tacacggtgc ctgactgcgt tagcaattta actgtgataa actaccgcat taaagcttat 7380cgatgataag ctgtcaaaca tgagaattct tgaagacgaa agggcctcgt gatacgccta 7440tttttatagg ttaatgtcat gataataatg gtttcttaga cgtcaggtgg cacttttcgg 7500ggaaatgtgc gcgcccgcgt tcctgctggc gctgggcctg tttctggcgc tggacttccc 7560gctgttccgt cagcagcttt tcgcccacgg ccttgatgat cgcggcggcc ttggcctgca 7620tatcccgatt caacggcccc agggcgtcca gaacgggctt caggcgctcc cgaaggtctc 7680gggccgtctc ttgggcttga tcggccttct tgcgcatctc acgcgctcct gcggcggcct 7740gtagggcagg ctcatacccc tgccgaaccg cttttgtcag ccggtcggcc acggcttccg 7800gcgtctcaac gcgctttgag attcccagct tttcggccaa tccctgcggt gcataggcgc 7860gtggctcgac cgcttgcggg ctgatggtga cgtggcccac tggtggccgc tccagggcct 7920cgtagaacgc ctgaatgcgc gtgtgacgtg ccttgctgcc 7960178464DNAArtificial sequenceSynthetic 17ctcgatgccc cgttgcagcc ctagatcggc cacagcggcc gcaaacgtgg tctggtcgcg 60ggtcatctgc gctttgttgc cgatgaactc cttggccgac agcctgccgt cctgcgtcag 120cggcaccacg aacgcggtca tgtgcgggct ggtttcgtca cggtggatgc tggccgtcac 180gatgcgatcc gccccgtact tgtccgccag ccacttgtgc gccttctcga agaacgccgc 240ctgctgttct tggctggccg acttccacca ttccgggctg gccgtcatga cgtactcgac 300cgccaacaca gcgtccttgc gccgcttctc tggcagcaac tcgcgcagtc ggcccatcgc 360ttcatcggtg ctgctggccg cccagtgctc gttctctggc gtcctgctgg cgtcagcgtt 420gggcgtctcg cgctcgcggt aggcgtgctt gagactggcc gccacgttgc ccattttcgc 480cagcttcttg catcgcatga tcgcgtatgc cgccatgcct gcccctccct tttggtgtcc 540aaccggctcg acgggggcag cgcaaggcgg tgcctccggc gggccactca atgcttgagt 600atactcacta gactttgctt cgcaaagtcg tgaccgccta cggcggctgc ggcgccctac 660gggcttgctc tccgggcttc gccctgcgcg gtcgctgcgc tcccttgcca gcccgtggat 720atgtggacga tggccgcgag cggccaccgg ctggctcgct tcgctcggcc cgtggacaac 780cctgctggac aagctgatgg acaggctgcg cctgcccacg agcttgacca cagggattgc 840ccaccggcta cccagccttc gaccacatac ccaccggctc caactgcgcg gcctgcggcc 900ttgccccatc aattttttta attttctctg gggaaaagcc tccggcctgc ggcctgcgcg 960cttcgcttgc cggttggaca ccaagtggaa ggcgggtcaa ggctcgcgca gcgaccgcgc 1020agcggcttgg ccttgacgcg cctggaacga cccaagccta tgcgagtggg ggcagtcgaa 1080ggcgaagccc gcccgcctgc cccccgagcc tcacggcggc gagtgcgggg gttccaaggg 1140ggcagcgcca ccttgggcaa ggccgaaggc cgcgcagtcg atcaacaagc cccggagggg 1200ccactttttg ccggaggggg agccgcgccg aaggcgtggg ggaaccccgc aggggtgccc 1260ttctttgggc accaaagaac tagatatagg gcgaaatgcg aaagacttaa aaatcaacaa 1320cttaaaaaag gggggtacgc aacagctcat tgcggcaccc cccgcaatag ctcattgcgt 1380aggttaaaga aaatctgtaa ttgactgcca cttttacgca acgcataatt gttgtcgcgc 1440tgccgaaaag ttgcagctga ttgcgcatgg tgccgcaacc gtgcggcacc ctaccgcatg 1500gagataagca tggccacgca gtccagagaa atcggcattc aagccaagaa caagcccggt 1560cactgggtgc aaacggaacg caaagcgcat gaggcgtggg ccgggcttat tgcgaggaaa 1620cccacggcgg caatgctgct gcatcacctc gtggcgcaga tgggccacca gaacgccgtg 1680gtggtcagcc agaagacact ttccaagctc atcggacgtt ctttgcggac ggtccaatac 1740gcagtcaagg acttggtggc cgagcgctgg atctccgtcg tgaagctcaa cggccccggc 1800accgtgtcgg cctacgtggt caatgaccgc gtggcgtggg gccagccccg cgaccagttg 1860cgcctgtcgg tgttcagtgc cgccgtggtg gttgatcacg acgaccagga cgaatcgctg 1920ttggggcatg gcgacctgcg ccgcatcccg accctgtatc cgggcgagca gcaactaccg 1980accggccccg gcgaggagcc gcccagccag cccggcattc cgggcatgga accagacctg 2040ccagccttga ccgaaacgga ggaatgggaa cggcgcgggc agcagcgcct gccgatgccc 2100gatgagccgt gttttctgga cgatggcgag ccgttggagc cgccgacacg ggtcacgctg 2160ccgcgccggt agcacttggg ttgcgcagca acccgtaagt gcgctgttcc agactatcgg 2220ctgtagccgc ctctagatta attaacctcc agcgcgggga tctcatgctg gagttcttcg 2280cccaccccca gacaagctgt gaccgtctcc gggagctgca tgtgtcagag gttttcaccg 2340tcatcaccga aacgcgcgag gcagcagatc aattcgcgcg cgaaggcgaa gcggcatgca 2400taatgtgcct gtcaaatgga cgaagcaggg attctgcaaa ccctatgcta ctccgtcaag 2460ccgtcaattg tctgattcgt taccaattat gacaacttga cggctacatc attcactttt 2520tcttcacaac cggcacggaa ctcgctcggg ctggccccgg tgcatttttt aaatacccgc 2580gagaaataga gttgatcgtc aaaaccaaca ttgcgaccga cggtggcgat aggcatccgg 2640gtggtgctca aaagcagctt cgcctggctg atacgttggt cctcgcgcca gcttaagacg 2700ctaatcccta actgctggcg gaaaagatgt gacagacgcg acggcgacaa gcaaacatgc 2760tgtgcgacgc tggcgatatc aaaattgctg tctgccaggt gatcgctgat gtactgacaa 2820gcctcgcgta cccgattatc catcggtgga tggagcgact cgttaatcgc ttccatgcgc 2880cgcagtaaca attgctcaag cagatttatc gccagcagct ccgaatagcg cccttcccct 2940tgcccggcgt taatgatttg cccaaacagg tcgctgaaat gcggctggtg cgcttcatcc 3000gggcgaaaga accccgtatt ggcaaatatt gacggccagt taagccattc atgccagtag 3060gcgcgcggac gaaagtaaac ccactggtga taccattcgc gagcctccgg atgacgaccg 3120tagtgatgaa tctctcctgg cgggaacagc aaaatatcac ccggtcggca aacaaattct 3180cgtccctgat ttttcaccac cccctgaccg cgaatggtga gattgagaat ataacctttc 3240attcccagcg gtcggtcgat aaaaaaatcg agataaccgt tggcctcaat cggcgttaaa 3300cccgccacca gatgggcatt aaacgagtat cccggcagca ggggatcatt ttgcgcttca 3360gccatacttt tcatactccc gccattcaga gaagaaacca attgtccata ttgcatcaga 3420cattgccgtc actgcgtctt ttactggctc ttctcgctaa ccaaaccggt aaccccgctt 3480attaaaagca ttctgtaaca aagcgggacc aaagccatga caaaaacgcg taacaaaagt 3540gtctataatc acggcagaaa agtccacatt gattatttgc acggcgtcac actttgctat 3600gccatagcat ttttatccat aagattagcg gatcctacct gacgcttttt atcgcaactc 3660tctactgttt ctccataccc gttttttggg ctagaaataa ttttgagctc gccaaggaga 3720tataatggtc acgcgcgcgg agcgcaagcg ccagcacatc aaccacgcgc tctccatcgg 3780ccagaagcgc gaaaccggcc tggacgacat cacgtttgtg catgtctcgc tgccggacct 3840ggccctcgaa caggtcgaca tctcgacgaa gattggcgag ctgagctcct cgtcgccgat 3900cttcatcaac gcgatgaccg gcggtggtgg caagctgacc tacgagatca acaagtccct 3960ggcgcgcgcg gccagccagg ccggcatccc gctggcggtc ggcagccaga tgtcggccct 4020gaaggacccc agcgagcgcc tgtcgtacga gattgtccgc aaggaaaacc cgaacggcct 4080gatcttcgcc aatctgggct cggaagccac cgcggcgcag gccaaagaag cggtggagat 4140gatcggcgcc aacgccctgc agatccacct gaacgtgatc caagagatcg tgatgcccga 4200gggcgaccgt tccttctccg gcgccctcaa gcgcatcgag caaatctgca gccgcgtgtc 4260ggtgcccgtc atcgtcaagg aagtgggctt cggcatgtcg aaggccagcg ccggcaagct 4320gtacgaagcc ggcgcggccg ccgtggacat cggcggctac ggcggcacga acttcagcaa 4380gattgagaat ctgcgccgcc agcggcagat cagcttcttc aactcgtggg gcatcagcac 4440ggccgcgtcg ctggcggaga tccggtccga gttcccggcc tcgaccatga tcgcgtccgg 4500tggcctccaa gacgccctgg acgtcgccaa ggccatcgcc ctgggcgcga gctgcaccgg 4560catggccggt cacttcctga aggccctgac cgatagcggc gaggaaggcc tgctggaaga 4620gatccagctg atcctggaag aactgaagct gatcatgacg gtgctgggcg cccgtaccat 4680cgcggatctg caaaaggcgc cgctcgtgat caagggcgaa acccatcact ggctcaccga 4740gcggggcgtg aacaccagct cgtattcggt gcgctgactt taaggaagga gcgaagcatg 4800cgttgtagcg ttagcaccga aaatgtgtcg tttacggaaa cggaaaccga agctcgccgc 4860agcgcaaact atgaaccgaa ctcgtgggat tacgattacc tccttagcag cgatacggat 4920gaaagcattg aagtgtataa agacaaagcc aagaaactgg aggccgaagt ccgtcgcgaa 4980atcaacaatg agaaagcgga gtttcttacg ttactggaat tgatcgataa cgtgcaacgg 5040ttaggcctcg gctaccgctt tgagagcgat atccgtggtg cactggaccg cttcgtatcg 5100tctggtggtt ttgacgccgt tacgaaaacg agcctgcatg gtacagcatt gtcttttcgg 5160ctgttgcgcc agcatggatt tgaagtgtca caggaggcat tttcaggctt caaagaccag 5220aacgggaatt ttttggagaa tttgaaagaa gatatcaaag cgatcttatc tctgtatgag 5280gcgtcatttc tcgctctgga aggggaaaat attctggacg aagcgaaagt gttcgcaatt 5340tcccatctga aagaactttc cgaagaaaag attgggaaag aattggccga acaggtgaac 5400catgcgctgg aactgccact gcaccgtcgc acccaacgcc tcgaagcggt atggtcgatt 5460gaagcgtatc gcaaaaaaga ggatgcaaat caggttctgc tggaactggc cattctcgac 5520tataacatga ttcagtccgt ctatcaacgt gatctgcgcg aaactagtcg ttggtggcgc 5580cgtgtaggac ttgccactaa actgcatttt gcacgtgatc gtctgattga gtcgttctat 5640tgggcggttg gtgtagcgtt tgagccgcag tattctgatt gccgcaatag tgtggcgaaa 5700atgttctcct ttgtgaccat cattgacgat atttacgacg tgtatggcac cctggatgaa 5760ctggaattat tcaccgatgc agtagaacgc tgggacgtca acgcgatcaa tgatttgccg 5820gattacatga aactgtgttt tctggccctg tataacacca ttaacgaaat tgcctatgac 5880aacctcaaag acaagggtga aaatatcctg ccctatctga ctaaagcttg ggctgatctg 5940tgtaacgcgt tcttacagga agccaaatgg ctctacaaca agagtacgcc tactttcgat 6000gactactttg gcaacgcttg gaaaagctct agcggccctt tacaactggt gttcgcgtat 6060ttcgccgttg ttcagaatat caagaaagaa gagattgaga acctccaaaa gtaccacgat 6120acgatttcgc gtccgtcaca catctttcgc ctttgcaatg atttggccag tgcatctgca 6180gagattgcgc gcggtgaaac tgccaactcc gtcagttgct acatgcgtac caaaggcatc 6240agcgaggaac tggctaccga gtcggtgatg aacttaatcg atgaaacctg gaagaagatg 6300aacaaagaga aacttggtgg cagtctgttt gctaaaccgt tcgttgagac agcgattaat 6360ctggcgcgtc aaagccactg cacctaccac aatggcgatg cccacacatc cccagacgaa 6420ttaacccgga aacgtgtcct gagtgtcatc accgaaccca ttctgccgtt cgaacgctaa 6480gcctgctaac aaagcccgaa aggaagctga gttggctgct gccaccgctg agcactagtg 6540cggccgcttt gcgcattcac agttctccgc aagaattgat tggctccaat tcttggagtg 6600gtgaatccgt tagcgaggtg ccgccggctt ccattcaggt cgaggtggcc cggctccatg 6660caccgcgacg caacgcgggg aggcagacaa ggtatagggc ggcgcctaca atccatgcca 6720acccgttcca tgtgctcgcc gaggcggcat aaatcgccgt gacgatcagc ggtccagtga 6780tcgaagttag gctggtaaga gccgcgagcg atccttgaag ctgtccctga tggtcgtcat 6840ctacctgcct ggacagcatg gcctgcaacg cgggcatccc gatgccgccg gaagcgagaa 6900gaatcataat ggggaaggcc atccagcctc gcgtcgcgaa cgccagcaag acgtagccca 6960gcgcgtcggc cgccatgccg gcgataatgg cctgcttctc gccgaaacgt ttggtggcgg 7020gaccagtgac gaaggcttga gcgagggcgt gcaagattcc gaataccgca agcgacaggc 7080cgatcatcgt cgcgctccag cgaaagcggt cctcgccgaa aatgacccag agcgctgccg 7140gcacctgtcc tacgagttgc atgataaaga agacagtcat aagtgcggcg acgatagtca 7200tgccccgcgc ccaccggaag gagctgactg ggttgaaggc tctcaagggc atcggtcgac 7260gctctccctt atgcgactcc tgcattagga agcagcccag tagtaggttg aggccgttga 7320gcaccgccgc cgcaaggaat ggtgcatgca aggagatggc gcccaacagt cccccggcca 7380cggggcctgc caccataccc

acgccgaaac aagcgctcat gagcccgaag tggcgagccc 7440gatcttcccc atcggtgatg tcggcgatat aggcgccagc aaccgcacct gtggcgccgg 7500tgatgccggc cacgatgcgt ccggcgtaga ggatccacag gacgggtgtg gtcgccatga 7560tcgcgtagtc gatagtggct ccaagtagcg aagcgagcag gactgggcgg cggccaaagc 7620ggtcggacag tgctccgaga acgggtgcgc atagaaattg catcaacgca tatagcgcta 7680gcagcacgcc atagtgactg gcgatgctgt cggaatggac gatatcccgc aagaggcccg 7740gcagtaccgg cataaccaag cctatgccta cagcatccag ggtgacggtg ccgaggatga 7800cgatgagcgc attgttagat ttcatacacg gtgcctgact gcgttagcaa tttaactgtg 7860ataaactacc gcattaaagc ttatcgatga taagctgtca aacatgagaa ttcttgaaga 7920cgaaagggcc tcgtgatacg cctattttta taggttaatg tcatgataat aatggtttct 7980tagacgtcag gtggcacttt tcggggaaat gtgcgcgccc gcgttcctgc tggcgctggg 8040cctgtttctg gcgctggact tcccgctgtt ccgtcagcag cttttcgccc acggccttga 8100tgatcgcggc ggccttggcc tgcatatccc gattcaacgg ccccagggcg tccagaacgg 8160gcttcaggcg ctcccgaagg tctcgggccg tctcttgggc ttgatcggcc ttcttgcgca 8220tctcacgcgc tcctgcggcg gcctgtaggg caggctcata cccctgccga accgcttttg 8280tcagccggtc ggccacggct tccggcgtct caacgcgctt tgagattccc agcttttcgg 8340ccaatccctg cggtgcatag gcgcgtggct cgaccgcttg cgggctgatg gtgacgtggc 8400ccactggtgg ccgctccagg gcctcgtaga acgcctgaat gcgcgtgtga cgtgccttgc 8460tgcc 8464188278DNAArtificial sequenceSynthetic 18ctcgatgccc cgttgcagcc ctagatcggc cacagcggcc gcaaacgtgg tctggtcgcg 60ggtcatctgc gctttgttgc cgatgaactc cttggccgac agcctgccgt cctgcgtcag 120cggcaccacg aacgcggtca tgtgcgggct ggtttcgtca cggtggatgc tggccgtcac 180gatgcgatcc gccccgtact tgtccgccag ccacttgtgc gccttctcga agaacgccgc 240ctgctgttct tggctggccg acttccacca ttccgggctg gccgtcatga cgtactcgac 300cgccaacaca gcgtccttgc gccgcttctc tggcagcaac tcgcgcagtc ggcccatcgc 360ttcatcggtg ctgctggccg cccagtgctc gttctctggc gtcctgctgg cgtcagcgtt 420gggcgtctcg cgctcgcggt aggcgtgctt gagactggcc gccacgttgc ccattttcgc 480cagcttcttg catcgcatga tcgcgtatgc cgccatgcct gcccctccct tttggtgtcc 540aaccggctcg acgggggcag cgcaaggcgg tgcctccggc gggccactca atgcttgagt 600atactcacta gactttgctt cgcaaagtcg tgaccgccta cggcggctgc ggcgccctac 660gggcttgctc tccgggcttc gccctgcgcg gtcgctgcgc tcccttgcca gcccgtggat 720atgtggacga tggccgcgag cggccaccgg ctggctcgct tcgctcggcc cgtggacaac 780cctgctggac aagctgatgg acaggctgcg cctgcccacg agcttgacca cagggattgc 840ccaccggcta cccagccttc gaccacatac ccaccggctc caactgcgcg gcctgcggcc 900ttgccccatc aattttttta attttctctg gggaaaagcc tccggcctgc ggcctgcgcg 960cttcgcttgc cggttggaca ccaagtggaa ggcgggtcaa ggctcgcgca gcgaccgcgc 1020agcggcttgg ccttgacgcg cctggaacga cccaagccta tgcgagtggg ggcagtcgaa 1080ggcgaagccc gcccgcctgc cccccgagcc tcacggcggc gagtgcgggg gttccaaggg 1140ggcagcgcca ccttgggcaa ggccgaaggc cgcgcagtcg atcaacaagc cccggagggg 1200ccactttttg ccggaggggg agccgcgccg aaggcgtggg ggaaccccgc aggggtgccc 1260ttctttgggc accaaagaac tagatatagg gcgaaatgcg aaagacttaa aaatcaacaa 1320cttaaaaaag gggggtacgc aacagctcat tgcggcaccc cccgcaatag ctcattgcgt 1380aggttaaaga aaatctgtaa ttgactgcca cttttacgca acgcataatt gttgtcgcgc 1440tgccgaaaag ttgcagctga ttgcgcatgg tgccgcaacc gtgcggcacc ctaccgcatg 1500gagataagca tggccacgca gtccagagaa atcggcattc aagccaagaa caagcccggt 1560cactgggtgc aaacggaacg caaagcgcat gaggcgtggg ccgggcttat tgcgaggaaa 1620cccacggcgg caatgctgct gcatcacctc gtggcgcaga tgggccacca gaacgccgtg 1680gtggtcagcc agaagacact ttccaagctc atcggacgtt ctttgcggac ggtccaatac 1740gcagtcaagg acttggtggc cgagcgctgg atctccgtcg tgaagctcaa cggccccggc 1800accgtgtcgg cctacgtggt caatgaccgc gtggcgtggg gccagccccg cgaccagttg 1860cgcctgtcgg tgttcagtgc cgccgtggtg gttgatcacg acgaccagga cgaatcgctg 1920ttggggcatg gcgacctgcg ccgcatcccg accctgtatc cgggcgagca gcaactaccg 1980accggccccg gcgaggagcc gcccagccag cccggcattc cgggcatgga accagacctg 2040ccagccttga ccgaaacgga ggaatgggaa cggcgcgggc agcagcgcct gccgatgccc 2100gatgagccgt gttttctgga cgatggcgag ccgttggagc cgccgacacg ggtcacgctg 2160ccgcgccggt agcacttggg ttgcgcagca acccgtaagt gcgctgttcc agactatcgg 2220ctgtagccgc ctctagatta attaacctcc agcgcgggga tctcatgctg gagttcttcg 2280cccaccccca gacaagctgt gaccgtctcc gggagctgca tgtgtcagag gttttcaccg 2340tcatcaccga aacgcgcgag gcagcagatc aattcgcgcg cgaaggcgaa gcggcatgca 2400taatgtgcct gtcaaatgga cgaagcaggg attctgcaaa ccctatgcta ctccgtcaag 2460ccgtcaattg tctgattcgt taccaattat gacaacttga cggctacatc attcactttt 2520tcttcacaac cggcacggaa ctcgctcggg ctggccccgg tgcatttttt aaatacccgc 2580gagaaataga gttgatcgtc aaaaccaaca ttgcgaccga cggtggcgat aggcatccgg 2640gtggtgctca aaagcagctt cgcctggctg atacgttggt cctcgcgcca gcttaagacg 2700ctaatcccta actgctggcg gaaaagatgt gacagacgcg acggcgacaa gcaaacatgc 2760tgtgcgacgc tggcgatatc aaaattgctg tctgccaggt gatcgctgat gtactgacaa 2820gcctcgcgta cccgattatc catcggtgga tggagcgact cgttaatcgc ttccatgcgc 2880cgcagtaaca attgctcaag cagatttatc gccagcagct ccgaatagcg cccttcccct 2940tgcccggcgt taatgatttg cccaaacagg tcgctgaaat gcggctggtg cgcttcatcc 3000gggcgaaaga accccgtatt ggcaaatatt gacggccagt taagccattc atgccagtag 3060gcgcgcggac gaaagtaaac ccactggtga taccattcgc gagcctccgg atgacgaccg 3120tagtgatgaa tctctcctgg cgggaacagc aaaatatcac ccggtcggca aacaaattct 3180cgtccctgat ttttcaccac cccctgaccg cgaatggtga gattgagaat ataacctttc 3240attcccagcg gtcggtcgat aaaaaaatcg agataaccgt tggcctcaat cggcgttaaa 3300cccgccacca gatgggcatt aaacgagtat cccggcagca ggggatcatt ttgcgcttca 3360gccatacttt tcatactccc gccattcaga gaagaaacca attgtccata ttgcatcaga 3420cattgccgtc actgcgtctt ttactggctc ttctcgctaa ccaaaccggt aaccccgctt 3480attaaaagca ttctgtaaca aagcgggacc aaagccatga caaaaacgcg taacaaaagt 3540gtctataatc acggcagaaa agtccacatt gattatttgc acggcgtcac actttgctat 3600gccatagcat ttttatccat aagattagcg gatcctacct gacgcttttt atcgcaactc 3660tctactgttt ctccataccc gttttttggg ctagaaataa ttttgagctc gccaaggaga 3720tataatgact gccgacaaca atagtatgcc ccatggtgca gtatctagtt acgccaaatt 3780agtgcaaaac caaacacctg aagacatttt ggaagagttt cctgaaatta ttccattaca 3840acaaagacct aatacccgat ctagtgagac gtcaaatgac gaaagcggag aaacatgttt 3900ttctggtcat gatgaggagc aaattaagtt aatgaatgaa aattgtattg ttttggattg 3960ggacgataat gctattggtg ccggtaccaa gaaagtttgt catttaatgg aaaatattga 4020aaagggttta ctacatcgtg cattctccgt ctttattttc aatgaacaag gtgaattact 4080tttacaacaa agagccactg aaaaaataac tttccctgat ctttggacta acacatgctg 4140ctctcatcca ctatgtattg atgacgaatt aggtttgaag ggtaagctag acgataagat 4200taagggcgct attactgcgg cggtgagaaa actagatcat gaattaggta ttccagaaga 4260tgaaactaag acaaggggta agtttcactt tttaaacaga atccattaca tggcaccaag 4320caatgaacca tggggtgaac atgaaattga ttacatccta ttttataaga tcaacgctaa 4380agaaaacttg actgtcaacc caaacgtcaa tgaagttaga gacttcaaat gggtttcacc 4440aaatgatttg aaaactatgt ttgctgaccc aagttacaag tttacgcctt ggtttaagat 4500tatttgcgag aattacttat tcaactggtg ggagcaatta gatgaccttt ctgaagtgga 4560aaatgacagg caaattcata gaatgctata actttaagga aggagcgaag catgcgttgt 4620agcgttagca ccgaaaatgt gtcgtttacg gaaacggaaa ccgaagctcg ccgcagcgca 4680aactatgaac cgaactcgtg ggattacgat tacctcctta gcagcgatac ggatgaaagc 4740attgaagtgt ataaagacaa agccaagaaa ctggaggccg aagtccgtcg cgaaatcaac 4800aatgagaaag cggagtttct tacgttactg gaattgatcg ataacgtgca acggttaggc 4860ctcggctacc gctttgagag cgatatccgt ggtgcactgg accgcttcgt atcgtctggt 4920ggttttgacg ccgttacgaa aacgagcctg catggtacag cattgtcttt tcggctgttg 4980cgccagcatg gatttgaagt gtcacaggag gcattttcag gcttcaaaga ccagaacggg 5040aattttttgg agaatttgaa agaagatatc aaagcgatct tatctctgta tgaggcgtca 5100tttctcgctc tggaagggga aaatattctg gacgaagcga aagtgttcgc aatttcccat 5160ctgaaagaac tttccgaaga aaagattggg aaagaattgg ccgaacaggt gaaccatgcg 5220ctggaactgc cactgcaccg tcgcacccaa cgcctcgaag cggtatggtc gattgaagcg 5280tatcgcaaaa aagaggatgc aaatcaggtt ctgctggaac tggccattct cgactataac 5340atgattcagt ccgtctatca acgtgatctg cgcgaaacta gtcgttggtg gcgccgtgta 5400ggacttgcca ctaaactgca ttttgcacgt gatcgtctga ttgagtcgtt ctattgggcg 5460gttggtgtag cgtttgagcc gcagtattct gattgccgca atagtgtggc gaaaatgttc 5520tcctttgtga ccatcattga cgatatttac gacgtgtatg gcaccctgga tgaactggaa 5580ttattcaccg atgcagtaga acgctgggac gtcaacgcga tcaatgattt gccggattac 5640atgaaactgt gttttctggc cctgtataac accattaacg aaattgccta tgacaacctc 5700aaagacaagg gtgaaaatat cctgccctat ctgactaaag cttgggctga tctgtgtaac 5760gcgttcttac aggaagccaa atggctctac aacaagagta cgcctacttt cgatgactac 5820tttggcaacg cttggaaaag ctctagcggc cctttacaac tggtgttcgc gtatttcgcc 5880gttgttcaga atatcaagaa agaagagatt gagaacctcc aaaagtacca cgatacgatt 5940tcgcgtccgt cacacatctt tcgcctttgc aatgatttgg ccagtgcatc tgcagagatt 6000gcgcgcggtg aaactgccaa ctccgtcagt tgctacatgc gtaccaaagg catcagcgag 6060gaactggcta ccgagtcggt gatgaactta atcgatgaaa cctggaagaa gatgaacaaa 6120gagaaacttg gtggcagtct gtttgctaaa ccgttcgttg agacagcgat taatctggcg 6180cgtcaaagcc actgcaccta ccacaatggc gatgcccaca catccccaga cgaattaacc 6240cggaaacgtg tcctgagtgt catcaccgaa cccattctgc cgttcgaacg ctaagcctgc 6300taacaaagcc cgaaaggaag ctgagttggc tgctgccacc gctgagcact agtgcggccg 6360ctttgcgcat tcacagttct ccgcaagaat tgattggctc caattcttgg agtggtgaat 6420ccgttagcga ggtgccgccg gcttccattc aggtcgaggt ggcccggctc catgcaccgc 6480gacgcaacgc ggggaggcag acaaggtata gggcggcgcc tacaatccat gccaacccgt 6540tccatgtgct cgccgaggcg gcataaatcg ccgtgacgat cagcggtcca gtgatcgaag 6600ttaggctggt aagagccgcg agcgatcctt gaagctgtcc ctgatggtcg tcatctacct 6660gcctggacag catggcctgc aacgcgggca tcccgatgcc gccggaagcg agaagaatca 6720taatggggaa ggccatccag cctcgcgtcg cgaacgccag caagacgtag cccagcgcgt 6780cggccgccat gccggcgata atggcctgct tctcgccgaa acgtttggtg gcgggaccag 6840tgacgaaggc ttgagcgagg gcgtgcaaga ttccgaatac cgcaagcgac aggccgatca 6900tcgtcgcgct ccagcgaaag cggtcctcgc cgaaaatgac ccagagcgct gccggcacct 6960gtcctacgag ttgcatgata aagaagacag tcataagtgc ggcgacgata gtcatgcccc 7020gcgcccaccg gaaggagctg actgggttga aggctctcaa gggcatcggt cgacgctctc 7080ccttatgcga ctcctgcatt aggaagcagc ccagtagtag gttgaggccg ttgagcaccg 7140ccgccgcaag gaatggtgca tgcaaggaga tggcgcccaa cagtcccccg gccacggggc 7200ctgccaccat acccacgccg aaacaagcgc tcatgagccc gaagtggcga gcccgatctt 7260ccccatcggt gatgtcggcg atataggcgc cagcaaccgc acctgtggcg ccggtgatgc 7320cggccacgat gcgtccggcg tagaggatcc acaggacggg tgtggtcgcc atgatcgcgt 7380agtcgatagt ggctccaagt agcgaagcga gcaggactgg gcggcggcca aagcggtcgg 7440acagtgctcc gagaacgggt gcgcatagaa attgcatcaa cgcatatagc gctagcagca 7500cgccatagtg actggcgatg ctgtcggaat ggacgatatc ccgcaagagg cccggcagta 7560ccggcataac caagcctatg cctacagcat ccagggtgac ggtgccgagg atgacgatga 7620gcgcattgtt agatttcata cacggtgcct gactgcgtta gcaatttaac tgtgataaac 7680taccgcatta aagcttatcg atgataagct gtcaaacatg agaattcttg aagacgaaag 7740ggcctcgtga tacgcctatt tttataggtt aatgtcatga taataatggt ttcttagacg 7800tcaggtggca cttttcgggg aaatgtgcgc gcccgcgttc ctgctggcgc tgggcctgtt 7860tctggcgctg gacttcccgc tgttccgtca gcagcttttc gcccacggcc ttgatgatcg 7920cggcggcctt ggcctgcata tcccgattca acggccccag ggcgtccaga acgggcttca 7980ggcgctcccg aaggtctcgg gccgtctctt gggcttgatc ggccttcttg cgcatctcac 8040gcgctcctgc ggcggcctgt agggcaggct catacccctg ccgaaccgct tttgtcagcc 8100ggtcggccac ggcttccggc gtctcaacgc gctttgagat tcccagcttt tcggccaatc 8160cctgcggtgc ataggcgcgt ggctcgaccg cttgcgggct gatggtgacg tggcccactg 8220gtggccgctc cagggcctcg tagaacgcct gaatgcgcgt gtgacgtgcc ttgctgcc 8278198455DNAArtificial sequenceSynthetic 19ctcgatgccc cgttgcagcc ctagatcggc cacagcggcc gcaaacgtgg tctggtcgcg 60ggtcatctgc gctttgttgc cgatgaactc cttggccgac agcctgccgt cctgcgtcag 120cggcaccacg aacgcggtca tgtgcgggct ggtttcgtca cggtggatgc tggccgtcac 180gatgcgatcc gccccgtact tgtccgccag ccacttgtgc gccttctcga agaacgccgc 240ctgctgttct tggctggccg acttccacca ttccgggctg gccgtcatga cgtactcgac 300cgccaacaca gcgtccttgc gccgcttctc tggcagcaac tcgcgcagtc ggcccatcgc 360ttcatcggtg ctgctggccg cccagtgctc gttctctggc gtcctgctgg cgtcagcgtt 420gggcgtctcg cgctcgcggt aggcgtgctt gagactggcc gccacgttgc ccattttcgc 480cagcttcttg catcgcatga tcgcgtatgc cgccatgcct gcccctccct tttggtgtcc 540aaccggctcg acgggggcag cgcaaggcgg tgcctccggc gggccactca atgcttgagt 600atactcacta gactttgctt cgcaaagtcg tgaccgccta cggcggctgc ggcgccctac 660gggcttgctc tccgggcttc gccctgcgcg gtcgctgcgc tcccttgcca gcccgtggat 720atgtggacga tggccgcgag cggccaccgg ctggctcgct tcgctcggcc cgtggacaac 780cctgctggac aagctgatgg acaggctgcg cctgcccacg agcttgacca cagggattgc 840ccaccggcta cccagccttc gaccacatac ccaccggctc caactgcgcg gcctgcggcc 900ttgccccatc aattttttta attttctctg gggaaaagcc tccggcctgc ggcctgcgcg 960cttcgcttgc cggttggaca ccaagtggaa ggcgggtcaa ggctcgcgca gcgaccgcgc 1020agcggcttgg ccttgacgcg cctggaacga cccaagccta tgcgagtggg ggcagtcgaa 1080ggcgaagccc gcccgcctgc cccccgagcc tcacggcggc gagtgcgggg gttccaaggg 1140ggcagcgcca ccttgggcaa ggccgaaggc cgcgcagtcg atcaacaagc cccggagggg 1200ccactttttg ccggaggggg agccgcgccg aaggcgtggg ggaaccccgc aggggtgccc 1260ttctttgggc accaaagaac tagatatagg gcgaaatgcg aaagacttaa aaatcaacaa 1320cttaaaaaag gggggtacgc aacagctcat tgcggcaccc cccgcaatag ctcattgcgt 1380aggttaaaga aaatctgtaa ttgactgcca cttttacgca acgcataatt gttgtcgcgc 1440tgccgaaaag ttgcagctga ttgcgcatgg tgccgcaacc gtgcggcacc ctaccgcatg 1500gagataagca tggccacgca gtccagagaa atcggcattc aagccaagaa caagcccggt 1560cactgggtgc aaacggaacg caaagcgcat gaggcgtggg ccgggcttat tgcgaggaaa 1620cccacggcgg caatgctgct gcatcacctc gtggcgcaga tgggccacca gaacgccgtg 1680gtggtcagcc agaagacact ttccaagctc atcggacgtt ctttgcggac ggtccaatac 1740gcagtcaagg acttggtggc cgagcgctgg atctccgtcg tgaagctcaa cggccccggc 1800accgtgtcgg cctacgtggt caatgaccgc gtggcgtggg gccagccccg cgaccagttg 1860cgcctgtcgg tgttcagtgc cgccgtggtg gttgatcacg acgaccagga cgaatcgctg 1920ttggggcatg gcgacctgcg ccgcatcccg accctgtatc cgggcgagca gcaactaccg 1980accggccccg gcgaggagcc gcccagccag cccggcattc cgggcatgga accagacctg 2040ccagccttga ccgaaacgga ggaatgggaa cggcgcgggc agcagcgcct gccgatgccc 2100gatgagccgt gttttctgga cgatggcgag ccgttggagc cgccgacacg ggtcacgctg 2160ccgcgccggt agcacttggg ttgcgcagca acccgtaagt gcgctgttcc agactatcgg 2220ctgtagccgc ctctagatta attaacctcc agcgcgggga tctcatgctg gagttcttcg 2280cccaccccca gacaagctgt gaccgtctcc gggagctgca tgtgtcagag gttttcaccg 2340tcatcaccga aacgcgcgag gcagcagatc aattcgcgcg cgaaggcgaa gcggcatgca 2400taatgtgcct gtcaaatgga cgaagcaggg attctgcaaa ccctatgcta ctccgtcaag 2460ccgtcaattg tctgattcgt taccaattat gacaacttga cggctacatc attcactttt 2520tcttcacaac cggcacggaa ctcgctcggg ctggccccgg tgcatttttt aaatacccgc 2580gagaaataga gttgatcgtc aaaaccaaca ttgcgaccga cggtggcgat aggcatccgg 2640gtggtgctca aaagcagctt cgcctggctg atacgttggt cctcgcgcca gcttaagacg 2700ctaatcccta actgctggcg gaaaagatgt gacagacgcg acggcgacaa gcaaacatgc 2760tgtgcgacgc tggcgatatc aaaattgctg tctgccaggt gatcgctgat gtactgacaa 2820gcctcgcgta cccgattatc catcggtgga tggagcgact cgttaatcgc ttccatgcgc 2880cgcagtaaca attgctcaag cagatttatc gccagcagct ccgaatagcg cccttcccct 2940tgcccggcgt taatgatttg cccaaacagg tcgctgaaat gcggctggtg cgcttcatcc 3000gggcgaaaga accccgtatt ggcaaatatt gacggccagt taagccattc atgccagtag 3060gcgcgcggac gaaagtaaac ccactggtga taccattcgc gagcctccgg atgacgaccg 3120tagtgatgaa tctctcctgg cgggaacagc aaaatatcac ccggtcggca aacaaattct 3180cgtccctgat ttttcaccac cccctgaccg cgaatggtga gattgagaat ataacctttc 3240attcccagcg gtcggtcgat aaaaaaatcg agataaccgt tggcctcaat cggcgttaaa 3300cccgccacca gatgggcatt aaacgagtat cccggcagca ggggatcatt ttgcgcttca 3360gccatacttt tcatactccc gccattcaga gaagaaacca attgtccata ttgcatcaga 3420cattgccgtc actgcgtctt ttactggctc ttctcgctaa ccaaaccggt aaccccgctt 3480attaaaagca ttctgtaaca aagcgggacc aaagccatga caaaaacgcg taacaaaagt 3540gtctataatc acggcagaaa agtccacatt gattatttgc acggcgtcac actttgctat 3600gccatagcat ttttatccat aagattagcg gatcctacct gacgcttttt atcgcaactc 3660tctactgttt ctccataccc gttttttggg ctagaaataa ttttgagctc gccaaggaga 3720tataatgaat cgaaaagatg aacatctatc attagctaaa gcgttccaca aagaaaaaag 3780taatgacttt gatcgtgtgc gttttgttca ccaatcgttt gctgaatccg ctgttaacga 3840agtggatatt tccacttcgt ttctttcttt tcagcttccc caaccttttt atgtcaatgc 3900aatgacaggt ggtagtcagc gtgcaaaaga aattaatcag caattaggca ttattgccaa 3960agaaactggc cttttagttg cgacaggatc tgtctcggca gcgttaaaag atgctagttt 4020agcggatacg tatcaaatta tgcgaaaaga aaacccagat ggactcattt ttgccaatat 4080tggtgcaggc ttgggtgtgg aagaagcaaa gcgagcgctt gatttatttc aagcgaatgc 4140cttacaaatc catgtaaatg tgccccaaga attggtcatg cctgaaggag atcgtgattt 4200cactaattgg ctaaccaaga ttgaagctat cgtacaggcc gtagaagtgc ctgtcattgt 4260caaagaggtt ggctttggca tgagccaaga aaccttagaa aaacttacct ctatcggcgt 4320tcaagcagcg gatgtgagcg gccaaggcgg aacgagtttt acacaaattg aaaatgcccg 4380gcggaagaaa cgagaacttt ctttcttaga tgattggggg caatcaacgg tcatctctct 4440tctggaatca caaaattggc aaaagaaact aactattctc ggctctggcg gtgtgcgtaa 4500ctctcttgat attgtcaaag gactcgcttt aggtgccaaa agcatgggag ttgctgggac 4560tatcttagct tcccttatga gtaaaaatgg tttagaaaat accttagccc ttgtacagca 4620atggcaagaa gaagtgaaaa tgctttatac tcttttagga aaaaagacga cagaagaatt 4680gacgagtacc gcacttgtcc tcgatccagt tttagttaat tggtgtcata accgtggtat 4740cgacagcact gttttcgcaa aacgttaact ttaaggaagg agcgaagcat gcgttgtagc 4800gttagcaccg aaaatgtgtc gtttacggaa acggaaaccg aagctcgccg cagcgcaaac 4860tatgaaccga actcgtggga ttacgattac ctccttagca gcgatacgga tgaaagcatt 4920gaagtgtata aagacaaagc caagaaactg gaggccgaag tccgtcgcga aatcaacaat 4980gagaaagcgg agtttcttac gttactggaa ttgatcgata acgtgcaacg gttaggcctc 5040ggctaccgct ttgagagcga tatccgtggt gcactggacc gcttcgtatc gtctggtggt 5100tttgacgccg ttacgaaaac gagcctgcat ggtacagcat tgtcttttcg gctgttgcgc 5160cagcatggat ttgaagtgtc acaggaggca ttttcaggct tcaaagacca gaacgggaat 5220tttttggaga atttgaaaga agatatcaaa gcgatcttat ctctgtatga ggcgtcattt 5280ctcgctctgg aaggggaaaa tattctggac gaagcgaaag tgttcgcaat ttcccatctg 5340aaagaacttt ccgaagaaaa gattgggaaa gaattggccg aacaggtgaa ccatgcgctg 5400gaactgccac tgcaccgtcg cacccaacgc ctcgaagcgg tatggtcgat tgaagcgtat 5460cgcaaaaaag aggatgcaaa tcaggttctg ctggaactgg ccattctcga ctataacatg 5520attcagtccg tctatcaacg tgatctgcgc gaaactagtc gttggtggcg ccgtgtagga

5580cttgccacta aactgcattt tgcacgtgat cgtctgattg agtcgttcta ttgggcggtt 5640ggtgtagcgt ttgagccgca gtattctgat tgccgcaata gtgtggcgaa aatgttctcc 5700tttgtgacca tcattgacga tatttacgac gtgtatggca ccctggatga actggaatta 5760ttcaccgatg cagtagaacg ctgggacgtc aacgcgatca atgatttgcc ggattacatg 5820aaactgtgtt ttctggccct gtataacacc attaacgaaa ttgcctatga caacctcaaa 5880gacaagggtg aaaatatcct gccctatctg actaaagctt gggctgatct gtgtaacgcg 5940ttcttacagg aagccaaatg gctctacaac aagagtacgc ctactttcga tgactacttt 6000ggcaacgctt ggaaaagctc tagcggccct ttacaactgg tgttcgcgta tttcgccgtt 6060gttcagaata tcaagaaaga agagattgag aacctccaaa agtaccacga tacgatttcg 6120cgtccgtcac acatctttcg cctttgcaat gatttggcca gtgcatctgc agagattgcg 6180cgcggtgaaa ctgccaactc cgtcagttgc tacatgcgta ccaaaggcat cagcgaggaa 6240ctggctaccg agtcggtgat gaacttaatc gatgaaacct ggaagaagat gaacaaagag 6300aaacttggtg gcagtctgtt tgctaaaccg ttcgttgaga cagcgattaa tctggcgcgt 6360caaagccact gcacctacca caatggcgat gcccacacat ccccagacga attaacccgg 6420aaacgtgtcc tgagtgtcat caccgaaccc attctgccgt tcgaacgcta agcctgctaa 6480caaagcccga aaggaagctg agttggctgc tgccaccgct gagcactagt gcggccgctt 6540tgcgcattca cagttctccg caagaattga ttggctccaa ttcttggagt ggtgaatccg 6600ttagcgaggt gccgccggct tccattcagg tcgaggtggc ccggctccat gcaccgcgac 6660gcaacgcggg gaggcagaca aggtataggg cggcgcctac aatccatgcc aacccgttcc 6720atgtgctcgc cgaggcggca taaatcgccg tgacgatcag cggtccagtg atcgaagtta 6780ggctggtaag agccgcgagc gatccttgaa gctgtccctg atggtcgtca tctacctgcc 6840tggacagcat ggcctgcaac gcgggcatcc cgatgccgcc ggaagcgaga agaatcataa 6900tggggaaggc catccagcct cgcgtcgcga acgccagcaa gacgtagccc agcgcgtcgg 6960ccgccatgcc ggcgataatg gcctgcttct cgccgaaacg tttggtggcg ggaccagtga 7020cgaaggcttg agcgagggcg tgcaagattc cgaataccgc aagcgacagg ccgatcatcg 7080tcgcgctcca gcgaaagcgg tcctcgccga aaatgaccca gagcgctgcc ggcacctgtc 7140ctacgagttg catgataaag aagacagtca taagtgcggc gacgatagtc atgccccgcg 7200cccaccggaa ggagctgact gggttgaagg ctctcaaggg catcggtcga cgctctccct 7260tatgcgactc ctgcattagg aagcagccca gtagtaggtt gaggccgttg agcaccgccg 7320ccgcaaggaa tggtgcatgc aaggagatgg cgcccaacag tcccccggcc acggggcctg 7380ccaccatacc cacgccgaaa caagcgctca tgagcccgaa gtggcgagcc cgatcttccc 7440catcggtgat gtcggcgata taggcgccag caaccgcacc tgtggcgccg gtgatgccgg 7500ccacgatgcg tccggcgtag aggatccaca ggacgggtgt ggtcgccatg atcgcgtagt 7560cgatagtggc tccaagtagc gaagcgagca ggactgggcg gcggccaaag cggtcggaca 7620gtgctccgag aacgggtgcg catagaaatt gcatcaacgc atatagcgct agcagcacgc 7680catagtgact ggcgatgctg tcggaatgga cgatatcccg caagaggccc ggcagtaccg 7740gcataaccaa gcctatgcct acagcatcca gggtgacggt gccgaggatg acgatgagcg 7800cattgttaga tttcatacac ggtgcctgac tgcgttagca atttaactgt gataaactac 7860cgcattaaag cttatcgatg ataagctgtc aaacatgaga attcttgaag acgaaagggc 7920ctcgtgatac gcctattttt ataggttaat gtcatgataa taatggtttc ttagacgtca 7980ggtggcactt ttcggggaaa tgtgcgcgcc cgcgttcctg ctggcgctgg gcctgtttct 8040ggcgctggac ttcccgctgt tccgtcagca gcttttcgcc cacggccttg atgatcgcgg 8100cggccttggc ctgcatatcc cgattcaacg gccccagggc gtccagaacg ggcttcaggc 8160gctcccgaag gtctcgggcc gtctcttggg cttgatcggc cttcttgcgc atctcacgcg 8220ctcctgcggc ggcctgtagg gcaggctcat acccctgccg aaccgctttt gtcagccggt 8280cggccacggc ttccggcgtc tcaacgcgct ttgagattcc cagcttttcg gccaatccct 8340gcggtgcata ggcgcgtggc tcgaccgctt gcgggctgat ggtgacgtgg cccactggtg 8400gccgctccag ggcctcgtag aacgcctgaa tgcgcgtgtg acgtgccttg ctgcc 8455208400DNAArtificial sequenceSynthetic 20ctcgatgccc cgttgcagcc ctagatcggc cacagcggcc gcaaacgtgg tctggtcgcg 60ggtcatctgc gctttgttgc cgatgaactc cttggccgac agcctgccgt cctgcgtcag 120cggcaccacg aacgcggtca tgtgcgggct ggtttcgtca cggtggatgc tggccgtcac 180gatgcgatcc gccccgtact tgtccgccag ccacttgtgc gccttctcga agaacgccgc 240ctgctgttct tggctggccg acttccacca ttccgggctg gccgtcatga cgtactcgac 300cgccaacaca gcgtccttgc gccgcttctc tggcagcaac tcgcgcagtc ggcccatcgc 360ttcatcggtg ctgctggccg cccagtgctc gttctctggc gtcctgctgg cgtcagcgtt 420gggcgtctcg cgctcgcggt aggcgtgctt gagactggcc gccacgttgc ccattttcgc 480cagcttcttg catcgcatga tcgcgtatgc cgccatgcct gcccctccct tttggtgtcc 540aaccggctcg acgggggcag cgcaaggcgg tgcctccggc gggccactca atgcttgagt 600atactcacta gactttgctt cgcaaagtcg tgaccgccta cggcggctgc ggcgccctac 660gggcttgctc tccgggcttc gccctgcgcg gtcgctgcgc tcccttgcca gcccgtggat 720atgtggacga tggccgcgag cggccaccgg ctggctcgct tcgctcggcc cgtggacaac 780cctgctggac aagctgatgg acaggctgcg cctgcccacg agcttgacca cagggattgc 840ccaccggcta cccagccttc gaccacatac ccaccggctc caactgcgcg gcctgcggcc 900ttgccccatc aattttttta attttctctg gggaaaagcc tccggcctgc ggcctgcgcg 960cttcgcttgc cggttggaca ccaagtggaa ggcgggtcaa ggctcgcgca gcgaccgcgc 1020agcggcttgg ccttgacgcg cctggaacga cccaagccta tgcgagtggg ggcagtcgaa 1080ggcgaagccc gcccgcctgc cccccgagcc tcacggcggc gagtgcgggg gttccaaggg 1140ggcagcgcca ccttgggcaa ggccgaaggc cgcgcagtcg atcaacaagc cccggagggg 1200ccactttttg ccggaggggg agccgcgccg aaggcgtggg ggaaccccgc aggggtgccc 1260ttctttgggc accaaagaac tagatatagg gcgaaatgcg aaagacttaa aaatcaacaa 1320cttaaaaaag gggggtacgc aacagctcat tgcggcaccc cccgcaatag ctcattgcgt 1380aggttaaaga aaatctgtaa ttgactgcca cttttacgca acgcataatt gttgtcgcgc 1440tgccgaaaag ttgcagctga ttgcgcatgg tgccgcaacc gtgcggcacc ctaccgcatg 1500gagataagca tggccacgca gtccagagaa atcggcattc aagccaagaa caagcccggt 1560cactgggtgc aaacggaacg caaagcgcat gaggcgtggg ccgggcttat tgcgaggaaa 1620cccacggcgg caatgctgct gcatcacctc gtggcgcaga tgggccacca gaacgccgtg 1680gtggtcagcc agaagacact ttccaagctc atcggacgtt ctttgcggac ggtccaatac 1740gcagtcaagg acttggtggc cgagcgctgg atctccgtcg tgaagctcaa cggccccggc 1800accgtgtcgg cctacgtggt caatgaccgc gtggcgtggg gccagccccg cgaccagttg 1860cgcctgtcgg tgttcagtgc cgccgtggtg gttgatcacg acgaccagga cgaatcgctg 1920ttggggcatg gcgacctgcg ccgcatcccg accctgtatc cgggcgagca gcaactaccg 1980accggccccg gcgaggagcc gcccagccag cccggcattc cgggcatgga accagacctg 2040ccagccttga ccgaaacgga ggaatgggaa cggcgcgggc agcagcgcct gccgatgccc 2100gatgagccgt gttttctgga cgatggcgag ccgttggagc cgccgacacg ggtcacgctg 2160ccgcgccggt agcacttggg ttgcgcagca acccgtaagt gcgctgttcc agactatcgg 2220ctgtagccgc ctctagatta attaacctcc agcgcgggga tctcatgctg gagttcttcg 2280cccaccccca gacaagctgt gaccgtctcc gggagctgca tgtgtcagag gttttcaccg 2340tcatcaccga aacgcgcgag gcagcagatc aattcgcgcg cgaaggcgaa gcggcatgca 2400taatgtgcct gtcaaatgga cgaagcaggg attctgcaaa ccctatgcta ctccgtcaag 2460ccgtcaattg tctgattcgt taccaattat gacaacttga cggctacatc attcactttt 2520tcttcacaac cggcacggaa ctcgctcggg ctggccccgg tgcatttttt aaatacccgc 2580gagaaataga gttgatcgtc aaaaccaaca ttgcgaccga cggtggcgat aggcatccgg 2640gtggtgctca aaagcagctt cgcctggctg atacgttggt cctcgcgcca gcttaagacg 2700ctaatcccta actgctggcg gaaaagatgt gacagacgcg acggcgacaa gcaaacatgc 2760tgtgcgacgc tggcgatatc aaaattgctg tctgccaggt gatcgctgat gtactgacaa 2820gcctcgcgta cccgattatc catcggtgga tggagcgact cgttaatcgc ttccatgcgc 2880cgcagtaaca attgctcaag cagatttatc gccagcagct ccgaatagcg cccttcccct 2940tgcccggcgt taatgatttg cccaaacagg tcgctgaaat gcggctggtg cgcttcatcc 3000gggcgaaaga accccgtatt ggcaaatatt gacggccagt taagccattc atgccagtag 3060gcgcgcggac gaaagtaaac ccactggtga taccattcgc gagcctccgg atgacgaccg 3120tagtgatgaa tctctcctgg cgggaacagc aaaatatcac ccggtcggca aacaaattct 3180cgtccctgat ttttcaccac cccctgaccg cgaatggtga gattgagaat ataacctttc 3240attcccagcg gtcggtcgat aaaaaaatcg agataaccgt tggcctcaat cggcgttaaa 3300cccgccacca gatgggcatt aaacgagtat cccggcagca ggggatcatt ttgcgcttca 3360gccatacttt tcatactccc gccattcaga gaagaaacca attgtccata ttgcatcaga 3420cattgccgtc actgcgtctt ttactggctc ttctcgctaa ccaaaccggt aaccccgctt 3480attaaaagca ttctgtaaca aagcgggacc aaagccatga caaaaacgcg taacaaaagt 3540gtctataatc acggcagaaa agtccacatt gattatttgc acggcgtcac actttgctat 3600gccatagcat ttttatccat aagattagcg gatcctacct gacgcttttt atcgcaactc 3660tctactgttt ctccataccc gttttttggg ctagaaataa ttttgagctc gccaaggaga 3720tataatgact aaccgtaaag atgatcacat caaatatgct ctcaagtacc aatcgcctta 3780taatgctttt gatgacatag aactcataca ccattcctta cctagctatg atttgtctga 3840tattgatctc agtactcatt ttgctgggca agacttcgac tttccctttt acatcaatgc 3900catgacagga ggaagtcaaa aaggcaaagc tgtcaatgaa aaattggcca aagtagcagc 3960agcaacaggg attgtcatgg tgacagggtc ttatagcgct gctttaaaaa atcctaacga 4020cgattcctat cgtttacatg aggtggcaga taacttgaaa ctagccacga atattggtct 4080agataaacct gtggcgctag gacaacaaac ggttcaagaa atgcagcccc tctttttaca 4140ggttcatgtg aatgtgatgc aagagttgct gatgccagag ggtgagcgcg tctttcatac 4200ctggaaaaaa cacctcgctg aatacgctag tcaaatacca gttcctgtca ttctcaaaga 4260agttggtttt ggcatggatg tcaatagtat caagctagca catgacctag gcattcaaac 4320ctttgatatt tcaggtagag gaggaacttc atttgcttac attgaaaatc aaagaggggg 4380agaccgctct tacttaaacg attggggaca aaccactgtt cagtgcttac tgaatgcaca 4440aggactgatg gaccaagtgg aaatcttagc ttcgggtggt gtcagacacc ccttggacat 4500gattaagtgt tttgtcttag gagcacgtgc agtgggactc tcacgcaccg ttttagaatt 4560ggtcgaaaaa tacccaaccg agcgtgtgat tgctatcgtt aatggctgga aagaagaatt 4620aaaaatcatt atgtgtgctc ttgactgtaa aactattaaa gaattaaagg gagtcgacta 4680cttactatat ggacgcttgc agcaggtcaa ttagcttaag gaaggagcga agcatgcgtt 4740gtagcgttag caccgaaaat gtgtcgttta cggaaacgga aaccgaagct cgccgcagcg 4800caaactatga accgaactcg tgggattacg attacctcct tagcagcgat acggatgaaa 4860gcattgaagt gtataaagac aaagccaaga aactggaggc cgaagtccgt cgcgaaatca 4920acaatgagaa agcggagttt cttacgttac tggaattgat cgataacgtg caacggttag 4980gcctcggcta ccgctttgag agcgatatcc gtggtgcact ggaccgcttc gtatcgtctg 5040gtggttttga cgccgttacg aaaacgagcc tgcatggtac agcattgtct tttcggctgt 5100tgcgccagca tggatttgaa gtgtcacagg aggcattttc aggcttcaaa gaccagaacg 5160ggaatttttt ggagaatttg aaagaagata tcaaagcgat cttatctctg tatgaggcgt 5220catttctcgc tctggaaggg gaaaatattc tggacgaagc gaaagtgttc gcaatttccc 5280atctgaaaga actttccgaa gaaaagattg ggaaagaatt ggccgaacag gtgaaccatg 5340cgctggaact gccactgcac cgtcgcaccc aacgcctcga agcggtatgg tcgattgaag 5400cgtatcgcaa aaaagaggat gcaaatcagg ttctgctgga actggccatt ctcgactata 5460acatgattca gtccgtctat caacgtgatc tgcgcgaaac tagtcgttgg tggcgccgtg 5520taggacttgc cactaaactg cattttgcac gtgatcgtct gattgagtcg ttctattggg 5580cggttggtgt agcgtttgag ccgcagtatt ctgattgccg caatagtgtg gcgaaaatgt 5640tctcctttgt gaccatcatt gacgatattt acgacgtgta tggcaccctg gatgaactgg 5700aattattcac cgatgcagta gaacgctggg acgtcaacgc gatcaatgat ttgccggatt 5760acatgaaact gtgttttctg gccctgtata acaccattaa cgaaattgcc tatgacaacc 5820tcaaagacaa gggtgaaaat atcctgccct atctgactaa agcttgggct gatctgtgta 5880acgcgttctt acaggaagcc aaatggctct acaacaagag tacgcctact ttcgatgact 5940actttggcaa cgcttggaaa agctctagcg gccctttaca actggtgttc gcgtatttcg 6000ccgttgttca gaatatcaag aaagaagaga ttgagaacct ccaaaagtac cacgatacga 6060tttcgcgtcc gtcacacatc tttcgccttt gcaatgattt ggccagtgca tctgcagaga 6120ttgcgcgcgg tgaaactgcc aactccgtca gttgctacat gcgtaccaaa ggcatcagcg 6180aggaactggc taccgagtcg gtgatgaact taatcgatga aacctggaag aagatgaaca 6240aagagaaact tggtggcagt ctgtttgcta aaccgttcgt tgagacagcg attaatctgg 6300cgcgtcaaag ccactgcacc taccacaatg gcgatgccca cacatcccca gacgaattaa 6360cccggaaacg tgtcctgagt gtcatcaccg aacccattct gccgttcgaa cgctaagcct 6420gctaacaaag cccgaaagga agctgagttg gctgctgcca ccgctgagca ctagtgcggc 6480cgctttgcgc attcacagtt ctccgcaaga attgattggc tccaattctt ggagtggtga 6540atccgttagc gaggtgccgc cggcttccat tcaggtcgag gtggcccggc tccatgcacc 6600gcgacgcaac gcggggaggc agacaaggta tagggcggcg cctacaatcc atgccaaccc 6660gttccatgtg ctcgccgagg cggcataaat cgccgtgacg atcagcggtc cagtgatcga 6720agttaggctg gtaagagccg cgagcgatcc ttgaagctgt ccctgatggt cgtcatctac 6780ctgcctggac agcatggcct gcaacgcggg catcccgatg ccgccggaag cgagaagaat 6840cataatgggg aaggccatcc agcctcgcgt cgcgaacgcc agcaagacgt agcccagcgc 6900gtcggccgcc atgccggcga taatggcctg cttctcgccg aaacgtttgg tggcgggacc 6960agtgacgaag gcttgagcga gggcgtgcaa gattccgaat accgcaagcg acaggccgat 7020catcgtcgcg ctccagcgaa agcggtcctc gccgaaaatg acccagagcg ctgccggcac 7080ctgtcctacg agttgcatga taaagaagac agtcataagt gcggcgacga tagtcatgcc 7140ccgcgcccac cggaaggagc tgactgggtt gaaggctctc aagggcatcg gtcgacgctc 7200tcccttatgc gactcctgca ttaggaagca gcccagtagt aggttgaggc cgttgagcac 7260cgccgccgca aggaatggtg catgcaagga gatggcgccc aacagtcccc cggccacggg 7320gcctgccacc atacccacgc cgaaacaagc gctcatgagc ccgaagtggc gagcccgatc 7380ttccccatcg gtgatgtcgg cgatataggc gccagcaacc gcacctgtgg cgccggtgat 7440gccggccacg atgcgtccgg cgtagaggat ccacaggacg ggtgtggtcg ccatgatcgc 7500gtagtcgata gtggctccaa gtagcgaagc gagcaggact gggcggcggc caaagcggtc 7560ggacagtgct ccgagaacgg gtgcgcatag aaattgcatc aacgcatata gcgctagcag 7620cacgccatag tgactggcga tgctgtcgga atggacgata tcccgcaaga ggcccggcag 7680taccggcata accaagccta tgcctacagc atccagggtg acggtgccga ggatgacgat 7740gagcgcattg ttagatttca tacacggtgc ctgactgcgt tagcaattta actgtgataa 7800actaccgcat taaagcttat cgatgataag ctgtcaaaca tgagaattct tgaagacgaa 7860agggcctcgt gatacgccta tttttatagg ttaatgtcat gataataatg gtttcttaga 7920cgtcaggtgg cacttttcgg ggaaatgtgc gcgcccgcgt tcctgctggc gctgggcctg 7980tttctggcgc tggacttccc gctgttccgt cagcagcttt tcgcccacgg ccttgatgat 8040cgcggcggcc ttggcctgca tatcccgatt caacggcccc agggcgtcca gaacgggctt 8100caggcgctcc cgaaggtctc gggccgtctc ttgggcttga tcggccttct tgcgcatctc 8160acgcgctcct gcggcggcct gtagggcagg ctcatacccc tgccgaaccg cttttgtcag 8220ccggtcggcc acggcttccg gcgtctcaac gcgctttgag attcccagct tttcggccaa 8280tccctgcggt gcataggcgc gtggctcgac cgcttgcggg ctgatggtga cgtggcccac 8340tggtggccgc tccagggcct cgtagaacgc ctgaatgcgc gtgtgacgtg ccttgctgcc 8400218443DNAArtificial sequenceSynthetic 21ctcgatgccc cgttgcagcc ctagatcggc cacagcggcc gcaaacgtgg tctggtcgcg 60ggtcatctgc gctttgttgc cgatgaactc cttggccgac agcctgccgt cctgcgtcag 120cggcaccacg aacgcggtca tgtgcgggct ggtttcgtca cggtggatgc tggccgtcac 180gatgcgatcc gccccgtact tgtccgccag ccacttgtgc gccttctcga agaacgccgc 240ctgctgttct tggctggccg acttccacca ttccgggctg gccgtcatga cgtactcgac 300cgccaacaca gcgtccttgc gccgcttctc tggcagcaac tcgcgcagtc ggcccatcgc 360ttcatcggtg ctgctggccg cccagtgctc gttctctggc gtcctgctgg cgtcagcgtt 420gggcgtctcg cgctcgcggt aggcgtgctt gagactggcc gccacgttgc ccattttcgc 480cagcttcttg catcgcatga tcgcgtatgc cgccatgcct gcccctccct tttggtgtcc 540aaccggctcg acgggggcag cgcaaggcgg tgcctccggc gggccactca atgcttgagt 600atactcacta gactttgctt cgcaaagtcg tgaccgccta cggcggctgc ggcgccctac 660gggcttgctc tccgggcttc gccctgcgcg gtcgctgcgc tcccttgcca gcccgtggat 720atgtggacga tggccgcgag cggccaccgg ctggctcgct tcgctcggcc cgtggacaac 780cctgctggac aagctgatgg acaggctgcg cctgcccacg agcttgacca cagggattgc 840ccaccggcta cccagccttc gaccacatac ccaccggctc caactgcgcg gcctgcggcc 900ttgccccatc aattttttta attttctctg gggaaaagcc tccggcctgc ggcctgcgcg 960cttcgcttgc cggttggaca ccaagtggaa ggcgggtcaa ggctcgcgca gcgaccgcgc 1020agcggcttgg ccttgacgcg cctggaacga cccaagccta tgcgagtggg ggcagtcgaa 1080ggcgaagccc gcccgcctgc cccccgagcc tcacggcggc gagtgcgggg gttccaaggg 1140ggcagcgcca ccttgggcaa ggccgaaggc cgcgcagtcg atcaacaagc cccggagggg 1200ccactttttg ccggaggggg agccgcgccg aaggcgtggg ggaaccccgc aggggtgccc 1260ttctttgggc accaaagaac tagatatagg gcgaaatgcg aaagacttaa aaatcaacaa 1320cttaaaaaag gggggtacgc aacagctcat tgcggcaccc cccgcaatag ctcattgcgt 1380aggttaaaga aaatctgtaa ttgactgcca cttttacgca acgcataatt gttgtcgcgc 1440tgccgaaaag ttgcagctga ttgcgcatgg tgccgcaacc gtgcggcacc ctaccgcatg 1500gagataagca tggccacgca gtccagagaa atcggcattc aagccaagaa caagcccggt 1560cactgggtgc aaacggaacg caaagcgcat gaggcgtggg ccgggcttat tgcgaggaaa 1620cccacggcgg caatgctgct gcatcacctc gtggcgcaga tgggccacca gaacgccgtg 1680gtggtcagcc agaagacact ttccaagctc atcggacgtt ctttgcggac ggtccaatac 1740gcagtcaagg acttggtggc cgagcgctgg atctccgtcg tgaagctcaa cggccccggc 1800accgtgtcgg cctacgtggt caatgaccgc gtggcgtggg gccagccccg cgaccagttg 1860cgcctgtcgg tgttcagtgc cgccgtggtg gttgatcacg acgaccagga cgaatcgctg 1920ttggggcatg gcgacctgcg ccgcatcccg accctgtatc cgggcgagca gcaactaccg 1980accggccccg gcgaggagcc gcccagccag cccggcattc cgggcatgga accagacctg 2040ccagccttga ccgaaacgga ggaatgggaa cggcgcgggc agcagcgcct gccgatgccc 2100gatgagccgt gttttctgga cgatggcgag ccgttggagc cgccgacacg ggtcacgctg 2160ccgcgccggt agcacttggg ttgcgcagca acccgtaagt gcgctgttcc agactatcgg 2220ctgtagccgc ctctagatta attaacctcc agcgcgggga tctcatgctg gagttcttcg 2280cccaccccca gacaagctgt gaccgtctcc gggagctgca tgtgtcagag gttttcaccg 2340tcatcaccga aacgcgcgag gcagcagatc aattcgcgcg cgaaggcgaa gcggcatgca 2400taatgtgcct gtcaaatgga cgaagcaggg attctgcaaa ccctatgcta ctccgtcaag 2460ccgtcaattg tctgattcgt taccaattat gacaacttga cggctacatc attcactttt 2520tcttcacaac cggcacggaa ctcgctcggg ctggccccgg tgcatttttt aaatacccgc 2580gagaaataga gttgatcgtc aaaaccaaca ttgcgaccga cggtggcgat aggcatccgg 2640gtggtgctca aaagcagctt cgcctggctg atacgttggt cctcgcgcca gcttaagacg 2700ctaatcccta actgctggcg gaaaagatgt gacagacgcg acggcgacaa gcaaacatgc 2760tgtgcgacgc tggcgatatc aaaattgctg tctgccaggt gatcgctgat gtactgacaa 2820gcctcgcgta cccgattatc catcggtgga tggagcgact cgttaatcgc ttccatgcgc 2880cgcagtaaca attgctcaag cagatttatc gccagcagct ccgaatagcg cccttcccct 2940tgcccggcgt taatgatttg cccaaacagg tcgctgaaat gcggctggtg cgcttcatcc 3000gggcgaaaga accccgtatt ggcaaatatt gacggccagt taagccattc atgccagtag 3060gcgcgcggac gaaagtaaac ccactggtga taccattcgc gagcctccgg atgacgaccg 3120tagtgatgaa tctctcctgg cgggaacagc aaaatatcac ccggtcggca aacaaattct 3180cgtccctgat ttttcaccac cccctgaccg cgaatggtga gattgagaat ataacctttc 3240attcccagcg gtcggtcgat aaaaaaatcg agataaccgt tggcctcaat cggcgttaaa 3300cccgccacca gatgggcatt aaacgagtat cccggcagca ggggatcatt ttgcgcttca 3360gccatacttt tcatactccc gccattcaga gaagaaacca attgtccata ttgcatcaga 3420cattgccgtc actgcgtctt ttactggctc ttctcgctaa ccaaaccggt aaccccgctt 3480attaaaagca ttctgtaaca aagcgggacc aaagccatga caaaaacgcg taacaaaagt 3540gtctataatc acggcagaaa agtccacatt gattatttgc acggcgtcac actttgctat 3600gccatagcat ttttatccat aagattagcg gatcctacct gacgcttttt atcgcaactc 3660tctactgttt ctccataccc gttttttggg ctagaaataa

ttttgagctc gccaaggaga 3720tataatgacg accaaccgca aggatgagca catcctctac gccctggagc agaagtcgtc 3780gtacaactcg ttcgacgaag tggaactgat ccactcgtcg ctgccgctgt ataacctgga 3840cgaaatcgac ctgtccaccg agttcgccgg ccgcaagtgg gatttcccgt tctacatcaa 3900tgccatgacc ggcggtagca acaagggccg cgaaatcaat cagaagctgg cccaggtcgc 3960cgagtcgtgc ggcatcctgt tcgtcaccgg cagctactcc gccgcgctga agaacccgac 4020cgacgactcg ttctcggtca agagcagcca cccgaatctg ctgctgggca cgaacatcgg 4080cctcgacaag cccgtcgaac tgggcctgca gaccgtggaa gaaatgaacc ccgtgctgct 4140ccaggtgcat gtgaacgtga tgcaagagct gctgatgccg gagggcgaac gcaagttccg 4200cagctggcag tcgcacctgg ccgactactc gaagcagatc cccgtgccga tcgtgctgaa 4260agaagtgggc ttcggcatgg acgccaagac catcgagcgt gcctacgagt tcggcgtgcg 4320caccgtggac ctctcgggcc gcggtggcac gagcttcgcg tacatcgaaa accggcgcag 4380cggccagcgc gactacctga accagtgggg ccaatcgacc atgcaggccc tgctgaacgc 4440gcaagaatgg aaggacaagg tcgagctgct ggtgtcgggc ggcgtgcgta acccgctcga 4500catgatcaag tgcctggtgt tcggcgccaa ggccgtgggc ctgtcccgca ccgtgctgga 4560gctggtcgaa acctacaccg tcgaagaagt catcggcatt gtccagggct ggaaggccga 4620cctccgcctc atcatgtgct ccctgaactg cgccacgatc gcggacctcc agaaggtgga 4680ctatctcctc tacggcaagc tcaaagaagc caaggaccag atgaagaagg cgtgacttta 4740aggaaggagc gaagcatgcg ttgtagcgtt agcaccgaaa atgtgtcgtt tacggaaacg 4800gaaaccgaag ctcgccgcag cgcaaactat gaaccgaact cgtgggatta cgattacctc 4860cttagcagcg atacggatga aagcattgaa gtgtataaag acaaagccaa gaaactggag 4920gccgaagtcc gtcgcgaaat caacaatgag aaagcggagt ttcttacgtt actggaattg 4980atcgataacg tgcaacggtt aggcctcggc taccgctttg agagcgatat ccgtggtgca 5040ctggaccgct tcgtatcgtc tggtggtttt gacgccgtta cgaaaacgag cctgcatggt 5100acagcattgt cttttcggct gttgcgccag catggatttg aagtgtcaca ggaggcattt 5160tcaggcttca aagaccagaa cgggaatttt ttggagaatt tgaaagaaga tatcaaagcg 5220atcttatctc tgtatgaggc gtcatttctc gctctggaag gggaaaatat tctggacgaa 5280gcgaaagtgt tcgcaatttc ccatctgaaa gaactttccg aagaaaagat tgggaaagaa 5340ttggccgaac aggtgaacca tgcgctggaa ctgccactgc accgtcgcac ccaacgcctc 5400gaagcggtat ggtcgattga agcgtatcgc aaaaaagagg atgcaaatca ggttctgctg 5460gaactggcca ttctcgacta taacatgatt cagtccgtct atcaacgtga tctgcgcgaa 5520actagtcgtt ggtggcgccg tgtaggactt gccactaaac tgcattttgc acgtgatcgt 5580ctgattgagt cgttctattg ggcggttggt gtagcgtttg agccgcagta ttctgattgc 5640cgcaatagtg tggcgaaaat gttctccttt gtgaccatca ttgacgatat ttacgacgtg 5700tatggcaccc tggatgaact ggaattattc accgatgcag tagaacgctg ggacgtcaac 5760gcgatcaatg atttgccgga ttacatgaaa ctgtgttttc tggccctgta taacaccatt 5820aacgaaattg cctatgacaa cctcaaagac aagggtgaaa atatcctgcc ctatctgact 5880aaagcttggg ctgatctgtg taacgcgttc ttacaggaag ccaaatggct ctacaacaag 5940agtacgccta ctttcgatga ctactttggc aacgcttgga aaagctctag cggcccttta 6000caactggtgt tcgcgtattt cgccgttgtt cagaatatca agaaagaaga gattgagaac 6060ctccaaaagt accacgatac gatttcgcgt ccgtcacaca tctttcgcct ttgcaatgat 6120ttggccagtg catctgcaga gattgcgcgc ggtgaaactg ccaactccgt cagttgctac 6180atgcgtacca aaggcatcag cgaggaactg gctaccgagt cggtgatgaa cttaatcgat 6240gaaacctgga agaagatgaa caaagagaaa cttggtggca gtctgtttgc taaaccgttc 6300gttgagacag cgattaatct ggcgcgtcaa agccactgca cctaccacaa tggcgatgcc 6360cacacatccc cagacgaatt aacccggaaa cgtgtcctga gtgtcatcac cgaacccatt 6420ctgccgttcg aacgctaagc ctgctaacaa agcccgaaag gaagctgagt tggctgctgc 6480caccgctgag ttggctgctg ccaccgctga gcactagtgc ggccgctttg cgcattcaca 6540gttctccgca agaattgatt ggctccaatt cttggagtgg tgaatccgtt agcgaggtgc 6600cgccggcttc cattcaggtc gaggtggccc ggctccatgc accgcgacgc aacgcgggga 6660ggcagacaag gtatagggcg gcgcctacaa tccatgccaa cccgttccat gtgctcgccg 6720aggcggcata aatcgccgtg acgatcagcg gtccagtgat cgaagttagg ctggtaagag 6780ccgcgagcga tccttgaagc tgtccctgat ggtcgtcatc tacctgcctg gacagcatgg 6840cctgcaacgc gggcatcccg atgccgccgg aagcgagaag aatcataatg gggaaggcca 6900tccagcctcg cgtcgcgaac gccagcaaga cgtagcccag cgcgtcggcc gccatgccgg 6960cgataatggc ctgcttctcg ccgaaacgtt tggtggcggg accagtgacg aaggcttgag 7020cgagggcgtg caagattccg aataccgcaa gcgacaggcc gatcatcgtc gcgctccagc 7080gaaagcggtc ctcgccgaaa atgacccaga gcgctgccgg cacctgtcct acgagttgca 7140tgataaagaa gacagtcata agtgcggcga cgatagtcat gccccgcgcc caccggaagg 7200agctgactgg gttgaaggct ctcaagggca tcggtcgacg ctctccctta tgcgactcct 7260gcattaggaa gcagcccagt agtaggttga ggccgttgag caccgccgcc gcaaggaatg 7320gtgcatgcaa ggagatggcg cccaacagtc ccccggccac ggggcctgcc accataccca 7380cgccgaaaca agcgctcatg agcccgaagt ggcgagcccg atcttcccca tcggtgatgt 7440cggcgatata ggcgccagca accgcacctg tggcgccggt gatgccggcc acgatgcgtc 7500cggcgtagag gatccacagg acgggtgtgg tcgccatgat cgcgtagtcg atagtggctc 7560caagtagcga agcgagcagg actgggcggc ggccaaagcg gtcggacagt gctccgagaa 7620cgggtgcgca tagaaattgc atcaacgcat atagcgctag cagcacgcca tagtgactgg 7680cgatgctgtc ggaatggacg atatcccgca agaggcccgg cagtaccggc ataaccaagc 7740ctatgcctac agcatccagg gtgacggtgc cgaggatgac gatgagcgca ttgttagatt 7800tcatacacgg tgcctgactg cgttagcaat ttaactgtga taaactaccg cattaaagct 7860tatcgatgat aagctgtcaa acatgagaat tcttgaagac gaaagggcct cgtgatacgc 7920ctatttttat aggttaatgt catgataata atggtttctt agacgtcagg tggcactttt 7980cggggaaatg tgcgcgcccg cgttcctgct ggcgctgggc ctgtttctgg cgctggactt 8040cccgctgttc cgtcagcagc ttttcgccca cggccttgat gatcgcggcg gccttggcct 8100gcatatcccg attcaacggc cccagggcgt ccagaacggg cttcaggcgc tcccgaaggt 8160ctcgggccgt ctcttgggct tgatcggcct tcttgcgcat ctcacgcgct cctgcggcgg 8220cctgtagggc aggctcatac ccctgccgaa ccgcttttgt cagccggtcg gccacggctt 8280ccggcgtctc aacgcgcttt gagattccca gcttttcggc caatccctgc ggtgcatagg 8340cgcgtggctc gaccgcttgc gggctgatgg tgacgtggcc cactggtggc cgctccaggg 8400cctcgtagaa cgcctgaatg cgcgtgtgac gtgccttgct gcc 84432232DNAArtificial sequenceSynthetic 22ggaaggagcg aagcatgcgt tgtagcgtta gc 322338DNAArtificial sequenceSynthetic 23gggctttgtt agcaggctta gcgttcgaac ggcagaat 382421DNAArtificial sequenceSynthetic 24gcctgctaac aaagcccgaa a 212520DNAArtificial sequenceSynthetic 25gcttcgctcc ttccttaaag 202619DNAArtificial sequenceSynthetic 26gccgccctat accttgtct 192720DNAArtificial sequenceSynthetic 27acggcgtcac actttgctat 202820DNAArtificial sequenceSynthetic 28cgcgtcgcga acgccagcaa 202920DNAArtificial sequenceSynthetic 29acggggcctg ccaccatacc 203020DNAArtificial sequenceSynthetic 30cttatcgatg ataagctgtc 203120DNAArtificial sequenceSynthetic 31cagccctaga tcggccacag 203220DNAArtificial sequenceSynthetic 32tgcctgcccc tcccttttgg 203320DNAArtificial sequenceSynthetic 33gcggcgagtg cgggggttcc 203420DNAArtificial sequenceSynthetic 34ggaaacccac ggcggcaatg 203524DNAArtificial sequenceSynthetic 35atcggctgta gccgcctcta gatt 243620DNAArtificial sequenceSynthetic 36agtaacaatt gctcaagcag 203720DNAArtificial sequenceSynthetic 37attcagagaa gaaaccaatt 203843DNAArtificial sequenceSynthetic 38gctagaaata attttgagct cgccaaggag atataatgca aac 433941DNAArtificial sequenceSynthetic 39gcttcgctcc ttccttaaag ttatttaagc tgggtaaatg c 414041DNAArtificial sequenceSynthetic 40gctagaaata attttgagct cgccaaggag atataatggt c 414138DNAArtificial sequenceSynthetic 41gcttcgctcc ttccttaaag tcagcgcacc gaatacga 384255DNAArtificial sequenceSynthetic 42gctagaaata attttgagct cgccaaggag atataatgac tgccgacaac aatag 554343DNAArtificial sequenceSynthetic 43gcttcgctcc ttccttaaag ttatagcatt ctatgaattt gcc 434454DNAArtificial sequenceSynthetic 44gctagaaata attttgagct cgccaaggag atataatgaa tcgaaaagat gaac 544540DNAArtificial sequenceSynthetic 45gcttcgctcc ttccttaaag ttaacgtttt gcgaaaacag 404657DNAArtificial sequenceSynthetic 46gctagaaata attttgagct cgccaaggag atataatgac taaccgtaaa gatgatc 574739DNAArtificial sequenceSynthetic 47gcttcgctcc ttccttaaag ctaattgacc tgctgcaag 394857DNAArtificial sequenceSynthetic 48gctagaaata attttgagct cgccaaggag atataatgac gaccaaccgc aaggatg 574938DNAArtificial sequenceSynthetic 49gcttcgctcc ttccttaaag tcacgccttc ttcatctg 385019DNAArtificial sequenceSynthetic 50gccgccctat accttgtct 195120DNAArtificial sequenceSynthetic 51acggcgtcac actttgctat 20521878DNAE. coli 52ataggatcct aatacgactc actatagggt tttgtttaac tttaagaagg agatatacca 60tggctactga actgctgtgt ttgcatcgcc cgatttcact tacacataaa ctgtttcgca 120atccactgcc gaaggttatt caggcgaccc ctctgacgtt aaaactgcgt tgtagcgtta 180gcaccgaaaa tgtgtcgttt acggaaacgg aaaccgaagc tcgccgcagc gcaaactatg 240aaccgaactc gtgggattac gattacctcc ttagcagcga tacggatgaa agcattgaag 300tgtataaaga caaagccaag aaactggagg ccgaagtccg tcgcgaaatc aacaatgaga 360aagcggagtt tcttacgtta ctggaattga tcgataacgt gcaacggtta ggcctcggct 420accgctttga gagcgatatc cgtggtgcac tggaccgctt cgtatcgtct ggtggttttg 480acgccgttac gaaaacgagc ctgcatggta cagcattgtc ttttcggctg ttgcgccagc 540atggatttga agtgtcacag gaggcatttt caggcttcaa agaccagaac gggaattttt 600tggagaattt gaaagaagat atcaaagcga tcttatctct gtatgaggcg tcatttctcg 660ctctggaagg ggaaaatatt ctggacgaag cgaaagtgtt cgcaatttcc catctgaaag 720aactttccga agaaaagatt gggaaagaat tggccgaaca ggtgaaccat gcgctggaac 780tgccactgca ccgtcgcacc caacgcctcg aagcggtatg gtcgattgaa gcgtatcgca 840aaaaagagga tgcaaatcag gttctgctgg aactggccat tctcgactat aacatgattc 900agtccgtcta tcaacgtgat ctgcgcgaaa ctagtcgttg gtggcgccgt gtaggacttg 960ccactaaact gcattttgca cgtgatcgtc tgattgagtc gttctattgg gcggttggtg 1020tagcgtttga gccgcagtat tctgattgcc gcaatagtgt ggcgaaaatg ttctcctttg 1080tgaccatcat tgacgatatt tacgacgtgt atggcaccct ggatgaactg gaattattca 1140ccgatgcagt agaacgctgg gacgtcaacg cgatcaatga tttgccggat tacatgaaac 1200tgtgttttct ggccctgtat aacaccatta acgaaattgc ctatgacaac ctcaaagaca 1260agggtgaaaa tatcctgccc tatctgacta aagcttgggc tgatctgtgt aacgcgttct 1320tacaggaagc caaatggctc tacaacaaga gtacgcctac tttcgatgac tactttggca 1380acgcttggaa aagctctagc ggccctttac aactggtgtt cgcgtatttc gccgttgttc 1440agaatatcaa gaaagaagag attgagaacc tccaaaagta ccacgatacg atttcgcgtc 1500cgtcacacat ctttcgcctt tgcaatgatt tggccagtgc atctgcagag attgcgcgcg 1560gtgaaactgc caactccgtc agttgctaca tgcgtaccaa aggcatcagc gaggaactgg 1620ctaccgagtc ggtgatgaac ttaatcgatg aaacctggaa gaagatgaac aaagagaaac 1680ttggtggcag tctgtttgct aaaccgttcg ttgagacagc gattaatctg gcgcgtcaaa 1740gccactgcac ctaccacaat ggcgatgccc acacatcccc agacgaatta acccggaaac 1800gtgtcctgag tgtcatcacc gaacccattc tgccgttcga acgccatcat caccatcacc 1860attaatagcc tagggtgt 1878

* * * * *

Patent Diagrams and Documents
D00000
D00001
D00002
D00003
D00004
D00005
D00006
D00007
D00008
S00001
XML
US20190218577A1 – US 20190218577 A1

uspto.report is an independent third-party trademark research tool that is not affiliated, endorsed, or sponsored by the United States Patent and Trademark Office (USPTO) or any other governmental organization. The information provided by uspto.report is based on publicly available data at the time of writing and is intended for informational purposes only.

While we strive to provide accurate and up-to-date information, we do not guarantee the accuracy, completeness, reliability, or suitability of the information displayed on this site. The use of this site is at your own risk. Any reliance you place on such information is therefore strictly at your own risk.

All official trademark data, including owner information, should be verified by visiting the official USPTO website at www.uspto.gov. This site is not intended to replace professional legal advice and should not be used as a substitute for consulting with a legal professional who is knowledgeable about trademark law.

© 2024 USPTO.report | Privacy Policy | Resources | RSS Feed of Trademarks | Trademark Filings Twitter Feed