I-crei Meganuclease Variants With Modified Specificity, Method Of Preparation And Uses Thereof DUCHATEAU; Philippe ; et al. [CELLECTIS]

I-crei Meganuclease Variants With Modified Specificity, Method Of Preparation And Uses Thereof

DUCHATEAU; Philippe ; et al.

Patent Application Summary

U.S. patent application number 12/859905 was filed with the patent office on 2011-03-24 for i-crei meganuclease variants with modified specificity, method of preparation and uses thereof. This patent application is currently assigned to CELLECTIS. Invention is credited to Philippe DUCHATEAU, Frederic Paques.

Application Number	20110072527 12/859905
Document ID	/
Family ID	36659837
Filed Date	2011-03-24

United States Patent Application	20110072527
Kind Code	A1
DUCHATEAU; Philippe ; et al.	March 24, 2011

I-CREI MEGANUCLEASE VARIANTS WITH MODIFIED SPECIFICITY, METHOD OF PREPARATION AND USES THEREOF

Abstract

Method of preparing I-CreI meganuclease variants having a modified cleavage specificity, variants obtainable by said method and their applications either for cleaving new DNA target or for genetic engineering and genome engineering for non-therapeutic purposes. Nucleic acids encoding said variants, expression cassettes comprising said nucleic acids, vectors comprising said expression cassettes, cells or organisms, plants or animals except humans, transformed by said vectors.

Inventors:	DUCHATEAU; Philippe; (Gandelu, FR) ; Paques; Frederic; (Bourg-la-Reine, FR)
Assignee:	CELLECTIS Romainville FR
Family ID:	36659837
Appl. No.:	12/859905
Filed:	August 20, 2010

Related U.S. Patent Documents


Application Number	Filing Date	Patent Number
11908798	Sep 17, 2007
PCT/IB2006/001203	Mar 15, 2006
12859905

Current U.S. Class:	800/14 ; 435/196; 435/320.1; 435/325; 435/91.41; 514/44R; 536/23.2; 800/298
Current CPC Class:	A61P 31/12 20180101; C12N 9/22 20130101; A61P 43/00 20180101; A61K 48/00 20130101; A61P 31/00 20180101; A61K 38/00 20130101
Class at Publication:	800/14 ; 435/196; 536/23.2; 435/320.1; 435/325; 800/298; 435/91.41; 514/44.R
International Class:	A01K 67/027 20060101 A01K067/027; C12N 9/16 20060101 C12N009/16; C07H 21/00 20060101 C07H021/00; C12N 15/63 20060101 C12N015/63; C12N 5/10 20060101 C12N005/10; A01H 5/00 20060101 A01H005/00; C12N 15/66 20060101 C12N015/66; A61K 31/7088 20060101 A61K031/7088

Foreign Application Data

Date	Code	Application Number
Mar 15, 2005	IB	PCT IB2005 000981
Sep 19, 2005	IB	PCT IB2005 003083

Claims

1. A method of preparing at least one I-CreI meganuclease variant having a modified cleavage specificity, said method comprising: (a) replacing amino acids Q44, R68 and/or R70, in reference with I-CreI pdb accession code 1g9y, with an amino acid selected from the group consisting of A, D, E, G, H, K, N, P, Q, R, S, T and Y to obtain one or more I-CreI meganuclease variants; and (b) selecting said one or more I-CreI meganuclease variants obtained in (a) having at least one of the following R.sub.3 triplet cleaving profile in reference to positions -5 to -3 in a double-strand DNA target, said positions -5 to -3 corresponding to R.sub.3 of the following formula I: TABLE-US-00003 5'-R.sub.1CAAAR.sub.2R.sub.3R.sub.4R'.sub.4R'.sub.3R'.sub.2TTTGR'.sub.1-3'- , (SEQ ID NO: 92)

wherein: R.sub.1 is absent or present; and when present represents a nucleic acid fragment comprising 1 to 9 nucleotides corresponding either to a random nucleic acid sequence or to a fragment of a I-CreI meganuclease homing site situated from position -20 to -12 (from 5' to 3'), R.sub.1 corresponding at least to position -12 of said homing site, R.sub.2 represents the nucleic acid doublet ac or ct and corresponds to positions -7 to -6 of said homing site, R.sub.3 represents a nucleic acid triplet corresponding to said positions -5 to -3, selected among g, t, c and a, except the following triplets: gtc, gcc, gtg, gtt and get, R.sub.4 represents the nucleic acid doublet gt or tc and corresponds to positions -2 to -1 of said homing site, R'.sub.1 is absent or present; and when present represents a nucleic acid fragment comprising 1 to 9 nucleotides corresponding either to a random nucleic acid sequence or to a fragment of a I-CreI meganuclease homing site situated from position +12 to +20 (from 5' to 3'), R'.sub.1 corresponding at least to position +12 of said homing site, R'.sub.2 represents the nucleic acid doublet ag or gt, and corresponds to positions +6 to +7 of said homing site, R'.sub.3 represents a nucleic acid triplet corresponding to said positions +3 to +5, selected among g, t, c, and a; R'.sub.3 being different from gac, ggc, cac, aac, and age, when R.sub.3 and R'.sub.3 are non-palindromic, and R'.sub.4 represents the nucleic acid doublet ga or ac and corresponds to positions +1 to +2 of said homing site.

2. The method according to claim 1, wherein said nucleic acid triplet R.sub.3 is selected among the following triplets: ggg, gga, ggt, ggc, gag, gaa, gat, gac, gta, gcg, gca, tgg, tga, tgt, tgc, tag, taa, tat, tac, ttg, tta, ttt, ttc, tcg, tca, tct, tcc, agg, aga, agt, agc, aag, aaa, aat, aac, atg, ata, att, atc, acg, aca, act, acc, cgg, cga, cgt, cgc, cag, caa, cat, cac, ctg, eta, ctt, ctc, ccg, cca, cct and ccc.

3. The method according to claim 2, wherein said nucleic acid triplet R.sub.3 is selected among the following triplets: ggg, ggt, ggc, gag, gat, gac, gta, gcg, gca, tag, taa, tat, tac, ttg, ttt, ttc, tcg, tct, tee, agg, aag, aat, aac, att, atc, act, acc, cag, cat, cac, ctt, ctc, ccg, cct and ccc.

4. The method according to claim 1, wherein the at least one I-CreI meganuclease variant selected in (b) is selected from the group consisting of: A44/A68/A70, A44/A68/G70, A44/A68/H70, A44/A68/K70, A44/A68/N70, A44/A68/Q70, A44/A68/R70, A44/A68/S70, A44/A68/T70, A44/D68/H70, A44/D68/K70, A44/D68/R70, A44/G68/H70, A44/G68/K70, A44/G68/N70, A44/G68/P70, A44/G68/R70, A44/H68/A70, A44/H68/G70, A44/H68/H70, A44/H68/K70, A44/H68/N70, A44/H68/Q70, A44/H68/R70, A44/H68/S70, A44/H68/T70, A44/K68/A70, A44/K68/G70, A44/K68/H70, A44/K68/K70, A44/K68/N70, A44/K68/Q70, A44/K68/R70, A44/K68/S70, A44/K68/T70, A44/N68/A70, A44/N68/E70, A44/N68/G70, A44/N68/H70, A44/N68/K70, A44/N68/N70, A44/N68/Q70, A44/N68/R70, A44/N68/S70, A44/N68/T70, A44/Q68/A70, A44/Q68/D70, A44/Q68/G70, A44/Q68/H70, A44/Q68/N70, A44/Q68/R70, A44/Q68/S70, A44/R68/A70, A44/R68/D70, A44/R68/E70, A44/R68/G70, A44/R68/H70, A44/R68/K70, A44/R68/L70, A44/R68/N70, A44/R68/R70, A44/R68/S70, A44/R68/T70, A44/S68/A70, A44/S68/G70, A44/S68/K70, A44/S68/N70, A44/S68/Q70, A44/S68/R70, A44/S68/S70, A44/S68/T70, A44/T68/A70, A44/T68/G70, A44/T68/H70, A44/T68/K70, A44/T68/N70, A44/T68/Q70, A44/T68/R70, A44/T68/S70, A44/T68/T70, D44/D68/H70, D44/N68/S70, D44/R68/A70, D44/R68/K70, D44/R68/N70, D44/R68/Q70, D44/R68/R70, D44/R68/S70, D44/R68/T70, E44/H68/H70, E44/R68/A70, E44/R68/H70, E44/R68/N70, E44/R68/S70, E44/R68/T70, E44/S68/T70, G44/H68/K70, G44/Q68/H70, G44/R68/Q70, G44/R68/R70, G44/T68/D70, G44/T68/P70, G44/T68/R70, H44/A68/S70, H44/A68/T70, H44/R68/A70, H44/R68/D70, H44/R68/E70, H44/R68/G70, H44/R68/N70, H44/R68/R70, H44/R68/S70, H44/R68/T70, H44/S68/G70, H44/S68/S70, H44/S68/T70, H44/T68/S70, H44/T68/T70, K44/A68/A70, K44/A68/D70, K44/A68/E70, K44/A68/G70, K44/A68/H70, K44/A68/N70, K44/A68/Q70, K44/A68/S70, K44/A68/T70, K44/D68/A70, K44/D68/T70, K44/E68/G70, K44/E68/N70, K44/E68/S70, K44/G68/A70, K44/G68/G70, K44/G68/N70, K44/G68/S70, K44/G68/T70, K44/H68/D70, K44/H68/E70, K44/H68/G70, K44/H68/N70, K44/H68/S70, K44/H68/T70, K44/K68/A70, K44/K68/D70, K44/K68/H70, K44/K68/T70, K44/N68/A70, K44/N68/D70, K44/N68/E70, K44/N68/G70, K441N68/H70, K44/N68/N70, K44/N68/Q70, K44/N68/S70, K44/N68/T70, K44/P68/H70, K44/Q68/A70, K44/Q68/D70, K44/Q68/E70, K44/Q68/S70, K44/Q68/T70, K44/R68/A70, K44/R68/D70, K44/R68/E70, K44/R68/G70, K44/R68/H70, K44/R68/N70, K44/R68/Q70, K44/R68/S70, K44/R68/T70, K44/S68/A70, K44/S68/D70, K44/S68/H70, K44/S68/N70, K44/S68/S70, K44/S68/T70, K44/T68/A70, K44/T68/D70, K44/T68/E70, K44/T68/G70, K44/T68/H70, K44/T68/N70, K44/T68/Q70, K44/T68/S70, K44/T68/T70, N44/A68/H70, N44/A68/R70, N44/H68/N70, N44/H68/R70, N44/K68/G70, N44/K68/H70, N44/K68/R70, N44/K68/S70, N44/N68/R70, N44/P68/D70, N44/Q68/H70, N44/Q68/R70, N44/R68/A70, N44/R68/D70, N44/R68/E70, N44/R68/G70, N44/R68/H70, N44/R68/K70, N44/R68/N70, N44/R68/R70, N44/R68/S70, N44/R68/T70, N44/S68/G70, N44/S68/H70, N44/S68/K70, N44/S68/R70, N44/T68/H70, N44/T68/K70, N44/T68/Q70, N44/T68/R70, N44/T68/S70, P44/N68/D70, P44/T68/T70, Q44/A68/A70, Q44/A68/H70, Q44/A68/R70, Q44/G68/K70, Q44/G68/R70, Q44/K68/G70, Q44/N68/A70, Q44/N68/H70, Q44/N68/S70, Q44/P68/P70, Q44/Q68/G70, Q44/R68/A70, Q44/R68/D70, Q44/R68/E70, Q44/R68/G70, Q44/R68/H70, Q44/R68/N70, Q44/R68/Q70, Q44/R68/S70, Q44/S68/H70, Q44/S68/R70, Q44/S68/S70, Q44/T68/A70, Q44/T68/G70, Q44/T68/H70, Q44/T68/R70, R44/A68/G70, R44/A68/T70, R44/G68/T70, R44/H68/D70, R44/H68/T70, R44/N68/T70, R44/R68/A70, R44/R68/D70, R44/R68/E70, R44/R68/G70, R44/R68/N70, R44/R68/Q70, R44/R68/S70, R44/R68/T70, R44/S68/G70, R44/S68/N70, R44/S68/S70, R44/S68/T70, S44/D68/K70, S44/H68/R70, S44/R68/G70, S44/R68/N70, S44/R68/R70, S44/R68/S70, T44/A68/K70, T44/A68/R70, T44/H68/R70, T44/K68/R70, T44/N68/P70, T44/N68/R70, T44/Q68/K70, T44/Q68/R70, T44/R68/A70, T44/R68/D70, T44/R68/E70, T44/R68/G70, T44/R68/H70, T44/R68/K70, T44/R68/N70, T44/R68/Q70, T44/R68/R70, T44/R68/S70, T44/R68/T70, T44/S68/K70, T44/S68/R70, T44/T68/K70, and T44/T68/R70.

5. The method according to claim 1, wherein said selecting (b) of said at least one I-CreI meganuclease variant is performed in vivo in yeast cells.

6. At least one I-CreI meganuclease variant prepared by the method according to claim 1, wherein said at least one I-CreI meganuclease variant is selected from the group consisting of: A44/A68/A70, A44/A68/G70, A44/A68/H70, A44/A68/K70, A44/A68/N70, A44/A68/Q70, A44/A68/S70, A44/A68/T70, A44/D68/H70, A44/D68/K70, A44/D68/R70, A44/G68/H70, A44/G68/K70, A44/G68/N70, A44/G68/P70, A44/H68/A70, A44/H68/G70, A44/H68/H70, A44/H68/K70, A44/H68/N70, A44/H68/Q70, A44/H68/S70, A44/H68/T70, A44/K68/A70, A44/K68/G70, A44/K68/H70, A44/K68/N70, A44/K68/Q70, A44/K68/R70, A44/K68/S70, A44/K68/T70, A44/N68/A70, A44/N68/E70, A44/N68/G70, A44/N68/H70, A44/N68/K70, A44/N68/N70, A44/N68/Q70, A44/N68/R70, A44/N68/S70, A44/N68/T70, A44/Q68/A70, A44/Q68/D70, A44/Q68/G70, A44/Q68/H70, A44/Q68/N70, A44/Q68/S70, A44/R68/E70, A44/R68/K70, A44/R68/L70, A44/S68/A70, A44/S68/G70, A44/S68/N70, A44/S68/Q70, A44/S68/R70, A44/S68/S70, A44/S68/T70, A44/T68/A70, A44/T68/G70, A44/T68/H70, A44/T68/N70, A44/T68/Q70, A44/T68/S70, A44/T68/T70, D44/D68/H70, D44/N68/S70, D44/R68/A70, D44/R68/N70, D44/R68/Q70, D44/R68/R70, D44/R68/S70, D44/R68/T70, E44/H68/H70, E44/R68/A70, E44/R68/H70, E44/R68/N70, E44/R68/S70, E44/R68/T70, E44/S68/T70, G44/H68/K70, G44/Q68/H70, G44/R68/Q70, G44/T68/D70, G44/T68/P70, G44/T68/R70, H44/A68/S70, H44/A68/T70, H44/R68/D70, H44/R68/E70, H44/R68/G70, H44/R68/N70, H44/R68/R70, H44/R68/S70, H44/S68/G70, H44/S68/S70, H44/S68/T70, H44/T68/S70, H44/T68/T70, K44/A68/A70, K44/A68/D70, K44/A68/E70, K44/A68/G70, K44/A68/H70, K44/A68/N70, K44/A68/Q70, K44/D68/A70, K44/D68/T70, K44/E68/G70, K44/E68/S70, K44/G68/A70, K44/G68/G70, K44/G68/N70, K44/G68/S70, K44/G68/T70, K44/H68/D70, K44/H68/E70, K44/H68/G70, K44/H68/N70, K44/H68/S70, K44/H68/T70, K44/K68/A70, K44/K68/D70, K44/K68/H70, K44/K68/T70, K44/N68/A70, K44/N68/D70, K44/N68/E70, K44/N68/G70, K44/N68/H70, K44/N68/N70, K44/N68/Q70, K44/N68/S70, K44/N68/T70, K44/P68/H70, K44/Q68/A70, K44/Q68/D70, K44/Q68/E70, K44/Q68/S70, K44/Q68/T70, K44/R68/A70, K44/R68/D70, K44/R68/E70, K44/R68/G70, K44/R68/H70, K44/R68/N70, K44/R68/S70, K44/S68/A70, K44/S68/D70, K44/S68/H70, K44/S68/N70, K44/S68/S70, K44/S68/T70, K44/T68/A70, K44/T68/D70, K44/T68/E70, K44/T68/G70, K44/T68/H70, K44/T68/N70, K44/T68/Q70, K44/T68/S70, K44/T68/T70, N44/A68/H70, N44/H68/N70, N44/H68/R70, N44/K68/G70, N44/K68/H70, N44/K68/R70, N44/K68/S70, N44/P68/D70, N44/Q68/H70, N44/R68/A70, N44/R68/D70, N44/R68/E70, N44/R68/K70, N44/S68/G70, N44/S68/H70, N44/S68/K70, N44/S68/R70, N44/T68/H70, N44/T68/K70, N44/T68/Q70, N44/T68/S70, P44/N68/D70, P44/T68/T70, Q44/G68/K70, Q44/G68/R70, Q44/K68/G70, Q44/N68/A70, Q44/N68/H70, Q44/N68/S70, Q44/P68/P70, Q44/Q68/G70, Q44/R68/D70, Q44/R68/E70, Q44/R68/G70, Q44/R68/Q70, Q44/S68/S70, Q44/T68/A70, Q44/T68/G70, Q44/T68/H70, R44/A68/G70, R44/A68/T70, R44/G68/T70, R44/H68/D70, R44/H68/T70, R44/N68/T70, R44/R68/A70, R44/R68/D70, R44/R68/E70, R44/R68/G70, R44/R68/Q70, R44/R68/S70, R44/R68/T70, R44/S68/G70, R44/S68/N70, R44/S68/S70, R44/S68/T70, S44/D68/K70, S44/R68/R70, S44/R68/S70, T44/A68/K70, T44/N68/P70, T44/N68/R70, T44/R68/E70, T44/R68/Q70, and T44/S68/K70.

7. The at least one I-CreI meganuclease variant according to claim 6, wherein said at least one I-CreI meganuclease variant comprises an alanine (A) or an asparagine (N) in position 44 and cleaves a double-strand nucleic acid target comprising nucleotide a in position -4, and/or nucleotide t in position +4.

8. The at least one I-CreI meganuclease variant according to claim 6, wherein said at least one I-CreI meganuclease variant comprises a lysine (K) in position 44 and cleaves a double-strand nucleic acid target comprising nucleotide c in position -4, and/or nucleotide g in position +4.

9. The at least one I-CreI meganuclease variant according to claim 6, wherein said at least one I-CreI meganuclease variant comprises a glutamine (Q) in position 44 and cleaves a double-strand nucleic acid target comprising nucleotide t in position -4, and/or nucleotide a in position +4.

10. The at least one I-CreI meganuclease variant according to claim 6, wherein said at least one I-CreI meganuclease variant is a homodimer.

11. The at least one I-CreI meganuclease variant according to claim 6, wherein said at least one I-CreI meganuclease variant is a heterodimer consisting of two monomers, each of said monomers being selected from a different I-CreI meganuclease variant as defined in claim 6.

12. A polynucleotide encoding the at least one I-CreI meganuclease variant according to claim 6.

13. An expression cassette.sub.s comprising the polynucleotide according to claim 12 and regulation sequences.

14. An expression vector comprising the expression cassette according to claim 13.

15. The expression vector according to claim 14, wherein said expression vector further comprises a targeting DNA construct.

16. The expression vector according to claim 15, wherein said targeting DNA construct comprises a sequence sharing homologies with a region surrounding a cleavage site of the at least one I-CreI meganuclease variant.

17. The expression vector according to claim 16, wherein said targeting DNA construct comprises: a) sequences sharing homologies with the region surrounding the cleavage site of the at least one I-CreI meganuclease variant, and b) sequences to be introduced flanked by sequence as in a).

18. A modified cell comprising the polynucleotide according to claim 12.

19. A transgenic plant comprising the polynucleotide according to claim 12.

20. A non-human transgenic mammal comprising the polynucleotide according to claim 12.

21. A method of genetic engineering comprising double-strand nucleic acid breaking in a site of interest located on a vector, comprising a DNA target of at least one I-CreI meganuclease variant according to claim 6, by contacting said vector with said at least one I-CreI meganuclease variant, thereby inducing a homologous recombination with another vector presenting homology with a sequence surrounding a cleavage site of said at least one I-CreI meganuclease variant.

22. A method of genome engineering comprising: 1) double-strand breaking a genomic locus comprising at least one recognition and cleavage site of at least one I-CreI meganuclease variant according to claim 6, by contacting said cleavage site with said at least one I-CreI meganuclease variant; and 2) maintaining said broken genomic locus under conditions appropriate for homologous recombination with a targeting DNA construct comprising a sequence to be introduced in said genomic locus, flanked by sequences sharing homologies with said genomic locus.

23. A method of genome engineering comprising: 1) double-strand breaking a genomic locus comprising at least one recognition and cleavage site of at least one I-CreI meganuclease variant according to claim 6, by contacting said cleavage site with said at least one I-CreI meganuclease variant; and 2) maintaining said broken genomic locus under conditions appropriate for homologous recombination with a chromosomal DNA sharing homologies to regions surrounding said cleavage site.

24. A composition comprising said at least one I-CreI meganuclease variant according to claim 6.

25. The composition according to claim 24, wherein said composition further comprises a targeting DNA construct comprising a sequence which repairs a site of interest flanked by sequences sharing homologies with a targeted locus.

26. A modified cell comprising the expression vector according to claim 14.

27. A transgenic plant comprising the expression vector according to claim 14.

28. A non-human transgenic mammal comprising the expression vector according to claim 14.

29. A composition comprising said polynucleotide according to claim 12.

30. The composition according to claim 29, wherein said composition further comprises a targeting DNA construct comprising a sequence which repairs a site of interest flanked by sequences sharing homologies with a targeted locus.

31. A composition comprising said expression vector according to claim 14.

32. The composition according to claim 31, wherein said composition further comprises a targeting DNA construct comprising a sequence which repairs a site of interest flanked by sequences sharing homologies with a targeted locus.

Description

CROSS-REFERENCE TO RELATED APPLICATIONS

[0001] The present application is a continuation of U.S. Ser. No. 11/908,798, filed on Sep. 17, 2007, which is a National Stage (371) of PCT/IB06/01203, filed on Mar. 15, 2006, which claims priority to PCT/IB05/00981, filed on Mar. 15, 2005, and PCT/IB05/03083, filed on Sep. 19, 2005.

BACKGROUND OF THE INVENTION

[0002] The present invention relates to a method of preparing I-CreI meganuclease variants having a modified cleavage specificity. The invention relates also to the I-CreI meganuclease variants obtainable by said method and to their applications either for cleaving new DNA target or for genetic engineering and genome engineering fornon-therapeutic purposes.

[0003] The invention also relates to nucleic acids encoding said variants, to expression cassettes comprising said nucleic acids, to vectors comprising said expression cassettes, to cells or organisms, plants or animals except humans, transformed by said vectors.

[0004] Meganucleases are sequence specific endonucleases recognizing large (>12 bp; usually 14-40 bp) DNA cleavage sites (Thierry and Dujon, 1992). In the wild, meganucleases are essentially represented by homing endonucleases, generally encoded by mobile genetic elements such as inteins and class I introns (Belfort and Roberts, 1997; Chevalier and Stoddard, 2001). Homing refers to the mobilization of these elements, which relies on DNA double-strand break (DSB) repair, initiated by the endonuclease activity of the meganuclease. Early studies on the HO (Haber, 1998; Klar et al., 1984; Kostriken et al., 1983), I-SceI (Colleaux et al., 1988; Jacquier and Dujon, 1985; Perrin et al., 1993; Plessis et al., 1992) and I-TevI (Bell-Pedersen et al., 1990; Bell-Pedersen et al., 1989; Bell-Pedersen et al., 1991; Mueller et al., 1996) proteins have illustrated the biology of the homing process. On another hand, these studies have also provided a paradigm for the study of DSB repair in living cells.

[0005] General asymmetry of homing endonuclease target sequences contrasts with the characteristic dyad symmetry of most restriction enzyme recognition sites. Several homing endonucleases encoded by introns ORF or inteins have been shown to promote the homing of their respective genetic elements into allelic intronless or inteinless sites. By making a site-specific double-strand break in the intronless or inteinless alleles, these nucleases create recombinogenic ends, which engage in a gene conversion process that duplicates the coding sequence and leads to the insertion of an intron or an intervening sequence at the DNA level.

[0006] Homing endonucleases fall into 4 separated families on the basis of pretty well conserved amino acids motifs [for review, see Chevalier and Stoddard (Nucleic Acids Research, 2001, 29, 3757-3774)]. One of them is the dodecapeptide family (dodecamer, DOD, D1-D2, LAGLIDADG (SEQ ID NO: 91), P1-P2). This is the largest family of proteins clustered by their most general conserved sequence motif: one or two copies (vast majority) of a twelve-residue sequence: the dodecapeptide. Homing endonucleases with one dodecapetide (D) are around 20 kDa in molecular mass and act as homodimers. Those with two copies (DD) range from 25 kDa (230 amino acids) to 50 kDa (HO, 545 amino acids) with 70 to 150 residues between each motif and act as monomer. Cleavage is inside the recognition site, leaving 4 nt staggered cut with 3'OH overhangs. Enzymes that contain a single copy of the LAGLIDADG (SEQ ID NO: 91) motif, such as I-CeuI and I-CreI act as homodimers and recognize a nearly palindromic homing site.

[0007] The sequence and the structure of the homing endonuclease I-CreI (pdb accession code 1g9y) have been determined (Rochaix J D et al., NAR, 1985, 13, 975-984; Heath P J et al., Nat. Struct. Biol., 1997, 4, 468-476; Wang et al., NAR, 1997, 25, 3767-3776; Jurica et al. Mol. Cell, 1998, 2, 469-476) and structural models using X-ray crystallography have been generated (Heath et al., 1997).

[0008] I-CreI comprises 163 amino acids (pdb accession code 1g9y); said endonuclease cuts as a dimer. The LAGLIDADG (SEQ ID NO: 91) motif corresponds to residues 13 to 21; on either side of the LAGLIDADG (SEQ ID NO: 91) .alpha.-helices, a four .beta.-sheet (positions 21-29; 37-48; 66-70 and 73-78) provides a DNA binding interface that drives the interaction of the protein with the half-site of the target DNA sequence. The dimerization interface involves the two LAGLIDADG (SEQ ID NO: 91) helix as well as other residues.

[0009] The homing site recognized and cleaved by I-CreI is 22-24 by in length and is a degenerate palindrome (see FIG. 2 of Jurica M S et al, 1998 and SEQ ID NO:65). More precisely, said I-CreI homing site is a semi-palindromic 22 by sequence, with 7 of 11 by identical in each half-site (Seligman L M et al., NAR, 2002, 30, 3870-3879).

[0010] The endonuclease-DNA interface has also been described (see FIG. 4 of Jurica M S et al, 1998) and has led to a number of predictions about specific protein-DNA contacts (Seligman L M et al., Genetics, 1997, 147, 1653-1664; Jurica M S et al., 1998; Chevalier B. et al., Biochemistry, 2004, 43, 14015-14026).

[0011] It emerges from said documents that:

[0012] the residues G19, D20, Q47, R51, K98 and D137 are part of the endonucleolytic site of I-CreI;

[0013] homing site sequence must have at least 20 by to achieve a maximal binding affinity of 0.2 nM;

[0014] sequence-specific contacts are distributed across the entire length of the homing site;

[0015] base-pair substitutions can be tolerated at many different homing site positions, without seriously disrupting homing site binding or cleavage;

[0016] R51 and K98 are located in the enzyme active site and are candidates to act as Lewis acid or to activate a proton donor in the cleavage reaction; mutations in each of these residues have been observed to sharply reduce I-CreI endonucleolytic activity (R51G, K98Q);

[0017] five additional residues, which when mutated abolish I-CreI endonuclease activity are located in or near the enzyme active site (R70A, L39R, L91R, D75G, Q47H).

[0018] These studies have paved the way for a general use of meganuclease for genome engineering. Homologous gene targeting is the most precise way to stably modify a chromosomal locus in living cells, but its low efficiency remains a major drawback. Since meganuclease-induced DSB stimulates homologous recombination up to 10 000-fold, meganucleases are today the best way to improve the efficiency of gene targeting in mammalian cells (Choulika et al., 1995; Cohen-Tannoudji et al., 1998; Donoho et al., 1998; Elliott et al., 1998; Rouet et al., 1994), and to bring it to workable efficiencies in organisms such as plants (Puchta et al., 1993; Puchta et al., 1996) and insects (Rong and Golic, 2000; Rong and Golic, 2001; Rong et al., 2002).

[0019] Meganucleases have been used to induce various kinds of homologous recombination events, such as direct repeat recombination in mammalian cells (Liang et al., 1998), plants (Siebert and Puchta, 2002), insects (Rong et al., 2002), and bacteria (Posfai et al., 1999), or interchromosomal recombination (Moynahan and Jasin, 1997; Puchta, 1999; Richardson et al., 1998).

[0020] However, this technology is still limited by the low number of potential natural target sites for meganucleases: although several hundreds of natural homing endonucleases have been identified (Belfort and Roberts, 1997; Chevalier and Stoddard, 2001), the probability to have a natural meganuclease cleaving a gene of interest is extremely low. The making of artificial meganucleases with dedicated specificities would bypass this limitation.

[0021] Artificial endonucleases with novel specificity have been made, based on the fusion of endonucleases domains to zinc-finger DNA binding domains (Bibikova et al., 2003; Bibikova et al., 2001; Bibikova et al., 2002; Porteus and Baltimore, 2003).

[0022] Homing endonucleases have also been used as scaffolds to make novel endonucleases, either by fusion of different protein domains (Chevalier et al., 2002; Epinat et al., 2003), or by mutation of single specific amino acid residues (Seligman et al., 1997, 2002; Sussman et al., 2004; International PCT Application WO 2004/067736).

[0023] The International PCT Application WO 2004/067736 describes a general method for producing a custom-made meganuclease derived from an initial meganuclease, said meganuclease variant being able to cleave a DNA target sequence which is different from the recognition and cleavage site of the initial meganuclease. This general method comprises the steps of preparing a library of meganuclease variants having mutations at positions contacting the DNA target sequence or interacting directly or indirectly with said DNA target, and selecting the variants able to cleave the DNA target sequence. When the initial meganuclease is the I-CreI N75 protein a library, wherein residues 44, 68 and 70 have been mutated was built and screened against a series of six targets close to the I-CreI natural target site; the screened mutants have altered binding profiles compared to the I-CreI N75 scaffold protein ; however, they cleave the I-CreI natural target site.

[0024] Seligman et al., 2002, describe mutations altering the cleavage specificity of I-CreI. More specifically, they have studied the role of the nine amino acids of I-CreI predicted to directly contact the DNA target (Q26, K28, N30, S32, Y33, Q38, Q44, R68 and R70). Among these nine amino acids, seven are thought to interact with nucleotides at symmetrical positions (S32, Y33, N30, Q38, R68, Q44 and R70). Mutants having each of said nine amino acids and a tenth (T140) predicted to participate in a water-mediated interaction, converted to alanine, were constructed and tested in a E. coli based assay.

[0025] The resulting I-CreI mutants fell into four distinct phenotypic classes in relation to the wild-type homing site :

[0026] S32A and T140A contacts appear least important for homing site recognition,

[0027] N30A, Q38A and Q44A displayed intermediate levels of activity in each assay,

[0028] Q26A, R68A and Y33A are inactive,

[0029] K28A and R70A are inactive and non-toxic.

[0030] It emerges from the results that I-CreI mutants at positions 30, 38, 44, 26, 68, 33, 28 and 70 have a modified behaviour in relation to the wild-type I-CreI homing site.

[0031] As regards the mutations altering the seven symmetrical positions in the I-CreI homing site, it emerges from the obtained results that five of the seven symmetrical positions in each half-site appear to be essential for efficient site recognition in vivo by wild-type I-CreI: 2/21, 3/20, 7/16, 8/15 and 9/14 (corresponding to positions -10/+10, -9/+9, -51+5, -4/+4 and -3/+3 in SEQ ID NO:65). All mutants altered at these positions were resistant to cleavage by wild-type I-CreI in vivo ; however, in vitro assay using E. coli appears to be more sensitive than the in vivo test and allows the detection of homing sites of wild-type I-CreI more effectively than the in vivo test; thus in vitro test shows that the DNA target of wild-type I-CreI may be the followings: gtc (recognized homing site in all the cited documents), gcc or gtt triplet at the positions -5 to -3, in reference to SEQ ID NO:65.

[0032] Seligman et al. have also studied the interaction between I-CreI position 33 and homing site bases 2 and 21 (.+-.10) or between I-CreI position 32 and homing site bases 1 and 22 (.+-.11) ; Y33C, Y33H, Y33R, Y33L, Y33S and Y33T mutants were found to cleave a homing site modified in positions .+-.10 that is not cleaved by I-CreI (Table 3). On the other hand, S32K and S32R were found to cleave a homing site modified in positions .+-.11 that is cleaved by I-CreI (Table 3).

[0033] Sussman et al., 2004, report studies in which the homodimeric LAGLIDADG (SEQ ID NO: 91) homing endonuclease I-CreI is altered at positions 26, and eventually 66, or at position 33, contacting the homing site bases in positions .+-.6 and .+-.10, respectively. The resulting enzymes constructs (Q26A, Q26C, Y66R, Q26C/Y66R, Y33C, Y33H) drive specific elimination of selected DNA targets in vivo and display shifted specificities of DNA binding and cleavage in vitro.

[0034] The overall result of the selection and characterization of enzyme point mutants against individual target site variants is both a shift and a broadening in binding specificity and in kinetics of substrate cleavage.

[0035] Each mutant displays a higher dissociation constant (lower affinity) against the original wild-type target site than does the wild-type enzyme, and each mutant displays a lower dissociation constant (higher affinity) against its novel target than does the wild-type enzyme.

[0036] The enzyme mutants display similar kinetics of substrate cleavage, with shifts and broadening in substrate preferences similar to those described for binding affinities.

[0037] To reach a larger number of DNA target sequences, it would be extremely valuable to generate new I-CreI variants with novel specificity, ie able to cleave DNA targets which are not cleaved by I-CreI or the few variants which have been isolated so far.

[0038] Such variants would be of a particular interest for genetic and genome engineering.

SUMMARY OF THE INVENTION

[0039] Here the inventors have found mutations in positions 44, 68 and 70 of I-CreI which result in variants able to cleave at least one homing site modified in positions .+-.3 to 5.

[0040] Therefore, the subject-matter of the present invention is a method of preparing a I-CreI meganuclease variant having a modified cleavage specificity, said method comprising:

[0041] (a) replacing amino acids Q44, R68 and/or R70, in reference with I-CreI pdb accession code 1g9y , with an amino acid selected in the group consisting of A, D, E, G, H, K, N, P, Q, R, S, T and Y;

[0042] (b) selecting the I-CreI meganuclease variants obtained in step (a) having at least one of the following R.sub.3 triplet cleaving profile in reference to positions -5 to -3 in a double-strand DNA target, said positions -5 to -3 corresponding to R.sub.3 of the following formula I:

TABLE-US-00001 5'- R.sub.1CAAAR.sub.2R.sub.3R.sub.4R'.sub.4R'.sub.3R'.sub.2TTTGR'.sub.1 -3', (SEQ ID NO: 92)

[0043] wherein:

[0044] R.sub.1 is absent or present; and when present represents a nucleic acid fragment comprising 1 to 9 nucleotides corresponding either to a random nucleic acid sequence or to a fragment of a I-CreI meganuclease homing site situated from position -20 to -12 (from 5' to 3'), R.sub.1 corresponding at least to position -12 of said homing site,

[0045] R.sub.2 represents the nucleic acid doublet ac or ct and corresponds to positions -7 to -6 of said homing site,

[0046] R.sub.3 represents a nucleic acid triplet corresponding to said positions -5 to -3, selected among g, t, c and a, except the following triplets : gtc, gcc, gtg, gtt and gct; therefore said nucleic acid triplet is preferably selected among the following triplets: ggg, gga, ggt, ggc, gag, gaa, gat, gac, gta, gcg, gca, tgg, tga, tgt, tgc, tag, taa, tat, tac, ttg, tta, ttt, ttc, tcg, tca, tct, tcc, agg, aga, agt, agc, aag, aaa, aat, aac, atg, ata, att, atc, acg, aca, act, acc, cgg, cga, cgt, cgc, cag, caa, cat, cac, ctg, cta, ctt, etc, ccg, cca, cct and ccc and more preferably among the following triplets: ggg, ggt, ggc, gag, gat, gac, gta, gcg, gca, tag, taa, tat, tac, ttg, ttt, ttc, tcg, tct, tcc, agg, aag, aat, aac, att, atc, act, ace, cag, cat, cac, ctt, etc, ccg, ect and ccc,

[0047] R.sub.4 represents the nucleic acid doublet gt or tc and corresponds to positions -2 to -1 of said homing site,

[0048] R'.sub.1 is absent or present; and when present represents a nucleic acid fragment comprising 1 to 9 nucleotides corresponding either to a random nucleic acid sequence or to a fragment of a I-CreI meganuclease homing site situated from position +12 to +20 (from 5' to 3'), R'.sub.1 corresponding at least to position +12 of said homing site,

[0049] R'.sub.2 represents the nucleic acid doublet ag or gt, and corresponds to positions +6 to +7 of said homing site,

[0050] R'.sub.3 represents a nucleic acid triplet corresponding to said positions +3 to +5, selected among g, t, c, and a; R'.sub.3 being different from gac, ggc, cac, aac, and agc, when R.sub.3 and R'.sub.3 are non-palindromic,

[0051] R'.sub.4 represents the nucleic acid doublet ga or ac and corresponds to positions +1 to +2 of said homing site.

DETAILED DESCRIPTION OF THE INVENTION

Definitions

[0052] Amino acid residues in a polypeptide sequence are designated herein according to the one-letter code, in which, for example, Q means Gln or Glutamine residue, R means Arg or Arginine residue and D means Asp or Aspartic acid residue.

[0053] In the present invention, unless otherwise mentioned, the residue numbers refer to the amino acid numbering of the I-CreI sequence SWISSPROT P05725 or the pdb accession code 1g9y. According to this definition, a variant named "ADR" is I-CreI meganuclease in which amino acid residues Q44 and R68 have been replaced by alanine and aspartic acid, respectively, while R70 has not been replaced. Other mutations that do not alter the cleavage activity of the variant are not indicated and the nomenclature adopted here does not limit the mutations to the only three positions 44, 68 and 70.

[0054] Nucleotides are designated as follows: one-letter code is used for designating the base of a nucleoside: a is adenine, t is thymine, c is cytosine, and g is guanine. For the degenerated nucleotides, r represents g or a (purine nucleotides), k represents g or t, s represents g or c, w represents a or t, m represents a or c, y represents t or c (pyrimidine nucleotides), d represents g, a or t, v represents g, a or c, b represents g, t or c, h represents a, t or c, and n represents g, a, t or c.

[0055] In the present application, when a sequence is given for illustrating a recognition or homing site, it is to be understood that it represents, from 5' to 3', only one strand of the double-stranded polynucleotide.

[0056] The term "partially palindromic sequence", "partially symmetrical sequence", "degenerate palindrome", "pseudopalindromic sequence" are indiscriminately used for designating a palindromic sequence having a broken symmetry. For example the 22 by sequence: c.sub.-11a.sub.-10a.sub.-9a.sub.-8a.sub.-7c.sub.-6g.sub.-5t.sub.-4t.sub.-- 3g.sub.-2t.sub.-1g.sub.+1a.sub.+2g.sub.+3a.sub.+4c.sub.+5a.sub.+6g.sub.+7t- .sub.+8t.sub.+9t.sub.+10g.sub.+11 (SEQ ID NO: 71) is a partially palindromic sequence in which symmetry is broken at base-pairs +/-1, 2, 6 and 7. According to another formulation, nucleotide sequences of positions +/-8 to 11 and +/-3 to 5 are palindromic sequences. Symmetry axe is situated between the base-pairs in positions -1 and +1. Using another numbering, from the 5' extremity to the 3' extremity, palindromic sequences are in positions 1 to 4 and 19 to 22, and 7 to 9 and 14 to 16, symmetry is broken at base-pairs 5, 6, 10, 11, 12, 13, 17 and 18, and the symmetry axe is situated between the base-pairs in positions 11 and 12.

[0057] As used herein, the term "wild-type I-CreI" designates a I-CreI meganuclase having the sequence SWISSPROT P05725 or pdb accession code 1g9y.

[0058] The terms "recognition site", "recognition sequence", "target", "target sequence", "DNA target", "homing recognition site", "homing site", "cleavage site" are indiscriminately used for designating a 14 to 40 by double-stranded, palindromic, non-palindromic or partially palindromic polynucleotide sequence that is recognized and cleaved by a meganuclease. These terms refer to a distinct DNA location, preferably a chromosomal location, at which a double stranded break (cleavage) is to be induced by the meganuclease.

[0059] For example, the known homing recognition site of wild-type I-CreI is represented by the 22 by sequence 5'-caaaacgtcgtgagacagtttg-3' (SEQ ID NO: 71) or the 24 by sequence 5'-tcaaaacgtcgtgagacagtttgg-3' presented in FIG. 2A (here named C1234, SEQ ID NO: 65; gtc in positions -5 to -3 and gac in positions +3 to +5). This particular site is hereafter also named "I-CreI natural target site". From the natural target can be derived two palindromic sequences by mutation of the nucleotides in positions +1,+2, +6, and +7 or -1, -2, -6 and -7: C1221 (SEQ ID NO: 12) and C4334 (SEQ ID NO:66), presented in FIG. 2A. Both have gtc in positions -5 to -3 and gac in positions +3 to +5, and are cut by I-CreI, in vitro and in yeast.

[0060] The term "modified specificity" relates to a meganuclease variant able to cleave a homing site that is not cleaved, in the same conditions by the initial meganuclease (scaffold protein) it is derived from; said initial or scaffold protein may be the wild-type meganuclease or a mutant thereof.

[0061] Indeed, when using an in vivo assay in a yeast strain, the Inventors found that wild-type I-CreI cleaves not only homing sites wherein the palindromic sequence in positions -5 to -3 is gtc (as in C1234, C1221 or C4334), but also gcc, gac, ggc, atc, ctc and ttc (FIG. 9a).

[0062] The I-CreI D75N mutant (I-CreI N75) which may also be used as scaffold protein for making variants with novel specificity, cleaves not only homing sites wherein the palindromic sequence in positions -5 to -3 is gtc, but also gcc, gtt, gtg, or get (FIGS. 8 and 9a).

[0063] Heterodimeric form may be obtained for example by proceeding to the fusion of the two monomers. Resulting heterodimeric meganuclease is able to cleave at least one target site that is not cleaved by the homodimeric form. Therefore a meganuclease variant is still part of the invention when used in a heteromeric form. The other monomer chosen for the formation of the heterodimeric meganuclease may be another variant monomer, but it may also be a wild-type monomer, for example a I-CreI monomer or a I-DmoI monomer.

[0064] Thus, the inventors constructed a I-CreI variants library from a I-CreI scaffold protein (I-CreI D75N) , each of them presenting at least one mutation in the amino acid residues in positions 44, 68 and/or 70 (pdb code 1g9y), and each of them being able to cleave at least one target site not cleaved by the I-CreI scaffold protein.

[0065] In this particular approach, the mutation consists of the replacement of at least one amino acid residue in position 44, 68, and/or 70 by another residue selected in the group comprising A, D, E, G, H, K, N, P, Q, R, S, T and Y. Each mutated amino acid residue is changed independently from the other residues, and the selected amino acid residues may be the same or may be different from the other amino acid residues in position 44, 68 and/or 70. In this approach, the homing site, cleaved by the I-CreI meganuclease variant according to the invention but not cleaved by the I-CreI scaffold protein, is the same as described above and illustrated in FIG. 2, except that the triplet sequence in positions -5 to -3 (corresponding to R.sub.3 in formula I) and/or triplet sequence in positions +3 to +5 (corresponding to R.sub.3' in formula I) differ from the triplet sequence in the same positions in the homing sites cleaved by the I-CreI scaffold protein.

[0066] Unexpectedly, the I-CreI meganuclease variants, obtainable by the method described above, i.e. with a "modified specificity" are able to cleave at least one target that differs from the I-CreI scaffold protein target in positions -5 to -3 and/or in positions +3 to +5. It must be noted that said DNA target is not necessarily palindromic in positions +/-3 to 5. I-CreI is active in homodimeric form, but may be active in a heterodimeric form. Therefore I-CreI variants according to the instant invention could be active not only in a homodimeric form, but also in a heterodimeric form, and in both cases, they could recognize a target with either palindromic or non palindromic sequence in position +/-3 to 5, provided that when the I-CreI N75 protein is used as scaffold, the triplet in position -5 to -3 and/or +3 to +5 differs from gtc, gcc, gtg, gtt and gct, and from gac, ggc, cac, aac, and agc, respectively. Since each monomer of I-CreI variant binds a half of the homing site, a variant able to cleave a plurality of targets could also cleave a target which sequence in position +/-3 to 5 is not palindromic. Further, a variant could act both in a homodimeric form and in a heterodimeric form. I-CreI variant could form a heterodimeric meganuclease, in which the other variant could be a wild-type I-CreI monomer, another wild-type meganuclease monomer, such as I-DmoI, another I-CreI variant monomer, or a monomer of a variant from another meganuclease than I-CreI.

[0067] According to an advantageous embodiment of said method, the I-CreI meganuclease variant obtained in step (b) is selected from the group consisting of: A44/A68/A70, A44/A68/G70, A44/A68/H70, A44/A68/K70, A44/A68/N70, A44/A68/Q70, A44/A68/R70, A44/A68/S70, A44/A68/T70, A44/D68/H70, A44/D68/K70, A44/D68/R70, A44/G68/H70, A44/G68/K70, A44/G68/N70, A44/G68/P70, A44/G68/R70, A44/H68/A70, A44/H68/G70, A44/H68/H70, A44/H68/K70, A44/H68/N70, A44/H68/Q70, A44/H68/R70, A44/H68/S70, A44/H68/T70, A44/K68/A70, A44/K68/G70, A44/K68/H70, A44/K68/K70, A44/K68/N70, A44/K68/Q70, A44/K68/R70, A44/K68/S70, A44/K68/T70, A44/N68/A70, A44/N68/E70, A44/N68/G70, A44/N68/H70, A44/N68/K70, A44/N68/N70, A44/N68/Q70, A44/N68/R70, A44/N68/S70, A44/N68/T70, A44/Q68/A70, A44/Q68/D70, A44/Q68/G70, A44/Q68/H70, A44/Q68/N70, A44/Q68/R70, A44/Q68/S70, A44/R68/A70, A44/R68/D70, A44/R68/E70, A44/R68/G70, A44/R68/H70, A44/R68/K70, A44/R68/L70, A44/R68/N70, A44/R68/R70, A44/R68/S70, A44/R68/T70, A44/S68/A70, A44/S68/G70, A44/S68/K70, A44/S68/N70, A44/S68/Q70, A44/S68/R70, A44/S68/S70, A44/S68/T70, A44/T68/A70, A44/T68/G70, A44/T68/H70, A44/T68/K70, A44/T68/N70, A44/T68/Q70, A44/T68/R70, A44/T68/S70, A44/T68/T70, D44/D68/H70, D44/N68/S70, D44/R68/A70, D44/R68/K70, D44/R68/N70, D44/R68/Q70, D44/R68/R70, D44/R68/S70, D44/R68/T70, E44/H68/H70, E44/R68/A70, E44/R68/H70, E44/R68/N70, E44/R68/S70, E44/R68/T70, E44/S68/T70, G44/H68/K70, G44/Q68/H70, G44/R68/Q70, G44/R68/R70, G44/T68/D70, G44/T68/P70, G44/T68/R70, H44/A68/S70, H44/A68/T70, H44/R68/A70, H44/R68/D70, H44/R68/E70, H44/R68/G70, H44/R68/N70, H44/R68/R70, H44/R68/S70, H44/R68/T70, H44/S68/G70, H44/S68/S70, H44/S68/T70, H44/T68/S70, H44/T68/T70, K44/A68/A70, K44/A68/D70, K44/A68/E70, K44/A68/G70, K44/A68/H70, K44/A68/N70, K44/A68/Q70, K44/A68/S70, K44/A68/T70, K44/D68/A70, K44/D68/T70, K44/E68/G70, K44/E68/N70, K44/E68/S70, K44/G68/A70, K44/G68/G70, K44/G68/N70, K44/G68/S70, K44/G68/T70, K44/H68/D70, K44/H68/E70, K44/H68/G70, K44/H68/N70, K44/H68/S70, K44/H68/T70, K44/K68/A70, K44/K68/D70, K44/K68/H70, K44/K68/T70, K44/N68/A70, K44/N68/D70, K44/N68/E70, K44/N68/G70, K44/N68/H70, K44/N68/N70, K44/N68/Q70, K44/N68/S70, K44/N68/T70, K44/P68/H70, K44/Q68/A70, K44/Q68/D70, K44/Q68/E70, K44/Q68/S70, K44/Q68/T70, K44/R68/A70, K44/R68/D70, K44/R68/E70, K44/R68/G70, K44/R68/H70, K44/R68/N70, K44/R68/Q70, K44/R68/S70, K44/R68/T70, K44/S68/A70, K44/S68/D70, K44/S68/H70, K44/S68/N70, K44/S68/S70, K44/S68/T70, K44/T68/A70, K44/T68/D70, K44/T68/E70, K44/T68/G70, K44/T68/H70, K44/T68/N70, K44/T68/Q70, K44/T68/S70, K44/T68/T70, N44/A68/H70, N44/A68/R70, N44/H68/N70, N44/H68/R70, N44/K68/G70, N44/K68/H70, N44/K68/R70, N44/K68/S70, N44/N68/R70, N44/P68/D70, N44/Q68/H70, N44/Q68/R70, N44/R68/A70, N44/R68/D70, N44/R68/E70, N44/R68/G70, N44/R68/H70, N44/R68/K70, N44/R68/N70, N44/R68/R70, N44/R68/S70, N44/R68/T70, N44/S68/G70, N44/S68/H70, N44/S68/K70, N44/S68/R70, N44/T68/H70, N44/T68/K70, N44/T68/Q70, N44/T68/R70, N44/T68/S70, P44/N68/D70, P44/T68/T70, Q44/A68/A70, Q44/A68/H70, Q44/A68/R70, Q44/G68/K70, Q44/G68/R70, Q44/K68/G70, Q44/N68/A70, Q44/N68/H70, Q44/N68/S70, Q44/P68/P70, Q44/Q68/G70, Q44/R68/A70, Q44/R68/D70, Q44/R68/E70, Q44/R68/G70, Q44/R68/H70, Q44/R68/N70, Q44/R68/Q70, Q44/R68/S70, Q44/S68/H70, Q44/S68/R70, Q44/S68/S70, Q44/T68/A70, Q44/T68/G70, Q44/T68/H70, Q44/T68/R70, R44/A68/G70, R44/A68/T70, R44/G68/T70, R44/H68/D70, R44/H68/T70, R44/N68/T70, R44/R68/A70, R44/R68/D70, R44/R68/E70, R44/R68/G70, R44/R68/N70, R44/R68/Q70, R44/R68/S70, R44/R68/T70, R44/S68/G70, R44/S68/N70, R44/S68/S70, R44/S68/T70, S44/D68/K70, S44/H68/R70, S44/R68/G70, S44/R68/N70, S44/R68/R70, S44/R68/S70, T44/A68/K70, T44/A68/R70, T44/H68/R70, T44/K68/R70, T44/N68/P70, T44/N68/R70, T44/Q68/K70, T44/Q68/R70, T44/R68/A70, T44/R68/D70, T44/R68/E70, T44/R68/G70, T44/R68/H70, T44/R68/K70, T44/R68/N70, T44/R68/Q70, T44/R68/R70, T44/R68/S70, T44/R68/T70, T44/S68/K70, T44/S68/R70, T44/T68/K70, and T44/T68/R70.

[0068] According to another advantageous embodiment of said method, the step (b) of selecting said I-CreI meganuclease variant is performed in vivo in yeast cells.

[0069] The subject-matter of the present invention is also the use of a I-CreI meganuclease variant as defined here above, i.e. obtainable by the method as described above, in vitro or in vivo for non-therapeutic purposes, for cleaving a double-strand nucleic acid target comprising at least a 20-24 by partially palindromic sequence, wherein at least the sequence in positions +/-8 to 11 is palindromic, and the nucleotide triplet in positions -5 to -3 and/or the nucleotide triplet in positions +3 to +5 differs from gtc, gcc, gtg, gtt, and get, and from gac, ggc, cac, aac and age, respectively. Formula I describes such a DNA target.

[0070] According to an advantageous embodiment of said use, said I-CreI meganuclease variant is selected from the group consisting of: A44/A68/A70, A44/A68/G70, A44/A68/H70, A44/A68/K70, A44/A68/N70, A44/A68/Q70, A44/A68/R70, A44/A68/S70, A44/A68/T70, A44/D68/H70, A44/D68/K70, A44/D68/R70, A44/G68/H70, A44/G68/K70, A44/G68/N70, A44/G68/P70, A44/G68/R70, A44/H68/A70, A44/H68/G70, A44/H68/H70, A44/H68/K70, A44/H68/N70, A44/H68/Q70, A44/H68/R70, A44/H68/S70, A44/H68/T70, A44/K68/A70, A44/K68/G70, A44/K68/H70, A44/K68/K70, A44/K68/N70, A44/K68/Q70, A44/K68/R70, A44/K68/S70, A44/K68/T70, A44/N68/A70, A44/N68/E70, A44/N68/G70, A44/N68/H70, A44/N68/K70, A44/N68/N70, A44/N68/Q70, A44/N68/R70, A44/N68/S70, A44/N68/T70, A44/Q68/A70, A44/Q68/D70, A44/Q68/G70, A44/Q68/H70, A44/Q68/N70, A44/Q68/R70, A44/Q68/S70, A44/R68/A70, A44/R68/D70, A44/R68/E70, A44/R68/G70, A44/R68/H70, A44/R68/K70, A44/R68/L70, A44/R68/N70, A44/R68/R70, A44/R68/S70, A44/R68/T70, A44/S68/A70, A44/S68/G70, A44/S68/K70, A44/S68/N70, A44/S68/Q70, A44/S68/R70, A44/S68/S70, A44/S68/T70, A44/T68/A70, A44/T68/G70, A44/T68/H70, A44/T68/K70, A44/T68/N70, A44/T68/Q70, A44/T68/R70, A44/T8/S70, A44/T68/T70, D44/D68/H70, D44/N68/S70, D44/R68/A70, D44/R68/K70, D44/R68/N70, D44/R68/Q70, D44/R68/R70, D44/R68/S70, D44/R68/T70, E44/H68/H70, E44/R68/A70, E44/R68/H70, E44/R68/N70, E44/R68/S70, E44/R68/T70, E44/S68/T70, G44/H68/K70, G44/Q68/H70, G44/R68/Q70, G44/R68/R70, G44/T68/D70, G44/T68/P70, G44/T68/R70, H44/A68/S70, H44/A68/T70, H44/R68/A70, H44/R68/D70, H44/R68/E70, H44/R68/G70, H44/R68/N70, H44/R68/R70, H44/R68/S70, H44/R68/T70, H44/S68/G70, H44/S68/S70, H44/S68/T70, H44/T68/S70, H44/T68/T70, K44/A68/A70, K44/A68/D70, K44/A68/E70, K44/A68/G70, K44/A68/H70, K44/A68/N70, K44/A68/Q70, K44/A68/S70, K44/A68/T70, K44/D68/A70, K44/D68/T70, K44/E68/G70, K44/E68/N70, K44/E68/S70, K44/G68/A70, K44/G68/G70, K44/G68/N70, K44/G68/S70, K44/G68/T70, K44/H68/D70, K44/H68/E70, K44/H68/G70, K44/H68/N70, K44/H68/S70, K44/H68/T70, K44/K68/A70, K44/K68/D70, K44/K68/H70, K44/K68/T70, K44/N68/A70, K44/N68/D70, K44/N68/E70, K44/N68/G70, K44/N68/H70, K44/N68/N70, K44/N68/Q70, K44/N68/S70, K44/N68/T70, K44/P68/H70, K44/Q68/A70, K44/Q68/D70, K44/Q68/E70, K44/Q68/S70, K44/Q68/T70, K44/R68/A70, K44/R68/D70, K44/R68/E70, K44/R68/G70, K44/R68/H70, K44/R68/N70, K44/R68/Q70, K44/R68/S70, K44/R68/T70, K44/S68/A70, K44/S68/D70, K44/S68/H70, K44/S68/N70, K44/S68/S70, K44/S68/T70, K44/T68/A70, K44/T68/D70, K44/T68/E70, K44/T68/G70, K44/T68/H70, K44/T68/N70, K44/T68/Q70, K44/T68/S70, K44/T68/T70, N44/A68/H70, N44/A68/R70, N44/H68/N70, N44/H68/R70, N44/K68/G70, N44/K68/H70, N44/K68/R70, N44/K68/S70, N44/N68/R70, N44/P68/D70, N44/Q68/H70, N44/Q68/R70, N44/R68/A70, N44/R68/D70, N44/R68/E70, N44/R68/G70, N44/R68/H70, N44/R68/K70, N44/R68/N70, N44/R68/R70, N44/R68/S70, N44/R68/T70, N44/S68/G70, N44/S68/H70, N44/S68/K70, N44/S68/R70, N44/T68/H70, N44/T68/K70, N44/T68/Q70, N44/T68/R70, N44/T68/S70, P44/N68/D70, P44/T68/T70, Q44/A68/A70, Q44/A68/H70, Q44/A68/R70, Q44/G68/K70, Q44/G68/R70, Q44/K68/G70, Q44/N68/A70, Q44/N68/H70, Q44/N68/S70, Q44/P68/P70, Q44/Q68/G70, Q44/R68/A70, Q44/R68/D70, Q44/R68/E70, Q44/R68/G70, Q44/R68/H70, Q44/R68/N70, Q44/R68/Q70, Q44/R68/S70, Q44/S68/H70, Q44/S68/R70, Q44/S68/S70, Q44/T68/A70, Q44/T68/G70, Q44/T68/H70, Q44/T68/R70, R44/A68/G70, R44/A68/T70, R44/G68/T70, R44/H68/D70, R44/H68/T70, R44/N68/T70, R44/R68/A70, R44/R68/D70, R44/R68/E70, R44/R68/G70, R44/R68/N70, R44/R68/Q70, R44/R68/S70, R44/R68/T70, R44/S68/G70, R44/S68/N70, R44/S68/S70, R44/S68/T70, S44/D68/K70, S44/H68/R70, S44/R68/G70, S44/R68/N70, S44/R68/R70, S44/R68/S70, T44/A68/K70, T44/A68/R70, T44/H68/R70, T44/K68/R70, T44/N68/P70, T44/N68/R70, T44/Q68/K70, T44/Q68/R70, T44/R68/A70, T44/R68/D70, T44/R68/E70, T44/R68/G70, T44/R68/H70, T44/R68/K70, T44/R68/N70, T44/R68/Q70, T44/R68/R70, T44/R68/S70, T44/R68/T70, T44/S68/K70, T44/S68/R70, T44/T68/K70, and T44/T68/R70.

[0071] According to another advantageous embodiment of said use, the I-CreI meganuclease variant is a homodimer.

[0072] According to another advantageous embodiment of said use, said I-CreI meganuclease variant is a heterodimer.

[0073] Said heterodimer may be either a single-chain chimeric molecule consisting of the fusion of two different I-CreI variants as defined in the present invention or of I-CreI scaffold protein with a I-CreI variant as defined in the present invention. Alternatively, said heterodimer may consist of two separate monomers chosen from two different I-CreI variants as defined in the present invention or I-CreI scaffold protein and a I-CreI variant as defined in the present invention.

[0074] According to said use:

[0075] either the I-CreI meganuclease variant is able to cleave a DNA target in which sequence in positions +/-3 to 5 is palindromic,

[0076] or, said I-CreI meganuclease variant is able to cleave a DNA target in which sequence in positions +/-3 to 5 is non-palindromic.

[0077] According to another advantageous embodiment of said use the cleaved nucleic acid target is a DNA target in which palindromic sequences in positions -11 to -8 and +8 to +11 are caaa and tttg, respectively.

[0078] According to another advantageous embodiment of said use, said I-CreI meganuclease variant further comprises a mutation in position 75, preferably a mutation in an uncharged amino acid, more preferably an asparagine or a valine (D75N or D75V).

[0079] According to yet another advantageous embodiment of said use, said I-CreI meganuclease variant has an alanine (A) or an asparagine (N) in position 44, for cleaving a DNA target comprising nucleotide a in position -4, and/or t in position +4.

[0080] According to yet another advantageous embodiment of said use, said I-CreI meganuclease variant has a glutamine (Q) in position 44, for cleaving a DNA target comprising nucleotide t in position -4 or a in position +4.

[0081] According to yet another advantageous embodiment of said use, said I-CreI meganuclease variant has a lysine (K) in position 44, for cleaving a target comprising nucleotide c in position -4, and/or g in position +4.

[0082] The subject-matter of the present invention is also I-CreI meganuclease variants:

[0083] Obtainable by the method of preparation as defined above;

[0084] Having one mutation of at least one of the amino acid residues in positions 44, 68 and 70 of I-CreI; said mutations may be the only ones within the amino acids contacting directly the DNA target; and

[0085] Having a modified cleavage specificity in positions .+-.3 to 5.

[0086] Such novel I-CreI meganucleases may be used either as very specific endonucleases in in vitro digestion, for restriction or mapping use, either in vivo or ex vivo as tools for genome engineering. In addition, each one can be used as a new scaffold for a second round of mutagenesis and selection/screening, for the purpose of making novel, second generation homing endonucleases.

[0087] The I-CreI meganuclease variants according to the invention are mutated only at positions 44, 68 and/or 70 of the DNA binding domain. However, the instant invention also includes different proteins able to form heterodimers: heterodimerization of two different proteins from the above list result also in cleavage of non palindromic sequences, made of two halves from the sites cleaved by the parental proteins alone. This can be obtained in vitro by adding the two different I-CreI variants in the reaction buffer, and in vivo or ex vivo by coexpression. Another possibility is to build a single-chain molecule, as described by Epinat et al. (Epinat et al., 2003). This single chain molecule would be the fusion of two different I-CreI variants, and should also result in the cleavage of chimeric, non-palindromic sequences.

[0088] According to an advantageous embodiment of said I-CreI meganuclease variant, the amino acid residue chosen for the replacement of the amino acid in positions 44, 68 and/or 70 is selected in the group comprising A, D, E, G, H, K, N, P, Q, R, S, T and Y.

[0089] According to another advantageous embodiment, said I-CreI meganuclease variant is selected in the group consisting of: A44/A68/A70, A44/A68/G70, A44/A68/H70, A44/A68/K70, A44/A68/N70, A44/A68/Q70, A44/A68/S70, A44/A68/T70, A44/D68/H70, A44/D68/K70, A44/D68/R70, A44/G68/H70, A44/G68/K70, A44/G68/N70, A44/G68/P70, A44/H68/A70, A44/H68/G70, A44/H68/H70, A44/H68/K70, A44/H68/N70, A44/H68/Q70, A44/H68/S70, A44/H68/T70, A44/K68/A70, A44/K68/G70, A44/K68/H70, A44/K68/N70, A44/K68/Q70, A44/K68/R70, A44/K68/S70, A44/K68/T70, A44/N68/A70, A44/N68/E70, A44/N68/G70, A44/N68/H70, A44/N68/K70, A44/N68/N70, A44/N68/Q70, A44/N68/R70, A44/N68/S70, A44/N68/T70, A44/Q68/A70, A44/Q68/D70, A44/Q68/G70, A44/Q68/H70, A44/Q68/N70, A44/Q68/S70, A44/R68/E70, A44/R68/K70, A44/R68/L70, A44/S68/A70, A44/S68/G70, A44/S68/N70, A44/S68/Q70, A44/S68/R70, A44/S68/S70, A44/S68/T70, A44/T68/A70, A44/T68/G70, A44/T68/H70, A44/T68/N70, A44/T68/Q70, A44/T68/S70, A44/T68/T70, D44/D68/H70, D44/N68/S70, D44/R68/A70, D44/R68/N70, D44/R68/Q70, D44/R68/R70, D44/R68/S70, D44/R68/T70, E44/H68/H70, E44/R68/A70, E44/R68/H70, E44/R68/N70, E44/R68/S70, E44/R68/T70, E44/S68/T70, G44/H68/K70, G44/Q68/H70, G44/R68/Q70, G44/T68/D70, G44/T68/P70, G44/T68/R70, H44/A68/S70, H44/A68/T70, H44/R68/D70, H44/R68/E70, H44/R68/G70, H44/R68/N70, H44/R68/R70, H44/R68/S70, H44/S68/G70, H44/S68/S70, H44/S68/T70, H44/T68/S70, H44/T68/T70, K44/A68/A70, K44/A68/D70, K44/A68/E70, K44/A68/G70, K44/A68/H70, K44/A68/N70, K44/A68/Q70, K44/D68/A70, K44/D68/T70, K44/E68/G70, K44/E68/S70, K44/G68/A70, K44/G68/G70, K44/G68/N70, K44/G68/S70, K44/G68/T70, K44/H68/D70, K44/H68/E70, K44/H68/G70, K44/H68/N70, K44/H68/S70, K44/H68/T70, K44/K68/A70, K44/K68/D70, K44/K68/H70, K44/K68/T70, K44/N68/A70, K44/N68/D70, K44/N68/E70, K44/N68/G70, K44/N68/H70, K44/N68/N70, K44/N68/Q70, K44/N68/S70, K44/N68/T70, K44/P68/H70, K44/Q68/A70, K44/Q68/D70, K44/Q68/E70, K44/Q68/S70, K44/Q68/T70, K44/R68/A70, K44/R68/D70, K44/R68/E70, K44/R68/G70, K44/R68/H70, K44/R68/N70, K44/R68/S70, K44/S68/A70, K44/S68/D70, K44/S68/H70, K44/S68/N70, K44/S68/S70, K44/S68/T70, K44/T68/A70, K44/T68/D70, K44/T68/E70, K44/T68/G70, K44/T68/H70, K44/T68/N70, K44/T68/Q70, K44/T68/S70, K44/T68/T70, N44/A68/H70, N44/H68/N70, N44/H68/R70, N44/K68/G70, N44/K68/H70, N44/K68/R70, N44/K68/S70, N44/P68/D70, N44/Q68/H70, N44/R68/A70, N44/R68/D70, N44/R68/E70, N44/R68/K70, N44/S68/G70, N44/S68/H70, N44/S68/K70, N44/S68/R70, N44/T68/H70, N44/T68/K70, N44/T68/Q70, N44/T68/S70, P44/N68/D70, P44/T68/T70, Q44/G68/K70, Q44/G68/R70, Q44/K68/G70, Q44/N68/A70, Q44/N68/H70, Q44/N68/S70, Q44/P68/P70, Q44/Q68/G70, Q44/R68/D70, Q44/R68/E70, Q44/R68/G70, Q44/R68/Q70, Q44/S68/S70, Q44/T68/A70, Q44/T68/G70, Q44/T68/H70, R44/A68/G70, R44/A68/T70, R44/G68/T70, R44/H68/D70, R44/H68/T70, R44/N68/T70, R44/R68/A70, R44/R68/D70, R44/R68/E70, R44/R68/G70, R44/R68/Q70, R44/R68/S70, R44/R68/T70, R44/S68/G70, R44/S68/N70, R44/S68/S70, R44/S68/T70, S44/D68/K70, S44/R68/R70, S44/R68/S70, T44/A68/K70, T44/N68/P70, T44/N68/R70, T44/R68/E70, T44/R68/Q70, and T44/S68/K70; said I-CreI meganuclease variant is able to cleave at least one target, as defined above, that is not cleaved by the I-CreI N75 scaffold protein.

[0090] According to yet another advantageous embodiment, the I-CreI meganuclease variant has an alanine (A) or an asparagine (N), in position 44, and cleaves a target comprising the nucleotide a in position -4, and/or t in position +4, with the exclusion of the variants presented in Table 4 and Table 5 of the International PCT Application WO 2004/067736, preferably said variant has an alanine or an asparagine.

[0091] According to yet another advantageous embodiment, the I-CreI meganuclease variant has a glutamine (Q) and cleaves a target comprising the nucleotide t in position -4, and/or a in position +4 in position 44, with the exclusion of the variants presented in Table 3, Table 4 and Table 5 of the International PCT Application WO 2004/067736.

[0092] According to yet another advantageous embodiment, the I-CreI meganuclease variant of the invention has a lysine (K) in position 44, and cleaves a target comprising c in position -4, and/or g in position +4, with the exclusion of the variant presented Table 5 of the International PCT Application WO 2004/067736.

[0093] As specified hereabove, in the frame of the definition of the I-CreI meganuclease variant in the use application, said I-CreI meganuclease variant may be a homodimer or a heterodimer. It may be able to cleave a palindromic or a non-palindromic DNA target. It may further comprise a mutation in position 75, as specified hereabove.

[0094] The subject-matter of the present invention is also a polynucleotide, characterized in that it encodes a I-CreI meganuclease variant according to the invention.

[0095] Further, the subject-matter of the present invention is an expression cassette comprising said polynucleotide and regulation sequences such as a promoter, and an expression vector comprising said expression cassette. When said variant is an heterodimer consisting of two different monomers, each monomer may be expressed from a single vector (dual expression vector) or from two different vectors.

[0096] The subject-matter of the present invention is also an expression vector, as described above, further comprising a targeting DNA construct.

[0097] The term "vector" refers to a nucleic acid molecule capable of transporting another nucleic acid to which it has been linked. One type of preferred vector is an episome, i.e., a nucleic acid capable of extra-chromosomal replication. Preferred vectors are those capable of autonomous replication and/or expression of nucleic acids to which they are linked. Vectors capable of directing the expression of genes to which they are operatively linked are referred to herein as "expression vectors A vector according to the present invention comprises, but is not limited to, a YAC (yeast artificial chromosome), a BAC (bacterial artificial), a baculovirus vector, a phage, a phagemid, a cosmid, a viral vector, a plasmid, a RNA vector or a linear or circular DNA or RNA molecule which may consist of chromosomal, non chromosomal, semi-synthetic or synthetic DNA. In general, expression vectors of utility in recombinant DNA techniques are often in the form of "plasmids" which refer generally to circular double stranded DNA loops which, in their vector form are not bound to the chromosome. Large numbers of suitable vectors are known to those of skill in the art and commercially available, such as the following bacterial vectors: pQE7O, pQE6O, pQE-9 (Qiagen), pbs, pDIO, phagescript, psiX174. pbluescript SK, pbsks, pNH8A, pNH16A, pNH18A, pNH46A (Stratagene); ptrc99a, pKK223-3, pKK233-3, pDR540, pR1T5 (Pharmacia); pWLNEO, pSV2CAT, pOG44, pXTI, pSG (Stratagene); pSVK3, pBPV, pMSG, pSVL (Pharmacia); pQE-30 (Q1Aexpress), pET (Novagen).

[0098] Viral vectors include retrovirus, adenovirus, parvovirus (e. g. adenoassociated viruses), coronavirus, negative strand RNA viruses such as orthomyxovirus (e. g., influenza virus), rhabdovirus (e.g., rabies and vesicular stomatitis virus), paramyxovirus (e. g. measles and Sendai), positive strand RNA viruses such as picornavirus and alphavirus, and double-stranded DNA viruses including adenovirus, herpesvirus (e.g., Herpes Simplex virus types 1 and 2, Epstein-Barr virus, cytomegalovirus), and poxvirus (e.g., vaccinia, fowlpox and canarypox). Other viruses include Norwalk virus, togavirus, flavivirus, reoviruses, papovavirus, hepadnavirus, and hepatitis virus, for example.

[0099] Vectors can comprise selectable markers, for example: neomycin phosphotransferase, histidinol dehydrogenase, dihydrofolate reductase, hygromycin phosphotransferase, herpes simplex virus thymidine kinase, adenosine deaminase, glutamine synthetase, and hypoxanthine-guanine phosphoribosyl transferase for eukaryotic cell culture; TRP1 for S. cerevisiae; tetracycline, rifampicin or ampicillin resistance in E. coli.

[0100] Preferably said vectors are expression vectors, wherein the sequences encoding the polypeptides of the invention are placed under control of appropriate transcriptional and translational control elements to permit production or synthesis of said polypeptides. Therefore, said polynucleotides are comprised in expression cassette(s). More particularly, the vector comprises a replication origin, a promoter operatively linked to said encoding polynucleotide, a ribosome binding site, an RNA-splicing site (when genomic DNA is used), a polyadenylation site and a transcription termination site. It also can comprise an enhancer. Selection of the promoter will depend upon the cell in which the polypeptide is expressed.

[0101] According to an advantageous embodiment of said expression vector, said targeting DNA construct comprises a sequence sharing homologies with the region surrounding the cleavage site of the I-CreI meganuclease variant of the invention.

[0102] According to another advantageous embodiment of said expression vector, said targeting DNA construct comprises:

[0103] a) sequences sharing homologies with the region surrounding the cleavage site of the I-CreI meganuclease variant according to claim, and

[0104] b) sequences to be introduced flanked by sequence as in a).

[0105] The subject-matter of the present invention is also a cell, characterized in that it is modified by a polynucleotide as defined above or by a vector as defined above.

[0106] The subject-matter of the present invention is also a transgenic plant, characterized in that it comprises a polynucleotide as defined above, or a vector as defined above.

[0107] The subject-matter of the present invention is also a non-human transgenic mammal, characterized in that it comprises a polynucleotide as defined above or a vector as defined above.

[0108] The polynucleotide sequences encoding the polypeptides as defined in the present invention may be prepared by any method known by the man skilled in the art. For example, they are amplified from a cDNA template, by polymerase chain reaction with specific primers. Preferably the codons of said cDNA are chosen to favour the expression of said protein in the desired expression system.

[0109] The recombinant vector comprising said polynucleotides may be obtained and introduced in a host cell by the well-known recombinant DNA and genetic engineering techniques.

[0110] The heterodimeric meganuclease of the invention is produced by expressing the two polypeptides as defined above; preferably said polypeptides are co-expressed in a host cell modified by two expression vectors, each comprising a polynucleotide fragment encoding a different polypeptide as defined above or by a dual expression vector comprising both polynucleotide fragments as defined above, under conditions suitable for the co-expression of the polypeptides, and the heterodimeric meganuclease is recovered from the host cell culture.

[0111] The subject-matter of the present invention is further the use of a I-CreI meganuclease variant, one or two polynucleotide(s), preferably both included in one expression vector (dual expression vector) or each included in a different expression vector, a cell, a transgenic plant, a non-human transgenic mammal, as defined above, for molecular biology, for in vivo or in vitro genetic engineering, and for in vivo or in vitro genome engineering, for non-therapeutic purposes.

[0112] Non therapeutic purposes include for example (i) gene targeting of specific loci in cell packaging lines for protein production, (ii) gene targeting of specific loci in crop plants, for strain improvements and metabolic engineering, (iii) targeted recombination for the removal of markers in genetically modified crop plants, (iv) targeted recombination for the removal of markers in genetically modified microorganism strains (for antibiotic production for example).

[0113] According to an advantageous embodiment of said use, it is for inducing a double-strand break in a site of interest comprising a DNA target sequence, thereby inducing a DNA recombination event, a DNA loss or cell death.

[0114] According to the invention, said double-strand break is for: repairing a specific sequence, modifying a specific sequence, restoring a functional gene in place of a mutated one, attenuating or activating an endogenous gene of interest, introducing a mutation into a site of interest, introducing an exogenous gene or a part thereof, inactivating or deleting an endogenous gene or a part thereof, translocating a chromosomal arm, or leaving the DNA unrepaired and degraded.

[0115] According to another advantageous embodiment of said use, said I-CreI meganuclease variant, polynucleotide, vector, cell, transgenic plant or non-human transgenic mammal are associated with a targeting DNA construct as defined above.

[0116] The subject-matter of the present invention is also a method of genetic engineering, characterized in that it comprises a step of double-strand nucleic acid breaking in a site of interest located on a vector, comprising a DNA target of a I-CreI meganuclease variant as defined hereabove, by contacting said vector with a I-CreI meganuclease variant as defined above, thereby inducing a homologous recombination with another vector presenting homology with the sequence surrounding the cleavage site of said I-CreI meganuclease variant.

[0117] The subjet-matter of the present invention is also a method of genome engineering, characterized in that it comprises the following steps: 1) double-strand breaking a genomic locus comprising at least one recognition and cleavage site of a I-CreI meganuclease variant as defined above, by contacting said cleavage site with said I-CreI meganuclease variant; 2) maintaining said broken genomic locus under conditions appropriate for homologous recombination with a targeting DNA construct comprising the sequence to be introduced in said locus, flanked by sequences sharing homologies with the target locus.

[0118] The subjet-matter of the present invention is also a method of genome engineering, characterized in that it comprises the following steps: 1) double-strand breaking a genomic locus comprising at least one recognition and cleavage site of a I-CreI meganuclease variant as defined above, by contacting said cleavage site with said I-CreI meganuclease variant; 2) maintaining said broken genomic locus under conditions appropriate for homologous recombination with chromosomal DNA sharing homologies to regions surrounding the cleavage site.

[0119] The subject-matter of the present invention is also a composition characterized in that it comprises at least one I-CreI meganuclease variant, a polynucleotide or a vector as defined above.

[0120] In a preferred embodiment of said composition, it comprises a targeting DNA construct comprising the sequence which repairs the site of interest flanked by sequences sharing homologies with the targeted locus.

[0121] The subject-matter of the present invention is also the use of at least one I-CreI meganuclease variant, a polynucleotide or a vector, as defined above for the preparation of a medicament for preventing, improving or curing a genetic disease in an individual in need thereof, said medicament being administrated by any means to said individual.

[0122] The subject-matter of the present invention is also the use of at least one I-CreI meganuclease variant, a polynucleotide or a vector as defined above for the preparation of a medicament for preventing, improving or curing a disease caused by an infectious agent that presents a DNA intermediate, in an individual in need thereof, said medicament being administrated by any means to said individual.

[0123] The subject-matter of the present invention is also the use of at least one I-CreI meganuclease variant, a polynucleotide or a vector, as defined above, in vitro, for inhibiting the propagation, inactivating or deleting an infectious agent that presents a DNA intermediate, in biological derived products or products intended for biological uses or for disinfecting an object.

[0124] In a particular embodiment, said infectious agent is a virus.

[0125] In addition to the preceding features, the invention further comprises other features which will emerge from the description which follows, which refers to examples illustrating the I-CreI meganuclease variants and their uses according to the invention, as well as to the appended drawings in which:

[0126] FIG. 1 illustrates the rationale of the experiments. (a) Structure of I-CreI bound to its DNA target. (b) Zoom of the structure showing residues 44, 68, 70 chosen for randomization, D75 and interacting base pairs. (c) Design of the library and targets. The interactions of I-CreI residues Q44, R68 an R70 with DNA targets are indicated (top). Other amino acid residues interacting directly or indirectly with the DNA target are not shown. Arginine (R) residue in position 44 of a I-CreI monomer directly interacts with guanine in position -5 of the target sequence, while glutamine (Q) residue of position 44 and Arginine (R) residue of position 70 directly interact with adenine in position +4 and guanine in position +3 of the complementary strand, respectively. The target described here (C1221, SEQ ID NO: 12) is a palindrome derived from the I-CreI natural target (C 1234, SEQ ID NO:65), and cleaved by I-CreI (Chevalier et al., 2003, precited). Cleavage positions are indicated by arrowheads. In the library, residues 44, 68 and 70 are replaced with ADEGHKNPQRST. Since I-CreI is an homodimer, the library was screened with palindromic targets. Sixty four palindromic targets resulting from substitutions in positions .+-.3, .+-.4 and .+-.5 were generated. A few examples of such targets are shown (bottom; SEQ ID NO: 1 to 7).

[0127] FIG. 2 illustrates the target used in the study. A. Two palindromic targets derived from the natural I-CreI target (here named C 1234, SEQ ID NO: 65). The I-CreI natural target contains two palindromes, boxed in grey: the -8 to -12 and +8 to +12 nucleotides on one hand, and the -5 to -3 and +3 to +5 nucleotide on another hand. Vertical dotted line, from which are numbered the nucleotide bases, represents the symmetry axe for the palindromic sequences. From the natural target can be derived two palindromic sequences, C1221 (SEQ ID NOS: 1-6, 81, 8-18, 82-83, 21-27, 84, 29-31, 85, 33-38, 86, 40-50, 87-88, 53-59, 89, 61-63 and 90, respectively, in order of appearance) and C4334 (SEQ ID NO:66). Both are cut by I-CreI, in vitro and in yeast. Only one strand of each target site is shown. B. The 64 targets. The 64 targets (SEQ ID NO: 1 to 64) are derived from C1221 (SEQ ID NO: 12) a palindrome derived from the I-CreI natural target (C1234, SEQ ID NO:65), and cleaved by I-CreI (Chevalier et al., 2003, precited). They correspond to all the 24 by palindromes resulting from substitutions at positions -5, -4, -3, +3, +4 and +5.

[0128] FIG. 3 illustrates the screening of the variants. (a) Yeast are transformed with the meganuclease expressing vector, marked with the LEU2 gene, and individually mated with yeast transformed with the reporter plasmid, marked by the TRP1 gene. In the reporter plasmid, a LacZ reporter gene is interrupted with an insert containing the site of interest, flanked by two direct repeats. In diploids (LEU2 TRP1), cleavage of the target site by the meganuclease (white oval) induces homologous recombination between the two lacZ repeats, resulting in a functional beta-galactosidase gene (grey oval), which can be monitored by X-Gal staining. (b) Scheme of the experiment. A library of I-CreI variants is built using PCR, cloned into a replicative yeast expression vector and transformed in S. cerevisiae strain FYC2-6A (MAT.alpha., trp1.DELTA.63, leu2.DELTA.1, his3.DELTA.200). The 64 palindromic targets are cloned in the LacZ-based yeast reporter vector, and the resulting clones transformed into strain FYBL2-7B (MATa, ura3.DELTA.851, trp1.DELTA.63, leu2.DELTA.1, lys2.DELTA.202). Robot-assisted gridding on filter membrane is used to perform mating between individual clones expressing meganuclease variants and individual clones harboring a reporter plasmid. After primary high throughput screening, the ORF of positive clones are amplified by PCR and sequenced. 410 different variants were identified among the 2100 positives, and tested at low density, to establish complete patterns, and 350 clones were validated. Also, 294 mutants were recloned in yeast vectors, and tested in a secondary screen, and results confirmed those obtained without recloning. Chosen clones are then assayed for cleavage activity in a similar CHO-based assay and eventually in vitro.

[0129] FIG. 4 represents the cDNA sequence encoding the I-CreI N75 scaffold protein and degenerated primers used for the Ulib2 library construction. A. The coding sequence (CDS) of the scaffold protein (SEQ ID NO: 69) is from base-pair 1 to base-pair 501 and the "STOP" codon TGA (not shown) follows the base-pair 501. In addition to the D75N mutation, the protein further contains mutations that do not alter its activity; in the protein sequence (SEQ ID NO:70), the two first N-terminal residues are methionine and alanine (MA), and the three C-terminal residues alanine, alanine and aspartic acid (AAD).B. Degenerated primers (SEQ ID NO: 67, 68).

[0130] FIG. 5 represents the pCLS0542 meganuclease expression vector map. The meganuclease expression vector is marked with LEU2. cDNAs encoding I-CreI meganuclease variants are cloned into this vector digested with NcoI and EagI, in order to have the variant expression driven by the inducible Gal10 promoter.

[0131] FIG. 6 represents the pCLS0042 reporter vector map. The reporter vector is marked with TRP1 and URA3. The LacZ tandem repeats share 800 by of homology, and are separated by 1,3 kb of DNA. They are surrounded by ADH promoter and terminator sequences. Target sites are cloned into the SmaI site.

[0132] FIG. 7 illustrates the cleavage profile of 292 I-CreI meganuclease variants with a modified specificity. The variants derive from the I-CreI N75 scaffold protein. Proteins are defined by the amino acid present in positions 44, 68 and 70 (three first columns). Numeration of the amino acids is according to pdb accession code 1g9y. Targets are defined by nucleotides at positions -5 to -3. For each protein, observed cleavage (1) or non observed cleavage (0) is shown for each one of the 64 targets.

[0133] FIG. 8 illustrates eight examples I-CreI variants cleavage pattern. The meganucleases are tested 4 times against the 64 targets described in FIG. 2B. The position of the different targets is indicated on the top, left panel. The variants which derive from the I-CreI N75 scaffold protein, are identified by the amino acids in positions 44, 68 and 70 (ex: KSS is K44, S68, S70 and N75, or K44/S68/S70). Numeration of the amino acids is according to pdb code 1g9y. QRR corresponds to I-CreI N75. The cleaved targets are indicated besides the panels.

[0134] FIG. 9 illustrates the cleavage patterns of the variants. Mutants are identified by three letters, corresponding to the residues in positions 44, 68 and 70. Each mutant is tested versus the 64 targets derived from the I-CreI natural targets, and a series of control targets. Target map is indicated in the top right panel. (a) Cleavage patterns in yeast (left) and mammalian cells (right) for the wild-type I-CreI (I-CreI) and I-CreI N75 (QRR) proteins, and 7 derivatives of the I-CreI N75 protein. For yeast, the initial raw data (filter) is shown. For CHO cells, quantitative raw data (ONPG measurement) are shown, values superior to 0.25 are boxed, values superior to 0.5 are highlighted in medium grey, values superior to 1 in dark grey. LacZ: positive control. 0: no target. U1, U2 and U3: three different uncleaved controls. (b) Cleavage in vitro. I-CreI and four mutants are tested against a set of 2 or 4 targets, including the target resulting in the strongest signal in yeast and CHO. Digests are performed at 37.degree. C. for 1 hour, with 2 nM linearized substrate, as described in Methods. Raw data are shown for I-CreI with two different targets. With both ggg and cct, cleavage is not detected with I-CreI.

[0135] FIG. 10 represents the statistical analysis. (a) Cleaved targets: targets cleaved by I-CreI variants are colored in grey. The number of proteins cleaving each target is shown below, and the level of grey coloration is proportional to the average signal intensity obtained with these cutters in yeast. (b) Analysis of 3 out of the 7 clusters. For each mutant cluster (clusters 1, 3 and 7), the cumulated intensities for each target was computed and a bar plot (left column) shows in decreasing order the normalized intensities. For each cluster, the number of amino acid of each type at each position (44, 68 and 70) is shown as a coded histogram in the right column. The legend of amino-acid color code is at the bottom of the figure. (b) Hierarchical clustering of mutant and target data in yeast. Both mutants and targets were clustered using hierarchical clustering with Euclidean distance and Ward's method (Ward, J. H., American statist. Assoc., 1963, 58, 236-244). Clustering was done with hclust from the R package. Mutants and targets dendrograms were reordered to optimize positions of the clusters and the mutant dendrogram was cut at the height of 8 with deduced clusters. QRR mutant and GTC target are indicated by an arrow. Gray levels reflects the intensity of the signal.

[0136] FIG. 11 illustrates an example of hybrid or chimeric site: gtt (SEQ ID NO: 79) and cct (SEQ ID NO: 77) are two palindromic sites derived from the I-CreI site. The gtt/cct hybrid site (SEQ ID NO: 80) displays the gtt sequence on the top strand in -5, -4, -3 and the cct sequence on the bottom strand in 5, 4, 3.

[0137] FIG. 12 illustrates the cleavage activity of the heterodimeric variants. Yeast were co-transformed with the KTG and QAN variants. Target organization is shown on the top panel: target with a single gtt, cct or gcc half site are in bold; targets with two such half sites, which are expected to be cleaved by homo- and/or heterodimers, are in bold and highlighted in grey; 0: no target. Results are shown on the three panels below. Unexpected faint signals are observed only for gtc/cct and gtt/gtc, cleaved by KTG and QAN, respectively.

[0138] FIG. 13 represents the quantitative analysis of the cleavage activity of the heterodimeric variants. (a) Co-transformation of selected mutants in yeast. For clarity, only results on relevant hybrid targets are shown. The aac/acc target is always shown as an example of unrelated target. For the KTGxAGR couple, the palindromic tac and tct targets, although not shown, are cleaved by AGR and KTG, respectively. Cleavage of the cat target by the RRN mutant is very low, and could not be quantified in yeast. (b) Transient co-transfection in CHO cells. For (a) and (b), Black bars: signal for the first mutant alone; grey bars: signal for the second mutant alone; striped bars: signal obtained by co-expression or cotransfection.

[0139] FIG. 14 illustrates the activity of the assembled heterodimer ARS-KRE on the selected mouse chromosome 17 DNA target. CHO-K1 cell line were co-transfected with equimolar of target LagoZ plasmid, ARS and KRE expression plasmids, and the beta galactosidase activity was measured. Cells co-transfected with the LagoZ plasmid and the I-SceI, I-CreI, ARS or KRE recombinant plasmid or an empty plasmid were used as control.

EXAMPLES

[0140] The following examples are presented here only for illustrating the invention and not for limiting the scope thereof. Other variants, obtained from a cDNA, which sequence differs from SEQ ID NO: 69, and using appropriate primers, are still part of the invention.

Example 1

Screening for New Functional Endonucleases

[0141] The method for producing meganuclease variants and the assays based on cleavage-induced recombination in mammal or yeast cells, which are used for screening variants with altered specificity, are described in the International PCT Application WO 2004/067736. These assays result in a functional LacZ reporter gene which can be monitored by standard methods (FIG. 3a).

A) Material and Methods

a) Construction of Mutant Libraries

[0142] I-CreI wt and I-CreI D75N (or I-CreI N75) open reading frames (SEQ ID NO:69, FIG. 4A) were synthesized, as described previously (Epinat et al., N.A.R., 2003, 31, 2952-2962). Mutation D75N was introduced by replacing codon 75 with aac. The diversity of the meganuclease library was generated by PCR using degenerate primers from Sigma harboring codon VVK (18 codons, amino acids ADEGHKNPQRST) at position 44, 68 and 70 which interact directly with the bases at positions 3 to 5, and as DNA template, the I-CreI D75N gene. Such primers allow mutation of residues 44, 68 and 70 with a theoretical diversity of 12. Briefly, forward primer (5'-gtttaaacatcagctaagattgacctttvvkgtgacttcaaaagacccag-3', SEQ ID NO: 67) and reverse primer (5'-gatgtagttggaaacggatccmbbatcmbbtacgtaaccaacgcc-3', SEQ ID NO: 68) were used to amplify a PCR fragment in 50 .mu.l PCR reactions: PCR products were pooled, EtOH precipitated and resuspended in 50 .mu.l 10 mM Tris. PCR products were cloned into a pET expression vector containing the I-CreI D75N gene, digested with appropriate restriction enzymes. Digestion of vector and insert DNA were conducted in two steps (single enzyme digestion) between which the DNA sample was extracted (using classic phenol:chloroform:isoamylalcohol-based methods) and EtOH-precipitated. 10 .mu.g of digested vector DNA were used for ligation, with a 5:1 excess of insert DNA. E coli TG1 cells were transformed with the resulting vector by electroporation. To produce a number of cell clones above the theoretical diversity of the library, 6.times.10.sup.4 clones were produced. Bacterial clones were scraped from plates and the corresponding plasmid vectors were extracted and purified.

[0143] The library was recloned in the yeast pCLS0542 vector (FIG. 5), by sub-cloning a NcoI-EagI DNA fragment containing the entire I-CreI D75N ORF. In this 2 micron-based replicative vector marked with the LEU2 gene, I-CreI variants are under the control of a galactose inducible promoter (Epinat et al., precited). After electroporation in E. coli, 7.times.10.sup.4 clones were obtained 7.times.10.sup.4 clones, representing 12 times the theoretical diversity at the DNA level (18.sup.3=5832). DNA was extracted and transformed into S. cerevisiae strain FYC2-6A (MAT.alpha., trp1.DELTA.63, leu2.DELTA.1, his3.DELTA.200). 13824 colonies were picked using a colony picker (QpixII, GENETIX), and grown in 144 microtiter plates.

b) Construction of Target Clones

[0144] The C1221 twenty-four by palindrome (tcaaaacgtcgtacgacgttttga, SEQ ID NO: 12) is a repeat of the half-site of the nearly palindromic natural I-CreI target (tcaaaacgtcgtgagacagtttgg, SEQ ID NO: 65). C1221 is cleaved as efficiently as the I-CreI natural target in vitro and ex vivo in both yeast and mammalian cells. The 64 palindromic targets were derived as follows: 64 pair of oligonucleotides (ggcatacaagtttcaaaacnnngtacnnngttttgacaatcgtctgtca (SEQ ID NO: 72) and reverse complementary sequences) corresponding to the two strands of the 64 DNA targets, with 12 pb of non palindromic extra sequence on each side, were ordered form Sigma, annealed and cloned into pGEM-T Easy (PROMEGA). Next, a 400 by PvuII fragment was excised from each one of the 64 pGEM-T-derived vector and cloned into the yeast vector pFL39-ADH-LACURAZ, described previously (Epinat et al., precited), also called pCLS0042 (FIG. 6), resulting in 64 yeast reporter vectors. Steps of excision, digestion and ligation are performed using typical methods known by those skilled in the art. Insertion of the target sequence is made at the SmaI site of pCLS0042. The 64 palindromic targets are described in FIG. 2B (positions -5 to -3 and +3 to +5, SEQ ID NOS: 1-6, 81, 8-18, 82-83, 21-27, 84, 29-31, 85, 33-38, 86, 40-50, 87-88, 53-59, 89, 61-63 and 90, respectively, in order of appearance).

c) Yeast Strains and Transformation

[0145] The library of meganuclease expression variants and the A44/R68/L70 variant, were transformed into strain FYC2-6A (MAT.alpha., trp1.DELTA.63, leu2.DELTA.1, his3.DELTA.200).

[0146] The target plasmids were transformed into yeast strain FYBL2-7B: (MAT.alpha., ura3.DELTA.851, trp1.DELTA.63, leu2.DELTA.1, lys2.DELTA.202).

[0147] For transformation, a classical chemical/heat choc protocol can be used, and routinely gives 10.sup.6 independent transformants per .mu.g of DNA; transformants were selected on leucine drop-out synthetic medium (Gietz and Woods, 2002).

d) Mating of Meganuclease Expressing Clones and Screening in Yeast

[0148] I-CreI variant clones as well as yeast reporter strains were stocked in glycerol (20%) stock and replicated in novel microplates. Mutants were gridded on nylon filters covering YPD plates, using a high gridding density (about 20 spots/cm.sup.2). A second gridding process was performed on the same filters to spot a second layer consisting of 64 or 75 different reporter-harboring yeast strains for each variant. Briefly, each reporter strain was spotted 13 824 times on a nylon membrane, and on each one of this spot was spotted one out of the 13 824 yeast clones expressing a variant meganuclease. Membranes were placed on solid agar YPD rich medium, and incubated at 30.degree. C. for one night, to allow mating. Next, filters were transferred to synthetic medium, lacking leucine and tryptophan, with galactose (1%) as a carbon source (and with G418 for coexpression experiments), and incubated for five days at 37.degree. C., to select for diploids carrying the expression and target vectors. After 5 days, filters were placed on solid agarose medium with 0.02% X-Gal in 0.5 M sodium phosphate buffer, pH 7.0, 0.1% SDS, 6% dimethyl formamide (DMF), 7 mM .beta.-mercaptoethanol, 1% agarose, and incubated at 37.degree. C., to monitor .beta.-galactosidase activity. Positive clones were identified after two days of incubation, according to staining. Results were analyzed by scanning and quantification was performed using a proprietary software. For secondary screening, the same procedure was followed with the 292 selected positives, except that each mutant was tested 4 times on the same membrane (see FIGS. 8 and 9a).

d) Sequence and Re-Cloning of Primary Hits

[0149] The open reading frame (ORF) of positive clones identified during the primary screening in yeast was amplified by PCR and sequenced. Then, ORFs were recloned using the Gateway protocol (Invitrogen). ORFs were amplified by PCR on yeast colonies (Akada et al., Biotechniques, 28, 668-670, 672-674), using primers: ggggacaagtttgtacaaaaaagcaggcttcgaaggagatagaaccatggccaataccaaatataacaaagag- ttcc (SEQ ID NO: 73) and ggggaccactttgtacaagaaagctgggtttagtcggccgccggggaggatttcttcttctcgc (SEQ ID NO: 74) from PROLIGO. PCR products were cloned in: (i) yeast gateway expression vector harboring a galactose inducible promoter, LEU2 or KanR as selectable marker and a 2 micron origin of replication, and (ii) a pET 24d(+) vector from NOVAGEN. Resulting clones were verified by sequencing (MILLEGEN).

B) Results

[0150] I-CreI is a dimeric homing endonuclease that cleaves a 22 by pseudo-palindromic target. Analysis of I-CreI structure bound to its natural target has shown that in each monomer, eight residues establish direct interactions with seven bases (Jurica et al., 1998, precited). Residues Q44, R68, R70 contact three consecutive base pairs at position 3 to 5 (and -3 to -5, FIG. 1). An exhaustive protein library vs. target library approach was undertaken to engineer locally this part of the DNA binding interface. First, the I-CreI scaffold was mutated from D75 to N to decrease likely energetic strains caused by the replacement of the basic residues R68 and R70 in the library that satisfy the hydrogen-acceptor potential of the buried D75 in the I-CreI structure. Homodimers of mutant D75N (purified from E. coli cells wherein it was over-expressed using a pET expression vector) were shown to cleave the I-CreI homing site. The D75N mutation did not affect the protein structure, but decreased the toxicity of I-CreI in overexpression experiments. Next, positions 44, 68 and 70 were randomized and 64 palindromic targets resulting from substitutions in positions .+-.3, .+-.4 and .+-.5 of a palindromic target cleaved by I-CreI (Chevalier et al., 2003, precited) were generated, as described in FIGS. 1 and 2B. Eventually, mutants in the protein library corresponded to independant combinations of any of the 12 amino acids encoded by the vvk codon at three residue positions. In consequence, the maximal (theoretical) diversity of the protein library was 12.sup.3 or 1728. However, in terms of nucleic acids, the diversity is 18.sup.3 or 5832.

[0151] The resulting library was cloned in a yeast replicative expression vector carrying a LEU2 auxotrophic marker gene and transformed into a leu2 mutant haploid yeast strain (FYC2-6A). The 64 targets were cloned in the appropriate yeast reporter vector and transformed into an haploid strain (FYBL2-7B), resulting in 64 tester strains.

[0152] A robot-assisted mating protocol was used to screen a large number of meganucleases from our library. The general screening strategy is described in FIG. 3b. 13,8247 meganuclease expressing clones (about 2.3-fold the theoretical diversity) were spotted at high density (20 spots/cm.sup.2) on nylon filters and individually tested against each one of the 64 target strains (884,608 spots). 2100 clones showing an activity against at least one target were isolated (FIG. 3b) and the ORF encoding the meganuclease was amplified by PCR and sequenced. 410 different sequences were identified and a similar number of corresponding clones were chosen for further analysis. The spotting density was reduced to 4 spots/cm.sup.2 and each clone was tested against the 64 reporter strains in quadruplicate, thereby creating complete profiles (as in FIGS. 8 and 9a). 350 positives could be confirmed. Next, to avoid the possibility of strains containing more than one clone, mutant ORFs were amplified by PCR, and recloned in the yeast vector. The resulting plasmids were individually transformed back into yeast. 294 such clones were obtained and tested at low density (4 spots/cm.sup.2). Differences with primary screening were observed mostly for weak signals, with 28 weak cleavers appearing now as negatives. Only one positive clone displayed a pattern different from what was observed in the primary profiling.

Example 2

I-CreI Meganuclease Variants with Different Cleavage Profiles

[0153] The validated clones from example 1 showed very diverse patterns. Some of these new profiles shared some similarity with the initial scaffold whereas many others were totally different. Various examples of profiles, including wild-type I-CreI and I-CreI N75, are shown in FIGS. 8 and 9a. The overall results (only for the 292 variants with modified specificity) are summarized in FIG. 7.

[0154] Homing endonucleases can usually accommodate some degeneracy in their target sequences, and one of our first findings was that the original I-CreI protein itself cleaves seven different targets in yeast. Many of our mutants followed this rule as well, with the number of cleaved sequences ranging from 1 to 21 with an average of 5.0 sequences cleaved (standard deviation=3.6). Interestingly, in 50 mutants (14%), specificity was altered so that they cleaved exactly one target. 37 (11%) cleaved 2 targets, 61 (17%) cleaved 3 targets and 58 (17%) cleaved 4 targets. For 5 targets and above, percentages were lower than 10%. Altogether, 38 targets were cleaved by the mutants (FIG. 10a). It is noteworthy that cleavage was barely observed on targets with an A in position .+-.3, and never with targets with tgn and cgn at position .+-.5, .+-.4, .+-.3.

[0155] These results do not limit the scope of the invention, since FIG. 7 only shows results obtained with 292 variants (291 out of the 1728 (or 12.sup.3) I-CreI meganuclease variants obtainable in a complete library).

Example 3

Novel Meganucleases Can Cleave Novel Targets While Keeping High Activity and Narrow Specificity

A) Material and Methods

a) Construction of Target Clones

[0156] The 64 palindromic targets were cloned into pGEM-T Easy (PROMEGA), as described in example 1. Next, a 400 by PvuII fragment was excised and cloned into the mammalian vector pcDNA3.1-LACURAZ-AURA, described previously (Epinat et al., precited). The 75 hybrid targets sequences were cloned as follows: oligonucleotides were designed that contained two different half sites of each mutant palindrome (PROLIGO).

b) Re-Cloning of Primary Hits

[0157] The open reading frame (ORF) of positive clones identified during the primary screening in yeast was recloned in: (i) a CHO gateway expression vector pCDNA6.2, following the instructions of the supplier (INVITROGEN), and ii) a pET 24d(+) vector from NOVAGEN Resulting clones were verified by sequencing (MILLEGEN).

c) Mammalian Cells Assay

[0158] CHO-K1 cell line from the American Type Culture Collection (ATCC) was cultured in Ham'sF12K medium supplemented with 10% Fetal Bovine Serum. For transient Single Strand Annealing (SSA) assays, cells were seeded in 12 well-plates at 13.10.sup.3 cells per well one day prior transfection. Cotransfection was carried out the following day with 400 ng of DNA using the EFFECTENE transfection kit (QIAGEN). Equimolar amounts of target LagoZ plasmid and expression plasmid were used. The next day, medium was replaced and cells were incubated for another 72 hours. CHO-K1 cell monolayers were washed once with PBS. The cells were then lysed with 150 .mu.l of lysis/revelation buffer added for .beta.-galactosidase liquid assay (100 ml of lysis buffer (Tris-HCl 10 mM pH7.5, NaCl 150 mM, Triton X100 0.1%, BSA 0.1 mg/ml, protease inhibitors) and 900 ml of revelation buffer (10 ml of Mg 100.times. buffer (MgCl.sub.2 100 mM, .beta.-mercaptoethanol 35%), 110 ml ONPG (8 mg/ml) and 780 ml of sodium phosphate 0.1 M pH7.5), 30 minutes on ice. Beta-galactosidase activity was assayed by measuring optical density at 415 nm. The entire process was performed on an automated Velocity11 BioCel platform. The beta-galactosidase activity is calculated as relative units normalized for protein concentration, incubation time and transfection efficiency.

d) Protein Expression and Purification

[0159] His-tagged proteins were over-expressed in E.coli BL21 (DE3)pLysS cells using pET-24d (+) vectors (NOVAGEN). Induction with IPTG (0.3 mM), was performed at 25.degree. C. Cells were sonicated in a solution of 50 mM Sodium Phosphate (pH 8), 300 mM sodium chloride containing protease inhibitors (Complete EDTA-free tablets, Roche) and 5% (v/v) glycerol. Cell lysates were centrifuged at 100000 g for 60 min. His-tagged proteins were then affinity-purified, using 5 ml Hi-Trap chelating HP columns (Amersham Biosciences) loaded with cobalt. Several fractions were collected during elution with a linear gradient of imidazole (up to 0.25M imidazole, followed by plateau at 0.5 M imidazole, 0.3 M NaCl and 50 mM Sodium Phosphate pH 8). Protein-rich fractions (determined by SDS-PAGE) were applied to the second column. The crude purified samples were taken to pH 6 and applied to a 5 ml HiTrap Heparin HP column (Amersham Biosciences) equilibrated with 20 mM Sodium Phosphate pH 6.0. Bound proteins are eluted with a sodium chloride continuous gradient with 20 mM sodium phosphate and 1M sodium chloride. The purified fractions were submitted to SDS-PAGE and concentrated (10 kDa cut-off centriprep Amicon Ultra system), frozen in liquid nitrogen and stored at -80.degree. C., Purified proteins were desalted using PD10 columns (Sephadex G-25M, Amersham Biosciences) in PBS or 10 mM Tris-HCl (pH 8) buffer.

e) In Vitro Cleavage Assays

[0160] pGEM plasmids with single meganuclease DNA target cut sites were first linearized with XmnI. Cleavage assays were performed at 37.degree. C. in 10 mM Tris-HCl (pH 8), 50 mM NaCl, 10 mM MgCl2, 1 mM DTT and 50 .mu.g/ml BSA. 2 nM was used as target substrate concentration. A dilution range between 0 and 85 nM was used for each protein, in 25 .mu.l final volume reaction. Reactions were stopped after 1 hour by addition of 5 .mu.l of 45% glycerol, 95 mM EDTA (pH 8), 1.5% (w/v) SDS, 1.5 mg/ml proteinase K and 0.048% (w/v) bromophenol blue (6.times. Buffer Stop) and incubated at 37.degree. C. for 30 minutes. Digests were run on agarosse electrophoresis gel, and fragment quantified after ethidium bromide staining, to calculate the percentage of cleavage.

B) Results

[0161] Eight representative mutants (belonging to 6 different clusters, see below) were chosen for further characterization (FIG. 9). First, data in yeast were confirmed in mammalian cells, by using an assay based on the transient cotransfection of a meganuclease expressing vector and a target vector, as described in a previous report. The 8 mutant ORFs and the 64 targets were cloned into appropriate vectors, and a robot-assisted microtiter-based protocol was used to co-transfect in CHO cells each selected variant with each one the 64 different reporter plasmids. Meganuclease-induced recombination was measured by a standard, quantitative ONPG assay that monitors the restoration of a functional .beta.-galactosidase gene. Profiles were found to be qualitatively and quantitatively reproducible in five independent experiments. As shown on FIG. 9a, strong and medium signals were nearly always observed with both yeast and CHO cells (with the exception of ADK), thereby validating the relevance of the yeast HTS process. However, weak signals observed in yeast were often not detected in CHO cells, likely due to a difference in the detection level (see QRR and targets gtg, get, and ttc). Four mutants were also produced in E. coli and purified by metal affinity chromatography. Their relative in vitro cleavage efficiencies against the wild-type site and their cognate sites was determined. The extent of cleavage under standardized conditions was assessed across a broad range of concentrations for the mutants (FIG. 9b). Similarly, the activity of I-CreI wt on these targets, was analysed . In many case, 100% cleavage of the substrate could not be achieved, likely reflecting the fact that these proteins may have little or no turnover (Perrin et al., EMBO J., 1993, 12, 2939-2947; Wang et al., Nucleic Acids Res., 1997, 25, 3767-3776). In general, in vitro assay confirmed the data obtained in yeast and CHO cells, but surprinsingly, the gtt target was efficiently cleaved by I-CreI

[0162] Specificity shifts were obvious from the profiles obtained in yeast and CHO: the I-CreI favorite gtc target was not cleaved or barely cleaved, while signals were observed with new targets. This switch of specificity was confirmed for QAN, DRK, RAT and KTG by in vitro analysis, as shown on FIG. 9b. In addition, these four mutants, which display various levels of activity in yeast and CHO (FIG. 9a) were shown to cleave 17-60% of their favorite target in vitro (FIG. 9b), with similar kinetics to I-CreI (half of maximal cleavage by 13-25 nM). Thus, activity was largely preserved by engineering. Third, the number of cleaved targets varied among the mutants: strong cleavers such as QRR, QAN, ARL and KTG have a spectrum of cleavage in the range of what is observed with I-CreI (5-8 detectable signals in yeast, 3-6 in CHO). Specificity is more difficult to compare with mutants that cleave weakly. For example, a single weak signal is observed with DRK but might represent the only detectable signal resulting from the attenuation of a more complex pattern. Nevertheless, the behavior of variants that cleave strongly shows that engineering preserves a very narrow specificity.

Example 4

Hierarchical Clustering Defines Seven I-CreI Variant Families

A) Material and Methods

[0163] Clustering was done using hclust from the R package. We used quantitative data from the primary, low density screening. Both variants and targets were clustered using standard hierarchical clustering with Euclidean distance and Ward's method (Ward, J. H., American Stat. Assoc., 1963, 58, 236-244). Mutants and targets dendrograms were reordered to optimize positions of the clusters and the mutant dendrogram was cut at the height of 8 to define the cluster.

B) Results

[0164] Next, hierarchical clustering was used to determine whether families could be identified among the numerous and diverse cleavage patterns of the variants. Since primary and secondary screening gave congruent results, quantitative data from the first round of yeast low density screening was used for analysis, to permit a larger sample size. Both variants and targets were clustered using standard hierarchical clustering with Euclidean distance and Ward's method (Ward, J. H., precited) and seven clusters were defined (FIG. 10b). Detailed analysis is shown for 3 of them (FIG. 10c) and the results are summarized in Table I.

TABLE-US-00002 TABLE I Cluster Analysis Nucleotide in examples Three preferred targets .sup.1 position 4 preferred amino acid .sup.2 cluster (FIG. 3a) sequence % cleavage (%) .sup.1 44 68 70 1 QAN gtt 46.2 g 0.5 Q gtc 18.3 a 2.0 80.5% 77 proteins gtg 13.6 t 82.4 (62/77) .SIGMA. = 78.1 c 15.1 2 QRR gtt 13.4 g 0 Q R gtc 11.8 a 4.9 100.0% 100.0% 8 proteins tct 11.4 t 56.9 (8/8) (8/8) .SIGMA. = 36.6 c 38.2 3 ARL gat 27.9 g 2.4 A R tat 23.2 a 88.9 63.0% 33.8% 65 proteins gag 15.7 t 5.7 (41/65) (22/65) .SIGMA. = 66.8 c 3.0 4 AGR gac 22.7 g 0.3 A&N R R 51.6% & tac 14.5 a 91.9 35.4% 48.4% 67.7% 31 proteins gat 13.4 t 6.6 (16 & 11/31) 15/31 21/31 .SIGMA. = 50.6 c 1.2 5 ADK gat 29.21 g 1.6 DRK tat 15.4 a 73.8 81 proteins gac 11.4 t 13.4 .SIGMA. = 56.05.9 c 11.2 6 KTG cct 30.1 g 0 K RAT tct 19.6 a 4.0 62.7% 51 proteins tcc 13.9 t 6.3 (32/51) .SIGMA. = 63.6 c 89.7 7 cct 20.8 g 0 K tct 19.6 a 0.2 91.9% 37 proteins tcc 15.3 t 14.4 (34/37) .SIGMA. = 55.7 c 85.4 .sup.1 frequencies according to the cleavage index, as described in FIG. 10c .sup.2 in each position, residues present in more than 1/3 of the cluster are indicated

[0165] For each cluster, a set of preferred targets could be identified on the basis of the frequency and intensity of the signal (FIG. 10c). The three preferred targets for each cluster are indicated in Table 1, with their cleavage frequencies. The sum of these frequencies is a measurement of the specificity of the cluster. For example, in cluster 1, the three preferred targets (gtt/c/g), account for 78.1% of the observed cleavage, with 46.2% for gtt alone, revealing a very narrow specificity. Actually, this cluster includes several proteins which, as QAN, which cleaves mostly gtt (FIG. 9a). In contrast, the three preferred targets in cluster 2 represent only 36.6% of all observed signals. In accordance with the relatively broad and diverse patterns observed in this cluster, QRR cleaves 5 targets (FIG. 9a), while other cluster members' activity are not restricted to these 5 targets.

[0166] Analysis of the residues found in each cluster showed strong biases for position 44: Q is overwhelmingly represented in clusters 1 and 2, whereas A and N are more frequent in clusters 3 and 4, and K in clusters 6 and 7. Meanwhile, these biases were correlated with strong base preferences for DNA positions .+-.4, with a large majority of t:a base pairs in cluster 1 and 2, a:t in clusters 3, 4 and 5, and c:g in clusters 6 and 7 (see Table I). The structure of I-CreI bound to its target shows that residue Q44 interacts with the bottom strand in position -4 (and the top strand of position +4, see FIGS. 1b and 1c). These results suggests that this interaction is largely conserved in our mutants, and reveals a "code", wherein Q44 would establish contact with adenine, A44 (or less frequently N44) with thymine, and K44 with guanine. Such correlation was not observed for positions 68 and 70.

Example 5

Variants Can be Assembled in Functional Heterodimers to Cleave New DNA Target Sequences

A) Materials and Methods

[0167] The 75 hybrid targets sequences were cloned as follows: oligonucleotides were designed that contained two different half sites of each mutant palindrome (PROLIGO). Double-stranded target DNA, generated by PCR amplification of the single stranded oligonucleotides, was cloned using the Gateway protocol (INVITROGEN) into yeast and mammalian reporter vectors. Yeast reporter vectors were transformed into S. cerevisiae strain FYBL2-7B (MAT.alpha., ura3.DELTA.851, trp1.DELTA.63, leu2.DELTA.1, lys2.DELTA.202).

B) Results

[0168] Variants are homodimers capable of cleaving palindromic sites. To test whether the list of cleavable targets could be extended by creating heterodimers that would cleave hybrid cleavage sites (as described in FIG. 11), a subset of I-CreI variants with distinct profiles was chosen and cloned in two different yeast vectors marked by LEU2 or KAN genes. Combinations of mutants having mutations at positions 44, 68 and/or 70 and N at position 75, were then co-expressed in yeast with a set of palindromic and non palindromic chimeric DNA targets. An example is shown on FIG. 12: co-expression of the K44, T68, G70,N75 (KTG) and Q44, A68, N70,N75 (QAN) mutants resulted in the cleavage of two chimeric targets, gtt/gcc and gtt/cct, that were not cleaved by either mutant alone. The palindromic gtt, cct and gcc targets (and other targets of KTG and QAN) were also cleaved, likely resulting from homodimeric species formation, but unrelated targets were not. In addition, a gtt, cct or gee half-site was not sufficient to allow cleavage, since such targets were fully resistant (see ggg/gcc, gat/gcc, gcc/tac, and many others, on FIG. 12). Unexpected cleavage was observed only with gtc/cct and gtt/gtc, with KTG and QAN homodimers, respectively, but signal remained very weak. Thus, efficient cleavage requires the cooperative binding of two mutant monomers. These results demonstrate a good level of specificity for heterodimeric species.

[0169] Altogether, a total of 112 combinations of 14 different proteins were tested in yeast, and 37.5% of the combinations (42/112) revealed a positive signal on their predicted chimeric target. Quantitative data are shown for six examples on FIG. 13a, and for the same six combinations, results were confirmed in CHO cells in transient co-transfection experiments, with a subset of relevant targets (FIG. 13b). As a general rule, functional heterodimers were always obtained when one of the two expressed proteins gave a strong signal as homodimer. For example, DRN and RRN, two low activity mutants, give functional heterodimers with strong cutters such as KTG or QRR (FIGS. 13a and 13b) whereas no cleavage of chimeric targets could be detected by co-expression of the same weak mutants

Example 6

Cleavage of a Natural DNA Target by Assembled Heterodimer

A) Materials and Methods

a) Genome Survey

[0170] A natural target potentially cleaved by a I-CreI variant, was identified by scanning the public databases, for genomic sequences matching the pattern caaaacnnnnnnnnnngttttg, wherein n is a, t, c, or g (SEQ ID NO: 78). The natural target DNA sequence caaaactatgtagagggttttg (SEQ ID NO: 75) was identified in mouse chromosome 17.

[0171] This DNA sequence is potentially cleaved by a combination of two I-CreI variants cleaving the sequences tcaaaactatgtgaatagttttga (SEQ ID NO: 76) and tcaaaaccctgtgaagggttttga (SEQ ID NO: 77), respectively.

b) Isolation of Meganuclease Variants

[0172] Variants were selected by the cleavage-induced recombination assay in yeast, as described in example 1, using the sequence tcaaaactatgtgaatagttttga (SEQ ID NO: 76) or the sequence tcaaaaccctgtgaagggttttga (SEQ ID NO: 77) as targets.

c) Construction of the Target Plasmid

[0173] Oligonucleotides were designed that contained two different half sites of each mutant palindrome (PROLIGO). Double-stranded target DNA, generated by PCR amplification of the single stranded oligonucleotides, was cloned using the Gateway protocol (INVITROGEN) into the mammalian reporter vector pcDNA3.1-LACURAZ-AURA, described previously (Epinat et al., precited), to generate the target LagoZ plasmid.

d) Construction of Meganuclease Expression Vector

[0174] The open reading frames (ORFs) of the clones identified during the screening in yeast were amplified by PCR on yeast colony and cloned individually in the CHO expression vector pCDNA6.2 (INVITROGEN), as described in example 1. I-Crel variants were expressed under the control of the CMV promoter.

e) Mammalian Cells Assay

[0175] CHO-K1 cell line were transiently co-transfected with equimolar amounts of target LagoZ plasmid and expression plasmids, and the beta galactosidase activity was measured as described in examples 3 and 5.

B) Results

[0176] A natural DNA target, potentially cleaved by I-CreI variants was identified by performing a genome survey of sequences matching the pattern caaaacnnnnnnnnnngttttg (SEQ ID NO: 78). A randomly chosen DNA sequence (SEQ ID NO: 78) identified in chromosome 17 of the mouse was cloned into a reporter plasmid. This DNA target was potentially cleaved by a combination of the I-CreI variants A44,R68,S70,N75 (ARS) and K44,R68,E70,N75 (KRE).

[0177] The co-expression of these two variants in CHO cell leads to the formation of functional heterodimer protein as shown in FIG. 14. Indeed when the I-CreI variants were expressed individually, virtually no cleavage activity could be detected on the mouse DNA target although the KRE protein showed a residual activity. In contrast, when these two variants were co-expressed together with the plasmid carrying the potential target, a strong beta-galactosidase activity could be measured. All together these data revealed that heterodimerization occurred in the CHO cells and that heterodimers were functional.

[0178] These data demonstrate that heterodimers proteins created by assembling homodimeric variants, extend the list of natural occurring DNA target sequences to all the potential hybrid cleavable targets resulting from all possible combination of the variants.

[0179] Moreover, these data demonstrated that it is possible to predict the DNA sequences that can be cleaved by a combination of variant knowing their individual DNA target of homodimer. Furthermore, the nucleotides at positions 1 et 2 (and -1 and -2) of the target can be different from gtac, indicating that they play little role in DNA/protein interaction.

REFERENCES

[0180] Belfort, M. and Roberts, R. J. (1997) Homing endonucleases: keeping the house in order. Nucleic Acids Res, 25, 3379-3388.

[0181] Bell-Pedersen, D., Quirk, S., Clyman, J. and Belfort, M. (1990) Intron mobility in phage T4 is dependent upon a distinctive class of endonucleases and independent of DNA sequences encoding the intron core: mechanistic and evolutionary implications. Nucleic Acids Res, 18, 3763-3770.

[0182] Bell-Pedersen, D., Quirk, S. M., Aubrey, M. and Belfort, M. (1989) A site-specific endonuclease and co-conversion of flanking exons associated with the mobile td intron of phage T4. Gene, 82, 119-126.

[0183] Bell-Pedersen, D., Quirk, S. M., Bryk, M. and Belfort, M. (1991) I-TevI, the endonuclease encoded by the mobile td intron, recognizes binding and cleavage domains on its DNA target. Proc Natl Acad Sci USA, 88, 7719-7723.

[0184] Bibikova, M., Beumer, K., Trautman, J. K. and Carroll, D. (2003) Enhancing gene targeting with designed zinc finger nucleases. Science, 300, 764.

[0185] Bibikova, M., Carroll, D., Segal, D. J., Trautman, J. K., Smith, J., Kim, Y. G. and Chandrasegaran, S. (2001) Stimulation of homologous recombination through targeted cleavage by chimeric nucleases. Mol Cell Biol, 21, 289-297.

[0186] Bibikova, M., Golic, M., Golic, K. G. and Carroll, D. (2002) Targeted chromosomal cleavage and mutagenesis in Drosophila using zinc-finger nucleases. Genetics, 161, 1169-1175.

[0187] Chevalier, B., Sussman, D., Otis, C., Noel A. J., Turmel, M., Lemieux, C., Stephens, K., Monnat, R. Jr., Stoddard, B. L., (2004) Metal-Dependant DNA cleavage mechanism of the I-CreI LAGLIDADG (SEQ ID NO: 91) homing endonuclease. Biochemistry, 43, 14015-14026.

[0188] Chevalier, B. S., Kortemme, T., Chadsey, M. S., Baker, D., Monnat, R. J. and Stoddard, B. L. (2002) Design, activity, and structure of a highly specific artificial endonuclease. Mol Cell, 10, 895-905.

[0189] Chevalier, B. S. and Stoddard, B. L. (2001) Homing endonucleases: structural and functional insight into the catalysts of intron/intein mobility. Nucleic Acids Res, 29, 3757-3774.

[0190] Choulika, A., Perrin, A., Dujon, B. and Nicolas, J. F. (1995) Induction of homologous recombination in mammalian chromosomes by using the I-SceI system of Saccharomyces cerevisiae. Mol Cell Biol, 15, 1968-1973.

[0191] Cohen-Tannoudji, M., Robine, S., Choulika, A., Pinto, D., El Marjou, F., Babinet, C., Louvard, D. and Jaisser, F. (1998) I-SceI-induced gene replacement at a natural locus in embryonic stem cells. Mol Cell Biol, 18, 1444-1448.

[0192] Colleaux, L., D'Auriol, L., Galibert, F. and Dujon, B. (1988) Recognition and cleavage site of the intron-encoded omega transposase. Proc Natl Acad Sci USA, 85, 6022-6026.

[0193] Donoho, G., Jasin, M. and Berg, P. (1998) Analysis of gene targeting and intrachromosomal homologous recombination stimulated by genomic double-strand breaks in mouse embryonic stem cells. Mol Cell Biol, 18, 4070-4078.

[0194] Elliott, B., Richardson, C., Winderbaum, J., Nickoloff, J. A. and Jasin, M. (1998) Gene conversion tracts from double-strand break repair in mammalian cells. Mol Cell Biol, 18, 93-101.

[0195] Epinat, J. C., Arnould, S., Chames, P., Rochaix, P., Desfontaines, D., Puzin, C., Patin, A., Zanghellini, A., Paques, F. and Lacroix, E. (2003) A novel engineered meganuclease induces homologous recombination in yeast and mammalian cells. Nucleic Acids Res, 31, 2952-2962.

[0196] Gietz, R. D. and Woods, R. A. (2002) Transformation of yeast by lithium acetate/single-stranded carrier DNA/polyethylene glycol method. Methods Enzymol, 350, 87-96.

[0197] Haber, J. E. (1998) Mating-type gene switching in Saccharomyces cerevisiae. Annu Rev Genet, 32, 561-599.

[0198] Jacquier, A. and Dujon, B. (1985) An intron-encoded protein is active in a gene conversion process that spreads an intron into a mitochondrial gene. Cell, 41, 383-394.

[0199] Jurica, M. S., Monnat, R. J., Jr. and Stoddard, B. L. (1998) DNA recognition and cleavage by the LAGLIDADG (SEQ ID NO: 91) homing endonuclease I-CreI. Mol Cell, 2, 469-476.

[0200] Klar, A. J., Strathern, J. N. and Abraham, J. A. (1984) Involvement of double-strand chromosomal breaks for mating-type switching in Saccharomyces cerevisiae. Cold Spring Harb Symp Quant Biol, 49, 77-88.

[0201] Kostriken, R., Strathern, J. N., Klar, A. J., Hicks, J. B. and Heffron, F. (1983) A site-specific endonuclease essential for mating-type switching in Saccharomyces cerevisiae. Cell, 35, 167-174.

[0202] Liang, F., Han, M., Romanienko, P. J. and Jasin, M. (1998) Homology-directed repair is a major double-strand break repair pathway in mammalian cells. Proc Natl Acad Sci USA, 95, 5172-5177.

[0203] Moynahan, M. E. and Jasin, M. (1997) Loss of heterozygosity induced by a chromosomal double-strand break. Proc Natl Acad Sci USA, 94, 8988-8993.

[0204] Mueller, J. E., Smith, D. and Belfort, M. (1996) Exon coconversion biases accompanying intron homing: battle of the nucleases. Genes Dev, 10, 2158-2166.

[0205] Perrin, A., Buckle, M. and Dujon, B. (1993) Asymmetrical recognition and activity of the I-SceI endonuclease on its site and on intron-exon junctions. Embo J, 12, 2939-2947.

[0206] Plessis, A., Perrin, A., Haber, J. E. and Dujon, B. (1992) Site-specific recombination determined by I-SceI, a mitochondrial group I intron-encoded endonuclease expressed in the yeast nucleus. Genetics, 130, 451-460.

[0207] Porteus, M. H. and Baltimore, D. (2003) Chimeric nucleases stimulate gene targeting in human cells. Science, 300, 763.

[0208] Posfai, G., Kolisnychenko, V., Bereczki, Z. and Blattner, F. R. (1999) Markerless gene replacement in Escherichia coli stimulated by a double-strand break in the chromosome. Nucleic Acids Res, 27, 4409-4415.

[0209] Puchta, H. (1999) Double-strand break-induced recombination between ectopic homologous sequences in somatic plant cells. Genetics, 152, 1173-1181.

[0210] Puchta, H., Dujon, B. and Hohn, B. (1993) Homologous recombination in plant cells is enhanced by in vivo induction of double strand breaks into DNA by a site-specific endonuclease. Nucleic Acids Res, 21, 5034-5040.

[0211] Puchta, H., Dujon, B. and Hohn, B. (1996) Two different but related mechanisms are used in plants for the repair of genomic double-strand breaks by homologous recombination. Proc Natl Acad Sci USA, 93, 5055-5060.

[0212] Richardson, C., Moynahan, M. E. and Jasin, M. (1998) Double-strand break repair by interchromosomal recombination: suppression of chromosomal translocations. Genes Dev, 12, 3831-3842.

[0213] Rong, Y. S. and Golic, K. G. (2000) Gene targeting by homologous recombination in Drosophila. Science, 288, 2013-2018.

[0214] Rong, Y. S. and Golic, K. G. (2001) A targeted gene knockout in Drosophila. Genetics, 157, 1307-1312.

[0215] Rong, Y. S., Titen, S. W., Xie, H. B., Golic, M. M., Bastiani, M., Bandyopadhyay, P., Olivera, B. M., Brodsky, M., Rubin, G. M. and Golic, K. G. (2002) Targeted mutagenesis by homologous recombination in D. melanogaster. Genes Dev, 16, 1568-1581.

[0216] Rouet, P., Smih, F. and Jasin, M. (1994) Introduction of double-strand breaks into the genome of mouse cells by expression of a rare-cutting endonuclease. Mol Cell Biol, 14, 8096-8106.

[0217] Seligman et al., 1997, Genetics, 1997, 147, 1653-1664

[0218] Seligman, L. M., Chisholm, K. M., Chevalier, B. S., Chadsey, M. S., Edwards, S. T., Savage, J. H. and Veillet, A. L. (2002) Mutations altering the cleavage specificity of a homing endonuclease. Nucleic Acids Res, 30, 3870-3879.

[0219] Siebert, R. and Puchta, H. (2002) Efficient repair of genomic double-strand breaks by homologous recombination between directly repeated sequences in the plant genome. Plant Cell, 14, 1121-1131.

[0220] Sussman, D., Chadsey, M., Fauce, S., Engel, A., Bruett, A., Monnat, R., Jr., Stoddard, B. L. and Seligman, L. M. (2004) Isolation and characterization of new homing endonuclease specificities at individual target site positions. J Mol Biol, 342, 31-41.

[0221] Thierry, A. and Dujon, B. (1992) Nested chromosomal fragmentation in yeast using the meganuclease I-Sce I: a new method for physical mapping of eukaryotic genomes. Nucleic Acids Res, 20, 5625-5631.

Sequence CWU 1

1

92124DNAArtificial SequenceDescription of Artificial Sequence Synthetic oligonucleotide 1tcaaaacggg gtaccccgtt ttga 24224DNAArtificial SequenceDescription of Artificial Sequence Synthetic oligonucleotide 2tcaaaacgga gtactccgtt ttga 24324DNAArtificial SequenceDescription of Artificial Sequence Synthetic oligonucleotide 3tcaaaacggt gtacaccgtt ttga 24424DNAArtificial SequenceDescription of Artificial Sequence Synthetic oligonucleotide 4tcaaaacggc gtacgccgtt ttga 24524DNAArtificial SequenceDescription of Artificial Sequence Synthetic oligonucleotide 5tcaaaacgag gtacctcgtt ttga 24624DNAArtificial SequenceDescription of Artificial Sequence Synthetic oligonucleotide 6tcaaaacgaa gtacttcgtt ttga 24724DNAArtificial SequenceDescription of Artificial Sequence Synthetic oligonucleotide 7tcaaaacgat gtacatcgtt ttga 24824DNAArtificial SequenceDescription of Artificial Sequence Synthetic oligonucleotide 8tcaaaacgac gtacgtcgtt ttga 24924DNAArtificial SequenceDescription of Artificial Sequence Synthetic oligonucleotide 9tcaaaacgtg gtaccacgtt ttga 241024DNAArtificial SequenceDescription of Artificial Sequence Synthetic oligonucleotide 10tcaaaacgta gtactacgtt ttga 241124DNAArtificial SequenceDescription of Artificial Sequence Synthetic oligonucleotide 11tcaaaacgtt gtacaacgtt ttga 241224DNAArtificial SequenceDescription of Artificial Sequence Synthetic oligonucleotide 12tcaaaacgtc gtacgacgtt ttga 241324DNAArtificial SequenceDescription of Artificial Sequence Synthetic oligonucleotide 13tcaaaacgcg gtaccgcgtt ttga 241424DNAArtificial SequenceDescription of Artificial Sequence Synthetic oligonucleotide 14tcaaaacgca gtactgcgtt ttga 241524DNAArtificial SequenceDescription of Artificial Sequence Synthetic oligonucleotide 15tcaaaacgct gtacagcgtt ttga 241624DNAArtificial SequenceDescription of Artificial Sequence Synthetic oligonucleotide 16tcaaaacgcc gtacggcgtt ttga 241724DNAArtificial SequenceDescription of Artificial Sequence Synthetic oligonucleotide 17tcaaaacagg gtaccctgtt ttga 241824DNAArtificial SequenceDescription of Artificial Sequence Synthetic oligonucleotide 18tcaaaacaga gtactctgtt ttga 241924DNAArtificial SequenceDescription of Artificial Sequence Synthetic oligonucleotide 19tcaaaacagt gtacactgtt ttga 242024DNAArtificial SequenceDescription of Artificial Sequence Synthetic oligonucleotide 20tcaaaacagc gtacgctgtt ttga 242124DNAArtificial SequenceDescription of Artificial Sequence Synthetic oligonucleotide 21tcaaaacaag gtaccttgtt ttga 242224DNAArtificial SequenceDescription of Artificial Sequence Synthetic oligonucleotide 22tcaaaacaaa gtactttgtt ttga 242324DNAArtificial SequenceDescription of Artificial Sequence Synthetic oligonucleotide 23tcaaaacaat gtacattgtt ttga 242424DNAArtificial SequenceDescription of Artificial Sequence Synthetic oligonucleotide 24tcaaaacaac gtacgttgtt ttga 242524DNAArtificial SequenceDescription of Artificial Sequence Synthetic oligonucleotide 25tcaaaacatg gtaccatgtt ttga 242624DNAArtificial SequenceDescription of Artificial Sequence Synthetic oligonucleotide 26tcaaaacata gtactatgtt ttga 242724DNAArtificial SequenceDescription of Artificial Sequence Synthetic oligonucleotide 27tcaaaacatt gtacaatgtt ttga 242824DNAArtificial SequenceDescription of Artificial Sequence Synthetic oligonucleotide 28tcaaaacatc gtacgatgtt ttga 242924DNAArtificial SequenceDescription of Artificial Sequence Synthetic oligonucleotide 29tcaaaacacg gtaccgtgtt ttga 243024DNAArtificial SequenceDescription of Artificial Sequence Synthetic oligonucleotide 30tcaaaacaca gtactgtgtt ttga 243124DNAArtificial SequenceDescription of Artificial Sequence Synthetic oligonucleotide 31tcaaaacact gtacagtgtt ttga 243224DNAArtificial SequenceDescription of Artificial Sequence Synthetic oligonucleotide 32tcaaaacacc gtacggtgtt ttga 243324DNAArtificial SequenceDescription of Artificial Sequence Synthetic oligonucleotide 33tcaaaactgg gtacccagtt ttga 243424DNAArtificial SequenceDescription of Artificial Sequence Synthetic oligonucleotide 34tcaaaactga gtactcagtt ttga 243524DNAArtificial SequenceDescription of Artificial Sequence Synthetic oligonucleotide 35tcaaaactgt gtacacagtt ttga 243624DNAArtificial SequenceDescription of Artificial Sequence Synthetic oligonucleotide 36tcaaaactgc gtacgcagtt ttga 243724DNAArtificial SequenceDescription of Artificial Sequence Synthetic oligonucleotide 37tcaaaactag gtacctagtt ttga 243824DNAArtificial SequenceDescription of Artificial Sequence Synthetic oligonucleotide 38tcaaaactaa gtacttagtt ttga 243924DNAArtificial SequenceDescription of Artificial Sequence Synthetic oligonucleotide 39tcaaaactat gtacatagtt ttga 244024DNAArtificial SequenceDescription of Artificial Sequence Synthetic oligonucleotide 40tcaaaactac gtacgtagtt ttga 244124DNAArtificial SequenceDescription of Artificial Sequence Synthetic oligonucleotide 41tcaaaacttg gtaccaagtt ttga 244224DNAArtificial SequenceDescription of Artificial Sequence Synthetic oligonucleotide 42tcaaaactta gtactaagtt ttga 244324DNAArtificial SequenceDescription of Artificial Sequence Synthetic oligonucleotide 43tcaaaacttt gtacaaagtt ttga 244424DNAArtificial SequenceDescription of Artificial Sequence Synthetic oligonucleotide 44tcaaaacttc gtacgaagtt ttga 244524DNAArtificial SequenceDescription of Artificial Sequence Synthetic oligonucleotide 45tcaaaactcg gtaccgagtt ttga 244624DNAArtificial SequenceDescription of Artificial Sequence Synthetic oligonucleotide 46tcaaaactca gtactgagtt ttga 244724DNAArtificial SequenceDescription of Artificial Sequence Synthetic oligonucleotide 47tcaaaactct gtacagagtt ttga 244824DNAArtificial SequenceDescription of Artificial Sequence Synthetic oligonucleotide 48tcaaaactcc gtacggagtt ttga 244924DNAArtificial SequenceDescription of Artificial Sequence Synthetic oligonucleotide 49tcaaaaccgg gtacccggtt ttga 245024DNAArtificial SequenceDescription of Artificial Sequence Synthetic oligonucleotide 50tcaaaaccga gtactcggtt ttga 245124DNAArtificial SequenceDescription of Artificial Sequence Synthetic oligonucleotide 51tcaaaaccgt gtacacggtt ttga 245224DNAArtificial SequenceDescription of Artificial Sequence Synthetic oligonucleotide 52tcaaaaccgc gtacgcggtt ttga 245324DNAArtificial SequenceDescription of Artificial Sequence Synthetic oligonucleotide 53tcaaaaccag gtacctggtt ttga 245424DNAArtificial SequenceDescription of Artificial Sequence Synthetic oligonucleotide 54tcaaaaccaa gtacttggtt ttga 245524DNAArtificial SequenceDescription of Artificial Sequence Synthetic oligonucleotide 55tcaaaaccat gtacatggtt ttga 245624DNAArtificial SequenceDescription of Artificial Sequence Synthetic oligonucleotide 56tcaaaaccac gtacgtggtt ttga 245724DNAArtificial SequenceDescription of Artificial Sequence Synthetic oligonucleotide 57tcaaaacctg gtaccaggtt ttga 245824DNAArtificial SequenceDescription of Artificial Sequence Synthetic oligonucleotide 58tcaaaaccta gtactaggtt ttga 245924DNAArtificial SequenceDescription of Artificial Sequence Synthetic oligonucleotide 59tcaaaacctt gtacaaggtt ttga 246024DNAArtificial SequenceDescription of Artificial Sequence Synthetic oligonucleotide 60tcaaaacctc gtacgaggtt ttga 246124DNAArtificial SequenceDescription of Artificial Sequence Synthetic oligonucleotide 61tcaaaacccg gtaccgggtt ttga 246224DNAArtificial SequenceDescription of Artificial Sequence Synthetic oligonucleotide 62tcaaaaccca gtactgggtt ttga 246324DNAArtificial SequenceDescription of Artificial Sequence Synthetic oligonucleotide 63tcaaaaccct gtacagggtt ttga 246424DNAArtificial SequenceDescription of Artificial Sequence Synthetic oligonucleotide 64tcaaaacccc gtacggggtt ttga 246524DNAArtificial SequenceDescription of Artificial Sequence Synthetic oligonucleotide 65tcaaaacgtc gtgagacagt ttgg 246624DNAArtificial SequenceDescription of Artificial Sequence Synthetic oligonucleotide 66ccaaactgtc tcgagacagt ttgg 246749DNAArtificial SequenceDescription of Artificial Sequence Synthetic primer 67gtttaaacat cagctaagct tgacctttvv kgtgactcaa aagacccag 496845DNAArtificial SequenceDescription of Artificial Sequence Synthetic primer 68gatgtagttg gaaacggatc cmbbatcmbb tacgtaacca acgcc 4569501DNAChlamydomonas reinhardtiiCDS(1)..(501) 69atg gcc aat acc aaa tat aac aaa gag ttc ctg ctg tac ctg gcc ggc 48Met Ala Asn Thr Lys Tyr Asn Lys Glu Phe Leu Leu Tyr Leu Ala Gly1 5 10 15ttt gtg gac ggt gac ggt agc atc atc gct cag att aaa cca aac cag 96Phe Val Asp Gly Asp Gly Ser Ile Ile Ala Gln Ile Lys Pro Asn Gln 20 25 30tct tat aag ttt aaa cat cag cta agc ttg acc ttt cag gtg act caa 144Ser Tyr Lys Phe Lys His Gln Leu Ser Leu Thr Phe Gln Val Thr Gln 35 40 45aag acc cag cgc cgt tgg ttt ctg gac aaa cta gtg gat gaa att ggc 192Lys Thr Gln Arg Arg Trp Phe Leu Asp Lys Leu Val Asp Glu Ile Gly 50 55 60gtt ggt tac gta cgt gat cgc gga tcc gtt tcc aac tac atc tta agc 240Val Gly Tyr Val Arg Asp Arg Gly Ser Val Ser Asn Tyr Ile Leu Ser65 70 75 80gaa atc aag ccg ctg cac aac ttc ctg act caa ctg cag ccg ttt ctg 288Glu Ile Lys Pro Leu His Asn Phe Leu Thr Gln Leu Gln Pro Phe Leu 85 90 95aaa ctg aaa cag aaa cag gca aac ctg gtt ctg aaa att atc gaa cag 336Lys Leu Lys Gln Lys Gln Ala Asn Leu Val Leu Lys Ile Ile Glu Gln 100 105 110ctg ccg tct gca aaa gaa tcc ccg gac aaa ttc ctg gaa gtt tgt acc 384Leu Pro Ser Ala Lys Glu Ser Pro Asp Lys Phe Leu Glu Val Cys Thr 115 120 125tgg gtg gat cag att gca gct ctg aac gat tct aag acg cgt aaa acc 432Trp Val Asp Gln Ile Ala Ala Leu Asn Asp Ser Lys Thr Arg Lys Thr 130 135 140act tct gaa acc gtt cgt gct gtg ctg gac agc ctg agc gag aag aag 480Thr Ser Glu Thr Val Arg Ala Val Leu Asp Ser Leu Ser Glu Lys Lys145 150 155 160aaa tcc tcc ccg gcg gcc gac 501Lys Ser Ser Pro Ala Ala Asp 16570167PRTChlamydomonas reinhardtii 70Met Ala Asn Thr Lys Tyr Asn Lys Glu Phe Leu Leu Tyr Leu Ala Gly1 5 10 15Phe Val Asp Gly Asp Gly Ser Ile Ile Ala Gln Ile Lys Pro Asn Gln 20 25 30Ser Tyr Lys Phe Lys His Gln Leu Ser Leu Thr Phe Gln Val Thr Gln 35 40 45Lys Thr Gln Arg Arg Trp Phe Leu Asp Lys Leu Val Asp Glu Ile Gly 50 55 60Val Gly Tyr Val Arg Asp Arg Gly Ser Val Ser Asn Tyr Ile Leu Ser65 70 75 80Glu Ile Lys Pro Leu His Asn Phe Leu Thr Gln Leu Gln Pro Phe Leu 85 90 95Lys Leu Lys Gln Lys Gln Ala Asn Leu Val Leu Lys Ile Ile Glu Gln 100 105 110Leu Pro Ser Ala Lys Glu Ser Pro Asp Lys Phe Leu Glu Val Cys Thr 115 120 125Trp Val Asp Gln Ile Ala Ala Leu Asn Asp Ser Lys Thr Arg Lys Thr 130 135 140Thr Ser Glu Thr Val Arg Ala Val Leu Asp Ser Leu Ser Glu Lys Lys145 150 155 160Lys Ser Ser Pro Ala Ala Asp 1657122DNAArtificial SequenceDescription of Artificial Sequence Synthetic oligonucleotide 71caaaacgtcg tgagacagtt tg 227249DNAArtificial SequenceDescription of Artificial Sequence Synthetic oligonucleotide 72ggcatacaag tttcaaaacn nngtacnnng ttttgacaat cgtctgtca 497377DNAArtificial SequenceDescription of Artificial Sequence Synthetic primer 73ggggacaagt ttgtacaaaa aagcaggctt cgaaggagat agaaccatgg ccaataccaa 60atataacaaa gagttcc 777464DNAArtificial SequenceDescription of Artificial Sequence Synthetic primer 74ggggaccact ttgtacaaga aagctgggtt tagtcggccg ccggggagga tttcttcttc 60tcgc 647522DNAArtificial SequenceDescription of Artificial Sequence Synthetic oligonucleotide 75caaaactatg tagagggttt tg 227624DNAArtificial SequenceDescription of Artificial Sequence Synthetic oligonucleotide 76tcaaaactat gtgaatagtt ttga 247724DNAArtificial SequenceDescription of Artificial Sequence Synthetic oligonucleotide 77tcaaaaccct gtacagggtt ttga 247822DNAArtificial SequenceDescription of Artificial Sequence Synthetic oligonucleotide 78caaaacnnnn nnnnnngttt tg 227924DNAArtificial SequenceDescription of Artificial Sequence Synthetic oligonucleotide 79tcaaaacgtt gtacaacgtt ttga 248024DNAArtificial SequenceDescription of Artificial Sequence Synthetic oligonucleotide 80tcaaaacgtt gtacagggtt ttga 248124DNAArtificial SequenceDescription of Artificial Sequence Synthetic oligonucleotide 81tcaaaacgat gtaaatcgtt ttga 248224DNAArtificial SequenceDescription of Artificial Sequence Synthetic oligonucleotide 82tcaaaacagt gtacgttgtt ttga 248324DNAArtificial SequenceDescription of Artificial Sequence Synthetic oligonucleotide 83tcaaaacagc gtacattgtt ttga 248424DNAArtificial SequenceDescription of Artificial Sequence Synthetic oligonucleotide 84tcaaaacatg gtaccatgtt ttga 248524DNAArtificial SequenceDescription of Artificial Sequence Synthetic oligonucleotide 85tcaaaacacg gtaccgtgtt ttga 248624DNAArtificial SequenceDescription of Artificial Sequence Synthetic oligonucleotide 86tcaaaactat gtaaatagtt ttga 248724DNAArtificial SequenceDescription of Artificial Sequence Synthetic oligonucleotide 87tcaaaaccgt gtacgtggtt ttga 248824DNAArtificial SequenceDescription of Artificial Sequence Synthetic oligonucleotide 88tcaaaaccgc gtacatggtt ttga 248924DNAArtificial SequenceDescription of Artificial Sequence Synthetic oligonucleotide 89tcaaaacctg gtaccaggtt ttga 249024DNAArtificial SequenceDescription of Artificial

Sequence Synthetic oligonucleotide 90tcaaaacccg gtaccgggtt ttga 24919PRTArtificial SequenceDescription of Artificial Sequence Synthetic peptide 91Leu Ala Gly Leu Ile Asp Ala Asp Gly1 59240DNAArtificial SequenceDescription of Artificial Sequence Synthetic oligonucleotide 92nnnnnnnnnc aaannnnnnn nnnnnnnttt gnnnnnnnnn 40

* * * * *