Focused Libraries Of Genetic Packages Ladner; Robert Charles [Dyax Corp.]

Focused Libraries Of Genetic Packages

Ladner; Robert Charles

Patent Application Summary

U.S. patent application number 16/789468 was filed with the patent office on 2020-10-15 for focused libraries of genetic packages. This patent application is currently assigned to Dyax Corp.. The applicant listed for this patent is Dyax Corp.. Invention is credited to Robert Charles Ladner.

Application Number	20200325469 16/789468
Document ID	/
Family ID	1000004928735
Filed Date	2020-10-15

United States Patent Application	20200325469
Kind Code	A1
Ladner; Robert Charles	October 15, 2020

FOCUSED LIBRARIES OF GENETIC PACKAGES

Abstract

Focused libraries of vectors or genetic packages that display, display and express, or comprise a member of a diverse family of antibody peptides, polypeptides or proteins and collectively display, display and express, or comprise at least a portion of the focused diversity of the family. The libraries have length and sequence diversities that mimic that found in native human antibodies.

Inventors:

Ladner; Robert Charles; (Ijamsville, MD)

Applicant:

Name	City	State	Country	Type
Dyax Corp.	Lexington	MA	US

Assignee:

Dyax Corp.
Lexington
MA

Family ID:

1000004928735

Appl. No.:

16/789468

Filed:

February 13, 2020

Related U.S. Patent Documents


Application Number	Filing Date	Patent Number
15797927	Oct 30, 2017	10604753
16789468
13571661	Aug 10, 2012	9617536
15797927
13250520	Sep 30, 2011	8258082
13571661
12762051	Apr 16, 2010	8895475
13250520
11416460	May 1, 2006
12762051
10026925	Dec 18, 2001
11416460
60256380	Dec 18, 2000

Current U.S. Class:	1/1
Current CPC Class:	C40B 40/10 20130101; C12N 15/1037 20130101; C07K 2317/565 20130101; C07K 16/00 20130101; C07K 16/005 20130101; C40B 40/08 20130101; C40B 40/02 20130101
International Class:	C12N 15/10 20060101 C12N015/10; C07K 16/00 20060101 C07K016/00; C40B 40/02 20060101 C40B040/02

Claims

1-43. (canceled)

44. A focused library of DNA plasmids or genetic packages comprising the DNA plasmids, the focused library comprising multiple different sets of variegated DNA molecules, each of which comprises a DNA sequence that encodes an antibody light chain variable domain having both frame work (FW) regions and complementary determining regions (CDR), wherein each light chain variable domain comprises antibody light chain FW1, antibody light chain CDR1, antibody heavy chain FW2, antibody light chain CDR2, antibody light chain FW3, antibody light chain CDR3, and antibody light chain FW4 in a DNA molecule arranged in the orientation of FW1-CDR1-FW2-CDR2-FW3-CDR3-FW4, and wherein: (a) a first collection of antibody light chains that are kappa light chains, which comprises a plurality of LC.sub..kappa. CDR3 regions selected from the group consisting of: (1) QQ<3><1><1><1>P<1>T (SEQ ID NO:16), wherein <1> is a mixture of amino acid residues A, D, E, F, G, H, I, K, L, M, N, P, Q, R, S, T, V, W, Y; and <3> is a mixture of amino acid residues Y, A, D, E, F, G, H, I, K, L, M, N, P, Q, R, T, V, and W; (2) QQ<3><3><1><1><1>P (SEQ ID NO:103), wherein <1> and <3> are as defined in (1) above; (3) QQ<3><2><1><1>PP<1>T (SEQ ID NO:17), wherein <1> and <3> are as defined in (1) above and <2> is a mixture of amino acid residues S, A, D, E, F, G, H, I, K, L, M, N, P, Q, R, T, V, W, and Y; and (4) a mixture of any of (1) to (3) set forth above; or (b) a second collection of antibody light chains that are lambda light chains, which comprise a plurality of LC.sub..lamda. CDR3 regions selected from the group consisting of: (1) <4><5><4><2><4>S<4><4><4>- <4>V (SEQ ID NO:106), wherein <2> is a mixture of amino acid residues D, N, A, E, F, G, H, I, K, L, M, P, Q, R, S, T, V, W, and Y; <4> is a mixture of amino acid residues A, D, E, F, G, H, I, K, L, M, N, P, Q, R, S, T, V, and W; and <5> is a mixture of amino acid residues S, A, D, E, F, G, H, I, K, L, M, N, P, Q, R, T, V, W, and Y; (2) <5>SY<1><5>S<5><1><4>V (SEQ ID NO:19), wherein <1> is a mixture of amino acid residues A, D, E, F, G, H, I, K, L, M, N, P, Q, R, S, T, V, W, and Y; and <4> and <5> are as defined in (1) above; and (3) a mixture of (1) and (2) set forth above.

45. The focused library of claim 44, wherein the first collection of antibody light chains are kappa light chains, which further comprise: (a) a plurality of LC.sub.K CDR1 regions selected from the group consisting of: (1) RASQ<1>V<2><2><3>LA (SEQ ID NO:14), wherein <1> is a mixture of amino acid residues A, D, E, F, G, H, I, K, L, M, N, P, Q, R, S, T, V, W, and Y; <2> is a mixture of amino acid residues S, A, D, E, F, G, H, I, K, L, M, N, P, Q, R, T, V, W, Y; and <3> is a mixture of amino acid residues Y, A, D, E, F, G, H, I, K, L, M, N, P, Q, R, T, V, W and S; and (2) RASQ<1>V<2><2><2><3>LA (SEQ ID NO:15); wherein <1>, <2>, and <3> are as defined in (1) above; and (3) a mixture of (1) and (2) set forth above; (b) a plurality of LC.kappa. CDR2 regions <1>AS<2>R<4><1>(SEQ ID NO:102), wherein <1> is a mixture of amino acid residues A, D, E, F, G, H, I, K, L, M, N, P, Q, R, S, T, V, W, and Y; <2> is a mixture of amino acid residues S, A, D, E, F, G, H, I, K, L, M, N, P, Q, R, T, V, W, and Y; and <4> is a mixture of amino acid residues A, D, E, F, G, H, I, K, L, M, N, P, Q, R, S, T, V, W, and Y; or both (a) and (b).

46. The focused library of claim 44, wherein the library is a library of DNA plasmids.

47. The focused library of claim 44, wherein the library is a library of genetic packages comprising the DNA plasmids.

48. The focused library of DNA plasmids of claim 44, wherein the DNA plasmids are in yeast cells.

49. The focused library of claim 48, wherein the yeast cells display the plurality of antibody light chain variable regions.

50. The focused library of claim 44, wherein the DNA plasmids are yeast DNA plasmids.

51. The focused library of claim 44, wherein the LC.sub.K CDR3s (1), (2), and (3) are present in the library in a ratio of 0.65:0.1:0.25.

52. The focused library of claim 45, wherein the LC.sub..kappa. CDR1s (1) and (2) are present in the library in a ratio of 0.68:0.32.

53. The focused library of claim 52, wherein the first collection of antibody light chains are kappa light chains, and wherein the library further comprises a second set of variegated DNA molecules encoding a second collection of antibody light chains, which are lambda light chains comprising a plurality of LC.sub..lamda. CDR3 regions selected from the group consisting of: (1) <4><5><4><2><4>S<4><4><4>- <4>V (SEQ ID NO:106), wherein <2> is a mixture of amino acid residues D, N, A, E, F, G, H, I, K, L, M, P, Q, R, S, T, V, W, and Y; <4> is a mixture of amino acid residues A, D, E, F, G, H, I, K, L, M, N, P, Q, R, S, T, V, and W; and <5> is a mixture of amino acid residues S, A, D, E, F, G, H, I, K, L, M, N, P, Q, R, T, V, W, and Y; (2) <5>SY<1><5>S<5><1><4>V (SEQ ID NO:19), wherein <1> is a mixture of amino acid residues A, D, E, F, G, H, I, K, L, M, N, P, Q, R, S, T, V, W, and Y; and <4> and <5> are as defined in (1) above; and (3) a mixture of (1) and (2) set forth above.

54. The focused library of claim 53, wherein the second collection of antibody light chains further comprises: (a) a plurality of LC.sub..lamda. CDR1 regions selected from the group consisting of: (1) TG<1>SS<2>VG<1><3><2><3>VS (SEQ ID NO:18), wherein <1> is a mixture of amino acid residues T, G, A, D, E, F, H, I, K, L, M, N, P, Q, R, S, V, W, and Y, <2> is a mixture of amino acid residues D, N, A, E, F, G, H, I, K, L, M, P, Q, R, S, T, V, W, and Y, and <3> is a mixture of amino acid residues Y, A, D, E, F, G, H, I, K, L, M, N, P, Q, R, S, T, V, and W; (2) G<2><4>L<4><4><4><3><4><4&gt- ;; (SEQ ID NO:104), wherein <2> is as defined in (1) above, <3> is a mixture of amino acid residues Y, A, D, E, F, G, H, I, K, L, M, N, P, Q, R, S, T, V, W, and <4> is a mixture of amino acid residues A, D, E, F, G, H, I, K, L, M, N, P, Q, R, S, T, V, W, and Y; and (b) a plurality of LC.lamda. CDR2 regions <4><4><4><2>RPS (SEQ ID NO:105), wherein <2> is a mixture of amino acid residues D, N, A, E, F, G, H, I, K, L, M, P, Q, R, S, T, V, W, and Y, and <4> is a mixture of amino acid residues A, D, E, F, G, H, I, K, L, M, N, P, Q, R, S, T, V, and W; or both (a) and (b)

55. The focused library of claim 54, wherein the LC.lamda. CDR3s (1) and (2) are present in the library in an equimolar mixture.

56. The focused library of claim 54, the LC.lamda. CDR1s (1) and (2) are present in the library in a ratio of 0.67:0.33.

57. The focused library of claim 44, wherein the first collection of antibody light chains are lambda chains, which further comprise (a) a plurality of LC.sub..lamda. CDR1 regions selected from the group consisting of: (1) TG<1>SS<2>VG<1><3><2><3>VS (SEQ ID NO:18), wherein <1> is a mixture of amino acid residues T, G, A, D, E, F, H, I, K, L, M, N, P, Q, R, S, V, W, and Y, <2> is a mixture of amino acid residues D, N and, A, E, F, G, H, I, K, L, M, P, Q, R, S, T, V, W, and Y, and <3> is a mixture of amino acid residues Y, A, D, E, F, G, H, I, K, L, M, N, P, Q, R, S, T, V, and W; (2) G<2><4>L<4><4><4><3><4><4&gt- ;; (SEQ ID NO:104), wherein <2> is as defined in (1) above, <3> is a mixture of amino acid residues Y, A, D, E, F, G, H, I, K, L, M, N, P, Q, R, S, T, V, W, and <4> is a mixture of amino acid residues A, D, E, F, G, H, I, K, L, M, N, P, Q, R, S, T, V, W, and Y; and (3) a mixture of (1) and (2) set forth above; (b) a plurality of LC.lamda. CDR2 regions <4><4><4><2>RPS (SEQ ID NO:105), wherein <2> is a mixture of amino acid residues D, N, A, E, F, G, H, I, K, L, M, P, Q, R, S, T, V, W, and Y and <4> is a mixture of amino acid residues A, D, E, F, G, H, I, K, L, M, N, P, Q, R, S, T, V, and W; or both (a) and (b).

58. The focused library of claim 57, wherein the LC.lamda. CDR3s (1) and (2) are present in the library in an equimolar mixture.

59. The focused library of claim 57, wherein the LC.lamda. CDR1s (1) and (2) are present in the library in a ratio of 0.67:0.33.

Description

[0001] This application is a continuation of U.S. application Ser. No. 15/797,927, filed Oct. 30, 2017, now allowed, which is a continuation of U.S. application Ser. No. 13/571,661, filed Aug. 10, 2012, now U.S. Pat. No. 9,617,536, which is a continuation of U.S. application Ser. No. 13/250,520, filed Sep. 30, 2011, now U.S. Pat. No. 8,258,082, which is a continuation of U.S. application Ser. No. 12/762,051, filed Apr. 16, 2010, now U.S. Pat. No. 8,895,475, which is a continuation of U.S. application Ser. No. 11/416,460, filed on May 1, 2006, now abandoned, which is a continuation of U.S. application Ser. No. 10/026,925, filed on Dec. 18, 2001, now abandoned, which claims the benefit under 35 USC .sctn. 119 of U.S. provisional application 60/256,380, filed Dec. 18, 2000, the entire content of each of which is herein incorporated by reference.

[0002] The present invention relates to focused libraries of genetic packages that each display, display and express, or comprise a member of a diverse family of peptides, polypeptides or proteins and collectively display, display and express, or comprise at least a portion of the focused diversity of the family. The focused diversity of the libraries of this invention comprises both sequence diversity and length diversity. In a preferred embodiment, the focused diversity of the libraries of this invention is biased toward the natural diversity of the selected family. In more preferred embodiment, the libraries are biased toward the natural diversity of human antibodies and are characterized by variegation in their heavy chain and light chain complementarity determining regions ("CDRs").

[0003] The present invention further relates to vectors and genetic packages (e.g., cells, spores or viruses) for displaying, or displaying and expressing a focused diverse family of peptides, polypeptides or proteins. In a preferred embodiment the genetic packages are filamentous phage or phagemids or yeast. Again, the focused diversity of the family comprises diversity in sequence and diversity in length.

[0004] The present invention further relates to methods of screening the focused libraries of the invention and to the peptides, polypeptides and proteins identified by such screening.

BACKGROUND OF THE INVENTION

[0005] It is now common practice in the art to prepare libraries of genetic packages that individually display, display and express, or comprise a member of a diverse family of peptides, polypeptides or proteins and collectively display, display and express, or comprise at least a portion of the amino acid diversity of the family. In many common libraries, the peptides, polypeptides or proteins are related to antibodies (e.g., single chain Fv (scFv), Fv, Fab, whole antibodies or minibodies (i.e., dimers that consist of V.sub.H linked to V.sub.L)). Often, they comprise one or more of the CDRs and framework regions of the heavy and light chains of human antibodies.

[0006] Peptide, polypeptide or protein libraries have been produced in several ways in the prior art. See e.g., Knappik et al., J. Mol. Biol., 296, pp. 57-86 (20004, which is incorporated herein by references. One method is to capture the diversity of native donors, either naive or immunized. Another way is to generate libraries having synthetic diversity. A third method is combination of the first two. Typically, the diversity produced by these methods is limited to sequence diversity, i.e., each member of the library differs from the other members of the family by having different amino acids or variegation at a given position in the peptide, polypeptide or protein chain. Naturally diverse peptides, polypeptides or proteins, however, are not limited to diversity only in their amino acid sequences. For example, human antibodies are not limited to sequence diversity in their amino acids, they are also diverse in the lengths of their amino acid chains.

[0007] For antibodies, diversity in length occurs, for example, during variable region rearrangements. See e.g., Corbett et al., J. Mol. Biol., 270, pp. 587-97 (1997). The joining of V genes to J genes, for example, results in the inclusion of a recognizable D segment in CDR3 in about half of the heavy chain antibody sequences, thus creating regions encoding varying lengths of amino-acids. The following also may occur during joining of antibody gene segments: (i) the end of the V gene may have zero to several base deleted or changed; (ii) the end of the D segment may have zero to many bases removed or changed; (iii) a number of random bases may be inserted between V and D or between D and J; and (iv) the 5' end of J may be edited to remove or to change several bases. These rearrangements result in antibodies that are diverse both in amino acid sequence and in length.

[0008] Libraries that contain only amino acid sequence diversity are, thus disadvantaged in that they do not reflect the natural diversity of the peptide, polypeptide or protein that the library is intended to mimic. Further, diversity in length may be important to the ultimate functioning of the protein, peptide or polypeptide. For example, with regard to a library comprising antibody regions, many of the peptides, polypeptides, proteins displayed, displayed and expressed, or comprised by the genetic packages of the library may not fold properly or their binding to an antigen may be disadvantaged, if diversity both in sequence and length are not represented in the library.

[0009] An additional disadvantage of prior art libraries of genetic packages that display, display and express, or comprise peptides, polypeptides and proteins is that they are not focused on those members that are based on natural occurring diversity and thus on members that are most likely to be functional. Rather, the prior art libraries, typically, attempt to include as much diversity or variegation at every amino acid residue as possible. This makes library construction time-consuming and less efficient than possible. The large number of members that are produced by trying to capture complete diversity also makes screening more cumbersome than it needs to be This is particularly true given that many members of the library will not be functional.

SUMMARY OF THE INVENTION

[0010] One objective of this invention is focused libraries of vectors or genetic packages that encode members of a diverse family of peptides, polypeptides or proteins wherein the libraries encode populations that are diverse in both length and sequence. The diverse length comprising components that contain motifs that are likely to fold and function in the context of the parental peptide, polypeptide or protein.

[0011] Another object of this invention is focused libraries of genetic packages that display, display and express, or comprise a member of a diverse family of peptides, polypeptides and proteins and collectively display, display and express, or comprise at least a portion of the focused diversity of the family. These libraries are diverse not only in their amino acid sequences, but also in their lengths. And, their diversity is focused so as to more closely mimic or take into account the naturally-occurring diversity of the specific family that the library represents.

[0012] Another object of this invention is diverse, but focused, populations of DNA sequences encoding peptides, polypeptides or proteins suitable for display or display and expression using genetic packages(such as phage or phagemids) or other regimens that allow selection of specific binding components of a library.

[0013] A further object of this invention is focused libraries comprising the CDRs of human antibodies that are diverse in both their amino acid sequence and in their length (examples of such libraries include libraries of single chain Fv(scFv), Fv, Fab, whole antibodies or minibodies (i.e., dimers that consist of V.sub.H linked to V.sub.L). Such regions may be from the heavy or light chains or both and may include one or, more of the CDRs of those chains. More preferably, they diversity or variegation occurs in all of the heavy chain and light chain CDRs.

[0014] It is another object of this invention to provide methods of making and screening the above libraries and the peptides, polypeptides and proteins obtained in such screening.

[0015] Among the preferred embodiments of this invention are the following:

[0016] 1. A focused library of vectors or genetic packages that display, display and express, or comprise a member of a diverse family of human antibody related peptides, polypeptides and proteins and collectively display, display and express, or comprise at least a portion of the diversity of the antibody family, the vectors or genetic packages being characterized by variegated DNA sequences that encode a heavy chain CDR1 selected from the group consisting of: [0017] (1) <1>.sub.1Y.sub.2<1>.sub.3M.sub.4<1>.sub.5 (SEQ ID NO:100), wherein <1> is an equimolar mixture of each of amino acid residues A, D, E, F, G, H, I, K, L, M, N, P, Q, R, S, T, V, W, and Y; [0018] (2) (S/T).sub.1(S/G/X).sub.2(S/G/X).sub.3Y.sub.4Y.sub.5W.sub.6(S/G/X).sub.7 (SEQ ID NO:101) wherein (S/T) is a 1:1 mixture of S and T residues, (S/G/X) is a mixture of 0.2025 S, 0.2025 G and 0.035 of each of amino acid residues A, D, E, F, H, I, K, L, H, N, P, Q, R, T, V, W, and Y; [0019] (3) V.sub.1S.sub.2G.sub.3G.sub.4S.sub.5I.sub.6S.sub.7<1>.sub.8<l>- .sub.9<l>.sub.10Y.sub.11Y.sub.12W.sub.13<1>.sub.14 (SEQ ID NO:1), wherein <1> is an equimolar mixture of each of amino acid residues A, D, E, F, G, H, I, K, L, M, N, P, Q, R, S, T, V, W, and Y; and [0020] (4) mixtures of vectors or genetic packages characterized by any of the above DNA sequences, preferably in the ratio: HC CDR1s (1):(2):(3)::0.80:0.17:0.02.

[0021] 2. A focused library of vectors or genetic packages that display, display and express, or comprise a member of a diverse family of human antibody related peptides, polypeptides and proteins and collectively display, display and express or comprise at least a portion of the diversity of the antibody facility, the vectors or genetic packages being characterized by variegated DNA sequences that encode a heavy chain CDR2 selected from the group consisting of: [0022] (1) <2>I<2><3>SGG<1>T<1>YADSVKG (SEQ ID NO:2), wherein <1> is an equimolar mixture of each of amino acid residues 2.sup.11, 0, E, F, G, H, I, K, L, M, N, P, O, P, S, T, V, W, and Y; <2> is an equimolar mixture of each of amino acid residues Y, R, W, V, G, and S; and <3> is an equimolar mixture of each of amino acid residues P, S, and G or an equimolar mixture of P and S; [0023] (2) <1>I<4><1><1><G><5><1><1>- <1>YADSVKG (SEQ ID NO:3), wherein <1> is an equimolar mixture of each of amino acid residues A, D, E, F, G, H, I, K, L, M, N, P, Q, R, S, T, V, W, and Y; <4> is an equimolar mixture of residues D, I, N, S, W, Y; and <5> is an equimolar mixture of residues S, G, D and N; [0024] (3)<1>I<4><1><1>G<5><1><1&g- t;YNPSLKG (SEQ ID NO:4), wherein <1> is an equimolar mixture of each of amino acid residues A, D, E, F, G, H, I, K, L, M, N; P, Q, R, S, T, V, W and Y, and <4> and <5> are as defined above; [0025] (4)<1>I<8>S<1><1><1>GGYY<1>YAASVKG (SEQ ID NO:5), wherein <1> is an equimolar mixture of each amino acid residues A, D, E, F, Gill, I, K, L, M, N, P, Q, R, S, T, V, and Y; <8> is 0.27 R and 0.027 of each of ADEFGHIKLMNPQSTVWY; and [0026] (5) mixtures of vectors or genetic packages characterized by any of the above DNA sequences, preferably in the ratio: HC CDR2s: (1)/(2) (equimolar): (3):(4)::0.54:0.43:0.03.

[0027] 3. A focused library of vectors or genetic packages that display, display and express, or comprise a member of a diverse family of human antibody related peptides, polypeptides and proteins and collectively display, display and express, or comprise at least a portion of the diversity of the antibody family, the vectors or genetic packages being characterized by variegated DNA sequences that encode a heavy chain CDR3 was selected from the group consisting of: [0028] (1) YYCA21111YFDYWG (SEQ ID NO:6), Wherein 1 is an equimolar mixture of each amino acid residues A, D, E, F, G, H, I, K, L, M, N, P, Q, R, S, T, V, W and Y; and 2 is an equimolar mixture of K and R; [0029] (2) YYCA2111111YFDYWG (SEQ ID NO:7), wherein 1 is an equimolar mixture of each amino acid residues A, D, E, F, G, H, K, L, M, N, P, Q, R, S, T, V, W and Y; and 2 is an equimolar mixture of K and R;

TABLE-US-00001 [0029] (SEQ ID NO: 8) (3) YYCA211111111YFDAYTG,

wherein 1 is an equimolar mixture of each amino acid residues A, D, E, F, G, H, 1, K, L, M, N, P, Q, R, S, T, V, W and Y; and 2 is an equimolar mixture of K and R;

TABLE-US-00002 (4) (SEQ ID NO: 9) YYCAR111S2S3111YFDYWG,

wherein 1 is an equimolar mixture of each amino acid residues A, D, E, G, H, I, K, L, M, N, P, Q, R, S, T, V, W and Y; and 2 is an equimolar mixture of S and G; and 3 is an equimolar mixture of Y and W;

TABLE-US-00003 (SEQ ID NO: 10) (5) YYCA2111CSG11CY1YFDYWG,

wherein 1 is an equimolar mixture of each amino acid residues A, D, E, F, G, H, I, K, L, M, N, P, Q, R, S, T, V, W and Y; and 2 is an equimolar mixture of K and R;

TABLE-US-00004 (SEQ ID NO: 11) (6) YYCA211S1TIFG11111YFDYWG,

wherein 1 is an equimolar mixture of each amino acid residues A, D, E, F, G, H, I, K, L, M, N, P, Q, R, S, T, V, W and Y; and 2 is an equimolar mixture of K and R. [0030] (7) YYCAR111YY2S3344111YFDYWG (SEQ ID NO:12), wherein 1 is an equimolar mixture of each amino acid residues A, D, E, F, G, H, I, K, L, M, N, P, Q, R, S, T, V, W and Y; 2 is an equimolar mixture of D and S; and 3 is an equimolar mixture of S and G; [0031] (8) YYCAR1111YC2231CY111YFDYWG (SEQ ID NO:13), wherein 1 is an equimolar mixture of each amino acid residues A, D, E, F, G, H, I, K, L, M, N, P, Q, R, S, T, V, W and Y; 2 is an equimolar mixture of S and G; and 3 is an equimolar mixture of T, D and G; and [0032] (9) mixtures of vectors or genetic packages characterized by any of the above DNA sequences, preferably the HC CDR3s (1) through (8) are in the following proportions in the mixture: [0033] (1) 0.10 [0034] (2) 0.14 [0035] (3) 0.25 [0036] (4) 0.13 [0037] (5) 0.13 [0038] (6) 0.11 [0039] (7) 0.04 and [0040] (8) 0.10; and more preferably the HC CDR3s (1) through (8) are in the following proportions in the mixture: [0041] (1) 0.02 [0042] (2) 0.14 [0043] (3) 0.25 [0044] (4) 0.14 [0045] (5) 0.14 [0046] (6) 0.12 [0047] (7) 0; 08 and [0048] (8) 0.11.

[0049] Preferably, 1 in one or all of HC CDR3s (1) through (8) is 0.095. of each of G and Y and 0.048 of each of A, D, E, F H, 1, K, L, M, N, P, Q, R, S, T, V, and W.

[0050] 4. A focused library of vectors or genetic packages that display, display and express, or comprise a member of a diverse family of human antibody related peptides, polypeptides and proteins and collectively display, display and express, or comprise at least a portion of the diversity of the antibody family, the vectors or genetic packages being characterized by variegated DNA sequences that encodes a kappa light chain CDR1 selected from the group consisting of:

TABLE-US-00005 (SEQ ID NO: 14) (1) RASQ<1>V<2><2><3>LA (SEQ ID NO: 15) (2) RASQ<1>V<2><2><2><3>LA;

wherein <1> is an equimolar mixture of amino acid residues ADEFGHIKLMNPQRSTVWY; <2> is 0.2 S and .0.044 of each of ADEFGHIKLMNPQRTVWY; and <3> is 0.2Y and 0.044 each of ADEFGHIKLMNPQRTVW and Y; and [0051] (3) mixtures of vectors or genetic packages characterized by any of the above DNA sequences, preferably in the ratio CDR1s (1):(2)::0.68:0.32.

[0052] 5. A focused library of vectors or genetic packages that display, display and express, or comprise a member of a diverse family of human antibody related peptides, polypeptides and proteins and collectively display, display and express, or comprise at least a portion of the diversity of the-antibody family the vectors or genetic packages being characterized by variegated DNA sequences that encode a kappa light-chain CDR2 having the sequence:

TABLE-US-00006 (SEQ ID NO: 102) <1>AS<2>R<4><1>,

wherein <1> is an equimolar mixture of amino acid residues ADEFGHIKLMNPQRSTVWY; <2> is 0.2 S and 0.044 of each of ADEFGHIKLMNPQRTVWY; and <4> is 0.2.A and 0.044 each of DEFGHIKLMNPQRSTVWY.

[0053] 6. A focused library of vectors or genetic packages that display, display and express, or comprise a member of a diverse family of human antibody related peptides, polypeptides and proteins and collectively display, display and express, or comprise at least a portion of the diversity of the antibody family, the vectors or genetic packages being characterized by variegated DNA sequences that encode a kappa light chain CDR3 selected from the groups consisting of:

TABLE-US-00007 (SEQ ID NO: 16) (1) QQ<3><1><1><1>P<1>T,

wherein <1> is an equimolar mixture of amino acid residues ADEFGHIKLMNPQRSTVWY; <3> is 0.2 Y and 0.044 each of ADEFGHIKIMNPQRTVW; [0054] (2) QQ33111P (SEQ ID NO:103), wherein 1 and 3 are as defined in (1) above; [0055] (3) QQ3211PP1T(SEQ ID NO:17), wherein 1 and 3 are as defined in (1) above and 2 is 0.2 S and 0.044 each of ADEFGHIKLMNPQRTVWY; and [0056] (4) mixtures of vectors or genetic packages characterized by any of the above DNA sequences, preferably in the ratio CDA3s (1):(2):(3)::0.65:0.1:0.25.

[0057] 7. A focused library of vectors or genetic packages that display, display and express, or comprise a member of a diverse family of human antibody related peptides, polypeptides and proteins and collectively display, display and express, or comprise at least a portion of the diversity of the antibody family, the vectors or genetic packages being characterized by variegated DNA sequences that encode a lambda light chain CDR1 selected from the group consisting of: [0058] (1) TG<1>SS<2>VG<1><3><2><3>VS(SEQ ID NO:18), wherein <1> is 0.27 T, 0.27 G and 0.027 each of ADEFRIKLMNPQRSVWY: <2> is 0.27 D, 0.27 N and 0.027 each of AEFGHIKLMPQRSTVWY, and <3> is 0.36 Y and 0.036 each of ADEFGHIKLMNPQRSTVW;

TABLE-US-00008 [0058] (SEQ ID NO: 104) (2) G<2><4>L<4><4><4><3><4><- 4>,

wherein <2> is as defined in (1) above and <4> is an equimolar mixture of amino acid residues ADEFGHIKIMNPQRSTVWY; and [0059] (3) mixtures of vectors or genetic packages 5 characterized by any of the above DNA sequences, preferably in the ratio CDR1 (1):(2)::0.67:0.33;

[0060] 8. A focused library of vectors or genetic packages that display, display and express, or comprise a member of a diverse family of human antibody related peptides, polypeptides and proteins and collectively display, display and express, or comprise at least a portion of the diversity of the antibody family, the vectors or genetic packages being characterized by variegated DNA sequences that encode a lambda light chain CDR2 has the sequence:

TABLE-US-00009 (SEQ ID NO: 105) <4><4><4><2>RPS

wherein <2> is 0.27 D, 0.27 N, and 0.027 each of AEFGHIKIMPQRSTVWY and <4> is an equimolar mixture of amino acid residues ADEFGHIKLONPQRSTVW.

[0061] 9. A focused library of vectors or genetic packages that display, display and express, or comprise a member of a diverse family of human antibody related peptides, polypeptides and proteins and collectively display, display and express, or comprise at least a portion of the diversity of the antibody family, the vectors or genetic packages being characterized by variegated DNA sequences that encode a lambda light chain CDR3 selected from the group consisting of: [0062] (1) <4><5><4><2><4>S<4><4><4>- <4>V (SEQ ID NO:106), wherein <2> is 0.27 D, 0.27 N, and 0.027 each of AEFGHIKIMPQRSTVWY; <4> is an equimolar mixture of amino acid residues ADEFGHIKLMVPQRSTVW; and <5> is 0.36 S and 0.6355 each of ADEFGHIKLMNPQRTVWY;

TABLE-US-00010 [0062] (SEQ ID NO: 19) (2) <5>SY<1><5>S<5><1><4>V,

wherein <1> is an equimolar mixture of ADEFGHIKLMNPQRSTVWY; and <4> and 5<5> are as defined in (1) above; and [0063] (3) mixtures of vectors or genetic packages characterized by any of the above DNA sequences, preferably in the ratio CDR3s

[0064] 10. A focused library comprising variegated-DNA sequences that encode a heavy chain CDR selected from the group consisting of: [0065] (1) one or more of the heavy chain CDR1s of 1 above; [0066] (2) one or more of the heavy chain CDR2s of 2 above; [0067] (3) one or more of the heavy chain CDR3s of 3 above; and [0068] (4) mixtures of vectors or genetic-packages characterized by (1), (2) and (3).

[0069] 11. The focused library comprising one or more of the variegated DNA sequences that encodes a heavy chain CDR of 1, 2 and 3 and further comprising variegated DNA sequences that encodes a light chain CDR selected from the group consisting of [0070] (1) one or more the kappa light chain CDR1s of 4; [0071] (2) the kappa light chain CDR2 of 5; [0072] (3) one or more of the kappa light chain CDR3s of 6; [0073] (4) one or more of the kappa light chain CDR1s of 7; [0074] (5) the lambda light chain `CDR2` of 8; [0075] (6) one or more of the lambda light chain CDR3s of 9; and [0076] (7) mixtures of vectors and genetic packages characterized by one or more of (1) through (6).

[0077] 12. A population of variegated DNA sequences as described in 1-11 above.

[0078] 13. A population of vectors comprising the variegated DNA sequences as described in 1-11 above.

DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS

[0079] Antibodies ("Ab") concentrate their diversity into those regions that are involved in determining affinity and specificity of the Ab for particular targets. These regions may be diverse in sequence or in length. Generally, they are diverse In both ways. However, within families of human antibodies the diversities, both in sequence and in length, are not truly random. Rather, some amino acid residues are preferred at certain positions of the CDRs and some CDR lengths are preferred. These preferred diversities account for the natural diversity of the antibody family.

[0080] According to this invention, and as more fully described below, libraries of vectors and genetic packages that more closely mirror the natural diversity, both in sequence and in length, of antibody families, or portions thereof are prepared and used.

Human Antibody Heavy Chain Sequence and Length Diversity

[0081] (a) Framework

[0082] The heavy chain ("HC") Germ-Line Gene (GLG) 3-23 (also known as 1/1)-47) accounts for about 12% of all human Abs and is preferred as the framework in the preferred embodiment of the invention. It should, however, be understood that other well-known frameworks, such as 4-34, 3-30, 3-30.3 and 4-30.1, may also be used without departing from the principles of the focused diversities of this invention.

[0083] In addition, JH4(YFDYWGQGTLVTVSS; SEQ ID NO:20) occurs more often than JH3 in native antibodies. Hence, it is preferred for the focused libraries of this invention. However, JH3 (AFDIWGQGTMVTVSS; SEQ ID NO:21) could as well be used.

(b) Focused Length Diversity: CDR1, 2 and 3

[0084] (i) CDR1

[0085] For CDR1, GLGs provide CDR1s only Of the lengths 5, 6, and 7. Mutations during the maturation of the v-domain gene, however, can lead to CDR1s having lengths as short as 2 and as long as 16. Nevertheless, length 5, predominates. Accordingly, in the preferred embodiment of this invention the preferred HC CDR1 is 5 amino acids, with less preferred CDR1s having lengths of 7 and 14. In the most preferred libraries of this invention, all three lengths are used in proportions similar to those found in natural antibodies.

[0086] (ii) CDR2

[0087] GLGs provide CDR2s only of the lengths 15:19, but mutations during maturation may result in CDR2s of lengths from 16 to 28 amino acids. The lengths 16 and 17 predominate in mature Ab genes. Accordingly, length 17 is the preferred length for HC CDR2 of the present invention. Less preferred HC CDR2s of this invention have lengths 16 and 19. In the most preferred focused libraries of this invention, all three lengths are included in proportions similar to those found in natural antibody families.

[0088] (iii) CDR3

[0089] HC CDR3s vary in length. About half of human HCs consist of the components: V::nz::D::ny::JHn where V is a V gene, nz is a series of bases (mean 12) that are essentially random, D is a D segment, often with heavy editing at both ends, ny is a series of bases (mean 6) that are essentially random, and JH is one of the six JH segments, often with heavy editing at the 5' end. The D segments appear to provide spacer segments that allow folding of the IgG. The greatest diversity is at the junctions of y with D and of D with JH.

[0090] In the preferred-libraries of this invention both types of HC CDR3s are used. In HC CDR3s that have no identifiable D segment, the structure is V::nz::JHn where JH is usually edited at the 5' end. In HC CDR3s that have an identifiable D segment, the structure is V::nz::D::ny::JHn.

(c) Focused Sequence Diversity: CDR1, 2 and 3

[0091] (i) CDR1

[0092] In 5 amino acid length CDR1, examination of a 3D model of a humanized Ab showed that the side groups of residues 1, 3, and 5 were directed toward the combining pocket. Consequently, in the focused libraries of this invention, each of these positions may be selected from any of the native amino acid residues, except cysteine ("C"). Cysteine can form disulfide bonds, which are an important component of the canonical Ig fold. Having free thiol groups Could, thus, interfere with proper folding of the HC and could lead to problems in production or manipulation of selected Abs. Thus, in the focused libraries of this invention cysteine is excluded from positions 1; 3 and 5 of the preferred 5 amino acid CDR1s. The other 19 natural amino acids residues may be used at positions 1, 3 and 5. Preferably, each is present in equimolar ratios in the variegated libraries of this invention.

[0093] 3D modeling also suggests that the side groups of residue 2 in a 5 amino acid CDR1 are directed away from the combining pocket. Although this position shows substantial diversity, both in GLG and mature genes, in the focused libraries of this invention this residue is preferably Tyr (Y) because it occurs in 681/820 mature antibody genes. However, any of the other native amino acid residues, except Cys (C), could also be used at this position.

[0094] For position 4, there is also some diversity in GLG and mature antibody genes. However, almost all mature genes have uncharged hydrophobic amino acid residues: A, G, L, P, F, M, W, I, V, at this position. Inspection of a 3D model also shows that the side group of residue 4 is packed into the innards of the HC. Thus, in the preferred embodiment of this invention which uses framework 3-23, residue 4 is preferably Met because it Is likely to fit very well into the framework of 3-23. With other frameworks, a similar fit consideration is used to assign residue 4.

[0095] Thus, the most preferred HC CDR1 of this invention consists of the amino acid sequence <1>Y<I>M<1> where <1> can be any one of amino acid residues: A, D, E, G, H, I, K, L, M, N, R, Q, S, T, V, W, Y. (not C), preferably present at each position in an equimolar amount. This diversity is shown in the context of a framework 3-23:JH4 in Table 1. It has a diversity of 6859-fold.

[0096] The two less preferred HC CDR1s of this invention have length 7 and length 14. For length 7, a preferred variegation is (S/T).sub.1(S/G/<1>).sub.2(S/G/<1>).sub.3Y.sub.4Y.sub.5W.sub.- 6(S/G/<1>).sub.7 (SEQ ID NO:107); where (S/T) indicates an equimolar mixture of Ser and Thr codons; (S/G/<1>) indicates a mixture Of 0.2025 S, 0.2025 G, and 0.035 for each of A, D, E, F, H, I, K, L, M, N, P, Q, R, T, V, W, Y. This design gives a predominance of Ser and Gly at positions 2, 3, and 7, as occurs in mature HC genes. For length 14, a preferred variegation is VSGGSIS<1><1><1>YYW<l>(SEQ ID NO:108),where <1> is an equimolar mixture of the 19 native amino acid residues, except Cys (C).

[0097] The DNA that encodes these preferred HC CDR1s is preferably synthesized using trinucleotide building blocks so that each amino acid residue ii present in essentially equimolar or other described amounts. The preferred codons for the <1> amino acid residues are gct, gat, gag, ttt, ggt, cat, att, aag, ctt, atg, aat, cct, cag, cgt, tct, act, gtt, tgg, and tat. Of course, other codons for the chosen amino acid residue could also be used.

[0098] The diversity oligonucleotide (ON). is preferably synthesized from BspEI to BstXI (as shown in Table 1) and can, therefore, be incorporated either by PCR synthesis using overlapping ONs or introduced by ligation of BspEI/BstXI-cut fragments. Table 2 shows the oligonucleotides that embody the specified variegations of the preferred length 5 HC CDR1s of this invention. PCR using ON-R1V1vg, ON-R1top, and ON-R1bot gives a dsDNA product of 73 base-pairs, cleavage with 14spEI and BstXI trims 11 and 13 bases from the ends and provides cohesive ends that can be ligated to similarly cut vector having the 3-23 domain shown in Table 1. Replacement of ON-R1V1vg with either ONR1V2vg or ONR1V3vg (see Table 2) allows synthesis of the two alternative diversity patterns--the 7 residue length and the 14 residue length HC CDR1.

[0099] The more preferred libraries of this invention comprise the 3 preferred HC CDR1 length diversities. Most preferably, the 3 lengths should be incorporated in approximately the ratios in which they are observed in antibodies selected without reference to the length of the CDRs. For example, one sample of 1095 HC genes have the three lengths present in the ratio: L=5:L=7:L=14::820:175:23::0.80:0.17:0.02. This is the preferred ratio in accordance with this invention.

[0100] (ii) CDR2

[0101] Diversity in HC CDR2 was designed with the same considerations as for HC CORI: GLG sequences, mature sequences and 3D structure. A preferred length for CDR2 is 17, as shown in Table 1. For this preferred 17 length CDR2, the preferred variegation in accordance with the invention is: <2>I<2><3>SGG<1>T<1>YADSVKG (SEQ ID NO:2), where <2> indicates any amino acid residue selected from the group of Y, R, W, V, G and S (equimolar mixture), <3> is P, S and G or P and S only (equimolar mixture), and <1> is any native amino acid residue except C (equimolar mixture).

[0102] ON-R2V1vg shown in Table 3 embodies this diversity pattern. It is preferably synthesized so that fragments of dsDNA containing the BstXI and XbaI site can be generated by PCR. PCR with ON-R2V1vg, ON-R2top, and ONR2bot gives a dsDNA product of 122 base pairs. Cleavage with BstXI and XbaI removes about 10 bases from each end and produces cohesive ends that can be ligated to similarly cut vector that contains the 3-23 gene-shown in Table 1.

[0103] In an alternative embodiment for a 17 length HC CDR2, the following variegation may be used; <1>I<4><1><1>G<5><1><1><1&gt- ;YADSVKG(SEQ ID NO:3), where <1> is as described above for the more preferred alternative of HC CDR2; <4> indicates an equimolar mixture of DINSWY, and <5> indicates an equimolar mixture of SGDN. This diversity pattern is embodied in ON-R2V2vg shown in Table 3. Preferably, the two embodiments are used in equimolar mixtures in the libraries of this invention.

[0104] Other preferred HC CDR2s have lengths 16 and 19.

TABLE-US-00011 Length 16: (SEQ ID NO: 4) <1>I<4><1><1>G<5<1><1>YNPSLKG; (SEQ ID NO: 5) Length: 19: <1>I<8>S<1><1><1>GGYY<1>YAASVKG,

wherein <1> is an equimolar mixture of all native amino acid residues except C; <4> is a equimolar mixture of DINSWY; <5> is an equimolar mixture of SGDN; and <8> is 0.27 R and 0:0 7 of each of residues ADEFGHIKLMNPQSTVWY. Table 3 shows ON-R2V3vg which embodies a preferred aDR2 variegation of length 16 and ON7R2V4vg which embodies a preferred CDR2 variegation of length 19. To prepare these variegations ON-R2V3vg may be PCR amplified with ON-A2top and ON-R2bo3 and ON-R2V4vg may be PCR amplified with ON-R2top and ON-R2-bo4. See Table 3. In the most preferred embodiment of this invention, all three HC CDR2 lengths are used. Preferably, they are present in a ratio 17:16:19::579:464:31::0.54:0.43:0.03.

[0105] (iii) CDR3

[0106] The preferred libraries of this invention comprise several BC CDR3 components. Some of these will have only sequence diversity. Others will have sequence diversity with embedded D segments to extend the length, while also incorporating sequences known to allow Igs to fold. The HC CDR3 components of the preferred libraries of this invention and their diversities are depicted in Table 4: Components 1-8.

[0107] This set of components was chosen after studying the sequences of 1383 human BC sequences. The proposed components are meant to fulfill the following goals:

[0108] 1) approximately the same distribution of lengths as seen in native Ab genes;

[0109] 2) high level of sequence diversity at places having high diversity in native Ab genes; and

[0110] 3) incorporation of constant sequences often seen in native Ab genes.

[0111] Component 1 represents all the genes having lengths 0 to 8 (counting from the YYCAR motif at the end of FR3 to the WG dipeptide motif near the start of the J region, i.e., FR4). Component 2 corresponds the all the genes having lengths 9 or 10. Component 3. corresponds to the genes having lengths 11 or 12 plus half the genes having length 13. Component 4 corresponds to those having length 14 plus half those having length 13. Component 5 corresponds to the genes having length 15 and half of those having length 16. Component. 6 corresponds to genes of length 17 plus half of those with length 16. Component 7 corresponds to those with length 18. Component 8 corresponds to those having length 19 and greater. See Table 4.

[0112] For each HC CDR3 residue having the diversity <1>, equimolar ratios are preferably not used. Rather, the following ratios are used 0.095 [G and Y] and 0.048 [A, D, E, F, H, I, K, L, M, N, P, Q, R, S, T, V, and W]. Thus, there is a double dose of G and Y with the other residues being in equimolar ratios. For the other diversities, e.g., KR or SG, the residues are present in equimolar mixtures.

[0113] In the preferred libraries of this invention the eight components are present in the following fractions: 1 (0.10), 2 (0.14), 3 (0.25), 4 (0.13), 5 (0.13), 6 (0.11), 7 (0.04) and 8 (0.10). See Table 4.

[0114] In the more preferred embodiment of this invention, the amounts of the eight components is adjusted because the first component is not complex enough to justify including it as 10% of the library. For example, if the final library were to have 1.times.10.sup.9 members, then 1.times.10.sup.8 sequences would come from component 1, but it has only 2.6.times.10.sup.5 CDR3 sequences so that each one would occur in .about.385 CDR1/2 contexts. Therefore, the more preferred amounts of the eight components are 1(0.02), 2(0.14), 3(0.25), 4(0.14), 510.14), 6(0.12), 7(0.68), 8(0.11). In accordance with the more preferred embodiment component 1 occurs in .about.77 CDR1/2 contexts and the other, longer CDR3s occur more often.

[0115] Table 5 shows vgDNA that embodies each of the eight HC CDR3 components shown in Table 4. In Table 5, the oligonucleotides (ON) Ctop25, CtprmA, C8prmB, and CBot25 allow PCR amplification of each of the variegated ONs (vgDNA): C1t08, C2t10, C3t12, C4t14, C5t15, C6t17, C7t18, and C8t19. After amplification, the dsDNA can be cleaved with AfiII and BstEII (or KpnI) and ligated to similarly cleaved vector that contains the remainder of the 3-23 domain. Preferably, this vector already contains diversity in one, or both, of CDR1 and CDR2 as disclosed herein. Most preferably, it contains diversity in both the CDR1 and CDR2 regions. It is, of course, to be understood that the various diversities can be incorporated into the vector in any order.

[0116] Preferably, the recipient vector originally contains a stuffer in place of CDR1, CDR2 and CDR3 so that there will be no parental sequence that would then occur in the resulting library. Table 6 shows a version of the V3-23 gene segment with each CDR replaced by a short segment that contains both stop codons and restriction sites that will allow specific cleavage of any vector that does not have the stuffer removed. The stuffer can either be short and contain a restriction enzyme site that will not occur in the finished library, allowing removal of vectors that are not cleaved by both AfiII and BstEII (or AionI) and religated. Alternatively, the stuffer could be 200-400 bases long so that uncleaved or once-cleaved vector can be readily separated from doubly cleaved vector.

Human Antibody Light Chain: Sequence and Length Diversity

[0117] (i) Kappa Chain

[0118] (a) Framework

[0119] In the preferred embodiment of this invention, the kappa light chain is built in an A27 framework with a JK1 region. These are the most common V and J regions in the native genes. Other frameworks, such as 012, L2, and All, and other J regions, such as JK4, however, may be used without departing from the scope of this invention.

[0120] (b) CDR1

[0121] In native human kappa chains, CDR1s with lengths of 11, 12, 13, 16, and 17 were observed with length 11 being predominant and length 12 being well represented. Thus, in the preferred embodiments of this invention LC CDR1s of length 11 and 12 are used in an and mixture similar to that observed in native antibodies), length 11 being most preferred. Length 11 has the following sequence: RASQ<1>V<2><2><3>LA (SEQ ID NO:14) and Length 12 hag the following sequence: RASQ<1>V<25<2><2><3>LA (SEQ ID NO:15), wherein <1> is an equimolar mixture of ill of the native.-amino acid residues, except C, <2> is 0.2 S and 0.044 of each of ADEFGHIKLMNPQRTVWY, and <3> is 0.2.Y and 0.044 each of A. D, E, F, G, H, 1, K, L, M, N, Q, R, T, V, W and Y. In the most preferred embodiment of this invention, both CDR1. lengths are used. Preferably, they are present in a ratio of 11:12::154:73:0.68:0.32.

[0122] (c) CDR2

[0123] In native kappa, CDR2 exhibits only length 7. This length is used in the preferred embodiments of-this invention. It has the sequence <1>AS<2>R<4><1>, wherein <1> is an-equimolar mixture of amino acid residues ADEFGHIKLMNPQRSTVWY; <2> is 0.2 S and 0.004 of each of ADEFGHIKLMNPQRTVWY; and <4> is 0.2 A and 0.044 of each of DEFGHIKLMNPQRSTUWY.

[0124] (d) CDR3

[0125] In native kappa, CDR3 exhibits lengths of 4, 6, 7; 8, 9, 10, 11, 12, 13, 0.0 . . . and 19. While any of these lengths and mixtures of them can be employed in this invention, we prefer lengths 8, 9 and 10, length 9 being more preferred. For the preferred Length 9, the sequence is, QQ<3><1><1><1>P<1>T, wherein <1> is an equimolar mixture of amino acid residues ADEFGHIKLMNPQRSTVWY and <3> is 0.2? and 0.044 each of ADEFGHIKLWQRSVW. Length 8 is preferably QQ33111P and Length 10 is Preferably QQ3211PP1T, wherein 1 and 3 are as defined for Length 9 and 2 is S (0.2) and 0.044 each of ADEFGHIKLMNPQRTVWY. A mixture of all 3 lengths being most preferred (ratios as in native antibodies), i.e., 8:9:10i28:166:63::0.1:0.65:0.25.

[0126] Table 7 shows a kappa chain gene of this invention, including a PlacZ promoter a ribosome-binding site, and signal sequence (MI3 III signal). The DNA sequence encodes the GLG amino acid sequence but does not comprise the GLG DNA sequence. Restriction sites are designed to fall within each framework region so that diversity can be cloned into the CDRs. XmaI and EspI are in FR1, SexAI is in FR2, RsrII is in FR3, and KpnI (or Acc65I), are in FR4. Additional sites are provided in the constant kappa chain to facilitate construction of the gene.

[0127] Table 7 also shows a suitable scheme of variegation for kappa. In .CDR1, the most preferred length 11 is depicted. However, most preferably both lengths 11 and 12 are used. Length 12 in CDR1 can be construed by introducing codon 51 as <2> (i.e. a Ser-biased mixture). CDR2 of kappa is always 7 codons. Table 7 shows a preferred variegation scheme for CDR2. Table 7 Shows a variegation scheme for the most preferred CDR3 (length 9). Similar variegations can be lied for CDRs of length 8 and 10. In the preferred embodiment of this invention, those three lengths (8, 9 and 10) are included in the libraries of this invention in the native ratios, as described above.

[0128] Table 9 shows series of diversity oligonucleotides and primers that may be used to construct the kappa chain diversities depicted in Table 7.

[0129] (ii) Lambda Chain

[0130] (a) Framework

[0131] The lambda chain is preferably built in a 2a2 framework with an L2J region. These are the most common V and J regions in the native genes. Other frameworks, such as 31, 4b, la and 6a, and other J regions, such as L1J, L3J and L7J, however, may be used without departing from the scope of this invention.

[0132] (b) CDR1

[0133] In native human lambda chains, CDR1s with length 14. predominate, lengths 11, 12 and 13 also occur. While any of these can be used in this invention, lengths 11 and 14 are preferred. For length 11 the sequence is: TG<2><4>L<4><4><4><3><4>&lt- ;4>(SEQ ID NO:22) and for Length 14 the sequence is: TG<1>SS<2>VG<1><3><2><3>VS (SEQ ID NO:18), wherein <1> is 0.27 T, 0.21 G and 0.027 each of ADEFHIKLMNPQRSVWY; <2> is 0.27 D, 0.27 N and 0.027 each of AEFGHIKLMPQRSTVWY; <3> is 0.36 Y and 0.0355 each of ADEFGHIKLMNPQRSTVW; and <4> is an equimolar mixture of amino acid residues ADEFGHIKLMNPQRSTVWY. Most preferably, Mixtures (similar to those occurring in native antibodies) preferably, the ratio is 11:14::23:46::0.33:0.67 of the three lengths are used.

[0134] (c) CDR2

[0135] In native human lambda chains.sub.4.CDR2s with length 7 are by far the most common. This length is preferred in this invention. The sequence of this Length 7 CDR2 is <4><4><4><2>RPS, wherein <2> is 0.27 D, 0.27 N, and 0.027 each of AEFGHIKLMPQRTVWY and <4> is an equimolar mixture of amino acid residues ADEFGHIKLMNPQRSTVW.

[0136] (d) CDR3

[0137] In native human lambda chains, CDR3s of length 10 and 11 predominate, while length 9 is also common. Any of these three lengths can be used in the invention. Length 11 is preferred and mixtures of 10 and 11 more preferred. The sequence of Length 11 is <4><5><4><2><4>S<4><4><4>- <4>V, where <2> and <4> are as defined for the lambda CORI and <5> is 0.36 S and 0.0355 each of ADFFGHIKLMNFORTVWY. The sequence of Length 10 is <5>SY<1><5>S<5><1><4>V (SEQ ID NO:19), wherein <1> is an equimolar mixture of ADEFGHIKLMNPQRSTVWY; and <4> and <5> are as defined for Length 11. The preferred mixtures of this invention comprise an equimolar mixture of Length 10 and Length 11. Table 8 shows a preferred focused lambda light chain diversity in accordance with this invention.

[0138] Table 9 shows a series of diversity oligonucleotides and primers that may be used to construct 10 the lambda chain diversities depicted in Table 7.

Method of Construction of the Genetic Package

[0139] The diversities of heavy chain and the kappa and lambda light chains are best constructed in separate vector's. First a synthetic gene is designed to embody each of the synthetic variable domains. The light chains are bounded by restriction sites for ApaLI (positioned at the very end of the signal sequence) and AscI (positioned after the stop codon). The heavy chain is bounded by SfiI (positioned within the PelB signal sequence) and NotI (positioned in the linker between CH1 and the anchor protein). Signal sequences other than PelB may also need, e.g., a M13 pIII signal sequence.

[0140] The initial genes are made with "stuffer" sequences in place of the desired CDRs. A "stuffer" is a sequence that is to be cut away and replaced by diverse DNA but which does not allow expression `of a functional antibody gene. For example, the stuffer may contain several stop codons and restriction sites that will not occur in the correct finished library vector. For example, in Table 10, the stuffer for CDR1 of kappa A27 contains a StuI site. The vgDNA for CDR1 is introduced as a cassette from EspI, XmaI, or Af1II to dither SexAI or KasI. After the ligation, the DNA is cleaved with Still; there should be no StuI sites in the desired vectors.

[0141] The sequences of the heavy chain gene with stuffers is depicted in Table 6. The sequences of the kappa light chain gene with stuffers is depicted in Table 10. The sequence of the lambda light chain gene with stuffers is depicted in Table 11.

[0142] In another embodiment of the present invention the diversities of heavy chain and the kappa or lambda light chains are constructed in .a single vector or genetic packages (e.g., for display or display and expression) having appropriate restriction sites that allow cloning of these chains. The processes to construct such vectors are well known and widely used in the art. Preferably, a heavy chain and Kappa light Chain library and a heavy chain and lambda light chain library would be prepared separately. The two libraries, most preferably, will then be mixed in equimolar amounts to attain maximum diversity.

[0143] Most preferably, the display is had on the surface of a derivative of M13 phage. The most preferred vector contains all the genes of M13, an antibiotic resistance-gene, and the display cassette. The preferred vector is provided with restriction sites that allow introduction and excision of members of the diverse family of genes, as cassettes. The preferred vector is stable against rearrangement under the growth conditions used to amplify phage.

[0144] In another embodiment of this invention, the diversity captured by the methods of the present invention may be displayed and/or expressed in a phagemid vector (e.g., pCES1) that displays and/or expresses the peptide, polypeptide or protein. Such vectors may also be used to store the diversity for subsequent display and/or expression using other vectors or phage.

[0145] In another embodiment of this invention, the diversity captured by the methods of the present invention may be displayed and/or expressed in a yeast vector.

TABLE-US-00012 TABLE 1 3-23:JH4 CDR1/2 diversity = 1.78 .times. 10.sup.8 FR1(VP47/V3-23)--------------- 20 21 22 23 24 25 26 27 28 29 30 (SEQ ID NO: 99) A M A E V Q L L E S G ctgtctgaac cc atg gcc gaa/gtt/caa/ttg/tta/gag/tct/ggt/ Scab ..... NcoI.... MfeI ----------FR1--------------------------------- 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 G G L V Q P G G S L R L S C A /ggc/ggt/ctt/gtt/cag/cct/ggt/ggt/tct/tta/cgt/ctt/tct/tgc/gct/ Sites of variegation <1><1> <1> <1> 6859-fold diversity ----FR1 ------------- >/ .. CDR1........... ./---FR2----- 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 A S G F T F S - Y - M - W V R /gct/tcc/gga/ttc/act/ttc/tct/ - /tac/ - /atg/ - /tgg/gtt/cgc/ BspEI BsiWI BstXI. Sites of variegation-><2> <2> <3> -----FR2-------------------- >/ ..CDR2 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 Q A P G K G L E W V S - I - - /caa/gct/cct/gtt/aaa/ggt/ttg/gag/tgg/gtt/tct/ - /atc/ - / - / ...BstXI <1> <1> 25992-fold diversity in CDR2 ...CDR2 ..................................... /---FR3----- 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 S G G - T - Y A D S V K G R F /tct/ggt/ggc/ - /act/ - /tat/gct/gac/tcc/gtt/aaa/ggt/cgc/ttc/ -- - - FR3------------------------------------------------- 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 T I S R D N S K N T L Y L Q M /act/atc/tct/aga/gac/aac/tct/aag/aat/act/ctc/tac/ttg/cag/atg/ XbaI ---FR3------------------------------------------------------ 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 N S L R A E D T A V Y Y C A K /aac/agc/tta/agg/gct/gag/gac/acc/gct/gtc/tac/tac/tgc/gcc/aaa/ Af1II .....CDR3................../ Replaced by the various components! 121 122 123 124 125 126 127 D Y E G T G Y (SEQ ID NO: 24) /gac/tat/gaa/ggt/act/ggt/tat/ (SEQ ID NO: 23) /---------- FR4 ---(JH4)-------------------------------------------------- Y F D Y W G Q G T L V T V S S (SEQ ID NO: 26) /tat/ttc/gat/tat/tgg/ggt/caa/ggt/acc/ctg/gtc/acc/gtc/tct/agt/. (SEQ ID NO: 25) KpnI BstEII <1> = Codons for ADEFGHIKLMNPQRSTVWY (equimolar mixture) <2> = Codons for YRWVGS (equimolar mixture) <3> = Codons for PS or PS and G (equimolar mixture)

TABLE-US-00013 TABLE 2 Oligonucleotides used to variegate CDR1 of human HC CDR1 - 5 residues (ON-R1V1vg): 5'-ct/tcc/gga/ttc/act/ttc/tct/<1>/tac/<1>/atg/<1>/tgg/g- tt/cgc/caa/gct/cct/gg-3' (SEQ ID NO: 27) <1> = Codons of ADEFGHIKLMNPQRSTVWY 1:1 (ON-Rltop): 5'-cctactgtct/tcc/gga/ttc/act/ttc/tct-3' (SEQ ID NO: 28) (ON-Rlbot)[RC]: 5'-'tgg/gtt/cgc/caa/gct/cct/ggttgctcactc-3' (SEQ ID NO: 29) CDR1 - 7 residues (ON-R1V2vg): 5'-ct/tcc/gga/ttc/act/ttc/tct/<6>/<7>/<7>/tac/tac/tgg/&- lt;7>/tgg/gtt/cgc/caa/gct/ cct/gg-3' (SEQ ID NO: 30) <6> = Codons for ST, 1:1 <7> 0.2025(Codons for SG) + 0.035(Codons for ADEFHIKLMNPQRTVWY) CDR1 - 14 residues (ON-R1V3vg): 5'- ct/tcc/gga/ttc/act/ttc/tct/atc/agc/ggt/ggt/tct/atc/tcc/<1>/<1&gt- ;/<1>/- tac/tac/tgg/<1>/tgg/gtt/cgc/caa/gct/cct/gg-3'(SEQ ID NO: 31) <1> = Codons for ADEFGHIKLMNPQRSTVWY 1:1

TABLE-US-00014 TABLE 3 Oligonucleotides used to variegate CDR2 of human HC CDR2 - 17 residues (ON-R2V1vg): 5'-ggt/ttg/gag/tgg/gtt/tct/<2>/atc/<2>/<3>/tct/ggt/ggc/- <1>/act/<1>/tat/gct/- gac/tcc/gtt/aaa/gg-3' (SEQ ID NO: 32) (ON-R2top): 5'-ct/tgg/gtt/cgc/caa/gct/cct/ggt/aaa/ggt/ttg/gag/tgg/gtt/tct-3' (SEQ ID NO: 33) (ON-R2bot)[RC]: 5'-tat/gct/gac/tcc/gtt/aaa/ggt/cgc/ttc/act/atc/tct/aga/ttcctgtcac-3' (SEQ ID NO: 34) <I> = Codons for A, D, E, F, G, H, I, K, L, M, N, P, Q, R, S, T, V, W and Y (equimolar mixture) <2> = Codons for Y, R, W, V, G and S (equimolar mixiure) <a> = Codons for P and S (equimolar mixture) or P, S and G (equimolar mixture) (ON-R2V2vg): 5'-ggt/ttg/gag/tgg/gtt/tct/<1>/atc/<4>/<1>/<1>/gg- t/<5>/<1>/<1>/<1>/tat/gct/- gac/tcc/gtt/aaa/gg-3' (SEQ ID NO: 35) <4> = Codons for DINSWY (equimolar mixture) <5> = Codons for SGDN, (equimolar mixture) CDR2 -16 residues (ON-R2V3vg): 5'-ggt/ttg/gag/tgg/gtt/tct/<1>/att/<4>/<1>/<1>/gg- t/ <5>/<1>/<1>/tat/aac/cct/tcc/ctt/aag/gg-3' (SEQ ID NO: 36) (ON-R2bo3)[RC]: 5'-tat/aac/cct/tcc/ctt/aag/ggt/cgc/ttc/act/atc/tct/aga/ttcctgtcac-3' (SEQ ID NO: 37) CDR2 19 residues (ON-R2V4vg): 5'-ggt/ttg/gag/tgg/gtt/tct/<1>/atc/<8>/agt/<1>/<1&gt- ;/ <1>/ggt/ggt/act/act/<1>/tat/gcc/gct/tcc/gtt/aag/gg-3' (SEQ ID NO: 38) (ON-R2bo4)[RC]: 5'-tat/gcc/gct/tcc/gtt/aag/ggt/cgc/ttc/act/atc/tct/aga/ttcctgtcac'-3' (SEQ ID NO: 39) <1>, <2>, <3>, <4> and <5> are as defined above <8> is 0.27 R and 0.027 each of ADEFGHIKLMNPQSTVWY

TABLE-US-00015 TABLE 4 Preferred Components Preferred Fraction of Adjusted Component Length Complexity Library Fraction 1 YYCA21111YFDYWG. 8 2.6 .times. 10.sup.5 .10 .02 (SEQ ID NO: 6) (1 = any amino acid residue, except C; 2 = K and R) 2 YYCA2111111YFDYWG. 10 9.4 .times. 10.sup.7 .14 .14 (SEQ ID NO: 7) (1 = any amino acid residue, except C; 2 = K and R) 3 YYCA211111111YFDYTG. 12 3.4 .times. 10.sup.10 .25 .25 (SEQ ID NO: 8) (1 = any amino acid residue, except C; 2 = K and R) 4 YYCAR111S2S3111YFDYWG. 14 1.9 .times. 10.sup.8 .13 .14 (SEQ ID NO: 9) (1 = any amino acid residue, except C; 2 = S and G 3 = Y and W) 5 YYCA2111CSG11CY1YFDYWG. 15 9.4 .times. 10.sup.7 .13 .14 (SEQ ID NO: 10) (1 = any amino acid residue except C; 2 = K and R) 6 YYCA211S1TIFG11111YFDYWG. 17 1.7 .times. 10.sup.10 .11 .12 (SEQ ID NO: 11) (1 = any amino acid residue, except C; 2 = K and R) 7 YYCAR111YY2S33YY111YFDYMG. 18 3.8 .times. 10.sup.8 .04 .08 (SEQ ID NO: 12) (1 = any amino acid residue, except C; 2 = D or G; 3 = S and G) 8 YYCAR1111YC2231CY111YFDYWG. 19 2.0 .times. 10.sup.11 .10 .11 (SEQ ID NO: 13) (1 = any amino acid residue, except C; 2 = S and G; 3 = T, D and G)

TABLE-US-00016 TABLE 5 Oligonucleotides used to variegate the eight components of HC CDR3 (Ctop25): 5'-gctctggtcaac/tta/agg/gct/gag/g-3' (SEQ ID NO: 40) (CtprmA): 5'-gctctggtcaac/tta/agg/gct/gag/gac/acc/gct/gtc/tac/tac/tgc/gcc-- 3' AflLL. . . (SEQ ID NO: 41) (CBprmB)[RC]: 5'-/tac/ttc/gat/tac/tgg/ggc/caa/ggt/acc/ctg/gtc/acc/tcgctccacc-3' (SEQ ID NO: 42) BstEII... (CBot25)[RC]: 5'-/ggt/acc/ctg/gtc/acc/tcgctccacc-3' (SEQ ID NO: 43) The 20 bases at 3' end of CtprmA are identical to the most 5' 20 bases of each of the vgDNA molecules. Ctop25 is identical to the most 5' 25 bases of CtprmA. The 23 most 3' bases of CBprmB are the reverse complement of the most 3' 23 bases of each of the vgDNA molecules. CBot25 is identical to the 25 bases at the 5' end of CBprmB. Component 1 (C1t08): 5'-cc/gct/gtc/tac/tac/tgc/gcc/<2>/<1>/<1>/<1>/<- 1>/tac/ttc/gat/tac/tgg/ggc/caa/gg-3' (SEQ ID NO: 44) <1> = 0.095 Y + 0.095 G + 0.048 each of the residues ADEFHIKLMNPQRSTVW, no C; <2> = K and R (equimolar mixture) component 2 (C2t10): 5'-cc/gct/gtc/tac/tac/tgc/gcc/<2>/<1>/<1>/<1>/<- 1>/<1>/<1>/tac/ttc/gat/tac/tgg/ggc/caa/gg-3' (SEQ ID NO: 45) <1> = 0.095 Y + 0.095 G + 0.048 each of ADEFHIKLMNPQRSTVW, no C; <2> = K and R (equimolar mixture) Component 3 (C3t12): 5'-cc/gct/gtc/tac/tac/tgc/gcc/<2>/<1>/<1>/<1>/<- 1>/<1>/<1>/<1>/<1>/tac/ttc/gat/tac/- tgg/ggc/caa/gg-3' (SEQ ID NO: 46) <1> = 0.095 Y + 0.095 G + 0.048 each of ADEFHIKLMNPQRSTVW, no C; <2> = K and R (equimolar mixture) Component 4 (C4t140): 5'-cc/gct/gtc/tac/tac/tgc/gcc/cgt/<1>/<1>1<1>/tct/<2&- gt;/tct/<3>/<1>/<1>/<1>/tac/ttc/gat/- tac/tgg/ggc/caa/gg-3' (SEQ ID NO: 47) <1> = 0.095 Y + 0.095 G + 0.048 each of ADEFHIKLMNPQRSTVW, no C; <2> = S and G (equimolar mixture); <3> = Y and W (equimolar mixture) Component 5 (C5t15): 5'-cc/gct/gtc/tac/tac/tgc/gcc/<2>/<1>/<1>/<1>/tgc/- tct/ggt/<1>/<1>/tgc/tat/<1>/tac/- ttc/gat/tac/tgg/ggc/caa/gg-3' (SEQ ID NO: 48) <1> = 0.095 Y + 0.095 G + 0.048 each of ADEFHIKLMNPQRSTVW, no C; <2> = K and R (equimolar mixture) Component 6 (C6t17): 5'-cc/gct/gtc/tac/tac/tgc/gcc/<2>/<1>/<1>/tct/<1>/- act/atc/ttc/ggt/<1>/<1>/<1>/<1>/- <1>/tac/ttc/gat/tac/tgg/ggc/caa/gg-3' (SEQ ID NO: 49) <1> = 0.095 Y + 0.095 G + 0.048 each of ADEFHIKLMNPQRSTVW, no C; <2> = K and R (equimolar mixture) Component 7 (C7t18): 5'-cc/gct/gtc/tac/tac/tgc/gcc/cgt/<1>/<1>/<1>/tat/tac/&l- t;2>/tct/<3>/<3>/tac/tat/- <1>/<1>/<1>/tac/ttc/gat/tac/tgg/ggc/caa/gg-3' (SEQ ID NO: 50) <1> = 0.095 Y + 0.095 G + 0.048 each of ADEFHIKLMNPQRSTVW, no C; <2> = D and G (equimolar mixture); <3> = S and G (equimolar mixture) Component 8 (c8t19): 5'-cc/gct/gtc/tac/tac/tgc/gcc/cgt/<1>/<1>/<1>/<1>/- tat/tgc/<2>/<2>/<3>/<1>/tgc/tat/- <1>/<1>/<1>/tac/ttc/gat/tac/tgg/ggc/caa/gg-3' (SEQ ID NO: 51) <1> = 0.095 Y + 0.095 G + 0.048 each of ADEFHIKLMNPQRSTVW, no C; <2> = S and G (equimolar mixture); <3> = TDG (equimolar mixture);

TABLE-US-00017 TABLE 6 3-23:: JH4 Stuffers in place of CDRs FR1(DP47/V3-23)------------------------ 20 21 22 23 24 25 26 27 28 29 30 A M A E V Q L L E S G ctgtctgaac cc atg gcc gaa/gtt/caa/ttg/tta/gag/tct/ggt/ (SEQ ID NO: 99) Scab .......NcoI.... MfeI ---------------------------- FR1---------------------------- 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 G G L V Q P G G S L R L S C A /ggc/ggt/ctt/gtt/cag/cct/ggt/ggt/tct/tta/cgt/ctt/tct/tgc/gct/ ---FR1--------------------->/...CDR1 stuffer..../---FR2------ 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 A S G F T F S S Y A / / W V R /gct/tcc/gga/ttc/act/ttc/tct/tcg/tac/gct/tag/taa/tgg/gtt/cgc/ BspEI BsiWI BstXI. -------FR2-------------------------------->/...CDR2 stuffer. 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 Q A P G K G L E W V S / P R / /caa/gct/cct/ggt/aaa/ggt/ttg/gag/tgg/gtt/tct/taa/cct/agg/tag/ ...BstXI AvrII.. ....CDR2 stuffer..................................../---FR3--- 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 T I S R D N S K N T L Y L Q M /act/atc/tct/aga/gac/aac/tct/aag/aat/act/ctc/tac/ttg/cag/atg/ XbaI --FR3------------..> CDR3 Stuffer ------------>/ 106 107 108 109 110 N S L R A (SEQ ID NO: 53) /aac/agc/tta/agg/gct/tag taa agg cct taa (SEQ ID NO: 52) AflII StuI... /-----FR4 --- (JH4) ------------------------------------------- Y F D Y W G Q G T L V T V S S (SEQ ID NO: 26) /tat/ttc/gat/tat/tgg/ggt/caa/ggt/acc/ctg/gtc/acc/gtc/tct/agt/... (SEQ ID NO: 25) KpnI BstEII

TABLE-US-00018 TABLE 7 A27:JH1 Human Kappa light chain gene gaggacc attgggcccc ctccgagact ctcgagcgca Scab ......Eco0109I XhoI ApaI acgcaattaa tgtgagttag ctcactcatt aggcacccca ggctttacac tttatgcttc ..-35.. Plac ..-10. cggctcgtat gttgtgtgga attgtgagcg gataacaatt tcacacagga aacagctatg accatgatta cgccaagctt tggagccttt tttttggaga ttttcaac (SEQ ID NO: 54) pflMI....... Hind III M13 III signal sequence (AA seg) ------------------------------ 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 M K K L L F A I P L V V P F Y gtg aag aag ctc cta ttt gct atc ccg ctt gtc gtt ccg ttt tac --Signal-->FR1---------------------------------------------> 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 S H S A Q S V L T Q S P G T L /agc/cat/agt/gca/caa/tcc/gtc/ctt/act/caa/tct/cct/ggc/act/ctt/ ApaLI... ---- FR1 -------------------------------------->/ CDR1----> 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 S L S P G E R A T L S C R A S (SEQ ID NO: 55) /tcg/cta/agc/ccg/ggt/gaa/cgt/gct/acc/tta/agt/tgc/cgt/gct/tcc/ (SEQ ID NO: 54; Cont'd) EspI..... AflII ... XmaI... For CDR1: <1> ADEFGHIKLMNPQRSTVWY 1:1 <2> S(0.2) ADEFGHIKLMNPQRTVWY (0.044 each) <3> Y(0.2) ADEFGHIKLMNPQRSTVW (0.044 each) (CDR1 installed as AflII-(SexAI or KasI) cassette.) For the most preferred 11 length codon 51 (XXX) is omitted; for the preferred 12 length this codon is <2> ------ CDR1--------------------- --->/ --- FR2-------------> <1> <2> <2> xxx <3> 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 Q - V - - - - L A W Y Q Q K P (SEQ ID NO: 55; Cont'd) /cag/ - /gtt/ - / - / - / - /ctt/gct/tgg/tat/caa/cag/aaa/cct/ (SEQ ID NO: 54; Cont'd) SexAI.... For CDR2: <1> ADEFGHIKLMNPQRSTVWY 1:1 <2> S(0.2) ADEFGHIKLMNPQRTVWY (0.044 each) <4> A(0.2) DEFGHIKLMNPQRSTVWY (0.044 each) CDR2 installed as (SexAI or KasI) to (BamHI or RsrII) cassette.) ----- FR2 ------------------------->/------CDR2-----------> <1> <2> <4> 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 G Q A P R L L I Y - A S - R - (SEQ ID NO: 55; Cont'd) /ggt/cag/gcg/ccg/cgt/tta/ctt/att/tat/ - /gct/tct/ - /cgc/ - (SEQ ID NO: 54; Cont'd) SexAI.... KasI.... CDR2-->/--- FR3 ------------------------------------------> <1> 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 - G I P D R F S G S G S G T D / - /ggg/atc/ccg/gac/cgt/ttc/tct/ggc/tct/ggt/tca/ggt/act/gac/ BamHI RsrII ..... --------FR3------------------------------------------------> 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 F T L T I S R L E P E D F A V (SEQ ID NO: 55' Cont'd) /ttt/acc/ctt/act/att/tct/aga/ttg/gaa/cct/gaa/gac/ttc/gct/gtt/ (SEQ ID NO: 54; Cont'd) XbaI For CDR3 (Length 9): <1> ADEFGHIKLMNPQRSTVWY 1:1 <3> Y(0.2) ADEFGHIKLMNPQRTVW (0.044 each) For CDR3 (Length 8): QQ33111P 1 and 3 as defined for Length 9 For CDR3 (Length 10): QQ3211PP1T 1 and 3 as defined for Length 9 2 S(0.2) and 0.044 each of ADEFGHIKLMNPQRTVWY CDR3 installed as XbaI to (StyI Or BsiWI)cassette. ------------->/----CDR3-------------------------->/----FR4---> <3> <1> <1> <1> <1> 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 Y Y C Q Q - - - - P - T F G Q (SEQ ID NO: 55; Cont'd) /tat/tat/tgc/caa/cag/ - / - / - / - /cct/ - /act/ttc/ggt/caa/ (SEQ ID NO: 54; Cont'd) BstXI......... ----FR4-------------------->/ <-------Ckappa -------------- 121 122 123 124 125 126 127 128 129 130 131 132 133 134 G T K V E I K R T V A A P S /ggt/acc/aag/gtt/gaa/atc/aag/ /cgt/acg/gtt/gcc/gct/cct/agt/ StyI.... BsiWI.. 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 V F I F P P S D E Q L K S G T /gtg/ttt/atc/ttt/cct/cct/tct/gac/gaa/caa/ttg/aag/tca/ggt/act/ MfeI... 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 A S V V C L L N N F Y P R E A (SEQ ID NO: 55; Cont'd) /gct/tct/gtc/gta/tgt/ttg/ctc/aac/aat/ttc/tac/cct/cgt/gaa/gct/ (SEQ ID NO: 54; Cont'd) BssSI.. 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 K V Q W K V D N A L Q S G N S /aaa/gtt/cag/tgg/aaa/gtc/gat/aac/gcg/ttg/cag/tcg/ggt/aac/agt/ MluI.... 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 Q E S V T E Q D S K D S T Y S /caa/gaa/tcc/gtc/act/gaa/cag/gat/agt/aag/gac/tct/acc/tac/tct/ 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 L S S T L T L S K A D Y E K H /ttg/tcc/tct/act/ctt/act/tta/tca/aag/gct/gat/tat/gag/aag/cat/ 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 K V Y A C E V T H Q G L S S P (SEQ ID NO: 55; Cont'd) /aag/gtc/tat/GCt/TGC/gaa/gtt/acc/cac/cag/ggt/ctg/agc/ttc/cct/ (SEQ ID NO: 54; Cont'd) SacI.... 225 226 227 228 229 230 231 232 233 234 V T K S F N R G E C (SEQ ID NO: 55; Cont'd) /gtt/acc/aaa/agt/ttc/aac/cgt/ggt/gaa/tgc/taa/tag ggcgcgcc DsaI.... AscI.... BssHII acgcatctctaa gcggccgc aacaggaggag (SEQ ID NO: 54; Cont'd) NotI....

TABLE-US-00019 TABLE 8 2a2:JH2 Human lambda-chain gene gaggaccatt gggcccc ttactccgtgac Scab...... Eco0109I ---------FR1--------------------------------------------> 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 S A Q S A L T Q P A S V S G S P G (SEQ ID NO: 57) agt/gca/caa/tcc/gct/ctc/act/cag/cct/gct/agc/gtt/tcc/ggg/tca/cct/ggt/ (SEQ ID NO: 56) ApaLI... NheI... BstEII... SexAI.... For CDR1 (length 14): <1> = 0.27 T, 0.27 G, 0.027 each of ADEFHIKLMNPQRSVWY, no C <2> = 0.27 D, 0.27 N, .0.027 each of AEFGHIKLMPQRSTVWY, no C <3> = 0.36 Y, 0.0355 each of ADEFGHIKLMNPQRSTVW, no C T G <1> S S <2> V G -----FR1 ----------------> /----CDR1----------------------- 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 Q S I T I S C T G - S S - V G /caa/agt/atc/act/att/tct/tgt/aca/ggt/ - /tct/tct/ - /gtt/ggc/ BsrGI.. <1> <3> <2> <3> V S = vg Scheme #1, length = 14 -----CDR1 -------------> /------------- FR2----------------- 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 - - - - V S W Y Q Q H P G K A (SEQ ID NO: 57; Cont'd) / - / - / - / - /gtt/tct/tgg/tat/caa/caa/cac/ccg/ggc/aag/gcg/ (SEQ ID NO: 56; Cont'd) XmaI.... KasI..... AvaI.... A second Vg scheme for CDR1 gives segments of length 11: T.sub.22G<2><4>L<4><4><4><3><4>&- lt;4> where <4> = equimolar mixture of each of ADEFGHIKLMNPQRSTVWY, no C <3> = as defined above for the alternative CDR1 For CDR2: <2> and <4> are the same variegation as for CDR1 <4> <4> <4> <2> R P S --FR2---------------> /-------CDR2--------- ----->/------FR3- 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 P K L M I Y - - - - R P S G V /ccg/aag/ttg/atg/atc/tac/ - / - / - / - /cgt/cct/tct/ggt/gtt/ KasI.... --------FR3------------------------------------------------- 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 S N R F S G S K S G N T A S L (SEQ ID NO: 57; Cont'd) /agc/aat/cgt/ttc/tcc/gga/tct/aaa/tcc/ggt/aat/acc/gca/agc/tta/ (SEQ ID NO: 56; Cont'd) BspEI.. HindIII. BsaBI..............(blunt) ------FR3-------------------------------------------------> 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 T I S G L Q A E D E A D Y Y C (SEQ ID NO: 57; Cont'd) /act/atc/tct/ggt/ctg/cag/gct/gaa/gac/gag/gct/gac/tac/tat/tgt/ (SEQ ID NO: 56; Cont'd) PstI... CDR3 (Length 11): <2> and <4> are the same variegation as for CDR1 <5> = 0.36 S, 0.0355 each of ADEFGHIKLMNPQRTVWY no C CDR3 (Length 10): <5> SY <1> <5> S <5> <1> <4> V <1> is an equimolar mixture of ADEFGHIKLMNPQRSTVWY, no C <4> and <5> are as defined for Length 11 <4> <5> <4> <2> <4> S <4> <4> <4> <4> V ------CDR3--------------------------------->/----FR4------- 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 - - - - - S - - - - V F G G G / - / - / - / - / - /tct/ - / - / - / - /gtc/ttc/ggc/ggt/ggt/ KpnI.. -------FR4-------------> 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 T K L T V L G Q P K A A P S V /acc/aaa/ctt/act/gtc/ctc/ggt/caa/cct/aag/gct/gct/cct/tcc/gtt/ KpnI... HincII.. Bsu36I... 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 T L F P P S S E E L Q A N K A (SEQ ID NO: 57; Cont'd) /act/ctc/ttc/cct/cct/agt/tct/gaa/gag/ctt/caa/gct/aac/aag/gct/ (SEQ ID NO: 56; Cont'd) SapI..... 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 T L V C L I S D F Y P G A V T /act/ctt/gtt/tgc/ttg/atc/agt/gac/ttt/tat/cct/ggt/gct/gtt/act/ BclI.... 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 V A W K A D S S P V K A G V E /gtc/gct/tgg/aaa/gcc/gat/tct/tct/cct/gtt/aaa/gct/ggt/gtt/gag/ BsmBI... 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 T T T P S K Q S N N K Y A A S /acg/acc/act/cct/tct/aaa/caa/tct/aac/aat/aag/tac/gct/gcg/agc/ BsmBI... SacI.... 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 S Y L S L T P E Q W K S H K S (SEQ ID NO: 57; Cont'd) /tct/tat/ctt/tct/ctc/acc/cct/gaa/caa/tgg/aag/tct/cat/aaa/tcc/ (SEQ ID NO: 56; Cont'd) SacI... 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 Y S C Q V T H E G S T V E K T /tat/tcc/tgt/caa/gtt/act/cat/gaa/ggt/tct/acc/gtt/gaa/aag/act/ BspHI... 211 212 213 214 215 216 217 218 219 V A P T E C S (SEQ ID NO: 57; Cont'd) /gtt/gcc/cct/act/gag/tgt/tct/tag/tga/ggcgcgcc AscI.... BssHII aacgatgttc aag gcggccgc aacaggaggag (SEQ ID NO: 56; Cont'd) NotI.... Scab.......

TABLE-US-00020 TABLE 9 Oligonucleotides For Kappa and Lambda Light Chain Variegation (Ctop25): 5'-gctctggtcaac/tta/agg/gct/gag/g-3' (SEQ ID NO: 58) (CtprmA): 5'-gctctggtcaac/tta/agg/gct/gag/gac/acc/gct/gtc/tac/tac/tgc/gcc-3' (SEQ ID NO: 59) AflII... (CBprmB)[RC]: 5'-/tac/ttc/gat/tac/ttg/ggc/caa/ggt/acc/ctg/gtc/acc/tcgctccacc-3' (SEQ ID NO: 60) BstEII... (CBot25)[RC]: 5'-/ggt/acc/ctg/gtc/acc/tcgaccacc-3' (SEQ ID NO: 61) Kappa chains: CDR1 ("1"), CDR2 ("2"), CDR1 ("3") CDR1 (Ka1Top610): 5'-ggtctcagttg/cta/agc/ccg/ggt/gaa/cgt/gct/acc/tta/agt/tgc/cgt/ gct/tcc/cag-3' (SEQ ID NO: 62) (Ka1STp615): 5'-ggtctcagttg/cta/agc/ccg/ggt/g-3' (SEQ ID NO: 63) (Ka1Bot620)[RC]: '5'-ctt/gct/tgg/tat/caa/cag/aaa/cct/ggt/cag/gcg/ccaagtcgtgtc-3' (SEQ ID NO: 64) (Ka1SB625)[RC]: 5'-cct/ggt/cag/gcg/ccaagtcgtgtc-3' (SEQ ID NO: 65) (Ka1vg600): 5'-gct/acc/tta/agt/tgc/cgt/gct/tcc/cag- /<1>/gtt/<2>/<2>/<3>/ctt/gct/tgg/tat/caa/cag/aaa/c- c-3' (SEQ ID NO: 66) (Ka1vg600-12): 5'-gct/acc/tta/agt/tgc/cgt/gct/tcc/cag- /<1>/gtt/<2>/<2>/<2>/<3>/ctt/gct/tgg/tat/caa- /cag/aaa/cc-3' (SEQ ID NO: 67) CDR2 (Ka2Tshort657): 5'-cacgagtccta/cct/ggt/cag/gc-3' (SEQ ID NO: 68) (Ka2Tlong655): 5'-cacgagtccta/cct/ggt/cag/gcg/ccg/cgt/tta/ctt/att/tat-3' (SEQ ID NO: 69) (Ka2Bshort660):[RC]: 5'-/gac/cgt/ttc/tct/ggt/tctcacc-3' (SEQ ID NO: 70) (Ka2vg650): 5'-cag/gcg/ccg/cgt/tta/ctt/att/tat/<1>/gct/tct/<2>/- /cgc/<4>/<1>/ggg/atc/ccg/gac/cgt/ttc/tct/ggt/tctcacc-3' (SEQ ID NO: 71) CDR3 (Ka3Tlon672): 5'-gacgagtccttct/aga/ttg/gaa/cct/gaa/gac/ttc/gct/gtt/tat/ tat/tgc/caa/c-3' (SEQ ID NO: 72) (Ka3BotL682)[RC]: 5'-act/ttc/ggt/caa/ggt/acc/aag/gtt/gaa/atc/aag/cgt/acg/ tcacaggtgag-3' (SEQ ID NO: 73) (Ka3Bsho694)[RC]: 5'-gaa/atc/aag/cgt/acg/tcacaggtgag-3' (SEQ ID NO: 74) (Ka3vg670): 5'-gac/ttc/gct/gtt/ - /tat/tat/tgc/caa/cag/<3>/<1>/<1>/<1>/cct/<1>- /act/ttc/ggt/caa/- /ggt/acc/aag/gtt/g-3' (SEQ ID NO: 75) (Ka3v670-8): 5'-gac/ttc/gct/gtt/- /tat/tat/tgc/caa/cag/<3>/<3>/<1>/<1>/<1>/cct- /ttc/ggt/caa/- /ggt/acc/aag/gtt/g-3' (SEQ ID NO: 76) (Ka3vg670-10): 5'-gac/ttc/gct/gtt/tat/- /tat/tgc/caa/cag/<3>/<2>/<1>/<1>/cct/cct/<1>- /act/ttc/ggt/caa/- /ggt/acc/aag/gtt/g-3' (SEQ ID NO: 77) Lambda Chains: CDR1("1"), CDR2("2"), CDR3("3") CDR1 (Lm1TPri75): 5'-gacgagtcctgg/tca/cct/ggt/-3' (SEQ ID NO: 78) (Lm1t1o715): 5'-gacgagtcctgg/tca/cct/ggt/caa/agt/atc/act/att/tct/tgt/aca/ggt-3' (SEQ ID NO: 79) (Lm1b1o724)[rc]: 5'-/gtt/tct/tgg/tat/caa/caa/cac/ccg/ggc/aag/gcg/agatcttcacaggtgag-3' (SEQ ID NO: 80) (Lm1bsh737)[rc]: 5'-gc/aag/gcg/agatcttcacaggtgag-3' (SEQ ID NO: 81) (Lm1vg710b): 5'-gt/atc/act/att/tct/tgt/aca/ggt/<2>/<4>/ctc/<4>/<4&- gt;/<4>/- /<3>/<4>/<4>/tgg/tat/caa/caa/cac/cc-3' (SEQ ID NO: 82) (Lm1vg710): 5'-gt/atc/act/att/tct/tgt/aca/ggt/<1>/tct/tct/<2>/gtt/ggc/- /<1>/<3>/<2>/<3>/gtt/tct/tgg/tat/caa/caa/cac/cc-3' (SEQ ID NO: 83) CDR2 (Lm2TSh757): 5'-gagcagaggac/ccg/ggc/aag/gc-3' (SEQ ID NO: 84) (Lm2TLo753): 5'-gagcagaggac/ccg/ggc/aag/gcg/ccg/aag/ttg/atg/atc/tac/-3' (SEQ ID NO: 85) (Lm2BLo762)[RC]: 5'-cgt/cct/tct/ggt/gtc/agc/aat/cgt/ttc/tcc/gga/tcacaggtgag-3' (SEQ ID NO: 86) (Lm2Bsh765)[RC]: 5'-cgt/ttc/tcc/gga/tcacaggtgag-3' (SEQ ID NO: 87) (Lm2vg750): 5'-g/ccg/aag/ttg/atg/atc/tac/- <4>/<4>/<4>/<2>/cgt/cct/tct/ggt/gtc/agc/aat/c-3' (SEQ ID NO: 88) CDR3 (Lm3TSh822): 5'-ctg/cag/gct/gaa/gac/gag/gct/gac-3' (SEQ ID NO: 89) (Lm3TLo819): 5'-ctg/cag/gct/gaa/gac/gag/gct/gac/tac/tat/tgt/-3' (SEQ ID NO: 90) (Lm3BLo825)[RC]: 5'-gtc/ttc/ggc/ggt/ggt/acc/aaa/ctt/act/gtc/ctc/ggt/caa/cct/aag/g- acacaggtgag-3' (SEQ ID NO: 91) (Lm3BSh832)[RC]: 5' c/ggt/caa/cct/aag/gacacaggtgag (SEQ ID NO: 92) (Lm3vg17): 5'-gac/gag/gct/gac/tac/tat/tgt/- /<4>/<5>/<4>/<2>/<4>/tct/<4>/<4>- /<4>/<4>/- Gtc/ttc/ggc/ggt/ggt/acc/aaa/ctt/ac-3' (SEQ ID NO: 93) (Lm3vg817-10): 5'-gac/gag/gct/gac/tac/tat/tgt/- /<5>/agc/tat/<1>/<5>/tct/<5>/<1>/<4>/g- tc/ttc/ggc/ggt/ggt/- /acc/aaa/ctt/ac-3' (SEQ ID NO: 94)

TABLE-US-00021 TABLE 10 A27:JH1 Kappa light chain gene with stuffers in place of CDRs Each stuffer contains at least one stop codon and a restriction site that will be unique within the diversity vector. gaggacc attgggcccc ctccgagact ctcgagcgca Scab..... EcoO109I ApaI. XhoI.. acgcaattaa tgtgagttag ctcactcatt aggcacccca ggctttacac tttatgcttc ..-35.. Plac ..-10. cggctcgtat gttgtgtgga attgtgagcg gataacaatt tcacacagga aacagctatgac catgatta cgccaagctt tggagccttt tttttggaga ttttcaac (SEQ ID NO: 95) PflMI ............. Hind3. M13 III signal sequence (AA seq)--------------------------> 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 M K K L L F A I P L V V P F Y gtg aag aag ctc cta ttt gct atc ccg ctt gtc gtt ccg ttt tac --Signal--> FR1-------------------------------------------- 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 S H S A Q S V L T Q S P G T L /agc/cat/agt/gca/caa/tcc/gtc/ctt/act/caa/tct/cct/ggc/act/ctt/ ApaLI... ----- FR1------------------- -------------->/---------Stuffer-> 31 32 33 34 35 36 37 38 39 40 41 42 43 S L S P G E R A T L S / / (SEQ ID NO: 96) /tcg/cta/agc/ccg/ggt/gga/cgt/gct/acc/tta/agt/tag/taa/gct/ccc/ (SEQ ID NO: 95; Cont'd) EspI..... AflII... XmaI.... - Stuffer for CDR1--> FR2 --------------- FR2--- >/ Stuffer for CDR2 59 60 61 62 63 64 65 66 K P G Q A P R /agg/cct/ctt/tga/tct/g/aaa/cct/ggt/cag/gcg/ccg/cgt/taa/tga/aagcgctaatggcca- acagtg StuI... SexAI... KasI.... AfeI.. MscI.. Stuffer-->/--- FR3 ------------------------------------------> 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 T G I P D R F S G S G S G T D (SEQ ID NO: 96; Cont'd) /act/ggg/atc/ccg/gac/cgt/ttc/tct/ggc/tct/ggt/tca/ggt/act/gac/ (SEQ ID NO: 95; Cont'd) BamHI... RsrII............ --------FR3------>----------------STUFFER for CDR3-------------------> 91 92 93 94 95 96 97 F T L T I S R / / /ttt/acc/ctt/act/att/tct/aga/taa/tga/ gttaac tag acc tacgta acc tag XbaI... HpaI.. SnaBI. ----------------------CDR3 stuffer---------------->/------FR4-------> 118 119 120 F G Q /ttc/ggt/caa/ -----FR4--------------------> <--------Ckappa ----------- 121 122 123 124 125 126 127 128 129 130 131 132 133 134 G T K V E I K R T V A A P S (SEQ ID NO: 96; Cont'd) /ggt/acc/aag/gtt/gaa/atc/aag/ /cgt/acg/gtt/gcc/gct/cct/agt/ StyI.... BsiWI.. (SEQ ID NO: 95; Cont'd) 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 V F I F P P S D E Q L K S G T (SEQ ID NO: 96; Cont'd) /gtg/ttt/atc/ttt/cct/cct/tct/gac/gaa/caa/ttg/aag/tca/ggt/act/ MfeI... acgcatctctaa gcggccgc aacaggaggag (SEQ ID NO: 95; Cont'd) NotI.... EagI..

TABLE-US-00022 TABLE 11 2a2:JH2 Human lambda-chain gene with stuffers in place of CDR3 gaggaccatt gggcccc ttactccgtgac Scab . . . EcoO109I ApaI ----------FR1-------------- ------------------------------ > 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 S A Q S A L T Q P A S V S G S P G agt/gca/caa/tcc/gct/ctc/act/cag/cct/gct/agc/gtt/tcc/ggg/tca/cct/ggt/ ApaLI . . . NheI . . . BstEll . . . SexAI . . . -------FR1----------------> /---------stuffer for CDR1 --------- 16 17 18 19 20 21 22 23 Q S I T I S C T (SEQ ID NO: 98) /caa/agt/atc/act/att/tct/tgt/aca/tct tag tga ctc (SEQ ID NO: 97) BsrGI.. ----Stuffer----------------------------->--------FR2-------> 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 R S / / P / H P G K A aga tct taa tga ccg tag cac/ccg/ggc/aag/gcg/ BglII XmaI . . . KasI . . . AvaI . . . --/-------------Stuffer for CDR2---------------------------------> P /ccg/taa/tga/atc/tcg tac g ct/ggt/gtt/ KasI . . . BsiWI . . . -------FR3------------------------------------------------ 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 S N R F S G S K S G N T A S L (SEQ ID NO: 98; Cont'd) /agc/aat/cgt/ttc/tcc/gga/tct/aaa/tcc/ggt/aat/acc/gca/agc/tta/ (SEQ ID NO: 97; Cont'd) BseEI . . . HindIII . . . BsaBI . . . (blunt) ------FR3------------->/--Stuffer for ORD3------------------>/ 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 T I S G L Q /act/atc/tct/ggt/ctg/cag/gtt ctg tag ttc caattg ctt tag tga ccc PstI . . . MfeI . . . -----Stuffer------------------------------->/---FR4--------- 103 104 105 G G G /ggc/ggt/ggt/ KpnI . . . --------FR4--------------> 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 T K L T V L G Q P K A A P S (SEQ ID NO: 98; Cont'd) V/acc/aaa/ctt/act/gtc/ctc/ggt/caa/cct/aag/gct/gct/cct/tcc/gtt/ (SEQ ID NO: 97; Cont'd) KpnI . . . HincII . . . Bsu36I . . . 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 T L F P P S S E E L Q A N K A /act/ctc/ttc/cat/cct/agt/tct/gaa/gag/ctt/caa/gct/aac/aag/gct/ SapI . . . 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 T L V C L I S D F Y P G A V T (SEQ ID NO: 98; Cont'd) /act/cct/gtt/tgc/ttg/atc/agt/gac/ttt/tat/cct/ggt/gct/gtt/act/ (SEQ ID NO: 97; Con'td) BclI . . .

[0146] The invention relates to generation of useful diversity in synthetic antibody (Ab) gene, especially to Ab genes having frameworks derived from human Abs.

BACKGROUND OF THE INVENTION

[0147] Antibodies are highly useful molecules because of their ability to bind almost any substance with high specificity and affinity and their ability to remain in circulation in blood for prolonged periods as therapeutic or diagnostic agents. For treatment of humans, Abs derived from human Abs are much preferred to avoid immune response to the Ab. For example, murine Abs very often cause Human Anti Mouse Antibodies (HAMA) which at a minimum prevent the therapeutic effects of the murine Ab. For many medical applications, monoclonal Abs are preferred. Nowadays the preferred method of obtaining a human Ab having a particular binding specificity is to select the Ab from a library of human-derived Abs displayed on a genetic package, such as filamentous phage.

[0148] Libraries of phage-displayed Fabs and scFvs have been produced in several ways. One method is to capture the diversity of donors, either naive or immunized. Another way is to generate libraries having synthetic diversity. The present invention relates to methods of generating useful diversity in human Ab scaffolds.

[0149] As is well known, typical Abs consist of two heavy chains (HC) and two light chains (LC). There are several types of HCs: gamma, mu, epsilon, delta, etc. Each type has an N-terminal V domain followed by three or more constant domains. The LCs comprise an N-terminal V domain followed by a constant domain. LCs come in two types: kappa and lambda.

[0150] Within each V domain (LC or HC) there are seven canonical regions, named FR1, CDR1, FR2, CDR2, FR3, CDR3, and FR4, where "FR" stands for "Framework Region" and "CDR" stands for "Complementarity Determining Region". For LC and HC, the FR and CDR GLGs have been selected over time to be secretable, stable, non-antigenic and these properties should be preserved as much as possible. Actual Ab genes contain mutations in the FR regions and some of these mutations contribute to binding, but such useful FR mutations are rare and are not necessary to obtain high-affinity binding. Thus, the present invention will concentrate diversity in the CDR regions.

[0151] In LC, FR1 up to FR3 and part of CDR3 comes from a genomic collection of genes called "V-genes". The remainder of CDR3 and FR4 comes from a genomic collection of genes called "J-genes". The joining may involve a certain degree of mutation, allowing diversity in CDR3 that is not present in the genomic sequences. After the LC gene is formed, somatic mutations can give rise to mature, rearranged LC genes that have higher affinity for an antigen (Ag) than does any LC encoded by genomic sequences. A large fraction of somatic mutations occur in CDRs.

[0152] The HC V region is more complicated. A V gene is joined to a J gene with the possible inclusion of a D segment. About half of HC Abs sequences contain a recognizable D segment in CDR3. The joining is achieved with an amazing degree of molecular sloppiness. Roughly, the end of the V gene may have zero to several bases deleted or changed, the D segment may have zero to many bases removed or changed at either end, a number of random bases may be inserted between V and D or between D and J, and the 5' end of J may be edited to remove or change several bases. Withal, it is amazing that human heavy chains work, but they do. The upshot is that the CDR3 is highly diverse both in encoded amino-acid sequences and in length. In designing synthetic libraries, there is the temptation to just throw in a high degree of synthetic diversity and let the phage sort it out. Nevertheless, D regions serve a function. They cause the Ab repertoire to be rich in sequences that a) allow Abs to fold correctly, and b) are conducive to binding to biological molecules, i.e. antigens.

[0153] One purpose of the present invention is to show how a manageable collection of diversified sequences can confer these advantages on synthetic Ab libraries. Another purpose of the present invention is to disclose analysis of known mature Ab sequences that lead to improved designs for diversity in the CDR1 and CDR2 of HC and the three CDRs of lambda and kappa chains.

BRIEF STATEMENT OF THE INVENTION

[0154] The invention is directed to methods of preparing synthetically diverse populations of Ab genes suitable for display on genetic packages (such as phage or phagemids) or for other regimens that allow selection of specific binding. Said populations concentrate the diversity into regions of the Ab that are likely to be involved in determining affinity and specificity of the Ab for particular targets. In particular, a collection of actual Ab genes has been analyzed and the sites of actual diversity have been identified. In addition, structural considerations were used to determine whether the diversity is likely to greatly influence the binding activity of the Ab. Schemes of variegation are presented that encode populations in which the majority of members will fold correctly and in which there is likely to be a plurality of members that will bind to any given Ag. Specifically, a plan of variegation is presented for each CDR of the human heavy chain, kappa light chain, and lambda light chain. The variegated CDRs are presented in synthetic HC and LC frameworks.

[0155] In one embodiment, the invention involves variegation of human HC variable domains based on a synthetic 3-23 domain joined to a JH4 segment in which the variability in CDR1 and CDR2 comprises sequence variation of segments of fixed length while in CDR3 there are several components such that the population has lengths roughly corresponding to lengths seen in human Abs and having embedded D segments in a portion of the longer segments. In the light chains, the kappa chain is built in an A27 framework and a JK1 while lambda is built in a 2a2 framework with an L2 J region.

EXAMPLES

Choice of a Heavy-Chain V Domain

[0156] The HC Germ-Line Gene (GLG) 3-23 (also known as VP-47) accounts for about 12% of all human Abs and it suitable for the framework of the library. Certain types of Ags elicit Abs having particular types of VH genes; in some cases, the types elicited are otherwise rarely found. This apparent Ag/Ab type specificity has been ascribed to possible structural differences between the various families of V genes. It is also possible that the selection has to do with the availability of particular AA types in the GLG CDRs. Suppose, for example, that the sequence YR at positions 4 and 5 of CDR2 is particularly effective in binding a particular type of Ag. Only the V gene 6-1 provides this combination. Most Abs specific for the Ag will come from GLG 6-1. If Y4-R5 were provided in other frameworks, then other frameworks are likely to be as effective in binding the Ag.

Analysis of HC CDR1 and CDR2:

[0157] In CDR1 and CDR2 of HCs, the GLGs provide limited length diversity as shown in Table 15P. Note that GLGs provide CDR's only of the lengths 5, 6, and 7. Mutations during the maturation of the V-domain gene leads to CDR's having lengths as short as 2 and as long as 16. Nevertheless, length 5 predominates. The preferred length for the present invention is 5 AAs in CDR1 with a possible supplemental components having lengths of 7 and 14.

[0158] GLGs provide CDR2s only of the lengths 15-19, but mutations during maturation result in CDR2s of length from 16 to 28 AAs. The lengths 16 and 17 predominate in mature Ab genes and length 17 is the most preferred length for the present invention. Possible supplementary components of length 16 and 19 may also be incorporated.

[0159] Table 20P shows the AA sequences of human GLG CDR1s and CDR2. Table 21P shows the frequency of each amino-acid type at each position in the GLGs. The GLGs as shown in Table 20P have been aligned by inserting gaps near the middle of the segment so that the ends align.

[0160] The 1398 mature V-domain genes used in studying D segments (vide infra) were scanned for examples in which CDR1 and CDR2 could be readily identified. Of this sample 1095 had identifiable CDR1, 2, and 3. The CDRs were identified by finding subsequences of the GLGs in an open reading frame. There are 51 human HC V genes. At the end of FR1, there are 20 different 9-mers. At the start of FR2, there are 11 different 9-mers. At the end of FR2 there are 14 different 9-mers. At the start of FR3, there are 14 different 9-mers. At the end of FR3, there are 13 different 9-mers. At the start of JH, there are three different 9-mers. These motifs were compared to the reported gene in frame and a match, at the site of maximum similarity, of seven out of nine was deemed acceptable. Only when all three CDRs were identified were any of the CDRs included in the analysis. In addition, the type of the gene was determined by comparing the framework regions to the GLG frameworks; the results are shown in Table 22P.

Design of HC CDR1 and CDR2 Diversity.

[0161] Diversity in CDR1 and CDR2 was designed from: a) the diversity of the GLGs, b) observed diversity in mature HC genes, and c) structural considerations. In CDR1, examination of a 3D model of a humanized Ab showed that the side groups of residues 1, 3, and 5 were directed toward the combining pocket. Consequently, we allow each of these positions to be any amino-acid type except cysteine. Cysteine can form disulfide bonds. Disulfide bonds are an important component of the canonical Ig fold. Having free thiol groups could interfere with proper folding of the HC and could lead to problems in production or manipulation of selected Abs. Thus, I exclude cysteine from the menu. The side groups of residue 2 is directed away from the combining pocket. Although this position shows substantial diversity, both in GLG and mature genes, I fixed this residue as Tyr because it occurs in 681/820 mature genes (Table 21P). Position 4 is fixed as Met. There is some diversity here, but almost all mature genes have uncharged hydrophobic AA types: M, W, I, V, etc. (Table 21P). Inspection of a 3D model shows that the side group of residue 4 is packed into the innards of the HC. Since we are using a single framework (3-23), we retain the Met that 3-23 has because it is likely to fit very well into the framework of 3-23. Thus, the most preferred CDR1 library consists of XYXMX (SEQ ID NO:109) where X can be any one of [A,D,E,F,G,H,I,K,L,M,N,P,Q,R,S,T,V,W,Y] (no C). The DNA that encodes this is preferably synthesized using trinucleotide building blocks so that each AA type is present in essentially equimolar amounts. Specifically, the X codons are synthesized using a mixture of the codons [gct, gat, gag, ttt, ggt, cat, att, aag, atg, aat, cct, cag, cgt, tct, act, gtt, tgg, tat]. This diversity is shown in the context of a synthetic 3-23 gene in Table 18P. The diversity oligonucleotide (ON) is synthesized from BspEI to BstXI and can be incorporated either by PCR synthesis using overlapping ONs or introduced by ligation of BspEI/BstXI-cut fragments. Table 22P shows ONs that embody the specified variegation. PCR using ON-R1V1vg, ON-R1top, and ON-R1bot gives a dsDNA product of 73 base pairs, cleavage with BspEI and BstXI trims 11 and 13 bases from the ends and provides cohesive ends that can be ligated to similarly cut vector having the synthetic 3-23 domain shown in Table 18P. Replacement of ON-R1V1vg with either ONR1V2vg or ONR1V3vg allows synthesis of the two alternative diversity patterns given below.

[0162] Alternatively, one can include CDR's of length 7 and/or 14. For length 7, a preferred diversity is (S/T).sub.1(S/G/x).sub.2(S/G/x).sub.3Y.sub.4Y.sub.5W.sub.6(S/G/x).sub.7 (SEQ ID NO:107); where (S/T) indicates an equimolar mixture of Ser and Thr codons; (S/G/x) indicates a mixture of 0.2025 S, 0.2025 G, and 0.035 for each of A, D, E, F, H, I, K, L, M, N, P, Q, R, T, V, W, Y. Other proportions could be used. The design gives a predominance of Ser and Gly at positions 2, 3, and 7, as occurs in mature HC genes. For length 14, a preferred pattern of diversity is VSGGSISXXXYYWX (SEQ ID NO:1) where X can be any AA type except Cys. This pattern appears to arise by insertions into the GLG sequences (SGGYYWS; SEQ ID NO:110, (4-30.1 and 4-31) and similar sequences. There is a preference for a hydrophobic residue at position 1 (V or C) with a second insertion of SISXXX (SEQ ID NO:111) between GG and YY. Diversity ONs having CDR1s of length 7 or 14 are synthesized from BspEI to BstXI and introduced into the library in appropriate proportions to the CDR1 of length 5. The components should be incorporated in approximately the ratios in which they are observed in antibodies selected without reference to the length of the CDRs. For example, the sample of 1095 HC genes examined here have them in the ratios (L=5:L=7:L=14::820:175:23::0.80:0.17:0.02).

CDR2

[0163] Diversity at CDR2 was designed with the same considerations: GLG sequences, mature sequences and 3D structure. A preferred length for CDR2 is 17, as shown in Table 18P. Examination of a 3D model suggests that the residues shown as varied in Table 18P are the most likely to interact directly with Ag. Thus a preferred pattern of variegation is: <2>I<2><3>SGG<1>T<1>YADSVKG (SEQ ID NO:2), where <2> indicates a mixture of YRWVGS, <3> is a mixture of P and S, and <1> is a mixture of ADEFGHIKLMNPQRSTVWY (no C). ON-R2V1vg shown in Table 22P embodies this diversity pattern. PCR with ON-R2V1vg, ON-R2top, and ONR2bot gives a dsDNA product of 122 base pairs. Cleavage with BstXI and XbaI removes about 10 bases from each end and produces cohesive ends that can be ligated to similarly cut vector that contains the 3-23 gene shown in Table 18P.

[0164] An alternative pattern would include the variability seen in mature CDR2s as shown in Table 21P: <1>I<4><1><1>G<5><1><1><1&gt- ;YADSVKG (SEQ ID NO:3), where <4> indicates a mixture of DINSWY, and <5> indicates a mixture of SGDN. This diversity pattern is embodied in ON-R2V2vg shown in Table 22P. For either case, the variegated ONs would be synthesized so that fragments of dsDNA containing the BstXI and XbaI site can be generated by PCR. ON-R2V2vg embodies this diversity pattern.

[0165] Alternatively, one can allow shorter or longer CDR2s. Table 22P shows ON-R2V3vg which embodies a CDR2 of length 16 and ON-R2V4vg which embodies a CDR2 of length 19. Table 22P shows ON-R2V3vg is PCR amplified with ON-R2top and ON-R2bo3 while ON-R2V4vg is amplified with ON-R2top and ONR2-bo4.

Analysis of HC CDR3:

[0166] CDR3s of HC vary in length and in sequence. About half of human HCs consist of the components: V::nz::D::ny::JHn where V is a V gene, nz is a series of bases (mean 12) that are essentially random, D is a D segment, often with heavy editing at both ends, ny is a series of bases (mean 6) that are essentially random, and JH is one of the six JH segments, often with heavy editing at the 5' end. In HCs that have no identifiable D segment, the structure is V::nz::JHn where JH is usually edited at the 5' end. Our goal is to mimic the diversity of CDR3, but not to duplicate it (which would be impossible). The D segments appear to provide spacer segments that allow folding of the IgG. The greatest diversity is at the junctions of V with D and of D with JH. The planned CDR3 library will consist of several components. Some of these will have only sequence diversity. Others will have sequence diversity with embedded D segments to extend the length while incorporating sequences known to allow Igs to fold.

[0167] There are many papers on D segments. Corbett et al. (1997) show which D segments are used in which reading frames. My analysis basically confirms their findings. They did not report, however, the level of editing of each D segment and this information is needed for design of an effective library.

[0168] The following diversified sequences would be incorporated in the indicated proportions: "1" stands for 0.095 [G, Y] and 0.048 [A, D, E, F, H, I, K, L, M, N, P, Q, R, S, T, V, W]; double dose of Gly and Tyr plus all other AAs except Cys at equal level.

[0169] The amount of each component is assigned from the tabulation of lengths of the collection of natural VH genes. Component 1 represents all the genes having length 0 to 8 (counting from the YYCAR (SEQ ID NO:112) motif to the WG dipeptide motif). Component 2 corresponds the all the chains having length 9 or 10. Component 3 corresponds to the genes having length 11 or 12 plus half the genes having length 13. Component 4 corresponds to those having length 14 plus half those having length 13. Component 5 corresponds to the genes having length 15 and half of those having length 16. Component 6 corresponds to genes of length 17 plus half of those with length 16. Component 7 corresponds to those with length 18. Component 8 corresponds to those having length 19 and greater.

[0170] The composition has been adjusted because the first component is not complex enough to justify including it as 10% of the library. If the final library were to be 1. E 9, then 1. E 8 sequences would come from component 1, but it has only 2.6 E 5 CDR3 sequences so that each one would occur in .about.385 CDR1/2 contexts. I think it better to have this short CDR3 diversity occur in .about.77 CDR1/2 contexts and have the other, longer CDR3s occur more often.

[0171] The ONs would be PCR amplified with the primers CtprmA and CBprmB, cut with AflII and BstEII, and ligated to similarly cut V3-23.

[0172] This set of components was designed after studying the sequences of 1383 human HC sequences as described below. The proposed components are meant to fulfill the goals:

1) approximately the same distribution of lengths as seen in real Ab genes, 2) high level of sequence diversity at places having high diversity in real Ab genes, and 3) incorporation of constant sequences often seen in real Ab genes.

[0173] Note that the design uses JH4 (YFDYWGQGTLVTVSS; SEQ ID NO:20), which is found more often, instead of JH3 (AFDIWGQGTMVTVSS; SEQ ID NO:21). This involves three changes in AA sequence, shown as double underscored bold. An alternative JH segment is shown.

How the Library Components were Designed:

[0174] The processing of sequence data was accomplished by a series of custom-written FORTRAN programs, each of which carries out a fairly simple transformation on the data and writes its results as one or more ASCII files. The next program then uses these files as input.

[0175] A set of 2049 human heavy-chain genes was selected from the version of GenBank that was available at Dyax on the Sun server on 26 Jun. 2000. A program named "Reformat" changed the format of the files to that of GenBank from the GCG format, creating one file per sequence. A second program named "IDENT_CDR3" processed each of these files as follows. Files were tested for duplication by previous entries, duplicates were discarded. Each reading frame was tested. Most entries had a single open reading frame (ORF), none had two, and some had none. Entries with multiple stops in every reading frame were discarded because this indicates poor quality of sequencing. The sequence was written in triplets in the ORF or in all three reading frames if no ORF was found. The sequence was examined for three motifs: a) AA sequence="YYCxx", b) DNA sequence="tgg ggc (=WG)", and DNA sequence="g gtc acc (=BstEII)". FR3 ends with a conserved motif YYCAR or a close approximation. When writing the DNA sequence, IDENT_CDR3 prints the DNA mostly in lower case. Cysteine codons (TGT or TGC) are printed in uppercase. When the motif "tay tay tgy" is found, IDENT_CDR3 starts a new line that contains "< > xxx xxx xxx xxx xxx" where the xxx's stand for the actual five codons that encode YYC and the next two codons (most often AR or AK). The following DNA is printed in triplets on new lines. A typical processed entry appears as in Table 1P.

[0176] Following the YYC motif, IDENT_CDR3 seeks the sequence "TGG GGC" (the "WG" motif) in the correct reading frame, 5/6 bases is counted as a hit. If found, the DNA is made uppercase. Following the WG motif (if found) or the YYC motif (if no WG found), IDENT_CDR3 seeks the sequence "G GTC ACC" (the BstEII site) in the correct reading frame, 6/7 bases is counted as a hit. If found, the bases are made upper case. If either the WG or BstEII motif are not found, a note is inserted saying that the feature was not identified. The output of IDENT_CDR3 was processed by hand. In many cases, the lacking YYC motif could be seen as a closely related sequence, such as YFC, FYC, or HYC. When this was supported by an appropriately positioned WG and/or BstEII site, the effective YYC site was marked and the sequence retained for further analysis. If the YYC motif could not be identified or if the WG or BstEII sites could not be found, the entry was discarded. For example, the entry in Table 2P had no YYC motif.

[0177] The double underscored sequence encodes YHCAS and is taken as the end of FR3. Note that there is a WG motif at bases 403-408 (bold upper case) and a BstEII site at bases 420-426 (bold upper case). Using WordPerfect, I first made all occurrences of TGC and TGT bold. I then searched for "YYC not found". If I could see the "YYC"-related sequence quickly, I edited the entry so that a YYC was shown. The entry above would be converted to that shown in Table 3P. This processing reduced the list of entries to 1669.

[0178] A third program named "New_DJ" processed the output of IDENT_CDR3. The end of the YYC motif (including the two codon following TGy=Cys) was taken as the end of FR3. The WG motif was taken as the end of the region that might contain a D segment. If WG was not observed and BstEII was, the WG site was assumed to be 17 bases upstream of BstEII. Using the WG motif for alignment, the sequence was compared to each human GLG JH segment (1-6) and the best one identified (New_DJ always assigned a JH segment). Starting from the WG motif of JH and moving toward the 5' end, the program looked for the first codon having more than one mismatch. The region from YYCxx (SEQ ID NO:113) to this codon was taken as the region that might contain a D segment.

[0179] The region that might contain a D segment was tested against all the germ-line genes (GLGs) of human D segments and the best D segment was identified. The scoring involved matching the observed sequence to the GLG sequence in all possible ways. Starting at each base, multiply by 4 for a match and divide by 4 for a mismatch. Record the maximum value obtained for this function. The match was deemed significant if 7/7, 8/9, 9/11, etc. or more bases matched. Of the 1383 sequences examined for D segments,

[0180] "Assign_D" processes the output of New_DJ. For each sequence that had a significant match with a GLG D segment, a file was written containing the putative D segment, the DJ segment, the identified GLG D segment, the identified JH segment, the phase of the match between observed and GLG gene. For example, "D1_1-01_Phz0_hsa239356.txt" is a file recording the match of entry hsa239356 with D1-01 in phase 0. The file contains the information shown in Table 4P. The final DV of the second sequence immediately precedes the WG in JH and is ascribed to JH3. Other files that begin D1_1-01_Phz0 match the same GLG D segment and these can be aligned by sliding amino-acid sequences across each other.

[0181] Table 5P shows how sequence hs6d4xb7 is first assigned to JH4 and then to D3-22. Note that the DNA sequence TGGGGG is aligned to the TGG GGC of the GLG and that the sequence is truncated on the left to fit. The program finds that JH4 has the best fit (5 misses and 18 correct out of 23). From the right, the program sees that DYWGQ (underscored) come from JH, but then the match drops off and the rest of the sequence on the left comes either from added bases or a D segment.

[0182] The lower part of Table 5P shows that the possible D segment matches D #13 (3-23) is a very good match.

[0183] Of 1383 files accepted by Assign_D, 757 had identifiable D segments. The tally of His in Table 6P shows that JH4 is by far the most common.

[0184] JH4 is most common, JH6 next, followed by JH3 and JH5. JH1 and JH2 are seldom used. Table 7P shows the length distributions of each JH class; they do not differ significantly class to class. These lengths count only amino-acids that are not accounted for by JH and so are shorter that the lengths given in Table 8P which cover from YYCAR (SEQ ID NO:112) to WG.

[0185] Table 8P contains the distribution of lengths for a) all the CDR3 segments, b) the CDR3 segments with identified D segments, and c) the CDR3 segments having no identifiable D segment. The CDR3s with identifiable D segments (13.9) are systematically longer than are those that lack D segments (11.2).

[0186] The identified CDR3 segments can be collated in two ways: aligned to the left (looking for a pattern following YYCAR; SEQ ID NO:112) or aligned to the right (looking for a pattern preceding WG). Table 9P shows the collation of left-aligned sequences while Table 10P shows the right-aligned sequences. For each position, I have tabulated the frequency of each AA type (A-M in the first block and N-Y in the second). The column headed "#" shows how many sequences have some AA at that position. The final column shows all of the AA types seen at that position with the most frequent first and the least frequent last. In the left-aligned sequences, we see that Gly is highly over-represented in the first seven positions while Tyr is over-represented at positions 8-16.

[0187] In Table 11P, I have tabulated the AA frequencies for the sequences having between 7 and 15 AAs between YYCAR (SEQ ID NO:112)_and WG. The last four positions can be viewed as coming from JH and so would be given lower levels of diversity than would earlier positions. From these tabulations, I conclude that most AA types are allowed at all the positions, but there is a fairly strong tendency to have Gly at the early positions and to end in Asp-Tyr (DY). We could use these tendencies in designing a pattern of variegation. I would not exclude any AA except Cys, but I might increase the frequency of Gly in the first several positions and Tyr in the last few.

[0188] There are 80 sequences (5.8%) having a pair of cysteines in CDR3. It is more surprising that 53 (3.8%) have a single Cys in CDR3.

[0189] MS-DOS was used to make a list of the files written by Assign_D. "Filter" converts the output of MS-DOS Dir into a form that can be read into WordPerfect and sorted to bring a files belonging to the same D region together.

[0190] "Filter2" collects the sequences and produces a draft table of sequences, grouped by the D-segment used, and written so that the sequences can be aligned. The output of Filter2 were edited by hand. For each group, the translation of the GLG was inserted and the collection of observed sequences was aligned to the conserved part of the GLG. "Filter3" collated the aligned sequences. Table 12P shows an example of an alignment and the tabulation of AA types. The entries are as follows: "Entry" is the name used in the data base, "Seq1" is the sequence from the YYCAR (SEQ ID NO:112) motif to the first amino acid not assigned to JH and "L1" is the length of the segment. The segments are shown aligned to the identified D segment. Seq2 is the sequence from the YYCAR (SEQ ID NO:112) motif to the WG motif (i.e. including part of JH) and "L2" is the length of that sequence. JH is the identified JH segment for this sequence. "P" is the phase of the match. For positive values of P, P bases are found in the observed sequence that do not correspond to any from the GLG, i.e. the observed sequence has had that many bases inserted. For negative values of P, there are |P| bases in the GLG sequence for which there are no corresponding bases in the observed sequence. "Score" is approximately 1/(probability of accidental match). This is calculated by looking at all possible alignments. For each alignment, the score is first set to 1.0. Base by base, the score is multiplied by 4. if the bases match and divided by 4. if they do not. This is done for all starting points and ending points and the maximum value is recorded.

[0191] Table 13P is a summary of how often each D segment was identified and in which reading frame. I have not been consistent with Corbett et al. in assigning the phases of the GLG D segments. The MRC Web page that I took the GLGs from did not have D segments D1-14, D4-11, D5-18, or D6-25. None of these contribute to any great extent and this omission is unlikely to have any serious effect on the conclusions. The column headed "%" contains the percentage of the sequences examined here. The column headed "C %" contains the percentage reported by Corbett et al. I assume that the data used in Corbett et al. is mostly included in my collection. Nevertheless, the observed frequencies differ in detail. For example, my compilation shows that 10.7% of the collection contains a D segment encoding two cysteines while they have only 4.16% in this category. In D3 phase "0", I see 19.4% of the collection while they report 11.8%.

[0192] The most common actual D segments were further analyzed. The GLGs are heavily edited at either end. The aligned sequences were aligned. For each D-segment having more than seven examples, Filter3 produced a table of the frequency of each amino-acid type at each position. From these tabulations, library components shown in Table 17P were designed. At each position where at least half the examples have an amino acid, I entered either the dominant AA type or "x". An AA type was "dominant" if it occurred more than 50% of the time. L is the length and f is the number of sequences observed that have related sequences.

[0193] Table 14P shows possible library components for a library of CDR3's. "L" is the length of the insert and "f" is the frequency of the motif in the assayed collection. Table 17P shows vgDNA that embodies each of the components shown in Table 14P. In Table 17P, the oligonucleotides (ON) Ctop25, CtprmA, CBprmB, and CBot25 allow PCR amplification of each of the variegated ONs (vgDNA): C1t08, C2t10, C3t12, C4t14, C5t15, C6t17, C7t18, and c8t19. After amplification, the dsDNA can be cleaved with AflII and BstEII (or KpnI) and ligated to similarly cleaved vector that contains the remainder of the 3-23 synthetic domain. Preferably, this vector already contains diversity in CDR1 and CDR2 as disclosed herein. Preferably, the recipient vector contains a stuffer in place of CDR3 so that there will be no parental sequence that would then occur in the resulting library. Table 50P shows a version of the V3-23 gene segment with each CDR replaced by a short segment that contains both stop codons and restriction sites that will allow specific cleavage of any vector that does not have the stuffer removed. The stuffer can either be short and contain a restriction enzyme site that will not occur in the finish library, allowing removal of vectors that are not cleaved by both AflII and BstEII (or KpnI) and religated. Alternatively, the stuffer could be 200-400 bases long so that uncleaved or once cleaved vector can be readily separated from doubly cleaved vector.

[0194] In the vgDNA for HC CDR3, <1> means a mixture comprising 0.27 Y, 0.27 G, and 0.027 of each of the amino-acid codons {A, D, E, F, H, I, K, L, M, N, P, Q, R, S, T, V, W}; <2> means an equimolar mixture of K and R; and <3> means an equimolar mixture of S and G.

Analysis of Human Kappa Light Chains and Preferred Variegation Scheme:

[0195] A collection of 285 human kappa chains was assembled from the public data base. Table 27 shows the names of the entries used. The GLG sequences of nine bases at each end of the framework regions were used to find the FR/CDR junctions. Only in cases where all six junctions could be found was the sequences included. Table 25P shows the distribution of lengths in CDRs in human kappas. CDR1s with lengths of 11, 12, 13, 16, and 17 were observed with 11 being predominant and 12 well represented. CDR2 exhibits only length 7. CDR3 exhibits lengths of 1, 4, 6, 7, 8, 9, 10, 11, 12, 13, and 19. Essentially all examples are in the 8, 9, or 10 length groups. Table 26P shows the distribution of V and J genes seen in the sample. A27 is the most common V and JK1 is the most common J. Thus, a suitable synthetic kappa gene comprises A27 joined to JK1. Table 30P shows a suitable synthetic kappa chain gene, including a PlacZ promoter, ribosome-binding site, and signal sequence (M13 III signal). The DNA sequence encodes the GLG amino-acid sequence, but does not comprise the GLG DNA sequence. Restriction sites are designed to fall within each framework region so that diversity can be cloned into the CDRs. XmaI and EspI are in FR1, SexAI is in FR2, RsrII is in FR3, and KpnI (or Acc65I) are in FR4. Additional sites are provided in the constant kappa chain to facilitate construction of the gene.

[0196] Table 30P also shows a suitable scheme of variegation for kappa. In CDR1, a preferred length is 11 codons. The A27 GLG has a CDR1 of 12 codons, but the sample of mature kappa chains has length 11 predominating. One could also introduce a component of kappas having length 12 in CDR1 by introducing codon 52 as <2> (i.e. a Ser-biased mixture). CDR2 of kappa is always 7 codons. Table 31P shows a tally of 285 CDR2s and a preferred variegation scheme for CDR2. The predominant length of CDR3 in kappa chains is 9 codons. Table 32P shows a tally of 166 CDR3s from human kappas and a preferred variegation scheme (which is also shown in Table 30P).

Analysis of Lambda Chains and Preferred Variegation Scheme:

[0197] A collection of 158 lambda sequences was obtained from the public data base. Of these 93 contained sequences in which the FR/CDR boundaries could be identified automatically. Table 33P shows the distribution of lengths of CDRs.

Method of Construction:

[0198] The diversity of HC, kappa, and lambda are best constructed in separate vectors. First a synthetic gene is designed to embody each of the synthetic variable domains. The light chains are bounded by restriction sites for ApaLI (positioned at the very end of the signal sequence) and AscI (positioned afer the stop codon). The heavy chain is bounded by SfiI (positioned within the PelB signal sequence) and NotI (positioned in the linker between CH1 and the anchor protein. The initial genes are made with "stuffer" sequences in place of the desired CDRs. A "Stuffer" is a sequence the is to be cut away and replaced by diverse DNA but which does not allow expression of a functional antibody gene. For example, the stuffer may contain several stop codons and restriction sites that will not occur in the correct finished library vector. In Table 40P, the stuffer for CDR1 of kappa A27 contains a StuI site. The vgDNA for CDR1 is introduced as a cassette from EspI, XmaI, or AflII to either SexAI or KasI. After the ligation, the DNA is cleaved with StuI; there should be no StuI sites in the desired vectors.

REFERENCES

[0199] Corbett, S J, Tomlinson, I M, Sonnhammer, E L L, Buck, D, Winter, G. "Sequences of the Human Immunoglobulin Diversity (D) Segment Locus: A Systematic Analysis Provides No Evidence for the Use of DIR Segments, Inverted D Segments, `Minor` D Segments or D-D Recombination". J Molec Biol (1997) 270:587-597.

TABLES

TABLE-US-00023 [0200] TABLE 1P Typical entry in which YYC motif is found. ++++C: \tmp\haj10335.txt LOCUS HAJ10335 306 bp mRNA PRI 18-AUG-1998 DEFINITION Homo sapiens mRNA for immunoglobulin heavy chain variable region, clone ELD16/6. ACCESSION AJ010335 VERSION AJ010335.1 GI: 3445266 Ngene = 306 Stop codons in reading frame 1 49 115 124 253 277 No stops in reading frame 2 Stop codons in reading frame 3 12 60 81 147 204 213 1 t ttg ggg tcc ctg aga ctc tcc TGT gca gcc tct gga ttc acc 44 gtc agt agc aac tac atg acc tgg gtc cgc cag gct cta ggg aag 89 ggg ctg gag tgg gtc tca gtt att tat agc ggt ggt agc aca tac 134 tac gca gac tcc gtg aag ggc gga ttc acc atc tcc aga gac aat 179 tcc aag aac aca ctg tat ctt caa atg aac agc ctg aga ccc gag 224 gac acg gct gtg < > TAT TAC TGT gcg aca 251 ggt aat cgc ctg gaa atg gct gca att aac TGG GGC caa gga acc 263 ctG GTC ACC aa (SEQ ID NO: 113) --------------------------------------------------------------------------- ---

TABLE-US-00024 TABLE 2P entry in which YYC motif was not automatically identified ++C: \tmp\hs202g3.txt !!NA_SEQUENCE 1.0 LOCUS HS202G3 522 bp mRNA PRI 03-AUG-1995 DEFINITION H. sapiens mRNA for immunoglobulin variable region (clone 202-G3). ACCESSION Z47259 VERSION Z47259.1 GI: 619470 Ngene = 522 No stops in reading frame 1 Stop codons in reading frame 2 89 110 305 314 Stop codons in reading frame 3 84 192 321 351 369 1 atg gac tgg acc tgg agg ttc ctc ttt gtg gtg gca gca gct aca 46 ggt gtc cag tcc cag gtg cag ctg gtg cag tct ggg gct gag gtg 91 aag aag cct ggg tcc tcg gtg aag gtc tcc TGC aag gct tct gga 136 ggc acc ttc agc agc tat gct atc agc tgg gtg cga cag gcc cct 181 gga caa ggg ctt gag tgg atg gga ggg atc atc cct atc ttt ggt 226 aca gca aac tac gca cag aag ttc cag ggc aga gtc acg att acc 271 gcg gac gaa tcc acg agc aca gcc tac atg gag ctg agc agc ctg 316 aga tct gag gac acg gcc gtg tat cac TGT gcg agt gag gga tgg 361 gag agt TGT agt ggt ggt ggc TGC tac gac ggt atg gac gtc TGG 406 GGC caa ggg acc acG GTC ACC gtc tcc tca gct tcc acc aag ggc 451 cca tcg gtc ttc ccc ctg gcg ccc TGC tcc agg agc acc tct ggg 496 ggc aca gcg gcc ctg ggc TGC ctg (SEQ ID NO: 114) YYC not found !!! --------------------------------------------------------------------------- ---

TABLE-US-00025 TABLE 3P Entry of Table 2P after editting. ++C: \tmp\hs202g3.txt !!NA_SEQUENCE 1.0 LOCUS HS202G3 522 bp mRNA PRI 03-AUG-1995 DEFINITION H. sapiens mRNA for immunoglobulin variable region (clone 202-G3). ACCESSION Z47259 VERSION Z47259.1 GI: 619470 Ngene = 522 No stops in reading frame 1 Stop codons in reading frame 2 89 110 305 314 Stop codons in reading frame 3 84 192 321 351 369 1 atg gac tgg acc tgg agg ttc ctc ttt gtg gtg gca gca gct aca 46 ggt gtc cag tcc cag gtg cag ctg gtg cag tct ggg gct gag gtg 91 aag aag cct ggg tcc tcg gtg aag gtc tcc TGC aag gct tct gga 136 ggc acc ttc agc agc tat gct atc agc tgg gtg cga cag gcc cct 181 gga caa ggg ctt gag tgg atg gga ggg atc atc cct atc ttt ggt 226 aca gca aac tac gca cag aag ttc cag ggc aga gtc acg att acc 271 gcg gac gaa tcc acg agc aca gcc tac atg gag ctg agc agc ctg 316 aga tct gag gac acg gcc gtg <YHCAS> tat cac TGT gcg agt (SEQ ID NO: 116) gag gga tgg 361 gag agt TGT agt ggt ggt ggc TGC tac gac ggt atg gac gtc TGG 406 GGC caa ggg acc acG GTC ACC gtc tcc tca gct tcc acc aag ggc 451 cca tcg gtc ttc ccc ctg gcg ccc TGC tcc agg agc acc tct ggg 496 ggc aca gcg gcc ctg ggc TGC ctg (SEQ ID NO: 115) YYC not found !!! --------------------------------------------------------------------------- ---

TABLE-US-00026 TABLE 4P contents of file D1_1-01_Phz0_hsa239356.txt SRGGKYQLAPKGGM (SEQ ID NO: 117) DRGGKYQLAPKGGMDV (SEQ ID NO: 118) JH3 D# 1 Phase 15 Score 6.55D + 04 --------------------------------------------------------------------------- ---

TABLE-US-00027 TABLE 5P alignment of a CDR3::JH segment to GLG JHs and D-segments. +c:\tmp\hs6d4xb7.text 1 1 2 2 3 3 3 1234567890 5 0 5 0 5 9 Observed tatgatagtagtgggtcatactccgactacTGGGGGcag (SEQ ID NO: 119) JH1 ------------gctgaatacttccagcactggggccagggcaccctggtcaccgtctcctcag--(SEQ ID NO: 120) Miss = 9 Nt = 27 JH2 -----------ctactggtacttcgatctctggggccgtggcaccctggtcactgtctcctcag--(SEQ ID NO: 121) Miss = 13 Nt = 28 JH3 --------------tgatgcttttgatatctggggccaagggacaatggtcaccgtctcttcag--(SEQ ID NO: 122) Miss = 14 Nt = 25 JH4 ----------------actactttgactactggggccagggaaccctggtcaccgtctcctcag--(SEQ ID NO: 123) Miss = 5 Nt = 23 JH5 -------------acaactggttcgacccctggggccagggaaccctggtcaccgtctcctcag--(SEQ ID NO: 124) Miss = 11 Nt = 26 JH6 -attactactactactacggtatggacgtctggggccaagggaccacggtcaccgtctcctcag--(SEQ ID NO: 125) Miss = 23 Nt = 38 4 tat gat agt agt ggg tca TAC Tcc GAC TAC TGG GGg CAG (SEQ ID NO: 126) Y D S S G S Y S D Y W G Q (SEQ ID NO: 127) JH4 --- --- --- --- --- -ac tac ttt gac tac tgg ggc cag gga acc ctg gtc acc gtc tcc tca g-- (SEQ ID NO: 128) - - - - - - Y F D Y W G Q G T L V T V S S -(SEQ ID NO: 129) Fract = 0.783 = 18 / 23 Matching the rest to D segments: D#13 --------gtattactatgatagtagtggttattactac GLG (SEQ ID NO: 130) gatcgccacaattactatgatagtagtgggtcatactcc Observed (SEQ ID NO: 131) --------gt...................t.at....a. . = match D#13 Phase = 9 Score = 4.3980E+12

TABLE-US-00028 TABLE 6P Number of sequences identified as having JH derived from GLG JHn JH 1 2 3 4 5 6 # sequences 17 40 198 707 160 261

TABLE-US-00029 TABLE 7P Distribution of CDR3 fragments that might contain D segments. For JH1 0 1 2 3 4 5 6 7 8 9 10 11 12 13 0 0 1 1 3 1 1 2 0 3 1 1 1 2 Total = 17 Median = 8.0 For JH2 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 0 0 0 0 0 2 4 6 2 6 3 4 5 2 3 15 16 17 18 2 0 0 1 Total = 40 Median = 9.0 For JH3 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 0 0 2 6 16 12 17 17 15 22 20 20 18 13 4 15 16 17 18 19 8 3 2 1 2 Total = 198 Median = 8.6 For JH4 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 0 0 7 15 19 40 63 82 81 77 81 53 57 44 30 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 15 23 8 3 5 2 0 1 0 0 0 0 0 0 0 30 31 32 33 34 35 0 0 0 0 0 1 Total = 707 Median = 8.6 For JH5 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 0 0 0 3 4 6 13 19 12 14 22 18 10 18 10 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 5 1 1 0 0 1 1 0 0 0 0 0 0 0 0 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 45 46 0 1 Total = 160 Median = 9.4 For JH6 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 2 0 1 2 5 15 20 18 22 29 29 28 23 16 10 15 16 17 18 19 20 14 9 9 4 2 3 Total = 261 Median = 9.6

TABLE-US-00030 TABLE 8P Lengths of CDR3 segments from YYCAR to WG. Distribution of lengths from end of FR3 to WG motif all sequences. L 0 1 2 3 4 5 6 7 8 9 10 N 6 0 0 4 2 9 13 38 61 88 101 Sum(N) 6 6 6 10 12 21 34 72 133 221 322 f .004 .004 .004 .007 .009 .015 .025 .052 .096 .160 .233 L 11 12 13 14 15 16 17 18 19 20 21 N 118 154 150 118 125 105 84 61 46 42 16 SN 440 594 744 862 987 1092 1176 1237 1283 1325 1341 f .318 .430 .538 .623 .714 .790 .850 .894 .928 .958 .970 L 22 23 24 25 26 27 28 29 30 31 32 N 17 7 9 2 1 0 2 1 0 0 0 SN 1358 1365 1374 1376 1377 1377 1379 1380 1380 1380 1380 f .982 .987 .993 .995 .996 .996 .997 .998 .998 .998 .998 L 33 34 35 36 37 38 39 40 41 42 43 N 0 0 0 0 1 0 0 0 0 1 0 SN 1380 1380 1380 1380 1381 1381 1381 1381 1381 1382 1382 f .998 .998 .998 .998 .999 .999 .999 .999 .999 .999 .999 L 44 45 46 N 0 0 1 SN 1382 1382 1383 f .999 .999 1.0 Median = 12.65 Distribution of lengths from end of FR3 to WG motif with assigned D. L 0 1 2 3 4 5 6 7 8 9 10 N 3 0 0 0 0 0 3 9 21 15 39 SN 3 3 3 3 3 3 6 15 36 51 90 f .004 .004 .004 .004 .004 .004 .008 .019 .046 .065 .115 L 11 12 13 14 15 16 17 18 19 20 21 N 64 77 97 72 77 75 63 45 35 38 15 SN 154 231 328 400 477 552 615 660 695 733 748 f .196 .294 .418 .510 .608 .703 .783 .841 .885 .934 .953 L 22 23 24 25 26 27 28 29 30 31 32 N 15 6 9 2 1 0 1 1 0 0 0 SN 763 769 778 780 781 781 782 783 783 783 783 f .972 .980 .991 .994 .995 .995 .996 .997 .997 .997 .997 L 33 34 35 36 37 38 39 40 41 42 43 N 0 0 0 0 1 0 0 0 0 0 0 SN 783 783 783 783 784 784 784 784 784 784 784 f .997 .997 .997 .997 .999 .999 .999 .999 .999 .999 .999 L 44 45 46 N 0 0 1 SN 784 784 785 f .999 .999 1.0 Median = 13.90 Distribution of lengths from end of FR3 to WG motif with no assigned D. L 0 1 2 3 4 5 6 7 8 9 10 N 3 0 0 4 2 9 10 29 40 73 62 SN 3 3 3 7 9 18 28 57 97 170 232 f .005 .005 .005 .012 .015 .030 .047 .095 .162 .284 .388 L 11 12 13 14 15 16 17 18 19 20 21 N 54 77 53 46 48 30 21 16 11 4 1 SN 286 363 416 462 510 540 561 577 588 592 593 f .478 .607 .696 .773 .853 .903 .938 .965 .983 .990 .992 L 22 23 24 25 26 27 28 29 30 31 32 N 2 1 0 0 0 0 1 0 0 0 0 SN 595 596 596 596 596 596 597 597 597 597 597 f .995 .997 .997 .997 .997 .997 .998 .998 .998 .998 .998 L 33 34 35 36 37 38 39 40 41 42 N 0 0 0 0 0 0 0 0 0 1 SN 597 597 597 597 597 597 597 597 597 598 f .998 .998 .998 .998 .998 .998 .998 .998 .998 1.0 Median = 11.17 L is the length N is the number of examples Sum(N) = SN is the sum of the Ns f is the cumulative fraction seen

TABLE-US-00031 TABLE 9P Tally of left-aligned CDR3 sequences A C D E F G H I K L M # 1 74 6 278 109 11 319 50 18 11 60 8 1383 GDERVASLHTNQPIWYFKMCX 2 50 9 64 32 29 249 43 42 41 109 22 1377 GRPSLDVYTANHIQKEFMWCX 3 81 18 74 39 25 214 29 42 16 83 19 1377 GSYRTVLADPIWEQHNFMCK| 4 70 23 92 49 50 228 23 58 21 70 16 1373 GSYDRVALTIPFEWNCHQKMX 5 86 28 106 32 59 217 21 41 16 72 19 1371 GYSDAVTLRFIPWNECHMQK|X 6 88 17 104 28 94 171 17 48 12 50 17 1362 GYSDFATVRWPLINEQCHMK| 7 69 15 110 21 89 176 22 50 15 81 12 1349 GSYDFVLTAPRWINHEQCKM|X 8 53 19 141 17 90 150 18 47 17 68 11 1311 YSGDFLTVWAPIRNCHEKQM| 9 44 21 120 24 102 174 24 36 20 71 11 1250 YGSDFLNVRTAWPIEHCKQM| 10 39 31 129 23 124 116 23 42 9 58 32 1162 YDFGSLIARPTVWNMCEHQK 11 36 12 158 17 137 83 13 18 10 40 21 1061 YDFGSPLVANWMTRIEHCKQX 12 34 11 164 10 82 74 34 30 1 31 20 943 YDFGPSVAHLINMRTWCEQKX 13 32 2 121 6 84 56 10 26 7 43 32 789 YDFGLSPVAMIWRTHNKQEC 14 23 131 5 59 65 10 16 4 25 34 639 YDGFMVLAPISWNRHTQEKX 15 15 4 107 5 43 42 1 23 20 34 521 YDFGVMILWAPRSENCQTH| 16 4 2 80 3 33 26 4 5 1 10 29 396 YDVFMGPSLNTRIWAHECQ|K 17 3 1 63 19 19 9 13 12 21 291 DYVMFGILHPSTWAQRCNX 18 3 47 16 13 1 4 7 23 207 DYVMFGPSLTIAHN 19 5 1 39 1 4 13 3 3 1 14 146 DYVMGAFHINRSCELPQW 20 2 17 4 5 3 4 12 100 VYDMGFLIPSARWQ 21 17 3 8 1 1 4 58 DVGYMFHINTW 22 1 7 6 1 1 5 42 VDFMYSAGITW 23 9 1 1 1 1 25 DVYGILMPS 24 1 2 1 1 1 18 VYDAHLMPT 25 1 3 9 GVDPSY 26 2 2 7 GMSTV 27 2 1 1 6 DKMST 28 1 1 1 6 VADGS 29 1 4 DPSV 30 1 3 FST 31 1 1 3 KLV 32 1 1 3 FGP 33 1 3 PG 34 1 1 1 3 HLS 35 3 AVW 36 1 1 3 DFP 37 3 PSY 38 1 2 LS 39 1 1 2 AK 40 2 PS 41 2 ST 42 2 S 43 1 1 K 44 1 S 45 1 T 46 1 S 816 220 2186 421 1166 2428 358 568 205 920 421 N P Q R S T V W Y | X # 1 35 23 31 108 63 50 94 16 13 6 1383 GDERVASLHTNQPIWYFKMCX 2 44 114 42 169 114 59 62 21 60 2 1377 GRPSLDVYTANHIQKEFMWCX 3 26 73 37 110 140 97 89 42 122 1 1377 GSYRTVLADPIWEQHNFMCK| 4 48 51 22 79 141 65 77 49 139 2 1373 GSYDRVALTIPFEWNCHQKMX 5 37 41 18 61 157 75 85 38 158 2 2 1371 GYSDAVTLRFIPWNECHMQK|X 6 32 54 23 67 152 80 78 64 165 1 1362 GYSDFATVRWPLINEQCHMK| 7 44 59 18 58 157 73 85 54 139 1 1 1349 GSYDFVLTAPRWINHEQCKM|X 8 38 48 14 41 167 68 59 59 185 1 1311 YSGDFLTVWAPIRNCHEKQM| 9 52 40 14 47 123 45 48 41 192 1 1250 YGSDFLNVRTAWPIEHCKQM| 10 33 37 12 39 73 36 36 35 235 1162 YDFGSLIARPTVWNMCEHQK 11 33 49 7 20 68 21 37 29 251 1 1061 YDFGSPLVANWMTRIEHCKQX 12 30 53 10 19 45 19 42 18 215 1 943 YDFGPSVAHLINMRTWCEQKX 13 10 34 7 22 40 15 33 25 184 789 YDFGLSPVAMIWRTHNKQEC 14 13 22 6 12 15 10 26 14 148 1 639 YDGFMVLAPISWNRHTQEKX 15 5 12 2 12 12 3 40 20 119 1 521 YDFGVMILWAPRSENCQTH| 16 10 24 3 6 12 7 49 5 82 2 396 YDVFMGPSLNTRIWAHECQ|K 17 1 8 3 2 8 5 42 4 58 1 291 DYVMFGILHPSTWAQRCNX 18 1 13 8 5 31 35 207 DYVMFGPSLTIAHN 19 2 1 1 2 2 24 1 29 146 DYVMGAFHINRSCELPQW 20 3 1 2 3 23 2 19 100 VYDMGFLIPSARWQ 21 1 1 14 1 7 58 DVGYMFHINTW 22 2 1 12 1 5 42 VDFMYSAGITW 23 1 1 5 5 25 DVYGILMPS 24 1 1 5 5 18 VYDAHLMPT 25 1 1 2 1 9 GVDPSY 26 1 1 1 7 GMSTV 27 1 1 6 DKMST 28 1 2 6 VADGS 29 1 1 1 4 DPSV 30 1 1 3 FST 31 1 3 KLV 32 1 3 FGP 33 2 3 PG 34 1 3 HLS 35 1 1 3 AVW 36 1 1 3 DFP 37 1 1 3 PSY 38 1 2 LS 39 2 AK 40 1 1 2 PS 41 1 1 2 ST 42 2 2 S 43 1 K 44 1 1 S 45 1 1 T 46 1 1 S 495 769 270 876 1518 741 1104 540 2572 10 17 18621

TABLE-US-00032 TABLE 10P Tally of right-aligned sequences A C D E F G H I K L M 5 1 1 G 6 1 S 7 1 1 G 8 1 1 G 9 2 RV 10 2 RV 11 1 1 2 GI 12 2 V 13 2 TY 14 1 1 3 DGN 15 1 3 ISY 16 1 3 DSY 17 1 3 APY 18 1 1 1 3 DFM 19 2 1 3 DG 20 1 1 3 ILV 21 3 WP 22 3 4 GS 23 2 1 6 GHQSV 24 1 3 1 6 GALR 25 1 2 1 7 DTAIS 26 1 1 1 1 1 1 1 9 ACDGKLMST 27 2 5 1 2 1 1 18 DAGVEILNQRS 28 2 2 3 1 2 25 TGQSDELPRIV 29 3 5 6 7 1 1 1 42 GEDVAPQRSKLMTY| 30 2 9 1 9 1 4 5 2 58 DGRLSIVPAMQTFHNY 31 4 2 19 9 2 18 1 2 1 3 100 DGSERVYALPTCFINHKW 32 10 5 18 5 3 16 3 3 2 14 1 146 DGLRVAPYSTCEQFHINWKM 33 20 18 10 7 34 7 8 2 6 1 207 GARDPSYTEVIFHLQWKM 34 13 4 31 18 9 37 8 16 4 14 4 291 GDRYPVEILASTFHQWCKMNX| 35 17 5 32 23 10 70 12 10 6 25 1 396 GRSDYLEVTPAHNFIWKCQM| 36 23 6 51 21 9 79 19 15 14 36 9 521 GDSYRLTVPAEHIKNFMWCQ| 37 35 12 56 23 15 110 14 17 5 24 4 639 GYDVRSTAPLEIFHNCWQKMX 38 28 19 68 27 29 133 26 31 12 43 7 789 GSYDVRLPTIFAEHCNWKQM 39 51 25 80 27 33 162 16 30 18 55 15 943 GSDRYVLATPFWIECKHMQNK 40 44 14 73 36 46 161 27 32 17 59 8 1061 GSRDYVTLPFAEIWHQNKCM 41 54 21 74 25 23 178 23 52 15 57 11 1162 GSYTDRVLPAIWNQEFHCKMX| 42 57 13 82 40 42 190 14 39 15 82 15 1250 GSYDLVRTANPFEIWQKMHC| 43 75 18 54 25 35 242 13 29 18 49 12 1311 GYSTARVPDLWNFIQECKHM| 44 63 17 79 15 43 197 20 38 14 76 8 1349 YGSTDLRAPVWNFIQHCEKM 45 59 16 69 35 55 165 26 23 23 75 9 1362 YGSLRTDNAFPVWEHIKCQM 46 41 19 125 26 27 208 31 14 16 38 8 1371 YGDSNRWATLPHFECQCKIM 47 160 10 24 13 53 332 36 16 11 40 10 1373 GYAWPSFRLHTVNDIEKCMQX 48 21 4 8 5 680 27 4 44 5 145 288 1377 FMLISGVYPAWTDNQREKCHX 49 23 2 1181 29 1 30 15 4 2 8 1 1377 DGEAHNQSYVLPTIRCKW|FMX 50 7 7 15 42 3 41 135 3 59 4 1383 YVIPSLFHNDTACXMGKQRW| 816 220 2186 421 1166 2428 358 568 205 920 421 N P Q R S T V W Y | X # 5 1 1 G 6 1 S 7 1 G 8 1 G 9 1 1 2 RV 10 1 1 2 RV 11 2 GI 12 2 2 V 13 1 1 2 TY 14 1 3 DGN 15 1 1 3 ISY 16 1 1 3 DSY 17 1 1 3 APY 18 3 DFM 19 3 DG 20 1 3 ILV 21 1 2 3 WP 22 1 4 GS 23 1 1 1 6 GSQSV 24 1 6 GALR 25 1 2 7 DTAIS 26 1 1 9 ACDGKLMST 27 1 1 1 1 2 18 DAGVEILNQRS 28 2 3 2 3 4 1 25 TGQSDELPRIV 29 3 3 2 2 1 5 1 1 42 GEDVAPQRSKLMTY| 30 1 3 2 7 5 2 4 1 58 DGRLSIVPAMQTFHNY 31 2 3 7 10 3 7 1 6 100 DGSERVYALPTCFINHKW 32 3 9 4 12 8 6 12 3 9 146 DGLRVAPYSTCEQFHINWKM 33 16 6 19 15 12 10 3 13 207 GARDPSYTEVIFHLQWKM 34 2 20 5 31 12 12 20 5 23 1 2 291 GDRYPVEILSASTFHQWCKMNX| 35 12 18 5 39 35 19 23 7 26 1 396 GRSDYLEVTPAHNFIWKCQM| 36 11 24 6 42 47 29 28 7 44 1 521 GDSYRLTVPAEHIKNFMWCQ| 37 14 33 9 54 52 37 55 11 58 1 639 GYDVRSTAPLEIFHNCWQKMX 38 18 33 12 46 77 32 58 17 73 789 GSYDVRLPTIFAEHCNWKQM 39 11 38 12 70 94 42 61 33 68 2 943 GSDRYVLATPFWIECKHMQNX 40 24 52 27 74 140 61 66 29 71 1061 GSRDYVTLPFAEIWHQNKCM 41 31 55 29 70 146 76 61 51 97 1 2 1162 GSYTDRVLPAIWNQEFHCKMX| 42 48 47 24 68 171 68 70 39 125 1 1250 GSYDLVRTANPFEIWQKMHC| 43 38 58 28 73 164 76 66 43 194 1 1311 GYSTARVPDLWNFIQECKHM| 44 48 60 24 69 131 86 57 52 252 1349 YGSTDLRAPVWNFIQHCEKM 45 62 51 16 75 116 74 50 39 324 1362 YGSLRTDNAFPVWEHIKCQM 46 97 38 21 55 110 39 26 55 377 1371 YGDSNRWATLPHFEVQCKIM 47 25 54 9 44 54 34 32 122 292 2 1373 GYAWPSFRLHTVNDIEKCMQX 48 8 22 7 6 28 10 25 16 23 1 1377 FMLISGVYPAWTDNQREKCHX 49 15 6 13 4 13 5 9 2 11 2 1 1377 DGEAHNQSYVLPTIRCKW|FMX 50 23 122 3 3 67 9 350 3 480 1 6 1383 YVIPSLFHNDTACXMGKQRW| 50 495 769 270 876 1518 741 1104 540 2572 10 17 18621

TABLE-US-00033 TABLE 11P Tallies of AA-frequencies in all CDR3 by length Tally of sequences of length 7 # = 38 A C D E F G H I K L M # 1 1 8 1 1 14 1 1 5 38 GDLRWAEFHKS 2 1 1 2 6 3 2 1 1 38 RGNHVFKTYADLMW 3 1 4 1 5 1 2 2 38 GSDWYPVILTAFHN 4 3 1 1 12 1 1 1 38 GYSANRVDFHILPT 5 2 1 14 3 4 1 3 3 38 FIGLMARVYEKP 6 26 1 1 38 DVPTHISWY 7 1 2 2 3 1 38 YVINDHSALR 9 42 2 19 40 9 11 4 13 4 N P Q R S T V W Y | X # 1 3 1 2 38 GDLRWAEFHKS 2 6 7 2 3 1 2 38 RGNHVFKTYADLMW 3 1 3 5 2 3 4 4 38 GSDWYPVILTAFHN 4 2 1 2 4 1 2 6 38 GYSANRVDFHILPT 5 1 2 2 2 38 FIGLMARVYEKP 6 2 1 2 3 1 1 38 DVPTHISWY 7 3 1 2 7 16 38 YVINDHSALR 12 7 15 13 7 20 8 31 266 Tally of sequences of length 8 # = 61 A C D E F G H I K L M # 1 3 7 3 14 2 2 5 61 GDLTVRSAEHINWPQY 2 1 9 1 1 15 1 2 1 61 GDTNRSVKWYAEFILPQ 3 2 3 1 10 1 1 7 61 GLSTYVDPRAFHIMNQW 4 4 1 3 1 1 15 1 4 61 GYRADQDSWVCEFHNPT 5 10 2 1 9 5 1 5 1 61 AGYHLTPRVDSEKMW 6 5 1 24 2 7 5 2 61 FIALPSVYGMCQRW 7 5 37 2 4 1 2 61 DAHSELNVIP| 8 1 2 3 1 12 3 61 YISFLVDNAHPRT 31 2 63 8 30 65 14 24 3 32 4 N P Q R S T V W Y | X # 1 2 1 1 4 4 5 5 1 1 61 GDLTVRSAEHINWPQY 2 6 1 1 4 3 8 3 2 2 61 GDTNRSVKWYAEFILPQ 3 1 3 1 3 7 7 5 1 7 61 GLSTYVDPRAFHIMNQW 4 1 1 4 5 3 1 2 3 11 61 GYRALQDSWVCEFHNPT 5 4 4 2 5 4 1 7 61 AGYHLTPRVDSEKMW 6 3 1 1 3 3 1 3 61 FIALPSVYGMCQRW 7 2 1 4 2 1 61 DAHSELNVIP| 8 2 1 1 7 1 3 24 61 YISFLVDNAHPRT 14 15 8 22 33 27 27 10 55 1 488 Tally of sequences of length 9 # = 88 A C D E F G H I K L M # 1 9 12 4 21 1 1 2 5 88 GDARNVLEQTKWHIPSY 2 2 2 3 3 13 4 3 7 2 88 GPSRLNTHEFKYADMQW 3 4 2 3 3 3 15 1 1 88 GTPSQNRVWYADEFCLM 4 5 1 6 3 6 22 2 4 1 6 1 88 GSDFLARITYENPWHVCKM 5 7 1 4 3 4 14 2 7 2 88 GSYALNDFVERWHMQTCP 6 13 2 1 3 13 6 2 1 4 1 88 YAGHNLPSVFTWDIEKMQR 7 4 2 41 2 3 1 14 5 88 FLMAPWIDGSVKNQTY 8 1 1 73 2 2 1 2 88 DEGLSACHNQRV 9 1 1 4 1 3 8 2 88 YVISFHPLNTCDGF 45 6 105 19 64 103 19 18 8 48 12 N P Q R S T V W Y | X # 1 7 1 3 8 1 3 7 2 1 88 GDARNVLEQTKWHIPSY 2 5 11 2 10 11 5 2 3 88 GPSRLNTHEFKYADMQW 3 5 7 6 5 7 11 5 5 5 88 GTPSQNRVWYADEFCLM 4 3 3 5 7 4 2 3 4 88 GSDFLARITYENPWHVCKM 5 6 1 2 3 12 2 4 3 11 88 GSYALNDFVERWHMQTCP 6 5 4 1 1 4 3 4 3 17 88 YAGHNLPSVFTWDIEKMQR 7 1 4 1 2 1 2 4 1 88 FLMAPWIDGSVKNQTY 8 1 1 1 2 1 88 DEGLSACHNQRV 9 2 3 1 8 2 9 43 88 YVISFHPLNTCDGR 35 34 16 34 54 31 34 22 85 792 Tally of sequences of length 10 # = 101 A C D E F G H I K L M # 1 8 1 19 7 1 16 3 2 3 2 101 DGNAERTSQVHLWKMYCF 2 3 8 3 5 13 5 15 2 101 LGRDSPVFINTAEQYMW 3 6 9 1 26 1 3 1 4 1 101 GSYDAVTLNRIPWFHKMQ 4 7 6 1 25 1 5 4 1 101 GSYARDINPLTVWQFHM 5 6 5 9 4 16 1 3 4 101 GYTESANDPRFLVKQWH 6 6 1 6 5 4 23 2 4 3 3 1 101 GYRSWADEFINKLTHCMQV 7 13 3 1 5 9 3 1 4 1 101 YASGPRWFTVLDHNEIMQ 8 2 1 1 57 3 4 15 4 101 9 3 78 2 6 1 1 1 101 10 3 4 4 13 1 101 54 3 137 28 82 137 15 36 10 54 12 N P Q R S T V W Y | X # 1 9 4 6 5 6 4 3 2 101 DGNAERTSQVHLWKMYCF 2 5 6 3 11 8 4 6 1 3 101 LGRDSPVFINTAEQYMW 3 4 3 1 4 14 5 6 2 10 101 GSYDAVTLNRIPWFHKMQ 4 5 5 3 7 11 4 4 4 8 101 GSYARDINPLTVWQFHM 5 6 5 2 5 8 10 4 2 11 101 GYTESANDPRFLVKQWH 6 4 1 8 7 3 1 7 12 101 GYRSWADEFINKLTHCMQV 7 2 7 1 7 11 5 5 6 17 101 YASGPRWFTVLDHNEIMQ 8 2 2 4 2 3 1 101 FLIMSGWANPVCEY 9 2 1 3 1 1 1 101 DGAQENIKLPRSW 10 4 8 7 5 52 101 YIPSVFHNDL 43 37 18 49 76 37 37 29 116 1010 Tally of sequences of length 11 # = 118 A C D E F G H I K L M # 1 7 1 21 11 23 5 2 7 118 GDEVRALQHSPTINCWY 2 1 2 9 1 1 24 5 6 2 7 3 118 GSRDYLPIVHQTMNCKWAEFX 3 4 4 2 4 13 2 3 1 7 2 118 SGTVRLYWADFNQIEHMKP 4 10 3 3 2 25 1 2 4 3 118 SGARTWYLVDEMQFINPH 5 5 2 10 1 4 24 2 1 5 1 118 GSVYDTNALRFWCHQEKM 6 6 4 2 7 19 2 3 1 5 1 118 GSYWTFAVLRDINEHQKMP 7 4 1 8 5 2 20 4 1 2 1 118 GYSNRDWTEPAHFLQVCIM 8 13 2 6 1 8 12 4 2 7 118 YAGWFLDPRSTHCKVE 9 2 2 68 2 5 14 7 118 FLMYVITADGP 10 2 1 100 5 3 2 1 1 118 DEGAHCLMNPQ 11 2 6 1 7 1 6 1 118 YPVISFLNDHKM 54 9 169 31 102 165 28 29 8 65 20 N P Q R S T V W Y | X # 1 2 4 7 8 5 3 10 1 1 118 GDEVRALQHSPTINCWY 2 3 7 4 10 11 4 6 2 9 1 118 GSRDYLPIVHQTMNCKWAEFX 3 4 1 4 8 25 12 9 6 7 118 SGTVRLYWADFNQIEHMKP 4 2 2 3 9 26 8 4 6 5 118 SGARTWYLVDEMQFINPH 5 6 2 5 15 9 11 4 11 118 GSVYDTNALRFWCHQEKM 6 3 1 2 5 16 9 6 11 15 118 GSYWTFAVLRDINEHQKMP 7 9 5 2 9 11 6 2 7 19 118 GYSNRDWTEPAHFLQVCIM 8 6 5 5 5 2 11 29 118 YAGWFLDPRSTHCKVE 9 1 4 6 7 118 FLMYVITADGP 10 1 1 1 118 DEGAHCLMNPQ 11 3 13 7 11 60 118 YPVISFLNDHKM 33 41 25 59 121 60 67 48 163 1 1298 Tally of sequences of length 12 # = 154 A C D E F G H I K L M # 1 5 31 12 37 6 1 1 7 3 154 GDRESVLHAPMNQTWYIK 2 5 1 7 6 1 25 3 7 3 13 2 154 GSRLPDIQEAVYHKNTMWCF 3 10 2 7 5 1 19 5 4 12 2 154 GRSYLATVPDQEIKWCMNF 4 8 9 6 8 27 6 5 6 1 154 GVSDNAFRTYEILKWPQM 5 18 1 8 5 6 42 1 9 1 7 3 154 GSAIDYLFPTEQVMNWCHK 6 13 12 4 10 23 1 7 8 1 154 GAVDSFYTLPRWINEQHM 7 11 2 4 3 10 15 1 4 12 154 YGSPLRAFWTNVDIECQH 8 3 2 18 3 3 25 4 2 5 6 154 YGDSNLTKRWHPAEFCIQV 9 15 1 2 8 33 4 7 1 5 1 154 GYWARFISPLHTDQCKMN 10 1 1 2 1 79 1 2 5 1 19 26 154 FMLIPYDHVWACEGKNQRST 11 2 135 2 4 2 154 DGYAEHSVNR 12 1 1 6 1 9 16 4 154 YVPIHFSLNCDGW 91 11 236 47 132 252 33 69 21 99 39 N P Q R S T V W Y | X # 1 3 4 3 14 10 3 10 2 2 154 GDRESVLHAPMNQTWYIK 2 3 11 7 22 24 3 5 2 4 154 GSRLPDIQEAVYHKNTMWCF 3 2 8 6 17 17 9 9 4 15 154 GRSYLATVPDQEIKWCMNF 4 9 4 4 7 17 7 18 5 7 154 GVSDNAFRTYEILKWPQM 5 3 6 4 20 6 4 2 8 154 GSAIDYLFPTEQVMNWCHK 6 5 8 3 8 11 9 13 8 10 154 GAVDSFYTLPRWINEQHM 7 5 14 2 12 15 6 5 9 24 154 YGSPLRAFWTNVDIECQH 8 10 4 2 5 15 6 2 5 34 154 YGDSNLTKRWHPAEFCIQV 9 1 6 2 10 7 3 18 30 154 GYWARFISPLHTDQCKMN 10 1 4 1 1 1 1 2 2 3 154 FMLIPYDHVWACEGKNQRST 11 1 1 2 2 3 154 DGYAEHSVNR 12 2 18 5 32 1 58 YVPIHFSLNCDGW 45 87 34 97 144 53 102 58 198 1848 Tally of sequences of length 13 # = 150 A C D E F G H I K L M # 1 4 2 28 9 3 37 8 3 3 5 150 GDTESHRVLPAQFIKCNW 2 11 4 4 1 2 32 3 1 5 11 3 150 GRSPALTKVCDYHMQWFEIN 3 7 2 8 4 4 23 11 1 4 6 2 150 GSYHQTDPRAVLEFKNCMWI 4 6 2 6 4 6 30 1 8 6 1 150 GSWYTIADFLPVEQRCHMNX 5 8 10 4 2 28 1 2 22 3 150 GLSYDATWPREQMNVFIH 6 10 2 11 1 6 21 2 2 5 1 150 GYSPTDAQVFRLNWCIKEM 7 5 1 8 1 4 19 1 6 5 21 2 150 LGYSTDPIRVAKFNWMQCEH 8 7 5 22 5 3 12 3 3 3 8 1 150 YDSGLARTCEQVNPFHIKWM 9 1 2 12 3 1 26 7 2 4 7 2 150 NGYDSWHLPRKETVCIMAFQ 10 19 1 2 2 17 24 5 2 5 1 150 YGAFWHLPTNSVDEIQRCM 11 1 1 105 2 2 1 13 14 150 FMLYGIVAEKPQRSWX 12 130 3 5 1 150 DGYEQNHT 13 1 2 5 5 14 18 1 150 YVLIPSFHTDAMN 80 21 243 38 158 259 46 46 27 127 31 N P Q R S T V W Y | X # 1 2 5 4 8 9 11 8 1 150 GDTESHRVLPAQFIKCNW 2 1 13 3 20 17 7 5 3 4 150 GRSPALTKVCDYEMQWFEIN 3 3 8 11 8 16 11 7 2 12 150 GSYHQTDPRAVLEFKNCMWI 4 1 6 4 4 18 10 6 16 14 1 150 GSWYTIADFLPVEQRCHMNX 5 3 6 4 5 19 8 3 7 15 150 GLSYDATWPREQMNVFIH 6 3 15 8 6 15 13 8 3 17 150 GYSPTDAQVFRLNWCIKEM 7 4 7 2 6 15 14 6 4 19 150 LGYSTDPIRVAKFNWMQCEH 8 4 4 5 7 15 7 5 2 29 150 YDSGLARTCEQVNPFHIKWM 9 31 5 1 5 10 3 3 9 16 150 NGYDSWHLPRKETVCIMAFQ 10 3 5 2 2 3 4 3 15 35 150 YGAFWHLPTNSVDEIQRCM 11 1 1 1 1 2 1 3 1 150 FMLYGIVAEKPQRSWX 12 2 3 1 5 150 GDYEQNHT 13 1 14 13 4 21 51 150 YVLIPSFHTDAMN 58 89 48 72 152 93 77 63 220 2 1950 Tally of sequences of length 14 # = 118 A C D E F G H I K L M # 1 6 29 7 2 32 8 1 1 2 118 GDVHERTAFLPSIKNQ 2 4 10 1 5 22 7 3 4 7 118 GPDRYSVHLFAKIQTENW 3 11 2 7 2 3 25 5 1 9 2 118 GVARYLSDITFWCEMPK 4 5 2 7 7 3 12 4 4 3 6 118 SGVYPDELRTANHIFKWC 5 6 5 12 2 18 2 2 2 4 1 118 GYSDTVARCLPFHIKNWMQ 6 6 10 5 4 16 5 3 2 1 118 YGSTDRAEIFVKWLPQMN 7 4 4 1 4 32 2 2 2 1 118 GSVTYNADFHIKPQRWEM 8 6 1 5 1 4 18 2 5 3 2 118 GSYTWAPRDIFNVLHMCE 9 5 2 4 1 2 11 2 1 5 9 1 118 YSGTLVAKNRDWCFHPEIM 10 2 5 9 2 3 21 2 2 4 118 YGSDNTCQLRFWAEIKPV 11 12 1 3 5 25 2 2 1 118 YGWAPVFNEHLTDMQR 12 1 64 5 1 5 12 16 118 FMLGIPSVAHQTY 13 3 97 4 5 1 1 1 1 118 DGEANQHIKLV 14 2 3 4 12 6 118 YVPILHFANS 73 17 195 34 104 242 35 48 24 67 25 N P Q R S T V W Y | X # 1 1 2 1 7 2 7 10 118 GDVHERTAFLPSIKNQ 2 1 13 2 10 8 2 8 1 10 118 GPDRYSVHLFAKIQTENW 3 2 11 8 4 13 3 10 118 GVARYLSDITFWCEMPK 4 5 8 6 13 6 12 3 12 118 SGVYPDELRTANHIFKWC 5 2 3 1 6 15 10 7 2 18 118 GYSDTVARCLPFHIKNWMQ 6 1 2 2 7 16 12 4 3 19 118 YGSTDRAEIFVKWLPQMN 7 5 2 2 2 18 12 13 2 10 118 GSVTYNADFHIKPQRWEM 8 4 6 6 16 12 4 9 14 118 GSYTWAPRDIFNVLHMCE 9 5 2 5 14 10 8 4 27 118 YSGTLVAKNRDWCFHPEIM 10 6 2 5 4 13 6 2 3 27 118 YGSDNTCQLRFWAEIKPV 11 4 7 1 1 2 6 14 32 118 YGWAPVFNEHLTDMQR 12 4 1 4 1 3 1 118 FMLGIPSVAHQTY 13 2 2 1 118 DGEANQHILKV 14 2 14 2 20 53 118 YVPILHFANS 38 67 17 65 129 84 111 44 233 1652

TABLE-US-00034 TABLE 12P Alignment and tabulation of sequences having 3-22 D segments D3:3-22_Phz0 YYYDSSGYYY (SEQ ID NO: 448) = GLG Entry Seq1 L1 Seq2 L2 JH P Score 1 hs3d6hcv GRDYYDSGGYFT 12 GRDYYDSGGYFTVAFDI 17 3 6 1.76D + 13 (SEQ ID NO: 334) (SEQ ID NO: 335) 2 hs6d4xb7 DRHNYYDSSGSYS 13 DRHNYYDSSGSYSDY 15 4 9 4.40D + 12 (SEQ ID NO: 336) (SEQ ID NO: 337) 3 hs6d4xg3 DCPAPAKMYYYGSGICT 17 DCPAPAKMYYYGSGICT 20 4 3 6.55D + 04 (SEQ ID NO: 338) FDY (SEQ ID NO: 339) 4 hs83x6f2 AFYDSAD 7 AFYDSADDY 9 4 -4 2.62D + 05 (SEQ ID NO: 340) (SEQ ID NO: 341) 5 hsa230644 RDYYDSSGPEAG 12 RDYYDSSGPEAGFDI 15 3 3 6.87D + 10 (SEQ ID NO: 342) (SEQ ID NO: 343) 6 hsa239386 DGTLIDTSAYYYL 13 DGTLIDTSAYYYLY 14 4 6 6.87D + 10 (SEQ ID NO: 344) (SEQ ID NO: 345) 7 hsa234232 NSSDSS 6 NSSDSSVLDV 10 6 -4 6.55D + 04 (SEQ ID NO: 346) (SEQ ID NO: 347) 8 hsa239378 DQVFDSGGYNHR 12 DQVFDSGGYNHRFDS 15 4 3 1.07D + 09 (SEQ ID NO: 348) (SEQ ID NO: 349) 9 hsa239367 DLEYYYDSGGHYSP 14 DLEYYYDSGGHYSPFHY 17 4 9 1.10D + 12 (SEQ ID NO: 350) (SEQ ID NO: 351) 10 hsa239339 DDSSGY 6 DDSSGYYYIDY 11 4 -10 1.72D + 10 (SEQ ID NO: 352) (SEQ ID NO: 353) 11 hsa245311 GHYYDSPGQYSYS 13 GHYYDSPGQYSYSEY 15 4 3 1.07D + 09 (SEQ ID NO: 354) (SEQ ID NO: 355) 12 hsa240578 GGFRPPPYDYESSAYRTYR 19 GGFRPPPYDYESSAYRT 22 4 21 2.75D + 11 (SEQ ID NO: 356) YRLDF (SEQ ID NO: 357) 13 hsa245359 DSDTRAY 7 DSDTRAYYWYFDL 13 2 -7 1.68D + 07 (SEQ ID NO: 358) (SEQ ID NO: 359) 14 hsa245028 GRHYYDSSGYYSTPE 15 GRHYYDSSGYYSTPENY 20 4 6 1.80D + 16 (SEQ ID NO: 360) FDY (SEQ ID NO: 361) 15 hsa245019 DPSYYYDSSGLPL 13 DPSYYYDSSGLPLHGMDV 18 6 9 4.40D + 12 (SEQ ID NO: 362) (SEQ ID NO: 363) 16 hsa244991 TYYYDSSGYLLTR 13 TYYYDSSGYLLTRYFQH 17 1 3 4.50D + 15 (SEQ ID NO: 364) (SEQ ID NO: 365) 17 hsa244945 NAPHYDSSGYYQT 13 NAPHYDSSGYYQTFDY 16 4 6 7.04D + 13 (SEQ ID NO: 366) (SEQ ID NO: 367) 18 hsa244943 GYHSSSYA 8 GYHSSSYADAFDI 13 3 -7 6.71D + 07 (SEQ ID NO: 368) (SEQ ID NO: 369) 19 hsa245289 PIGYCSGGSC 10 PIGYCSGGSCYSFDY 15 4 -4 2.62D + 05 (SEQ ID NO: 370) (SEQ ID NO: 371) 20 hsa240554 THGTYVTSGYYPKI 14 THGTYVTSGYYPKI 14 4 6 2.68D + 08 (SEQ ID NO: 372) (SEQ ID NO: 373) 21 hsa279533 GATYYYESSGNYP 13 GATYYYESSGNYPDY 15 4 9 7.04D + 13 (SEQ ID NO: 374) (SEQ ID NO: 375) 22 hsa389177 AFYHYDSTGYPNRRY 15 AFYHYDSTGYPNRRYYFDY 19 4 6 4.29D + 09 (SEQ ID NO: 376) (SEQ ID NO: 377) 23 hsa7321 SYSYYYDSSGYWGG 14 SYSYYYDSSGYWGGYFDY 18 4 9 4.50D + 15 (SEQ ID NO: 378) (SEQ ID NO: 379) 24 hsaj2772 LSPYYYDSSSYH 12 LSPYYYDSSSYHDAFDI 17 3 6 2.62D + 05 (SEQ ID NO: 380) (SEQ ID NO: 381) 25 hsb7g4f08 EEDYYDSSGQAS 12 EEDYYDSSGQASYNWFXP 18 5 6 2.75D + 11 (SEQ ID NO: 382) (SEQ ID NO: 383) 26 hsb7g3b02 ETNYYDSGGYPG 12 ETNYYDSGGYPGFDF 15 4 6 4.40D + 12 (SEQ ID NO: 384) (SEQ ID NO: 385) 27 hsb7g3c12 GDHYYDRSGYRH 12 GDHYYDRSGYRHSYYYY 21 6 6 2.75D + 11 (SEQ ID NO: 386) AMDV (SEQ ID NO: 387) 28 hsb8g3b07 DRSSGN 6 DRSSGNYFDGMDV 13 6 -10 6.55D + 04 (SEQ ID NO: 388) (SEQ ID NO: 389) 29 hsfoglh GRSRYSGYG 9 GRSRYSGYGFYSGMDV 16 6 -4 2.62D + 05 (SEQ ID NO: 390) (SEQ ID NO: 391) 30 hsgvh0209 DDTSGYGP 8 DDTSGYGPYYFYYGMDV 17 6 -10 2.68D + 08 (SEQ ID NO: 392) (SEQ ID NO: 393) 31 hsgvh55 RAYYDTSFYFEY 12 RAYYDTSFYFEYY 13 4 3 1.72D + 10 (SEQ ID NO: 394) (SEQ ID NO: 395) 32 hsgvh0304 DRIDYYKSGYYLGSA 15 DRIDYYKSGYYLGSADS 17 4 6 1.68D + 07 (SEQ ID NO: 396) (SEQ ID NO: 397) 33 hsgvh0332 DTDSSSHYG 9 DTDSSSHYGRFDP 13 5 -7 1.68D + 07 (SEQ ID NO: 398) (SEQ ID NO: 399) 34 hsgvh0328 VSISHYDSSGRPQRVF 16 VSISHYDSSGRPQRVFY 21 6 9 1.07D + 09 (SEQ ID NO: 400) GMDV (SEQ ID NO: 401) 35 hsgvh536 QARENVFYDSSGPTAP 16 QARENVFYDSSGPTAPFDH 19 4 15 1.72D + 10 (SEQ ID NO: 402) (SEQ ID NO: 403) 36 hshcmg42 VPAGNYYDTSGPDN 14 VPAGNYYDTSGPDNAD 16 4 12 1.72D + 10 (SEQ ID NO: 404) (SEQ ID NO: 405) 37 hsig001vh WYYFDTSGYYPRNFYYMDV 19 WYYFDTSGYYPRNFYYMDV 19 4 3 2.81D + 14 (SEQ ID NO: 406) (SEQ ID NO: 407) 38 hsig13g10 GYYYDSGGNYNG 12 GYYYDSGGNYNGDY 14 4 3 1.10D + 12 (SEQ ID NO: 408) (SEQ ID NO: 409) 39 hsighpat3 DLRSYDPSGYYN 12 DLRSYDPSGYYNDGFDI 17 3 6 2.75D + 11 (SEQ ID NO: 410) (SEQ ID NO: 411) 40 hsigh13g7 GYYYDRGGNCNG 12 GYYYDRGGNCNGDY 14 4 3 6.87D + 10 (SEQ ID NO: 412) (SEQ ID NO: 413) 41 hsighl3g1 GYYYDRGGNYNG 12 GYYYDRGGNYNGDY 14 4 3 1.10D + 12 (SEQ ID NO: 414) (SEQ ID NO: 415) 42 hsighxx20 THYDSSGL 8 THYDSSGLDAFDI 13 3 -4 1.72D + 10 (SEQ ID NO: 416) (SEQ ID NO: 417) 43 hsihr9 DDSSGS 6 DDSSGSYYFDY 11 4 -10 1.07D + 09 (SEQ ID NO: 418) (SEQ ID NO: 419) 44 hsihv11 LSGGYYS 7 LSGGYYSDFDY 11 4 -13 2.68D + 08 (SEQ ID NO: 420) (SEQ ID NO: 421) 45 hs ej1f GDYSDSSDSYI 11 GDYSDSSDSYIDAFDV 16 3 3 1.10D + 12 (SEQ ID NO: 422) (SEQ ID NO: 423) 46 hsmvh51 GETYYYDSRGYA 12 GETYYYDSRGYAFDH 15 4 6 2.62D + 05 (SEQ ID NO: 424) (SEQ ID NO: 425) 47 hsmvh517 PTRDSSGY 8 PTRDSSGYYVGY 12 4 -4 1.07D + 09 (SEQ ID NO: 426) (SEQ ID NO: 427) 48 hsmvh0406 GSFYYDSSGYPP 12 GSFYYDSSGYPPFDC 15 4 6 6.87D + 10 (SEQ ID NO: 428) (SEQ ID NO: 429) 49 hst14x14 GPYYYDSSGYYL 12 GPYYYDSSGYYLLDY 15 4 6 1.80D + 16 (SEQ ID NO: 430) (SEQ ID NO: 431) 50 hsvhig2 EEGYYDSSGYYSLGA 15 EEGYYDSSGYYSLGASDY 18 4 6 4.50D + 15 (SEQ ID NO: 432) (SEQ ID NO: 433) 51 hsvhia2 RPDSSGSRW 9 RPDSSGSRWYFDY 13 4 -7 6.71D + 07 (SEQ ID NO: 434) (SEQ ID NO: 435) 52 hsy14936 GYYDISGYYF 10 GYYDISGYYFDAFNI 15 3 -4 2.81D + 14 (SEQ ID NO: 436) (SEQ ID NO: 437) 53 hsy14934 DRGYDSSGYYGN 12 DRGYDSSGYYGNLDC 15 4 3 1.76D + 13 (SEQ ID NO: 438) (SEQ ID NO: 439) 54 hsy14935 DRGYDSIGYYGN 12 DRGYDSIGYYGNLDC 15 4 3 1.10D + 12 (SEQ ID NO: 440) (SEQ ID NO: 441) 55 hsz80519 AEDLTYYYDRSGWGVHGLL 19 AEDLTYYYDRSGWGVHG 24 4 15 4.40D + 12 (SEQ ID NO: 442) LLYYFDY (SEQ ID NO: 443) 56 hsz80429 LYPHYDSSGYYYV 13 LYPHYDSSGYYYVLDY 16 4 6 4.50D + 15 (SEQ ID NO: 444) (SEQ ID NO: 445) 57 hsz80461 DRVGYYDSSGYPPGSP 16 DRVGYYDSSGYPPGSPLDY 19 4 9 1.76D + 13 (SEQ ID NO: 446) (SEQ ID NO: 447) Frequency of each AA type at each position in 57 Sequences having D3-22 segments Pos A C D E F G H I K L M N P Q R S T V W Y | X # 1 1 1 2 1 1 3 1 1 1 3 4 1 1 1 1 4 5 5 1 1 2 1 1 1 12 6 3 3 4 6 3 1 2 2 2 1 1 28 x 7 1 5 4 1 7 2 1 1 1 3 5 3 4 1 1 1 41 x 8 2 1 4 1 5 3 1 4 4 1 3 1 3 1 14 48 x 9 4 2 3 5 1 1 1 2 2 2 1 28 52 Y 10 1 4 2 1 1 1 1 4 1 40 56 Y 11 46 2 1 1 1 2 1 3 57 D 12 1 1 1 1 1 1 4 39 7 1 57 S 13 1 8 1 1 1 1 43 1 57 S 14 3 2 1 45 1 1 3 56 G 15 2 2 2 5 3 2 1 4 1 33 55 Y 16 2 1 1 1 2 3 1 1 1 6 3 1 1 1 24 49 x 17 3 1 1 1 5 2 1 4 6 6 2 7 2 1 1 3 46 x 18 8 1 1 2 2 2 4 3 1 3 27 19 2 1 1 1 3 4 1 13 20 2 1 2 1 1 1 1 9 21 1 1 1 3 22 1 1 2 23 1 1 2 24 1 1 25 1 1 Average Dseg = 11.9 Average DJ = 15.7 Median D = 12 12 Shortest 6 Longest 19 Median DJ = 15 15 Shortest 9 Longest 24

TABLE-US-00035 TABLE 13P Frequency of D-segments. "|" stands for a stop codon. D seg "0" % C % GLG "1" % C % GLG "2" % C % GLG 1-01 1 0.13 0 VQLERX 4 0.53 0.22 GTTGTX 5 0.66 0.34 YNWND (SEQ ID NO: 132) (SEQ ID NO: 133) (SEQ ID NO: 134) 1-07 0 0 0 V|LELX 3 0.4 0.11 GITGTX 9 1.19 0.34 YNWNY (SEQ ID NO: 135) (SEQ ID NO: 136) (SEQ ID NO: 137) 1-20 0 0 0 V|LERX 1 0.13 0.22 GITGTX 4 0.53 0.45 YNWND (SEQ ID NO: 138) (SEQ ID NO: 139) (SEQ ID NO: 140) 1-26 4 0.53 0 V|WELLX 13 1.72 0.90 GIVGATX 36 4.76 0.78 YSGSYY (SEQ ID NO: 141) (SEQ ID NO: 142) (SEQ ID NO: 143) 2-02 31 4.1 2.47 GYCSSTSCYT 4 0.53 0.22 RIL||YQLLYX 9 1.19 2.47 DIVVVPAAIX (SEQ ID NO: 144) (SEQ ID NO: 145) (SEQ ID NO: 146) 2-08 5 0.66 0.56 GYCTNGVCYT 0 0 0 RILY|WCMLYX 3 0.4 0.56 DIVLMVYAIX (SEQ ID NO: 147) (SEQ ID NO: 148) (SEQ ID NO: 149) 2-15 29 3.83 1.57 GYCSGGSCYS 2 0.26 0.11 RIL|WW|LLLX 7 0.92 1.57 DIVVVVAATX (SEQ ID NO: 150) (SEQ ID NO: 151) (SEQ ID NO: 152) 2-21 16 2.11 0.67 AYCGGDCYS 0 0 0 SILWW|LLFX 7 0.92 0.67 HIVVVTAIX (SEQ ID NO: 153) (SEQ ID NO: 154) (SEQ ID NO: 155) 3-03 32 4.23 2.80 YYDFWSGYYT 7 0.92 0.90 VLRFLEWLLYX 27 3.57 1.12 ITIFGVVIIX (SEQ ID NO: 156) (SEQ ID NO: 157) (SEQ ID NO: 158) 3-09 13 1.72 1.35 YYDILTGYYN 5 0.66 0.78 VLRYFDWLL|X 0 0 0 ITIF|LVIIX (SEQ ID NO: 159) (SEQ ID NO: 160) (SEQ ID NO: 161) 3-10 42 5.55 4.26 YYYGSGSYYN 13 1.72 0.89 VLLWFGELL|X 11 1.45 2.91 ITMVRGVIIX (SEQ ID NO: 162) (SEQ ID NO: 163) (SEQ ID NO: 164) 3-16 18 2.38 0.67 YYDYVWGSYRYT 8 1.06 0 VL|LRLGELSLYX 5 0.66 0.34 IMITFGGVIVIX (SEQ ID NO: 165) (SEQ ID NO: 166) (SEQ ID NO: 167) 3-22 57 7.53 3.36 YYYDSSGYYY 1 0.13 0.11 VLL|||WLLLX 6 0.79 0.34 ITMIVVVITX (SEQ ID NO: 168) (SEQ ID NO: 169) (SEQ ID NO: 170) 4-04 5 0.66 0.28 DYSNY 2 0.26 0 |LQ|LX 2 0.26 0.06 TTVTX (SEQ ID NO: 171) (SEQ ID NO: 172) (SEQ ID NO: 173) 4-17 29 3.83 1.45 DYGDY 0 0 0 |LR|LX 20 2.64 0.90 TTVTX (SEQ ID NO: 174) (SEQ ID NO: 175) (SEQ ID NO: 176) 4-23 10 1.32 0.56 DYGGNS 1 0.13 0 |LRW|LX 4 0.53 0.56 TTVVTX (SEQ ID NO: 177) (SEQ ID NO: 178) (SEQ ID NO: 179) 5-05 3 0.4 0.06 WIQLWLX 10 1.32 0.39 VDTAMVX 31 4.1 0.73 GYSYGY (SEQ ID NO: 180) (SEQ ID NO: 181) (SEQ ID NO: 182) 5-12 0 0 0 WI|WLRLX 8 1.06 0.45 VDIVATIX 14 1.85 1.12 GYSGYDY (SEQ ID NO: 183) (SEQ ID NO: 184) (SEQ ID NO: 185) 5-24 11 1.45 0 |RWLQLX 5 0.66 0.34 VEMATIX 13 1.72 0.44 RDGYNY (SEQ ID NO: 186) (SEQ ID NO: 187) (SEQ ID NO: 188) 6-06 11 1.45 0.78 SIAARX 9 1.19 0.48 EYSSSS 1 0.13 0.11 V|QLVX (SEQ ID NO: 189) (SEQ ID NO: 190) (SEQ ID NO: 191) 6-13 19 2.51 1.01 GIAAAGX 35 4.62 2.13 GYSSSWY 2 0.26 0.31 V|QQLVX (SEQ ID NO: 192) (SEQ ID NO: 193) (SEQ ID NO: 194) 6-19 14 1.85 2.12 GIAVAGX 48 6.34 2.02 GYSSGWY 4 0.53 0.56 V|QWLVX (SEQ ID NO: 195) (SEQ ID NO: 196) (SEQ ID NO: 197) D7: 1 0.13 0 |LGX 2 0.26 0.68 LTGX 2 0.26 0.22 NWG 7-27 (SEQ ID NO: 198) Total = 757

TABLE-US-00036 TABLE 14P Possible library components. Component L f D2_2-02_Phz0 xxxYCSSTSCxxx 13, 31, (SEQ ID NO: 199) D3_3-16_Phz0 xxxxYVWGSYxxx 13, 18, (SEQ ID NO: 200) D5_5-12_Phz2 xxxxxxxSGYxxx 13, 14, (SEQ ID NO: 201) D3_3-09_Phz0 xxxYDILIGYYxx 13, 13, (SEQ ID NO: 202) D2_2-02_Phz2 xxxVVVPAAxxxx 13, 9, (SEQ ID NO: 203) D3_3-22_Phz0 xxxYYDSSGYxx 12, 57, (SEQ ID NO: 204) D3_3-03_Phz0 xxxDFWSGxxxx 12, 32, (SEQ ID NO: 205) D3_3-03_Phz2 xxxTIFGVxxxx 12, 27, (SEQ ID NO: 206) D5_5-12_Phz1 xxxxIVATxxxx 12, 8, (SEQ ID NO: 207) D3_3-10_Phz0 xxxYGSGSYYx 11, 42, ! could add one x at either end (SEQ ID NO: 208) D5_5-05_Phz2 xxxxYSYGxxx 11, 31, (SEQ ID NO: 209) D2_2-15_Phz0 xxxCSGxxCYx 11, 29, (SEQ ID NO: 210) D6_6-13_Phz0 xxxxAAAGxxx 11, 19, (SEQ ID NO: 211) D4_4-23_Phz0 xGxxxGGNxxx 11, 10, (SEQ ID NO: 212) D1_1-26_Phz2 xxxSGSYxxx 10, 35, (SEQ ID NO: 213) D6_6-13_Phz1 xxxSSSWxxx 10, 35, (SEQ ID NO: 214) D4_4-17_Phz2 xxxxTTVTTx 10, 20, (SEQ ID NO: 215) D2_2-21_Phz0 xxxC(SG)GDxCx 10, 16, (SEQ ID NO: 216) D6_6-19_Phz0 xxx(IV)AVAGxx 10, 14, (SEQ ID NO: 217) D3_3-10_Phz1 xxLWFGELxx 10, 13, (SEQ ID NO: 218) D5_5-24_Phz0 GxxWLxxxxF 10, 11, (SEQ ID NO: 219) D5_5-05_Phz1 xxxDTxMVxx 10, 10, (SEQ ID NO: 220) D3_3-16_Phz1 xxxxxGExxx 10, 8, (SEQ ID NO: 221) D6_6-19_Phz1 xxxxSGWxx 9, 48, (SEQ ID NO: 222) D5_5-24_Phz2 xxxxGYNxx 9, 13, (SEQ ID NO: 223) D3_3-10_Phz2 xxxVRGVxx 9, 11, (SEQ ID NO: 224) D6_6-06_Phz0 xxxIAAxxx 9, 11, (SEQ ID NO: 225) D1_1-07_Phz2 xxYxWNxxx 9, 9, (SEQ ID NO: 226) D4_4-17_Phz0 xxxYGDxx 8, 29, (SEQ ID NO: 227) D1_1-26_Phz1 xxVGATxx 8, 13, (SEQ ID NO: 228) D6_6-06_Phz1 xxxYSSSx 8, 9, (SEQ ID NO: 229)

TABLE-US-00037 TABLE 15P Lengths of CDRs: 1095 actual VH domains and 51 VH GLGs. Length 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 CDR1 0 0 10 0 1 820 38 175 1 1 5 1 11 0 23 1 7 0 GLG 0 0 0 0 0 38 3 10 0 0 . . . CDR2 0 0 0 0 0 2 0 0 0 0 0 0 0 0 0 0 464 579 GLG 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 17 28 CDR3 0 0 0 4 2 8 6 28 40 65 77 90 117 117 88 105 86 81 Length 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 (33 or more) CDR2 9 31 1 3 3 1 0 0 0 0 2 0 0 . . . GLG 1 4 0 0 . . . CDR3 45 36 36 16 16 8 8 2 3 0 2 1 0 0 1 5

TABLE-US-00038 TABLE 16P Library of HC CDR3 Component Fraction of Length #X Complexity library Adjusted 1: YYCA21111YFDYWG. 8 4 2.6 E 5 .10 (0-8) .02 (2 = KR; SEQ ID NO: 6) 2: YYCA2111111YFDYWG. 10 6 9.4 E 7 .14 (9-10) .14 (2 = KR; SEQ ID NO: 7) 3: YYCA211111111YFDYTG. 12 8 3.4 E 10 .25 (11 + 12 + .25 (2 = KR; SEQ ID NO: 8) 13/2) 4: YYCAR111S2S3111YFDYWG. 14 6 1.9 e 8 .13 (14 + 13/2) .14 (2 = SG 3 = YW; SEQ ID NO: 9) 5: YYCA2111CSG11CY1YFDYWG. 15 6 9.4 E 7 .13 (15 + 16/2) .14 (2 = KR; SEQ ID NO: 10) 6: YYCA211S1TIFG11111YFDYWG. 17 8 1.7 E 10 .11 (17 + 16/2) .12 (2 = KR; SEQ ID NO: 11) 7: YYCAR111YY2S33YY111YFDYWG. 18 6 3.8 E 8 .04 (18) .08 (2 = D|G; 3 = S|G; SEQ ID NO: 12) 8: YYCAR1111YC2231CY111YFDYWG. 19 8 2.0 E 11 .10 (19 on) .11 (2 = S|G; 3 = T|D|G; SEQ ID NO: 13) Allowed lengths: 8, 10, 12, 14, 15, 17, 18, & 19

TABLE-US-00039 TABLE 17P vgDNA encoding the CDR3 elements of the library ! CDR3 library components (Ctop25) 5'-gctctggtcaa C|TTA|AGg|gct|gag|g-3' (SEQ ID NO: 40) (CtprmA) 5'-gctctggtcaa C|TTA|AGg|gct|gag|gac- ! AflII... |acc|gct|gtc|tac|tac|tgc|gcc-3' (SEQ ID NO: 41) ! (CBprmB)[RC] 5'-|tac|ttc|gat|tac|ttg|ggc|caa|GGT|ACC|ctG|GTC|ACC|tcgctccacc-3'(SEQ ID NO: 42) ! BstEII... (CBot25)[RC] 5'-|GGT|ACC|ctG|GTC|ACC|tcgctccacc-3'(SEQ ID NO: 43) ! ! N.B. [RC] means the the actual oligonucleotide is the reverse complement ! of the one shown. ! N.B. The 20 bases at 3' end of CtprmA are identical to the most 5' 20 bases ! of each of the vgDNA molecules. ! N.B. Ctop25 is identical to the most 5' 25 bases of CtprmA. ! N.B. The 23 most 3' bases of CBprmB are the reverse complement of the ! most 3' 23 bases of each of the vgDNA molecules. ! N.B. CBot25 is identical to the 25 bases at the 5' end of CBprmB. ! (C1t08) 5'-cc|gct|gtc|tac|tac|tgc|B|- <2>|<1>|<1>|<1>|<1>- |tac|ttc|gat|tac|ttg|ggc|caa|GG-3' (SEQ ID NO: 44) ! 2 = KR, 1 = 0.27Y + 0.27G + 0.027{ADEFHIKLMNPQRSTVW} no C ! (C2t10) 5'-cc|gct|gtc|tac|tac|tgc|gcc|- <2>|<1>|<1>|<1>|<1>|<1>|<1>|- tac|ttc|gat|tac|ttg|ggc|caa|GG-3' (SEQ ID NO: 45) ! 2 = KR, 1 = 0.27Y + 0.27G + 0.027{ADEFHIKLMNPQRSTVW} no C ! (C3t12) 5'-cc|gct|gtc|tac|tac|tgc|gcc|- <2>|<1>|<1>|<1>|<1>|<1>|<1>|&lt- ;1>|<1>|- tac|ttc|gat|tac|ttg|ggc|caa|GG-3' (SEQ ID NO: 46) ! 2 = KR, 1 = 0.27Y + 0.27G + 0.027{ADEFHIKLMNPQRSTVW} no C ! (C4t14) 5'-cc|gct|gtc|tac|tac|tgc|gcc|cgt|- |<1>|<1>|<1>|tct|<2>|tct|<3>|<1>|<- 1>|<1>|- tac|ttc|gat|tac|ttg|ggc|caa|GG-3 (SEQ ID NO: 47) ! 2 = SG, 1 = 0.27Y + 0.27G + 0.027{ADEFHIKLMNPQRSTVW} no C, 3 = YW ! (C5t15) 5'-cc|gct|gtc|tac|tac|tgc|gcc|- <2>|<1>|<1>|<1>|tgc|tct|ggt|<1>|<1>|t- gc|tat|<1>|- tac|ttc|gat|tac|ttg|ggc|caa|GG-3 (SEQ ID NO: 48) ! 2 = KR, 1 = 0.27Y + 0.27G + 0.027{ADEFHIKLMNPQRSTVW} no C ! (C6t17) 5'-cc|gct|gtc|tac|tac|tgc|gcc|- <2>|<1>|<1>|tct|<1>|act|atc|ttc|ggt|<1>|&lt- ;1>|<1>|<1>|<1>|- tac|ttc|gat|tac|ttg|ggc|caa|GG-3' (SEQ ID NO: 49) ! 2 = KR, 1 = 0.27Y + 0.27G + 0.027{ADEFHIKLMNPQRSTVW} no C ! (C7t18) 5'-cc|gct|gtc|tac|tac|tgc|gcc|cgt|- |<1>|<1>|<1>|tat|tac|<2>|tct|<3>|<3- >|tac|tat|<1>|<1>|<1>|- tac|ttc|gat|tac|ttg|ggc|caa|GG-3' (SEQ ID NO: 50) ! 2 = DG, 1 = 0.27Y + 0.27G + 0.027{ADEFHIKLMNPQRSTVW} no C, 3 = SG ! (c8t19) 5'-cc|gct|gtc|tac|tac|tgc|gcc|cgt|- |<1>|<1>|<1>|<1>|tat|tgc|<2>|<2>- |<3>|<1>|tgc|tat|<1>|<1>|<1>|- tac|ttc|gat|tac|ttg|ggc|caa|GG-3' (SEQ ID NO: 51) ! 2 = SG, 1 = 0.27Y + 0.27G + 0.027{ADEFHIKLMNPQRSTVW} no C, 3 = TDG !

TABLE-US-00040 TABLE 19 Names of 1398 GeneBank entries examined haj10335 hsa006165 hsa234190 hsa234288 hsa239366 hsa240594 hsa244963 hs201e3 hsa006167 hsa234191 hsa234290 hsa239367 hsa240595 hsa244965 hs201g1 hsa006169 hsa234193 hsa234291 hsa239368 hsa240599 hsa244966 hs201m2 hsa006171 hsa234194 hsa234294 hsa239369 hsa240604 hsa244967 hs202e2 hsa006173 hsa234196 hsa234296 hsa239370 hsa241344 hsa244968 hs202g3 hsa131921 hsa234197 hsa234298 hsa239371 hsa241345 hsa244969 hs202g9 hsa132847 hsa234199 hsa235649 hsa239372 hsa241346 hsa244970 hs202m3 hsa132849 hsa234202 hsa235658 hsa239373 hsa241347 hsa244971 hs203e1 hsa132850 hsa234203 hsa235662 hsa239375 hsa241348 hsa244972 hs203g1 hsa132851 hsa234205 hsa235664 hsa239376 hsa241349 hsa244973 hs203m5 hsa132852 hsa234206 hsa235665 hsa239377 hsa241350 hsa244974 hs204e1 hsa224746 hsa234207 hsa235667 hsa239378 hsa241351 hsa244975 hs204g1 hsa225092 hsa234208 hsa235671 hsa239379 hsa241353 hsa244976 hs3d6hcv hsa225093 hsa234209 hsa235675 hsa239380 hsa241354 hsa244977 hs6d4xa7 hsa230634 hsa234211 hsa235677 hsa239381 hsa241355 hsa244978 hs6d4xb7 hsa230635 hsa234212 hsa238036 hsa239382 hsa241356 hsa244979 hs6d4xf1 hsa230636 hsa234214 hsa238037 hsa239383 hsa241357 hsa244980 hs6d4xf2 hsa230637 hsa234217 hsa238038 hsa239384 hsa241420 hsa244981 hs6d4xg3 hsa230638 hsa234221 hsa238039 hsa239385 hsa241421 hsa244982 hs6d4xh5 hsa230639 hsa234224 hsa238040 hsa239386 hsa242555 hsa244983 hs83x6b2 hsa230640 hsa234227 hsa238326 hsa239387 hsa242556 hsa244984 hs83x6b5 hsa230641 hsa234229 hsa238327 hsa239388 hsa243108 hsa244985 hs83x6c3 hsa230643 hsa234230 hsa238328 hsa239390 hsa243110 hsa244986 hs83x6c4 hsa230644 hsa234232 hsa239330 hsa239391 hsa244928 hsa244987 hs83x6c5 hsa230645 hsa234235 hsa239331 hsa240553 hsa244929 hsa244988 hs83x6d4 hsa230646 hsa234238 hsa239332 hsa240554 hsa244930 hsa244989 hs83x6f1 hsa230647 hsa234239 hsa239333 hsa240555 hsa244931 hsa244990 hs83x6f2 hsa230648 hsa234242 hsa239334 hsa240556 hsa244932 hsa244991 hs83x6f3 hsa230649 hsa234245 hsa239335 hsa240557 hsa244933 hsa244992 hs83x6f5 hsa230650 hsa234248 hsa239336 hsa240558 hsa244934 hsa244993 hs83x6h3 hsa230651 hsa234249 hsa239337 hsa240559 hsa244935 hsa244994 hs83x9a6 hsa230652 hsa234251 hsa239338 hsa240560 hsa244936 hsa244995 hs83x9b6 hsa230653 hsa234252 hsa239339 hsa240561 hsa244937 hsa244996 hs83x9b9 hsa230654 hsa234255 hsa239340 hsa240562 hsa244938 hsa244997 hs83x9c8 hsa230655 hsa234256 hsa239341 hsa240563 hsa244939 hsa244998 hs83x9d6 hsa230656 hsa234257 hsa239342 hsa240564 hsa244940 hsa244999 hs83x9d7 hsa230657 hsa234258 hsa239343 hsa240565 hsa244941 hsa245000 hs83x9e6 hsa230658 hsa234259 hsa239344 hsa240566 hsa244942 hsa245001 hs83x9e8 hsa234156 hsa234260 hsa239345 hsa240567 hsa244943 hsa245002 hs83x9e9 hsa234158 hsa234262 hsa239346 hsa240568 hsa244944 hsa245003 hs83x9f6 hsa234160 hsa234263 hsa239347 hsa240569 hsa244945 hsa245004 hs83x9g6 hsa234161 hsa234264 hsa239348 hsa240570 hsa244946 hsa245005 hs9d4x10 hsa234163 hsa234266 hsa239349 hsa240571 hsa244947 hsa245006 hs9d4x7 hsa234164 hsa234268 hsa239350 hsa240572 hsa244948 hsa245007 hs9d4x8 hsa234166 hsa234269 hsa239351 hsa240573 hsa244949 hsa245008 hs9d4x9 hsa234168 hsa234270 hsa239353 hsa240575 hsa244950 hsa245009 hs9d4xa6 hsa234169 hsa234272 hsa239354 hsa240576 hsa244951 hsa245010 hs9d4xa7 hsa234171 hsa234273 hsa239355 hsa240578 hsa244952 hsa245011 hs9d4xb6 hsa234172 hsa234274 hsa239356 hsa240580 hsa244953 hsa245012 hs9d4xc2 hsa234175 hsa234276 hsa239357 hsa240581 hsa244954 hsa245013 hs9d4xd6 hsa234178 hsa234277 hsa239358 hsa240582 hsa244955 hsa245014 hs9d4xe6 hsa234180 hsa234279 hsa239359 hsa240585 hsa244956 hsa245015 hs9d4xf3 hsa234181 hsa234281 hsa239360 hsa240586 hsa244957 hsa245016 hs9d4xh4 hsa234183 hsa234282 hsa239361 hsa240588 hsa244958 hsa245017 hs9d4xh5 hsa234184 hsa234283 hsa239362 hsa240589 hsa244959 hsa245018 hsa005975 hsa234186 hsa234284 hsa239363 hsa240590 hsa244960 hsa245019 hsa005977 hsa234187 hsa234286 hsa239364 hsa240592 hsa244961 hsa245020 hsa006161 hsa234189 hsa234287 hsa239365 hsa240593 hsa244962 hsa245021 hsa245022 hsa245217 hsa245305 hsa279524 hsabhiv8 hsb8g2g08 hsevh52a1 hsa245023 hsa245218 hsa245307 hsa279526 hsadeigvh hsb8g3b07 hsevh52a2 hsa245024 hsa245219 hsa245309 hsa279527 hsaj2768 hsb8g3c07 hsevh52a3 hsa245025 hsa245220 hsa245311 hsa279528 hsaj2769 hsb8g3c08 hsevh52a4 hsa245026 hsa245221 hsa245312 hsa279529 hsaj2771 hsb8g3c12 hsevh52a5 hsa245027 hsa245222 hsa245313 hsa279530 hsaj2772 hsb8g3d03 hsevh52b1 hsa245028 hsa245223 hsa245315 hsa279531 hsaj2773 hsb8g3d04 hsevh53a1 hsa245029 hsa245224 hsa245317 hsa279532 hsaj2776 hsb8g3d07 hsevh53a2 hsa245030 hsa245225 hsa245318 hsa279533 hsaj2777 hsb8g3d08 hsfog1h hsa245031 hsa245226 hsa245319 hsa279535 hsaj4083 hsb8g3e02 hsfog3h hsa245032 hsa245228 hsa245320 hsa279536 hsaj4899 hsb8g3e03 hsfogbh hsa245033 hsa245229 hsa245321 hsa279537 hsasighc hsb8g3f03 hsfom1h hsa245034 hsa245230 hsa245322 hsa279543 hsavh510 hsb8g3g01 hsfs10hc hsa245035 hsa245231 hsa245323 hsa279544 hsavh512 hsb8g3g03 hsfs11hc hsa245036 hsa245232 hsa245325 hsa279545 hsavh513 hsb8g3g05 hsfs9whc hsa245037 hsa245233 hsa245326 hsa279552 hsavh514 hsb8g3g10 hsgad2h hsa245039 hsa245234 hsa245338 hsa389169 hsavh515 hsb8g3h01 hsgvh0117 hsa245040 hsa245235 hsa245342 hsa389170 hsavh516 hsb8g4c02 hsgvh0118 hsa245041 hsa245236 hsa245343 hsa389171 hsavh517 hsb8g4e01 hsgvh0119 hsa245042 hsa245237 hsa245345 hsa389172 hsavh519 hsb8g4e05 hsgvh0120 hsa245043 hsa245238 hsa245346 hsa389173 hsavh520 hsb8g4f11 hsgvh0121 hsa245044 hsa245239 hsa245347 hsa389174 hsavh523 hsb8g4h09 hsgvh0122 hsa245045 hsa245240 hsa245348 hsa389175 hsavh524 hsb8g4h10 hsgvh0123 hsa245046 hsa245241 hsa245349 hsa389176 hsavh526 hsb8g5d10 hsgvh0124 hsa245047 hsa245246 hsa245350 hsa389177 hsavh529 hsb8g5h08 hsgvh0201 hsa245048 hsa245251 hsa245352 hsa389178 hsavh53 hsbel1 hsgvh0202 hsa245049 hsa245255 hsa245353 hsa389179 hsavh56 hsbel14 hsgvh0203 hsa245050 hsa245258 hsa245355 hsa389180 hsb3g4a07 hsbel28 hsgvh0204 hsa245051 hsa245260 hsa245356 hsa389181 hsb73g04n hsbel29 hsgvh0205 hsa245052 hsa245261 hsa245357 hsa389182 hsb74a08n hsbel3 hsgvh0206 hsa245053 hsa245262 hsa245358 hsa389183 hsb7g1a11 hsbel34 hsgvh0207 hsa245054 hsa245263 hsa245359 hsa389184 hsb7g2b01 hsbel43 hsgvh0208 hsa245055 hsa245265 hsa249378 hsa389185 hsb7g3a01 hsbel45 hsgvh0209 hsa245056 hsa245266 hsa249628 hsa389186 hsb7g3a05 hsbel5 hsgvh0210 hsa245057 hsa245268 hsa249629 hsa389187 hsb7g3a10 bsbel54 hsgvh0211 hsa245058 hsa245272 hsa249630 hsa389188 hsb7g3b02 bsbel69 hsgvh0213 hsa245059 hsa245273 hsa249631 hsa389190 hsb7g3b03 hsbo1vhig hsgvh0214 hsa245060 hsa245275 hsa249632 hsa389191 hsb7g3b05 hsbo3vhig hsgvh0215 hsa245061 hsa245277 hsa249633 hsa389192 hsb7g3c03 hsbr1vhig hsgvh0216 hsa245062 hsa245278 hsa249634 hsa389193 hsb7g3c12 hsbradh3 hsgvh0217 hsa245063 hsa245279 hsa249635 hsa389194 hsb7g3d07 hscal4ghc hsgvh0218 hsa245064 hsa245280 hsa249636 hsa389195 hsb7g3e01 hsd4xd10 hsgvh0219 hsa245065 hsa245281 hsa249637 hsa389927 hsb7g3f02 hsd4xf21 hsgvh0220 hsa245066 hsa245282 hsa271600 hsa389929 hsb7g3f10 hsd4xg2 hsgvh0221 hsa245067 hsa245283 hsa271601 hsa6351 hsb7g3g02 hsd4xi10 hsgvh0222 hsa245068 hsa245284 hsa271602 hsa7321 hsb7g3g04 hsd4xi4 hsgvh0223 hsa245069 hsa245285 hsa271603 hsa7322 hsb7g4a08 hsd4xk9 hsgvh0224 hsa245071 hsa245286 hsa271604 hsa7323 hsb7g4c05 hsd4xl3 hsgvh0302 hsa245072 hsa245287 hsa279513 hsa7325 hsb7g4d09 hsd5hc hsgvh0304 hsa245073 hsa245288 hsa279514 hsa7326 hsb7g4f08 hsdo1vhig hsgvh0306 hsa245201 hsa245289 hsa279515 hsa7328 hsb7g4g07 hseliepa1 hsgvh0307 hsa245203 hsa245290 hsa279516 hsa7438 hsb7g5g03 hseliepa3 hsgvh0308 hsa245204 hsa245291 hsa279517 hsa7440 hsb8g1c04 hseliepa4 hsgvh0309 hsa245208 hsa245292 hsa279519 hsa7441 hsb8g1e04 hseliepb2 hsgvh0310 hsa245209 hsa245294 hsa279520 hsa7442 hsb8g1f03 hseliepd2 hsgvh0311 hsa245210 hsa245298 hsa279521 hsa7443 hsb8g1g04 hselilpb1 hsgvh0312 hsa245214 hsa245299 hsa279522 hsa7444 hsb8g1h02 hsevh51a1 hsgvh0314 hsa245215 hsa245301 hsa279523 hsaarma1 hsb8g2f09 hsevh51b1 hsgvh0315 hsgvh0318 hsig001vh hsighpat5 hsigvhc07 hsimghc1 hsmvh0401 hsrou233 hsgvh0320 hsig030vh hsighpat6 hsigvhc08 hsimghc2 hsmvh0403 hsrt792hc hsgvh0321 hsig033vh hsighpat7 hsigvhc09 hsimghc3 hsmvh0404 hsrt79hc hsgvh0322 hsig039vh hsighpat8 hsigvhc10 hsimghc4 hsmvh0405 hssm1vhig hsgvh0323 hsig040vh hsighpat9 hsigvhc11 hsimghcS hsmvh0406 hssp46a hsgvh0324 hsig055vh hsighpt11 hsigvhc12 hsin42p5 hsmvh0501 hst14vh hsgvh0325 hsig057vh hsighpt12 hsigvhc14 hsin51p7 hsmvh0502 hst14x1 hsgvh0326 hsig1059 hsighpta1 hsigvhc16 hsin51p8 hsmvh0503 hst14x10 hsgvh0327 hsig10610 hsighvb5 hsigvhc17 hsin78 hsmvh0504 hst14x11 hsgvh0328 hsig13g10 hsighvca hsigvhc18 hsin87 hsmvh0505 hst14x12 hsgvh0329 hsig473 hsighvcb hsigvhc19 hsin89p2 hsmvh0506 hst14x13 hsgvh0330 hsig7sa11 hsighvcc hsigvhc20 hsin92 hsmvh0507 hst14x14 hsgvh0331 hsigaehc hsighvcd hsigvhc21 hsin98p1 hsmvh0508 hst14x15 hsgvh0332 hsigaf2h2 hsighvce hsigvhc22 hsjac10h hsmvh0509 hst14x16 hsgvh0333 hsigashc hsighvm hsigvhc23 hsjhba1f hsmvh0510 hst14x17 hsgvh0334 hsigathc hsighxx1 hsigvhc24 hsjhbr2f hsmvh0511 hst14x18 hsgvh0335 hsigdvrhc hsighxx10 hsigvhc25 hsjhej1f hsmvh0513 hst14x19 hsgvh0336 hsigg1kh hsighxx11 hsigvhc26 hsld1110 hsmvh0515 hst14x20 hsgvh0419 hsigg1kl hsighxx12 hsigvhc27 hsld1117 hsmvh0529 hst14x21 hsgvh0420 hsigg1lh hsighxx14 hsigvhc28 hsld152 hsmvh51 hst14x22 hsgvh0421 hsigghc85 hsighxx16 hsigvhc29 hsld21 hsmvh510 hst14x23 hsgvh0422 hsigghcv3 hsighxx18 hsigvhc30 hsld217 hsmvh511 hst14x24 hsgvh0423 hsigghevr hsighxx2 hsigvhc31 hsld218 hsmvh512 hst14x25 hsgvh0424 hsiggvdj1 hsighxx20 hsigvhc32 hsld25 hsmvh515 hst14x3 hsgvh0428 hsiggvdj2 hsighxx21 hsigvhc33 hsmad2h hsmvh516 hst14x6 hsgvh0429 hsiggvhb hsighxx22 hsigvhc35 hsmbcl5h4 hsmvh517 hst14x7 hsgvh0430 hsiggvhc hsighxx23 hsigvhc36 hsmica1h hsmvh53 hst14x8 hsgvh0517 hsigh10g1 hsighxx25 hsigvhc37 hsmica3h hsmvh54 hst14x9 hsgvh0519 hsigh10g2 hsighxx26 hsigvhc38 hsmica4h hsmvh55 hst22x1 hsgvh0522 hsigh10g3 hsighxx28 hsigvhc39 hsmica5h hsmvh56 hst22x11 hsgvh0523 hsigh10g4 hsighxx29 hsigvhc40 hsmica6h hsmvh57 hst22x12 hsgvh0526 hsigh10g5 hsighxx3 hsigvhc41 hsmica7h hsmvh58 hst22x13 hsgvh0527 hsigh10g7 hsighxx30 hsigvhc42 hsmt11ige hsmvh59 hst22x14 hsgvh0531 hsigh10g8 hsighxx31 hsigvhc43 hsmt12ige hsnamembo hst22x15 hsgvh511 hsigh10g9 hsighxx32 hsigvhls hsmt13ige hsnpb346e hst22x18 hsgvh512 hsigh13g1 hsighxx34 hsigvhttd hsmt14ige hsoak3h hst22x20 hsgvh513 hsigh13g7 hsighxx36 hsigvp151 hsmt15ige hsog31h hst22x21 hsgvh515 hsigh14g1 hsighxx37 hsigvp152 hsmt16ige hspag1h hst22x22 hsgvh519 hsigh14g2 hsighxx38 hsigvp153 hsmt17ige hsrael hst22x23 hsgvh521 hsigh2f2 hsighxx5 hsigvp154 hsmt21ige hsregah hst22x25 hsgvh526 hsigh3135 hsighxx6 hsigvp155 hsmt22ige hsrfabh37 hst22x26 hsgvh530 hsigh35 hsighxx7 hsigvp156 hsmt23ige hsrighvja hst22x27 hsgvh533 hsigh44 hsighxx8 hsigvp157 hsmt24ige hsrighvjb hst22x28 hsgvh534 hsigh4c2 hsighxx9 hsigvp158 hsmt25ige hsrou10 hst22x30 hsgvh535 hsigh9e1 hsigkrf hsigvp251 hsmt26ige hsrou11 hst22x9 hsgvh536 hsighadi2 hsigmhavh hsigvp255 hsmt27ige hsrou111 hsu24687 hsgvh55 hsighadi3 hsigrhe15 hsigvp256 hsmutuiem hsrou112 hsu24688 hsh217e hsighcvr hsigtgk1h hsigvp257 hsmvh0001 hsrou119 hsu24690 hsh241e hsighcza hsigtgk4h hsigvp360 hsmvh0002 hsrou122 hsu24691 hsh28e hsighczb hsigtgl9h hsigvp363 hsmvh0003 hsrou126 hsv52a512 hsha3d1ig hsighczc hsigvarh1 hsigvp369 hsmvh0004 hsrou127 hsvdj10h hshambh hsighczd hsigvhc hsigvp39 hsmvh0005 hsrou129 hsvdj12h hshcmg42 hsighczf hsigvhc01 hsihr8 hsmvh0006 hsrou13 hsvgcg1 hshcmg43 hsighczg hsigvhc02 hsihr9 hsmvh0007 hsrou131 hsvgcm1 hshcmg44 hsigheavy hsigvhc03 hsihv1 hsmvh0009 hsrou18 hsvgcm2 hshcmg46 hsighpat2 hsigvhc04 hsihv11 hsmvh0010 hsrou219 hsvh1djh6 hshcmt42 hsighpat3 hsigvhc05 hsihv18 hsmvh0011 hsrou221 hsvh3djh4 hshcmt47 hsighpat4 hsigvhc06 hsim9vch hsmvh0012 hsrou222 hsvh4dj hsvh4djh6 hsvhic11 hsww1p10e hsy14935 hsz80377 hsz80424 hsz80482 hsvh4r hsvhic2 hsx98932 hsy14936 hsz80378 hsz80426 hsz80483 hsvh52a43 hsvhic3 hsx98933 hsy14937 hsz80383 hsz80427 hsz80487 hsvh52a55 hsvhid1 hsx98934 hsy14938 hsz80385 hsz80429 hsz80489 hsvh5dj hsvhid5 hsx98935 hsy14939 hsz80386 hsz80433 hsz80492 hsvh5djh5 hsvhid7 hsx98936 hsy14940 hsz80388 hsz80436 hsz80495 hsvh710p1 hsvhid9 hsx98938 hsy14943 hsz80390 hsz80438 hsz80496 hsvheg7 hsvhie4 hsx98939 hsy14945 hsz80391 hsz80439 hsz80499 hsvhfa2 hsvhif10 hsx98940 hsy18120 hsz80392 hsz80441 hsz80500 hsvhfa7 hsvhif3 hsx98941 hsz74663 hsz80393 hsz80442 hsz80502 hsvhfb5 hsvhif7 hsx98943 hsz74665 hsz80394 hsz80443 hsz80504 hsvhfc2 hsvhig2 hsx98944 hsz74668 hsz80397 hsz80445 hsz80507 hsvhfd7 hsvhp2 hsx98945 hsz74671 hsz80400 hsz80458 hsz80509 hsvhfe5 hsvhp29 hsx98946 hsz74672 hsz80403 hsz80459 hsz80512 hsvhfg9 hsvhp30 hsx98947 hsz74682 hsz80406 hsz80460 hsz80513 hsvhgd8 hsvhp32 hsx98948 hsz74688 hsz80407 hsz80461 hsz80517 hsvhgd9 hsvhp34 hsx98950 hsz74690 hsz80409 hsz80462 hsz80519 hsvhgh7 hsvhp4 hsx98951 hsz74693 hsz80411 hsz80463 hsz80520 hsvhha10 hsvhp46 hsx98952 hsz74695 hsz80412 hsz80465 hsz80527 hsvhia2 hsvhp48 hsx98953 hsz80363 hsz80414 hsz80466 hsz80534 hsvhia5 hsvhp53 hsx98954 hsz80364 hsz80415 hsz80473 hsz80538 hsvhib12 hsvhp7 hsx98955 hsz80365 hsz80416 hsz80474 hsz80544 hsvhib6 hsvigd9 hsx98956 hsz80367 hsz80417 hsz80475 hsz80545 hsvhib8 hswad35vh hsx98958 hsz80368 hsz80418 hsz80476 hsvhic1 hswanembo hsx98963 hsz80372 hsz80421 hsz80477 hsvhic10 hswo1vhig hsy14934 hsz80375 hsz80422 hsz80480

TABLE-US-00041 TABLE 20P Human GLG CDR1 & CDR2 AA seqs CDR1 1 1 1 Name 1234567 CDR2 1234567890123456789 1-02 GYY-MH (SEQ ID NO: 230) WINPNSGG--TNYAQKFQG (SEQ ID NO: 231) 1-03 SYA--MH (SEQ ID NO: 232) WINAGNGN--TKYSQKFQG (SEQ ID NO: 233) 1-08 SYD--IN (SEQ ID NO: 234) WMNPNSGN--TGYAQKFQG (SEQ ID NO: 235) 1-18 SYG--IS (SEQ ID NO: 236) WISAYNGN--TNYAQKLQG (SEQ ID NO: 237) 1-24 ELS--MH (SEQ ID NO: 238) GFDPEDGE--TIYAQKFQG (SEQ ID NO: 239) 1-45 YRY--LH (SEQ ID NO: 240) WITPFNGN--TNYAQKFQD (SEQ ID NO: 241) 1-46 SYY--MH (SEQ ID NO: 242) IINPSGGS--TSYAQKFQG (SEQ ID NO: 243) 1-58 SSA--VQ (SEQ ID NO: 244) WIVVGSGN--TNYAQKFQE (SEQ ID NO: 245) 1-69 SYA--IS (SEQ ID NO: 246) GIIPIFGT--ANYAQKFQG (SEQ ID NO: 247) 1-e SYA--IS (SEQ ID NO: 248) GIIPIFGT--ANYAQKFQG (SEQ ID NO: 249) 1-f DYY--MH (SEQ ID NO: 250) LVDPEDGE--TIYAEKFQG (SEQ ID NO: 251) 2-05 TSGVGVG (SEQ ID NO: 252) LIYWNDDK---RYSPSLKS (SEQ ID NO: 253) 2-26 NARMGVS (SEQ ID NO: 254) HIFSNDEK---SYSTSLKS (SEQ ID NO: 255) 2-70 TSGMRVS (SEQ ID NO: 256) RIDWDDDK---FYSTSLKT (SEQ ID NO: 257) 3-07 SYW--MS (SEQ ID NO: 258) NIKQDGSE--KYYVDSVKG (SEQ ID NO: 259) 3-09 DYA--MH (SEQ ID NO: 260) GISWNSGS--IGYADSVKG (SEQ ID NO: 261) 3-11 DYY--MS (SEQ ID NO: 262) YISSSGST--IYYADSVKG (SEQ ID NO: 263) 3-13 SYD--MH (SEQ ID NO: 264) AIGTAGD---TYYPGSVKG (SEQ ID NO: 265) 3-15 NAW--MS (SEQ ID NO: 266) RIKSKIDGGITDYAAPVKG (SEQ ID NO: 267) 3-20 DYG--MS (SEQ ID NO: 268) GINWNGGS--TGYADSVKG (SEQ ID NO: 269) 3-21 SYS--MN (SEQ ID NO: 270) SISSSSSY--IYYADSVKG (SEQ ID NO: 271) 3-23 SYA--MS (SEQ ID NO: 272) AISGSGGS--TYYADSVKG (SEQ ID NO: 273) 3-30 SYG--MH (SEQ ID NO: 274) VISYDGSN--KYYADSVKG (SEQ ID NO: 275) 3303 SYA--MH (SEQ ID NO: 276) VISYDGSN--KYYADSVKG (SEQ ID NO: 277) 3305 SYG--MH (SEQ ID NO: 278) VISYDGSN--KYYADSVKG (SEQ ID NO: 279) 3-33 SYG--MH (SEQ ID NO: 280) VIWYDGSN--KYYADSVKG (SEQ ID NO: 281) 3-43 DYT--MH (SEQ ID NO: 282) LISWDGGS--TYYADSVKG (SEQ ID NO: 283) 3-48 SYS--MN (SEQ ID NO: 284) YISSSSST--IYYADSVKG (SEQ ID NO: 285) 3-49 DYA--MS (SEQ ID NO: 286) FIRSKAYGGTTEYTASVKG (SEQ ID NO: 287) 3-53 SNY--MS (SEQ ID NO: 288) VIYSGGS---TYYADSVKG (SEQ ID NO: 289) 3-64 SYA--MH (SEQ ID NO: 290) AISSNGGS--TYYANSVKG (SEQ ID NO: 291) 3-66 SNY--MS (SEQ ID NO: 292) VIYSGGS---TYYADSVKG (SEQ ID NO: 293) 3-72 DHY--MD (SEQ ID NO: 294) RTRNKANSYTTEYAASVKG (SEQ ID NO: 295) 3-73 GSA--MH (SEQ ID NO: 296) RIRSKANSYATAYAASVKG (SEQ ID NO: 297) 3-74 SYW--MH (SEQ ID NO: 298) RINSDGSS--TSYADSVKG (SEQ ID NO: 299) 3-d SNE--MS (SEQ ID NO: 300) SISGGS----TYYADSRKG (SEQ ID NO: 301) 4-04 SSNW-WS (SEQ ID NO: 302) EIYHSGS---TNYNPSLKS (SEQ ID NO: 303) 4-28 SSNW-WG (SEQ ID NO: 304) YIYYSGS---TYYNPSLKS (SEQ ID NO: 305) 4301 SGGYYWS (SEQ ID NO: 306) YIYYSGS---TYYNPSLKS (SEQ ID NO: 307) 4302 SGGYSWS (SEQ ID NO: 308) YIYHSGS---TYYNPSLKS (SEQ ID NO: 309) 4304 SGDYYWS (SEQ ID NO: 310) YIYYSGS---TYYNPSLKS (SEQ ID NO: 311) 4-31 SGGYYWS (SEQ ID NO: 312) YIYYSGS---TYYNPSLKS (SEQ ID NO: 313) 4-34 GYY--WS (SEQ ID NO: 314) EINHSGS---TNYNPSLKS (SEQ ID NO: 315) 4-39 SSSYYWG (SEQ ID NO: 316) SIYYSGS---TYYNPSLKS (SEQ ID NO: 317) 4-59 SYY--WS (SEQ ID NO: 318) YIYYSGS---TNYNPSLKS (SEQ ID NO: 319) 4-61 SGSYYWS (SEQ ID NO: 320) YIYYSGS---TNYNPSLKS (SEQ ID NO: 321) 4-b SGYY-WG (SEQ ID NO: 322) SIYHSGS---TYYNPSLKS (SEQ ID NO: 323) 5-51 SYW--IG (SEQ ID NO: 324) IIYPGDSD--TRYSPSFQG (SEQ ID NO: 325) 5-a SYW--IS (SEQ ID NO: 326) RIDPSDSY--TNYSPSFQG (SEQ ID NO: 327) 6-1 SNSAAWN (SEQ ID NO: 328) RTYYRSKWY-NDYAVSVKS (SEQ ID NO: 329) 74.1 SYA--MN (SEQ ID NO: 330) WINTNIGN--PTYAQGFIG (SEQ ID NO: 331) CDR1 of human GLGs A C D E F G H I K L M N P Q R S T V W Y -- Consens. 1 7 1 3 2 35 2 1 Sd x 2 2 6 1 1 4 1 7 29 Ysg x 3 11 3 1 10 2 1 6 1 5 11 YAGS x 4 1 2 1 2 7 38 -- 5 1 2 1 1 5 41 -- 6 6 1 28 4 12 Mwi 7 1 5 16 5 1 23 SHng CDR2 of human GLGs 1 3 2 1 5 1 2 3 1 7 4 6 7 9 X 2 1 46 1 2 1 I 3 4 1 1 2 2 8 3 12 1 1 1 15 ysn x 4 2 2 4 1 10 1 11 2 1 5 12 ysp x 5 1 8 2 1 6 2 4 8 1 17 1 sd x 6 3 7 2 26 3 8 2 Gsd x 7 4 1 17 1 2 24 1 1 SG x 8 1 3 3 3 10 9 4 1 2 15 -ns 9 2 3 46 -- 10 1 3 47 -- 11 2 4 5 1 1 35 3 T 12 1 2 2 1 3 2 1 11 2 3 1 22 Yn x 13 51 Y 14 31 11 1 6 1 1 An x 15 4 16 1 1 1 14 11 2 1 dpq x 16 1 11 1 38 Sk 17 13 15 1 22 Vlf 18 37 13 1 Kq 19 1 1 34 14 1 GS

TABLE-US-00042 TABLE 21P Tallies of Amino-acid frequencies in mature CDR1 and CDR2 Tally of 23 examples with length 14 A C D E F G H I K L M N P Q R S T V W Y | X 1 8 2 13 2 3 15 3 2 3 2 1 14 1 5 4 2 2 11 5 3 5 7 1 1 13 1 6 1 4 3 12 2 1 7 3 1 1 2 1 5 10 8 6 1 1 2 1 6 4 2 9 1 5 1 3 1 4 7 1 10 1 8 3 1 2 1 4 1 2 11 1 1 1 1 2 1 16 12 1 2 1 1 1 1 1 1 14 13 4 2 17 14 4 1 5 4 5 4 Tally of 11 examples with length 12 A C D E F G H I K L M N P Q R S T V W Y | X 1 4 7 2 1 4 4 2 3 7 4 4 1 1 1 5 2 1 5 1 9 1 6 2 1 3 2 3 7 3 1 3 1 3 8 1 3 2 1 2 2 9 1 1 9 10 1 10 11 11 12 2 1 7 1 Tally of 175 examples with length 7 A C D E F G H I K L M N P Q R S T V W Y | X 1 2 1 1 2 1 3 2 153 10 2 3 2 1 87 1 10 1 5 61 2 2 3 3 26 1 54 1 5 1 2 76 3 1 2 4 6 1 1 6 1 2 1 11 1 145 5 5 2 13 2 2 3 6 2 140 6 1 1 1 13 159 7 2 1 67 1 10 88 5 1 Tally of 38 examples with length 6 A C D E F G H I K L M N P 4 R S T V W Y | X 1 2 34 2 2 1 2 1 8 4 22 3 3 26 9 4 1 1 29 7 5 38 6 10 3 3 Tally of 820 examples with length 5 A C D E F G H I K L M N P Q R S T V W Y Seen 1 8 81 10 151 4 8 5 3 100 4 15 364 55 8 4 SGNDT x 15 2 7 5 12 24 1 30 1 1 5 26 1 1 23 2 681 Y 15 3 202 4 24 13 13 133 10 2 7 5 2 3 32 14 13 112 231 YAGW x 17 4 6 172 2 7 409 3 16 205 MWI 8 5 8 6 1 1 49 241 2 79 1 3 367 56 2 4 SHNTx 14 CDR2 Tally of 31 examples with CDR2 of length 19 A C D E F G H I K L M N P Q R S T V W Y X RF x 1 11 1 1 1 15 1 1 I 2 1 28 2 Rk 3 9 1 18 1 1 1 S 4 1 2 6 21 1 K x 5 1 1 1 22 1 1 1 1 1 1 A x 6 16 1 1 1 1 3 1 6 1 y x 7 1 9 7 3 1 10 G 8 23 1 1 5 1 G 9 2 18 1 1 1 7 1 T 10 4 1 1 1 1 1 21 1 T 11 1 3 1 26 x 12 2 11 9 1 1 1 1 2 1 2 Y 13 1 1 29 A 14 29 1 1 A 15 25 3 1 1 1 Sp 16 1 10 20 V 17 1 1 29 K 18 1 27 1 2 G 19 1 30 Tally of 579 (n > 50, bold; over 400, underscored) examples with length 17 A C D E F G H I K L M N P Q R S T V W Y X 1 44 1 1 2 11 81 5 69 1 14 6 41 1 4 34 30 19 118 66 31 VGIW x 2 7 522 1 10 17 1 3 3 8 10 I 3 3 1 22 5 7 6 51 25 1 76 8 262 19 1 46 46 SNIx 4 39 2 8 6 16 64 9 3 2 3 15 178 23 6 50 11 8 16 120 PYGx 5 3 194 6 1 70 6 44 6 4 1 55 4 8 133 9 7 1 27 DSGN x 6 3 1 75 4 45 326 1 6 43 1 63 8 1 2 GDS x 7 8 24 5 226 3 3 3 4 24 2 11 245 14 6 1 SG x 8 4 2 57 37 5 22 4 18 18 2 2 161 1 4 11 106 90 2 1 32 NST x 9 56 11 2 63 157 1 3 3 11 5 13 4 242 8 TKIA x 10 1 14 2 13 30 23 6 29 2 3 110 3 52 20 10 1 1 259 YNR x 11 2 7 5 1 4 3 5 551 Y 12 405 1 2 18 1 6 2 3 1 89 8 44 A 13 7 323 22 7 4 1 4 66 138 3 1 3 DQP x 14 2 5 6 3 123 4 2 7 421 1 2 2 SK x 15 1 1 188 2 1 22 3 1 357 2 1 VF 16 1 13 1 1 332 3 2 1 1 199 21 4 KQ x 17 11 1 565 1 1 G Tally of 464 (over 50, bold; over 400, underscored) A C D E F G H I K L M N P Q R S T V W Y X 1 5 13 184 8 1 7 1 2 15 6 3 26 65 9 14 105 EYSL x 2 6 429 3 4 1 2 19 I 3 1 13 13 4 10 5 154 1 12 1 250 YN x 4 1 12 2 6 199 2 1 3 4 5 2 19 28 15 165 YH x 5 5 20 1 1 18 4 9 1 22 365 16 1 1 S x 6 13 8 439 1 1 1 1 G 7 20 2 14 2 4 2 26 1 12 357 20 1 2 1 S x 8 13 2 4 8 1 2 4 3 6 420 1 T 9 10 4 1 10 1 8 1 245 13 9 3 1 1 157 NY x 10 6 2 2 2 1 7 444 Y 11 14 3 1 1 8 408 4 21 2 2 N 12 4 13 4 2 1 418 14 7 1 P 13 2 2 6 452 1 1 S 14 2 2 441 1 18 L 15 18 413 3 5 11 10 1 2 1 K 16 1 1 31 2 2 3 419 5 S

TABLE-US-00043 TABLE 22P Tally of VH types 1-02 16 1-03 16 1-08 13 1-18 27 1-24 5 1-45 0 1-46 14 1-58 1 1-69 37 1-e 16 1-f 1 2-05 13 2-26 1 2-70 2 3-07 33 3-09 13 3-11 15 3-13 4 3-15 10 3-20 4 3-21 25 3-23 85 3-30 55 3303 59 3305 0 3-33 42 3-43 1 3-48 24 3-49 11 3-53 12 3-64 4 3-66 4 3-72 3 3-73 3 3-74 12 3-d 0 4-04 29 4-28 3 4301 46 4302 7 4304 37 4-31 0 4-34 184 4-39 65 4-59 45 4-61 9 4-b 11 5-51 55 5-a 13 6-1 7 74.1 3

TABLE-US-00044 TABLE 23P Oligonucleotides used to variegate CDR1 and CDR2 of human HC !(name) 5'-....DNA sequence....-3' !everything to right of an exclamation point is commentary ![RC] means "reverse complement" of sequence shown ! If last non-comment and non-blank character is "-", then continue !on next line. ! Ignore case, "a" = "A", "c" = "C", etc. ! Ignore "I" and blanks. ! <number> means incorporate trinucleotide mixtue of given number. !------------------------------------------------------------------------- ! ! CDR1 (ON-R1V1vg) 5'- ct|TCC|GGA|ttc|act|ttc|tct|- <1>|tac|<1>|atg|<1>|- ! CDR1 of length 5, ON = 55 bases tgg|gtt|cgC|CAa|gct|ccT|GG-3' (SEQ ID NO: 27) ! <1> = ADEFGHIKLMNPQRSTVWY no C ! (ON-R1top) 5'-cctactgtct |TCC|GGA|ttc|act|ttc|tct-3' (SEQ ID NO: 28) (ON-R1bot)[RC] 5'-tgg|gtt|cgC|CAa|gct|ccT|GG ttgctcactc-3' (SEQ ID NO: 29) (ON-R1V2vg) 5'- ct|TCC|GGA|ttc|act|ttc|tct|- <6>|<7>|<7>|tac|tac|tgg|<7>|- ! CDR1 of length 7, ON = 61 bases tgg|gtt|cgC|CAa|gct|ccT|GG-3' (SEQ ID NO: 30) ! <6> = ST, 1:1 ! <7> = 0.2025(SG) + 0.035(ADEFHIKLMNPQRTVWY) no C (ON-R1V3vg) 5'-ct|TCC|GGA|ttc|act|ttc|tct|- |atc|agc|ggt|ggt|tct|atc|tcc|<1>|<1>|<1>|tac|tac|t- gg|<1>|- ! CDR1, L = 14 tgg|gtt|cgC|CAa|gct|ccT|GG-3'(SEQ ID NO: 31) ! ON = 82 bases ! CDR2 (ON-R2V1vg) 5'-ggt|ttg|gag|tgg|gtt|tct|- <2>|atc|<2>|<3>|tct|ggt|ggc|<1>|act|<- 1>|- tat|gct|gac|tcc|gtt|aaa|gg-3' (SEQ ID NO: 32)! ON = 68 bases, CDR2 = 17 AA (ON-R2top) 5'-ct|tgg|gtt|cgC|CAa|gct|ccT|GGt|aaa|ggt|ttg|gag|tgg|gtt|tct-3- ' (SEQ ID NO: 33) (ON-R2bot)[RC] 5 -tat|gct|gac|tcc|gtt|aaa|ggt|- cgc|ttc|act|atc|TCT|AGA|ttcctgtcac-3' (SEQ ID NO: 34)! XbaI plus 10 bases of scab (ON-R2V2vg) 5'-ggt|ttg|gag|tgg|gtt|tct|- <1>|atc|<4>|<1>|<1>|ggt|<5>|<1&- gt;|<1>|<1>|- tat|gct|gac|tcc|gtt|aaa|gg-3' (SEQ ID NO: 35)! ON = 68 bases, CDR2 = 17 AA ! <4> = DINSWY, equimolar ! <5> = SGDN, equimolar (ON-R2V3vg) 5'-ggt|ttg|gag|tgg|gtt|tct|- <1>|atc|<4>|<1>|<1>|ggt|<5>|<1&- gt;|<1>|- tat|aac|cct|tcc|ctt|aag|gg-3' (SEQ ID NO: 36)! ON = 65 bases, CDR2 = 16 AA (ON-R2bo3)[RC] 5'-tat|aac|cct|tcc|ctt|aag|ggt|- cgc|ttc|act|atc|TCT|AGA|ttcctgtcac-3' (SEQ ID NO: 37)! XbaI plus 10 bases of scab (ON-R2V4vg) 5'-ggt|ttg|gag|tgg|gtt|tct|- <1>|atc|<8>|agt|<1>|<1>|<1>|ggt|gg- t|act|act|<1> tat|gcc|gct|tcc|gtt|aag|gg-3' (SEQ ID NO: 38)! ON = 74 bases, CDR2 = 19 AA (ON-R2bo4)[RC] 5'--tat|gcc|gct|tcc|gtt|aag|ggt|- cgc|ttc|act|atc|TCT|AGA|ttcctgtcac-3' (SEQ ID NO: 39) ! XbaI plus 10 bases of scab

TABLE-US-00045 TABLE 25P Lengths of CDRs in 285 human kappa chains 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 CDR1 0 0 0 0 0 0 0 0 0 0 0 154 73 3 0 0 28 27 0 0 CDR2 0 0 0 0 0 0 0 285 0 0 0 0 0 0 0 0 0 0 0 0 CDR3 0 5 0 0 1 0 3 2 28 166 63 12 1 1 0 0 0 0 0 1

TABLE-US-00046 TABLE 26P Tally of kappa types: V and J V genes: O12 59 O2 0 O18 0 O8 0 A20 0 A30 0 L14 0 L1 2 L15 0 L4 2 L18 0 L5 4 L19 0 L8 4 L23 0 L9 1 L24 0 L11 4 L12 8 O11 10 O1 0 A17 5 A1 0 A18 3 A2 0 A19 13 A3 0 A23 4 A27 79 A11 26 L2 28 L16 0 L6 11 L20 0 L25 0 B3 22 B2 0 A26 0 A10 0 A14 0 JH# 1 2 3 4 5 tally 105 64 29 78 9

TABLE-US-00047 TABLE 27P Names of Kappa chains analyzed AB022651 AB022653 AB022654 AB022656 AF007572 AF021036 AF103499 AF103500 AF103527 AF103873 AF107244 AF107245 AF107246 AF107247 AF115361 AF165099 AF165101 AF165103 AF165108 AF165110 AF165111 AF184763 AF184767 hsa004955 hsa004956 hsa011133 HSA241367 HSA241375 HSA388639 HSA388640 HSA388641 HSA388642 HSA388643 HSA388644 HSA388645 HSA388646 HSA388647 HSA388648 HSA388650 HSA388651 HSA388652 HSA388653 HSA388654 HSA388655 HSA388656 HSA388657 hsew1vk hsew3vk hsew4vk hsigdpk13 hsigg1kl HSIGGVKA hsigk123 hsigk319 hsigklc14 hsigklc28 hsigklc5 hsigklg31 hsigklv01 hsigklv02 hsigklv03 hsigklv04 hsigklv05 hsigklv06 hsigklv07 hsigklv09 hsigklv10 hsigklv12 hsigklv13 hsigklv14 hsigklv15 hsigklv16 hsigklv17 hsigklv18 hsigklv19 hsigklv20 hsigklv21 hsigklv22 hsigklv23 hsigklv24 hsigklv25 hsigklv27 hsigklv28 hsigklv29 hsigklv31 hsigklv32 hsigklv33 hsigklv34 hsigklv35 hsigklv36 hsigklv37 hsigklv38 hsigklv39 hsigklv40 hsigklv41 hsigklv42 hsigklv43 hsigklv44 hsigklv45 hsigklv46 hsigklv49 hsigklv50 hsigklv51 hsigklv52 hsigklv53 hsigklv54a hsigklv56 hsigklv57 hsigklv58 hsigklv59 hsigklv60 hsigklv61 hsigklv62 hsigklv63 hsigklv65 hsigklv66 hsigklv68 hsigklv69 hsigklv71 hsigkvba hsigkvbb hsigkvbc hsigkvbd hsigkvbe hsigkvbf hsigkvc01 hsigkvc03 hsigkvc06 hsigkvc11 hsigkvc12 hsigkvc20 hsigkvc23 hsigkvc27 hsigkvc29 hsigrklc hsikcvjp1 hsikcvjp2 hsikcvjp3 hsikcvjp6 hsikcvjp7 hsld110vl hsld117vl hsld128vl hsld140vl hsld152vl hsld184vl hsld198vl hsld24vl hsmbcl1k1 hsmbcl1k2 hsmbcl2k2 hsmbcl5k4 hss10avl hss17bvl hss1a15vl HSU44792 HSU44794 HSU94422 hsz84852 hsz84853 humigk1dm humigk3am humigk3bm humigk3cm humigkacoa humigkacob humigkacoc humigkacoe humigkacof humigkb1aa humigkb1ab humigkb1ac humigkvra humigkvrb humigkvrc humigkvrd humigkvre humigkvrg humigkvrh humigkvri humigkx humigky1 humigky2 humigky4 humigky5 humigky6 humigl3ac humikc humikca humikcad humikcaf humikcag humikcah humikcai humikcaj humikcal humikcam humikcan humikcas humikcau humikcav humikcaw humikcax humikcay humikcaz humikcb humikcba humikcbb humikcbc humikcbd humikcbe humikcbf humikcbg humikcbh humikcbi humikcbj humikcbl humikcbm humikcbn humikcbo humikcbp humikcbq humikcbs humikcbt humikcbu humikcbv humikcbw humikcbx humikcbz humikcc humikcca humikccb humikccc humikccd humikcce humikccf humikccg humikcch humikcci humikccj humikcck humikcco humikccp humikccq humikccr

humikccs humikcct humikccu humikccv humikccw humikcd humikcf humikcg humikch humikci humikck humikcm humikcn humikco humikcp humikcq humikcr humikcs humikct humikcu humikcv humikcva humikcvb humikcvc humikcvd humikcve humikcvf humikcvg humikcvh humikcvi humikcvj humikcw humikcx humikcy humikcz S46248 S82746 S82747 SU96396 SU96397

TABLE-US-00048 TABLE 28P AA types seen in 154 kappa sequences having CDR1 of length 11 Tally A C D E F G H I K L M N P Q R S T V W Y 1 11 143 R 2 148 1 2 2 1 A 3 152 2 S 4 1 3 3 147 Q 5 12 1 27 7 3 99 4 1 S 6 1 81 1 71 V 7 2 4 18 5 1 2 9 12 97 3 1 S 8 3 5 1 2 1 31 1 10 87 12 1 S 9 2 7 10 1 6 29 1 8 13 77 Y 10 2 150 1 1 L 11 96 4 2 46 2 1 3 A

TABLE-US-00049 TABLE 30P Synthetic Kappa light chain gene ! ! ! A27::JH1 with all CDRs replaced by stuffers. ! Each stuffer contains at least one stop codon and a ! restriction site that will be unique within the diversity vector. ! 1 GAGGACCATt GGGCCCC ctccgagact ! Scab...... EcoO109I ! ApaI. !----------------------------------- ! 28 CTCGAG cgca ! XhoI.. !----------------------------------- ! 38 acgcaatTAA TGTgagttag ctcactcatt aggcacccca ggcTTTACAc tttatgcttc ! ..-35.. Plac ..-10. !----------------------------------- ! 98 cggctcgtat gttgtgtgga attgtgagcg gataacaatt tc !----------------------------------- ! 140 acacagga aacagctatgac !----------------------------------- ! 160 catgatta cgCCAAGCTT TGGagccttt tttttggaga ttttcaac (SEQ ID NO: 54) ! PflMI....... ! Hind3. !----------------------------------- ! ! M13 III signal sequence (AA seq)---------------------------> ! 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 ! M K K L L F A I P L V V P F Y 206 gtg aag aag ctc cta ttt gct atc ccg ctt gtc gtt ccg ttt tac !---------------------------------- ! ! --Signal--> FR1-------------------------------------------> ! 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 ! S H S A Q S V L T Q S P G T L 251 |agc|cat|aGT|GCA|Caa|tcc|gtc|ctt|act|caa|tct|cct|ggc|act|ctt| ! ApaLI... !---------------------------------- ! ! ----- FR1 ------------------------------------->| CDR1------> ! 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 ! S L S P G E R A T L S C R A S (SEQ ID NO: 55) ! |tcG|CTA|AGC|CCG|GGt|gaa|cgt|gct|acC|TTA|AGt|tgc|cgt|gct|tcc|(SEQ ID NO: 54; ! EspI..... AflII... ! XmaI.... ! !---------------------------------- ! For CDR1: ! <1> ADEFGHIKLMNPQRSTVWY equimolar ! <2> S(0.2) ADEFGHIKLMNPQRTVWY (0.044 each) ! <3> Y(0.2) ADEFGHIKLMNPQRSTVW (0.044 each) ! In a preferred embodiment, we omit codon 52 in vgDNA for CDR1. ! ! ------- CDR1 --------------------->|--- FR2 ---------------> ! <1> <2> <2> xxx <3> ! 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 ! Q S V S S S Y L A W Y Q Q K P |cag|tct|gtt|tcc|tct|tct|tat|ctt|gct|tgg|tat|caa|cag|aaA|CCT| ! SexAI... !----------------------------------- ! For CDR2: ! <1> ADEFGHIKLMNPQRSTVWY equimolar ! <2> S(0.2) ADEFGHIKLMNPQRTVWY (0.044 each) ! <4> A(0.2) DEFGHIKLMNPQRSTVWY (0.044 each) ! ----- FR2 ------------------------->|------- CDR2 ----------> ! <1> <2> <4> ! 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 ! G Q A P R L L I Y G A S S R A |GGT|caG|GCG|CCg|cgt|tta|ctt|att|tat|ggt|gct|tct|tcc|cgc|gct| ! SexAI.... KasI.... (CDR1 installed as AFlII-(SexAI or KasI) cassette.) ! !----------------------------------- ! ! CDR2-->|--- FR3 -----------------------------------------------> ! <1> ! 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 ! T G I P D R F S G S G S G T D |act|gGG|ATC|CCG|GAC|CGt|ttc|tct|ggc|tct|ggt|tca|ggt|act|gac| ! BamHI... ! RsrII..... ! (CDR2 installed as (SexAI or KasI) to (BamHI or RsrII) cassette.) !----------------------------------- ! ! ------ FR3 -------------------------------------------------> ! 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 ! F T L T I S R L E P E D F A V 477 |ttt|acc|ctt|act|att|TCT|AGA|ttg|gaa|cct|gaa|gac|ttc|gct|gtt| ! XbaI... ! !----------------------------------- ! ! ----------->|----CDR3-------------------------->|-----FR4-- --> ! <3> <1> <1> <1> <1> ! 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 ! Y Y C Q Q Y G S S P E T F G Q |tat|tat|tgC|CAa|cag|taT|GGt|tct|tct|cct|gaa|act|ttc|ggt|caa| ! BstXI........... ! !----------------------------------- ! ! -----FR4------------------->| <------- Ckappa ------------ ! 121 122 123 124 125 126 127 128 129 130 131 132 133 134 ! G T K V E I K R T V A A P S 510 |ggt|aCC|AAG|Gtt|gaa|atc|aag| |CGT|ACG|gtt|gcc|gct|cct|agt| ! StyI.... BsiWI.. ! ! (CDR3 installed as XbaI to (StyI or BsiWI) cassette.) ! ! 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 ! V F I F P P S D E Q L K S G T 552 |gtg|ttt|atc|ttt|cct|cct|tct|gac|gaa|CAA|TTG|aag|tca|ggt|act| ! MfeI... ! ! 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 ! A S V V C L L N N F Y P R E A 597 |gct|tct|gtc|gta|tgt|ttg|ctc|aac|aat|ttc|tac|cCT|CGT|Gaa|gct| ! BssSI... ! ! 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 ! K V Q W K V D N A L Q S G N S 642 |aaa|gtt|cag|tgg|aaa|gtc|gat|aAC|GCG|Ttg|cag|tcg|ggt|aac|agt| ! MluI.... ! ! 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 ! Q E S V T E Q D S K D S T Y S 687 |caa|gaa|tcc|gtc|act|gaa|cag|gat|agt|aag|gac|tct|acc|tac|tct| ! ! 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 ! L S S T L T L S K A D Y E K H 732 |ttg|tcc|tct|act|ctt|act|tta|tca|aag|gct|gat|tat|gag|aag|cat| ! ! 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 ! K V Y A C E V T H Q G L S S P 777 |aag|gtc|tat|GCt|TGC|gaa|gtt|acc|cac|cag|ggt|ctG|AGC|TCc|cct| ! SacI.... ! ! 225 226 227 228 229 230 231 232 233 234 ! V T K S F N R G E C . . (SEQ ID NO: 332) 822 |gtt|acc|aaa|agt|ttc|aaC|CGT|GGt|gaa|tgc|taa|tag GGCGCGCC ! DsaI.... AscI.... ! BssHII ! 866 acgcatctctaa GCGGCCGC aacaggaggag (SEQ ID NO: 333) ! NotI.... ! EagI..

TABLE-US-00050 TABLE 31P Tally of 285 CDR2s of length 7 in human kappa Tally A C D E F G H I K L M N P Q R S T V W Y 1 51 62 7 95 1 11 15 2 1 2 6 6 3 22 1 x 2 225 18 5 5 2 1 1 3 16 9 A 3 2 9 1 2 267 2 1 1 S 4 2 1 5 4 9 1 77 4 93 80 2 7 Sx 5 1 2 80 200 2 R 6 162 7 36 4 4 1 3 3 63 2 Ax 7 5 1 3 1 1 2 2 1 125 144 x

TABLE-US-00051 TABLE 32P Tally of 166 CDR3s of length 9 from human kappa. Tally A C D E F G H I K L M N P Q R S T V W Y 1 4 8 21 131 1 1 Q 2 1 9 2 1 153 Q 3 14 4 4 3 6 4 1 1 3 21 16 3 4 82 Yx 4 1 9 1 2 37 4 2 2 15 1 33 2 20 7 1 29 x 5 2 2 6 3 4 5 3 28 17 7 65 19 1 1 3 x 6 7 1 11 2 3 8 1 4 3 41 33 5 28 19 x 7 1 2 6 146 2 2 5 2 P 8 2 4 1 2 21 7 3 5 1 38 7 4 25 1 3 1 16 25 x 9 3 2 1 1 2 157 T

TABLE-US-00052 TABLE 33P lengths of CDRs in 93 human lambda chains 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18+ CDR1 0 0 0 0 0 0 0 0 0 0 0 23 7 15 46 0 0 0 2 CDR2 5 0 0 1 0 0 0 80 2 0 0 1 4 0 0 0 0 0 1 CDR3 0 0 0 0 0 0 0 0 1 16 28 27 6 1 0 4 6 4 0

TABLE-US-00053 TABLE 34P Tally of 46 CDR1s of length 14 from human lambda chains Tally A C D E F G H I K L M N P Q R S T V W Y 1 2 2 1 41 T 2 43 3 G 3 2 1 1 6 36 TGx 4 1 45 S 5 5 1 40 S 6 39 1 4 2 DNx 7 8 1 37 V 8 1 42 2 1 G 9 4 1 35 1 2 3 TGx 10 1 1 3 1 2 38 Yx 11 4 1 35 6 DNx 12 3 1 2 1 1 2 36 Yx 13 1 2 43 V 14 1 4 41 S

TABLE-US-00054 TABLE 35P Synthtic human lambda-chain gene ! Lambda 14-7(A) 2a2::JH2::Clambda ! AA sequence tested ! 1 GAGGACCATt GGGCCCC ttactccgtgac ! Scab...... EcoO109I ! ApaI.. !----------------------------------------------- ! ! -----------FR1--------------------------------------------> ! 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 ! S A Q S A L T Q P A S V S G S P G 30 aGT|GCA|Caa|tcc|gct|ctc|act|cag|cct|GCT|AGC|gtt|tcc|gGG|TcA|CCt|GGT- | ! ApaLI... NheI... BstEII... ! SexAI.... !----------------------------------------------- ! ! For CDR1, ! <1> = 0.27 T, 0.27 G, 0.027 {ADEFHIKLMNPQRSVWY} no C ! <2> = 0.27 D, 0.27 N, 0.027 {AEFGHIKLMPQRSTVWY} no C ! <3> = 0.36 Y, 0.0355{ADEFGHIKLMNPQRSTVW} no C ! T G <1> S S <2> V G ! ------FR1------------------> |-----CDR1--------------------- ! 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 ! Q S I T I S C T G T S S D V G |caa|agt|atc|act|att|tct|TGT|ACA|ggt|act|tct|tct|gat|gtt|ggc| ! BsrGI.. ! ! a second vg scheme for CDR1 gives segments of length 11: ! G.sub.23<2><4>L<4><4><4><3><4>- <4> where ! <4> = equimolar {ADEFGHIKLMNPQRSTVWY} no C !------------------------------------------------------- ! ! <1> <3> <2> <3> V S = vg Scheme #1, length = 14 ! -----CDR1------------->|--------FR2------------------------- ! 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 ! G Y N Y V S W Y Q Q H P G K A |ggt|tac|aat|tac|gtt|tct|tgg|tat|caa|caa|caC|CCG|GGc|aaG|GCG| ! XmaI.... KasI..... ! AvaI.... !------------------------------------------------------------------- ! ! <4> <4> <4> <2> R P S ! --FR2-----------------> |------CDR2--------------->|-----FR3-- ! 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 ! P K L M I Y E V S N R P S G V |CCg|aag|ttg|atg|atc|tac|gaa|gtt|tcc|aat|cgt|cct|tct|ggt|gtt| ! KasI.... !------------------------------------------------------------------- ! ! -------FR3---------------------------------------------------- ! 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 ! S N R F S G S K S G N T A S L |agc|aat|cgt|ttc|TCC|GGA|tct|aaa|tcc|ggt|aat|acc|gcA|AGC|TTa| ! BspEI.. | HindIII. ! BsaBI........(blunt) !------------------------------------------------------------------- ! ! -------FR3-------------------------------------------------->| ! 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 ! T I S G L Q A E D E A D Y Y C |act|atc|tct|ggt|CTG|CAG|gct|gaa|gac|gag|gct|gac|tac|tat|tgt| ! PstI... ! !------------------------------------------------------------------- ! ! <5> = 0.36 S, 0.0355{ADEFGHIKLMNPQRTVWY} no C ! ! <4> <5> <4> <2> <4> S <4> <4> <4> <4> V ! -----CDR3---------------------------------->|---FR4--------- ! 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 ! S S Y T S S S T L V V F G G G |tct|tct|tac|act|tct|tct|agt|acc|ctt|gtt|gtc|ttc|ggc|ggt|GGT| ! KpnI... ! !------------------------------------------------------------------------ ! ! -------FR4--------------> ! 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 ! T K L T V L G Q P K A A P S V 279 |ACC|aaa|ctt|act|gtc|ctc|gGT|CAA|CCT|aAG|Gct|gct|cct|tcc|gtt| ! KpnI... HindII.. ! Bsu36I... ! ! 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 ! T L F P P S S E E L Q A N K A 324 |act|ctc|ttc|cct|cct|agt|tct|GAA|GAG|Ctt|caa|gct|aac|aag|gct| ! SapI..... ! ! 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 ! T L V C L I S D F Y P G A V T 369 |act|ctt|gtt|tgc|tTG|ATC|Agt|gac|ttt|tat|cct|ggt|gct|gtt|act| ! BClI.... ! ! 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 ! V A W K A D S S P V K A G V E 414 |gtc|gct|tgg|aaa|gcc|gat|tct|tct|cct|gtt|aaa|gct|ggt|gtt|GAG| ! BsmBI... ! ! 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 ! T T T P S K Q S N N K Y A A S 459 |ACG|acc|act|cct|tct|aaa|caa|tct|aac|aat|aag|tac|gct|gcG|AGC| ! BsmBI.... SacI.... ! ! 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 ! S Y L S L T P E Q W K S H K S 504 |TCt|tat|ctt|tct|ctc|acc|cct|gaa|caa|tgg|aag|tct|cat|aaa|tcc| ! SacI... ! ! 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 ! Y S C Q V T H E G S T V E K T 549 |tat|tcc|tgt|caa|gtt|acT|CAT|GAa|ggt|tct|acc|gtt|gaa|aag|act| ! BspHI... ! ! 211 212 213 214 215 216 217 218 219 ! V A P T E C S . . (SEQ ID NO: 57) 594 |gtt|gcc|cct|act|gag|tgt|tct|tag|tga|GGCGCGCC ! AscI.... ! BssHII ! 629 aacgatgttc aag GCGGCCGC aacaggaggag (SEQ ID NO: 56) ! NotI.... Scab.......

TABLE-US-00055 TABLE 36P Tally of 23 CDR1s of length 11 from human lambda chains Tally A C D E F G H I K L M N P Q R S T V W Y 1 1 6 10 6 x 2 1 1 21 G 3 15 1 7 DNx 4 2 1 1 3 7 1 8 X 5 7 16 L 6 11 1 2 8 1 X 7 1 1 1 2 2 1 14 1 X 8 1 10 1 1 1 2 7 X 9 2 6 15 Yx 10 11 1 11 X 11 3 7 9 2 2 X

TABLE-US-00056 TABLE 37P Tally of 80 CDR2s of length 7 from human lambda chains Tally A C D E F G H I K L M N P Q R S T V W Y 1 1 14 32 1 13 3 1 4 5 1 2 3 X 2 18 2 8 16 2 34 X 3 1 2 1 31 39 4 2 X 4 6 4 1 14 1 41 8 1 1 2 1 DNx 5 1 1 78 R 6 1 77 2 P 7 2 78 S

TABLE-US-00057 TABLE 38P Tally of 27 CDR3s of length 11 from human lambda chains Tally A C D E F G H I K L M N P Q R S T V W Y 1 4 5 6 5 4 3 X 2 3 1 2 14 5 2 Sx 3 1 7 13 6 X 4 19 2 1 1 4 DNx 5 1 4 2 2 2 1 13 2 X 6 1 3 1 21 1 S 7 1 7 12 1 4 2 X 8 2 1 10 1 6 6 1 X 9 3 1 8 10 3 1 1 X 10 1 4 1 1 1 3 1 1 6 5 3 X 11 2 25 V

TABLE-US-00058 TABLE 40P Synthetic Kappa light chain gene with stuffers ! ! A27::JH1 with all CDRs replaced by stuffers. ! Each stuffer contains at least one stop codon and a ! restriction site that will be unique within the diversity vector. ! 1 GAGGACCATt GGGCCCC ctccgagact ! Scab...... EcoO109I ! ApaI. !---------------------------------- ! 28 CTCGAG cgca ! XhoI.. !---------------------------------- 38 acgcaatTAA TGTgagttag ctcactcatt aggcacccca ggcTTTACAc tttatgcttc ! ..-35.. Plac ..-10. !---------------------------------- ! 98 cggctcgtat gttgtgtgga attgtgagcg gataacaatt tc !---------------------------------- ! 140 acacagga aacagctatgac !---------------------------------- ! 160 catgatta cgCCAAGCTT TGGagccttt tttttggaga ttttcaac ! PflMI....... ! Hind3. !---------------------------------- ! ! M13 III signal sequence (AA seq)---------------------------> ! 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 ! M K K L L F A I P L V V P F Y 206 gtg aag aag ctc cta ttt gct atc ccg ctt gtc gtt ccg ttt tac !---------------------------------- ! ! --Signal--> FR1-------------------------------------------> ! 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 ! S H S A Q S V L T Q S P G T L 251 |agc|cat|aGT|GCA|Caa|tcc|gtc|ctt|act|caa|tct|cct|ggc|act|ctt| ! ApaLI... !---------------------------------- ! ! ----- FR1 --------------------------------->|-------Stuffer-> ! 31 32 33 34 35 36 37 38 39 40 41 42 43 ! S L S P G E R A T L S | | 296 |tcG|CTA|AGC|CCG|GGt|gaa|cgt|gct|acC|TTA|AGt|TAG|TAA|gct|ccc| ! EspI..... AflII... ! XmaI.... !---------------------------------- ! ! ------- Stuffer for CDR1------------------------->|- FR2 --> ! 59 60 ! K P 341 |AGG|CCT|ctt|TGA|tct| g|aaA|CCT| ! StuI... SexAI... !---------------------------------- ! ! ----- FR2 ------|-----------Stuffer for CDR2----------------> ! 61 62 63 64 65 66 ! G Q A P R | | 363 |GGT|caG|GCG|CCg|cgt|TAA|TGA|a AGCGCT aa TGGCCA aca gtg ! SexAI.... KasI.... AfeI.. MscI.. !---------------------------------- ! ! Stuffer-->|--- FR3 -----------------------------------------------> ! <1> ! 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 ! T G I P D R F S G S G S G T D 405 |act|gGG|ATC|CCG|GAC|CGt|ttc|tct|ggc|tct|ggt|tca|ggt|act|gac| ! BamHI... ! RsrII..... !---------------------------------- ! ! ------ FR3 ---------------------STUFFER for CDR3------------------> ! 91 92 93 94 95 96 97 ! F T L T I S R | | 450 |ttt|acc|ctt|act|att|TCT|AGA|TAA|TGA| GTTAAC TAG acc TACGTA acc tag ! XbaI... HpaI.. SnaBI. !---------------------------------- ! ! ! ------------------------CDR3 stuffer------------------>|-----FR4---> ! 118 119 120 ! F G Q 501 |ttc|ggt|caa| !---------------------------------- ! ! -----FR4------------------->| <------- Ckappa ------------ ! 121 122 123 124 125 126 127 128 129 130 131 132 133 134 ! G T K V E I K R T V A A P S 510 |ggt|aCC|AAG|Gtt|gaa|atc|aag| |CGT|ACG|gtt|gcc|gct|cct|agt| ! StyI.... BsiWI.. ! !(CDR3 installed as XbaI to (StyI or BsiWI) cassette.) ! ! 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 ! V F I F P P S D E Q L K S G T (SEQ ID NO: 96) 552 |gtg|ttt|atc|ttt|cct|cct|tct|gac|gaa|CAA|TTG|aag|tca|ggt|act| ! MfeI... ! ! 866 acgcatctctaa GCGGCCGC aacaggaggag (SEQ ID NO: 95) ! NotI.... ! EagI..

TABLE-US-00059 TABLE 41P Variegated DNA for kappa chains !---------------------------------------------------------------- ! Kappa chains ! For CDR1: ! <1> ADEFGHIKLMNPQRSTVWY equimolar ! <2> S(0.2) ADEFGHIKLMNPQRTVWY (0.044 each) ! <3> Y(0.2) ADEFGHIKLMNPQRSTVW (0.044 each) ! <4> A(0.2) DEFGHIKLMNPQRSTVWY (0.044 each) (Ka1vg600) 5'-gct|acC|TTA|AGt|tgc|cgt|gct|tcc|cag- |<1>|gtt|<2>|<2>| <3>|ctt|gct|tgg|tat|caa|cag|aaA|CC-3' (SEQ ID NO: 66) (Ka2vg650) 5'-caG|GCG|CCg|cgt|tta|ctt|att|tat|<1>|gct|tct|<2>|cgc|<4&- gt;|- |<1>|gGG|ATC|CCG|GAC|CGt|ttc|tct|ggt|tctcacc-3' (SEQ ID NO: 71) (Ka3vg670) 5'- gac|ttc|gct|gtt|- |tat|tat|tgC|CAa|cag|<3>|<1>|<1>|<1>|cct|<1&gt- ;|act|ttc|ggt|caa|- |ggt|aCC|AAG|Gtt|g-3' (SEQ ID NO: 77)

TABLE-US-00060 TABLE 42P Variegated DNA for lambda chains !------------------------ ! For CDR1, ! <1> = 0.27 T, 0.27 G, 0.027 {ADEFHIKLMNPQRSVWY} no C ! <2> = 0.27 D, 0.27 N, 0.027 {AEFGHIKLMPQRSTVWY} no C ! <3> = 0.36 Y, 0.0355{ADEFGHIKLMNPQRSTVW} no C ! <4> = equimolar {ADEFGHIKLMNPQRSTVWY} no C ! <5> = 0.36 S, 0.0355{ADEFGHIKLMNPQRTVWY} no C (Lm1vg710) 5'-gt|atc|act|att|tct|TGT|ACA|ggt|<1>|tct|tct|<2>|g- tt|ggc|- |<1>|<3>|<2>|<3>|gtt|tct|tgg|tat|caa|caa|ca- C|CC-3' (SEQ ID NO: 83) !------------------------------------------------- (Lm2vg750) 5'-G|CCg|aag|ttg|atg|atc|tac|- <4>|<4>|<4>|<2>|cgt|cct|tct|ggt|gtc|agc|aat|c-3- ' (SEQ ID NO: 88) (Lm3vg817) 5'- gac|gag|gct|gac|tac|tat|tgt|- |<4>|<5>|<4>|<2>|<4>|tct|<4>|&l- t;4>|<4>|<4>|gtc|ttc|ggc|ggt|GGT|- |ACC|aaa|ctt|ac-3' (SEQ ID NO: 93) !----------------------------------------------------------------

TABLE-US-00061 TABLE 43P Constant DNA for Synthetic Library ! CDR3 library components (Ctop25) 5'-gctctggtcaa C|TTA|AGg|gct|gag|g-3' (SEQ ID NO: 58) (CtprmA) 5'-gctctggtcaa C|TTA|AGg|gct|gag|gac- ! AflII... |acc|gct|gtc|tac|tac|tgc|gcc-3' (SEQ ID NO: 59) ! (CBprmB)[RC] 5'-|tac|ttc|gat|tac|ttg|ggc|caa|GGT|ACC|ctG|GTC|ACC|tcgctccacc-3'(SEQ ID NO: 60) ! BstEII... (CBot25)[Rc] 5'-|GGT|ACC|ctG|GTC|ACC|tcgctccacc-3'(SEQ ID NO: 61) !---------------------------------------------------------------- ! Kappa chains (Ka1Top610) 5'-ggtctcagtt- G|CTA|AGC|CCG|GGt|gaa|cgt|gct|acC|TTA|AGt|tgc|cgt|gct|tcc|cag-3' (SEQ ID NO: 62) (Ka1STp615) 5'-ggtctcagtt- G|CTA|AGC|CCG|GGt|g-3' (SEQ ID NO: 63) (Ka1Bot620) [RC] 5'-ctt|gct|tgg|tat|caa|cag|aaA|- CCt|GGT|caG|GCG|CC aagtcgtgtc-3' (SEQ ID NO: 64) (Ka1SB625) [RC] 5'-cct |GGT|caG|GCG|CC aagtcgtgtc-3' (SEQ ID NO: 65) ! (Ka2Tshort657) 5'-cacgagtcctA|CCT|GGT|- caG|GC-3' (SEQ ID NO: 68) (Ka2Tlong655) 5'-cacgagtcctA|CCT|GGT|- caG|GCG|CCg|cgt|tta|ctt|att|tat-3' (SEQ ID NO: 69) (Ka2Bshort660) [RC] 5'- |GAC|CGt|ttc|tct|ggt|tctcacc-3' (SEQ ID NO: 70) !--------------------------------------------------------------- (Ka3Tlon672)5'- gacgagtcct TCT|AGA|ttg|gaa|cct|gaa|gac|ttc|gct|gtt|- |tat|tat|tgC|CAa|c -3' (SEQ ID NO: 72) (Ka3BotL682) [RC]5'-act|ttc|ggt|caa|- |ggt|aCC|AAG|Gtt|gaa|atc|aag| |CGT|ACG|tcacaggtgag-3' (SEQ ID NO: 73) (Ka3Bsho694) [RC]5'- gaa|atc|aag| |CGT|ACG| tcacaggtgag-3'(SEQ ID NO: 74) !----------------------------------------------------------------- (Lm1TPri75) 5'-gacgagtcct GG|TcA|CCt|GGT|-3' (SEQ ID NO: 78) (Lm1TLo715) 5'-gacgagtcct GG|TcA|CCt|GGt|- caa|agt|atc|act|att|tct|TGT|ACA|ggt-3' (SEQ ID NO: 79) (Lm1BLo724)[RC] 5'-gtt|tct|tgg|tat|caa|caa|caC|CCG|GGc|aaG|GCG|- AGA TCT tcacaggtgag-3' (SEQ ID NO: 80) (Lm1BSh737)[RC] 5'- Gc|aaG|GCG|- AGA TCT tcacaggtgag-3' (SEQ ID NO: 81) !------------------------------------------------- (Lm2TSh757) 5'-gagcagagga C|CCG|GGc|aaG|GC-3' (SEQ ID NO: 84) (Lm2TLo753) 5'-gagcagagga C|CCG|GGc|aaG|GCG|CCg|aag|ttg|atg|atc|tac|-3' (SEQ ID NO: 85) (Lm2BLo762)[RC] 5'-cgt|cct|tct|ggt|gtc|agc|aat|cgt|ttc|TCC|GGA|tcacaggtgag-3' (SEQ ID NO: 86) (Lm2BSh765)[RC] 5'- cgt|ttc|TCC|GGA|tcacaggtgag-3' (SEQ ID NO: 87) !------------------------------------------------- (Lm3TSh822) 5'-CTG|CAG|gct|gaa|gac|gag|gct|gac -3' (SEQ ID NO: 89) (Lm3TLo819) 5'-CTG|CAG|gct|gaa|gac|gag|gct|gac|tac|tat|tgt|-3' (SEQ ID NO: 90) (Lm3BLo825) [RC] 5'-gtc|ttc|ggc|ggt|GGT|- |ACC|aaa|ctt|act|gtc|ctc|gGT|CAA|CCT|aAG|G acacaggtgag-3' (SEQ ID NO: 91) (Lm3BSh832) [RC] 5'- c|gGT|CAA|CCT|aAG|G acacaggtgag-3' (SEQ ID NO: 92) !----------------------------------------------------------------

TABLE-US-00062 TABLE 48P Synthtic human lambda-chain gene with stuffers in place of CDRs ! Lambda 14-7(A) 2a2::JH2::Clambda ! AA sequence tested ! 1 GAGGACCATt GGGCCCC ttactccgtgac ! Scab...... EcoO109I ! ApaI.. !----------------------------------------------- ! ! -----------FR1--------------------------------------------> ! 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 ! S A Q S A L T Q P A S V S G S P G 30 aGT|GCA|Caa|tcc|gct|ctc|act|cag|cct|GCT|AGC|gtt|tcc|gGG|TcA|CCt|GGT- | ! ApaLI... NheI... BstEII... ! SexAI.... !----------------------------------------------- ! ! ------FR1------------------> |-----stuffer for CDR1--------- ! 16 17 18 19 20 21 22 23 ! Q S I T I S C T 81 |caa|agt|atc|act|att|tct|TGT|ACA|tct TAG TGA ctc ! BsrGI.. !------------------------------------------------------- ! ! -----Stuffer--------------------------->-------------------- ! 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 ! R S | | P | H P G K A 117 AGA TCT TAA TGA ccg tag caC|CCG|GGc|aaG|GCG| ! BglII XmaI.... KasI..... ! AvaI.... !------------------------------------------------------------------- ! ! --|-------------Stuffer -------------------------------------> ! P 150 |CCg|TAA|TGA|atc tCG TAC G ct|ggt|gtt| ! KasI.... BsiWI... !------------------------------------------------------------------- ! ! -------FR3---------------------------------------------------- ! 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 ! S N R F S G S K S G N T A S L 177 |agc|aat|cgt|ttc|TCC|GGA|tct|aaa|tcc|ggt|aat|acc|gcA|AGC|TTa| ! BspEI.. | HindIII. ! BsaBI........(blunt) !------------------------------------------------------------------- ! ! -------FR3------------->|--Stuffer-------------------------->- | ! 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 ! T I S G L Q 222 |act|atc|tct|ggt|CTG|CAG|gtt ctg tag ttc CAATTG ctt tag tga ccc ! PstI... MfeI.. !------------------------------------------------------------------- ! ! -----Stuffer------------------------------->|---FR4--------- ! 103 104 105 ! G G G 270 |ggc|ggt|GGT| ! KpnI... !---------------------------------------------------------------------- ! ! -------FR4--------------> ! 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 ! T K L T V L G Q P K A A P S V 279 |ACC|aaa|ctt|act|gtc|ctc|gGT|CAA|CCT|aAG|Gct|gct|cct|tcc|gtt| ! KpnI... HincII.. ! Bsu36I... ! ! 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 ! T L F P P S S E E L Q A N K A 324 |act|ctc|ttc|cct|cct|agt|tct|GAA|GAG|Ctt|caa|gct|aac|aag|gct| ! SapI..... ! ! 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 ! T L V C L I S D F Y P G A V T (SEQ ID NO: 98) 369 |act|ctt|gtt|tgc|tTG|ATC|Agt|gac|ttt|tat|cct|ggt|gct|gtt|act| (SEQ ID NO: 97) ! BclI....

TABLE-US-00063 TABLE 50P 3-23::CDR3::JH4 Stuffers in place of CDRs FR1(DP47/V3-23)--------------- 20 21 22 23 24 25 26 27 28 29 30 A M A E V Q L L E S G ctgtctgaac CC atg gcc gaa|gtt|CAA|TTG|tta|gag|tct|ggt| Scab...... NcoI.... | MfeI | --------------FR1-------------------------------------------- 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 G G L V Q P G G S L R L S C A |ggc|ggt|ctt|gtt|cag|cct|ggt|ggt|tct|tta|cgt|ctt|tct|tgc|gct| ----FR1-------------------->|...CDR1 stuffer....|---FR2------ 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 A S G F T F S S Y A | | W V R |gct|TCC|GGA|ttc|act|ttc|tct|tCG|TAC|Gct|TAG|TAA|tgg|gtt|cgC| | BspEI | | BsiWI| |BstXI. -------FR2-------------------------------->|...CDR2 stuffer. 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 Q A P G K G L E W V S | p r | |CAa|gct|ccT|GGt|aaa|ggt|ttg|gag|tgg|gtt|tct|TAA|CCT|AGG|TAG| ...BstXI | AvrII.. .....CDR2 stuffer....................................|---FR3--- --------FR3-------------------------------------------------- 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 T I S R D N S K N T L Y L Q M |act|atc|TCT|AGA|gac|aac|tct|aag|aat|act|ctc|tac|ttg|cag|atg| | XbaI | ---FR3-----------..> Stuffer------------->| 106 107 108 109 110 N S L R A (SEQ ID NO: 53) |aac|agC|TTA|AGg|gct|TAG TAA AGG cct TAA (SEQ ID NO: 52) |AflII | StuI... |----- FR4 ---(JH4)------------------------------------------ Y F D Y W G Q G T L V T V S S (SEQ ID NO: 26) |tat|ttc|gat|tat|tgg|ggt|caa|GGT|ACC|ctG|GTC|ACC|gtc|tct|agt|... (SEQ ID NO: 25) | KpnI| | BstEII |

Sequence CWU 1

1

447114PRTArtificial SequenceSynthetic peptideMOD_RES(8)..(10)A, D, E, F, G, H, I, K, L, M, N, P, Q, R, S, T, V, W, or YMOD_RES(14)..(14)A, D, E, F, G, H, I, K, L, M, N, P, Q, R, S, T, V, W, or Y 1Val Ser Gly Gly Ser Ile Ser Xaa Xaa Xaa Tyr Tyr Trp Xaa1 5 10217PRTArtificial SequenceSynthetic peptideMOD_RES(1)..(1)amino acid Y, R, W, V, G, or SMOD_RES(3)..(3)amino acid Y, R, W, V, G, or SMOD_RES(4)..(4)amino acid P, S, or GMOD_RES(8)..(8)amino acid E, F, G, H, I, K, L, M, N, P, 0, P, S, T, V, W, or YMOD_RES(10)..(10)amino acid E, F, G, H, I, K, L, M, N, P, 0, P, S, T, V, W, or Y 2Xaa Ile Xaa Xaa Ser Gly Gly Xaa Thr Xaa Tyr Ala Asp Ser Val Lys1 5 10 15Gly317PRTArtificial SequenceSynthetic peptideMOD_RES(1)..(1)amino acid A, D, E, F, G, H, I, K, L, M,N, P, Q, R, S, T, V, W, or YMOD_RES(3)..(3)amino acid D, I, N, S, W, or YMOD_RES(4)..(5)amino acid A, D, E, F, G, H, I, K, L, M,N, P, Q, R, S, T, V, W, or YMOD_RES(7)..(7)amino acid S, G, D or NMOD_RES(8)..(10)amino acid A, D, E, F, G, H, I, K, L, M,N, P, Q, R, S, T, V, W, or Y 3Xaa Ile Xaa Xaa Xaa Gly Xaa Xaa Xaa Xaa Tyr Ala Asp Ser Val Lys1 5 10 15Gly416PRTArtificial SequenceSynthetic peptideMOD_RES(1)..(1)amino acid A, D, E, F, G, H, I, K, L, M, N; P, Q, R, S, T, V, W or YMOD_RES(3)..(3)amino acid D, I, N, S, W, or YMOD_RES(4)..(5)amino acid A, D, E, F, G, H, I, K, L, M, N; P, Q, R, S, T, V, W or YMOD_RES(7)..(7)amino acid S, G, D or NMOD_RES(8)..(9)amino acid A, D, E, F, G, H, I, K, L, M, N; P, Q, R, S, T, V, W or Y 4Xaa Ile Xaa Xaa Xaa Gly Xaa Xaa Xaa Tyr Asn Pro Ser Leu Lys Gly1 5 10 15519PRTArtificial SequenceSynthetic peptideMOD_RES(1)..(1)amino acid A, D, E, F, G, H, I, K, L, M, N, P, Q, R, S, T, V, or YMOD_RES(3)..(3)amino acid A, D, E, F, G, H, I, K, L, M, N, P, Q, S, T, V, W, or YMOD_RES(5)..(7)amino acid A, D, E, F, G, H, I, K, L, M, N, P, Q, R, S, T, V, or YMOD_RES(12)..(12)amino acid A, D, E, F, G, H, I, K, L, M, N, P, Q, R, S, T, V, or Y 5Xaa Ile Xaa Ser Xaa Xaa Xaa Gly Gly Tyr Tyr Xaa Tyr Ala Ala Ser1 5 10 15Val Lys Gly615PRTArtificial SequenceSynthetic peptideMOD_RES(5)..(5)amino acid K or RMOD_RES(6)..(9)amino acid A, D, E, F, G, H, I, K, L, M, N, P, Q, R, S, T, V, W or Y 6Tyr Tyr Cys Ala Xaa Xaa Xaa Xaa Xaa Tyr Phe Asp Tyr Trp Gly1 5 10 15717PRTArtificial SequenceSynthetic peptideMOD_RES(5)..(5)amino acid K or RMOD_RES(6)..(13)amino acid A, D, E, F, G, H, K, L, M, N, P, Q, R, S, T, V, W or Y 7Tyr Tyr Cys Ala Xaa Xaa Xaa Xaa Xaa Xaa Xaa Tyr Phe Asp Tyr Trp1 5 10 15Gly819PRTArtificial SequenceSynthetic peptideMOD_RES(5)..(5)amino acid K or RMOD_RES(6)..(13)amino acid A, D, E, F, G, H, 1, K, L, M, N, P, Q, R, S, T, V, W or Y 8Tyr Tyr Cys Ala Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Tyr Phe Asp1 5 10 15Tyr Trp Gly921PRTArtificial SequenceSynthetic peptideMOD_RES(6)..(8)amino acid A, D, E, G, H, I, K, L, M, N, P, Q, R, S, T, V, W or YMOD_RES(10)..(10)amino acid S or GMOD_RES(11)..(11)amino acid Y or WMOD_RES(12)..(14)amino acid A, D, E, G, H, I, K, L, M, N, P, Q, R, S, T, V, W or Ymisc_feature(15)..(15)Xaa can be any naturally occurring amino acid 9Tyr Tyr Cys Ala Arg Xaa Xaa Xaa Ser Xaa Ser Xaa Xaa Xaa Xaa Tyr1 5 10 15Phe Asp Tyr Trp Gly 201022PRTArtificial SequenceSynthetic peptideMOD_RES(5)..(5)amino acid K or RMOD_RES(6)..(8)amino acid A, D, E, F, G, H, I, K, L, M, N, P, Q, R, S, T, V, W or YMOD_RES(12)..(13)amino acid A, D, E, F, G, H, I, K, L, M, N, P, Q, R, S, T, V, W or YMOD_RES(16)..(16)amino acid A, D, E, F, G, H, I, K, L, M, N, P, Q, R, S, T, V, W or Y 10Tyr Tyr Cys Ala Xaa Xaa Xaa Xaa Cys Ser Gly Xaa Xaa Cys Tyr Xaa1 5 10 15Tyr Phe Asp Tyr Trp Gly 201124PRTArtificial SequenceSynthetic peptidemisc_feature(5)..(5)Xaa can be any naturally occurring amino acidMOD_RES(6)..(6)amino acid K or RMOD_RES(7)..(8)amino acid A, D, E, F, G, H, I, K, L, M, N, P, Q, R, S, T, V, W or Ymisc_feature(9)..(9)Xaa can be any naturally occurring amino acidMOD_RES(10)..(10)amino acid A, D, E, F, G, H, I, K, L, M, N, P, Q, R, S, T, V, W or Ymisc_feature(14)..(14)Xaa can be any naturally occurring amino acidMOD_RES(15)..(19)amino acid A, D, E, F, G, H, I, K, L, M, N, P, Q, R, S, T, V, W or Y 11Tyr Tyr Cys Ala Xaa Xaa Xaa Ser Xaa Thr Ile Phe Gly Xaa Xaa Xaa1 5 10 15Xaa Xaa Tyr Phe Asp Tyr Trp Gly 201225PRTArtificial SequenceSynthetic peptideMOD_RES(6)..(8)amino acid A, D, E, F, G, H, I, K, L, M, N, P, Q, R, S, T, V, W or YMOD_RES(11)..(11)amino acid D or SMOD_RES(13)..(14)amino acid S or GMOD_RES(15)..(16)any amino acidMOD_RES(17)..(19)amino acid A, D, E, F, G, H, I, K, L, M, N, P, Q, R, S, T, V, W or Y 12Tyr Tyr Cys Ala Arg Xaa Xaa Xaa Tyr Tyr Xaa Ser Xaa Xaa Xaa Xaa1 5 10 15Xaa Xaa Xaa Tyr Phe Asp Tyr Trp Gly 20 251326PRTArtificial SequenceSynthetic peptideMOD_RES(6)..(9)amino acid A, D, E, F, G, H, I, K, L, M, N, P, Q, R, S, T, V, W or YMOD_RES(12)..(13)amino acid S or GMOD_RES(14)..(14)amino acid T, D, or GMOD_RES(15)..(15)amino acid A, D, E, F, G, H, I, K, L, M, N, P, Q, R, S, T, V, W or YMOD_RES(18)..(20)amino acid A, D, E, F, G, H, I, K, L, M, N, P, Q, R, S, T, V, W or Y 13Tyr Tyr Cys Ala Arg Xaa Xaa Xaa Xaa Tyr Cys Xaa Xaa Xaa Xaa Cys1 5 10 15Tyr Xaa Xaa Xaa Tyr Phe Asp Tyr Trp Gly 20 251411PRTArtificial SequenceSynthetic peptideMOD_RES(5)..(5)amino acid A, D, E, F, G, H, I, K, L, M, N, P, Q, R, S, T, V, W, or YMOD_RES(7)..(8)amino acid A, D, E, F, G, H, I, K, L, M, N, P, Q, R, T, V, W, or YMOD_RES(9)..(9)amino acid A, D, E, F, G, H, I, K, L, M, N, P, Q, R, T, V, or W 14Arg Ala Ser Gln Xaa Val Xaa Xaa Xaa Leu Ala1 5 101512PRTArtificial SequenceSynthetic peptideMOD_RES(5)..(5)amino acid A, D, E, F, G, H, I, K, L, M, N, P, Q, R, S, T, V, W, or YMOD_RES(7)..(10)amino acid A, D, E, F, G, H, I, K, L, M, N, P, Q, R, T, V, W, or Y 15Arg Ala Ser Gln Xaa Val Xaa Xaa Xaa Xaa Leu Ala1 5 10169PRTArtificial SequenceSynthetic peptideMOD_RES(3)..(3)amino acid A, D, E, F, G, H, I, K, I, M, N, P, Q, R, T, V, or WMOD_RES(4)..(6)amino acid A, D, E, F, G, H, I, K, I, M, N, P, Q, R, S, T, V, W, or YMOD_RES(8)..(8)amino acid A, D, E, F, G, H, I, K, I, M, N, P, Q, R, S, T, V, W, or Y 16Gln Gln Xaa Xaa Xaa Xaa Pro Xaa Thr1 51710PRTArtificial SequenceSynthetic peptideMOD_RES(3)..(3)amino acid A, D, E, F, G, H, I, K, I, M, N, P, Q, R, T, V, or WMOD_RES(4)..(4)amino acid A, D, E, F, G, H, I, K, I, M, N, P, Q, R, T, V, W, or YMOD_RES(5)..(6)amino acid A, D, E, F, G, H, I, K, I, M, N, P, Q, R, S, T, V, W, or YMOD_RES(9)..(9)amino acid A, D, E, F, G, H, I, K, I, M, N, P, Q, R, S, T, V, W, or Y 17Gln Gln Xaa Xaa Xaa Xaa Pro Pro Xaa Thr1 5 101814PRTArtificial SequenceSynthetic peptideMOD_RES(3)..(3)amino acid A, D, E, F, R, I, K, L, M, N, P, Q, R, S, V, W, or YMOD_RES(6)..(6)amino acid A, E, F, G, H, I, K, L, M, P, Q, R, S, T, V, W, or YMOD_RES(9)..(9)amino acid A, D, E, F, R, I, K, L, M, N, P, Q, R, S, V, W, or YMOD_RES(10)..(10)amino acid A, D, E, F, G, H, I, K, L, M, N, P, Q, R, S, T, V, or WMOD_RES(11)..(11)amino acid A, E, F, G, H, I, K, L, M, P, Q, R, S, T, V, W, or YMOD_RES(12)..(12)amino acid A, D, E, F, G, H, I, K, L, M, N, P, Q, R, S, T, V, or W 18Thr Gly Xaa Ser Ser Xaa Val Gly Xaa Xaa Xaa Xaa Val Ser1 5 101910PRTArtificial SequenceSynthetic peptideMOD_RES(1)..(1)amino acid A, D, E, F, G, H, I, K, L, M, N, P, Q, R, T, V, W, or YMOD_RES(4)..(4)amino acid A, D, E, F, G, H, I, K, L, M, N, P, Q, R, S, T, V, W, or YMOD_RES(5)..(5)amino acid A, D, E, F, G, H, I, K, L, M, N, P, Q, R, T, V, W, or YMOD_RES(7)..(7)amino acid A, D, E, F, G, H, I, K, L, M, N, P, Q, R, T, V, W, or YMOD_RES(8)..(8)amino acid A, D, E, F, G, H, I, K, L, M, N, P, Q, R, S, T, V, W, or YMOD_RES(9)..(9)amino acid A, D, E, F, G, H, I, K, L, M, V, P, Q, R, S, T, V, or W 19Xaa Ser Tyr Xaa Xaa Ser Xaa Xaa Xaa Val1 5 102015PRTArtificial SequenceSynthetic peptide 20Tyr Phe Asp Tyr Trp Gly Gln Gly Thr Leu Val Thr Val Ser Ser1 5 10 152115PRTArtificial SequenceSynthetic peptide 21Ala Phe Asp Ile Trp Gly Gln Gly Thr Met Val Thr Val Ser Ser1 5 10 152211PRTArtificial SequenceSynthetic peptideMOD_RES(3)..(3)amino acid A, E, F, G, H, I, K, L, M, P, Q, R, S, T, V, W, or YMOD_RES(4)..(4)amino acid A, D, E, F, G, H, I, K, L, M, N, P, Q, R, S, T, V, W, or YMOD_RES(6)..(8)amino acid A, D, E, F, G, H, I, K, L, M, N, P, Q, R, S, T, V, W, or YMOD_RES(9)..(9)amino acid A, D, E, F, G, H, I, K, L, M, N, P, Q, R, S, T, V, or WMOD_RES(10)..(11)amino acid A, D, E, F, G, H, I, K, L, M, N, P, Q, R, S, T, V, W, or Y 22Thr Gly Xaa Xaa Leu Xaa Xaa Xaa Xaa Xaa Xaa1 5 102321DNAArtificial SequenceSynthetic oligonucleotide 23gactatgaag gtactggtta t 21247PRTArtificial SequenceSynthetic peptide 24Asp Tyr Glu Gly Thr Gly Tyr1 52545DNAArtificial SequenceSynthetic oligonucleotide 25tatttcgatt attggggtca aggtaccctg gtcaccgtct ctagt 452615PRTArtificial SequenceSynthetic peptide 26Tyr Phe Asp Tyr Trp Gly Gln Gly Thr Leu Val Thr Val Ser Ser1 5 10 152746DNAArtificial SequenceSynthetic oligonucleotide 27cttccggatt cactttctct tacatgtggg ttcgccaagc tcctgg 462828DNAArtificial SequenceSynthetic oligonucleotide 28cctactgtct tccggattca ctttctct 282930DNAArtificial SequenceSynthetic oligonucleotide 29tgggttcgcc aagctcctgg ttgctcactc 303049DNAArtificial SequenceSynthetic oligonucleotide 30cttccggatt cactttctct tactactggt gggttcgcca agctcctgg 493170DNAArtificial SequenceSynthetic oligonucleotide 31cttccggatt cactttctct atcagcggtg gttctatctc ctactactgg tgggttcgcc 60aagctcctgg 703253DNAArtificial SequenceSynthetic oligonucleotide 32ggtttggagt gggtttctat ctctggtggc acttatgctg actccgttaa agg 533344DNAArtificial SequenceSynthetic oligonucleotide 33cttgggttcg ccaagctcct ggtaaaggtt tggagtgggt ttct 443449DNAArtificial SequenceSynthetic oligonucleotide 34tatgctgact ccgttaaagg tcgcttcact atctctagat tcctgtcac 493544DNAArtificial SequenceSynthetic oligonucleotide 35ggtttggagt gggtttctat cggttatgct gactccgtta aagg 443644DNAArtificial SequenceSynthetic oligonucleotide 36ggtttggagt gggtttctat tggttataac ccttccctta aggg 443749DNAArtificial SequenceSynthetic oligonucleotide 37tataaccctt cccttaaggg tcgcttcact atctctagat tcctgtcac 493856DNAArtificial SequenceSynthetic oligonucleotide 38ggtttggagt gggtttctat cagtggtggt actacttatg ccgcttccgt taaggg 563949DNAArtificial SequenceSynthetic oligonucleotide 39tatgccgctt ccgttaaggg tcgcttcact atctctagat tcctgtcac 494025DNAArtificial SequenceSynthetic oligonucleotide 40gctctggtca acttaagggc tgagg 254148DNAArtificial SequenceSynthetic oligonucleotide 41gctctggtca acttaagggc tgaggacacc gctgtctact actgcgcc 484246DNAArtificial SequenceSynthetic oligonucleotide 42tacttcgatt actggggcca aggtaccctg gtcacctcgc tccacc 464325DNAArtificial SequenceSynthetic oligonucleotide 43ggtaccctgg tcacctcgct ccacc 254443DNAArtificial sequenceSynthetic oligonucleotide 44ccgctgtcta ctactgcgcc tacttcgatt actggggcca agg 434542DNAArtificial sequenceSynthetic oligonucleotide 45ccgctgtcta ctactgcgcc tacttcgatt actgggccaa gg 424643DNAArtificial SequenceSynthetic oligonucleotide 46ccgctgtcta ctactgcgcc tacttcgatt actggggcca agg 434752DNAArtificial SequenceSynthetic oligonucleotide 47ccgctgtcta ctactgcgcc cgttcttctt acttcgatta ctggggccaa gg 524858DNAArtificial SequenceSynthetic oligonucleotide 48ccgctgtcta ctactgcgcc tgctctggtt gctattactt cgattactgg ggccaagg 584958DNAArtificial SequenceSynthetic oligonucleotide 49ccgctgtcta ctactgcgcc tctactatct tcggttactt cgattactgg ggccaagg 585061DNAArtificial SequenceSynthetic oligonucleotide 50cccctgtcta ctactgcgcc cgttattact cttactatta cttcgattac tggggccaag 60g 615158DNAArtificial SequenceSynthetic oligonucleotide 51ccgctgtcta ctactgcgcc cgttattgct gctattactt cgattactgg ggccaagg 5852234DNAArtificial SequenceSynthetic oligonucleotide 52gaagttcaat tgttagagtc tggtggcggt cttgttcagc ctggtggttc tttacgtctt 60tcttgcgctg cttccggatt cactttctct tcgtacgctt agtaatgggt tcgccaagct 120cctggtaaag gtttggagtg ggtttcttaa cctaggtaga ctatctctag agacaactct 180aagaatactc tctacttgca gatgaacagc ttaagggctt agtaaaggcc ttaa 2345370PRTArtificial SequenceSynthetic peptide 53Glu Val Gln Leu Leu Glu Ser Gly Gly Gly Leu Val Gln Pro Gly Gly1 5 10 15Ser Leu Arg Leu Ser Cys Ala Ala Ser Gly Phe Thr Phe Ser Ser Tyr 20 25 30Ala Trp Val Arg Gln Ala Pro Gly Lys Gly Leu Glu Trp Val Ser Ile 35 40 45Pro Arg Thr Ile Ser Arg Asp Asn Ser Lys Asn Thr Leu Tyr Leu Gln 50 55 60Met Asn Ser Leu Arg Ala65 7054909DNAArtificial SequenceSynthetic oligonucleotide 54gaggaccatt gggccccctc cgagactctc gagcgcaacg caattaatgt gagttagctc 60actcattagg caccccaggc tttacacttt atgcttccgg ctcgtatgtt gtgtggaatt 120gtgagcggat aacaatttca cacaggaaac agctatgacc atgattacgc caagctttgg 180agcctttttt ttggagattt tcaacgtgaa gaagctccta tttgctatcc cgcttgtcgt 240tccgttttac agccatagtg cacaatccgt ccttactcaa tctcctggca ctctttcgct 300aagcccgggt gaacgtgcta ccttaagttg ccgtgcttcc caggttcttg cttggtatca 360acagaaacct ggtcaggcgc cgcgtttact tatttatgct tctcgcggga tcccggaccg 420tttctctggc tctggttcag gtactgactt tacccttact atttctagat tggaacctga 480agacttcgct gtttattatt gccaacagcc tactttcggt caaggtacca aggttgaaat 540caagcgtacg gttgccgctc ctagtgtgtt tatctttcct ccttctgacg aacaattgaa 600gtcaggtact gcttctgtcg tatgtttgct caacaatttc taccctcgtg aagctaaagt 660tcagtggaaa gtcgataacg cgttgcagtc gggtaacagt caagaatccg tcactgaaca 720ggatagtaag gactctacct actctttgtc tctactctta ctttatcaaa ggctgattat 780gagaagcata aggtctatgc ttgcgaagtt acccagcagg gtctgagctt ccctgttacc 840aaaagtttca accgtggtga atgctaatag ggcgcgccac gcatctctaa gcggccgcaa 900caggaggag 90955220PRTArtificial SequenceSynthetic peptide 55Met Lys Lys Leu Leu Phe Ala Ile Pro Leu Val Val Pro Phe Tyr Pro1 5 10 15His Ser Ala Gln Ser Val Leu Thr Gln Ser Pro Gly Thr Leu Ser Leu 20 25 30Ser Pro Gly Glu Arg Ala Thr Leu Ser Cys Arg Ala Ser Gln Val Leu 35 40 45Ala Trp Tyr Gln Gln Lys Pro Gly Gln Ala Pro Arg Leu Leu Ile Tyr 50 55 60Ala Ser Arg Gly Ile Pro Asp Arg Phe Ser Gly Ser Gly Ser Gly Thr65 70 75 80Asp Phe Thr Leu Thr Ile Ser Arg Leu Glu Pro Glu Pro Phe Ala Val 85 90 95Tyr Tyr Cys Gln Gln Pro Thr Phe Gly Gln Gly Thr Lys Val Glu Ile 100 105 110Lys Arg Thr Val Ala Ala Pro Ser Val Phe Ile Phe Pro Pro Ser Asp 115 120 125Glu Gln Leu Lys Ser Gly Thr Ala Ser Val Val Cys Leu Leu Asn Asn 130 135 140Phe Tyr Pro Arg Glu Ala Lys Val Gln Trp Lys Val Asp Asn Ala Leu145 150 155 160Gln Ser Gly Asn Ser Gln Glu Ser Val Thr Glu Gln Asp Ser Lys Asp 165 170 175Ser Thr Tyr Ser Leu Ser Ser Thr Leu Thr Leu Ser Lys Ala Asp Tyr 180 185 190Glu Lys His Lys Val Tyr Ala Cys Glu Val Thr His Gln Gly Leu Ser 195 200 205Ser Pro Val Thr Lys Ser Phe Asn Arg Gly Glu Cys 210 215 22056646DNAArtificial SequenceSynthetic oligonucleotide 56agtgcacaat ccgctctcac tcagcctgct agcgtttccg ggtcacctgg

tcaaagtatc 60actatttctt gtacaggttc ttctgttggc gtttcttggt atcaacaaca cccgggcaag 120gcgccgaagt tgatgatcta ccgtccttct ggtgttagca atcgtttctc cggatctaaa 180tccggtaata ccgcaagctt aactatctct ggtctgcagg ctgaagacga ggctgactac 240tattgttctg tcttcggcgg tggtaccaaa cttactgtcc tcggtcaacc taaggctgct 300ccttccgtta ctctcttccc tcctagttct gaagagcttc aagctaacaa ggctactctt 360gtttgcttga tcagtgactt ttatcctggt gctgttactg tcgcttggaa agccgattct 420tctcctgtta aagctggtgt tgagacgacc actccttcta aacaatctaa caataagtac 480gctgcgagct cttatctttc tctcacccct gaacaatgga agtctcataa atcctattcc 540tatcaagtta ctcatgaagg ttctaccgtt gaaaagactg ttgcccctac tgagtgttct 600tagtgaggcg cgccaacgat gttcaaggcg gccgcaacag gaggag 64657200PRTArtificial SequenceSynthetic peptide 57Ser Ala Gln Ser Ala Leu Thr Gln Pro Ala Ser Val Ser Gly Ser Pro1 5 10 15Gly Gln Ser Ile Thr Ile Ser Cys Thr Gly Ser Ser Val Gly Val Ser 20 25 30Trp Tyr Gln Gln His Pro Gly Lys Ala Pro Lys Leu Met Ile Tyr Arg 35 40 45Pro Ser Gly Val Ser Asn Arg Phe Ser Gly Ser Lys Ser Gly Asn Thr 50 55 60Ala Ser Leu Thr Ile Ser Gly Leu Gln Arg Glu Asp Glu Ala Asp Tyr65 70 75 80Tyr Cys Ser Val Phe Gly Gly Gly Thr Lys Leu Thr Val Leu Gly Gln 85 90 95Pro Lys Ala Ala Pro Ser Val Thr Leu Phe Pro Pro Ser Ser Glu Glu 100 105 110Leu Gln Ala Asn Lys Ala Thr Leu Val Cys Leu Ile Ser Asp Phe Tyr 115 120 125Pro Gly Ala Val Thr Val Ala Trp Lys Ala Ala Ser Ser Phe Val Lys 130 135 140Ala Gly Val Glu Thr Thr Thr Pro Ser Lys Gln Ser Asn Asn Lys Tyr145 150 155 160Ala Ala Ser Ser Tyr Leu Ser Leu Thr Pro Glu Gln Trp Lys Ser His 165 170 175Lys Ser Tyr Ser Cys Gln Val Thr His Glu Gly Ser Thr Val Glu Lys 180 185 190Thr Val Ala Pro Thr Glu Cys Ser 195 2005825DNAArtificial SequenceSynthetic oligonucleotide 58gctctggtca acttaagggc tgagg 255948DNAArtificial SequenceSynthetic oligonucleotide 59gctctggtca acttaagggc tgaggacacc gctgtctact actgcgcc 486046DNAArtificial SequenceSynthetic oligonucleotide 60tacttcaatt acttgggcca aggtaccctg gtcacctcgc tccacc 466124DNAArtificial SequenceSynthetic oligonucleotide 61ggtaccctgg tcacctcgac cacc 246256DNAArtificial SequenceSynthetic oligonucleotide 62ggtctcagtt gctaagcccg ggtgaacgtg ctaccttaag ttgccgtgct tcccag 566324DNAArtificial SequenceSynthetic oligonucleotide 63ggtctcagtt gctaagcccg ggtg 246445DNAArtificial SequenceSynthetic oligonucleotide 64cttgcttggt atcaacagaa acctggtcag gcgccaagtc gtgtc 456523DNAArtificial SequenceSynthetic oligonucleotide 65cctggtcagg cgcaagtcgt gtc 236653DNAArtificial SequenceSynthetic oligonucleotide 66gctaccttaa gttgccgtgc ttcccaggtt cttgcttggt atcaacagaa acc 536753DNAArtificial SequenceSynthetic oligonucleotide 67gctaccttaa gttgccgtgc ttcccaggtt cttgcttggt atcaacagaa acc 536822DNAArtificial SequenceSynthetic oligonucleotide 68cacgagtcct acctggtcag gc 226941DNAArtificial SequenceSynthetic oligonucleotide 69cacgagtcct acctggtcag gcgccgcgtt tacttattta t 417022DNAArtificial SequenceSynthetic oligonucleotide 70gaccgtttct ctggttctca cc 227164DNAArtificial SequenceSynthetic oligonucleotide 71caggcgccgc gtttacttat ttatgcttct cgcgggatcc cggaccgttt ctctggttct 60cacc 647253DNAArtificial SequenceSynthetic oligonucleotide 72gacgagtcct tctagattgg aacctgaaga cttcgctgtt tattattgcc aac 537350DNAArtificial SequenceSynthetic oligonucleotide 73actttcggtc aaggtaccaa ggttgaaatc aagcgtacgt cacaggtgag 507426DNAArtificial SequenceSynthetic oligonucleotide 74gaaatcaagc gtacgtcaca ggtgag 267555DNAArtificial SequenceSynthetic oligonucleotide 75gacttcgctg tttattattg ccaacagcct actttcggtc aaggtaccaa ggttg 557652DNAArtificial SequenceSynthetic oligonucleotide 76gacttcgctg tttattattg ccaacagcct ttcggtcaag gtaccaaggt tg 527758DNAArtificial SequenceSynthetic oligonucleotide 77gacttcgctg tttattattg ccaacagcct cctactttcg gtcaaggtac caaggttg 587821DNAArtificial SequenceSynthetic oligonucleotide 78gacgagtcct ggtcacctgg t 217948DNAArtificial SequenceSynthetic oligonucleotide 79gacgagtcct ggtcacctgg tcaaagtatc actatttctt gtacaggt 488050DNAArtificial SequenceSynthetic oligonucleotide 80gtttcttggt atcaacaaca cccgggcaag gcgagatctt cacaggtgag 508125DNAArtificial SequenceSynthetic oligonucleotide 81gcaaggcgag atcttcacag gtgag 258243DNAArtificial SequenceSynthetic oligonucleotide 82gttatcatat ttcttgtaca ggtctctggt atcaacaaca ccc 438358DNAArtificial SequenceSynthetic oligonucleotide 83gtatcactat ttcttgtaca ggttcttctg ttggcgtttc ttggtatcaa caacaccc 588422DNAArtificial SequenceSynthetic oligonucleotide 84gagcagagga cccgggcaag gc 228541DNAArtificial SequenceSynthetic oligonucleotide 85gagcagagga cccgggcaag gcgccgaagt tgatgatcta c 418644DNAArtificial SequenceSynthetic oligonucleotide 86cgtccttctg gtgtcagcaa tcgtttctcc ggatcacagg tgag 448723DNAArtificial SequenceSynthetic oligonucleotide 87cgtttctccg gatcacaggt gag 238841DNAArtificial SequenceSynthetic oligonucleotide 88gccgaagttg atgatctacc gtccttctgg tgtcagcaat c 418924DNAArtificial SequenceSynthetic oligonucleotide 89ctgcaggctg aagacgaggc tgac 249033DNAArtificial SequenceSynthetic oligonucleotide 90ctgcaggctg aagacgaggc tgactactat tgt 339157DNAArtificial SequenceSynthetic oligonucleotide 91gtcttcggcg gtggtaccaa acttactgtc ctcggtcaac ctaaggacac aggtgag 579224DNAArtificial SequenceSynthetic oligonucleotide 92cggtcaacct aaggacacag gtga 249350DNAArtificial SequenceSynthetic oligonucleotide 93gacgaggctg actactattg ttctgtcttc ggcggtggta ccaaacttac 509456DNAArtificial SequenceSynthetic oligonucleotide 94gacgaggctg actactattg tagctattct gtcttcggcg gtggtaccaa acttac 5695618DNAArtificial SequenceSynthetic oligonucleotide 95gaggaccatt gggccccctc cgagactctc gagcgcaacg caattaatgt gagttagctc 60actcattagg caccccaggc tttacacttt atgcttccgg ctcgtatgtt gtgtggaatt 120gtgagcggat aacaatttca cacaggaaac agctatgacc atgattacgc caagctttgg 180agcctttttt ttggagattt tcaacgtgaa gaagctccta tttgctatcc cgcttgtcgt 240tccgttttac agccatagtg cacaatccgt ccttactcaa tctcctggca ctctttcgct 300aagcccgggt ggacgtgcta ccttaagtta gtaagctccc aggcctcttt gatctgaaac 360ctggtcaggc gccgcgttaa tgaaagcgct aatggccaac agtgactggg atcccggacc 420gtttctctgg ctctggttca ggtactgact ttacccttac tatttctaga taatgagtta 480actagaccta cgtaacctag ggtaccaagg ttgaaatcaa gcgtacggtt gccgctccta 540gtgtgtttat ctttcctcct tctgacgaac aattgaagtc aggtactacg catctctaag 600cggccgcaac aggaggag 61896102PRTArtificial SequenceSynthetic peptide 96Met Lys Lys Leu Leu Phe Ala Ile Pro Leu Val Val Pro Phe Tyr Ser1 5 10 15His Ser Ala Gln Ser Val Leu Thr Gln Ser Pro Gly Thr Leu Ser Leu 20 25 30Ser Pro Gly Glu Arg Ala Thr Leu Ser Lys Pro Gly Gln Ala Pro Arg 35 40 45Thr Gly Ile Pro Asp Arg Phe Ser Gly Ser Gly Ser Gly Thr Asp Phe 50 55 60Thr Leu Thr Ile Ser Arg Phe Gly Gln Gly Thr Lys Val Glu Ile Lys65 70 75 80Arg Thr Val Ala Ala Pro Ser Val Phe Ile Phe Pro Pro Ser Asp Glu 85 90 95Gln Leu Lys Ser Gly Thr 10097404DNAArtificial SequenceSynthetic oligonucleotide 97gaggaccatt gggcccctta ctccgtgaca gtgcacaatc cgctctcact cagcctgcta 60gcgtttccgg gtcacctggt caaagtatca ctatttcttg tacatcttag tgactcagat 120cttaatgacc gtagcacccg ggcaaggcgc cgtaatgaat ctcgtacgct ggtgttagca 180atcgtttctc cggatctaaa tccggtaata ccgcaagctt aactatctct ggtctgcagg 240ttctgtagtt ccaattgctt tagtgaccca ccaaacttac tgtcctcggt caacctaagg 300ctgctccttc cgttactctc ttccatccta gttctgaaga gattcaagct aacaaggcta 360ctcctgtttg cttgatcagt gacttttatc ctggtgctgt tact 4049899PRTArtificial SequenceSynthetic peptide 98Ser Ala Gln Ser Ala Leu Thr Gln Pro Ala Ser Val Ser Gly Ser Pro1 5 10 15Gly Gln Ser Ile Thr Ile Ser Cys Thr Arg Ser Pro His Pro Gly Lys 20 25 30Ala Ser Asn Arg Phe Ser Gly Ser Lys Ser Gly Asn Thr Ala Ser Leu 35 40 45Thr Ile Ser Gly Leu Gln Thr Lys Leu Thr Val Leu Gly Gln Pro Lys 50 55 60Ala Ala Pro Ser Val Thr Leu Phe Pro Pro Ser Ser Glu Glu Leu Gln65 70 75 80Ala Asn Lys Ala Thr Leu Val Cys Leu Ile Ser Asp Phe Tyr Pro Gly 85 90 95Ala Val Thr9918DNAArtificial SequenceSynthetic oligonucleotide 99ctgtctgaac ccatggcc 181005PRTArtificial SequenceSynthetic peptideMOD_RES(1)..(1)amino acid A, D, E, F, G, H,I, K, L, M, N, P, Q, R, S, T, V, W, or YMOD_RES(3)..(3)amino acid A, D, E, F, G, H,I, K, L, M, N, P, Q, R, S, T, V, W, or YMOD_RES(5)..(5)amino acid A, D, E, F, G, H,I, K, L, M, N, P, Q, R, S, T, V, W, or Y 100Xaa Tyr Xaa Met Xaa1 51017PRTArtificial SequenceSynthetic peptideMOD_RES(1)..(1)amino acid S or TMOD_RES(2)..(3)amino acid S, G, or XMOD_RES(7)..(7)amino acid S, G, or X 101Xaa Xaa Xaa Tyr Tyr Trp Xaa1 51027PRTArtificial SequenceSynthetic peptideMOD_RES(1)..(1)amino acid A, D, E, F, G, H, I, K, L, M, N, P, Q, R, S, T, V, W, or YMOD_RES(4)..(4)amino acid A, D, E, F, G, H, I, K, L, M, N, P, Q, R, T, V, W, or YMOD_RES(6)..(6)amino acid D, E, F, G, H, I, K, L, M, N, P, Q, R, S, T, V, W, or YMOD_RES(7)..(7)amino acid A, D, E, F, G, H, I, K, L, M, N, P, Q, R, S, T, V, W, or Y 102Xaa Ala Ser Xaa Arg Xaa Xaa1 51038PRTArtificial SequenceSynthetic peptideMOD_RES(3)..(4)amino acid A, D, E, F, G, H, I, K, I, M, N, P, Q, R, T, V, or WMOD_RES(5)..(7)amino acid A, D, E, F, G, H, I, K, I, M, N, P, Q, R, S, T, V, W, or Y 103Gln Gln Xaa Xaa Xaa Xaa Xaa Pro1 510410PRTArtificial SequenceSynthetic peptideMOD_RES(2)..(2)amnio acid A, E, F, G, H, I, K, L, M, P, Q, R, S, T, V, W, or YMOD_RES(3)..(3)amnio acid A, D, E, F, G, H, I, K, I, M, N, P, Q, R, S, T, V, W, or YMOD_RES(5)..(7)amnio acid A, D, E, F, G, H, I, K, I, M, N, P, Q, R, S, T, V, W, or YMOD_RES(8)..(8)amnio acid A, D, E, F, G, H, I, K, L, M, N, P, Q, R, S, T, V, or WMOD_RES(9)..(10)amnio acid A, D, E, F, G, H, I, K, I, M, N, P, Q, R, S, T, V, W, or Y 104Gly Xaa Xaa Leu Xaa Xaa Xaa Xaa Xaa Xaa1 5 101057PRTArtificial SequenceSynthetic peptideMOD_RES(1)..(3)amino acid A, D, E, F, G, H, I, K, L, O, N, P, Q, R, S, T, V, or WMOD_RES(4)..(4)amino acid A, E, F, G, H, I, K, I, M, P, Q, R, S, T, V, W, or Y 105Xaa Xaa Xaa Xaa Arg Pro Ser1 510611PRTArtificial SequenceSynthetic peptideMOD_RES(1)..(1)amino acid A, D, E, F, G, H, I, K, L, M, V, P, Q, R, S, T, V, or WMOD_RES(2)..(2)amino acid A, D, E, F, G, H, I, K, L, M, N, P, Q, R, T, V, W, or YMOD_RES(3)..(3)amino acid A, D, E, F, G, H, I, K, L, M, V, P, Q, R, S, T, V, or WMOD_RES(4)..(4)amino acid A, E, F, G, H, I, K, L, M, P, Q, R, S, T, V, W, or YMOD_RES(5)..(5)amino acid A, D, E, F, G, H, I, K, L, M, V, P, Q, R, S, T, V, or WMOD_RES(7)..(10)amino acid A, D, E, F, G, H, I, K, L, M, V, P, Q, R, S, T, V, or W 106Xaa Xaa Xaa Xaa Xaa Ser Xaa Xaa Xaa Xaa Val1 5 101077PRTArtificial SequenceSynthetic peptideMOD_RES(1)..(1)amino acid S or TMOD_RES(2)..(3)amino acid S or G, or A, D, E, F, H, I, K, L, M, N, P, Q, R, T, V, W, or YMOD_RES(7)..(7)amino acid S or G, or A, D, E, F, H, I, K, L, M, N, P, Q, R, T, V, W, or Y 107Xaa Xaa Xaa Tyr Tyr Trp Xaa1 510814PRTArtificial SequenceSynthetic peptideMOD_RES(8)..(10)any amino acid except CMOD_RES(14)..(14)any amino acid except C 108Val Ser Gly Gly Ser Ile Ser Xaa Xaa Xaa Tyr Tyr Trp Xaa1 5 101095PRTArtificial SequenceSynthetic peptideMOD_RES(1)..(1)any amino acid except CMOD_RES(3)..(3)any amino acid except CMOD_RES(5)..(5)any amino acid except C 109Xaa Tyr Xaa Met Xaa1 51107PRTArtificial SequenceSynthetic peptide 110Ser Gly Gly Tyr Tyr Trp Ser1 51116PRTArtificial SequenceSynthetic peptidemisc_feature(4)..(6)Xaa can be any naturally occurring amino acid 111Ser Ile Ser Xaa Xaa Xaa1 51125PRTArtificial SequenceSynthetic peptide 112Tyr Tyr Cys Ala Arg1 51135PRTArtificial SequenceSynthetic peptideMOD_RES(4)..(5)any amino acid 113Tyr Tyr Cys Xaa Xaa1 5114519DNAArtificial SequenceSynthetic oligonucleotide 114atggactgga cctggaggtt cctctttgtg gtggcagcag ctacaggtgt ccagtcccag 60gtgcagctgg tgcagtctgg ggctgaggtg aagaagcctg ggtcctcggt gaaggtctcc 120tgcaaggctt ctggaggcac cttcagcagc tatgctatca gctgggtgcg acaggcccct 180ggacaagggc ttgagtggat gggagggatc atccctatct ttggtacagc aaactacgca 240cagaagttcc agggcagagt cacgattacc gcggacgaat ccacgagcac agcctacatg 300gagctgagca gcctgagatc tgaggacacg gccgtgtatc actgtgcgag tgagggatgg 360gagagttgta gtggtggtgg ctgctacgac ggtatggacg tctggggcca agggaccacg 420gtcaccgtct cctcagcttc caccaagggc ccatcggtct tccccctggc gccctgctcc 480aggagcacct ctgggggcac agcggccctg ggctgcctg 519115518DNAArtificial SequenceSynthetic oligonucleotide 115atggactgga cctggaggtt cctctttgtg gtggcagcag ctacaggtgt ccagtcccag 60gtgcagctgg tgcagtctgg ggctgaggtg aagaagcctg ggtcctcggt gaaggtctcc 120tgcaaggctt ctggaggcac cttcagcagc tatgctatca gctgggtgcg acaggcccct 180ggacaagggc ttgagtggat gggagggatc atccctatct ttggtacagc aaactacgca 240cagaagttcc agggcagagt cacgattacc gcggacgaat ccacgagcac agcctacatg 300gagctgagca gcctgagatc tgaggacacg gccgtgtatc actgtgcgag tgagggatgg 360gagagttgta gtggtggtgg ctgctacgac ggtatggacg tctggggcca agggaccacg 420gtcaccgtct cctcagcttc caccaagggc catcggtctt ccccctggcg ccctgctcca 480ggagcacctc tgggggcaca gcggccctgg gctgcctg 5181165PRTArtificial SequenceSynthetic peptide 116Tyr His Cys Ala Ser1 511714PRTArtificial SequenceSynthetic peptide 117Asp Arg Gly Gly Lys Tyr Gln Leu Ala Pro Lys Gly Gly Met1 5 1011816PRTArtificial SequenceSynthetic peptide 118Asp Arg Gly Gly Lys Tyr Gln Leu Ala Pro Lys Gly Gly Met Asp Val1 5 10 1511939DNAArtificial SequenceSynthetic oligonucleotide 119tatgatagta gtgggtcata ctccgactac tgggggcag 3912052DNAArtificial SequenceSynthetic oligonucleotide 120gctgaatact tccagcactg gggccagggc accctggtca ccgtctcctc ag 5212153DNAArtificial SequenceSynthetic oligonucleotide 121ctactggtac ttcgatctct ggggccgtgg caccctggtc actgtctcct cag 5312250DNAArtificial SequenceSynthetic oligonucleotide 122tgatgctttt gatatctggg gccaagggac aatggtcacc gtctcttcag 5012348DNAArtificial SequenceSynthetic oligonucleotide 123actactttga ctactggggc cagggaaccc tggtcaccgt ctcctcag 4812451DNAArtificial SequenceSynthetic oligonucleotide 124acaactggtt cgacccctgg ggccagggaa ccctggtcac cgtctcctca g 5112563DNAArtificial SequenceSynthetic oligonucleotide 125attactacta ctactacggt atggacgtct ggggccaagg gaccacggtc accgtctcct 60cag 6312639DNAArtificial SequenceSynthetic oligonucleotide 126tatgatagta gtgggtcata ctccgactac tgggggcag 3912713PRTArtificial SequenceSynthetic peptide 127Tyr Asp Ser Ser Gly Ser Tyr Ser Asp Tyr Trp Gly Gln1 5 1012848DNAArtificial SequenceSynthetic oligonucleotide 128actactttga ctactggggc cagggaaccc tggtcaccgt ctcctcag

4812915PRTArtificial SequenceSynthetic peptide 129Tyr Phe Asp Tyr Trp Gly Gln Gly Thr Leu Val Thr Val Ser Ser1 5 10 1513031DNAArtificial SequenceSynthetic oligonucleotide 130gtattactat gatagtagtg gttattacta c 3113139DNAArtificial SequenceSynthetic oligonucleotide 131gatcgccaca attactatga tagtagtggg tcatactcc 391326PRTArtificial SequenceSynthetic peptidemisc_feature(6)..(6)Xaa can be any naturally occurring amino acid 132Val Gln Leu Glu Arg Xaa1 51336PRTArtificial SequenceSynthetic peptidemisc_feature(6)..(6)Xaa can be any naturally occurring amino acid 133Gly Thr Thr Gly Thr Xaa1 51345PRTArtificial SequenceSynthetic peptide 134Tyr Asn Trp Asn Asp1 51355PRTArtificial SequenceSynthetic peptidemisc_feature(5)..(5)Xaa can be any naturally occurring amino acid 135Val Leu Glu Leu Xaa1 51366PRTArtificial SequenceSynthetic peptidemisc_feature(6)..(6)Xaa can be any naturally occurring amino acid 136Gly Ile Thr Gly Thr Xaa1 51375PRTArtificial SequenceSynthetic peptide 137Tyr Asn Trp Asn Tyr1 51385PRTArtificial SequenceSynthetic peptidemisc_feature(5)..(5)Xaa can be any naturally occurring amino acid 138Val Leu Glu Arg Xaa1 51396PRTArtificial SequenceSynthetic peptidemisc_feature(6)..(6)Xaa can be any naturally occurring amino acid 139Gly Ile Thr Gly Thr Xaa1 51405PRTArtificial SequenceSynthetic peptide 140Tyr Asn Trp Asn Asp1 51416PRTArtificial SequenceSynthetic peptidemisc_feature(6)..(6)Xaa can be any naturally occurring amino acid 141Val Trp Glu Leu Leu Xaa1 51427PRTArtificial SequenceSynthetic peptidemisc_feature(7)..(7)Xaa can be any naturally occurring amino acid 142Gly Ile Val Gly Ala Thr Xaa1 51436PRTArtificial SequenceSynthetic peptide 143Tyr Ser Gly Ser Tyr Tyr1 514410PRTArtificial SequenceSynthetic peptide 144Gly Tyr Cys Ser Ser Thr Ser Cys Tyr Thr1 5 101459PRTArtificial SequenceSynthetic peptidemisc_feature(9)..(9)Xaa can be any naturally occurring amino acid 145Arg Ile Leu Tyr Gln Leu Leu Tyr Xaa1 514610PRTArtificial SequenceSynthetic peptidemisc_feature(10)..(10)Xaa can be any naturally occurring amino acid 146Asp Ile Val Val Val Pro Ala Ala Ile Xaa1 5 1014710PRTArtificial SequenceSynthetic peptide 147Gly Tyr Cys Thr Asn Gly Val Cys Tyr Thr1 5 1014810PRTArtificial SequenceSynthetic peptidemisc_feature(10)..(10)Xaa can be any naturally occurring amino acid 148Arg Ile Leu Tyr Trp Cys Met Leu Tyr Xaa1 5 1014910PRTArtificial SequenceSynthetic peptidemisc_feature(10)..(10)Xaa can be any naturally occurring amino acid 149Asp Ile Val Leu Met Val Tyr Ala Ile Xaa1 5 1015010PRTArtificial SequenceSynthetic peptide 150Gly Tyr Cys Ser Gly Gly Ser Cys Tyr Ser1 5 101519PRTArtificial SequenceSynthetic peptidemisc_feature(9)..(9)Xaa can be any naturally occurring amino acid 151Arg Ile Leu Trp Trp Leu Leu Leu Xaa1 515210PRTArtificial SequenceSynthetic peptidemisc_feature(10)..(10)Xaa can be any naturally occurring amino acid 152Asp Ile Val Val Val Val Ala Ala Thr Xaa1 5 101539PRTArtificial SequenceSynthetic peptide 153Ala Tyr Cys Gly Gly Asp Cys Tyr Ser1 51549PRTArtificial SequenceSynthetic peptidemisc_feature(9)..(9)Xaa can be any naturally occurring amino acid 154Ser Ile Leu Trp Trp Leu Leu Phe Xaa1 51559PRTArtificial SequenceSynthetic peptidemisc_feature(9)..(9)Xaa can be any naturally occurring amino acid 155His Ile Val Val Val Thr Ala Ile Xaa1 515610PRTArtificial SequenceSynthetic peptide 156Tyr Tyr Asp Phe Trp Ser Gly Tyr Tyr Thr1 5 1015711PRTArtificial SequenceSynthetic peptidemisc_feature(11)..(11)Xaa can be any naturally occurring amino acid 157Val Leu Arg Phe Leu Glu Trp Leu Leu Tyr Xaa1 5 1015810PRTArtificial SequenceSynthetic peptidemisc_feature(10)..(10)Xaa can be any naturally occurring amino acid 158Ile Thr Ile Phe Gly Val Val Ile Ile Xaa1 5 1015910PRTArtificial SequenceSynthetic peptide 159Tyr Tyr Asp Ile Leu Thr Gly Tyr Tyr Asn1 5 1016010PRTArtificial SequenceSynthetic peptidemisc_feature(10)..(10)Xaa can be any naturally occurring amino acid 160Val Leu Arg Tyr Phe Asp Trp Leu Leu Xaa1 5 101619PRTArtificial SequenceSynthetic peptidemisc_feature(9)..(9)Xaa can be any naturally occurring amino acid 161Ile Thr Ile Phe Leu Val Ile Ile Xaa1 516210PRTArtificial SequenceSynthetic peptide 162Tyr Tyr Tyr Gly Ser Gly Ser Tyr Tyr Asn1 5 1016310PRTArtificial SequenceSynthetic peptidemisc_feature(10)..(10)Xaa can be any naturally occurring amino acid 163Val Leu Leu Trp Phe Gly Glu Leu Leu Xaa1 5 1016410PRTArtificial SequenceSynthetic peptidemisc_feature(10)..(10)Xaa can be any naturally occurring amino acid 164Ile Thr Met Val Arg Gly Val Ile Ile Xaa1 5 1016512PRTArtificial SequenceSynthetic peptide 165Tyr Tyr Asp Tyr Val Trp Gly Ser Tyr Arg Tyr Thr1 5 1016612PRTArtificial SequenceSynthetic peptidemisc_feature(12)..(12)Xaa can be any naturally occurring amino acid 166Val Leu Leu Arg Leu Gly Glu Leu Ser Leu Tyr Xaa1 5 1016712PRTArtificial SequenceSynthetic peptidemisc_feature(12)..(12)Xaa can be any naturally occurring amino acid 167Ile Met Ile Thr Phe Gly Gly Val Ile Val Ile Xaa1 5 1016810PRTArtificial SequenceSynthetic peptide 168Tyr Tyr Tyr Asp Ser Ser Gly Tyr Tyr Tyr1 5 101698PRTArtificial SequenceSynthetic peptidemisc_feature(8)..(8)Xaa can be any naturally occurring amino acid 169Val Leu Leu Trp Leu Leu Leu Xaa1 517010PRTArtificial SequenceSynthetic peptidemisc_feature(10)..(10)Xaa can be any naturally occurring amino acid 170Ile Thr Met Ile Val Val Val Ile Thr Xaa1 5 101715PRTArtificial SequenceSynthetic peptide 171Asp Tyr Ser Asn Tyr1 51724PRTArtificial SequenceSynthetic peptidemisc_feature(4)..(4)Xaa can be any naturally occurring amino acid 172Leu Gln Leu Xaa11735PRTArtificial SequenceSynthetic peptidemisc_feature(5)..(5)Xaa can be any naturally occurring amino acid 173Thr Thr Val Thr Xaa1 51745PRTArtificial SequenceSynthetic peptide 174Asp Tyr Gly Asp Tyr1 51754PRTArtificial SequenceSynthetic peptidemisc_feature(4)..(4)Xaa can be any naturally occurring amino acid 175Leu Arg Leu Xaa11765PRTArtificial SequenceSynthetic peptidemisc_feature(5)..(5)Xaa can be any naturally occurring amino acid 176Thr Thr Val Thr Xaa1 51776PRTArtificial SequenceSynthetic peptide 177Asp Tyr Gly Gly Asn Ser1 51785PRTArtificial SequenceSynthetic peptidemisc_feature(5)..(5)Xaa can be any naturally occurring amino acid 178Leu Arg Trp Leu Xaa1 51796PRTArtificial SequenceSynthetic peptidemisc_feature(6)..(6)Xaa can be any naturally occurring amino acid 179Thr Thr Val Val Thr Xaa1 51807PRTArtificial SequenceSynthetic peptidemisc_feature(7)..(7)Xaa can be any naturally occurring amino acid 180Trp Ile Gln Leu Trp Leu Xaa1 51817PRTArtificial SequenceSynthetic peptidemisc_feature(7)..(7)Xaa can be any naturally occurring amino acid 181Val Asp Thr Ala Met Val Xaa1 51826PRTArtificial SequenceSynthetic peptide 182Gly Tyr Ser Tyr Gly Tyr1 51837PRTArtificial SequenceSynthetic peptidemisc_feature(7)..(7)Xaa can be any naturally occurring amino acid 183Trp Ile Trp Leu Arg Leu Xaa1 51848PRTArtificial SequenceSynthetic peptidemisc_feature(8)..(8)Xaa can be any naturally occurring amino acid 184Val Asp Ile Val Ala Thr Ile Xaa1 51857PRTArtificial SequenceSynthetic peptide 185Gly Tyr Ser Gly Tyr Asp Tyr1 51866PRTArtificial SequenceSynthetic peptidemisc_feature(6)..(6)Xaa can be any naturally occurring amino acid 186Arg Trp Leu Gln Leu Xaa1 51877PRTArtificial SequenceSynthetic peptidemisc_feature(7)..(7)Xaa can be any naturally occurring amino acid 187Val Glu Met Ala Thr Ile Xaa1 51886PRTArtificial SequenceSynthetic peptide 188Arg Asp Gly Tyr Asn Tyr1 51896PRTArtificial SequenceSynthetic peptidemisc_feature(6)..(6)Xaa can be any naturally occurring amino acid 189Ser Ile Ala Ala Arg Xaa1 51906PRTArtificial SequenceSynthetic peptide 190Glu Tyr Ser Ser Ser Ser1 51915PRTArtificial SequenceSynthetic peptidemisc_feature(5)..(5)Xaa can be any naturally occurring amino acid 191Val Gln Leu Val Xaa1 51927PRTArtificial SequenceSynthetic peptidemisc_feature(7)..(7)Xaa can be any naturally occurring amino acid 192Gly Ile Ala Ala Ala Gly Xaa1 51937PRTArtificial SequenceSynthetic peptide 193Gly Tyr Ser Ser Ser Trp Tyr1 51946PRTArtificial SequenceSynthetic peptidemisc_feature(6)..(6)Xaa can be any naturally occurring amino acid 194Val Gln Gln Leu Val Xaa1 51957PRTArtificial SequenceSynthetic peptidemisc_feature(7)..(7)Xaa can be any naturally occurring amino acid 195Gly Ile Ala Val Ala Gly Xaa1 51967PRTArtificial SequenceSynthetic peptide 196Gly Tyr Ser Ser Gly Trp Tyr1 51976PRTArtificial SequenceSynthetic peptidemisc_feature(6)..(6)Xaa can be any naturally occurring amino acid 197Val Gln Trp Leu Val Xaa1 51984PRTArtificial SequenceSynthetic peptidemisc_feature(4)..(4)Xaa can be any naturally occurring amino acid 198Leu Thr Gly Xaa119913PRTArtificial SequenceSynthetic peptideMOD_RES(1)..(3)any amino acidMOD_RES(11)..(13)any amino acid 199Xaa Xaa Xaa Tyr Cys Ser Ser Thr Ser Cys Xaa Xaa Xaa1 5 1020013PRTArtificial SequenceSynthetic peptideMOD_RES(1)..(4)any amino acidMOD_RES(11)..(13)any amino acid 200Xaa Xaa Xaa Xaa Tyr Val Trp Gly Ser Tyr Xaa Xaa Xaa1 5 1020113PRTArtificial SequenceSynthetic peptideMOD_RES(1)..(7)any amino acidMOD_RES(11)..(13)any amino acid 201Xaa Xaa Xaa Xaa Xaa Xaa Xaa Ser Gly Tyr Xaa Xaa Xaa1 5 1020213PRTArtificial SequenceSynthetic peptideMOD_RES(1)..(3)any amino acidMOD_RES(12)..(13)any amino acid 202Xaa Xaa Xaa Tyr Asp Ile Leu Thr Gly Tyr Tyr Xaa Xaa1 5 1020313PRTArtificial SequenceSynthetic peptideMOD_RES(1)..(3)any amino acidMOD_RES(10)..(13)any amino acid 203Xaa Xaa Xaa Val Val Val Pro Ala Ala Xaa Xaa Xaa Xaa1 5 1020412PRTArtificial SequenceSynthetic peptideMOD_RES(1)..(3)any amino acidMOD_RES(11)..(12)any amino acid 204Xaa Xaa Xaa Tyr Tyr Asp Ser Ser Gly Tyr Xaa Xaa1 5 1020512PRTArtificial SequenceSynthetic peptideMOD_RES(1)..(3)any amino acidMOD_RES(9)..(12)any amino acid 205Xaa Xaa Xaa Asp Phe Trp Ser Gly Xaa Xaa Xaa Xaa1 5 1020612PRTArtificial SequenceSynthetic peptideMOD_RES(1)..(3)any amino acidMOD_RES(9)..(12)any amino acid 206Xaa Xaa Xaa Thr Ile Phe Gly Val Xaa Xaa Xaa Xaa1 5 1020712PRTArtificial SequenceSynthetic peptideMOD_RES(1)..(4)any amino acidMOD_RES(9)..(12)any amino acid 207Xaa Xaa Xaa Xaa Ile Val Ala Thr Xaa Xaa Xaa Xaa1 5 1020811PRTArtificial SequenceSynthetic peptideMOD_RES(1)..(3)any amino acidMOD_RES(11)..(11)any amino acid 208Xaa Xaa Xaa Tyr Gly Ser Gly Ser Tyr Tyr Xaa1 5 1020911PRTArtificial SequenceSynthetic peptideMOD_RES(1)..(4)any amino acidMOD_RES(9)..(11)any amino acid 209Xaa Xaa Xaa Xaa Tyr Ser Tyr Gly Xaa Xaa Xaa1 5 1021011PRTArtificial SequenceSynthetic peptideMOD_RES(1)..(3)any amino acidMOD_RES(7)..(8)any amino acidMOD_RES(11)..(11)any amino acid 210Xaa Xaa Xaa Cys Ser Gly Xaa Xaa Cys Tyr Xaa1 5 1021111PRTArtificial SequenceSynthetic peptideMOD_RES(1)..(4)any amino acidMOD_RES(9)..(11)any amino acid 211Xaa Xaa Xaa Xaa Ala Ala Ala Gly Xaa Xaa Xaa1 5 1021211PRTArtificial SequenceSynthetic peptideMOD_RES(1)..(1)any amino acidMOD_RES(3)..(5)any amino acidMOD_RES(9)..(11)any amino acid 212Xaa Gly Xaa Xaa Xaa Gly Gly Asn Xaa Xaa Xaa1 5 1021310PRTArtificial SequenceSynthetic peptideMOD_RES(1)..(3)any amino acidMOD_RES(8)..(10)any amino acid 213Xaa Xaa Xaa Ser Gly Ser Tyr Xaa Xaa Xaa1 5 1021410PRTArtificial SequenceSynthetic peptideMOD_RES(1)..(3)any amino acidMOD_RES(8)..(10)any amino acid 214Xaa Xaa Xaa Ser Ser Ser Trp Xaa Xaa Xaa1 5 1021510PRTArtificial SequenceSynthetic peptideMOD_RES(1)..(4)any amino acidMOD_RES(10)..(10)any amino acid 215Xaa Xaa Xaa Xaa Thr Thr Val Thr Thr Xaa1 5 1021610PRTArtificial SequenceSynthetic peptideMOD_RES(1)..(3)any amino acidMOD_RES(5)..(5)amino acid S or GMOD_RES(8)..(8)any amino acidMOD_RES(10)..(10)any amino acid 216Xaa Xaa Xaa Cys Xaa Gly Asp Xaa Cys Xaa1 5 1021710PRTArtificial SequenceSynthetic peptideMOD_RES(1)..(3)any amino acidMOD_RES(4)..(4)amino acid I or VMOD_RES(9)..(10)any amino acid 217Xaa Xaa Xaa Xaa Ala Val Ala Gly Xaa Xaa1 5 1021810PRTArtificial SequenceSynthetic peptideMOD_RES(1)..(2)any amino acidMOD_RES(9)..(10)any amino acid 218Xaa Xaa Leu Trp Phe Gly Glu Leu Xaa Xaa1 5 1021910PRTArtificial SequenceSynthetic peptideMOD_RES(2)..(3)any amino acidMOD_RES(6)..(9)any amino acid 219Gly Xaa Xaa Trp Leu Xaa Xaa Xaa Xaa Phe1 5 1022010PRTArtificial SequenceSynthetic peptideMOD_RES(1)..(3)any amino acidMOD_RES(6)..(6)any amino acidMOD_RES(9)..(10)any amino acid 220Xaa Xaa Xaa Asp Thr Xaa Met Val Xaa Xaa1 5 1022110PRTArtificial SequenceSynthetic peptideMOD_RES(1)..(5)any amino acidMOD_RES(8)..(10)any amino acid 221Xaa Xaa Xaa Xaa Xaa Gly Glu Xaa Xaa Xaa1 5 102229PRTArtificial SequenceSynthetic peptideMOD_RES(1)..(4)any amino acidMOD_RES(8)..(9)any amino acid 222Xaa Xaa Xaa Xaa Ser Gly Trp Xaa Xaa1 52239PRTArtificial SequenceSynthetic peptideMOD_RES(1)..(4)any amino acidMOD_RES(8)..(9)any amino acid 223Xaa Xaa Xaa Xaa Gly Tyr Asn Xaa Xaa1 52249PRTArtificial SequenceSynthetic peptideMOD_RES(1)..(3)any amino acidMOD_RES(8)..(9)any amino acid 224Xaa Xaa Xaa Val Arg Gly Val Xaa Xaa1 52259PRTArtificial SequenceSynthetic peptideMOD_RES(1)..(3)any amino acidMOD_RES(7)..(9)any amino acid 225Xaa Xaa Xaa Ile Ala Ala Xaa Xaa Xaa1 52269PRTArtificial SequenceSynthetic peptideMOD_RES(1)..(2)any amino acidMOD_RES(4)..(4)any amino acidMOD_RES(7)..(9)any amino acid 226Xaa Xaa Tyr Xaa Trp Asn Xaa Xaa Xaa1 52278PRTArtificial SequenceSynthetic peptideMOD_RES(1)..(3)any amino acidMOD_RES(7)..(8)any amino acid 227Xaa Xaa Xaa Tyr Gly Asp Xaa Xaa1 52288PRTArtificial SequenceSynthetic peptideMOD_RES(1)..(2)any amino acidMOD_RES(7)..(8)any amino acid 228Xaa Xaa Val Gly Ala Thr Xaa Xaa1 52298PRTArtificial SequenceSynthetic peptideMOD_RES(1)..(3)any amino acidMOD_RES(8)..(8)any amino acid 229Xaa Xaa Xaa Tyr Ser Ser Ser Xaa1 52305PRTArtificial SequenceSynthetic peptide 230Gly Tyr Tyr Met His1 523117PRTArtificial SequenceSynthetic peptide 231Trp Ile Asn Pro Asn Ser Gly Gly Thr Asn Tyr Ala Gln Lys Phe Gln1 5 10 15Gly2325PRTArtificial SequenceSynthetic peptide 232Ser Tyr Ala Met His1 523317PRTArtificial SequenceSynthetic peptide 233Trp Ile Asn Ala Gly Asn Gly Asn Thr Lys Tyr Ser Gln Lys Phe Gln1 5 10 15Gly2345PRTArtificial SequenceSynthetic peptide 234Ser Tyr Asp Ile Asn1 523517PRTArtificial SequenceSynthetic peptide 235Trp Met Asn Pro Asn Ser Gly Asn Thr Gly Tyr Ala Gln Lys Phe Gln1 5 10 15Gly2365PRTArtificial SequenceSynthetic peptide 236Ser Tyr Gly Ile Ser1 523717PRTArtificial SequenceSynthetic peptide 237Trp Ile Ser Ala Tyr Asn Gly Asn Thr Asn Tyr Ala Gln Lys Leu Gln1 5 10 15Gly2385PRTArtificial SequenceSynthetic peptide 238Glu Leu Ser Met His1 523917PRTArtificial SequenceSynthetic peptide 239Gly Phe Asp Pro Glu Asp Gly Glu Thr Ile Tyr Ala Gln Lys Phe Gln1 5 10 15Gly2405PRTArtificial SequenceSynthetic peptide 240Tyr Arg Tyr Leu His1 524117PRTArtificial

SequenceSynthetic peptide 241Trp Ile Thr Pro Phe Asn Gly Asn Thr Asn Tyr Ala Gln Lys Phe Gln1 5 10 15Asp2425PRTArtificial SequenceSynthetic peptide 242Ser Tyr Tyr Met His1 524317PRTArtificial SequenceSynthetic peptide 243Ile Ile Asn Pro Ser Gly Gly Ser Thr Ser Tyr Ala Gln Lys Phe Gln1 5 10 15Gly2445PRTArtificial SequenceSynthetic peptide 244Ser Ser Ala Val Gln1 524517PRTArtificial SequenceSynthetic peptide 245Trp Ile Val Val Gly Ser Gly Asn Thr Asn Tyr Ala Gln Lys Phe Gln1 5 10 15Glu2465PRTArtificial SequenceSynthetic peptide 246Ser Tyr Ala Ile Ser1 524717PRTArtificial SequenceSynthetic peptide 247Gly Ile Ile Pro Ile Phe Gly Thr Ala Asn Tyr Ala Gln Lys Phe Gln1 5 10 15Gly2485PRTArtificial SequenceSynthetic peptide 248Ser Tyr Ala Ile Ser1 524917PRTArtificial SequenceSynthetic peptide 249Gly Ile Ile Pro Ile Phe Gly Thr Ala Asn Tyr Ala Gln Lys Phe Gln1 5 10 15Gly2505PRTArtificial SequenceSynthetic peptide 250Asp Tyr Tyr Met His1 525117PRTArtificial SequenceSynthetic peptide 251Leu Val Asp Pro Glu Asp Gly Glu Thr Ile Tyr Ala Glu Lys Phe Gln1 5 10 15Gly2527PRTArtificial SequenceSynthetic peptide 252Thr Ser Gly Val Gly Val Gly1 525316PRTArtificial SequenceSynthetic peptide 253Leu Ile Tyr Trp Asn Asp Asp Lys Arg Tyr Ser Pro Ser Leu Lys Ser1 5 10 152547PRTArtificial SequenceSynthetic peptide 254Asn Ala Arg Met Gly Val Ser1 525516PRTArtificial SequenceSynthetic peptide 255His Ile Phe Ser Asn Asp Glu Lys Ser Tyr Ser Thr Ser Leu Lys Ser1 5 10 152567PRTArtificial SequenceSynthetic peptide 256Thr Ser Gly Met Arg Val Ser1 525716PRTArtificial SequenceSynthetic peptide 257Arg Ile Asp Trp Asp Asp Asp Lys Phe Tyr Ser Thr Ser Leu Lys Thr1 5 10 152585PRTArtificial SequenceSynthetic peptide 258Ser Tyr Trp Met Ser1 525917PRTArtificial SequenceSynthetic peptide 259Asn Ile Lys Gln Asp Gly Ser Glu Lys Tyr Tyr Val Asp Ser Val Lys1 5 10 15Gly2605PRTArtificial SequenceSynthetic peptide 260Asp Tyr Ala Met His1 526117PRTArtificial SequenceSynthetic peptide 261Gly Ile Ser Trp Asn Ser Gly Ser Ile Gly Tyr Ala Asp Ser Val Lys1 5 10 15Gly2625PRTArtificial SequenceSynthetic peptide 262Asp Tyr Tyr Met Ser1 526317PRTArtificial SequenceSynthetic peptide 263Tyr Ile Ser Ser Ser Gly Ser Thr Ile Tyr Tyr Ala Asp Ser Val Lys1 5 10 15Gly2645PRTArtificial SequenceSynthetic peptide 264Ser Tyr Asp Met His1 526516PRTArtificial SequenceSynthetic peptide 265Ala Ile Gly Thr Ala Gly Asp Thr Tyr Tyr Pro Gly Ser Val Lys Gly1 5 10 152665PRTArtificial SequenceSynthetic peptide 266Asn Ala Trp Met Ser1 526719PRTArtificial SequenceSynthetic peptide 267Arg Ile Lys Ser Lys Thr Asp Gly Gly Thr Thr Asp Tyr Ala Ala Pro1 5 10 15Val Lys Gly2685PRTArtificial SequenceSynthetic peptide 268Asp Tyr Gly Met Ser1 526917PRTArtificial SequenceSynthetic peptide 269Gly Ile Asn Trp Asn Gly Gly Ser Thr Gly Tyr Ala Asp Ser Val Lys1 5 10 15Gly2705PRTArtificial SequenceSynthetic peptide 270Ser Tyr Ser Met Asn1 527117PRTArtificial SequenceSynthetic peptide 271Ser Ile Ser Ser Ser Ser Ser Tyr Ile Tyr Tyr Ala Asp Ser Val Lys1 5 10 15Gly2725PRTArtificial SequenceSynthetic peptide 272Ser Tyr Ala Met Ser1 527317PRTArtificial SequenceSynthetic peptide 273Ala Ile Ser Gly Ser Gly Gly Ser Thr Tyr Tyr Ala Asp Ser Val Lys1 5 10 15Gly2745PRTArtificial SequenceSynthetic peptide 274Ser Tyr Gly Met His1 527517PRTArtificial SequenceSynthetic peptide 275Val Ile Ser Tyr Asp Gly Ser Asn Lys Tyr Tyr Ala Asp Ser Val Lys1 5 10 15Gly2765PRTArtificial SequenceSynthetic peptide 276Ser Tyr Ala Met His1 527717PRTArtificial SequenceSynthetic peptide 277Val Ile Ser Tyr Asp Gly Ser Asn Lys Tyr Tyr Ala Asp Ser Val Lys1 5 10 15Gly2785PRTArtificial SequenceSynthetic peptide 278Ser Tyr Gly Met His1 527917PRTArtificial SequenceSynthetic peptide 279Val Ile Ser Tyr Asp Gly Ser Asn Lys Tyr Tyr Ala Asp Ser Val Lys1 5 10 15Gly2805PRTArtificial SequenceSynthetic peptide 280Ser Tyr Gly Met His1 528117PRTArtificial SequenceSynthetic peptide 281Val Ile Trp Tyr Asp Gly Ser Asn Lys Tyr Tyr Ala Asp Ser Val Lys1 5 10 15Gly2825PRTArtificial SequenceSynthetic peptide 282Asp Tyr Thr Met His1 528317PRTArtificial SequenceSynthetic peptide 283Leu Ile Ser Trp Asp Gly Gly Ser Thr Tyr Tyr Ala Asp Ser Val Lys1 5 10 15Gly2845PRTArtificial SequenceSynthetic peptide 284Ser Tyr Ser Met Asn1 528517PRTArtificial SequenceSynthetic peptide 285Tyr Ile Ser Ser Ser Ser Ser Thr Ile Tyr Tyr Ala Asp Ser Val Lys1 5 10 15Gly2865PRTArtificial SequenceSynthetic peptide 286Asp Tyr Ala Met Ser1 528719PRTArtificial SequenceSynthetic peptide 287Phe Ile Arg Ser Lys Ala Tyr Gly Gly Thr Thr Glu Tyr Thr Ala Ser1 5 10 15Val Lys Gly2885PRTArtificial SequenceSynthetic peptide 288Ser Asn Tyr Met Ser1 528916PRTArtificial SequenceSynthetic peptide 289Val Ile Tyr Ser Gly Gly Ser Thr Tyr Tyr Ala Asp Ser Val Lys Gly1 5 10 152905PRTArtificial SequenceSynthetic peptide 290Ser Tyr Ala Met His1 529117PRTArtificial SequenceSynthetic peptide 291Ala Ile Ser Ser Asn Gly Gly Ser Thr Tyr Tyr Ala Asn Ser Val Lys1 5 10 15Gly2925PRTArtificial SequenceSynthetic peptide 292Ser Asn Tyr Met Ser1 529316PRTArtificial SequenceSynthetic peptide 293Val Ile Tyr Ser Gly Gly Ser Thr Tyr Tyr Ala Asp Ser Val Lys Gly1 5 10 152945PRTArtificial SequenceSynthetic peptide 294Asp His Tyr Met Asp1 529519PRTArtificial SequenceSynthetic peptide 295Arg Thr Arg Asn Lys Ala Asn Ser Tyr Thr Thr Glu Tyr Ala Ala Ser1 5 10 15Val Lys Gly2965PRTArtificial SequenceSynthetic peptide 296Gly Ser Ala Met His1 529719PRTArtificial SequenceSynthetic peptide 297Arg Ile Arg Ser Lys Ala Asn Ser Tyr Ala Thr Ala Tyr Ala Ala Ser1 5 10 15Val Lys Gly2985PRTArtificial SequenceSynthetic peptide 298Ser Tyr Trp Met His1 529917PRTArtificial SequenceSynthetic peptide 299Arg Ile Asn Ser Asp Gly Ser Ser Thr Ser Tyr Ala Asp Ser Val Lys1 5 10 15Gly3005PRTArtificial SequenceSynthetic peptide 300Ser Asn Glu Met Ser1 530115PRTArtificial SequenceSynthetic peptide 301Ser Ile Ser Gly Gly Ser Thr Tyr Tyr Ala Asp Ser Arg Lys Gly1 5 10 153026PRTArtificial SequenceSynthetic peptide 302Ser Ser Asn Trp Trp Ser1 530316PRTArtificial SequenceSynthetic peptide 303Glu Ile Tyr His Ser Gly Ser Thr Asn Tyr Asn Pro Ser Leu Lys Ser1 5 10 153046PRTArtificial SequenceSynthetic peptide 304Ser Ser Asn Trp Trp Gly1 530516PRTArtificial SequenceSynthetic peptide 305Tyr Ile Tyr Tyr Ser Gly Ser Thr Tyr Tyr Asn Pro Ser Leu Lys Ser1 5 10 153067PRTArtificial SequenceSynthetic peptide 306Ser Gly Gly Tyr Tyr Trp Ser1 530716PRTArtificial SequenceSynthetic peptide 307Tyr Ile Tyr Tyr Ser Gly Ser Thr Tyr Tyr Asn Pro Ser Leu Lys Ser1 5 10 153087PRTArtificial SequenceSynthetic peptide 308Ser Gly Gly Tyr Ser Trp Ser1 530916PRTArtificial SequenceSynthetic peptide 309Tyr Ile Tyr His Ser Gly Ser Thr Tyr Tyr Asn Pro Ser Leu Lys Ser1 5 10 153107PRTArtificial SequenceSynthetic peptide 310Ser Gly Asp Tyr Tyr Trp Ser1 531116PRTArtificial SequenceSynthetic peptide 311Tyr Ile Tyr Tyr Ser Gly Ser Thr Tyr Tyr Asn Pro Ser Leu Lys Ser1 5 10 153127PRTArtificial SequenceSynthetic peptide 312Ser Gly Gly Tyr Tyr Trp Ser1 531316PRTArtificial SequenceSynthetic peptide 313Tyr Ile Tyr Tyr Ser Gly Ser Thr Tyr Tyr Asn Pro Ser Leu Lys Ser1 5 10 153145PRTArtificial SequenceSynthetic peptide 314Gly Tyr Tyr Trp Ser1 531516PRTArtificial SequenceSynthetic peptide 315Glu Ile Asn His Ser Gly Ser Thr Asn Tyr Asn Pro Ser Leu Lys Ser1 5 10 153167PRTArtificial SequenceSynthetic peptide 316Ser Ser Ser Tyr Tyr Trp Gly1 531716PRTArtificial SequenceSynthetic peptide 317Ser Ile Tyr Tyr Ser Gly Ser Thr Tyr Tyr Asn Pro Ser Leu Lys Ser1 5 10 153185PRTArtificial SequenceSynthetic peptide 318Ser Tyr Tyr Trp Ser1 531916PRTArtificial SequenceSynthetic peptide 319Tyr Ile Tyr Tyr Ser Gly Ser Thr Asn Tyr Asn Pro Ser Leu Lys Ser1 5 10 153207PRTArtificial SequenceSynthetic peptide 320Ser Gly Ser Tyr Tyr Trp Ser1 532116PRTArtificial SequenceSynthetic peptide 321Tyr Ile Tyr Tyr Ser Gly Ser Thr Asn Tyr Asn Pro Ser Leu Lys Ser1 5 10 153226PRTArtificial SequenceSynthetic peptide 322Ser Gly Tyr Tyr Trp Gly1 532316PRTArtificial SequenceSynthetic peptide 323Ser Ile Tyr His Ser Gly Ser Thr Tyr Tyr Asn Pro Ser Leu Lys Ser1 5 10 153245PRTArtificial SequenceSynthetic peptide 324Ser Tyr Trp Ile Gly1 532517PRTArtificial SequenceSynthetic peptide 325Ile Ile Tyr Pro Gly Asp Ser Asp Thr Arg Tyr Ser Pro Ser Phe Gln1 5 10 15Gly3265PRTArtificial SequenceSynthetic peptide 326Ser Tyr Trp Ile Ser1 532717PRTArtificial SequenceSynthetic peptide 327Arg Ile Asp Pro Ser Asp Ser Tyr Thr Asn Tyr Ser Pro Ser Phe Gln1 5 10 15Gly3287PRTArtificial SequenceSynthetic peptide 328Ser Asn Ser Ala Ala Trp Asn1 532918PRTArtificial SequenceSynthetic peptide 329Arg Thr Tyr Tyr Arg Ser Lys Trp Tyr Asn Asp Tyr Ala Val Ser Val1 5 10 15Lys Ser3305PRTArtificial SequenceSynthetic peptide 330Ser Tyr Ala Met Asn1 533117PRTArtificial SequenceSynthetic peptide 331Trp Ile Asn Thr Asn Thr Gly Asn Pro Thr Tyr Ala Gln Gly Phe Thr1 5 10 15Gly33210PRTArtificial SequenceSynthetic peptide 332Val Thr Lys Ser Phe Asn Arg Gly Glu Cys1 5 1033375DNAArtificial SequenceSynthetic oligonucleotide 333gttaccaaaa gtttcaaccg tggtgaatgc taatagggcg cgccacgcat ctctaagcgg 60ccgcaacagg aggag 7533412PRTArtificial SequenceSynthetic peptide 334Gly Arg Asp Tyr Tyr Asp Ser Gly Gly Tyr Phe Thr1 5 1033517PRTArtificial SequenceSynthetic peptide 335Gly Arg Asp Tyr Tyr Asp Ser Gly Gly Tyr Phe Thr Val Ala Phe Asp1 5 10 15Ile33613PRTArtificial SequenceSynthetic peptide 336Asp Arg His Asn Tyr Tyr Asp Ser Ser Gly Ser Tyr Ser1 5 1033715PRTArtificial SequenceSynthetic peptide 337Asp Arg His Asn Tyr Tyr Asp Ser Ser Gly Ser Tyr Ser Asp Tyr1 5 10 1533817PRTArtificial SequenceSynthetic peptide 338Asp Cys Pro Ala Pro Ala Lys Met Tyr Tyr Tyr Gly Ser Gly Ile Cys1 5 10 15Thr33920PRTArtificial SequenceSynthetic peptide 339Asp Cys Pro Ala Pro Ala Lys Met Tyr Tyr Tyr Gly Ser Gly Ile Cys1 5 10 15Thr Phe Asp Tyr 203407PRTArtificial SequenceSynthetic peptide 340Ala Phe Tyr Asp Ser Ala Asp1 53419PRTArtificial SequenceSynthetic peptide 341Ala Phe Tyr Asp Ser Ala Asp Asp Tyr1 534212PRTArtificial SequenceSynthetic peptide 342Arg Asp Tyr Tyr Asp Ser Ser Gly Pro Glu Ala Gly1 5 1034315PRTArtificial SequenceSynthetic peptide 343Arg Asp Tyr Tyr Asp Ser Ser Gly Pro Glu Ala Gly Phe Asp Ile1 5 10 1534413PRTArtificial SequenceSynthetic peptide 344Asp Gly Thr Leu Ile Asp Thr Ser Ala Tyr Tyr Tyr Leu1 5 1034514PRTArtificial SequenceSynthetic peptide 345Asp Gly Thr Leu Ile Asp Thr Ser Ala Tyr Tyr Tyr Leu Tyr1 5 103466PRTArtificial SequenceSynthetic peptide 346Asn Ser Ser Asp Ser Ser1 534710PRTArtificial SequenceSynthetic peptide 347Asn Ser Ser Asp Ser Ser Val Leu Asp Val1 5 1034812PRTArtificial SequenceSynthetic peptide 348Asp Gln Val Phe Asp Ser Gly Gly Tyr Asn His Arg1 5 1034915PRTArtificial SequenceSynthetic peptide 349Asp Gln Val Phe Asp Ser Gly Gly Tyr Asn His Arg Phe Asp Ser1 5 10 1535014PRTArtificial SequenceSynthetic peptide 350Asp Leu Glu Tyr Tyr Tyr Asp Ser Gly Gly His Tyr Ser Pro1 5 1035117PRTArtificial SequenceSynthetic peptide 351Asp Leu Glu Tyr Tyr Tyr Asp Ser Gly Gly His Tyr Ser Pro Phe His1 5 10 15Tyr3526PRTArtificial SequenceSynthetic peptide 352Asp Asp Ser Ser Gly Tyr1 535311PRTArtificial SequenceSynthetic peptide 353Asp Asp Ser Ser Gly Tyr Tyr Tyr Ile Asp Tyr1 5 1035413PRTArtificial SequenceSynthetic peptide 354Gly His Tyr Tyr Asp Ser Pro Gly Gln Tyr Ser Tyr Ser1 5 1035515PRTArtificial SequenceSynthetic peptide 355Gly His Tyr Tyr Asp Ser Pro Gly Gln Tyr Ser Tyr Ser Glu Tyr1 5 10 1535619PRTArtificial SequenceSynthetic peptide 356Gly Gly Phe Arg Pro Pro Pro Tyr Asp Tyr Glu Ser Ser Ala Tyr Arg1 5 10 15Thr Tyr Arg35722PRTArtificial SequenceSynthetic peptide 357Gly Gly Phe Arg Pro Pro Pro Tyr Asp Tyr Glu Ser Ser Ala Tyr Arg1 5 10 15Thr Tyr Arg Leu Asp Phe 203587PRTArtificial SequenceSynthetic peptide 358Asp Ser Asp Thr Arg Ala Tyr1 535913PRTArtificial SequenceSynthetic peptide 359Asp Ser Asp Thr Arg Ala Tyr Tyr Trp Tyr Phe Asp Leu1 5 1036015PRTArtificial SequenceSynthetic peptide 360Gly Arg His Tyr Tyr Asp Ser Ser Gly Tyr Tyr Ser Thr Pro Glu1 5 10 1536120PRTArtificial SequenceSynthetic peptide 361Gly Arg His Tyr Tyr Asp Ser Ser Gly Tyr Tyr Ser Thr Pro Glu Asn1 5 10 15Tyr Phe Asp Tyr 2036213PRTArtificial SequenceSynthetic peptide 362Asp Pro Ser Tyr Tyr Tyr Asp Ser Ser Gly Leu Pro Leu1 5 1036318PRTArtificial SequenceSynthetic peptide 363Asp Pro Ser Tyr Tyr Tyr Asp Ser Ser Gly Leu Pro Leu His Gly Met1 5 10 15Asp Val36413PRTArtificial SequenceSynthetic peptide 364Thr Tyr Tyr Tyr Asp Ser Ser Gly Tyr Leu Leu Thr Arg1 5 1036517PRTArtificial SequenceSynthetic peptide 365Thr Tyr Tyr Tyr Asp Ser Ser Gly Tyr Leu Leu Thr Arg Tyr Phe Gln1 5 10 15His36613PRTArtificial SequenceSynthetic peptide 366Asn Ala Pro His Tyr Asp Ser Ser Gly Tyr Tyr Gln Thr1 5 1036716PRTArtificial SequenceSynthetic peptide 367Asn Ala Pro His Tyr Asp Ser Ser Gly Tyr Tyr Gln Thr Phe Asp Tyr1 5 10 153688PRTArtificial SequenceSynthetic peptide 368Gly Tyr His Ser Ser Ser Tyr Ala1 536913PRTArtificial SequenceSynthetic peptide 369Gly Tyr His Ser Ser Ser Tyr Ala Asp Ala Phe Asp Ile1 5 1037010PRTArtificial SequenceSynthetic peptide 370Pro Ile Gly Tyr Cys Ser Gly Gly Ser Cys1 5 1037115PRTArtificial SequenceSynthetic peptide 371Pro Ile Gly Tyr Cys Ser Gly Gly Ser Cys Tyr Ser Phe Asp Tyr1 5 10 1537214PRTArtificial SequenceSynthetic peptide 372Thr His Gly Thr Tyr Val Thr Ser Gly Tyr Tyr Pro Lys Ile1 5 1037314PRTArtificial SequenceSynthetic peptide 373Thr His

Gly Thr Tyr Val Thr Ser Gly Tyr Tyr Pro Lys Ile1 5 1037413PRTArtificial SequenceSynthetic peptide 374Gly Ala Thr Tyr Tyr Tyr Glu Ser Ser Gly Asn Tyr Pro1 5 1037515PRTArtificial SequenceSynthetic peptide 375Gly Ala Thr Tyr Tyr Tyr Glu Ser Ser Gly Asn Tyr Pro Asp Tyr1 5 10 1537615PRTArtificial SequenceSynthetic peptide 376Ala Phe Tyr His Tyr Asp Ser Thr Gly Tyr Pro Asn Arg Arg Tyr1 5 10 1537719PRTArtificial SequenceSynthetic peptide 377Ala Phe Tyr His Tyr Asp Ser Thr Gly Tyr Pro Asn Arg Arg Tyr Tyr1 5 10 15Phe Asp Tyr37814PRTArtificial SequenceSynthetic peptide 378Ser Tyr Ser Tyr Tyr Tyr Asp Ser Ser Gly Tyr Trp Gly Gly1 5 1037918PRTArtificial SequenceSynthetic peptide 379Ser Tyr Ser Tyr Tyr Tyr Asp Ser Ser Gly Tyr Trp Gly Gly Tyr Phe1 5 10 15Asp Tyr38012PRTArtificial SequenceSynthetic peptide 380Leu Ser Pro Tyr Tyr Tyr Asp Ser Ser Ser Tyr His1 5 1038117PRTArtificial SequenceSynthetic peptide 381Leu Ser Pro Tyr Tyr Tyr Asp Ser Ser Ser Tyr His Asp Ala Phe Asp1 5 10 15Ile38212PRTArtificial SequenceSynthetic peptide 382Glu Glu Asp Tyr Tyr Asp Ser Ser Gly Gln Ala Ser1 5 1038318PRTArtificial SequenceSynthetic peptidemisc_feature(17)..(17)Xaa can be any naturally occurring amino acid 383Glu Glu Asp Tyr Tyr Asp Ser Ser Gly Gln Ala Ser Tyr Asn Trp Phe1 5 10 15Xaa Pro38412PRTArtificial SequenceSynthetic peptide 384Glu Thr Asn Tyr Tyr Asp Ser Gly Gly Tyr Pro Gly1 5 1038515PRTArtificial SequenceSynthetic peptide 385Glu Thr Asn Tyr Tyr Asp Ser Gly Gly Tyr Pro Gly Phe Asp Phe1 5 10 1538612PRTArtificial SequenceSynthetic peptide 386Gly Asp His Tyr Tyr Asp Arg Ser Gly Tyr Arg His1 5 1038721PRTArtificial SequenceSynthetic peptide 387Gly Asp His Tyr Tyr Asp Arg Ser Gly Tyr Arg His Ser Tyr Tyr Tyr1 5 10 15Tyr Ala Met Asp Val 203886PRTArtificial SequenceSynthetic peptide 388Asp Arg Ser Ser Gly Asn1 538913PRTArtificial SequenceSynthetic peptide 389Asp Arg Ser Ser Gly Asn Tyr Phe Asp Gly Met Asp Val1 5 103909PRTArtificial SequenceSynthetic peptide 390Gly Arg Ser Arg Tyr Ser Gly Tyr Gly1 539116PRTArtificial SequenceSynthetic peptide 391Gly Arg Ser Arg Tyr Ser Gly Tyr Gly Phe Tyr Ser Gly Met Asp Val1 5 10 153928PRTArtificial SequenceSynthetic peptide 392Asp Asp Thr Ser Gly Tyr Gly Pro1 539317PRTArtificial SequenceSynthetic peptide 393Asp Asp Thr Ser Gly Tyr Gly Pro Tyr Tyr Phe Tyr Tyr Gly Met Asp1 5 10 15Val39412PRTArtificial SequenceSynthetic peptide 394Arg Ala Tyr Tyr Asp Thr Ser Phe Tyr Phe Glu Tyr1 5 1039513PRTArtificial SequenceSynthetic peptide 395Arg Ala Tyr Tyr Asp Thr Ser Phe Tyr Phe Glu Tyr Tyr1 5 1039615PRTArtificial SequenceSynthetic peptide 396Asp Arg Ile Asp Tyr Tyr Lys Ser Gly Tyr Tyr Leu Gly Ser Ala1 5 10 1539717PRTArtificial SequenceSynthetic peptide 397Asp Arg Ile Asp Tyr Tyr Lys Ser Gly Tyr Tyr Leu Gly Ser Ala Asp1 5 10 15Ser3989PRTArtificial SequenceSynthetic peptide 398Asp Thr Asp Ser Ser Ser His Tyr Gly1 539913PRTArtificial SequenceSynthetic peptide 399Asp Thr Asp Ser Ser Ser His Tyr Gly Arg Phe Asp Pro1 5 1040016PRTArtificial SequenceSynthetic peptide 400Val Ser Ile Ser His Tyr Asp Ser Ser Gly Arg Pro Gln Arg Val Phe1 5 10 1540121PRTArtificial SequenceSynthetic peptide 401Val Ser Ile Ser His Tyr Asp Ser Ser Gly Arg Pro Gln Arg Val Phe1 5 10 15Tyr Gly Met Asp Val 2040216PRTArtificial SequenceSynthetic peptide 402Gln Ala Arg Glu Asn Val Phe Tyr Asp Ser Ser Gly Pro Thr Ala Pro1 5 10 1540319PRTArtificial SequenceSynthetic peptide 403Gln Ala Arg Glu Asn Val Phe Tyr Asp Ser Ser Gly Pro Thr Ala Pro1 5 10 15Phe Asp His40414PRTArtificial SequenceSynthetic peptide 404Val Pro Ala Gly Asn Tyr Tyr Asp Thr Ser Gly Pro Asp Asn1 5 1040516PRTArtificial SequenceSynthetic peptide 405Val Pro Ala Gly Asn Tyr Tyr Asp Thr Ser Gly Pro Asp Asn Ala Asp1 5 10 1540619PRTArtificial SequenceSynthetic peptide 406Trp Tyr Tyr Phe Asp Thr Ser Gly Tyr Tyr Pro Arg Asn Phe Tyr Tyr1 5 10 15Met Asp Val40719PRTArtificial SequenceSynthetic peptide 407Trp Tyr Tyr Phe Asp Thr Ser Gly Tyr Tyr Pro Arg Asn Phe Tyr Tyr1 5 10 15Met Asp Val40812PRTArtificial SequenceSynthetic peptide 408Gly Tyr Tyr Tyr Asp Ser Gly Gly Asn Tyr Asn Gly1 5 1040914PRTArtificial SequenceSynthetic peptide 409Gly Tyr Tyr Tyr Asp Ser Gly Gly Asn Tyr Asn Gly Asp Tyr1 5 1041012PRTArtificial SequenceSynthetic peptide 410Asp Leu Arg Ser Tyr Asp Pro Ser Gly Tyr Tyr Asn1 5 1041117PRTArtificial SequenceSynthetic peptide 411Asp Leu Arg Ser Tyr Asp Pro Ser Gly Tyr Tyr Asn Asp Gly Phe Asp1 5 10 15Ile41212PRTArtificial SequenceSynthetic peptide 412Gly Tyr Tyr Tyr Asp Arg Gly Gly Asn Cys Asn Gly1 5 1041314PRTArtificial SequenceSynthetic peptide 413Gly Tyr Tyr Tyr Asp Arg Gly Gly Asn Cys Asn Gly Asp Tyr1 5 1041412PRTArtificial SequenceSynthetic peptide 414Gly Tyr Tyr Tyr Asp Arg Gly Gly Asn Tyr Asn Gly1 5 1041514PRTArtificial SequenceSynthetic peptide 415Gly Tyr Tyr Tyr Asp Arg Gly Gly Asn Tyr Asn Gly Asp Tyr1 5 104168PRTArtificial SequenceSynthetic peptide 416Thr His Tyr Asp Ser Ser Gly Leu1 541713PRTArtificial SequenceSynthetic peptide 417Thr His Tyr Asp Ser Ser Gly Leu Asp Ala Phe Asp Ile1 5 104186PRTArtificial SequenceSynthetic peptide 418Asp Asp Ser Ser Gly Ser1 541911PRTArtificial SequenceSynthetic peptide 419Asp Asp Ser Ser Gly Ser Tyr Tyr Phe Asp Tyr1 5 104207PRTArtificial SequenceSynthetic peptide 420Leu Ser Gly Gly Tyr Tyr Ser1 542111PRTArtificial SequenceSynthetic peptide 421Leu Ser Gly Gly Tyr Tyr Ser Asp Phe Asp Tyr1 5 1042211PRTArtificial SequenceSynthetic peptide 422Gly Asp Tyr Ser Asp Ser Ser Asp Ser Tyr Ile1 5 1042316PRTArtificial SequenceSynthetic peptide 423Gly Asp Tyr Ser Asp Ser Ser Asp Ser Tyr Ile Asp Ala Phe Asp Val1 5 10 1542412PRTArtificial SequenceSynthetic peptide 424Gly Glu Thr Tyr Tyr Tyr Asp Ser Arg Gly Tyr Ala1 5 1042515PRTArtificial SequenceSynthetic peptide 425Gly Glu Thr Tyr Tyr Tyr Asp Ser Arg Gly Tyr Ala Phe Asp His1 5 10 154268PRTArtificial SequenceSynthetic peptide 426Pro Thr Arg Asp Ser Ser Gly Tyr1 542712PRTArtificial SequenceSynthetic peptide 427Pro Thr Arg Asp Ser Ser Gly Tyr Tyr Val Gly Tyr1 5 1042812PRTArtificial SequenceSynthetic peptide 428Gly Ser Phe Tyr Tyr Asp Ser Ser Gly Tyr Pro Pro1 5 1042915PRTArtificial SequenceSynthetic peptide 429Gly Ser Phe Tyr Tyr Asp Ser Ser Gly Tyr Pro Pro Phe Asp Cys1 5 10 1543012PRTArtificial SequenceSynthetic peptide 430Gly Pro Tyr Tyr Tyr Asp Ser Ser Gly Tyr Tyr Leu1 5 1043115PRTArtificial SequenceSynthetic peptide 431Gly Pro Tyr Tyr Tyr Asp Ser Ser Gly Tyr Tyr Leu Leu Asp Tyr1 5 10 1543215PRTArtificial SequenceSynthetic peptide 432Glu Glu Gly Tyr Tyr Asp Ser Ser Gly Tyr Tyr Ser Leu Gly Ala1 5 10 1543318PRTArtificial SequenceSynthetic peptide 433Glu Glu Gly Tyr Tyr Asp Ser Ser Gly Tyr Tyr Ser Leu Gly Ala Ser1 5 10 15Asp Tyr4349PRTArtificial SequenceSynthetic peptide 434Arg Pro Asp Ser Ser Gly Ser Arg Trp1 543513PRTArtificial SequenceSynthetic peptide 435Arg Pro Asp Ser Ser Gly Ser Arg Trp Tyr Phe Asp Tyr1 5 1043610PRTArtificial SequenceSynthetic peptide 436Gly Tyr Tyr Asp Ile Ser Gly Tyr Tyr Phe1 5 1043715PRTArtificial SequenceSynthetic peptide 437Gly Tyr Tyr Asp Ile Ser Gly Tyr Tyr Phe Asp Ala Phe Asn Ile1 5 10 1543812PRTArtificial SequenceSynthetic peptide 438Asp Arg Gly Tyr Asp Ser Ser Gly Tyr Tyr Gly Asn1 5 1043915PRTArtificial SequenceSynthetic peptide 439Asp Arg Gly Tyr Asp Ser Ser Gly Tyr Tyr Gly Asn Leu Asp Cys1 5 10 1544012PRTArtificial SequenceSynthetic peptide 440Asp Arg Gly Tyr Asp Ser Ile Gly Tyr Tyr Gly Asn1 5 1044115PRTArtificial SequenceSynthetic peptide 441Asp Arg Gly Tyr Asp Ser Ile Gly Tyr Tyr Gly Asn Leu Asp Cys1 5 10 1544219PRTArtificial SequenceSynthetic peptide 442Ala Glu Asp Leu Thr Tyr Tyr Tyr Asp Arg Ser Gly Trp Gly Val His1 5 10 15Gly Leu Leu44324PRTArtificial SequenceSynthetic peptide 443Ala Glu Asp Leu Thr Tyr Tyr Tyr Asp Arg Ser Gly Trp Gly Val His1 5 10 15Gly Leu Leu Tyr Tyr Phe Asp Tyr 2044413PRTArtificial SequenceSynthetic peptide 444Leu Tyr Pro His Tyr Asp Ser Ser Gly Tyr Tyr Tyr Val1 5 1044516PRTArtificial SequenceSynthetic peptide 445Leu Tyr Pro His Tyr Asp Ser Ser Gly Tyr Tyr Tyr Val Leu Asp Tyr1 5 10 1544616PRTArtificial SequenceSynthetic peptide 446Asp Arg Val Gly Tyr Tyr Asp Ser Ser Gly Tyr Pro Pro Gly Ser Pro1 5 10 1544719PRTArtificial SequenceSynthetic peptide 447Asp Arg Val Gly Tyr Tyr Asp Ser Ser Gly Tyr Pro Pro Gly Ser Pro1 5 10 15Leu Asp Tyr

* * * * *

Patent Diagrams and Documents

S00001

XML

US20200325469A1 – US 20200325469 A1