Mutant Sodium Channel Nav 1.7 And Methods Related Thereto Leppert; Mark F. ; et al. [Leppert; Mark F.]

Mutant Sodium Channel Nav 1.7 And Methods Related Thereto

Leppert; Mark F. ; et al.

Patent Application Summary

U.S. patent application number 12/536245 was filed with the patent office on 2011-05-05 for mutant sodium channel nav 1.7 and methods related thereto. Invention is credited to Mark F. Leppert, Nanda A. Singh.

Application Number	20110104665 12/536245
Document ID	/
Family ID	34807157
Filed Date	2011-05-05

United States Patent Application	20110104665
Kind Code	A1
Leppert; Mark F. ; et al.	May 5, 2011

MUTANT SODIUM CHANNEL NAV 1.7 AND METHODS RELATED THERETO

Abstract

Described are mutant Na.sub.v1.7 sodium channel alpha-subunits and nucleic acid sequences encoding such mutants. Further described are methods for characterizing a nucleic acid sequence that encodes a Na.sub.v1 sodium channel alpha-subunit, methods for determining a Na.sub.v1.7 haplotype, methods for determining a subject's predisposition to a neurologic disorder associated with a sodium channel mutation, and methods of identifying a compound that modulates mutant Na.sub.v1.7 sodium channels. Other materials, compositions, articles, devices, and methods relating to mutant Na.sub.v1.7 sodium channels are also described herein.

Inventors:	Leppert; Mark F.; (Salt Lake City, UT) ; Singh; Nanda A.; (Heber City, UT)
Family ID:	34807157
Appl. No.:	12/536245
Filed:	August 5, 2009

Related U.S. Patent Documents


Application Number	Filing Date	Patent Number
10585717	Mar 14, 2007	7670771
PCT/US05/02059	Jan 21, 2005
12536245
60538149	Jan 21, 2004

Current U.S. Class:	1/1
Current CPC Class:	C07K 14/705 20130101; Y10T 436/143333 20150115; C12Q 1/6883 20130101; C12Q 2600/172 20130101; C12Q 2600/156 20130101
Class at Publication:	435/6
International Class:	C12Q 1/68 20060101 C12Q001/68

Goverment Interests

STATEMENT REGARDING FEDERALLY SPONSORED RESEARCH

[0002] This invention was made with government support under federal grant R01-NS-32666 awarded by the National Institutes of Health. The Government has certain rights to this invention.

Claims

1-88. (canceled)

89. A method for determining a subject's predisposition to a neurologic disorder associated with a sodium channel mutation, comprising the step of identifying mutations at one or more sites in regions of the nucleic acid sequence that encode an intracellular N-terminal region, an extracellular loop in domain I, an intracellular loop between domains I and II, an intracellular loop between domains II and III, an intramembrane region of domain II, or any combination thereof, such identified nucleotides indicating a predisposition to the neurologic disorder.

90. The method of claim 89, wherein the mutation is present in the nucleic acid region encoding the intracellular N-terminus region of the subunit.

91. The method of claim 89, wherein the mutation is present in the nucleic acid region encoding the extracellular loop of domain I of the subunit.

92. The method of claim 89, wherein the mutation is present in the nucleic acid region encoding the intracellular loop between domains I and II of the subunit.

93. The method of claim 89, wherein the mutation is present in the nucleic acid region encoding the intracellular loop between domains II and III of the subunit.

94. The method of claim 89, wherein the mutation is present in the nucleic acid region encoding the intramembrane region of domain II of the subunit.

95. The method of claim 89, wherein the step of identifying the mutations comprises comparing the nucleic acid sequence to a wild-type nucleic acid sequence.

96. The method of claim 95, wherein the wild-type nucleic acid sequence encodes the amino acid sequence of SEQ ID NO: 38.

97. The method of claim 89, wherein the identifying step comprises obtaining a biological sample and testing the sample to identify the nucleotides at the mutations sites of the nucleic acid contained therein.

98. The method of claim 89, wherein the neurologic disorder is a seizure disorder.

99. The method of claim 98, wherein the seizure disorder is a febrile seizure disorder.

100. A method for determining a subject's predisposition to a neurologic disorder associated with a sodium channel mutation comprising comparing the subject's Nav1.7 haplotype with one or more reference haplotypes that correlate with the neurologic disorder, a similar haplotype in the subject's Nav1.7 haplotype as compared to the reference haplotype or haplotypes indicating a predisposition to the neurologic disorder.

101. The method of claim 28, wherein the neurologic disorder is a seizure disorder.

102. The method of claim 29, wherein the seizure disorder is a febrile seizure disorder.

Description

CROSS-REFERENCE TO RELATED APPLICATIONS

[0001] This application is a continuation of copending application Ser. No. 10/585,717, filed Jul. 11, 2006, which was the National Stage of International Application No. PCT/US2005/002059, filed Jan. 21, 2005, which claims benefit of U.S. Provisional Application No. 60/538,149, filed Jan. 21, 2004. application Ser. No. 10/585,717, filed Jul. 11, 2006, International Application No. PCT/US2005/002059, filed Jan. 21, 2005, and U.S. Provisional Application No. 60/538,149, filed Jan. 21, 2004, are hereby incorporated herein by reference in their entirety.

BACKGROUND

[0003] Voltage-gated sodium channels are transmembrane proteins that mediate regenerative inward currents that are responsible for the initial depolarization of action potentials in excitable cells, such as neurons and muscle. Sodium channels are typically a complex of various subunits, the principle one being the alpha-subunit. The alpha-subunit is the pore-forming subunit, and it alone is sufficient for all known sodium channel function. However, in certain sodium channels, smaller, auxiliary subunits called beta-subunits are known to associate with the larger alpha-subunit and are believed to modulate some of the functions of the alpha-subunit. (See Kraner, et al. (1985) J Biol Chem 260:6341-6347; Tanaka, et al. (1983) J Biol Chem 258:7519-7526; Hartshorne, et al. (1984) J Biol Chem 259:1667-1675; Catterall, (1992) Physiol Rev 72:S14-S48; Anderson, et al. (1992) Physiol Rev 72:S89-S158.) A review of sodium channels is presented in Catterall, (1995) Ann Rev Biochem 64:493-531.

[0004] The primary structures of sodium channel alpha-subunits from a variety of tissues (brain, peripheral nerve, skeletal muscle, and cardiac muscle) and organisms (jellyfish, squid, eel, rat, human) have been identified, and their amino acid sequences show individual regions which have been conserved over a long evolutionary period (see Alberts, et al., eds., "Molecular Biology of the Cell" 534-535, Garland Pub., New York, N.Y. (1994)). From these studies it is known that the alpha-subunit of a sodium channel is a large glycoprotein containing four homologous domains (labeled I-IV in FIG. 1) connected by intracellular loops. The N-terminus of the alpha-subunit extends intracellularly at domain I (i.e., DI) and the C-terminus of the alpha-subunit extends intracellularly at domain IV (i.e., DIV). In the plasma membrane, the four domains orient in such a way as to create a central pore whose structural constituents determine the selectivity and conductance properties of the sodium channel.

[0005] Each domain of the sodium channel alpha-subunit contains six transmembrane alpha-helices or segments (labeled 1-6 in FIG. 1). Five of these transmembrane segments are hydrophobic, whereas one segment is positively charged with several lysine or arginine residues. This highly charged segment is the fourth transmembrane segment in each domain. Extracellular loops connect segment 1 (i.e., S1) to segment 2 (i.e., S2) and segment 3 (i.e., S3) to segment 4 (i.e., S4). Intracellular loops connect S2 to S3 and S4 to segment 5 (i.e., S5). An extracellular re-enterant loop connects S5 to segment 6 (i.e., S6). (See Agnew, et al. (1978) Proc Natl Acad Sci USA 75:2606-2610; Agnew, et al. (1980) Biochem Biophys Res Comm 92:860-866; Catterall, (1986) Ann Rev Biochem 55:953-985; Catterall, (1992) Physiol Rev 72:S14-S48.)

[0006] Voltage-gated sodium channels can be named according to a standardized form of nomenclature outlined in Goldin, et al. (2000) Neuron 28:365-368. According to that system, voltage-gated sodium channels are grouped into one family from which nine mammalian isoforms and have been identified and expressed. These nine isoforms are given the names Na.sub.v1.1 through Na.sub.v1.9. Also, splice variants of the various isoforms are distinguished by the use of lower case letters following the numbers (e.g., "Na.sub.v1.1a").

[0007] Because of the important role sodium channels play in the transmission of action potentials in excitable cells like neurons and muscle, sodium channels have been implicated in many sensory, motor, and neurologic disorders. Accordingly, sodium channels have been the focus of much scientific research. However, while a great deal has been learned about sodium channels, there remains a need for further understanding of the functioning of sodium channels, and means to diagnose, predict, prevent, and treat diseases, disorders, and conditions that result from variations and abnormalities of sodium channels. These and other objects and advantages of the materials, compositions, articles, devices, and methods described herein, as well as additional inventive features, will be apparent from the following disclosure.

BRIEF SUMMARY

[0008] In accordance with the purposes of the disclosed materials, compositions, articles, devices, and methods, as embodied and broadly described herein, the disclosed subject matter, in one aspect, relates to a method of characterizing a nucleic acid sequence that encodes a Na.sub.v1.7 sodium channel alpha-subunit, wherein the method comprises the step of identifying mutations at one or more sites in regions of the nucleic acid sequence that encode an intracellular N-terminal region, an extracellular loop in domain I, an intracellular loop between domains I and II, an intracellular loop between domains II and III, an intramembrane region of domain II, or any combination thereof, such identified nucleotides indicating the character of the nucleic acid sequence.

[0009] In another aspect, the disclosed subject matter relates to a method for determining a Na.sub.v1.7 haplotype in a human subject, wherein the method comprises identifying one or more nucleotides encoding amino acid residues 62, 149, 641, 655, 739, 1123, or any combination thereof, wherein the nucleotide or nucleotides indicate the haplotype.

[0010] In yet another aspect, the disclosed subject matter relates to a method for determining a subject's predisposition to a neurologic disorder associated with a sodium channel mutation comprising comparing the subject's Na.sub.v1.7 haplotype with one or more reference haplotypes that correlate with the neurologic disorder, a similar haplotype in the subject's Na.sub.v1.7 haplotype as compared to the reference haplotype or haplotypes indicating a predisposition to the neurologic disorder.

[0011] In a still further aspect, described herein is a method of identifying a compound that modulates mutant Na.sub.v1.7 sodium channels, wherein the method comprises contacting with a test compound a cell containing a mutant Na.sub.v1.7 nucleic acid that encodes a mutant Na.sub.v1.7 sodium channel comprising one or more mutations at residue 62, residue 149, residue 641, residue 655, residue 739, or residue 1123, detecting Na.sub.v1.7 sodium channel activity, and comparing the Na.sub.v1.7 sodium channel activity in the contacted cell with the amount of Na.sub.v1.7 sodium channel activity in a control cell, wherein the control cell is not contacted by the test compound, an increased or decreased Na.sub.v1.7 sodium channel activity in the test cell as compared to the control cell indicating a compound that modulates mutant Na.sub.v1.7 sodium channels.

[0012] Also, described herein are isolated nucleic acids comprising nucleotide sequences encoding mutant Na.sub.v1.7 sodium channel alpha-subunits, expression vectors made from such nucleic acids, cultured cells comprising such vectors, and methods of making mutant Na.sub.v1.7 sodium channel alpha-subunits comprising culturing such cells under conditions allowing expression of the polypeptide encoded by the nucleic acids, wherein the polypeptide comprises a mutant Na.sub.v1.7 sodium channel alpha-subunit. Further, described herein are isolated polypeptides comprising mutant Na.sub.v1.7 sodium channel alpha-subunits and fragments thereof as well as purified antibodies that bind to epitopes of such mutant Na.sub.v1.7 sodium channel alpha-subunits.

[0013] Additional advantages will be set forth in part in the description that follows, and in part will be obvious from the description, or may be learned by practice of the aspects described below. The advantages described below will be realized and attained by means of the elements and combinations particularly pointed out in the appended claims. It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive.

BRIEF DESCRIPTION OF THE DRAWINGS

[0014] The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate several aspects described below.

[0015] FIG. 1 is a diagram of the secondary structure of a sodium channel alpha-subunit. Not shown is the pore region in each of the four domains, which consists of an inward loop between transmembrane regions 5 and 6.

[0016] FIG. 2 is a diagram showing the segregation of the N641Y mutation and phenotypic findings of kindred 4425. The following abbreviations are used in the diagram: "fs" means febrile seizures; "afs" means afebrile seizures; "+" means wild type; and "m" means mutant.

[0017] FIG. 3 is a diagram of the secondary structure of a Na.sub.v1.7 sodium channel alpha-subunit where the locations of various mutations are identified.

[0018] FIG. 4 is a graph showing current voltage relationships of whole-cell currents. Full-length wild-type SCN9A and mutant SCN9A (K655R and N641Y) constructs were transiently transfected into tsA201 cells. Currents were elicted by test pulses from -60 mV to +40 mV in 5 mV increments. At negative potentials, K655R has a higher current density than wild type. At positive potentials, N641Y has reduced current density compared to wild-type, p<0.05.

DETAILED DESCRIPTION

[0019] The materials, compositions, articles, devices, and methods described herein may be understood more readily by reference to the following detailed description of specific aspects of the disclosed subject matter, and methods and the Examples included therein and to the Figures and their previous and following description.

[0020] Before the present materials, compositions, articles, devices, and methods are disclosed and described, it is to be understood that the aspects described below are not limited to specific synthetic methods or specific reagents, as such may, of course, vary. It is also to be understood that the terminology used herein is for the purpose of describing particular aspects only and is not intended to be limiting.

[0021] Disclosed herein are materials, compositions, and components that can be used for, can be used in conjunction with, can be used in preparation for, or are products of the disclosed method and compositions. These and other materials are disclosed herein, and it is understood that when combinations, subsets, interactions, groups, etc. of these materials are disclosed that while specific reference of each various individual and collective combinations and permutation of these compounds may not be explicitly disclosed, each is specifically contemplated and described herein. For example, if a Na.sub.v1.7 sodium channel is disclosed and a number of modifications that can be made to a number of amino acid residues or nucleotides, including those related to the mutant Na.sub.v1.7 sodium channel are discussed, each and every combination and permutation that are possible are specifically contemplated unless specifically indicated to the contrary. Thus, if a class of substituents A, B, and C are disclosed as well as a class of substituents D, E, and F and an example of a combination molecule, A-D is disclosed, then even if each is not individually recited, each is individually and collectively contemplated. Thus, in this example, each of the combinations A-E, A-F, B-D, B-E, B-F, C-D, C-E, and C-F are specifically contemplated and should be considered disclosed from disclosure of A, B, and C; D, E, and F; and the example combination A-D. Likewise, any subset or combination of these is also specifically contemplated and disclosed. Thus, for example, the sub-group of A-E, B-F, and C-E are specifically contemplated and should be considered disclosed from disclosure of A, B, and C; D, E, and F; and the example combination A-D. This concept applies to all aspects of this disclosure including, but not limited to, steps in methods of making and using the disclosed compositions. Thus, if there are a variety of additional steps that can be performed it is understood that each of these additional steps can be performed with any specific embodiment or combination of embodiments of the disclosed methods, and that each such combination is specifically contemplated and should be considered disclosed.

[0022] Throughout this specification, various publications are referenced. The disclosures of these publications in their entireties are hereby incorporated by reference into this application in order to more fully describe the state of the art to which this pertains. The references disclosed are also individually and specifically incorporated by reference herein for the material contained in them that is discussed in the sentence in which the reference is relied upon.

Definitions

[0023] In this specification and in the claims that follow, reference will be made to a number of terms, which shall be defined to have the following meanings.sup..

[0024] As used in the specification and the appended claims, the singular forms "a," "an," and "the" include plural referents unless the context clearly dictates otherwise. Thus, for example, reference to "a nucleotide" includes mixtures of two or more such nucleotides, reference to "an amino acid" includes mixtures of two or more such amino acids, reference to "the sodium channel" includes mixtures of two or more such sodium channels, and the like.

[0025] "Optional" or "optionally" means that the subsequently described event or circumstance can or cannot occur, and that the description includes instances where the event or circumstance occurs and instances where it does not. For example, the phrase "the array can optionally comprise the most commonly found allele at a second . . . position" means that the most commonly found allele at a second position may or may not be present in the array and that the description includes both arrays without the most commonly found allele at the second position and arrays where there is the most commonly found allele at the second position.

[0026] Ranges may be expressed herein as from "about" one particular value, and/or to "about" another particular value. When such a range is expressed, another aspect includes from the one particular value and/or to the other particular value. Similarly, when values are expressed as approximations, by use of the antecedent "about," it will be understood that the particular value forms another aspect. It will be further understood that the endpoints of each of the ranges are significant both in relation to the other endpoint, and independently of the other endpoint.

[0027] "Subject," as used herein, means an individual. In one aspect, the subject is a mammal such as a primate, and, in another aspect, the subject is a human. The term "subject" also includes domesticated animals (e.g., cats, dogs, etc.), livestock (e.g., cattle, horses, pigs, sheep, goats, etc.), and laboratory animals (e.g., mouse, rabbit, rat, guinea pig, etc.).

[0028] "Na.sub.v1.7," as used herein, refers to an isoform of a sodium channel known in the art by names such as NaS, hNE-Na, and PN1. The traditional gene symbol for a Nav1.7 sodium channel is SCN9A, and thus the term Na.sub.v1.7, as used herein, is synonymous with the term SCN9A. There are a variety of sequences related to the Na.sub.v1.7 gene having the following Genbank Accession Numbers: NM 002977 (human), U35238 (rabbit), X82835 (human), U79568 (rat), and AF000368 (rat), these nucleic acid sequences, the polypeptides encoded by them, and other nucleic acid and polypeptide sequences are herein incorporated by reference in their entireties as well as for individual subsequences contained therein.

[0029] There are a variety of compositions disclosed herein that are amino acid based, including for example Na.sub.v1.7 sodium channel alpha-subunits. Thus, as used herein, "amino acid," means the typically encountered twenty amino acids which make up polypeptides. In addition, it further includes less typical constituents which are both naturally occurring, such as, but not limited to formylmethionine and selenocysteine, analogs of typically found amino acids, and mimetics of amino acids or amino acid functionalities. Non-limiting examples of these and other molecules are discussed herein.

[0030] As used herein, the terms "peptide" and "polypeptide" refer to a class of compounds composed of amino acids chemically bound together. Non-limiting examples of these and other molecules are discussed herein. In general, the amino acids are chemically bound together via amide linkages (CONH); however, the amino acids may be bound together by other chemical bonds known in the art. For example, the amino acids may be bound by amine linkages. Peptide as used herein includes oligomers of amino acids and small and large peptides, including polypeptides and proteins.

[0031] There are a variety of compositions disclosed herein that are nucleic acid based, including for example the nucleic acids that encode, for example, Na.sub.v1.7 sodium channel alpha-subunits. Thus, as used herein, "nucleic acid" means a molecule made up of, for example, nucleotides, nucleotide analogs, or nucleotide substitutes. Non-limiting examples of these and other molecules are discussed herein. A nucleic acid can be double stranded or single stranded. It is understood that, for example, when a vector is expressed in a cell the expressed mRNA will typically be made up of A, C, G, and U. Likewise, it is understood that if, for example, an antisense molecule is introduced into a cell or cell environment through, for example, exogenous delivery, it is advantageous that the antisense molecule be made up of nucleotide analogs that reduce the degradation of the antisense molecule in the cellular environment.

[0032] As used herein, "nucleotide" is a molecule that contains a base moiety, a sugar moiety and a phosphate moiety. Nucleotides can be linked together through their phosphate moieties and sugar moieties creating an internucleoside linkage. The base moiety of a nucleotide can be adenine-9-yl (A), cytosine-1-yl (C), guanine-9-yl (G), uracil-1-yl (U), and thymin-1-yl (T). The sugar moiety of a nucleotide is a ribose or a deoxyribose. The phosphate moiety of a nucleotide is pentavalent phosphate. A non-limiting example of a nucleotide would be 3'-AMP (3'-adenosine monophosphate) or 5'-GMP (5'-guanosine monophosphate).

[0033] "Nucleotide analog," as used herein, is a nucleotide which contains some type of modification to either the base, sugar, or phosphate moieties. Modifications to nucleotides are well known in the art and would include for example, 5-methylcytosine (5-me-C), 5-hydroxymethyl cytosine, xanthine, hypoxanthine, and 2-aminoadenine as well as modifications at the sugar or phosphate moieties.

[0034] "Nucleotide substitutes," as used herein, are molecules having similar functional properties to nucleotides, but which do not contain a phosphate moiety, such as peptide nucleic acid (PNA). Nucleotide substitutes are molecules that will recognize nucleic acids in a Watson-Crick or Hoogsteen manner, but which are linked together through a moiety other than a phosphate moiety. Nucleotide substitutes are able to conform to a double helix type structure when interacting with the appropriate target nucleic acid.

[0035] It is also possible to link other types of molecules (conjugates) to nucleotides or nucleotide analogs to enhance for example, cellular uptake. Conjugates can be chemically linked to the nucleotide or nucleotide analogs. Such conjugates include but are not limited to lipid moieties such as a cholesterol moiety (Letsinger, et al. (1989) Proc Natl Acad Sci USA, 86:6553-6556.)

[0036] A "Watson-Crick interaction" is at least one interaction with the Watson-Crick face of a nucleotide, nucleotide analog, or nucleotide substitute. The Watson-Crick face of a nucleotide, nucleotide analog, or nucleotide substitute includes the C2, N1, and C6 positions of a purine based nucleotide, nucleotide analog, or nucleotide substitute and the C2, N3, C4 positions of a pyrimidine based nucleotide, nucleotide analog, or nucleotide substitute.

[0037] A "Hoogsteen interaction" is the interaction that takes place on the Hoogsteen face of a nucleotide or nucleotide analog, which is exposed in the major groove of duplex DNA. The Hoogsteen face includes the N7 position and reactive groups (NH.sub.2 or O) at the C6 position of purine nucleotides.

[0038] "Deletion," as used herein, refers to a change in an amino acid or nucleotide sequence in which one or more amino acid or nucleotide residues, respectively, are absent relative to the reference sequence.

[0039] "Insertion" or "addition," as used herein, refers to a change in an amino acid or nucleotide sequence resulting in the addition of one or more amino acid or nucleotide residues, respectively, as compared to the reference sequence.

[0040] "Substitution," as used herein, refers to the replacement of one or more amino acids or nucleotides by one or more different amino acids or nucleotides, respectively, in a reference sequence.

[0041] "Isolated," as used herein refers to material, such as a nucleic acid or a polypeptide, which is: (1) substantially or essentially free from components which normally accompany or interact with it as found in its naturally occurring environment. Although, the isolated material optionally comprises material not found with the material in its natural environment; or (2) if the material is in its natural environment, the material has been synthetically (non-naturally) altered by deliberate human intervention to a composition and/or placed at a locus in the cell (e.g., genome or subcellular organelle) not native to a material found in that environment. The alteration to yield the synthetic material can be performed on the material within or removed from its natural state.

Characterizing Mutant Na.sub.v1.7 Nucleic Acid Sequences

[0042] It has been found that, in certain neurologic disorders, specific sites in the Na.sub.v1.7 gene are mutated, i.e., the nucleotide at a specific position or at specific positions differs from that observed in the most commonly found Na.sub.v1.7 gene sequence. Accordingly, disclosed herein are methods of characterizing mutant nucleic acid sequences that encode a Na.sub.v1.7 sodium channel alpha-subunit and the use of such nucleic acids to diagnose and treat disease states and neurologic disorders, such as seizures.

[0043] In one aspect, disclosed herein is a method of characterizing a nucleic acid sequence that encodes a Na.sub.v1.7 sodium channel alpha-subunit, comprising the step of identifying mutations at one or more sites in regions of the nucleic acid sequence that encode various regions of the Na.sub.v1.7 sodium channel alpha-subunit. While mutations can be present in any region of the Na.sub.v1.7 nucleic acid sequence, specific regions of the nucleic acid sequence where mutations can be identified include, but are not limited to, those regions that encode an intracellular N-terminal region, an extracellular loop in domain I, an intracellular loop between domains I and II, an intracellular loop between domains II and III, an intramembrane region of domain II, or any combination thereof Such identified nucleotides can indicate the character of the nucleic acid sequence.

[0044] The terms "mutation" and "mutant," as used herein, mean that, at one or more specific positions in a nucleic acid or amino acid sequence, a nucleotide or amino acid that differs from the most commonly found nucleotide or amino acid can be identified. A mutation includes deletions, additions, insertions, and substitutions in the nucleotide or amino acid sequence. For example, in one particular mutant Na.sub.v1.7 nucleic acid sequence disclosed herein, position 184 of the nucleic acid sequence contains a substitution; that is, the most commonly found nucleotide at position 184 of the Na.sub.v1.7 gene is A, whereas in the mutant Na.sub.v1.7 nucleic acid sequence, the nucleotide found at position 184, i.e., the mutated site, is G. One of skill in the art can analyze position 184 and determine which of the two amino acids (A or G) is present. As another example, in one particular mutant Na.sub.v1.7 sodium channel alpha-subunit disclosed herein, position 62 of the amino acid sequence contains a substitution; that is, the most commonly found amino acid at position 62 of the Na.sub.v1.7 amino acid sequence is isoleucine, whereas in the mutant Na.sub.v1.7 amino acid sequence, the amino acid found at position 62, i.e., the mutated site, is valine. Also, one of skill in the art can analyze position 62 of the amino acid sequence and determine which of the two amino acids (isoleucine or valine) is present. Further, as used herein, "mutant" also includes combinations of mutations at more than one position in the Na.sub.v1.7 nucleic acid or amino acid sequence. Mutations may provide functional differences in the genetic sequence, through changes in the encoded polypeptide, changes in mRNA stability, binding of transcriptional and translation factors to the DNA or RNA, and the like. The mutations can also be used as single nucleotide or single amino acid mutations to detect genetic linkage to phenotypic variation in activity and expression of sodium channels.

[0045] As utilized herein, the "character" of the Na.sub.v1.7 nucleic acid sequence can be the combination of nucleotides present at mutated sites that make up the Na.sub.v1.7 sodium channel alpha-subunit haplotype as well as the biological activity associated with a particular mutation or combination of mutations.

[0046] In one specific aspect, a mutation can be present in the nucleic acid region encoding the intracellular N-terminus region of the Na.sub.v1.7 sodium channel alpha-subunit. For example, such a mutation can be at the site that encodes amino acid residue 62. The mutated site can be at position 184 of the Na.sub.v1.7 nucleic acid sequence. In one particular aspect, the mutation can encode a valine at amino acid residue 62.

[0047] In another aspect, a mutation can be present in the nucleic acid region encoding the extracellular loop of domain I of the Na.sub.v1.7 sodium channel alpha-subunit. For example, such a mutation can be at the site that encodes amino acid residue 149. The mutated site can be at position 446 of the Na.sub.v1.7 nucleic acid sequence. In one specific aspect, the mutation can encode a glutamine at amino acid residue 149.

[0048] In yet another aspect, mutations can be present in the nucleic acid region encoding the intracellular loop between domains I and II of the Na.sub.v1.7 sodium channel alpha-subunit. For example, such mutations can be at sites that encode amino acid residue 641 and/or amino acid residue 655. The mutated sites can be at positions 1921 and/or 1964 of the Na.sub.v1.7 nucleic acid sequence. In one specific aspect, the mutation can encode a tyrosine at amino acid residue 641. In another aspect, the mutation can encode an arginine at amino acid residue 655.

[0049] In a further aspect, a mutation can be present in the nucleic acid region encoding the intramembrane region of domain II of the Na.sub.v1.7 sodium channel alpha-subunit. For example, such a mutation can be at the site that encodes amino acid residue 739. The mutated site can be at position 2215 of the Na.sub.v1.7 nucleic acid sequence. In one specific aspect, the mutation can encode a valine at amino acid residue 739.

[0050] In still another aspect, a mutation can be present in the nucleic acid region encoding the intracellular loop between domains II and III of the Na.sub.v1.7 sodium channel alpha-subunit. For example, such a mutation can be at the site that encodes amino acid residue 1123. The mutated site can be at position 3369 of the Na.sub.v1.7 nucleic acid sequence. In one specific aspect, the mutation can encode a phenylalanine at amino acid residue 1123.

[0051] Mutations can also be present in more than one region of the nucleic acid sequence, such as in regions that encode an intracellular N-terminal region and an extracellular loop in domain I; an intracellular N-terminal region and an intracellular loop between domains I and II; an intracellular N-terminal region and an intracellular loop between domains II and III; an intracellular N-terminal region and an intramembrane region of domain II; an extracellular loop in domain I and an intracellular loop between domains I and II; an extracellular loop in domain I and an intracellular loop between domains II and III; an extracellular loop in domain I and an intramembrane region of domain II; an intracellular loop between domains I and II and an intracellular loop between domains II and III; an intracellular loop between domains I and II and an intramembrane region of domain II; and an intracellular loop between domains II and III and an intramembrane region of domain II.

[0052] Some of the mutations that can be identified by the methods disclosed herein include, but are not limited to, mutations at positions 184, 446, 1921, 1964, 2215, 3369, or any combination thereof, of the Na.sub.v1.7 nucleic acid sequence. Any individual mutation can be analyzed at any of these positions, or combinations of mutant variants at more than one position can be identified and analyzed by the methods disclosed herein.

[0053] A number of methods are available for analyzing nucleic acids for the presence of a specific sequence. For all of the methods described herein, genomic DNA can be extracted from a sample and this sample can be from any organism and can be, but is not limited to, peripheral blood, bone marrow specimens, primary tumors, embedded tissue sections, frozen tissue sections, cell preparations, cytological preparations, exfoliate samples (e.g., sputum), fine needle aspirations, amnion cells, fresh tissue, dry tissue, and cultured cells or tissue. Such samples can be obtained directly from a subject, commercially obtained or obtained via other means. Thus, the methods described herein can be utilized to analyze a nucleic acid sample that comprises genomic DNA, amplified DNA (such as a PCR product), cDNA, cRNA, a restriction fragment or any other desired nucleic acid sample. When one performs one of the herein described methods on genomic DNA, typically the genomic DNA will be treated in a manner to reduce viscosity of the DNA and allow better contact of a primer or probe with the target region of the genomic DNA. Such reduction in viscosity can be achieved by any desired methods, which are known to the skilled artisan, such as DNase treatment or shearing of the genomic DNA, preferably lightly.

[0054] If sufficient DNA is available, genomic DNA can be used directly. Alternatively, the region of interest is cloned into a suitable vector and grown in sufficient quantity for analysis. The nucleic acid may be amplified by conventional techniques, such as the polymerase chain reaction (PCR), to provide sufficient amounts for analysis. A variety of PCR techniques are familiar to those skilled in the art. For a review of PCR technology, see the publication entitled "PCR Methods and Applications" (1991, Cold Spring Harbor Laboratory Press), which is incorporated herein by reference in its entirety for amplification methods. In each of these PCR procedures, PCR primers on either side of the nucleic acid sequences to be amplified are added to a suitably prepared nucleic acid sample along with dNTPs and a thermostable polymerase such as Taq polymerase, Pfu polymerase, or Vent polymerase. The nucleic acid in the sample is denatured and the PCR primers are specifically hybridized to complementary nucleic acid sequences in the sample. The hybridized primers are extended. Thereafter, another cycle of denaturation, hybridization, and extension is initiated. The cycles are repeated multiple times to produce an amplified fragment containing the nucleic acid sequence between the primer sites. PCR has further been described in several patents including U.S. Pat. Nos. 4,683,195, 4,683,202, and 4,965,188. Each of these publications is incorporated herein by reference in its entirety for PCR methods. One of skill in the art would know how to design and synthesize primers flanking any of the nucleic acid sequences disclosed herein.

[0055] For example, the disclosed method provides primers GTCCCGCCCATTGCCTGACAC (SEQ ID NO: 20) and TTCTGGTCATGATATGGTTATTCAC (SEQ ID NO: 21), which can be utilized to amplify the region of the Na.sub.v1.7 nucleic acid sequence comprising nucleotide position 184 in order to identify a mutation at this site. The disclosed method also provides primers TGATAGATGCGTTGATGACATTGG (SEQ ID NO: 22) and TTCATAAATGCAGTAACTTCCTGG (SEQ ID NO: 23), which can be utilized to amplify the region of the Na.sub.v1.7 nucleic acid sequence comprising nucleotide position 446 in order to identify a mutation at this site. Also, the disclosed method provides primers TGTTTCTTTTAAGTCAGTACAGAG (SEQ ID NO: 24) and AGAGCCATTCACAAGACCAGAG (SEQ ID NO: 25), which can be utilized to amplify the region of the Na.sub.v1.7 nucleic acid sequence comprising nucleotide position 1921 in order to identify a mutation at this site. Additionally, the disclosed method provides primers ACTCAGAAAGGCAGAGAGGTG (SEQ ID NO: 26) and TTGCCATGTTATCAATGTCTGTG (SEQ ID NO: 27), which can be utilized to amplify the region of the Na.sub.v1.7 nucleic acid sequence comprising nucleotide position 1964 in order to identify a mutation at this site. Further, the disclosed method provides primers GACTGATTTGTATCTGGTTAGGAG (SEQ ID NO: 28) and GCAATGTAATTAGGAAGGTGTGAG (SEQ ID NO: 29), which can be utilized to amplify the region of the Na.sub.v1.7 nucleic acid sequence comprising nucleotide position 2215 in order to identify a mutation at this site. For example, the disclosed method provides primers TTTGAATGAACTCTAAATGAACTACC (SEQ ID NO: 30) and TAAGTATTAGGCGTTAAGACAAACC (SEQ ID NO: 31), which can be utilized to amplify the region of the Na.sub.v1.7 nucleic acid sequence comprising nucleotide position 3369 in order to identify a mutation at this site. One of skill in the art would know how to design primers accordingly to amplify any region of the Na.sub.v1.7 nucleic acid sequence for the purposes of identifying a mutation at any nucleotide position throughout the Na.sub.v1.7 sodium channel alpha-subunit sequence. Amplification may also be used to determine whether a mutation is present by using a primer that is specific for the mutation.

[0056] Various methods are known in the art that utilize oligonucleotide ligation as a means of detecting mutations, for examples see Riley, et al. (1990) Nucleic Acids Res 18:2887-2890; and Delahunty, et al. (1996) Am J Hum Genet 58:1239-1246, which are incorporated herein by reference in their entirety for methods of detecting mutations. Such methods include single base chain extension (SBCE), oligonucleotide ligation assay (OLA) and cleavase reaction/signal release (Invader methods, Third Wave Technologies).

[0057] LCR and Gap LCR are exponential amplification techniques. Both depend on DNA ligase to join adjacent primers annealed to a DNA molecule. In Ligase Chain Reaction (LCR), probe pairs are used which include two primary (first and second) and two secondary (third and fourth) probes, all of which are employed in molar excess to target. The first probe hybridizes to a first segment of the target strand and the second probe hybridizes to a second segment of the target strand, the first and second segments being contiguous so that the primary probes abut one another in 5'-phosphate-3'-hydroxyl relationship, and so that a ligase can covalently fuse or ligate the two probes into a fused product. In addition, a third (secondary) probe can hybridize to a portion of the first probe and a fourth (secondary) probe can hybridize to a portion of the second probe in a similar abutting fashion. Of course, if the target is initially double stranded, the secondary probes also will hybridize to the target complement in the first instance. Once the ligated strand of primary probes is separated from the target strand, it will hybridize with the third and fourth probes, which can be ligated to form a complementary, secondary ligated product. It is important to realize that the ligated products are functionally equivalent to either the target or its complement. By repeated cycles of hybridization and ligation, amplification of the target sequence is achieved. A method for multiplex LCR has also been described (WO 9320227, which is incorporated herein by reference in its entirety for the methods taught therein). Gap LCR (GLCR) is a version of LCR where the probes are not adjacent but are separated by 2 to 3 bases.

[0058] A method for typing single nucleotide mutations in DNA, labeled Genetic Bit Analysis (GBA), has been described in Nikiforov, et al. (1994) Nucleic Acid Res 22:4167-4175. In this method, specific fragments of genomic DNA containing the mutated site(s) are first amplified by the polymerase chain reaction (PCR) using one regular and one phosphorothioate-modified primer. The double-stranded PCR product is rendered single-stranded by treatment with the enzyme T7 gene 6 exonuclease, and captured onto individual wells of a 96 well polystyrene plate by hybridization to an immobilized oligonucleotide primer. This primer is designed to hybridize to the single-stranded target DNA immediately adjacent from the mutated site of interest. Using the Klenow fragment of E. coli DNA polymerase I or the modified T7 DNA polymerase (Sequenase), the 3' end of the capture oligonucleotide is extended by one base using a mixture of one biotin-labeled, one fluorescein-labeled, and two unlabeled dideoxynucleoside triphosphates. Antibody conjugates of alkaline phosphatase and horseradish peroxidase are then used to determine the nature of the extended base in an ELISA format. Additionally, minisequencing with immobilized primers has been utilized for detection of mutations in PCR products (see Pastinen, et al. (1997) Genome Res 7:606-614).

[0059] The effect of phosphorothioate bonds on the hydrolytic activity of the 5'.fwdarw.3' double-strand-specific T7 gene 6 exonuclease is used in order to improve upon GBA. The use of phosphorothioate primers and exonuclease hydrolysis for the preparation of single-stranded PCR products and their detection by solid-phase hybridization can be used. (See Nikiforov, et al. (1994) PCR Methods and Applications 3:285-291.) Double-stranded DNA substrates containing one phosphorothioate residue at the 5' end were found to be hydrolyzed by this enzyme as efficiently as unmodified ones. The enzyme activity was, however, completely inhibited by the presence of four phosphorothioates. On the basis of these results, a method for the conversion of double-stranded PCR products into full-length, single-stranded DNA fragments was developed. In this method, one of the PCR primers contains four phosphorothioates at its 5' end, and the opposite strand primer is unmodified. Following the amplification, the double-stranded product is treated with T7 gene 6 exonuclease. The phosphorothioated strand is protected from the action of this enzyme, whereas the opposite strand is hydrolyzed. When the phosphorothioated PCR primer is 5' biotinylated, the single-stranded PCR product can be easily detected colorimetrically after hybridization to an oligonucleotide probe immobilized on a microtiter plate. A simple and efficient method for the immobilization of relatively short oligonucleotides to microtiter plates with a hydrophilic surface in the presence of salt can be used.

[0060] DNA analysis based on template hybridization (or hybridization plus enzymatic processing) to an array of surface-bound oligonucleotides is well suited for high density, parallel, low cost and automatable processing (Ives, et al. (1996) Proc SPIE-Int Soc Opt Eng 2680 (Ultrasensitive Biochemical Diagnostics) 258-269). Direct fluorescence detection of labeled DNA provides the benefits of linearity, large dynamic range, multianalyte detection, processing simplicity and safe handling at reasonable cost. The Molecular Tool Corporation has applied a proprietary enzymatic method of solid phase genotyping to DNA processing in 96-well plates and glass microscope slides. Detecting the fluor-labeled GBA dideoxynucleotides requires a detection limit of approximately 100 mols/.mu.m.sup.2. Commercially available plate readers detect about 1000 mols/.mu.m.sup.2, and an experimental setup with an argon laser and thermoelectrically-cooled CCD can detect approximately 1 order of magnitude less signal. The current limit is due to glass fluorescence. Dideoxynucleotides labeled with fluorescein, eosin, tetramethylrhodamine, Lissamine and Texas Red have been characterized, and photobleaching, quenching and indirect detection with fluorogenic substrates have been investigated.

[0061] Other amplification techniques that can be used in the context of the present invention include, but are not limited to, Q-beta amplification as described in European Patent Application No 4544610, strand displacement amplification as described in EP 684 315A and, target mediated amplification as described in PCT Publication WO 9322461, the disclosures of which are incorporated herein by reference in their entirety for the methods taught therein.

[0062] Allele specific amplification can also be utilized for biallelic markers. Discrimination between the two alleles of a biallelic marker can also be achieved by allele specific amplification, a selective strategy, whereby one of the alleles is amplified without amplification of the other allele. For allele specific amplification, at least one member of the pair of primers is sufficiently complementary with a region of a reference sequence (i.e., Na.sub.v1.7) comprising the polymorphic base of a biallelic marker of the present invention to hybridize therewith. Such primers are able to discriminate between the two alleles of a biallelic marker. This can be accomplished by placing the mutated base at the 3' end of one of the amplification primers. Such allele specific primers tend to selectively prime an amplification or sequencing reaction so long as they are used with a nucleic acid sample that contains one of the two alleles present at a biallelic marker because the extension forms from the 3' end of the primer, a mismatch at or near this position has an inhibitory effect on amplification. Therefore, under appropriate amplification conditions, these primers only direct amplification on their complementary allele. Determining the precise location of the mismatch and the corresponding assay conditions are well with the ordinary skill in the art.

[0063] A detectable label may be included in an amplification reaction. Suitable labels include fluorochromes, e.g., fluorescein isothiocyanate (FITC), rhodamine, Texas Red, phycoerythrin, allophycocyanin, 6-carboxyfluorescein (6-FAM), 2',7'-dimethoxy-4',5'-dichloro-6-carboxyfluorescein (JOE), 6-carboxy-X-rhodamine (ROX), 6-carboxy-2',4',7',4,7-hexachlorofluorescein (HEX), 5-carboxyfluorescein (5-FAM) or N,N,N',N'-tetramethyl-6-carboxyrhodamine (TAMRA), radioactive labels, e.g., 32 P, 35 S, 3 H; etc. The label may be a two stage system, where the amplified DNA is conjugated to biotin, haptens, etc. having a high affinity binding partner, e.g., avidin, specific antibodies, etc., where the binding partner is conjugated to a detectable label. The label may be conjugated to one or both of the primers. Alternatively, the pool of nucleotides used in the amplification is labeled, so as to incorporate the label into the amplification product.

[0064] The sample nucleic acid, e.g., amplified or cloned fragment, can be analyzed by one of a number of methods known in the art. The nucleic acid can be sequenced by dideoxy or other methods. Hybridization with the variant sequence can also be used to determine its presence, by Southern blots, dot blots, etc. The hybridization pattern of a control (reference) and variant sequence to an array of oligonucleotide probes immobilized on a solid support, as described in U.S. Pat. No. 5,445,934 and WO 95/35505, which are incorporated herein by reference in their entirety for the methods, may also be used as a means of detecting the presence of variant sequences. Single strand conformational polymorphism (SSCP) analysis, denaturing gradient gel electrophoresis (DGGE), mismatch cleavage detection, and heteroduplex analysis in gel matrices are used to detect conformational changes created by DNA sequence variation as alterations in electrophoretic mobility. Alternatively, where a mutation creates or destroys a recognition site for a restriction endonuclease (restriction fragment length polymorphism, RFLP), the sample is digested with that endonuclease, and the products size fractionated to determine whether the fragment was digested. Fractionation is performed by gel or capillary electrophoresis, particularly acrylamide or agarose gels.

[0065] The disclosed materials, compositions, and methods also provide the use of the nucleic acid sequences described herein in methods using a mobile solid support to analyze mutations. See, for example, WO 01/48244, which is incorporated herein by reference in its entirety for the methods taught therein.

[0066] The method of performing a Luminex FlowMetrix-based SNP analysis involves differential hybridization of a PCR product to two differently-colored FACS-analyzable beads. The FlowMetrix system currently consists of uniformly-sized 5 micron polystyrene-divinylbenzene beads stained in eight concentrations of two dyes (orange and red). The matrix of the two dyes in eight concentrations allows for 64 differently-colored beads that can each be differentiated by a FACScalibur suitably modified with the Luminex PC computer board. In the Luminex SNP analysis, covalently-linked to a bead is a short (approximately 18-20 bases) "target" oligodeoxynucleotide (oligo). The nucleotide positioned at the center of the target oligo encodes the polymorphic base. A pair of beads are synthesized; each bead of the pair has attached to it one of the polymorphic oligonucleotides. A PCR of the region of DNA surrounding the to-be analyzed SNP is performed to generate a PCR product. Conditions are established that allow hybridization of the PCR product preferentially to the bead on which is encoded the precise complement. In one format ("without competitor"), the PCR product itself incorporates a flourescein dye and it is the gain of the flourescein stain on the bead, as measured during the FACScalibur run, that indicates hybridization. In a second format ("with competitor,") the beads are hybridized with a competitor to the PCR product. The competitor itself in this case is labeled with flourescein. And it is the loss of the flourescein by displacement by unlabeled PCR product that indicates successful hybridization.

Isolated Na.sub.v1.7 Nucleic Acid Sequences

[0067] The nucleic acid sequences disclosed herein can be isolated by methods known in the art and described herein. In one aspect, disclosed herein are isolated nucleic acids comprising nucleotide sequences encoding mutant Na.sub.v1.7 sodium channel alpha-subunits. For example, in one aspect, disclosed herein is an isolated nucleic acid sequence comprising a nucleotide sequence encoding the amino acid sequence of SEQ ID NO: 2. In another aspect, disclosed herein is an isolated nucleic acid sequence comprising a nucleotide sequence encoding the amino acid sequence of SEQ ID NO: 3. In yet another aspect, disclosed herein is an isolated nucleic acid sequence comprising a nucleotide sequence encoding the amino acid sequence of SEQ ID NO: 4. In a further aspect, disclosed herein is an isolated nucleic acid sequence comprising a nucleotide sequence encoding the amino acid sequence of SEQ ID NO: 5. In a still further aspect, disclosed herein is an isolated nucleic acid sequence comprising a nucleotide sequence encoding the amino acid sequence of SEQ ID NO: 6. In one aspect, disclosed herein is an isolated nucleic acid sequence comprising a nucleotide sequence encoding the amino acid sequence of SEQ ID NO: 7.

[0068] Also, disclosed herein is an isolated nucleic acid comprising a nucleotide sequence encoding at least 5 contiguous residues of the Na.sub.v1.7 sodium channel alpha-subunit. For example, in one aspect, disclosed herein is an isolated nucleic acid comprising a nucleotide sequence encoding at least 5 contiguous residues of the amino acid sequence of SEQ ID NO: 2, wherein one of the amino acid residues comprises a valine in a position that corresponds to position 62 in SEQ ID NO: 2. In another aspect, disclosed herein is an isolated nucleic acid comprising a nucleotide sequence encoding at least 5 contiguous residues of the amino acid sequence of SEQ ID NO: 3, wherein one of the amino acid residues comprises a glutamine in a position that corresponds to position 149 in SEQ ID NO: 3. In yet another aspect, disclosed herein is an isolated nucleic acid comprising a nucleotide sequence encoding at least 5 contiguous residues of the amino acid sequence of SEQ ID NO: 4, wherein one of the amino acid residues comprises a tyrosine in a position that corresponds to position 641 in SEQ ID NO: 4. In a further aspect, disclosed herein is an isolated nucleic acid comprising a nucleotide sequence encoding at least 5 contiguous residues of the amino acid sequence of SEQ ID NO: 5, wherein one of the amino acid residues comprises a arginine in a position that corresponds to position 655 in SEQ ID NO: 5. In a still further aspect, disclosed herein is an isolated nucleic acid comprising a nucleotide sequence encoding at least 5 contiguous residues of the amino acid sequence of SEQ ID NO: 6, wherein one of the amino acid residues comprises a valine in a position that corresponds to position 739 in SEQ ID NO: 6. In one aspect, disclosed herein is an isolated nucleic acid comprising a nucleotide sequence encoding at least 5 contiguous residues of the amino acid sequence of SEQ ID NO: 7, wherein one of the amino acid residues comprises a phenylalanine in a position that corresponds to position 1123 in SEQ ID NO: 7.

Reference Nucleic Acid Sequences

[0069] Reference sequences of the Na.sub.v1.7 gene comprising the most commonly found allele are provided herein. As utilized herein, "reference sequence" refers to a nucleic acid sequence that encodes a Na.sub.v1.7 sodium channel alpha-subunit or fragment thereof comprising a specific nucleotide at a particular position(s) in the Na.sub.v1.7 nucleic acid sequence. Optionally, the reference sequence comprises the most commonly found nucleotide or allele at the particular position or positions. This reference sequence can be a full-length Na.sub.v1.7 nucleic acid sequence or fragments thereof. An example of a full-length human Na.sub.v1.7 nucleic acid sequence is provided herein as SEQ ID NO: 1.

[0070] The term "wild-type" may also be used to refer to the reference sequence comprising the most commonly found allele. It will be understood by one of skill in the art that the designation as "wild-type" is merely a convenient label for a common allele and should not be construed as conferring any particular property on that form of the sequence.

[0071] Alternatively, one of skill in the art can utilize a reference sequence or a fragment thereof comprising a nucleotide or allele that is not the most commonly found nucleotide or allele at a specific nucleotide position(s) in the Na.sub.v1.7 nucleic acid sequence or can utilize a reference sequence that comprises alternative nucleotides at a specific position(s). An example of a full-length Na.sub.v1.7 nucleic acid sequence that comprises such an alternative nucleotide at position 184 is provided herein as SEQ ID NO: 8. Therefore, when utilizing this reference sequence or a fragment thereof, the nucleotide at position 184 can be A or G. Other examples of full-length Na.sub.v1.7 reference sequences that comprise such alternative nucleotides at positions 446, 1921, 1964, 2215, and 3369 are provided herein as SEQ ID NO's: 9, 10, 11, 12, and 13, respectively. Therefore, when utilizing these reference sequences or fragments thereof, respectively, the nucleotide at position 446 can be C or A, the nucleotide at position 1921 can be A or T, the nucleotide at position 1964 can be position A or G, the nucleotide at position 2215 can be A or T, and the nucleotide at position 3369 can be G or T.

[0072] In one aspect, the reference sequence can comprise a fragment of the Na.sub.v1.7 nucleic acid sequence. For example, disclosed herein is a reference sequence comprising the nucleotide sequence GCCCTTCATCTATGG (SEQ ID NO: 14), corresponding to nucleotides 177 to 191 of the Na.sub.v1.7 gene sequence. This reference sequence has an "A" at position 184, which is the most commonly found nucleotide at this position. Therefore, one of skill in the art can compare this reference sequence to a test sequence and determine if the most commonly found nucleotide (A) is present at position 184 of the test sequence or if another nucleotide (G) is present at position 184 of the test sequence. Also provided are nucleotide sequence corresponding to any fragment of SEQ ID NO: 14 that includes the A at position 184 or the corresponding sequence with a G at position 184.

[0073] As another example, disclosed herein is a reference sequence comprising the nucleotide sequence AACCCGCCGGACTGG (SEQ ID NO: 15), corresponding to nucleotides 439 to 453 of the Na.sub.v1.7 gene sequence. This reference sequence has a "C" at position 446, which is the most commonly found nucleotide at this position. Therefore, one of skill in the art can compare this reference sequence to a test sequence and determine if the most commonly found nucleotide (C) is present at position 446 of the test sequence or if another nucleotide (A) is present at position 446 of the test sequence. Also provided are nucleotide sequence corresponding to any fragment of SEQ ID NO: 15 that includes the C at position 446 or the corresponding sequence with a A at position 446.

[0074] Also, disclosed herein is a reference sequence comprising the nucleotide sequence GCTCCCCAATGGACA (SEQ ID NO: 16), corresponding to nucleotides 1914 to 1928 of the Na.sub.v1.7 gene sequence. This reference sequence has an "A" at position 1921, which is the most commonly found nucleotide at this position. Therefore, one of skill in the art can compare this reference sequence to a test sequence and determine if the most commonly found nucleotide (A) is present at position 1921 of the test sequence or if another nucleotide (G) is present at position 1921 of the test sequence. Also provided are nucleotide sequence corresponding to any fragment of SEQ ID NO: 16 that includes the A at position 1921 or the corresponding sequence with a G at position 1921.

[0075] Further, disclosed herein is a reference sequence comprising the nucleotide sequence ATACACAAGAAAAGG (SEQ ID NO: 17), corresponding to nucleotides 1956 to 1971 of the Na.sub.v1.7 gene sequence. This reference sequence has an "A" at position 1964, which is the most commonly found nucleotide at this position. Therefore, one of skill in the art can compare this reference sequence to a test sequence and determine if the most commonly found nucleotide (A) is present at position 1964 of the test sequence or if another nucleotide (G) is present at position 1964 of the test sequence. Also provided are nucleotide sequence corresponding to any fragment of SEQ ID NO: 17 that includes the A at position 1964 or the corresponding sequence with a G at position 1964.

[0076] In yet another example, disclosed herein is a reference sequence comprising the nucleotide sequence TCTTGCAATTACCAT (SEQ ID NO: 18), corresponding to nucleotides 2208 to 2222 of the Na.sub.v1.7 gene sequence. This reference sequence has an "A" at position 2215, which is the most commonly found nucleotide at this position. Therefore, one of skill in the art can compare this reference sequence to a test sequence and determine if the most commonly found nucleotide (A) is present at position 2215 of the test sequence or if another nucleotide (G) is present at position 2215 of the test sequence. Also provided are nucleotide sequence corresponding to any fragment of SEQ ID NO: 18 that includes the A at position 2215 or the corresponding sequence with a G at position 2215.

[0077] In still another example, disclosed herein is a reference sequence comprising the nucleotide sequence ACCCTTTGCCTGGAG (SEQ ID NO: 19), corresponding to nucleotides 3362 to 3376 of the Na.sub.v1.7 gene sequence. This reference sequence has a "G" at position 3369, which is the most commonly found nucleotide at this position. Therefore, one of skill in the art can compare this reference sequence to a test sequence and determine if the most commonly found nucleotide (G) is present at position 3369 of the test sequence or if another nucleotide (T) is present at position 3369 of the test sequence. Also provided are nucleotide sequence corresponding to any fragment of SEQ ID NO: 19 that includes the G at position 3369 or the corresponding sequence with a T at position 3369.

Probes and Primers

[0078] Nucleic acids of interest comprising the mutations provided herein can be utilized as probes or primers. The complementary sequences of the Na.sub.v1.7 nucleic acid sequences disclosed herein are also provided. For the most part, the nucleic acid fragments will be of at least about 15 nucleotides, usually at least about 20 nucleotides, often at least about 50 nucleotides. Such fragments are useful as primers for PCR, hybridization screening, etc. Larger nucleic acid fragments, for example, greater than about 100 nucleotides are useful for production of promoter fragments, motifs, etc. For use in amplification reactions, such as PCR, a pair of primers will be used. The exact composition of primer sequences is not critical to the invention, but for most applications the primers will hybridize to the subject sequence under stringent conditions, as known in the art.

[0079] "Probes," as used herein, are molecules capable of interacting with a target nucleic acid, typically in a sequence specific manner, for example, through hybridization. The hybridization of nucleic acids is well understood in the art and is discussed herein. Typically, a probe can be made from any combination of nucleotides or nucleotide derivatives or analogs available in the art.

[0080] By "hybridizing under stringent conditions" or "hybridizing under highly stringent conditions" is meant that the hybridizing portion of the hybridizing nucleic acid, typically comprising at least 15 (e.g., 20, 25, 30, or 50 nucleotides), hybridizes to all or a portion of the provided nucleotide sequence under stringent conditions. The term "hybridization" typically means a sequence driven interaction between at least two nucleic acid molecules, such as a primer or a probe and a gene. Sequence driven interaction means an interaction that occurs between two nucleotides or nucleotide analogs or nucleotide derivatives in a nucleotide specific manner. For example, G interacting with C or A interacting with T are sequence driven interactions. Typically sequence driven interactions occur on the Watson-Crick face or Hoogsteen face of the nucleotide. The hybridization of two nucleic acids is affected by a number of conditions and parameters known to those of skill in the art. For example, the salt concentrations, pH, and temperature of the reaction all affect whether two nucleic acid molecules will hybridize. Generally, the hybridizing portion of the hybridizing nucleic acid is at least 80%, for example, at least 90%, 95%, or 98%, identical to the sequence of or a portion of the Na.sub.v1.7 nucleic acid of the invention, or its complement. Hybridizing nucleic acids of the invention can be used, for example, as a cloning probe, a primer (e.g., for PCR), a diagnostic probe, or an antisense probe. Hybridization of the oligonucleotide probe to a nucleic acid sample typically is performed under stringent conditions. Nucleic acid duplex or hybrid stability is expressed as the melting temperature or T.sub.m, which is the temperature at which a probe dissociates from a target DNA. This melting temperature is used to define the required stringency conditions. If sequences are to be identified that are related and substantially identical to the probe, rather than identical, then it is useful to first establish the lowest temperature at which only homologous hybridization occurs with a particular concentration of salt (e.g., SSC or SSPE). Assuming that a 1% mismatch results in a 1.degree. C. decrease in the T.sub.m, the temperature of the final wash in the hybridization reaction is reduced accordingly (for example, if sequence having >95% identity with the probe are sought, the final wash temperature is decreased by 5.degree. C.). In practice, the change in T.sub.m can be between 0.5.degree. C. and 1.5.degree. C. per 1% mismatch. Stringent conditions involve hybridizing at 68.degree. C. in 5.times.SSC/5.times. Denhardt's solution/1.0% SDS, and washing in 0.2.times.SSC/0.1% SDS at room temperature. Moderately stringent conditions include washing in 3.times.SSC at 42.degree. C. The parameters of salt concentration and temperature can be varied to achieve the optimal level of identity between the probe and the target nucleic acid. Additional guidance regarding such conditions is readily available in the art, for example, in Sambrook, et al., Molecular Cloning, A Laboratory Manual, Cold Spring Harbor Press, New York, N.Y. (1989) ; and Ausubel, et al., eds., Current Protocols in Molecular Biology, John Wiley & Sons, New York, N.Y., at Unit 2.10 (1995).

[0081] Synthetic analogs of nucleic acids may be preferred for use as probes because of superior stability under assay conditions. Modifications in the native structure, including alterations in the backbone, sugars or heterocyclic bases, have been shown to increase intracellular stability and binding affinity. Among useful changes in the backbone chemistry are phosphorothioates; phosphorodithioates, where both of the non-bridging oxygens are substituted with sulfur; phosphoroamidites; alkyl phosphotriesters and boranophosphates. Achiral phosphate derivatives include 3'-O'-5'-S-phosphorothioate, 3'-S-5'-O-phosphorothioate, 3'-CH.sub.2-5'-O-phosphonate and 3'-NH-5'-O-phosphoroamidate. Peptide nucleic acids replace the entire ribose phosphodiester backbone with a peptide linkage.

[0082] Sugar modifications are also used to enhance stability and affinity. The alpha-anomer of deoxyribose may be used, where the base is inverted with respect to the natural beta-anomer. The 2'-OH of the ribose sugar may be altered to form 2'-O-methyl or 2'-O-allyl sugars, which provides resistance to degradation without compromising affinity.

[0083] Modification of the heterocyclic bases must maintain proper base pairing. Some useful substitutions include deoxyuridine for deoxythymidine; 5-methyl-2'-deoxycytidine and 5-bromo-2'-deoxycytidine for deoxycytidine. 5- propynyl-2'-deoxyuridine and 5-propynyl-2'-deoxycytidine have been shown to increase affinity and biological activity when substituted for deoxythymidine and deoxycytidine, respectively.

[0084] In one aspect, disclosed herein are isolated nucleic acids comprising a sequence that hybridizes under stringent conditions to a nucleic acid comprising a nucleotide sequence encoding an amino acid sequence of a mutated Na.sub.v1.7 sodium channel alpha-subunit but not to a nucleic acid sequence that encodes the amino acid sequence of the wild-type Na.sub.v1.7 sodium channel alpha-subunit. For example, in one aspect, disclosed herein are isolated nucleic acids comprising a sequence that hybridizes under stringent conditions to a nucleic acid comprising a nucleotide sequence encoding the amino acid sequence of SEQ ID NO: 2 but not to the nucleic acid sequence that encodes SEQ ID NO: 38. In another aspect, disclosed herein are isolated nucleic acids comprising a sequence that hybridizes under stringent conditions to a nucleic acid comprising a nucleotide sequence encoding the amino acid sequence of SEQ ID NO: 3 but not to the nucleic acid sequence that encodes SEQ ID NO: 38. In yet another aspect, disclosed herein are isolated nucleic acids comprising a sequence that hybridizes under stringent conditions to a nucleic acid comprising a nucleotide sequence encoding the amino acid sequence of SEQ ID NO: 4 but not to the nucleic acid sequence that encodes SEQ ID NO: 38. In an further aspect, disclosed herein are isolated nucleic acids comprising a sequence that hybridizes under stringent conditions to a nucleic acid comprising a nucleotide sequence encoding the amino acid sequence of SEQ ID NO: 5 but not to the nucleic acid sequence that encodes SEQ ID NO: 38. In a still further aspect, disclosed herein are isolated nucleic acids comprising a sequence that hybridizes under stringent conditions to a nucleic acid comprising a nucleotide sequence encoding the amino acid sequence of SEQ ID NO: 6 but not to the nucleic acid sequence that encodes SEQ ID NO: 38. In one aspect, disclosed herein are isolated nucleic acids comprising a sequence that hybridizes under stringent conditions to a nucleic acid comprising a nucleotide sequence encoding the amino acid sequence of SEQ ID NO: 7 but not to the nucleic acid sequence that encodes SEQ ID NO: 38.

[0085] In another aspect, disclosed herein are isolated nucleic acids comprising a sequence that hybridizes under stringent conditions to a mutated Na.sub.v1.7 nucleic acid sequence or a fragment thereof but not to a wild-type Na.sub.v1.7 nucleic acid sequence. For example, in one aspect, disclosed herein are isolated nucleic acids comprising a sequence that hybridizes under stringent conditions to a nucleic acid sequence of SEQ ID NO: 8, or a fragment thereof, such as SEQ ID NO: 14, but not to the nucleic acid sequence of SEQ ID NO: 1. In another aspect, disclosed herein are isolated nucleic acids comprising a sequence that hybridizes under stringent conditions to a nucleic acid of SEQ ID NO: 9, or a fragment thereof, such as SEQ ID NO: 15, but not to the nucleic acid sequence of SEQ ID NO: 1. In yet another aspect, disclosed herein are isolated nucleic acids comprising a sequence that hybridizes under stringent conditions to a nucleic acid sequence of SEQ ID NO: 10, or a fragment thereof, such as SEQ ID NO: 16, but not to the nucleic acid sequence of SEQ ID NO: 1. In an further aspect, disclosed herein are isolated nucleic acids comprising a sequence that hybridizes under stringent conditions to a nucleic acid sequence of SEQ ID NO: 11, or a fragment thereof, such as SEQ ID NO: 17, but not to the nucleic acid sequence of SEQ ID NO: 1. In a still further aspect, disclosed herein are isolated nucleic acids comprising a sequence that hybridizes under stringent conditions to a nucleic acid sequence of SEQ ID NO: 12, or a fragment thereof, such as SEQ ID NO: 18, but not to the nucleic acid sequence of SEQ ID NO: 1. In one aspect, disclosed herein are isolated nucleic acids comprising a sequence that hybridizes under stringent conditions to a nucleic acid sequence of SEQ ID NO: 13, or a fragment thereof, such as SEQ ID NO: 19, but not to the nucleic acid sequence of SEQ ID NO: 1.

[0086] In yet another aspect, disclosed herein are isolated nucleic acids encoding mutant Na.sub.v1.7 sodium channels comprising a sequence that hybridizes under stringent conditions to a mutated Na.sub.v1.7 nucleic acid comprising a nucleotide sequence encoding an amino acid sequence of sodium channel alpha-subunit but not to a wild-type Na.sub.v1.7 nucleic acid sequence. For example, in one aspect, disclosed herein are isolated nucleic acids comprising a sequence that hybridizes under stringent conditions to a nucleic acid comprising a nucleotide sequence encoding the amino acid sequence of SEQ ID NO: 2 but not to the nucleic acid sequence of SEQ ID NO: 1. In another aspect, disclosed herein are isolated nucleic acids comprising a sequence that hybridizes under stringent conditions to a nucleic acid comprising a nucleotide sequence encoding the amino acid sequence of SEQ ID NO: 3 but not to the nucleic acid sequence of SEQ ID NO: 1. In yet another aspect, disclosed herein are isolated nucleic acids comprising a sequence that hybridizes under stringent conditions to a nucleic acid comprising a nucleotide sequence encoding the amino acid sequence of SEQ ID NO: 4 but not to the nucleic acid sequence of SEQ ID NO: 1. In an further aspect, disclosed herein are isolated nucleic acids comprising a sequence that hybridizes under stringent conditions to a nucleic acid comprising a nucleotide sequence encoding the amino acid sequence of SEQ ID NO: 5 but not to the nucleic acid sequence of SEQ ID NO: 1. In a still further aspect, disclosed herein are isolated nucleic acids comprising a sequence that hybridizes under stringent conditions to a nucleic acid comprising a nucleotide sequence encoding the amino acid sequence of SEQ ID NO: 6 but not to the nucleic acid sequence of SEQ ID NO: 1. In one aspect, disclosed herein are isolated nucleic acids comprising a sequence that hybridizes under stringent conditions to a nucleic acid comprising a nucleotide sequence encoding the amino acid sequence of SEQ ID NO: 7 but not to the nucleic acid sequence of SEQ ID NO: 1.

[0087] In a further aspect, disclosed herein are isolated nucleic acids comprising a sequence that hybridizes under stringent conditions to a mutated Na.sub.v1.7 nucleic acid sequence or a fragment thereof but not to the nucleic acid sequence that encodes SEQ ID NO: 38. For example, in one aspect, disclosed herein are isolated nucleic acids comprising a sequence that hybridizes under stringent conditions to a nucleic acid sequence of SEQ ID NO: 8, or a fragment thereof, such as SEQ ID NO: 14, but not to the nucleic acid sequence that encodes SEQ ID NO: 38. In another aspect, disclosed herein are isolated nucleic acids comprising a sequence that hybridizes under stringent conditions to a nucleic acid of SEQ ID NO: 9, or a fragment thereof, such as SEQ ID NO: 15, but not to the nucleic acid sequence that encodes SEQ ID NO: 38. In yet another aspect, disclosed herein are isolated nucleic acids comprising a sequence that hybridizes under stringent conditions to a nucleic acid sequence of SEQ ID NO: 10, or a fragment thereof, such as SEQ ID NO: 16, but not to the nucleic acid sequence that encodes SEQ ID NO: 38. In an further aspect, disclosed herein are isolated nucleic acids comprising a sequence that hybridizes under stringent conditions to a nucleic acid sequence of SEQ ID NO: 11, or a fragment thereof, such as SEQ ID NO: 17, but not to the nucleic acid sequence that encodes SEQ ID NO: 38. In a still further aspect, disclosed herein are isolated nucleic acids comprising a sequence that hybridizes under stringent conditions to a nucleic acid sequence of SEQ ID NO: 12, or a fragment thereof, such as SEQ ID NO: 18, but not to the nucleic acid sequence that encodes SEQ ID NO: 38. In one aspect, disclosed herein are isolated nucleic acids comprising a sequence that hybridizes under stringent conditions to a nucleic acid sequence of SEQ ID NO: 13, or a fragment thereof, such as SEQ ID NO: 19, but not to the nucleic acid sequence that encodes SEQ ID NO: 38.

Arrays

[0088] The disclosed materials, compounds, and methods also provide an array of oligonucleotides for identification of mutations, where discrete positions on the array are complementary to one or more of the provided mutated sequences, e.g. oligonucleotides of at least 12 nucleotides, frequently 15 nucleotides, 20 nucleotides, or larger, and including the sequence flanking the mutated position. Such an array may comprise a series of oligonucleotides, each of which can specifically hybridize to a different mutation of the disclosed compositions. An array may comprise all or a subset of nucleic acid sequences having SEQ ID NOs: 8, 9, 10, 11, 12, and/or 13, or any fragment of at least 15 contiguous nucleotides thereof, for example SEQ ID NOs: 14, 15, 16, 17, 18, and/or 19. Usually such an array will include at least 2 different mutated sequences, i.e., mutations located at unique positions within the locus, and may include all of the provided mutations. Therefore, the array can include wild-type sequences comprising the most commonly found alleles. The array can optionally comprise the most commonly found allele at a first, second, third, fourth, fifth, or more positions as well as other nucleotides at each of these positions. Each oligonucleotide sequence on the array will usually be at least about 12 nucleotides in length (i.e., 10-15 nucleotides), may be the length of the provided mutated sequences, or may extend into the flanking regions to generate fragments of 100 to 200 nucleotides in length. For examples of arrays, see Ramsay, (1998) Nat Biotech 16:4044; Hacia, et al. (1996) Nature Genetics 14:441-447; Lockhart, et al. (1996) Nature Biotechnol 14:1675-1680; and De Risi, et al. (1996) Nature Genetics 14:457-460, which are incorporated by reference in their entirety for the methods of making and using arrays.

Haplotyping

[0089] In another aspect, the disclosed materials, compositions, articles, devices, and methods relate to a method for determining a Na.sub.v1.7 haplotype in a human subject, wherein the method comprises identifying one or more nucleotides encoding amino acid residues 62, 149, 641, 655, 739, 1123, or any combination thereof, wherein the nucleotide or nucleotides indicate the haplotype. The disclosed subject matter also provides a method for determining a Na.sub.v1.7 haplotype in a human subject comprising identifying one or more nucleotides present at one or more of sites 184, 446, 1921, 1964, 2215, or 3369, in either or both copies of the Na.sub.v1.7 gene contained in the subject genomic nucleic acid, wherein the nucleotide present at the mutated site or sites indicates the Na.sub.v1.7 haplotype. It will be recognized by one of skill in the art that numerous haplotypes are possible.

[0090] For example, one of skill in the art could identify the nucleotide present in either or both copies of the Na.sub.v1.7 gene contained in the subject genomic nucleic acid that encodes for amino acid 62 in the Na.sub.v1.7 sodium channel alpha-subunit sequence. The haplotypes for this particular analysis can be I62V, P149Q, N641Y, K655R, I739V, L1123F, or any combination thereof, where the number indicates a position in the Na.sub.v1.7 sodium channel alpha-subunit, the first letter represents the most common amino acid found at that positions, and the last letter represents the amino acid found in the haplotype. Similarly, one of skill in the art could identify the nucleotide in a Na.sub.v1.7 nucleic acid sequence at position 184, 446, 1921, 1964, 2215, and/or 3369, and determine the Na.sub.v1.7 haplotype. Therefore, any of positions 184, 446, 1921, 1964, 2215, and/or 3369 in the nucleic acid sequence or positions 62, 149, 641, 655, 739, and/or 1123 in the encoded amino acid sequence can be analyzed individually or in combination to obtain the haplotypes of the disclosed subject matter.

Determining a Predisposition

[0091] Disclosed herein is a method for determining a subject's predisposition to a neurologic disorder associated with a sodium channel mutation comprising comparing the subject's Na.sub.v1.7 haplotype with one or more reference haplotypes that correlate with the neurologic disorder, a similar haplotype in the subject's Na.sub.v1.7 haplotype as compared to the reference haplotype or haplotypes indicating a predisposition to the neurologic disorder.

[0092] As used herein, "neurologic disorder associated with a sodium channel mutation" includes, but is not limited to, seizure disorders (e.g., febrile seizures, nonfebrile seizures, and epileptic seizures). As used herein "epliptic seizures" includes, but is not limited to, partial (e.g., simple and complex) and generalized (e.g., absence, myoclonic, and tonic-clonic) seizures, temporal lobe epilepsy, and severe myoclonic epilepsy of infancy.

[0093] Each haplotype can be correlated with specific neurologic disorders or severity of such disorders to generate a database of reference haplotypes, such that one of skill in the art can compare a subject's haplotype to a reference haplotype or haplotypes and determine whether the subject is at risk for a neurologic disorder.

[0094] The reference haplotype can comprise nucleotides that encode one or more mutations in the Na.sub.v1.7 sodium channel alpha-subunit. For example, the reference haplotype can comprise nucleotides that encode one or more mutations at residue 62, residue 149, residue 641, residue 655, residue 739, or residue 1123 of the encoded amino acid sequence of Na.sub.v1.7.

[0095] Since subjects will vary depending on numerous parameters including, but not limited to, race, age, weight, medical history etc., as more information is gathered on populations, the database can contain haplotype information classified by race, age, weight, medical history etc., such that one of skill in the art can assess the subject's risk of developing neurologic disorders based on information more closely associated with the subject's demographic profile. Where there is a differential distribution of a mutation by racial background or another parameter, guidelines for drug administration can be generally tailored to a particular group.

[0096] It will be appreciated by those skilled in the art that the nucleic acids provided herein as well as the nucleic acid and amino acid sequences identified from subjects can be stored, recorded, and manipulated on any medium which can be read and accessed by a computer. As used herein, the words "recorded" and "stored" refer to a process for storing information on a computer medium. A skilled artisan can readily adopt any of the presently known methods for recording information on a computer readable medium to generate a list of sequences comprising one or more of the nucleic acids of the invention. Another aspect of the present invention is a computer readable medium having recorded thereon at least 2, 5, 10, 15, 20, 25, 30, 50, 100, 200, 250, 300, 400, 500, 1000, 2000, 3000, 4000 or 5000 nucleic acids of the invention or nucleic acid sequences identified from subjects.

[0097] Computer readable media include magnetically readable media, optically readable media, electronically readable media and magnetic/optical media. For example, the computer readable media may be a hard disc, a floppy disc, a magnetic tape, CD-ROM, DVD, RAM, or ROM as well as other types of other media known to those skilled in the art.

[0098] Embodiments of the present invention include systems, particularly computer systems which contain the sequence information described herein. As used herein, "a computer system" refers to the hardware components, software components, and data storage components used to store and/or analyze the nucleotide sequences of the present invention or other sequences. The computer system preferably includes the computer readable media described above, and a processor for accessing and manipulating the sequence data.

[0099] Preferably, the computer is a general purpose system that comprises a central processing unit (CPU), one or more data storage components for storing data, and one or more data retrieving devices for retrieving the data stored on the data storage components. A skilled artisan can readily appreciate that any one of the currently available computer systems are suitable.

[0100] In one particular aspect, the computer system includes a processor connected to a bus which is connected to a main memory, preferably implemented as RAM, and one or more data storage devices, such as a hard drive and/or other computer readable media having data recorded thereon. In some embodiments, the computer system further includes one or more data retrieving devices for reading the data stored on the data storage components. The data retrieving device may represent, for example, a floppy disk drive, a compact disk drive, a magnetic tape drive, a hard disk drive, a CD-ROM drive, a DVD drive, etc. In some embodiments, the data storage component is a removable computer-readable medium such as a floppy disk, a compact disk, a magnetic tape, etc. containing control logic and/or data recorded thereon. The computer system may advantageously include or be programmed by appropriate software for reading the control logic and/or the data from the data storage component once inserted in the data retrieving device. Software for accessing and processing the nucleotide sequences of the nucleic acids of the invention (such as search tools, compare tools, modeling tools, etc.) may reside in main memory during execution.

[0101] In some aspects, the computer system may further comprise a sequence comparer for comparing the nucleic acid sequences stored on a computer readable medium to another test sequence stored on a computer readable medium. A "sequence comparer" refers to one or more programs which are implemented on the computer system to compare a nucleotide sequence with other nucleotide sequences.

[0102] Accordingly, one aspect of the present invention is a computer system comprising a processor, a data storage device having stored thereon a nucleic acid of the invention, a data storage device having retrievably stored thereon reference nucleotide sequences to be compared with test or sample sequences and a sequence comparer for conducting the comparison. The sequence comparer may indicate a homology level between the sequences compared or identify a difference between the two sequences. For example, a reference sequence comprising SEQ ID NO: 8 or any fragment thereof, such as SEQ ID NO: 14, can be compared with a test sequence from a subject to determine if the test sequence is the same as the reference sequence, e.g., contains an A at position 184 or a different nucleotide (G).

[0103] Alternatively, the computer program may be a computer program which compares a test nucleotide sequence(s) from a subject or a plurality of subjects to a reference nucleotide sequence(s) in order to determine whether the test nucleotide sequence(s) differs from or is the same as a reference nucleic acid sequence(s) at one or more positions. Optionally such a program records the length and identity of inserted, deleted or substituted nucleotides with respect to the sequence of either the reference polynucleotide or the test nucleotide sequence. In one embodiment, the computer program may be a program which determines whether the nucleotide sequences of the test nucleotide sequence contains one or more single nucleotide mutations with respect to a reference nucleotide sequence. These single nucleotide mutations may each comprise a single base substitution, insertion, or deletion.

[0104] Accordingly, another aspect of the materials, compounds, articles, devices, and methods disclosed herein is a method for determining whether a test nucleotide sequence differs at one or more nucleotides from a reference nucleotide sequence comprising the steps of reading the test nucleotide sequence and the reference nucleotide sequence through use of a computer program which identifies differences between nucleic acid sequences and identifying differences between the test nucleotide sequence and the reference nucleotide sequence with the computer program. The computer program can be a program which identifies single nucleotide polymorphisms. The method may be implemented by the computer systems described above. The method may also be performed by reading at least 2, 5, 10, 15, 20, 25, 30, 50, 100, or more test nucleotide sequences and the reference nucleotide sequences through the use of the computer program and identifying differences between the test nucleotide sequences and the reference nucleotide sequences with the computer program. A computer program that identifies single nucleotide mutations in a Na.sub.v1.7 gene sequence and determines a subject's haplotype is also contemplated by the subject matter disclosed herein. The subject matter disclosed herein also provides for a computer program that correlates haplotypes with Na.sub.v1.7 levels such that one of skill in the art can assess a subject's risk of developing a neurologic disorder, such as febrile seizures, nonfebrile seizures, or epileptic seizures. The computer program can optionally include treatment options or drug indications for subjects with haplotypes associated with increased risk of seizures.

[0105] The nucleic acids of the invention (both test nucleic acid sequences and reference nucleic acid sequences) may be stored and manipulated in a variety of data processor programs in a variety of formats. For example, they may be stored as text in a word processing file, such as MicrosoftWORD or WORDPERFECT or as an ASCII file in a variety of database programs familiar to those of skill in the art, such as DB2, SYBASE, or ORACLE. In addition, many computer programs and databases may be used as sequence comparers, identifiers, or sources of reference nucleotide sequences. The following list is intended not to limit the invention but to provide guidance to programs and databases which are useful with the nucleic acid sequences of the invention. The programs and databases which may be used include, but are not limited to: MacPattern (EMBL), DiscoveryBase (Molecular Applications Group), GeneMine (Molecular Applications Group), Look (Molecular Applications Group), MacLook (Molecular Applications Group), BLAST and BLAST2 (NCBI), BLASTN and BLASTX (Altschul, et al. (1990) J Mol Biol 3:403-410), FASTA (Pearson and Lipman, (1988) Proc Natl Acad Sci USA 85:2444-2448), FASTDB (Brutlag et al., (1990) Compt Appl Biosci 6:237-245), Catalyst (Molecular Simulations Inc.), Catalyst/SHAPE (Molecular Simulations Inc.), Cerius.sup.2.DBAccess (Molecular Simulations Inc.), HypoGen (Molecular Simulations Inc.), Insight II (Molecular Simulations Inc.), Discover (Molecular Simulations Inc.), CHARMm (Molecular Simulations Inc.), Felix (Molecular Simulations Inc.), DelPhi, (Molecular Simulations Inc.), QuanteMM, (Molecular Simulations Inc.), Homology (Molecular Simulations Inc.), Modeler (Molecular Simulations Inc.), ISIS (Molecular Simulations Inc.), Quanta/Protein Design (Molecular Simulations Inc.), WebLab (Molecular Simulations Inc.), WebLab Diversity Explorer (Molecular Simulations Inc.), Gene Explorer (Molecular Simulations Inc.), SeqFold (Molecular Simulations Inc.), the EMBL/Swissprotein database, the MDL Available Chemicals Directory database, the MDL Drug Data Report data base, the Comprehensive Medicinal Chemistry database, Derwents's World Drug Index database, the BioByteMasterFile database, the Genbank database, and the Genseqn database. Many other programs and data bases would be apparent.

Delivery of the Na.sub.v1.7 Nucleic Acid Sequence

[0106] Optionally, the nucleic acids described herein are delivered to various expression systems. There are a number of compositions and methods which can be used to deliver nucleic acids to cells, either in vitro or in vivo. These methods and compositions can largely be broken down into two classes: viral based delivery systems and non-viral based delivery systems. For example, the nucleic acids can be delivered through a number of direct delivery systems such as, electroporation, lipofection, calcium phosphate precipitation, plasmids, viral vectors, viral nucleic acids, phage nucleic acids, phages, cosmids, or via transfer of genetic material in cells or carriers such as cationic liposomes. Appropriate means for transfection, including viral vectors, chemical transfectants, or physico-mechanical methods such as electroporation and direct diffusion of DNA, are described by, for example, Wolff, et al. (1990) Science 247:1465-1468; and Wolff, (1991) Nature 352:815-818. Such methods are well known in the art and readily adaptable for use with the compositions and methods described herein. In certain cases, the methods will be modified to specifically function with large DNA molecules.

[0107] Nucleic Acid Based Delivery Systems: Vectors

[0108] In one aspect, disclosed herein are expression vectors comprising a nucleic acid comprising a nucleotide sequence encoding an amino acid sequence of mutated Na.sub.v1.7 sodium channel alpha-subunit wherein the nucleotide sequence is operably linked to an expression control sequence. For example, in one aspect, disclosed herein are expression vectors comprising a nucleic acid comprising a nucleotide sequence encoding the amino acid sequence of SEQ ID NO: 2 operably linked to an expression control sequence. In another aspect, disclosed herein are expression vectors comprising a nucleic acid comprising a nucleotide sequence encoding the amino acid sequence of SEQ ID NO: 3 operably linked to an expression control sequence. In yet another aspect, disclosed herein are expression vectors comprising a nucleic acid comprising a nucleotide sequence encoding the amino acid sequence of SEQ ID NO: 4 operably linked to an expression control sequence. In a further aspect, disclosed herein are expression vectors comprising a nucleic acid comprising a nucleotide sequence encoding the amino acid sequence of SEQ ID NO: 5 operably linked to an expression control sequence. In a still further aspect, disclosed herein are expression vectors comprising a nucleic acid comprising a nucleotide sequence encoding the amino acid sequence of SEQ ID NO: 6 operably linked to an expression control sequence. In one aspect, disclosed herein are expression vectors comprising a nucleic acid comprising a nucleotide sequence encoding the amino acid sequence of SEQ ID NO: 7 operably linked to an expression control sequence.

[0109] Further provided are expression vectors comprising any fragment of the nucleic acid encoding SEQ ID NOs: 2-7. Such fragments preferably encode at least 5 contiguous amino acid sequences of SEQ ID NOs: 2-7.

[0110] Expression or transfer vectors can be any nucleotide construction used to deliver genes into cells (e.g., a plasmid), or as part of a general strategy to deliver genes, e.g., as part of recombinant retrovirus or adenovirus (Ram, et al. (1993) Cancer Res 53:83-88).

[0111] As used herein, plasmid or viral vectors are agents that transport the disclosed nucleic acids, such as SEQ ID NOs: 8, 9, 10, 11, 12, and/or 13 into the cell without degradation and include a promoter yielding expression of the gene in the cells into which it is delivered. Viral vectors are, for example, Adenovirus, Adeno-associated virus, Herpes virus, Vaccinia virus, Polio virus, AIDS virus, neuronal trophic virus, Sindbis and other RNA viruses, including those viruses with the HIV backbone. Also preferred are any viral families which share the properties of these viruses which make them suitable for use as vectors. Retroviruses include Murine Maloney Leukemia virus, MMLV, and retroviruses that express the desirable properties of MMLV as a vector. Retroviral vectors are able to carry a larger genetic payload, i.e., a transgene or marker gene, than other viral vectors, and for this reason are a commonly used vector. However, they are not as useful in non-proliferating cells. Adenovirus vectors are relatively stable and easy to work with, have high titers, and can be delivered in aerosol formulation, and can transfect non-dividing cells. Pox viral vectors are large and have several sites for inserting genes, they are thermostable and can be stored at room temperature. In one specific aspect is a viral vector that has been engineered so as to suppress the immune response of the host organism, elicited by the viral antigens. Vectors of this type will carry coding regions for Interleukin 8 or 10.

[0112] Viral vectors can have higher transaction (ability to introduce genes) abilities than chemical or physical methods to introduce genes into cells. Typically, viral vectors contain, nonstructural early genes, structural late genes, an RNA polymerase III transcript, inverted terminal repeats necessary for replication and encapsidation, and promoters to control the transcription and replication of the viral genome. When engineered as vectors, viruses typically have one or more of the early genes removed and a gene or gene/promotor cassette is inserted into the viral genome in place of the removed viral DNA. Constructs of this type can carry up to about 8 kb of foreign genetic material. The necessary functions of the removed early genes are typically supplied by cell lines which have been engineered to express the gene products of the early genes in trans.

[0113] Retroviral Vectors

[0114] A retrovirus is an animal virus belonging to the virus family of Retroviridae, including any types, subfamilies, genus, or tropisms. Retroviral vectors, in general, are described by Verma, Retroviral vectors for gene transfer. In Microbiology-1985, American Society for Microbiology, pp. 229-232, Washington, (1985), which is incorporated by reference herein in its entirety for retroviral vectors and methods of making them. Examples of methods for using retroviral vectors for gene therapy are described in U.S. Pat. Nos. 4,868,116 and 4,980,286; PCT applications WO 90/02806 and WO 89/07136; and Mulligan, (Science 260:926-932 (1993)); the teachings of which are incorporated by reference herein in its entirety for retroviral vectors and methods of using them.

[0115] A retrovirus is essentially a package which has packed into it nucleic acid cargo. The nucleic acid cargo carries with it a packaging signal, which ensures that the replicated daughter molecules will be efficiently packaged within the package coat. In addition to the package signal, there are a number of molecules which are needed in cis, for the replication, and packaging of the replicated virus. Typically a retroviral genome, contains the gag, pol, and env genes which are involved in the making of the protein coat. It is the gag, pol, and env genes which are typically replaced by the foreign DNA that it is to be transferred to the target cell. Retrovirus vectors typically contain a packaging signal for incorporation into the package coat, a sequence which signals the start of the gag transcription unit, elements necessary for reverse transcription, including a primer binding site to bind the tRNA primer of reverse transcription, terminal repeat sequences that guide the switch of RNA strands during DNA synthesis, a purine rich sequence 5' to the 3' LTR that serve as the priming site for the synthesis of the second strand of DNA synthesis, and specific sequences near the ends of the LTRs that enable the insertion of the DNA state of the retrovirus to insert into the host genome. The removal of the gag, pol, and env genes allows for about 8 kb of foreign sequence to be inserted into the viral genome, become reverse transcribed, and upon replication be packaged into a new retroviral particle. This amount of nucleic acid is sufficient for the delivery of a one to many genes depending on the size of each transcript. It is preferable to include either positive or negative selectable markers along with other genes in the insert.

[0116] Since the replication machinery and packaging proteins in most retroviral vectors have been removed (gag, pol, and env), the vectors are typically generated by placing them into a packaging cell line. A packaging cell line is a cell line which has been transfected or transformed with a retrovirus that contains the replication and packaging machinery, but lacks any packaging signal. When the vector carrying the DNA of choice is transfected into these cell lines, the vector containing the gene of interest is replicated and packaged into new retroviral particles, by the machinery provided in cis by the helper cell. The genomes for the machinery are not packaged because they lack the necessary signals.

[0117] Adenoviral Vectors

[0118] The construction of replication-defective adenoviruses has been described (Berkner, et al. (1987) J Virology 61:1213-1220; Massie, et al. (1986) Mol Cell Biol 6:2872-2883; Haj-Ahmad, et al. (1986) J Virology 57:267-274; Davidson, et al. (1987) J Virology 61:1226-1239; Zhang, (1993) BioTechniques 15:868-872). The benefit of the use of these viruses as vectors is that they are limited in the extent to which they can spread to other cell types, since they can replicate within an initial infected cell, but are unable to form new infectious viral particles. Recombinant adenoviruses have been shown to achieve high efficiency gene transfer after direct, in vivo delivery to airway epithelium, hepatocytes, vascular endothelium, CNS parenchyma and a number of other tissue sites (Morsy, (1993) J Clin Invest 92:1580-1586; Kirshenbaum, (1993) J Clin Invest 92:381-387; Roessler, (1993) J Clin Invest 92:1085-1092; Moullier, (1993) Nature Genetics 4:154-159; La Salle, (1993) Science 259:988-990; Gomez-Foix, (1992) J Biol Chem 267:25129-25134; Rich, (1993) Human Gene Therapy 4:461-476; Zabner, (1994) Nature Genetics 6:75-83; Guzman, (1993) Circulation Res 73:1201-1207; Bout, (1994) Human Gene Therapy 5:3-10; Zabner, (1993) Cell 75:207-216; Caillaud, (1993) Eur. J. Neuroscience 5:1287-1291; Ragot, (1993) J Gen Virology 74:501-507). Recombinant adenoviruses achieve gene transduction by binding to specific cell surface receptors, after which the virus is internalized by receptor-mediated endocytosis, in the same manner as wild type or replication-defective adenovirus (Chardonnet, et al. (1970) Virology 40:462-477; Brown, et al. (1973) J Virology 12:386-396; Svensson, et al. (1985) J Virology 55:442-449; Seth, et al. (1985) J Virol 51:650-655; Seth, et al. (1984) Mol Cell Biol 4:1528-1533; Varga, et al. (1991) J Virology 65:6061-6070; Wickham, et al. (1993) Cell 73:309-319).

[0119] A viral vector can be one based on an adenovirus which has had the E1 gene removed and these virons are generated in a cell line such as the human 293 cell line. In another aspect, both the E1 and E3 genes are removed from the adenovirus genome.

[0120] Adeno-Associated Viral Vectors

[0121] Another type of viral vector is based on an adeno-associated virus (AAV). This defective parvovirus is a preferred vector because it can infect many cell types and is nonpathogenic to humans. AAV type vectors can transport about 4 to 5 kb and wild type AAV is known to stably insert into chromosome 19. Vectors which contain this site specific integration property are preferred. An especially preferred embodiment of this type of vector is the P4.1 C vector produced by Avigen, San Francisco, Calif., which can contain the herpes simplex virus thymidine kinase gene, HSV-tk, and/or a marker gene, such as the gene encoding the green fluorescent protein, GFP.

[0122] In another type of AAV virus, the AAV contains a pair of inverted terminal repeats (ITRs) which flank at least one cassette containing a promoter which directs cell-specific expression operably linked to a heterologous gene. Heterologous in this context refers to any nucleotide sequence or gene which is not native to the AAV or B19 parvovirus.

[0123] Typically the AAV and B19 coding regions have been deleted, resulting in a safe, noncytotoxic vector. The AAV ITRs, or modifications thereof, confer infectivity and site-specific integration, but not cytotoxicity, and the promoter directs cell-specific expression. U.S. Pat. No. 6,261,834 is incorporated by reference herein in its entirety for material related to the AAV vector.

[0124] The disclosed vectors described throughout thus provide nucleic acids which are capable of integration into a mammalian chromosome without substantial toxicity. The vectors can also provide nucleic acids that can be expressed in oocytes (including, e.g., Kenopus oocytes).

[0125] The inserted genes in viral and retroviral usually contain promoters, and/or enhancers to help control the expression of the desired gene product. A promoter is generally a sequence or sequences of nucleic acids that function when in a relatively fixed location in regard to the transcription start site. A promoter contains core elements required for basic interaction of RNA polymerase and transcription factors, and may contain upstream elements and response elements.

[0126] Large Payload Viral Vectors

[0127] Molecular genetic experiments with large human herpesviruses have provided a means whereby large heterologous DNA fragments can be cloned, propagated and established in cells permissive for infection with herpesviruses (Sun, et al. (1994) Nature Genetics 8:33-41; Cotter, et al. (1999) Curr Opin Mol Ther 5:633-644). These large DNA viruses (herpes simplex virus (HSV) and Epstein-Barr virus (EBV), have the potential to deliver fragments of human heterologous DNA>150 kb to specific cells. EBV recombinants can maintain large pieces of DNA in the infected B-cells as episomal DNA. Individual clones carried human genomic inserts up to 330 kb appeared genetically stable. The maintenance of these episomes requires a specific EBV nuclear protein, EBNA1, constitutively expressed during infection with EBV. Additionally, these vectors can be used for transfection, where large amounts of protein can be generated transiently in vitro. Herpesvirus amplicon systems are also being used to package pieces of DNA>220 kb and to infect cells that can stably maintain DNA as episomes.

[0128] Other useful systems include, for example, replicating and host-restricted non-replicating vaccinia virus vectors.

[0129] Non-Nucleic Acid Based Systems

[0130] The disclosed compositions can also be delivered to the target cells in a variety of ways other than through nucleic acid based methods. For example, the compositions can be delivered through electroporation, or through lipofection, or through calcium phosphate precipitation. The delivery mechanism chosen will depend in part on the type of cell targeted and whether the delivery is occurring for example in vivo or in vitro.

[0131] Thus, the compositions can comprise, in addition to the disclosed mutant Na.sub.v1.7 nucleic acid sequences or vectors, for example, lipids such as liposomes, such as cationic liposomes (e.g., DOTMA, DOPE, DC cholesterol) or anionic liposomes. Liposomes can further comprise proteins to facilitate targeting a particular cell, if desired. Administration of a composition comprising a compound and a cationic liposome can be administered to the blood afferent to a target organ or inhaled into the respiratory tract to target cells of the respiratory tract. Regarding liposomes, see, e.g., Brigham, et al. (1989) Am J Resp Cell Mol Biol 1:95-100; Felgner, et al. (1987) Proc Natl Acad Sci USA 84:7413-7417; U.S. Pat. No. 4,897,355. Furthermore, the compound can be administered as a component of a microcapsule that can be targeted to specific cell types, such as macrophages, or where the diffusion of the compound or delivery of the compound from the microcapsule is designed for a specific rate or dosage.

[0132] In the methods described above which include the administration and uptake of exogenous DNA into the cells of a subject (i.e., gene transduction or transfection), delivery of the compositions to cells can be via a variety of mechanisms. As one example, delivery can be via a liposome, using commercially available liposome preparations such as LIPOFECTIN, LIPOFECTAMINE (GIBCO-BRL, Inc., Gaithersburg, Md.), SUPERFECT (Qiagen, Inc. Hilden, Germany) and TRANSFECTAM (Promega Biotec, Inc., Madison, Wis.), as well as other liposomes developed according to procedures standard in the art. In addition, the disclosed nucleic acid or vector can be delivered in vivo by electroporation, the technology for which is available from Genetronics, Inc. (San Diego, Calif.) as well as by means of a SONOPORATION machine (ImaRx Pharmaceutical Corp., Tucson, Ariz.).

[0133] The materials may be in solution, suspension (for example, incorporated into microparticles, liposomes, or cells). These may be targeted to a particular cell type via antibodies, receptors, or receptor ligands. The following references are examples of the use of this technology to target specific proteins to tumor tissue (Senter, et al. (1991) Bioconjugate Chem. 2:447-451; Bagshawe, et al. (1998) Br J Cancer 60:275-281; Bagshawe, et al. (1988) Br J Cancer 58:700-703; Senter, et al. (1993) Bioconjugate Chem 4:3-9; Battelli, et al. (1992) Cancer Immunol Immunother 35:421-425; Pietersz, et al. (1992) Immunolog Rev 129:57-80; Roffler, et al. (1991) Biochem Pharmacol 42:2062-2065). These techniques can be used for a variety of other specific cell types. Vehicles such as "stealth" and other antibody conjugated liposomes (including lipid mediated drug targeting to colonic carcinoma), receptor mediated targeting of DNA through cell specific ligands, lymphocyte directed tumor targeting, and highly specific therapeutic retroviral targeting of murine glioma cells in vivo. The following references are examples of the use of this technology to target specific proteins to tumor tissue (Hughes, et al. (1989) Cancer Res 49:6214-6220; Litzinger, et al. (1992) Biochimica et Biophysica Acta 1104:179-187). In general, receptors are involved in pathways of endocytosis, either constitutive or ligand induced. These receptors cluster in clathrin coated pits, enter the cell via clathrin coated vesicles, pass through an acidified endosome in which the receptors are sorted, and then either recycle to the cell surface, become stored intracellularly, or are degraded in lysosomes. The internalization pathways serve a variety of functions, such as nutrient uptake, removal of activated proteins, clearance of macromolecules, opportunistic entry of viruses and toxins, dissociation and degradation of ligand, and receptor level regulation. Many receptors follow more than one intracellular pathway, depending on the cell type, receptor concentration, type of ligand, ligand valency, and ligand concentration. Molecular and cellular mechanisms of receptor mediated endocytosis has been reviewed (see Brown, et al. (1991) DNA and Cell Biology 10:399-409).

[0134] Nucleic acids that are delivered to cells which are to be integrated into the host cell genome, typically contain integration sequences. These sequences are often viral related sequences, particularly when viral based systems are used. These viral integration systems can also be incorporated into nucleic acids which are to be delivered using a non-nucleic acid based system of deliver, such as a liposome, so that the nucleic acid contained in the delivery system can be come integrated into the host genome.

[0135] Other general techniques for integration into the host genome include, for example, systems designed to promote homologous recombination with the host genome. These systems typically rely on sequence flanking the nucleic acid to be expressed that has enough homology with a target sequence within the host cell genome that recombination between the vector nucleic acid and the target nucleic acid takes place, causing the delivered nucleic acid to be integrated into the host genome. These systems and the methods necessary to promote homologous recombination are known to those of skill in the art.

Expression

[0136] The nucleic acids that are delivered to cells typically contain expression controlling systems. For example, the inserted genes in viral and retroviral systems usually contain expression control sequences, i.e., promoters, and/or enhancers to help control the expression of the desired gene product. A promoter is generally a sequence or sequences of DNA that function when in a relatively fixed location in regard to the transcription start site. A promoter contains core elements required for basic interaction of RNA polymerase and transcription factors, and may contain upstream elements and response elements.

[0137] Viral Promoters and Enhancers

[0138] Preferred promoters controlling transcription from vectors in mammalian host cells may be obtained from various sources, for example, the genomes of viruses such as: polyoma, Simian Virus 40 (SV40), adenovirus, retroviruses, hepatitis B virus and most preferably cytomegalovirus, or from heterologous mammalian promoters, e.g., beta-actin promoter. The early and late promoters of the SV40 virus are conveniently obtained as an SV40 restriction fragment which also contains the SV40 viral origin of replication (Fiers, et al. (1978) Nature 273: 113). The immediate early promoter of the human cytomegalovirus is conveniently obtained as a HindIII E restriction fragment (see Greenway, et al. (1982) Gene 18:355-360). Of course, promoters from the host cell or related species also are useful herein.

[0139] Enhancer generally refers to a sequence of DNA that functions at no fixed distance from the transcription start site and can be either 5' (Laimins, et al. (1981) Proc Natl Acad Sci USA 78:993) or 3' (Lusky, et al. (1983) Mol Cell Bio 3:1108) to the transcription unit. Furthermore, enhancers can be within an intron (Banerji, et al. (1983) Cell 33:729) as well as within the coding sequence itself (Osborne, et al. (1984) Mol Cell Bio 4:1293). They are usually between 10 and 300 bp in length, and they function in cis. Enhancers function to increase transcription from nearby promoters. Enhancers also often contain response elements that mediate the regulation of transcription. Promoters can also contain response elements that mediate the regulation of transcription. Enhancers often determine the regulation of expression of a gene. While many enhancer sequences are now known from mammalian genes (e.g., globin, elastase, albumin, fetoprotein, and insulin), typically, one will use an enhancer from a eukaryotic cell virus for general expression. Preferred examples are the SV40 enhancer on the late side of the replication origin (bp 100-270), the cytomegalovirus early promoter enhancer, the polyoma enhancer on the late side of the replication origin, and adenovirus enhancers.

[0140] The promotor and/or enhancer may be specifically activated either by light or specific chemical events which trigger their function. Systems can be regulated by reagents such as tetracycline and dexamethasone. There are also ways to enhance viral vector gene expression by exposure to irradiation, such as gamma irradiation, or alkylating chemotherapy drugs.

[0141] In certain embodiments, the promoter and/or enhancer region can act as a constitutive promoter and/or enhancer to maximize expression of the region of the transcription unit to be transcribed. In certain constructs the promoter and/or enhancer region be active in all eukaryotic cell types, even if it is only expressed in a particular type of cell at a particular time. A preferred promoter of this type is the CMV promoter (650 bases). Other preferred promoters are SV40 promoters, cytomegalovirus (full-length promoter), and retroviral vector LTF.

[0142] It has been shown that all specific regulatory elements can be cloned and used to construct expression vectors that are selectively expressed in specific cell types such as melanoma cells. For example, the glial fibrillary acetic protein (GFAP) promoter has been used to selectively express genes in cells of glial origin.

[0143] Expression vectors used in eukaryotic host cells (yeast, fungi, insect, plant, animal, human or nucleated cells) may also contain sequences necessary for the termination of transcription which may affect mRNA expression. These regions are transcribed as polyadenylated segments in the untranslated portion of the mRNA encoding tissue factor protein. The 3'-untranslated regions also include transcription termination sites. It is preferred that the transcription unit also contains a polyadenylation region. One benefit of this region is that it increases the likelihood that the transcribed unit will be processed and transported like mRNA. The identification and use of polyadenylation signals in expression constructs is well established. It is preferred that homologous polyadenylation signals be used in the transgene constructs. In certain transcription units, the polyadenylation region is derived from the SV40 early polyadenylation signal and consists of about 400 bases. It is also preferred that the transcribed units contain other standard sequences alone or in combination with the above sequences improve expression from, or stability of, the construct.

[0144] Markers

[0145] The viral vectors can include nucleic acid sequence encoding a marker product. This marker product is used to determine if the gene has been delivered to the cell and once delivered is being expressed. Preferred marker genes are the E. Coli lacZ gene, which encodes beta-galactosidase, and green fluorescent protein (GFP).

[0146] Marker product, as used herein, is synonymous with "reporter protein." As used herein, a "reporter protein" is any protein that can be specifically detected when expressed. Reporter proteins are useful for detecting or quantifying expression from expression sequences. Many reporter proteins are known to one of skill in the art. These include, but are not limited to, beta-galactosidase, luciferase, and alkaline phosphatase that produce specific detectable products. Fluorescent reporter proteins can also be used, such as green fluorescent protein (GFP), green reef coral fluorescent protein (G-RCFP), cyan fluorescent protein (CFP), red fluorescent protein (RFP) and yellow fluorescent protein (YFP).

[0147] In some embodiments the marker or reporter protein may be a selectable marker. Examples of suitable selectable markers for mammalian cells are dihydrofolate reductase (DHFR), thymidine kinase, neomycin, neomycin analog G418, hydromycin, and puromycin. When such selectable markers are successfully transferred into a mammalian host cell, the transformed mammalian host cell can survive if placed under selective pressure. There are two widely used distinct categories of selective regimes. The first category is based on a cell's metabolism and the use of a mutant cell line which lacks the ability to grow independent of a supplemented media. Two examples are: CHO DHFR cells and mouse LTK cells. These cells lack the ability to grow without the addition of such nutrients as thymidine or hypoxanthine. Because these cells lack certain genes necessary for a complete nucleotide synthesis pathway, they cannot survive unless the missing nucleotides are provided in a supplemented media. An alternative to supplementing the media is to introduce an intact DHFR or TK gene into cells lacking the respective genes, thus altering their growth requirements. Individual cells which were not transformed with the DHFR or TK gene will not be capable of survival in non supplemented media.

[0148] The second category is dominant selection which refers to a selection scheme used in any cell type and does not require the use of a mutant cell line. These schemes typically use a drug to arrest growth of a host cell. Those cells which have a novel gene would express a protein conveying drug resistance and would survive the selection. Examples of such dominant selection use the drugs neomycin, (Southern, et al. (1982) J Molec Appl Genet 1:327), mycophenolic acid, (Mulligan, et al. (1980) Science 209:1422) or hygromycin, (Sugden, et al. (1985) Mol Cell Biol 5:410-413). The three examples employ bacterial genes under eukaryotic control to convey resistance to the appropriate drug G418 or neomycin (geneticin), xgpt (mycophenolic acid) or hygromycin, respectively. Others include the neomycin analog G418 and puramycin.

Cultured Cells

[0149] The materials, compositions, articles, devices, and methods disclosed herein, in one aspect, related to a method of making a mutant Nav1.7 sodium channel alpha subunit comprising culturing the cells comprising vectors comprising mutant Na.sub.v1.7 nucleic acids under conditions allowing expression of the polypeptide encoded by the nucleic acid, wherein the polypeptide comprises a mutant Na.sub.v1.7 sodium channel.

Transgenic Animals

[0150] In one aspect, disclosed herein are transgenic animals that express one or more of the mutant Na.sub.v1.7 sodium channels described herein. For example, disclosed herein is a transgenic mouse comprising cells that encode a mutant Na.sub.v1.7 sodium channel alpha-subunit, wherein the mouse exhibits increased seizure activity as compared to the wild-type animal.

[0151] "Transgenic animal" is used herein to mean an animal comprising a transgene. By a "transgene" is meant a nucleic acid sequence that is inserted by artifice into a cell and becomes a part of the genome of that cell and its progeny. Such a transgene may be (but is not necessarily) partly or entirely heterologous (for example, derived from a different species) to the cell. A transgenic animal can be any non-human animal, such as a mouse, rat, guinea pig, sheep, pig, goat, and the like. Transgenic animals are made by techniques that are well known in the art. For example, a transgenic animal can be prepared by the method used in U.S. Pat. No. 4,736,866

Mutant Na.sub.v1.7 Sodium Channel Alpha-Subunits

[0152] In one aspect, disclosed herein are mutant Na.sub.v1.7 sodium channel alpha-subunits and the use of such mutant Na.sub.v1.7 sodium channels to diagnose and treat disease states such as, for example, neurologic disorders associated with a sodium channel mutation. It was found that specific sites in the Na.sub.v1.7 sodium channel alpha-subunit are mutated, i.e., the amino acid at a specific position or at specific positions differs from that observed in the most commonly found Na.sub.v1.7 sodium channel.

[0153] As this specification discusses various amino acid sequences it is understood that the nucleic acids that can encode those amino acid sequences are also disclosed. This would include all degenerate sequences related to a specific amino acid sequence, i.e. all nucleic acids having a sequence that encodes one particular amino acid sequence as well as all nucleic acids, including degenerate nucleic acids, encoding the disclosed variants and derivatives of the amino acid sequences. Thus, while each particular nucleic acid sequence may not be written out herein, it is understood that each and every sequence is in fact disclosed and described herein through the disclosed amino acid sequence. For example, one of the many nucleic acid sequences that can encode the amino acid sequence of SEQ ID NO: 2 is set forth in SEQ ID NO: 8. Another nucleic acid sequence that encodes the amino acid sequence of SEQ ID NO: 3 is set forth in SEQ ID NO: 9. Another nucleic acid sequence that encodes the amino acid sequence of SEQ ID NO: 4 is set forth in SEQ ID NO: 10. Another nucleic acid sequence that encodes the amino acid sequence of SEQ ID NO: 5 is set forth in SEQ ID NO: 11. Another nucleic acid sequence that encodes the amino acid sequence of SEQ ID NO: 6 is set forth in SEQ ID NO: 12. Another nucleic acid sequence that encodes the amino acid sequence of SEQ ID NO: 7 is set forth in SEQ ID NO: 13. It is also understood that while no amino acid sequence indicates what particular DNA sequence encodes that protein within an organism, where particular variants of a disclosed protein are disclosed herein, the known nucleic acid sequence that encodes that amino acid sequence in the particular mutant Na.sub.v1.7 sodium channel alpha-subunit from which that amino acid sequence arises is also known and herein disclosed and described.

[0154] In one aspect, the mutant Na.sub.v1.7 sodium channel alpha-subunits described herein have one or more mutated sites. For example, in one aspect, disclosed herein is a mutant Na.sub.v1.7 sodium channel alpha-subunit where the amino acid at position 62 is not isoleucine (I) as is commonly found at position 62 but, rather, valine (V) (SEQ ID NO: 2). In another aspect, disclosed herein is a mutant Na.sub.v1.7 sodium channel alpha-subunit where the amino acid at position 149 is not proline (P) as is commonly found at position 149 but, rather, glutamine (Q) (SEQ ID NO: 3). In another aspect, disclosed herein is a mutant Na.sub.v1.7 sodium channel alpha-subunit where the amino acid at position 641 is not asparagines (N) as is commonly found at position 641 but, rather, tyrosine (Y) (SEQ ID NO: 4). In yet another aspect, disclosed herein is a mutant Na.sub.v1.7 sodium channel alpha-subunit where the amino acid at position 655 is not lysine (K) as is commonly found at position 655 but, rather, arginine (R) (SEQ ID NO: 5). In a further aspect, disclosed herein is a mutant Na.sub.v1.7 sodium channel alpha-subunit where the amino acid at position 739 is not isoleucine (I) as is commonly found at position 739 but, rather, valine (V) (SEQ ID NO: 6). In a still further aspect, disclosed herein is a mutant Na.sub.v1.7 sodium channel alpha-subunit where the amino acid at position 1123 is not leucine (L) as is commonly found at position 1123 but, rather, phenylalanine (F) (SEQ ID NO: 7).

[0155] Also contemplated are variants and derivatives of the disclosed mutant Na.sub.v1.7 amino acid sequences. It is understood that one way to define the variants and derivatives of the disclosed proteins herein is through defining the variants and derivatives in terms of homology/identity to specific known sequences. For example, SEQ ID NO: 2 sets forth a particular sequence of a mutant I62V mutant sodium channel alpha-subunit, SEQ ID NO: 3 sets forth a particular sequence of a mutant P149Q Na.sub.v1.7 sodium channel alpha-subunit, SEQ ID NO: 4 sets forth a particular sequence of a mutant N641Y Na.sub.v1.7 sodium channel alpha-subunit, SEQ ID NO: 5 sets forth a particular sequence of a mutant K655R Na.sub.v1.7 sodium channel alpha-subunit, SEQ ID NO: 6 sets forth a particular sequence of a mutant I739V Na.sub.v1.7 sodium channel alpha-subunit, and SEQ ID NO: 7 sets forth a particular sequence of a mutant L1123F Na.sub.v1.7 sodium channel alpha-subunit. Specifically disclosed are variants of these and other proteins herein disclosed which have at least, 70% or 75% or 80% or 85% or 90% or 95% homology to the stated sequence. Also, provided are amino acid sequences comprising the sequences of SEQ ID NOs: 2, 3, 4, 5, 6, and 7, or any fragment thereof wherein the sequence comprises one or more conservative amino acid substitutions. Preferably, the amino acid sequence with conservative amino acid substitutions maintains sodium channel function. Examples of conservative amino acid substitutions are shown in Table 1. Those of skill in the art readily understand how to determine the homology of two proteins. For example, the homology can be calculated after aligning the two sequences so that the homology is at its highest level.

TABLE-US-00001 TABLE 1 Amino Acid Substitutions Original Residue Exemplary Conservative Substitutions, others are known in the art. Ala ser Arg lys or gln Asn gln or his Asp glu Cys ser Gln asn or lys Glu asp Gly pro His asn or gln Ile leu or val Leu ile or val Lys arg or gln; Met leu or ile Phemet leu or tyr Ser thr Thr ser Trp tyr Tyr trp or phe Val ile or leu

[0156] Another way of calculating homology can be performed by published algorithms. Optimal alignment of sequences for comparison may be conducted by the local homology algorithm of Smith and Waterman (1981) Adv Appl Math 2:482, by the homology alignment algorithm of Needleman and Wunsch (1970) J Mol Biol 48:443, by the search for similarity method of Pearson and Lipman, (1988) Proc Natl Acad Sci USA 85:2444, by computerized implementations of these algorithms (GAP, BESTFIT, FASTA, and TFASTA in the Wisconsin Genetics Software Package, Genetics Computer Group, 575 Science Dr., Madison, Wis.), or by inspection.

[0157] Also, disclosed herein are isolated polypeptides and fragments of polypeptides comprising mutant Na.sub.v1.7 sodium channel alpha-subunit amino acid sequences. For example, disclosed herein are isolated polypeptides having amino acid sequences of SEQ ID NOs: 2, 3, 4, 5, 6, and 7. In another aspect, disclosed herein are fragments of such sequences. For example, disclosed herein are isolated polypeptides having amino acid sequences of SEQ ID NOs: 32, 33, 34, 35, 36, and 37.

[0158] Also, provided are fragments of at least 5 contiguous amino acid sequences corresponding to SEQ ID NOs: 2, 3, 4, 5, 6, and 7. Among these fragments are those comprising PFVYG (SEQ ID NO: 32), NPQDW (SEQ ID NO: 33), LPYGQ (SEQ ID NO: 34), IHRKR (SEQ ID NO: 35), LAVTI (SEQ ID NO: 36), and NPFPG (SEQ ID NO: 37).

Methods of Synthesizing Polypeptides

[0159] The peptides, polypeptides, and polypeptide fragments disclosed herein can be chemically synthesized using currently available laboratory equipment using either Fmoc (9-fluorenylmethyloxycarbonyl) or Boc (tert -butyloxycarbonoyl) chemistry. (Applied Biosystems, Inc., Foster City, Calif.). One skilled in the art can readily appreciate that a peptide or polypeptide corresponding to the sodium channels disclosed herein, for example, can be synthesized by standard chemical reactions. For example, a peptide or polypeptide fragment can be synthesized and not cleaved from its synthesis resin whereas another peptide or polypeptide fragment can be synthesized and subsequently cleaved from the resin, thereby exposing a terminal group which is functionally blocked on the other fragment. By peptide condensation reactions, these two fragments can be covalently joined via a peptide bond at their carboxyl and amino termini, respectively, to form an sodium channel, or fragment thereof. (See Grant G A (1992) Synthetic Peptides: A User Guide. W.H. Freeman and Co., New York, N.Y. (1992); Bodansky M and Trost B., Ed. Principles of Peptide Synthesis. Springer-Verlag Inc., New York, N.Y. (1993)). Alternatively, the peptide or polypeptide is independently synthesized in vivo as described above.

[0160] For example, enzymatic ligation of cloned or synthetic peptide segments allow relatively short peptide fragments to be joined to produce larger peptide fragments, polypeptides or whole protein domains (Abrahmsen, et al. (1991) Biochemistry 30:4151). Alternatively, native chemical ligation of synthetic peptides can be utilized to synthetically construct large peptides or polypeptides from shorter peptide fragments. This method consists of a two step chemical reaction (Dawson, et al. (1994) Science 266:776-779). The first step is the chemoselective reaction of an unprotected synthetic peptide-alpha-thioester with another unprotected peptide segment containing an amino-terminal Cys residue to give a thioester-linked intermediate as the initial covalent product. Without a change in the reaction conditions, this intermediate undergoes spontaneous, rapid intramolecular reaction to form a native peptide bond at the ligation site. Application of this native chemical ligation method to the total synthesis of a protein molecule is illustrated by the preparation of human interleukin 8 (IL-8) (Baggiolini, et al. (1992) FEBS Lett. 307:97-101; Clark-Lewis, et al. (1994) J Biol Chem 269:16075; Clark-Lewis, et al. (1991) Biochemistry 30:3128; Rajarathnam, et al. (1994) Biochemistry 33:6623-30).

[0161] Alternatively, unprotected peptide segments are chemically linked where the bond formed between the peptide segments as a result of the chemical ligation is an unnatural (non-peptide) bond (Schnolzer, et al. (1992) Science, 256:221). This technique has been used to synthesize analogs of protein domains as well as large amounts of relatively pure proteins with full biological activity (deLisle Milton, et al. Techniques in Protein Chemistry IV. Academic Press, New York, pp. 257-267 (1992)).

Antibodies to Mutant Na.sub.v1.7 Sodium Channels

[0162] The disclosed materials, compositions, articles, devices, and methods disclosed herein, in one aspect, relate to purified antibodies that selectively bind to an epitope of a mutant Na.sub.v1.7 sodium channel alpha-subunit. In one aspect, the purified antibody selectively binds to an epitope of the I62V mutant Na.sub.v1.7 sodium channel alpha-subunit. In another aspect, the purified antibody selectively binds to an epitope of the P149Q mutant Na.sub.v1.7 sodium channel alpha-subunit. In yet another aspect, the purified antibody selectively binds to an epitope of the N641Y mutant Na.sub.v1.7 sodium channel alpha-subunit. In a further aspect, the purified antibody selectively binds to an epitope of the K655R mutant Na.sub.v1.7 sodium channel alpha-subunit. In a still further aspect, the purified antibody selectively binds to an epitope of the I739V mutant Na.sub.v1.7 sodium channel alpha-subunit. In one aspect, the purified antibody selectively binds to an epitope of the L1123F mutant Na.sub.v1.7 sodium channel alpha-subunit.

[0163] By "selectively binds" is meant that the antibody binds to the mutant Na.sub.v1.7 sodium channel without appreciably binding to the non-mutant Na.sub.v1.7 sodium channel. By "binding" is meant such that the signal that indicates binding is at least about 1.5 times the signal for a non-binding control. Thus, without appreciable binding is meant less than or equal to 1.5 times the background of a non-binding control.

[0164] The term "antibodies" is used herein in a broad sense and includes both polyclonal and monoclonal antibodies, as well as humanized, fully human, and non-human antibodies. Also provided are fragments of these antibodies wherein the fragments selectively bind with epitopes of mutant Na.sub.v1.7 sodium channel alpha-subunits. The antibodies can be tested for their desired binding activity using the in vitro assays described herein, or by analogous methods. Optionally, the antibodies are labeled directly or indirectly and can be used with imaging technologies to detect expression of the mutant Na.sub.v1.7.

[0165] The term "monoclonal antibody" as used herein refers to an antibody obtained from a substantially homogeneous population of antibodies, i.e., the individual antibodies within the population are identical except for possible naturally occurring mutations that may be present in a small subset of the antibody molecules. The monoclonal antibodies herein specifically include "chimeric" antibodies in which a portion of the heavy and/or light chain is identical with or homologous to corresponding sequences in antibodies derived from a particular species or belonging to a particular antibody class or subclass, while the remainder of the chain(s) is identical with or homologous to corresponding sequences in antibodies derived from another species or belonging to another antibody class or subclass, as well as fragments of such antibodies, as long as they exhibit the desired antagonistic activity (See, U.S. Pat. No. 4,816,567 and Morrison, et al. (1984) Proc Natl Acad Sci USA, 81:6851-6855).

[0166] The disclosed monoclonal antibodies can be made using any procedure which produces monoclonal antibodies. For example, disclosed monoclonal antibodies can be prepared using hybridoma methods, such as those described by Kohler, et al. (1975) Nature 256:495). In a hybridoma method, a mouse or other appropriate host animal is typically immunized with an immunizing agent to elicit lymphocytes that produce or are capable of producing antibodies that will specifically bind to the immunizing agent. Alternatively, the lymphocytes may be immunized in vitro, e.g., using the mutant Na.sub.v1.7 channels described herein.

[0167] The monoclonal antibodies may also be made by recombinant DNA methods, such as those described in U.S. Pat. No. 4,816,567. DNA encoding the disclosed monoclonal antibodies can be readily isolated and sequenced using conventional procedures (e.g., by using oligonucleotide probes that are capable of binding specifically to genes encoding the heavy and light chains of murine antibodies). Libraries of antibodies or active antibody fragments can also be generated and screened using phage display techniques, e.g., as described in U.S. Pat. Nos. 5,804,440 and 6,096,441.

[0168] In vitro methods are also suitable for preparing monovalent antibodies, including, for example, scfv antibodies. Digestion of antibodies to produce fragments thereof, particularly, Fab fragments, can be accomplished using routine techniques known in the art. For instance, digestion can be performed using papain. Examples of papain digestion are described in WO 94/29348 and U.S. Pat. No. 4,342,566. Papain digestion of antibodies typically produces two identical antigen binding fragments, called Fab fragments, each with a single antigen binding site, and a residual Fc fragment. Pepsin treatment yields a fragment that has two antigen combining sites and is still capable of cross linking antigen.

[0169] The fragments, whether attached to other sequences or not, can also include insertions, deletions, substitutions, or other selected modifications of particular regions or specific amino acids residues, provided the activity of the antibody or antibody fragment is not significantly altered or impaired compared to the non-modified antibody or antibody fragment. These modifications can provide for some additional property, such as to remove/add amino acids capable of disulfide bonding, to increase its bio-longevity, to alter its secretory characteristics, etc. In any case, the antibody or antibody fragment must possess a bioactive property, such as specific binding to its cognate antigen. Functional or active regions of the antibody or antibody fragment may be identified by mutagenesis of a specific region of the protein, followed by expression and testing of the expressed polypeptide. Such methods are readily apparent to a skilled practitioner in the art and can include site-specific mutagenesis of the nucleic acid encoding the antibody or antibody fragment. (See Zoller, (1992) J Curr Opin Biotechnol 3:348-354).

[0170] As used herein, the term "antibody" or "antibodies" can also refer to a human antibody and/or a humanized antibody. Many non-human antibodies (e.g., those derived from mice, rats, or rabbits) are naturally antigenic in humans, and thus can give rise to undesirable immune responses when administered to humans. Therefore, the use of human or humanized antibodies in the methods serves to lessen the chance that an antibody administered to a human will evoke an undesirable immune response.

[0171] Human Antibodies

[0172] The disclosed human antibodies can be prepared using any technique. Examples of techniques for human monoclonal antibody production include those described by Cole et al. (Monoclonal Antibodies and Cancer Therapy, Alan R. Liss, p. 77, 1985) and by Boerner, et al. (1991) J Immunol 147:86-95. Human antibodies (and fragments thereof) can also be produced using phage display libraries (see Hoogenboom, et al. (1991) J Mol Biol 227:381; Marks, et al. (1991) J Mol Biol 222:581).

[0173] The disclosed human antibodies can also be obtained from transgenic animals. For example, transgenic, mutant mice that are capable of producing a full repertoire of human antibodies, in response to immunization, have been described (see, e.g., Jakobovits, et al. (1993) Proc Natl Acad Sci USA 90:2551- 2555; Jakobovits, et al. (1993) Nature 362:255-258; Bruggermann, et al. (1993) Year in Immunol 7:33). Specifically, the homozygous deletion of the antibody heavy chain joining region (J(H)) gene in these chimeric and germ line mutant mice results in complete inhibition of endogenous antibody production, and the successful transfer of the human germ line antibody gene array into such germ line mutant mice results in the production of human antibodies upon antigen challenge. Antibodies having the desired activity are selected using the mutant Na.sub.v1.7 sodium channels provided herein.

[0174] Humanized Antibodies

[0175] Antibody humanization techniques generally involve the use of recombinant DNA technology to manipulate the DNA sequence encoding one or more polypeptide regions of an antibody molecule. Accordingly, a humanized form of a non-human antibody (or a fragment thereof) is a chimeric antibody or antibody chain (or a fragment thereof, such as an Fv, Fab, Fab', or other antigen binding portion of an antibody) which contains a portion of an antigen binding site from a non-human (donor) antibody integrated into the framework of a human (recipient) antibody.

[0176] To generate a humanized antibody, residues from one or more complementarity determining regions (CDRs) of a recipient (human) antibody molecule are replaced by residues from one or more CDRs of a donor (non-human) antibody molecule that is known to have desired antigen binding characteristics (e.g., a certain level of specificity and affinity for the target antigen). In some instances, Fv framework (FR) residues of the human antibody are replaced by corresponding non-human residues. Humanized antibodies may also contain residues which are found neither in the recipient antibody nor in the imported CDR or framework sequences. Generally, a humanized antibody has one or more amino acid residues introduced into it from a source which is non-human. In practice, humanized antibodies are typically human antibodies in which some CDR residues and possibly some FR residues are substituted by residues from analogous sites in rodent antibodies. Humanized antibodies generally contain at least a portion of an antibody constant region (Fc), typically that of a human antibody (see Jones, et al. (1986) Nature 321:522-525; Reichmann, et al. (1988) Nature 332:323-327; Presta, (1992) Curr Opin Struct Biol 2:593-596).

[0177] Methods for humanizing non-human antibodies are well known in the art. For example, humanized antibodies can be generated according to the methods of Winter and co-workers (Jones, et al. (1986) Nature 321:522-525; Riechmann, et al. (1988) Nature 332:323-327; Verhoeyen, et al. (1988) Science 239:1534-1536), by substituting rodent CDRs or CDR sequences for the corresponding sequences of a human antibody. Methods that can be used to produce humanized antibodies are also described in U.S. Pat. Nos. 4,816,567, 5,565,332, 5,721,367, 5,837,243, 5,939,598, 6,130,364, and 6,180,377.

[0178] In one aspect, as a form of therapy, antibodies can be used to inactivate the function of a mutant protein.

Methods of Drug Screening and Delivery

[0179] The materials, compositions, articles, devices and methods disclosed herein, in one aspect, relate to a method of identifying a compound that modulates mutant Na.sub.v1.7 sodium channels comprising contacting, with a test compound, a cell containing a mutant Na.sub.v1.7 nucleic acid that encodes a mutant Na.sub.v1.7 sodium channel comprising one or more mutations at residue 62, residue 149, residue 641, residue 655, residue 739, or residue 1123 of the channel; detecting Na.sub.v1.7 sodium channel activity; and comparing the Na.sub.v1.7 sodium channel activity in the contacted cell with the amount of Na.sub.v1.7 sodium channel activity in a control cell, wherein the control cell is not contacted by the test compound, an increased or decreased Na.sub.v1.7 sodium channel activity in the test cell as compared to the control cell indicating a compound that modulates mutant Na.sub.v1.7 sodium channels. Detecting sodium channel activity can be accomplished by methods known in the art. For example, a suitable protocol for detecting sodium channel activity is described in Kausalia, et al. (2003) J. Neurophysiol. 10.1152/jn.00676.2003.

[0180] The cell can express the mutant channel naturally or can be genetically modified to do so. Optionally, the cell is an oocyte that expressed the mutant sodium channel. The mutant sodium channel can be a I62V, P149Q, N641Y, K655R, I739V, or L1123F mutant. Optionally, a mutant channel can comprise one or more of the site mutations.

[0181] Optionally, channel activity is tested using intracellular or extracellular recording to assess changes in membrane potential associated with sodium ion flux. Alternatively, imaging technologies can be used to observe labeled ion flux. Expression can be assessed in Xenopus oocytes or mammalian cells such as CHO, HEK and tsa201. Mutations may result in errors of protein trafficking and protein interaction. As such, mutant channels can be assessed for their ability to form functional channels in the cell membrane as opposed to being retained in the endoplasmic reticulum by using labeled antibodies to the wild-type channel, or by attaching a common epitope to the channels and using a specific antibody to that epitope. Mutations that alter interactions with intracellular proteins, such as protein kinase A, protein kinase C or calmodulin kinase, or the sodium channel beta-subunits, can be identified through yeast 2-hybrid studies, co-immunoprecipitation experiments or electrophysiological experiments.

[0182] Also, the materials, compositions, articles, devices, and methods disclosed herein, in one aspect, relate to a method of preventing or reducing the effects of neurologic disorders like febrile seizures, afebrile seizures, or epilepsy by treating a subject at risk for neurologic disorders with a composition that modulates mutant Na.sub.v1.7 levels. Thus, a subject with a mutation(s) in Na.sub.v1.7 sodium channel alpha-subunits, consistent with a neurologic disorder or an increased risk of a neurologic disorder, can be treated with a composition comprising a mutant Na.sub.v1.7 modulator identified or manufactured using the methods taught herein.

[0183] The materials and compositions disclosed herein can be administered in vivo in a pharmaceutically acceptable carrier. By "pharmaceutically acceptable carrier" is meant a material that is not biologically or otherwise undesirable, i.e., the material may be administered to a subject, along with a modulator of Na.sub.v1.7 sodium channel function identified or made by the methods taught herein, without causing any undesirable biological effects or interacting in a deleterious manner with any of the other components of the pharmaceutical composition in which it is contained. The carrier would naturally be selected to minimize any degradation of the active ingredient and to minimize any adverse side effects in the subject, as would be well known to one of skill in the art.

[0184] The compositions may be administered orally, parenterally (e.g., intravenously), by intramuscular injection, by intraperitoneal injection, transdermally, extracorporeally, topically or the like, although topical intranasal administration or administration by inhalant is typically preferred. As used herein, "topical intranasal administration" means delivery of the compositions into the nose and nasal passages through one or both of the nares and can comprise delivery by a spraying mechanism or droplet mechanism, or through aerosolization of the composition. The latter may be effective when a large number of animals are to be treated simultaneously. Administration of the compositions by inhalant can be through the nose or mouth via delivery by a spraying or droplet mechanism. Delivery can also be directly to any area of the respiratory system (e.g., lungs) via intubation. The exact amount of the compositions required will vary from subject to subject, depending on the species, age, weight and general condition of the subject, the severity of the disorder being treated, the particular nucleic acid or modulator used, its mode of administration and the like. Thus, it is not possible to specify an exact amount for every composition. However, an appropriate amount can be determined by one of ordinary skill in the art using only routine experimentation given the teachings herein.

[0185] Parenteral administration of the composition, if used, is generally characterized by injection. Injectables can be prepared in conventional forms, either as liquid solutions or suspensions, solid forms suitable for solution of suspension in liquid prior to injection, or as emulsions. A more recently revised approach for parenteral administration involves use of a slow release or sustained release system such that a constant dosage is maintained. See, e.g., U.S. Pat. No. 3,610,795, which is incorporated by reference herein.

[0186] The materials may be in solution, suspension (for example, incorporated into microparticles, liposomes, or cells). These may be targeted to a particular cell type via antibodies, receptors, or receptor ligands.

[0187] Liposomes are vesicles comprised of one or more concentrically ordered lipid bilayers which encapsulate an aqueous phase. They are normally not leaky, but can become leaky if a hole or pore occurs in the membrane, if the membrane is dissolved or degrades, or if the membrane temperature is increased to the phase transition temperature. Current methods of drug delivery via liposomes require that the liposome carrier ultimately become permeable and release the encapsulated drug at the target site. This can be accomplished, for example, in a passive manner wherein the liposome bilayer degrades over time through the action of various agents in the body. Every liposome composition will have a characteristic half-life in the circulation or at other sites in the body and, thus, by controlling the half-life of the liposome composition, the rate at which the bilayer degrades can be somewhat regulated.

[0188] In contrast to passive drug release, active drug release involves using an agent to induce a permeability change in the liposome vesicle. Liposome membranes can be constructed so that they become destabilized when the environment becomes acidic near the liposome membrane (see, e.g., (1987) Proc Natl Acad Sci USA 84:7851; (1989) Biochemistry 28:908, which are hereby incorporated by reference in their entireties for their teachings of liposome construction and administration). When liposomes are endocytosed by a target cell, for example, they can be routed to acidic endosomes which will destabilize the liposome and result in drug release.

[0189] Alternatively, the liposome membrane can be chemically modified such that an enzyme is placed as a coating on the membrane which slowly destabilizes the liposome. Since control of drug release depends on the concentration of enzyme initially placed in the membrane, there is no real effective way to modulate or alter drug release to achieve "on demand" drug delivery. The same problem exists for pH-sensitive liposomes in that as soon as the liposome vesicle comes into contact with a target cell, it will be engulfed and a drop in pH will lead to drug release. This liposome delivery system can also be made to target B cells by incorporating into the liposome structure a ligand having an affinity for B cell-specific receptors.

[0190] Compositions including the liposomes in a pharmaceutically acceptable carrier are also contemplated.

[0191] Transdermal delivery devices have been employed for delivery of low molecular weight proteins by using lipid-based compositions (i.e., in the form of a patch) in combination with sonophoresis. However, as reported in U.S. Pat. No. 6,041,253, which is hereby incorporated by reference in its entirety for the methods taught therein, transdermal delivery can be further enhanced by the application of an electric field, for example, by ionophoresis or electroporation. Using low frequency ultrasound which induces cavitation of the lipid layers of the stratum corneum, higher transdermal fluxes, rapid control of transdermal fluxes, and drug delivery at lower ultrasound intensities can be achieved. Still further enhancement can be obtained using a combination of chemical enhancers and/or magnetic field along with the electric field and ultrasound.

[0192] Implantable or injectable protein depot compositions can also be employed, providing long-term delivery of the composition. For example, U.S. Pat. No. 6,331,311, which is hereby incorporated by reference in its entirety for protein depot compositions and uses, reports an injectable depot gel composition which includes a biocompatible polymer, a solvent that dissolves the polymer and forms a viscous gel, and an emulsifying agent in the form of a dispersed droplet phase in the viscous gel. Upon injection, such a gel composition can provide a relatively continuous rate of dispersion of the agent to be delivered, thereby avoiding an initial burst of the agent to be delivered.

[0193] The test compound and modulator taught herein can be, but is not limited to, antibodies, chemicals, small molecules, modified antisense RNAs, ions, siRNAs, receptor ligands, drugs and secreted proteins.

EXAMPLES

[0194] The following examples are put forth so as to provide those of ordinary skill in the art with a complete disclosure and description of how the antibodies, polypeptides, nucleic acids, compositions, and/or methods claimed herein are made and evaluated, and are intended to be purely exemplary of the invention and are not intended to limit the scope of what the inventors regard as their invention. Efforts have been made to ensure accuracy with respect to numbers (e.g., amounts, temperature, etc.), but some errors and deviations should be accounted for.

Example 1

[0195] Febrile seizures are the most common seizure disorder of early childhood, exhibiting a prevalence of 2-5% in European and North American children and as high as 9% in the Japanese. The incidence of febrile seizures in first-degree relatives is 31% (Aicardi, Epilepsy in Children, Raven Press, New York, N.Y. (1994)), supporting a strong genetic etiology of febrile seizures. The impact of febrile seizures is considerable because individuals who experience febrile seizures have a 2-7% chance of developing afebrile seizures later in life (Annegers, et al. (1987) N Engl J Med 316:493-498). These later epileptic phenomena include cases of various generalized convulsive, as well as simple and complex partial seizures.

[0196] Linkage analysis of a febrile seizure kindred K4425 identified a 10 cM region of no recombination on chromosome 2q24 (FEB3, OMIM 604403), which contains five sodium channel alpha-subunit genes (see Peiffer, et al. (1999) Ann Neurol 46:671-678). Three of the sodium channel genes within this critical region, Na.sub.v1.1, Na.sub.v1.2, and Na.sub.v1.3, share over 85% identity and are highly expressed in brain (see Catterall, (2000) Neuron 26:13-25). Na.sub.v1.7, which also resides within this critical genetic interval, shares approximately 70-80% homology with Na.sub.v1.1, Na.sub.v1.2, and Na.sub.v1.3 (see Catterall, (2000) Neuron 26:13-25; Sangameswaran, et al. (1997) J Biol Chem 272:14805-14809), is expressed primarily in neurons of the dorsal root ganglia, and shows minimal to no expression in brain (Felts, et al. (1997) Brain Res Mol Brain Res 45:71-82; Toledo-Aral, et al. (1997) Proc Natl Acad Sci USA 94:1527-1532). Consequently, Na.sub.v1.7 has been classified as a peripheral nervous system channel (Catterall, (2000) Neuron 26:13-25; Goldin, et al. (2001) Annu Rev Physiol 63:871-894).

[0197] Recently, disease-causing mutations were identified in Na.sub.v1.1 and Na.sub.v1.2 in generalized epilepsy febrile seizure plus (GEFS+), a febrile seizure disorder that is subtly different from the phenotype described in kindred K4425 (see Singh, et al. (1999) Ann Neurol 45:75-81; Sugawara, et al. (2001) Proc Natl Acad Sci USA 98:6384-6389; Wallace, et al. (2001) Am J Hum Genet 68:859-865; Escayg, et al. (2000) Nat Genet 24: 343-345). Sequence analysis of an affected individual in K4425 did not yield any disease-causing variants in either of these two genes, or in the closely related Na.sub.v1.3 gene. Sequence analysis of the Na.sub.v1.7 large intracellular loop between domains I and II revealed a missense change (N641Y) in all affected individuals of K4425 that was absent from 236 control chromosomes (see FIG. 2).

[0198] Na.sub.v1.7 was then sequenced in a panel of 32 sporadic and familial cases with seizures occurring in the setting of a febrile illness and five additional variants in Na.sub.v1.7 that were not found in 180 ethnically matched control chromosomes were identified (Table 2). These variants were identified in the intracellular N-terminus (I62V), the DI S1-S2 extracellular loop (P149Q), the DI-DII intracellular loop (K655R), the DII S1 transmembrane domain (I739V), and the DII-DIII intracellular loop (L1123F) (see FIG. 3).

TABLE-US-00002 TABLE 2 Amino acid conservation and clinical findings associated with Nav1.7 mutations Amino Acid Conservation Species Gene family Family Clinical Findings Exon Mutation mou/rat/rabb Nav1.1/1.2/1.3 History Presentation (age) Clinical Course 1 I62V -- FS (2 yr) FS until 2 yr 3 P149Q P/P/A -- FS (2 yr) FS until 4 yr 11 N641Y V/A/T +* FS (mean 1.3 yr) GTC, PC, SP, GT, GA until 6- 16 yr 12 K655R R/R/R + FS (5 yr) IGE until 6 yr 13 I739V + FS (1 yr) IGE until 8 yr 17 L1123F A/A/L -- epilepsy (5 mo) Intractable seizures Species: corresponding amino acid of Na.sub.v1.7 of the mouse, rat and rabbit; Gene family: corresponding amino acid of Na.sub.v1.1, Na.sub.v1.2, and Na.sub.v1.3; , amino acid is identical to human Na.sub.v1.7. Family history: -, negative; +, positive. FS, febrile seizures; GTC, generalized tonic-clonic; PC, partial complex; SP, simple partial; GT, generalized tonic; GA, generalized atonic; IGE, idiopathic generalized epilepsy; *family described in FIG. 2.

[0199] All variants, except proline 149, are conserved in the Na.sub.v1.7 gene of mouse, rat and rabbit. Proline 149 is conserved in mouse and rat, and is substituted with alanine in rabbit (Table 2). Less conservation of the mutant Na.sub.v1.7 residues is found among the Na.sub.v1.1, Na.sub.v1.2, and Na.sub.v1.3 genes.

[0200] A broad variety of neurologic manifestations is observed in patients with mutations in Na.sub.v1.7, suggestive of a wide clinical continuum. Illustrating the milder end of the continuum are two probands suffering only of infrequent febrile seizures before six years of age (Table 2: I62V, P149Q). An additional two such patients later developed rare generalized convulsive episodes (associated with generalized epileptiform discharges on EEG) that resolved by eight years of age (Table 2: K655R, I739V). All 21 affected individuals in K4425 experienced febrile seizures before six years of age (Table 2: N641Y). Eight of these individuals had later afebrile seizures which remitted by the age of 16 in six individuals. Peiffer, et al. (1999) Ann Neurol 46:671-678). These patients with afebrile seizures that ultimately resolved suggest an intermediate phenotype. Lastly, one proband in our study experienced multiple generalized clonic seizures which were predominantly afebrile, beginning at five months of age. This patient without a family history of seizures progressed to have frequent episodes of status epilepticus and prolonged complex partial seizures by 16 months, and at 5 years old, continues to have mixed seizures (including probable myoclonic and astatic seizures) in spite of resolute therapeutic intervention. This last case represents the severe end of the clinical spectrum, and may be characterized as similar to SMEI (Table 2: L1123F). There is now abundant evidence for an increasing range of epilepsy phenotypes in patients with mutations in Na.sub.v1.1 (see Nabbout, et al. (2003) Neurology 60:1961-1967; Fujiwara, et al. (2003) Brain 126:531-546). Electrophysiological characterization of these unique Na.sub.v1.7 mutations may help shed light on the variation in seizure manifestation observed in this group of patients.

[0201] To date, Na.sub.v1.1 and Na.sub.v.beta.1.1 are the most commonly mutated genes in the febrile seizure phenotype. However, in an Australian cohort of 36 unrelated GEFS+ samples, mutations in Nav1.1 and Na.sub.v.beta.1.1 account for only 17% of cases (see Wallace, et al. (2001) Am J Hum Genet 68:859-865). In our panel of 32 unrelated febrile seizure cases, only one Na.sub.v1.1 mutation, R946H, was identified. Na.sub.v1.1 is implicated as a major cause of SMEI or Dravet syndrome (see Nabbout, et al. (2003) Neurology 60:1961-1967; Fujiwara, et al. (2003) Brain 126:531-546; Claes, et al. (2003) Hum Mutat 21:615-621). Since, GEFS+, and possibly SMEI, exhibit genetic heterogeneity, there can be a prevalence of Na.sub.v1.7 mutations in cohorts of both disorders.

Example 2

[0202] Experiments were conducted as described in Lossin, et al., (2003) J Neurosci 23(36):11289-11295. Results are shown in FIG. 4. Full-length wild-type SCN9A and mutant SCN9A (K655R and N641Y) constructs were transiently transfected into tsA201 cells. Currents were elicted by test pulses from -60 mV to +40 mV in 5 mV increments. At negative potentials, K655R has a higher current density than wild type. At positive potentials, N641Y has reduced current density compared to wild-type, p<0.05.

[0203] Throughout this application, various publications are referenced. The disclosures of these publications in their entireties are hereby incorporated by reference into this application in order to more fully describe the state of the art to which this invention pertains.

[0204] It will be apparent to those skilled in the art that various modifications and variations can be made in the present invention without departing from the scope or spirit of the invention. Other embodiments of the invention will be apparent to those skilled in the art from consideration of the specification and practice of the invention disclosed herein. It is intended that the specification and examples be considered as exemplary only, with a true scope and spirit of the invention being indicated by the following claims.

Sequence CWU 1

1

3815934DNAHomo sapiens 1atggcaatgt tgcctccccc aggacctcag agctttgtcc atttcacaaa acagtctctt 60gccctcattg aacaacgcat tgctgaaaga aaatcaaagg aacccaaaga agaaaagaaa 120gatgatgatg aagaagcccc aaagccaagc agtgacttgg aagctggcaa acaactgccc 180ttcatctatg gggacattcc tcccggcatg gtgtcagagc ccctggagga cttggacccc 240tactatgcag acaaaaagac tttcatagta ttgaacaaag ggaaaacaat cttccgtttc 300aatgccacac ctgctttata tatgctttct cctttcagtc ctctaagaag aatatctatt 360aagattttag tacactcctt attcagcatg ctcatcatgt gcactattct gacaaactgc 420atatttatga ccatgaataa cccgccggac tggaccaaaa atgtcgagta cacttttact 480ggaatatata cttttgaatc acttgtaaaa atccttgcaa gaggcttctg tgtaggagaa 540ttcacttttc ttcgtgaccc gtggaactgg ctggattttg tcgtcattgt ttttgcgtat 600ttaacagaat ttgtaaacct aggcaatgtt tcagctcttc gaactttcag agtattgaga 660gctttgaaaa ctatttctgt aatcccaggc ctgaagacaa ttgtaggggc tttgatccag 720tcagtgaaga agctttctga tgtcatgatc ctgactgtgt tctgtctgag tgtgtttgca 780ctaattggac tacagctgtt catgggaaac ctgaagcata aatgttttcg aaattcactt 840gaaaataatg aaacattaga aagcataatg aataccctag agagtgaaga agactttaga 900aaatattttt attacttgga aggatccaaa gatgctctcc tttgtggttt cagcacagat 960tcaggtcagt gtccagaggg gtacacctgt gtgaaaattg gcagaaaccc tgattatggc 1020tacacgagct ttgacacttt cagctgggcc ttcttagcct tgtttaggct aatgacccaa 1080gattactggg aaaaccttta ccaacagacg ctgcgtgctg ctggcaaaac ctacatgatc 1140ttctttgtcg tagtgatttt cctgggctcc ttttatctaa taaacttgat cctggctgtg 1200gttgccatgg catatgaaga acagaaccag gcaaacattg aagaagctaa acagaaagaa 1260ttagaatttc aacagatgtt agaccgtctt aaaaaagagc aagaagaagc tgaggcaatt 1320gcagcggcag cggctgaata tacaagtatt aggagaagca gaattatggg cctctcagag 1380agttcttctg aaacatccaa actgagctct aaaagtgcta aagaaagaag aaacagaaga 1440aagaaaaaga atcaaaagaa gctctccagt ggagaggaaa agggagatgc tgagaaattg 1500tcgaaatcag aatcagagga cagcatcaga agaaaaagtt tccaccttgg tgtcgaaggg 1560cataggcgag cacatgaaaa gaggttgtct acccccaatc agtcaccact cagcattcgt 1620ggctccttgt tttctgcaag gcgaagcagc agaacaagtc tttttagttt caaaggcaga 1680ggaagagata taggatctga gactgaattt gccgatgatg agcacagcat ttttggagac 1740aatgagagca gaaggggctc actgtttgtg ccccacagac cccaggagcg acgcagcagt 1800aacatcagcc aagccagtag gtccccacca atgctgccgg tgaacgggaa aatgcacagt 1860gctgtggact gcaacggtgt ggtctccctg gttgatggac gctcagccct catgctcccc 1920aatggacagc ttctgccaga gggcacgacc aatcaaatac acaagaaaag gcgttgtagt 1980tcctatctcc tttcagagga tatgctgaat gatcccaacc tcagacagag agcaatgagt 2040agagcaagca tattaacaaa cactgtggaa gaacttgaag agtccagaca aaaatgtcca 2100ccttggtggt acagatttgc acacaaattc ttgatctgga attgctctcc atattggata 2160aaattcaaaa agtgtatcta ttttattgta atggatcctt ttgtagatct tgcaattacc 2220atttgcatag ttttaaacac attatttatg gctatggaac accacccaat gactgaggaa 2280ttcaaaaatg tacttgctat aggaaatttg gtctttactg gaatctttgc agctgaaatg 2340gtattaaaac tgattgccat ggatccatat gagtatttcc aagtaggctg gaatattttt 2400gacagcctta ttgtgacttt aagtttagtg gagctctttc tagcagatgt ggaaggattg 2460tcagttctgc gatcattcag actgctccga gtcttcaagt tggcaaaatc ctggccaaca 2520ttgaacatgc tgattaagat cattggtaac tcagtagggg ctctaggtaa cctcacctta 2580gtgttggcca tcatcgtctt catttttgct gtggtcggca tgcagctctt tggtaagagc 2640tacaaagaat gtgtctgcaa gatcaatgat gactgtacgc tcccacggtg gcacatgaac 2700gacttcttcc actccttcct gattgtgttc cgcgtgctgt gtggagagtg gatagagacc 2760atgtgggact gtatggaggt cgctggtcaa gctatgtgcc ttattgttta catgatggtc 2820atggtcattg gaaacctggt ggtcctaaac ctatttctgg ccttattatt gagctcattt 2880agttcagaca atcttacagc aattgaagaa gaccctgatg caaacaacct ccagattgca 2940gtgactagaa ttaaaaaggg aataaattat gtgaaacaaa ccttacgtga atttattcta 3000aaagcatttt ccaaaaagcc aaagatttcc agggagataa gacaagcaga agatctgaat 3060actaagaagg aaaactatat ttctaaccat acacttgctg aaatgagcaa aggtcacaat 3120ttcctcaagg aaaaagataa aatcagtggt tttggaagca gcgtggacaa acacttgatg 3180gaagacagtg atggtcaatc atttattcac aatcccagcc tcacagtgac agtgccaatt 3240gcacctgggg aatccgattt ggaaaatatg aatgctgagg aacttagcag tgattcggat 3300agtgaataca gcaaagtgag attaaaccgg tcaagctcct cagagtgcag cacagttgat 3360aaccctttgc ctggagaagg agaagaagca gaggctgaac ctatgaattc cgatgagcca 3420gaggcctgtt tcacagatgg ttgtgtacgg aggttctcat gctgccaagt taacatagag 3480tcagggaaag gaaaaatctg gtggaacatc aggaaaacct gctacaagat tgttgaacac 3540agttggtttg aaagcttcat tgtcctcatg atcctgctca gcagtggtgc cctggctttt 3600gaagatattt atattgaaag gaaaaagacc attaagatta tcctggagta tgcagacaag 3660atcttcactt acatcttcat tctggaaatg cttctaaaat ggatagcata tggttataaa 3720acatatttca ccaatgcctg gtgttggctg gatttcctaa ttgttgatgt ttctttggtt 3780actttagtgg caaacactct tggctactca gatcttggcc ccattaaatc ccttcggaca 3840ctgagagctt taagacctct aagagcctta tctagatttg aaggaatgag ggtcgttgtg 3900aatgcactca taggagcaat tccttccatc atgaatgtgc tacttgtgtg tcttatattc 3960tggctgatat tcagcatcat gggagtaaat ttgtttgctg gcaagttcta tgagtgtatt 4020aacaccacag atgggtcacg gtttcctgca agtcaagttc caaatcgttc cgaatgtttt 4080gcccttatga atgttagtca aaatgtgcga tggaaaaacc tgaaagtgaa ctttgataat 4140gtcggacttg gttacctatc tctgcttcaa gttgcaactt ttaagggatg gacgattatt 4200atgtatgcag cagtggattc tgttaatgta gacaagcagc ccaaatatga atatagcctc 4260tacatgtata tttattttgt cgtctttatc atctttgggt cattcttcac tttgaacttg 4320ttcattggtg tcatcataga taatttcaac caacagaaaa agaagcttgg aggtcaagac 4380atctttatga cagaagaaca gaagaaatac tataatgcaa tgaaaaagct ggggtccaag 4440aagccacaaa agccaattcc tcgaccaggg aacaaaatcc aaggatgtat atttgaccta 4500gtgacaaatc aagcctttga tattagtatc atggttctta tctgtctcaa catggtaacc 4560atgatggtag aaaaggaggg tcaaagtcaa catatgactg aagttttata ttggataaat 4620gtggttttta taatcctttt cactggagaa tgtgtgctaa aactgatctc cctcagacac 4680tactacttca ctgtaggatg gaatattttt gattttgtgg ttgtgattat ctccattgta 4740ggtatgtttc tagctgattt gattgaaacg tattttgtgt cccctaccct gttccgagtg 4800atccgtcttg ccaggattgg ccgaatccta cgtctagtca aaggagcaaa ggggatccgc 4860acgctgctct ttgctttgat gatgtccctt cctgcgttgt ttaacatcgg cctcctgctc 4920ttcctggtca tgttcatcta cgccatcttt ggaatgtcca actttgccta tgttaaaaag 4980gaagatggaa ttaatgacat gttcaatttt gagacctttg gcaacagtat gatttgcctg 5040ttccaaatta caacctctgc tggctgggat ggattgctag cacctattct taacagtaag 5100ccacccgact gtgacccaaa aaaagttcat cctggaagtt cagttgaagg agactgtggt 5160aacccatctg ttggaatatt ctactttgtt agttatatca tcatatcctt cctggttgtg 5220gtgaacatgt acattgcagt catactggag aattttagtg ttgccactga agaaagtact 5280gaacctctga gtgaggatga ctttgagatg ttctatgagg tttgggagaa gtttgatccc 5340gatgcgaccc agtttataga gttctctaaa ctctctgatt ttgcagctgc cctggatcct 5400cctcttctca tagcaaaacc caacaaagtc cagctcattg ccatggatct gcccatggtt 5460agtggtgacc ggatccattg tcttgacatc ttatttgctt ttacaaagcg tgttttgggt 5520gagagtgggg agatggattc tcttcgttca cagatggaag aaaggttcat gtctgcaaat 5580ccttccaaag tgtcctatga acccatcaca accacactaa aacggaaaca agaggatgtg 5640tctgctactg tcattcagcg tgcttataga cgttaccgct taaggcaaaa tgtcaaaaat 5700atatcaagta tatacataaa agatggagac agagatgatg atttactcaa taaaaaagat 5760atggcttttg ataatgttaa tgagaactca agtccagaaa aaacagatgc cacttcatcc 5820accacctctc caccttcata tgatagtgta acaaagccag acaaagagaa atatgaacaa 5880gacagaacag aaaaggaaga caaagggaaa gacagcaagg aaagcaaaaa atag 593421977PRTHomo sapiens 2Met Ala Met Leu Pro Pro Pro Gly Pro Gln Ser Phe Val His Phe Thr1 5 10 15Lys Gln Ser Leu Ala Leu Ile Glu Gln Arg Ile Ala Glu Arg Lys Ser 20 25 30Lys Glu Pro Lys Glu Glu Lys Lys Asp Asp Asp Glu Glu Ala Pro Lys 35 40 45Pro Ser Ser Asp Leu Glu Ala Gly Lys Gln Leu Pro Phe Val Tyr Gly 50 55 60Asp Ile Pro Pro Gly Met Val Ser Glu Pro Leu Glu Asp Leu Asp Pro65 70 75 80Tyr Tyr Ala Asp Lys Lys Thr Phe Ile Val Leu Asn Lys Gly Lys Thr 85 90 95Ile Phe Arg Phe Asn Ala Thr Pro Ala Leu Tyr Met Leu Ser Pro Phe 100 105 110Ser Pro Leu Arg Arg Ile Ser Ile Lys Ile Leu Val His Ser Leu Phe 115 120 125Ser Met Leu Ile Met Cys Thr Ile Leu Thr Asn Cys Ile Phe Met Thr 130 135 140Met Asn Asn Pro Pro Asp Trp Thr Lys Asn Val Glu Tyr Thr Phe Thr145 150 155 160Gly Ile Tyr Thr Phe Glu Ser Leu Val Lys Ile Leu Ala Arg Gly Phe 165 170 175Cys Val Gly Glu Phe Thr Phe Leu Arg Asp Pro Trp Asn Trp Leu Asp 180 185 190Phe Val Val Ile Val Phe Ala Tyr Leu Thr Glu Phe Val Asn Leu Gly 195 200 205Asn Val Ser Ala Leu Arg Thr Phe Arg Val Leu Arg Ala Leu Lys Thr 210 215 220Ile Ser Val Ile Pro Gly Leu Lys Thr Ile Val Gly Ala Leu Ile Gln225 230 235 240Ser Val Lys Lys Leu Ser Asp Val Met Ile Leu Thr Val Phe Cys Leu 245 250 255Ser Val Phe Ala Leu Ile Gly Leu Gln Leu Phe Met Gly Asn Leu Lys 260 265 270His Lys Cys Phe Arg Asn Ser Leu Glu Asn Asn Glu Thr Leu Glu Ser 275 280 285Ile Met Asn Thr Leu Glu Ser Glu Glu Asp Phe Arg Lys Tyr Phe Tyr 290 295 300Tyr Leu Glu Gly Ser Lys Asp Ala Leu Leu Cys Gly Phe Ser Thr Asp305 310 315 320Ser Gly Gln Cys Pro Glu Gly Tyr Thr Cys Val Lys Ile Gly Arg Asn 325 330 335Pro Asp Tyr Gly Tyr Thr Ser Phe Asp Thr Phe Ser Trp Ala Phe Leu 340 345 350Ala Leu Phe Arg Leu Met Thr Gln Asp Tyr Trp Glu Asn Leu Tyr Gln 355 360 365Gln Thr Leu Arg Ala Ala Gly Lys Thr Tyr Met Ile Phe Phe Val Val 370 375 380Val Ile Phe Leu Gly Ser Phe Tyr Leu Ile Asn Leu Ile Leu Ala Val385 390 395 400Val Ala Met Ala Tyr Glu Glu Gln Asn Gln Ala Asn Ile Glu Glu Ala 405 410 415Lys Gln Lys Glu Leu Glu Phe Gln Gln Met Leu Asp Arg Leu Lys Lys 420 425 430Glu Gln Glu Glu Ala Glu Ala Ile Ala Ala Ala Ala Ala Glu Tyr Thr 435 440 445Ser Ile Arg Arg Ser Arg Ile Met Gly Leu Ser Glu Ser Ser Ser Glu 450 455 460Thr Ser Lys Leu Ser Ser Lys Ser Ala Lys Glu Arg Arg Asn Arg Arg465 470 475 480Lys Lys Lys Asn Gln Lys Lys Leu Ser Ser Gly Glu Glu Lys Gly Asp 485 490 495Ala Glu Lys Leu Ser Lys Ser Glu Ser Glu Asp Ser Ile Arg Arg Lys 500 505 510Ser Phe His Leu Gly Val Glu Gly His Arg Arg Ala His Glu Lys Arg 515 520 525Leu Ser Thr Pro Asn Gln Ser Pro Leu Ser Ile Arg Gly Ser Leu Phe 530 535 540Ser Ala Arg Arg Ser Ser Arg Thr Ser Leu Phe Ser Phe Lys Gly Arg545 550 555 560Gly Arg Asp Ile Gly Ser Glu Thr Glu Phe Ala Asp Asp Glu His Ser 565 570 575Ile Phe Gly Asp Asn Glu Ser Arg Arg Gly Ser Leu Phe Val Pro His 580 585 590Arg Pro Gln Glu Arg Arg Ser Ser Asn Ile Ser Gln Ala Ser Arg Ser 595 600 605Pro Pro Met Leu Pro Val Asn Gly Lys Met His Ser Ala Val Asp Cys 610 615 620Asn Gly Val Val Ser Leu Val Asp Gly Arg Ser Ala Leu Met Leu Pro625 630 635 640Asn Gly Gln Leu Leu Pro Glu Gly Thr Thr Asn Gln Ile His Lys Lys 645 650 655Arg Arg Cys Ser Ser Tyr Leu Leu Ser Glu Asp Met Leu Asn Asp Pro 660 665 670Asn Leu Arg Gln Arg Ala Met Ser Arg Ala Ser Ile Leu Thr Asn Thr 675 680 685Val Glu Glu Leu Glu Glu Ser Arg Gln Lys Cys Pro Pro Trp Trp Tyr 690 695 700Arg Phe Ala His Lys Phe Leu Ile Trp Asn Cys Ser Pro Tyr Trp Ile705 710 715 720Lys Phe Lys Lys Cys Ile Tyr Phe Ile Val Met Asp Pro Phe Val Asp 725 730 735Leu Ala Ile Thr Ile Cys Ile Val Leu Asn Thr Leu Phe Met Ala Met 740 745 750Glu His His Pro Met Thr Glu Glu Phe Lys Asn Val Leu Ala Ile Gly 755 760 765Asn Leu Val Phe Thr Gly Ile Phe Ala Ala Glu Met Val Leu Lys Leu 770 775 780Ile Ala Met Asp Pro Tyr Glu Tyr Phe Gln Val Gly Trp Asn Ile Phe785 790 795 800Asp Ser Leu Ile Val Thr Leu Ser Leu Val Glu Leu Phe Leu Ala Asp 805 810 815Val Glu Gly Leu Ser Val Leu Arg Ser Phe Arg Leu Leu Arg Val Phe 820 825 830Lys Leu Ala Lys Ser Trp Pro Thr Leu Asn Met Leu Ile Lys Ile Ile 835 840 845Gly Asn Ser Val Gly Ala Leu Gly Asn Leu Thr Leu Val Leu Ala Ile 850 855 860Ile Val Phe Ile Phe Ala Val Val Gly Met Gln Leu Phe Gly Lys Ser865 870 875 880Tyr Lys Glu Cys Val Cys Lys Ile Asn Asp Asp Cys Thr Leu Pro Arg 885 890 895Trp His Met Asn Asp Phe Phe His Ser Phe Leu Ile Val Phe Arg Val 900 905 910Leu Cys Gly Glu Trp Ile Glu Thr Met Trp Asp Cys Met Glu Val Ala 915 920 925Gly Gln Ala Met Cys Leu Ile Val Tyr Met Met Val Met Val Ile Gly 930 935 940Asn Leu Val Val Leu Asn Leu Phe Leu Ala Leu Leu Leu Ser Ser Phe945 950 955 960Ser Ser Asp Asn Leu Thr Ala Ile Glu Glu Asp Pro Asp Ala Asn Asn 965 970 975Leu Gln Ile Ala Val Thr Arg Ile Lys Lys Gly Ile Asn Tyr Val Lys 980 985 990Gln Thr Leu Arg Glu Phe Ile Leu Lys Ala Phe Ser Lys Lys Pro Lys 995 1000 1005Ile Ser Arg Glu Ile Arg Gln Ala Glu Asp Leu Asn Thr Lys Lys 1010 1015 1020Glu Asn Tyr Ile Ser Asn His Thr Leu Ala Glu Met Ser Lys Gly 1025 1030 1035His Asn Phe Leu Lys Glu Lys Asp Lys Ile Ser Gly Phe Gly Ser 1040 1045 1050Ser Val Asp Lys His Leu Met Glu Asp Ser Asp Gly Gln Ser Phe 1055 1060 1065Ile His Asn Pro Ser Leu Thr Val Thr Val Pro Ile Ala Pro Gly 1070 1075 1080Glu Ser Asp Leu Glu Asn Met Asn Ala Glu Glu Leu Ser Ser Asp 1085 1090 1095Ser Asp Ser Glu Tyr Ser Lys Val Arg Leu Asn Arg Ser Ser Ser 1100 1105 1110Ser Glu Cys Ser Thr Val Asp Asn Pro Leu Pro Gly Glu Gly Glu 1115 1120 1125Glu Ala Glu Ala Glu Pro Met Asn Ser Asp Glu Pro Glu Ala Cys 1130 1135 1140Phe Thr Asp Gly Cys Val Arg Arg Phe Ser Cys Cys Gln Val Asn 1145 1150 1155Ile Glu Ser Gly Lys Gly Lys Ile Trp Trp Asn Ile Arg Lys Thr 1160 1165 1170Cys Tyr Lys Ile Val Glu His Ser Trp Phe Glu Ser Phe Ile Val 1175 1180 1185Leu Met Ile Leu Leu Ser Ser Gly Ala Leu Ala Phe Glu Asp Ile 1190 1195 1200Tyr Ile Glu Arg Lys Lys Thr Ile Lys Ile Ile Leu Glu Tyr Ala 1205 1210 1215Asp Lys Ile Phe Thr Tyr Ile Phe Ile Leu Glu Met Leu Leu Lys 1220 1225 1230Trp Ile Ala Tyr Gly Tyr Lys Thr Tyr Phe Thr Asn Ala Trp Cys 1235 1240 1245Trp Leu Asp Phe Leu Ile Val Asp Val Ser Leu Val Thr Leu Val 1250 1255 1260Ala Asn Thr Leu Gly Tyr Ser Asp Leu Gly Pro Ile Lys Ser Leu 1265 1270 1275Arg Thr Leu Arg Ala Leu Arg Pro Leu Arg Ala Leu Ser Arg Phe 1280 1285 1290Glu Gly Met Arg Val Val Val Asn Ala Leu Ile Gly Ala Ile Pro 1295 1300 1305Ser Ile Met Asn Val Leu Leu Val Cys Leu Ile Phe Trp Leu Ile 1310 1315 1320Phe Ser Ile Met Gly Val Asn Leu Phe Ala Gly Lys Phe Tyr Glu 1325 1330 1335Cys Ile Asn Thr Thr Asp Gly Ser Arg Phe Pro Ala Ser Gln Val 1340 1345 1350Pro Asn Arg Ser Glu Cys Phe Ala Leu Met Asn Val Ser Gln Asn 1355 1360 1365Val Arg Trp Lys Asn Leu Lys Val Asn Phe Asp Asn Val Gly Leu 1370 1375 1380Gly Tyr Leu Ser Leu Leu Gln Val Ala Thr Phe Lys Gly Trp Thr 1385 1390 1395Ile Ile Met Tyr Ala Ala Val Asp Ser Val Asn Val Asp Lys Gln 1400 1405 1410Pro Lys Tyr Glu Tyr Ser Leu Tyr Met Tyr Ile Tyr Phe Val Val 1415 1420 1425Phe Ile Ile Phe Gly Ser Phe Phe Thr Leu Asn Leu Phe Ile Gly 1430 1435 1440Val Ile Ile Asp Asn Phe Asn Gln Gln Lys Lys Lys Leu Gly Gly 1445 1450 1455Gln Asp Ile Phe Met Thr Glu Glu Gln Lys Lys Tyr Tyr Asn Ala 1460 1465 1470Met Lys Lys Leu Gly Ser Lys Lys Pro Gln Lys Pro Ile Pro Arg 1475

1480 1485Pro Gly Asn Lys Ile Gln Gly Cys Ile Phe Asp Leu Val Thr Asn 1490 1495 1500Gln Ala Phe Asp Ile Ser Ile Met Val Leu Ile Cys Leu Asn Met 1505 1510 1515Val Thr Met Met Val Glu Lys Glu Gly Gln Ser Gln His Met Thr 1520 1525 1530Glu Val Leu Tyr Trp Ile Asn Val Val Phe Ile Ile Leu Phe Thr 1535 1540 1545Gly Glu Cys Val Leu Lys Leu Ile Ser Leu Arg His Tyr Tyr Phe 1550 1555 1560Thr Val Gly Trp Asn Ile Phe Asp Phe Val Val Val Ile Ile Ser 1565 1570 1575Ile Val Gly Met Phe Leu Ala Asp Leu Ile Glu Thr Tyr Phe Val 1580 1585 1590Ser Pro Thr Leu Phe Arg Val Ile Arg Leu Ala Arg Ile Gly Arg 1595 1600 1605Ile Leu Arg Leu Val Lys Gly Ala Lys Gly Ile Arg Thr Leu Leu 1610 1615 1620Phe Ala Leu Met Met Ser Leu Pro Ala Leu Phe Asn Ile Gly Leu 1625 1630 1635Leu Leu Phe Leu Val Met Phe Ile Tyr Ala Ile Phe Gly Met Ser 1640 1645 1650Asn Phe Ala Tyr Val Lys Lys Glu Asp Gly Ile Asn Asp Met Phe 1655 1660 1665Asn Phe Glu Thr Phe Gly Asn Ser Met Ile Cys Leu Phe Gln Ile 1670 1675 1680Thr Thr Ser Ala Gly Trp Asp Gly Leu Leu Ala Pro Ile Leu Asn 1685 1690 1695Ser Lys Pro Pro Asp Cys Asp Pro Lys Lys Val His Pro Gly Ser 1700 1705 1710Ser Val Glu Gly Asp Cys Gly Asn Pro Ser Val Gly Ile Phe Tyr 1715 1720 1725Phe Val Ser Tyr Ile Ile Ile Ser Phe Leu Val Val Val Asn Met 1730 1735 1740Tyr Ile Ala Val Ile Leu Glu Asn Phe Ser Val Ala Thr Glu Glu 1745 1750 1755Ser Thr Glu Pro Leu Ser Glu Asp Asp Phe Glu Met Phe Tyr Glu 1760 1765 1770Val Trp Glu Lys Phe Asp Pro Asp Ala Thr Gln Phe Ile Glu Phe 1775 1780 1785Ser Lys Leu Ser Asp Phe Ala Ala Ala Leu Asp Pro Pro Leu Leu 1790 1795 1800Ile Ala Lys Pro Asn Lys Val Gln Leu Ile Ala Met Asp Leu Pro 1805 1810 1815Met Val Ser Gly Asp Arg Ile His Cys Leu Asp Ile Leu Phe Ala 1820 1825 1830Phe Thr Lys Arg Val Leu Gly Glu Ser Gly Glu Met Asp Ser Leu 1835 1840 1845Arg Ser Gln Met Glu Glu Arg Phe Met Ser Ala Asn Pro Ser Lys 1850 1855 1860Val Ser Tyr Glu Pro Ile Thr Thr Thr Leu Lys Arg Lys Gln Glu 1865 1870 1875Asp Val Ser Ala Thr Val Ile Gln Arg Ala Tyr Arg Arg Tyr Arg 1880 1885 1890Leu Arg Gln Asn Val Lys Asn Ile Ser Ser Ile Tyr Ile Lys Asp 1895 1900 1905Gly Asp Arg Asp Asp Asp Leu Leu Asn Lys Lys Asp Met Ala Phe 1910 1915 1920Asp Asn Val Asn Glu Asn Ser Ser Pro Glu Lys Thr Asp Ala Thr 1925 1930 1935Ser Ser Thr Thr Ser Pro Pro Ser Tyr Asp Ser Val Thr Lys Pro 1940 1945 1950Asp Lys Glu Lys Tyr Glu Gln Asp Arg Thr Glu Lys Glu Asp Lys 1955 1960 1965Gly Lys Asp Ser Lys Glu Ser Lys Lys 1970 197531977PRTHomo sapiens 3Met Ala Met Leu Pro Pro Pro Gly Pro Gln Ser Phe Val His Phe Thr1 5 10 15Lys Gln Ser Leu Ala Leu Ile Glu Gln Arg Ile Ala Glu Arg Lys Ser 20 25 30Lys Glu Pro Lys Glu Glu Lys Lys Asp Asp Asp Glu Glu Ala Pro Lys 35 40 45Pro Ser Ser Asp Leu Glu Ala Gly Lys Gln Leu Pro Phe Ile Tyr Gly 50 55 60Asp Ile Pro Pro Gly Met Val Ser Glu Pro Leu Glu Asp Leu Asp Pro65 70 75 80Tyr Tyr Ala Asp Lys Lys Thr Phe Ile Val Leu Asn Lys Gly Lys Thr 85 90 95Ile Phe Arg Phe Asn Ala Thr Pro Ala Leu Tyr Met Leu Ser Pro Phe 100 105 110Ser Pro Leu Arg Arg Ile Ser Ile Lys Ile Leu Val His Ser Leu Phe 115 120 125Ser Met Leu Ile Met Cys Thr Ile Leu Thr Asn Cys Ile Phe Met Thr 130 135 140Met Asn Asn Pro Gln Asp Trp Thr Lys Asn Val Glu Tyr Thr Phe Thr145 150 155 160Gly Ile Tyr Thr Phe Glu Ser Leu Val Lys Ile Leu Ala Arg Gly Phe 165 170 175Cys Val Gly Glu Phe Thr Phe Leu Arg Asp Pro Trp Asn Trp Leu Asp 180 185 190Phe Val Val Ile Val Phe Ala Tyr Leu Thr Glu Phe Val Asn Leu Gly 195 200 205Asn Val Ser Ala Leu Arg Thr Phe Arg Val Leu Arg Ala Leu Lys Thr 210 215 220Ile Ser Val Ile Pro Gly Leu Lys Thr Ile Val Gly Ala Leu Ile Gln225 230 235 240Ser Val Lys Lys Leu Ser Asp Val Met Ile Leu Thr Val Phe Cys Leu 245 250 255Ser Val Phe Ala Leu Ile Gly Leu Gln Leu Phe Met Gly Asn Leu Lys 260 265 270His Lys Cys Phe Arg Asn Ser Leu Glu Asn Asn Glu Thr Leu Glu Ser 275 280 285Ile Met Asn Thr Leu Glu Ser Glu Glu Asp Phe Arg Lys Tyr Phe Tyr 290 295 300Tyr Leu Glu Gly Ser Lys Asp Ala Leu Leu Cys Gly Phe Ser Thr Asp305 310 315 320Ser Gly Gln Cys Pro Glu Gly Tyr Thr Cys Val Lys Ile Gly Arg Asn 325 330 335Pro Asp Tyr Gly Tyr Thr Ser Phe Asp Thr Phe Ser Trp Ala Phe Leu 340 345 350Ala Leu Phe Arg Leu Met Thr Gln Asp Tyr Trp Glu Asn Leu Tyr Gln 355 360 365Gln Thr Leu Arg Ala Ala Gly Lys Thr Tyr Met Ile Phe Phe Val Val 370 375 380Val Ile Phe Leu Gly Ser Phe Tyr Leu Ile Asn Leu Ile Leu Ala Val385 390 395 400Val Ala Met Ala Tyr Glu Glu Gln Asn Gln Ala Asn Ile Glu Glu Ala 405 410 415Lys Gln Lys Glu Leu Glu Phe Gln Gln Met Leu Asp Arg Leu Lys Lys 420 425 430Glu Gln Glu Glu Ala Glu Ala Ile Ala Ala Ala Ala Ala Glu Tyr Thr 435 440 445Ser Ile Arg Arg Ser Arg Ile Met Gly Leu Ser Glu Ser Ser Ser Glu 450 455 460Thr Ser Lys Leu Ser Ser Lys Ser Ala Lys Glu Arg Arg Asn Arg Arg465 470 475 480Lys Lys Lys Asn Gln Lys Lys Leu Ser Ser Gly Glu Glu Lys Gly Asp 485 490 495Ala Glu Lys Leu Ser Lys Ser Glu Ser Glu Asp Ser Ile Arg Arg Lys 500 505 510Ser Phe His Leu Gly Val Glu Gly His Arg Arg Ala His Glu Lys Arg 515 520 525Leu Ser Thr Pro Asn Gln Ser Pro Leu Ser Ile Arg Gly Ser Leu Phe 530 535 540Ser Ala Arg Arg Ser Ser Arg Thr Ser Leu Phe Ser Phe Lys Gly Arg545 550 555 560Gly Arg Asp Ile Gly Ser Glu Thr Glu Phe Ala Asp Asp Glu His Ser 565 570 575Ile Phe Gly Asp Asn Glu Ser Arg Arg Gly Ser Leu Phe Val Pro His 580 585 590Arg Pro Gln Glu Arg Arg Ser Ser Asn Ile Ser Gln Ala Ser Arg Ser 595 600 605Pro Pro Met Leu Pro Val Asn Gly Lys Met His Ser Ala Val Asp Cys 610 615 620Asn Gly Val Val Ser Leu Val Asp Gly Arg Ser Ala Leu Met Leu Pro625 630 635 640Asn Gly Gln Leu Leu Pro Glu Gly Thr Thr Asn Gln Ile His Lys Lys 645 650 655Arg Arg Cys Ser Ser Tyr Leu Leu Ser Glu Asp Met Leu Asn Asp Pro 660 665 670Asn Leu Arg Gln Arg Ala Met Ser Arg Ala Ser Ile Leu Thr Asn Thr 675 680 685Val Glu Glu Leu Glu Glu Ser Arg Gln Lys Cys Pro Pro Trp Trp Tyr 690 695 700Arg Phe Ala His Lys Phe Leu Ile Trp Asn Cys Ser Pro Tyr Trp Ile705 710 715 720Lys Phe Lys Lys Cys Ile Tyr Phe Ile Val Met Asp Pro Phe Val Asp 725 730 735Leu Ala Ile Thr Ile Cys Ile Val Leu Asn Thr Leu Phe Met Ala Met 740 745 750Glu His His Pro Met Thr Glu Glu Phe Lys Asn Val Leu Ala Ile Gly 755 760 765Asn Leu Val Phe Thr Gly Ile Phe Ala Ala Glu Met Val Leu Lys Leu 770 775 780Ile Ala Met Asp Pro Tyr Glu Tyr Phe Gln Val Gly Trp Asn Ile Phe785 790 795 800Asp Ser Leu Ile Val Thr Leu Ser Leu Val Glu Leu Phe Leu Ala Asp 805 810 815Val Glu Gly Leu Ser Val Leu Arg Ser Phe Arg Leu Leu Arg Val Phe 820 825 830Lys Leu Ala Lys Ser Trp Pro Thr Leu Asn Met Leu Ile Lys Ile Ile 835 840 845Gly Asn Ser Val Gly Ala Leu Gly Asn Leu Thr Leu Val Leu Ala Ile 850 855 860Ile Val Phe Ile Phe Ala Val Val Gly Met Gln Leu Phe Gly Lys Ser865 870 875 880Tyr Lys Glu Cys Val Cys Lys Ile Asn Asp Asp Cys Thr Leu Pro Arg 885 890 895Trp His Met Asn Asp Phe Phe His Ser Phe Leu Ile Val Phe Arg Val 900 905 910Leu Cys Gly Glu Trp Ile Glu Thr Met Trp Asp Cys Met Glu Val Ala 915 920 925Gly Gln Ala Met Cys Leu Ile Val Tyr Met Met Val Met Val Ile Gly 930 935 940Asn Leu Val Val Leu Asn Leu Phe Leu Ala Leu Leu Leu Ser Ser Phe945 950 955 960Ser Ser Asp Asn Leu Thr Ala Ile Glu Glu Asp Pro Asp Ala Asn Asn 965 970 975Leu Gln Ile Ala Val Thr Arg Ile Lys Lys Gly Ile Asn Tyr Val Lys 980 985 990Gln Thr Leu Arg Glu Phe Ile Leu Lys Ala Phe Ser Lys Lys Pro Lys 995 1000 1005Ile Ser Arg Glu Ile Arg Gln Ala Glu Asp Leu Asn Thr Lys Lys 1010 1015 1020Glu Asn Tyr Ile Ser Asn His Thr Leu Ala Glu Met Ser Lys Gly 1025 1030 1035His Asn Phe Leu Lys Glu Lys Asp Lys Ile Ser Gly Phe Gly Ser 1040 1045 1050Ser Val Asp Lys His Leu Met Glu Asp Ser Asp Gly Gln Ser Phe 1055 1060 1065Ile His Asn Pro Ser Leu Thr Val Thr Val Pro Ile Ala Pro Gly 1070 1075 1080Glu Ser Asp Leu Glu Asn Met Asn Ala Glu Glu Leu Ser Ser Asp 1085 1090 1095Ser Asp Ser Glu Tyr Ser Lys Val Arg Leu Asn Arg Ser Ser Ser 1100 1105 1110Ser Glu Cys Ser Thr Val Asp Asn Pro Leu Pro Gly Glu Gly Glu 1115 1120 1125Glu Ala Glu Ala Glu Pro Met Asn Ser Asp Glu Pro Glu Ala Cys 1130 1135 1140Phe Thr Asp Gly Cys Val Arg Arg Phe Ser Cys Cys Gln Val Asn 1145 1150 1155Ile Glu Ser Gly Lys Gly Lys Ile Trp Trp Asn Ile Arg Lys Thr 1160 1165 1170Cys Tyr Lys Ile Val Glu His Ser Trp Phe Glu Ser Phe Ile Val 1175 1180 1185Leu Met Ile Leu Leu Ser Ser Gly Ala Leu Ala Phe Glu Asp Ile 1190 1195 1200Tyr Ile Glu Arg Lys Lys Thr Ile Lys Ile Ile Leu Glu Tyr Ala 1205 1210 1215Asp Lys Ile Phe Thr Tyr Ile Phe Ile Leu Glu Met Leu Leu Lys 1220 1225 1230Trp Ile Ala Tyr Gly Tyr Lys Thr Tyr Phe Thr Asn Ala Trp Cys 1235 1240 1245Trp Leu Asp Phe Leu Ile Val Asp Val Ser Leu Val Thr Leu Val 1250 1255 1260Ala Asn Thr Leu Gly Tyr Ser Asp Leu Gly Pro Ile Lys Ser Leu 1265 1270 1275Arg Thr Leu Arg Ala Leu Arg Pro Leu Arg Ala Leu Ser Arg Phe 1280 1285 1290Glu Gly Met Arg Val Val Val Asn Ala Leu Ile Gly Ala Ile Pro 1295 1300 1305Ser Ile Met Asn Val Leu Leu Val Cys Leu Ile Phe Trp Leu Ile 1310 1315 1320Phe Ser Ile Met Gly Val Asn Leu Phe Ala Gly Lys Phe Tyr Glu 1325 1330 1335Cys Ile Asn Thr Thr Asp Gly Ser Arg Phe Pro Ala Ser Gln Val 1340 1345 1350Pro Asn Arg Ser Glu Cys Phe Ala Leu Met Asn Val Ser Gln Asn 1355 1360 1365Val Arg Trp Lys Asn Leu Lys Val Asn Phe Asp Asn Val Gly Leu 1370 1375 1380Gly Tyr Leu Ser Leu Leu Gln Val Ala Thr Phe Lys Gly Trp Thr 1385 1390 1395Ile Ile Met Tyr Ala Ala Val Asp Ser Val Asn Val Asp Lys Gln 1400 1405 1410Pro Lys Tyr Glu Tyr Ser Leu Tyr Met Tyr Ile Tyr Phe Val Val 1415 1420 1425Phe Ile Ile Phe Gly Ser Phe Phe Thr Leu Asn Leu Phe Ile Gly 1430 1435 1440Val Ile Ile Asp Asn Phe Asn Gln Gln Lys Lys Lys Leu Gly Gly 1445 1450 1455Gln Asp Ile Phe Met Thr Glu Glu Gln Lys Lys Tyr Tyr Asn Ala 1460 1465 1470Met Lys Lys Leu Gly Ser Lys Lys Pro Gln Lys Pro Ile Pro Arg 1475 1480 1485Pro Gly Asn Lys Ile Gln Gly Cys Ile Phe Asp Leu Val Thr Asn 1490 1495 1500Gln Ala Phe Asp Ile Ser Ile Met Val Leu Ile Cys Leu Asn Met 1505 1510 1515Val Thr Met Met Val Glu Lys Glu Gly Gln Ser Gln His Met Thr 1520 1525 1530Glu Val Leu Tyr Trp Ile Asn Val Val Phe Ile Ile Leu Phe Thr 1535 1540 1545Gly Glu Cys Val Leu Lys Leu Ile Ser Leu Arg His Tyr Tyr Phe 1550 1555 1560Thr Val Gly Trp Asn Ile Phe Asp Phe Val Val Val Ile Ile Ser 1565 1570 1575Ile Val Gly Met Phe Leu Ala Asp Leu Ile Glu Thr Tyr Phe Val 1580 1585 1590Ser Pro Thr Leu Phe Arg Val Ile Arg Leu Ala Arg Ile Gly Arg 1595 1600 1605Ile Leu Arg Leu Val Lys Gly Ala Lys Gly Ile Arg Thr Leu Leu 1610 1615 1620Phe Ala Leu Met Met Ser Leu Pro Ala Leu Phe Asn Ile Gly Leu 1625 1630 1635Leu Leu Phe Leu Val Met Phe Ile Tyr Ala Ile Phe Gly Met Ser 1640 1645 1650Asn Phe Ala Tyr Val Lys Lys Glu Asp Gly Ile Asn Asp Met Phe 1655 1660 1665Asn Phe Glu Thr Phe Gly Asn Ser Met Ile Cys Leu Phe Gln Ile 1670 1675 1680Thr Thr Ser Ala Gly Trp Asp Gly Leu Leu Ala Pro Ile Leu Asn 1685 1690 1695Ser Lys Pro Pro Asp Cys Asp Pro Lys Lys Val His Pro Gly Ser 1700 1705 1710Ser Val Glu Gly Asp Cys Gly Asn Pro Ser Val Gly Ile Phe Tyr 1715 1720 1725Phe Val Ser Tyr Ile Ile Ile Ser Phe Leu Val Val Val Asn Met 1730 1735 1740Tyr Ile Ala Val Ile Leu Glu Asn Phe Ser Val Ala Thr Glu Glu 1745 1750 1755Ser Thr Glu Pro Leu Ser Glu Asp Asp Phe Glu Met Phe Tyr Glu 1760 1765 1770Val Trp Glu Lys Phe Asp Pro Asp Ala Thr Gln Phe Ile Glu Phe 1775 1780 1785Ser Lys Leu Ser Asp Phe Ala Ala Ala Leu Asp Pro Pro Leu Leu 1790 1795 1800Ile Ala Lys Pro Asn Lys Val Gln Leu Ile Ala Met Asp Leu Pro 1805 1810 1815Met Val Ser Gly Asp Arg Ile His Cys Leu Asp Ile Leu Phe Ala 1820 1825 1830Phe Thr Lys Arg Val Leu Gly Glu Ser Gly Glu Met Asp Ser Leu 1835 1840 1845Arg Ser Gln Met Glu Glu Arg Phe Met Ser Ala Asn Pro Ser Lys 1850 1855 1860Val Ser Tyr Glu Pro Ile Thr Thr Thr Leu Lys Arg Lys Gln Glu 1865 1870 1875Asp Val Ser Ala Thr Val Ile Gln Arg Ala Tyr Arg Arg Tyr Arg 1880 1885 1890Leu Arg Gln Asn Val Lys Asn Ile Ser Ser Ile Tyr Ile Lys Asp 1895 1900 1905Gly Asp Arg Asp Asp Asp Leu Leu Asn Lys Lys Asp Met Ala Phe 1910 1915 1920Asp Asn Val Asn Glu Asn Ser Ser Pro Glu Lys Thr Asp Ala Thr 1925 1930 1935Ser Ser Thr Thr Ser Pro Pro Ser Tyr Asp Ser Val Thr Lys Pro 1940 1945 1950Asp

Lys Glu Lys Tyr Glu Gln Asp Arg Thr Glu Lys Glu Asp Lys 1955 1960 1965Gly Lys Asp Ser Lys Glu Ser Lys Lys 1970 197541977PRTHomo sapiens 4Met Ala Met Leu Pro Pro Pro Gly Pro Gln Ser Phe Val His Phe Thr1 5 10 15Lys Gln Ser Leu Ala Leu Ile Glu Gln Arg Ile Ala Glu Arg Lys Ser 20 25 30Lys Glu Pro Lys Glu Glu Lys Lys Asp Asp Asp Glu Glu Ala Pro Lys 35 40 45Pro Ser Ser Asp Leu Glu Ala Gly Lys Gln Leu Pro Phe Ile Tyr Gly 50 55 60Asp Ile Pro Pro Gly Met Val Ser Glu Pro Leu Glu Asp Leu Asp Pro65 70 75 80Tyr Tyr Ala Asp Lys Lys Thr Phe Ile Val Leu Asn Lys Gly Lys Thr 85 90 95Ile Phe Arg Phe Asn Ala Thr Pro Ala Leu Tyr Met Leu Ser Pro Phe 100 105 110Ser Pro Leu Arg Arg Ile Ser Ile Lys Ile Leu Val His Ser Leu Phe 115 120 125Ser Met Leu Ile Met Cys Thr Ile Leu Thr Asn Cys Ile Phe Met Thr 130 135 140Met Asn Asn Pro Pro Asp Trp Thr Lys Asn Val Glu Tyr Thr Phe Thr145 150 155 160Gly Ile Tyr Thr Phe Glu Ser Leu Val Lys Ile Leu Ala Arg Gly Phe 165 170 175Cys Val Gly Glu Phe Thr Phe Leu Arg Asp Pro Trp Asn Trp Leu Asp 180 185 190Phe Val Val Ile Val Phe Ala Tyr Leu Thr Glu Phe Val Asn Leu Gly 195 200 205Asn Val Ser Ala Leu Arg Thr Phe Arg Val Leu Arg Ala Leu Lys Thr 210 215 220Ile Ser Val Ile Pro Gly Leu Lys Thr Ile Val Gly Ala Leu Ile Gln225 230 235 240Ser Val Lys Lys Leu Ser Asp Val Met Ile Leu Thr Val Phe Cys Leu 245 250 255Ser Val Phe Ala Leu Ile Gly Leu Gln Leu Phe Met Gly Asn Leu Lys 260 265 270His Lys Cys Phe Arg Asn Ser Leu Glu Asn Asn Glu Thr Leu Glu Ser 275 280 285Ile Met Asn Thr Leu Glu Ser Glu Glu Asp Phe Arg Lys Tyr Phe Tyr 290 295 300Tyr Leu Glu Gly Ser Lys Asp Ala Leu Leu Cys Gly Phe Ser Thr Asp305 310 315 320Ser Gly Gln Cys Pro Glu Gly Tyr Thr Cys Val Lys Ile Gly Arg Asn 325 330 335Pro Asp Tyr Gly Tyr Thr Ser Phe Asp Thr Phe Ser Trp Ala Phe Leu 340 345 350Ala Leu Phe Arg Leu Met Thr Gln Asp Tyr Trp Glu Asn Leu Tyr Gln 355 360 365Gln Thr Leu Arg Ala Ala Gly Lys Thr Tyr Met Ile Phe Phe Val Val 370 375 380Val Ile Phe Leu Gly Ser Phe Tyr Leu Ile Asn Leu Ile Leu Ala Val385 390 395 400Val Ala Met Ala Tyr Glu Glu Gln Asn Gln Ala Asn Ile Glu Glu Ala 405 410 415Lys Gln Lys Glu Leu Glu Phe Gln Gln Met Leu Asp Arg Leu Lys Lys 420 425 430Glu Gln Glu Glu Ala Glu Ala Ile Ala Ala Ala Ala Ala Glu Tyr Thr 435 440 445Ser Ile Arg Arg Ser Arg Ile Met Gly Leu Ser Glu Ser Ser Ser Glu 450 455 460Thr Ser Lys Leu Ser Ser Lys Ser Ala Lys Glu Arg Arg Asn Arg Arg465 470 475 480Lys Lys Lys Asn Gln Lys Lys Leu Ser Ser Gly Glu Glu Lys Gly Asp 485 490 495Ala Glu Lys Leu Ser Lys Ser Glu Ser Glu Asp Ser Ile Arg Arg Lys 500 505 510Ser Phe His Leu Gly Val Glu Gly His Arg Arg Ala His Glu Lys Arg 515 520 525Leu Ser Thr Pro Asn Gln Ser Pro Leu Ser Ile Arg Gly Ser Leu Phe 530 535 540Ser Ala Arg Arg Ser Ser Arg Thr Ser Leu Phe Ser Phe Lys Gly Arg545 550 555 560Gly Arg Asp Ile Gly Ser Glu Thr Glu Phe Ala Asp Asp Glu His Ser 565 570 575Ile Phe Gly Asp Asn Glu Ser Arg Arg Gly Ser Leu Phe Val Pro His 580 585 590Arg Pro Gln Glu Arg Arg Ser Ser Asn Ile Ser Gln Ala Ser Arg Ser 595 600 605Pro Pro Met Leu Pro Val Asn Gly Lys Met His Ser Ala Val Asp Cys 610 615 620Asn Gly Val Val Ser Leu Val Asp Gly Arg Ser Ala Leu Met Leu Pro625 630 635 640Tyr Gly Gln Leu Leu Pro Glu Gly Thr Thr Asn Gln Ile His Lys Lys 645 650 655Arg Arg Cys Ser Ser Tyr Leu Leu Ser Glu Asp Met Leu Asn Asp Pro 660 665 670Asn Leu Arg Gln Arg Ala Met Ser Arg Ala Ser Ile Leu Thr Asn Thr 675 680 685Val Glu Glu Leu Glu Glu Ser Arg Gln Lys Cys Pro Pro Trp Trp Tyr 690 695 700Arg Phe Ala His Lys Phe Leu Ile Trp Asn Cys Ser Pro Tyr Trp Ile705 710 715 720Lys Phe Lys Lys Cys Ile Tyr Phe Ile Val Met Asp Pro Phe Val Asp 725 730 735Leu Ala Ile Thr Ile Cys Ile Val Leu Asn Thr Leu Phe Met Ala Met 740 745 750Glu His His Pro Met Thr Glu Glu Phe Lys Asn Val Leu Ala Ile Gly 755 760 765Asn Leu Val Phe Thr Gly Ile Phe Ala Ala Glu Met Val Leu Lys Leu 770 775 780Ile Ala Met Asp Pro Tyr Glu Tyr Phe Gln Val Gly Trp Asn Ile Phe785 790 795 800Asp Ser Leu Ile Val Thr Leu Ser Leu Val Glu Leu Phe Leu Ala Asp 805 810 815Val Glu Gly Leu Ser Val Leu Arg Ser Phe Arg Leu Leu Arg Val Phe 820 825 830Lys Leu Ala Lys Ser Trp Pro Thr Leu Asn Met Leu Ile Lys Ile Ile 835 840 845Gly Asn Ser Val Gly Ala Leu Gly Asn Leu Thr Leu Val Leu Ala Ile 850 855 860Ile Val Phe Ile Phe Ala Val Val Gly Met Gln Leu Phe Gly Lys Ser865 870 875 880Tyr Lys Glu Cys Val Cys Lys Ile Asn Asp Asp Cys Thr Leu Pro Arg 885 890 895Trp His Met Asn Asp Phe Phe His Ser Phe Leu Ile Val Phe Arg Val 900 905 910Leu Cys Gly Glu Trp Ile Glu Thr Met Trp Asp Cys Met Glu Val Ala 915 920 925Gly Gln Ala Met Cys Leu Ile Val Tyr Met Met Val Met Val Ile Gly 930 935 940Asn Leu Val Val Leu Asn Leu Phe Leu Ala Leu Leu Leu Ser Ser Phe945 950 955 960Ser Ser Asp Asn Leu Thr Ala Ile Glu Glu Asp Pro Asp Ala Asn Asn 965 970 975Leu Gln Ile Ala Val Thr Arg Ile Lys Lys Gly Ile Asn Tyr Val Lys 980 985 990Gln Thr Leu Arg Glu Phe Ile Leu Lys Ala Phe Ser Lys Lys Pro Lys 995 1000 1005Ile Ser Arg Glu Ile Arg Gln Ala Glu Asp Leu Asn Thr Lys Lys 1010 1015 1020Glu Asn Tyr Ile Ser Asn His Thr Leu Ala Glu Met Ser Lys Gly 1025 1030 1035His Asn Phe Leu Lys Glu Lys Asp Lys Ile Ser Gly Phe Gly Ser 1040 1045 1050Ser Val Asp Lys His Leu Met Glu Asp Ser Asp Gly Gln Ser Phe 1055 1060 1065Ile His Asn Pro Ser Leu Thr Val Thr Val Pro Ile Ala Pro Gly 1070 1075 1080Glu Ser Asp Leu Glu Asn Met Asn Ala Glu Glu Leu Ser Ser Asp 1085 1090 1095Ser Asp Ser Glu Tyr Ser Lys Val Arg Leu Asn Arg Ser Ser Ser 1100 1105 1110Ser Glu Cys Ser Thr Val Asp Asn Pro Leu Pro Gly Glu Gly Glu 1115 1120 1125Glu Ala Glu Ala Glu Pro Met Asn Ser Asp Glu Pro Glu Ala Cys 1130 1135 1140Phe Thr Asp Gly Cys Val Arg Arg Phe Ser Cys Cys Gln Val Asn 1145 1150 1155Ile Glu Ser Gly Lys Gly Lys Ile Trp Trp Asn Ile Arg Lys Thr 1160 1165 1170Cys Tyr Lys Ile Val Glu His Ser Trp Phe Glu Ser Phe Ile Val 1175 1180 1185Leu Met Ile Leu Leu Ser Ser Gly Ala Leu Ala Phe Glu Asp Ile 1190 1195 1200Tyr Ile Glu Arg Lys Lys Thr Ile Lys Ile Ile Leu Glu Tyr Ala 1205 1210 1215Asp Lys Ile Phe Thr Tyr Ile Phe Ile Leu Glu Met Leu Leu Lys 1220 1225 1230Trp Ile Ala Tyr Gly Tyr Lys Thr Tyr Phe Thr Asn Ala Trp Cys 1235 1240 1245Trp Leu Asp Phe Leu Ile Val Asp Val Ser Leu Val Thr Leu Val 1250 1255 1260Ala Asn Thr Leu Gly Tyr Ser Asp Leu Gly Pro Ile Lys Ser Leu 1265 1270 1275Arg Thr Leu Arg Ala Leu Arg Pro Leu Arg Ala Leu Ser Arg Phe 1280 1285 1290Glu Gly Met Arg Val Val Val Asn Ala Leu Ile Gly Ala Ile Pro 1295 1300 1305Ser Ile Met Asn Val Leu Leu Val Cys Leu Ile Phe Trp Leu Ile 1310 1315 1320Phe Ser Ile Met Gly Val Asn Leu Phe Ala Gly Lys Phe Tyr Glu 1325 1330 1335Cys Ile Asn Thr Thr Asp Gly Ser Arg Phe Pro Ala Ser Gln Val 1340 1345 1350Pro Asn Arg Ser Glu Cys Phe Ala Leu Met Asn Val Ser Gln Asn 1355 1360 1365Val Arg Trp Lys Asn Leu Lys Val Asn Phe Asp Asn Val Gly Leu 1370 1375 1380Gly Tyr Leu Ser Leu Leu Gln Val Ala Thr Phe Lys Gly Trp Thr 1385 1390 1395Ile Ile Met Tyr Ala Ala Val Asp Ser Val Asn Val Asp Lys Gln 1400 1405 1410Pro Lys Tyr Glu Tyr Ser Leu Tyr Met Tyr Ile Tyr Phe Val Val 1415 1420 1425Phe Ile Ile Phe Gly Ser Phe Phe Thr Leu Asn Leu Phe Ile Gly 1430 1435 1440Val Ile Ile Asp Asn Phe Asn Gln Gln Lys Lys Lys Leu Gly Gly 1445 1450 1455Gln Asp Ile Phe Met Thr Glu Glu Gln Lys Lys Tyr Tyr Asn Ala 1460 1465 1470Met Lys Lys Leu Gly Ser Lys Lys Pro Gln Lys Pro Ile Pro Arg 1475 1480 1485Pro Gly Asn Lys Ile Gln Gly Cys Ile Phe Asp Leu Val Thr Asn 1490 1495 1500Gln Ala Phe Asp Ile Ser Ile Met Val Leu Ile Cys Leu Asn Met 1505 1510 1515Val Thr Met Met Val Glu Lys Glu Gly Gln Ser Gln His Met Thr 1520 1525 1530Glu Val Leu Tyr Trp Ile Asn Val Val Phe Ile Ile Leu Phe Thr 1535 1540 1545Gly Glu Cys Val Leu Lys Leu Ile Ser Leu Arg His Tyr Tyr Phe 1550 1555 1560Thr Val Gly Trp Asn Ile Phe Asp Phe Val Val Val Ile Ile Ser 1565 1570 1575Ile Val Gly Met Phe Leu Ala Asp Leu Ile Glu Thr Tyr Phe Val 1580 1585 1590Ser Pro Thr Leu Phe Arg Val Ile Arg Leu Ala Arg Ile Gly Arg 1595 1600 1605Ile Leu Arg Leu Val Lys Gly Ala Lys Gly Ile Arg Thr Leu Leu 1610 1615 1620Phe Ala Leu Met Met Ser Leu Pro Ala Leu Phe Asn Ile Gly Leu 1625 1630 1635Leu Leu Phe Leu Val Met Phe Ile Tyr Ala Ile Phe Gly Met Ser 1640 1645 1650Asn Phe Ala Tyr Val Lys Lys Glu Asp Gly Ile Asn Asp Met Phe 1655 1660 1665Asn Phe Glu Thr Phe Gly Asn Ser Met Ile Cys Leu Phe Gln Ile 1670 1675 1680Thr Thr Ser Ala Gly Trp Asp Gly Leu Leu Ala Pro Ile Leu Asn 1685 1690 1695Ser Lys Pro Pro Asp Cys Asp Pro Lys Lys Val His Pro Gly Ser 1700 1705 1710Ser Val Glu Gly Asp Cys Gly Asn Pro Ser Val Gly Ile Phe Tyr 1715 1720 1725Phe Val Ser Tyr Ile Ile Ile Ser Phe Leu Val Val Val Asn Met 1730 1735 1740Tyr Ile Ala Val Ile Leu Glu Asn Phe Ser Val Ala Thr Glu Glu 1745 1750 1755Ser Thr Glu Pro Leu Ser Glu Asp Asp Phe Glu Met Phe Tyr Glu 1760 1765 1770Val Trp Glu Lys Phe Asp Pro Asp Ala Thr Gln Phe Ile Glu Phe 1775 1780 1785Ser Lys Leu Ser Asp Phe Ala Ala Ala Leu Asp Pro Pro Leu Leu 1790 1795 1800Ile Ala Lys Pro Asn Lys Val Gln Leu Ile Ala Met Asp Leu Pro 1805 1810 1815Met Val Ser Gly Asp Arg Ile His Cys Leu Asp Ile Leu Phe Ala 1820 1825 1830Phe Thr Lys Arg Val Leu Gly Glu Ser Gly Glu Met Asp Ser Leu 1835 1840 1845Arg Ser Gln Met Glu Glu Arg Phe Met Ser Ala Asn Pro Ser Lys 1850 1855 1860Val Ser Tyr Glu Pro Ile Thr Thr Thr Leu Lys Arg Lys Gln Glu 1865 1870 1875Asp Val Ser Ala Thr Val Ile Gln Arg Ala Tyr Arg Arg Tyr Arg 1880 1885 1890Leu Arg Gln Asn Val Lys Asn Ile Ser Ser Ile Tyr Ile Lys Asp 1895 1900 1905Gly Asp Arg Asp Asp Asp Leu Leu Asn Lys Lys Asp Met Ala Phe 1910 1915 1920Asp Asn Val Asn Glu Asn Ser Ser Pro Glu Lys Thr Asp Ala Thr 1925 1930 1935Ser Ser Thr Thr Ser Pro Pro Ser Tyr Asp Ser Val Thr Lys Pro 1940 1945 1950Asp Lys Glu Lys Tyr Glu Gln Asp Arg Thr Glu Lys Glu Asp Lys 1955 1960 1965Gly Lys Asp Ser Lys Glu Ser Lys Lys 1970 197551977PRTHomo sapiens 5Met Ala Met Leu Pro Pro Pro Gly Pro Gln Ser Phe Val His Phe Thr1 5 10 15Lys Gln Ser Leu Ala Leu Ile Glu Gln Arg Ile Ala Glu Arg Lys Ser 20 25 30Lys Glu Pro Lys Glu Glu Lys Lys Asp Asp Asp Glu Glu Ala Pro Lys 35 40 45Pro Ser Ser Asp Leu Glu Ala Gly Lys Gln Leu Pro Phe Ile Tyr Gly 50 55 60Asp Ile Pro Pro Gly Met Val Ser Glu Pro Leu Glu Asp Leu Asp Pro65 70 75 80Tyr Tyr Ala Asp Lys Lys Thr Phe Ile Val Leu Asn Lys Gly Lys Thr 85 90 95Ile Phe Arg Phe Asn Ala Thr Pro Ala Leu Tyr Met Leu Ser Pro Phe 100 105 110Ser Pro Leu Arg Arg Ile Ser Ile Lys Ile Leu Val His Ser Leu Phe 115 120 125Ser Met Leu Ile Met Cys Thr Ile Leu Thr Asn Cys Ile Phe Met Thr 130 135 140Met Asn Asn Pro Pro Asp Trp Thr Lys Asn Val Glu Tyr Thr Phe Thr145 150 155 160Gly Ile Tyr Thr Phe Glu Ser Leu Val Lys Ile Leu Ala Arg Gly Phe 165 170 175Cys Val Gly Glu Phe Thr Phe Leu Arg Asp Pro Trp Asn Trp Leu Asp 180 185 190Phe Val Val Ile Val Phe Ala Tyr Leu Thr Glu Phe Val Asn Leu Gly 195 200 205Asn Val Ser Ala Leu Arg Thr Phe Arg Val Leu Arg Ala Leu Lys Thr 210 215 220Ile Ser Val Ile Pro Gly Leu Lys Thr Ile Val Gly Ala Leu Ile Gln225 230 235 240Ser Val Lys Lys Leu Ser Asp Val Met Ile Leu Thr Val Phe Cys Leu 245 250 255Ser Val Phe Ala Leu Ile Gly Leu Gln Leu Phe Met Gly Asn Leu Lys 260 265 270His Lys Cys Phe Arg Asn Ser Leu Glu Asn Asn Glu Thr Leu Glu Ser 275 280 285Ile Met Asn Thr Leu Glu Ser Glu Glu Asp Phe Arg Lys Tyr Phe Tyr 290 295 300Tyr Leu Glu Gly Ser Lys Asp Ala Leu Leu Cys Gly Phe Ser Thr Asp305 310 315 320Ser Gly Gln Cys Pro Glu Gly Tyr Thr Cys Val Lys Ile Gly Arg Asn 325 330 335Pro Asp Tyr Gly Tyr Thr Ser Phe Asp Thr Phe Ser Trp Ala Phe Leu 340 345 350Ala Leu Phe Arg Leu Met Thr Gln Asp Tyr Trp Glu Asn Leu Tyr Gln 355 360 365Gln Thr Leu Arg Ala Ala Gly Lys Thr Tyr Met Ile Phe Phe Val Val 370 375 380Val Ile Phe Leu Gly Ser Phe Tyr Leu Ile Asn Leu Ile Leu Ala Val385 390 395 400Val Ala Met Ala Tyr Glu Glu Gln Asn Gln Ala Asn Ile Glu Glu Ala 405 410 415Lys Gln Lys Glu Leu Glu Phe Gln Gln Met Leu Asp Arg Leu Lys Lys 420 425 430Glu Gln Glu Glu Ala Glu Ala Ile Ala Ala Ala Ala Ala Glu Tyr Thr 435 440 445Ser Ile Arg Arg

Ser Arg Ile Met Gly Leu Ser Glu Ser Ser Ser Glu 450 455 460Thr Ser Lys Leu Ser Ser Lys Ser Ala Lys Glu Arg Arg Asn Arg Arg465 470 475 480Lys Lys Lys Asn Gln Lys Lys Leu Ser Ser Gly Glu Glu Lys Gly Asp 485 490 495Ala Glu Lys Leu Ser Lys Ser Glu Ser Glu Asp Ser Ile Arg Arg Lys 500 505 510Ser Phe His Leu Gly Val Glu Gly His Arg Arg Ala His Glu Lys Arg 515 520 525Leu Ser Thr Pro Asn Gln Ser Pro Leu Ser Ile Arg Gly Ser Leu Phe 530 535 540Ser Ala Arg Arg Ser Ser Arg Thr Ser Leu Phe Ser Phe Lys Gly Arg545 550 555 560Gly Arg Asp Ile Gly Ser Glu Thr Glu Phe Ala Asp Asp Glu His Ser 565 570 575Ile Phe Gly Asp Asn Glu Ser Arg Arg Gly Ser Leu Phe Val Pro His 580 585 590Arg Pro Gln Glu Arg Arg Ser Ser Asn Ile Ser Gln Ala Ser Arg Ser 595 600 605Pro Pro Met Leu Pro Val Asn Gly Lys Met His Ser Ala Val Asp Cys 610 615 620Asn Gly Val Val Ser Leu Val Asp Gly Arg Ser Ala Leu Met Leu Pro625 630 635 640Asn Gly Gln Leu Leu Pro Glu Gly Thr Thr Asn Gln Ile His Arg Lys 645 650 655Arg Arg Cys Ser Ser Tyr Leu Leu Ser Glu Asp Met Leu Asn Asp Pro 660 665 670Asn Leu Arg Gln Arg Ala Met Ser Arg Ala Ser Ile Leu Thr Asn Thr 675 680 685Val Glu Glu Leu Glu Glu Ser Arg Gln Lys Cys Pro Pro Trp Trp Tyr 690 695 700Arg Phe Ala His Lys Phe Leu Ile Trp Asn Cys Ser Pro Tyr Trp Ile705 710 715 720Lys Phe Lys Lys Cys Ile Tyr Phe Ile Val Met Asp Pro Phe Val Asp 725 730 735Leu Ala Ile Thr Ile Cys Ile Val Leu Asn Thr Leu Phe Met Ala Met 740 745 750Glu His His Pro Met Thr Glu Glu Phe Lys Asn Val Leu Ala Ile Gly 755 760 765Asn Leu Val Phe Thr Gly Ile Phe Ala Ala Glu Met Val Leu Lys Leu 770 775 780Ile Ala Met Asp Pro Tyr Glu Tyr Phe Gln Val Gly Trp Asn Ile Phe785 790 795 800Asp Ser Leu Ile Val Thr Leu Ser Leu Val Glu Leu Phe Leu Ala Asp 805 810 815Val Glu Gly Leu Ser Val Leu Arg Ser Phe Arg Leu Leu Arg Val Phe 820 825 830Lys Leu Ala Lys Ser Trp Pro Thr Leu Asn Met Leu Ile Lys Ile Ile 835 840 845Gly Asn Ser Val Gly Ala Leu Gly Asn Leu Thr Leu Val Leu Ala Ile 850 855 860Ile Val Phe Ile Phe Ala Val Val Gly Met Gln Leu Phe Gly Lys Ser865 870 875 880Tyr Lys Glu Cys Val Cys Lys Ile Asn Asp Asp Cys Thr Leu Pro Arg 885 890 895Trp His Met Asn Asp Phe Phe His Ser Phe Leu Ile Val Phe Arg Val 900 905 910Leu Cys Gly Glu Trp Ile Glu Thr Met Trp Asp Cys Met Glu Val Ala 915 920 925Gly Gln Ala Met Cys Leu Ile Val Tyr Met Met Val Met Val Ile Gly 930 935 940Asn Leu Val Val Leu Asn Leu Phe Leu Ala Leu Leu Leu Ser Ser Phe945 950 955 960Ser Ser Asp Asn Leu Thr Ala Ile Glu Glu Asp Pro Asp Ala Asn Asn 965 970 975Leu Gln Ile Ala Val Thr Arg Ile Lys Lys Gly Ile Asn Tyr Val Lys 980 985 990Gln Thr Leu Arg Glu Phe Ile Leu Lys Ala Phe Ser Lys Lys Pro Lys 995 1000 1005Ile Ser Arg Glu Ile Arg Gln Ala Glu Asp Leu Asn Thr Lys Lys 1010 1015 1020Glu Asn Tyr Ile Ser Asn His Thr Leu Ala Glu Met Ser Lys Gly 1025 1030 1035His Asn Phe Leu Lys Glu Lys Asp Lys Ile Ser Gly Phe Gly Ser 1040 1045 1050Ser Val Asp Lys His Leu Met Glu Asp Ser Asp Gly Gln Ser Phe 1055 1060 1065Ile His Asn Pro Ser Leu Thr Val Thr Val Pro Ile Ala Pro Gly 1070 1075 1080Glu Ser Asp Leu Glu Asn Met Asn Ala Glu Glu Leu Ser Ser Asp 1085 1090 1095Ser Asp Ser Glu Tyr Ser Lys Val Arg Leu Asn Arg Ser Ser Ser 1100 1105 1110Ser Glu Cys Ser Thr Val Asp Asn Pro Leu Pro Gly Glu Gly Glu 1115 1120 1125Glu Ala Glu Ala Glu Pro Met Asn Ser Asp Glu Pro Glu Ala Cys 1130 1135 1140Phe Thr Asp Gly Cys Val Arg Arg Phe Ser Cys Cys Gln Val Asn 1145 1150 1155Ile Glu Ser Gly Lys Gly Lys Ile Trp Trp Asn Ile Arg Lys Thr 1160 1165 1170Cys Tyr Lys Ile Val Glu His Ser Trp Phe Glu Ser Phe Ile Val 1175 1180 1185Leu Met Ile Leu Leu Ser Ser Gly Ala Leu Ala Phe Glu Asp Ile 1190 1195 1200Tyr Ile Glu Arg Lys Lys Thr Ile Lys Ile Ile Leu Glu Tyr Ala 1205 1210 1215Asp Lys Ile Phe Thr Tyr Ile Phe Ile Leu Glu Met Leu Leu Lys 1220 1225 1230Trp Ile Ala Tyr Gly Tyr Lys Thr Tyr Phe Thr Asn Ala Trp Cys 1235 1240 1245Trp Leu Asp Phe Leu Ile Val Asp Val Ser Leu Val Thr Leu Val 1250 1255 1260Ala Asn Thr Leu Gly Tyr Ser Asp Leu Gly Pro Ile Lys Ser Leu 1265 1270 1275Arg Thr Leu Arg Ala Leu Arg Pro Leu Arg Ala Leu Ser Arg Phe 1280 1285 1290Glu Gly Met Arg Val Val Val Asn Ala Leu Ile Gly Ala Ile Pro 1295 1300 1305Ser Ile Met Asn Val Leu Leu Val Cys Leu Ile Phe Trp Leu Ile 1310 1315 1320Phe Ser Ile Met Gly Val Asn Leu Phe Ala Gly Lys Phe Tyr Glu 1325 1330 1335Cys Ile Asn Thr Thr Asp Gly Ser Arg Phe Pro Ala Ser Gln Val 1340 1345 1350Pro Asn Arg Ser Glu Cys Phe Ala Leu Met Asn Val Ser Gln Asn 1355 1360 1365Val Arg Trp Lys Asn Leu Lys Val Asn Phe Asp Asn Val Gly Leu 1370 1375 1380Gly Tyr Leu Ser Leu Leu Gln Val Ala Thr Phe Lys Gly Trp Thr 1385 1390 1395Ile Ile Met Tyr Ala Ala Val Asp Ser Val Asn Val Asp Lys Gln 1400 1405 1410Pro Lys Tyr Glu Tyr Ser Leu Tyr Met Tyr Ile Tyr Phe Val Val 1415 1420 1425Phe Ile Ile Phe Gly Ser Phe Phe Thr Leu Asn Leu Phe Ile Gly 1430 1435 1440Val Ile Ile Asp Asn Phe Asn Gln Gln Lys Lys Lys Leu Gly Gly 1445 1450 1455Gln Asp Ile Phe Met Thr Glu Glu Gln Lys Lys Tyr Tyr Asn Ala 1460 1465 1470Met Lys Lys Leu Gly Ser Lys Lys Pro Gln Lys Pro Ile Pro Arg 1475 1480 1485Pro Gly Asn Lys Ile Gln Gly Cys Ile Phe Asp Leu Val Thr Asn 1490 1495 1500Gln Ala Phe Asp Ile Ser Ile Met Val Leu Ile Cys Leu Asn Met 1505 1510 1515Val Thr Met Met Val Glu Lys Glu Gly Gln Ser Gln His Met Thr 1520 1525 1530Glu Val Leu Tyr Trp Ile Asn Val Val Phe Ile Ile Leu Phe Thr 1535 1540 1545Gly Glu Cys Val Leu Lys Leu Ile Ser Leu Arg His Tyr Tyr Phe 1550 1555 1560Thr Val Gly Trp Asn Ile Phe Asp Phe Val Val Val Ile Ile Ser 1565 1570 1575Ile Val Gly Met Phe Leu Ala Asp Leu Ile Glu Thr Tyr Phe Val 1580 1585 1590Ser Pro Thr Leu Phe Arg Val Ile Arg Leu Ala Arg Ile Gly Arg 1595 1600 1605Ile Leu Arg Leu Val Lys Gly Ala Lys Gly Ile Arg Thr Leu Leu 1610 1615 1620Phe Ala Leu Met Met Ser Leu Pro Ala Leu Phe Asn Ile Gly Leu 1625 1630 1635Leu Leu Phe Leu Val Met Phe Ile Tyr Ala Ile Phe Gly Met Ser 1640 1645 1650Asn Phe Ala Tyr Val Lys Lys Glu Asp Gly Ile Asn Asp Met Phe 1655 1660 1665Asn Phe Glu Thr Phe Gly Asn Ser Met Ile Cys Leu Phe Gln Ile 1670 1675 1680Thr Thr Ser Ala Gly Trp Asp Gly Leu Leu Ala Pro Ile Leu Asn 1685 1690 1695Ser Lys Pro Pro Asp Cys Asp Pro Lys Lys Val His Pro Gly Ser 1700 1705 1710Ser Val Glu Gly Asp Cys Gly Asn Pro Ser Val Gly Ile Phe Tyr 1715 1720 1725Phe Val Ser Tyr Ile Ile Ile Ser Phe Leu Val Val Val Asn Met 1730 1735 1740Tyr Ile Ala Val Ile Leu Glu Asn Phe Ser Val Ala Thr Glu Glu 1745 1750 1755Ser Thr Glu Pro Leu Ser Glu Asp Asp Phe Glu Met Phe Tyr Glu 1760 1765 1770Val Trp Glu Lys Phe Asp Pro Asp Ala Thr Gln Phe Ile Glu Phe 1775 1780 1785Ser Lys Leu Ser Asp Phe Ala Ala Ala Leu Asp Pro Pro Leu Leu 1790 1795 1800Ile Ala Lys Pro Asn Lys Val Gln Leu Ile Ala Met Asp Leu Pro 1805 1810 1815Met Val Ser Gly Asp Arg Ile His Cys Leu Asp Ile Leu Phe Ala 1820 1825 1830Phe Thr Lys Arg Val Leu Gly Glu Ser Gly Glu Met Asp Ser Leu 1835 1840 1845Arg Ser Gln Met Glu Glu Arg Phe Met Ser Ala Asn Pro Ser Lys 1850 1855 1860Val Ser Tyr Glu Pro Ile Thr Thr Thr Leu Lys Arg Lys Gln Glu 1865 1870 1875Asp Val Ser Ala Thr Val Ile Gln Arg Ala Tyr Arg Arg Tyr Arg 1880 1885 1890Leu Arg Gln Asn Val Lys Asn Ile Ser Ser Ile Tyr Ile Lys Asp 1895 1900 1905Gly Asp Arg Asp Asp Asp Leu Leu Asn Lys Lys Asp Met Ala Phe 1910 1915 1920Asp Asn Val Asn Glu Asn Ser Ser Pro Glu Lys Thr Asp Ala Thr 1925 1930 1935Ser Ser Thr Thr Ser Pro Pro Ser Tyr Asp Ser Val Thr Lys Pro 1940 1945 1950Asp Lys Glu Lys Tyr Glu Gln Asp Arg Thr Glu Lys Glu Asp Lys 1955 1960 1965Gly Lys Asp Ser Lys Glu Ser Lys Lys 1970 197561977PRTHomo sapiens 6Met Ala Met Leu Pro Pro Pro Gly Pro Gln Ser Phe Val His Phe Thr1 5 10 15Lys Gln Ser Leu Ala Leu Ile Glu Gln Arg Ile Ala Glu Arg Lys Ser 20 25 30Lys Glu Pro Lys Glu Glu Lys Lys Asp Asp Asp Glu Glu Ala Pro Lys 35 40 45Pro Ser Ser Asp Leu Glu Ala Gly Lys Gln Leu Pro Phe Ile Tyr Gly 50 55 60Asp Ile Pro Pro Gly Met Val Ser Glu Pro Leu Glu Asp Leu Asp Pro65 70 75 80Tyr Tyr Ala Asp Lys Lys Thr Phe Ile Val Leu Asn Lys Gly Lys Thr 85 90 95Ile Phe Arg Phe Asn Ala Thr Pro Ala Leu Tyr Met Leu Ser Pro Phe 100 105 110Ser Pro Leu Arg Arg Ile Ser Ile Lys Ile Leu Val His Ser Leu Phe 115 120 125Ser Met Leu Ile Met Cys Thr Ile Leu Thr Asn Cys Ile Phe Met Thr 130 135 140Met Asn Asn Pro Pro Asp Trp Thr Lys Asn Val Glu Tyr Thr Phe Thr145 150 155 160Gly Ile Tyr Thr Phe Glu Ser Leu Val Lys Ile Leu Ala Arg Gly Phe 165 170 175Cys Val Gly Glu Phe Thr Phe Leu Arg Asp Pro Trp Asn Trp Leu Asp 180 185 190Phe Val Val Ile Val Phe Ala Tyr Leu Thr Glu Phe Val Asn Leu Gly 195 200 205Asn Val Ser Ala Leu Arg Thr Phe Arg Val Leu Arg Ala Leu Lys Thr 210 215 220Ile Ser Val Ile Pro Gly Leu Lys Thr Ile Val Gly Ala Leu Ile Gln225 230 235 240Ser Val Lys Lys Leu Ser Asp Val Met Ile Leu Thr Val Phe Cys Leu 245 250 255Ser Val Phe Ala Leu Ile Gly Leu Gln Leu Phe Met Gly Asn Leu Lys 260 265 270His Lys Cys Phe Arg Asn Ser Leu Glu Asn Asn Glu Thr Leu Glu Ser 275 280 285Ile Met Asn Thr Leu Glu Ser Glu Glu Asp Phe Arg Lys Tyr Phe Tyr 290 295 300Tyr Leu Glu Gly Ser Lys Asp Ala Leu Leu Cys Gly Phe Ser Thr Asp305 310 315 320Ser Gly Gln Cys Pro Glu Gly Tyr Thr Cys Val Lys Ile Gly Arg Asn 325 330 335Pro Asp Tyr Gly Tyr Thr Ser Phe Asp Thr Phe Ser Trp Ala Phe Leu 340 345 350Ala Leu Phe Arg Leu Met Thr Gln Asp Tyr Trp Glu Asn Leu Tyr Gln 355 360 365Gln Thr Leu Arg Ala Ala Gly Lys Thr Tyr Met Ile Phe Phe Val Val 370 375 380Val Ile Phe Leu Gly Ser Phe Tyr Leu Ile Asn Leu Ile Leu Ala Val385 390 395 400Val Ala Met Ala Tyr Glu Glu Gln Asn Gln Ala Asn Ile Glu Glu Ala 405 410 415Lys Gln Lys Glu Leu Glu Phe Gln Gln Met Leu Asp Arg Leu Lys Lys 420 425 430Glu Gln Glu Glu Ala Glu Ala Ile Ala Ala Ala Ala Ala Glu Tyr Thr 435 440 445Ser Ile Arg Arg Ser Arg Ile Met Gly Leu Ser Glu Ser Ser Ser Glu 450 455 460Thr Ser Lys Leu Ser Ser Lys Ser Ala Lys Glu Arg Arg Asn Arg Arg465 470 475 480Lys Lys Lys Asn Gln Lys Lys Leu Ser Ser Gly Glu Glu Lys Gly Asp 485 490 495Ala Glu Lys Leu Ser Lys Ser Glu Ser Glu Asp Ser Ile Arg Arg Lys 500 505 510Ser Phe His Leu Gly Val Glu Gly His Arg Arg Ala His Glu Lys Arg 515 520 525Leu Ser Thr Pro Asn Gln Ser Pro Leu Ser Ile Arg Gly Ser Leu Phe 530 535 540Ser Ala Arg Arg Ser Ser Arg Thr Ser Leu Phe Ser Phe Lys Gly Arg545 550 555 560Gly Arg Asp Ile Gly Ser Glu Thr Glu Phe Ala Asp Asp Glu His Ser 565 570 575Ile Phe Gly Asp Asn Glu Ser Arg Arg Gly Ser Leu Phe Val Pro His 580 585 590Arg Pro Gln Glu Arg Arg Ser Ser Asn Ile Ser Gln Ala Ser Arg Ser 595 600 605Pro Pro Met Leu Pro Val Asn Gly Lys Met His Ser Ala Val Asp Cys 610 615 620Asn Gly Val Val Ser Leu Val Asp Gly Arg Ser Ala Leu Met Leu Pro625 630 635 640Asn Gly Gln Leu Leu Pro Glu Gly Thr Thr Asn Gln Ile His Lys Lys 645 650 655Arg Arg Cys Ser Ser Tyr Leu Leu Ser Glu Asp Met Leu Asn Asp Pro 660 665 670Asn Leu Arg Gln Arg Ala Met Ser Arg Ala Ser Ile Leu Thr Asn Thr 675 680 685Val Glu Glu Leu Glu Glu Ser Arg Gln Lys Cys Pro Pro Trp Trp Tyr 690 695 700Arg Phe Ala His Lys Phe Leu Ile Trp Asn Cys Ser Pro Tyr Trp Ile705 710 715 720Lys Phe Lys Lys Cys Ile Tyr Phe Ile Val Met Asp Pro Phe Val Asp 725 730 735Leu Ala Val Thr Ile Cys Ile Val Leu Asn Thr Leu Phe Met Ala Met 740 745 750Glu His His Pro Met Thr Glu Glu Phe Lys Asn Val Leu Ala Ile Gly 755 760 765Asn Leu Val Phe Thr Gly Ile Phe Ala Ala Glu Met Val Leu Lys Leu 770 775 780Ile Ala Met Asp Pro Tyr Glu Tyr Phe Gln Val Gly Trp Asn Ile Phe785 790 795 800Asp Ser Leu Ile Val Thr Leu Ser Leu Val Glu Leu Phe Leu Ala Asp 805 810 815Val Glu Gly Leu Ser Val Leu Arg Ser Phe Arg Leu Leu Arg Val Phe 820 825 830Lys Leu Ala Lys Ser Trp Pro Thr Leu Asn Met Leu Ile Lys Ile Ile 835 840 845Gly Asn Ser Val Gly Ala Leu Gly Asn Leu Thr Leu Val Leu Ala Ile 850 855 860Ile Val Phe Ile Phe Ala Val Val Gly Met Gln Leu Phe Gly Lys Ser865 870 875 880Tyr Lys Glu Cys Val Cys Lys Ile Asn Asp Asp Cys Thr Leu Pro Arg 885 890 895Trp His Met Asn Asp Phe Phe His Ser Phe Leu Ile Val Phe Arg Val 900 905 910Leu Cys Gly Glu Trp Ile Glu Thr Met Trp Asp Cys Met Glu Val Ala 915 920 925Gly Gln

Ala Met Cys Leu Ile Val Tyr Met Met Val Met Val Ile Gly 930 935 940Asn Leu Val Val Leu Asn Leu Phe Leu Ala Leu Leu Leu Ser Ser Phe945 950 955 960Ser Ser Asp Asn Leu Thr Ala Ile Glu Glu Asp Pro Asp Ala Asn Asn 965 970 975Leu Gln Ile Ala Val Thr Arg Ile Lys Lys Gly Ile Asn Tyr Val Lys 980 985 990Gln Thr Leu Arg Glu Phe Ile Leu Lys Ala Phe Ser Lys Lys Pro Lys 995 1000 1005Ile Ser Arg Glu Ile Arg Gln Ala Glu Asp Leu Asn Thr Lys Lys 1010 1015 1020Glu Asn Tyr Ile Ser Asn His Thr Leu Ala Glu Met Ser Lys Gly 1025 1030 1035His Asn Phe Leu Lys Glu Lys Asp Lys Ile Ser Gly Phe Gly Ser 1040 1045 1050Ser Val Asp Lys His Leu Met Glu Asp Ser Asp Gly Gln Ser Phe 1055 1060 1065Ile His Asn Pro Ser Leu Thr Val Thr Val Pro Ile Ala Pro Gly 1070 1075 1080Glu Ser Asp Leu Glu Asn Met Asn Ala Glu Glu Leu Ser Ser Asp 1085 1090 1095Ser Asp Ser Glu Tyr Ser Lys Val Arg Leu Asn Arg Ser Ser Ser 1100 1105 1110Ser Glu Cys Ser Thr Val Asp Asn Pro Leu Pro Gly Glu Gly Glu 1115 1120 1125Glu Ala Glu Ala Glu Pro Met Asn Ser Asp Glu Pro Glu Ala Cys 1130 1135 1140Phe Thr Asp Gly Cys Val Arg Arg Phe Ser Cys Cys Gln Val Asn 1145 1150 1155Ile Glu Ser Gly Lys Gly Lys Ile Trp Trp Asn Ile Arg Lys Thr 1160 1165 1170Cys Tyr Lys Ile Val Glu His Ser Trp Phe Glu Ser Phe Ile Val 1175 1180 1185Leu Met Ile Leu Leu Ser Ser Gly Ala Leu Ala Phe Glu Asp Ile 1190 1195 1200Tyr Ile Glu Arg Lys Lys Thr Ile Lys Ile Ile Leu Glu Tyr Ala 1205 1210 1215Asp Lys Ile Phe Thr Tyr Ile Phe Ile Leu Glu Met Leu Leu Lys 1220 1225 1230Trp Ile Ala Tyr Gly Tyr Lys Thr Tyr Phe Thr Asn Ala Trp Cys 1235 1240 1245Trp Leu Asp Phe Leu Ile Val Asp Val Ser Leu Val Thr Leu Val 1250 1255 1260Ala Asn Thr Leu Gly Tyr Ser Asp Leu Gly Pro Ile Lys Ser Leu 1265 1270 1275Arg Thr Leu Arg Ala Leu Arg Pro Leu Arg Ala Leu Ser Arg Phe 1280 1285 1290Glu Gly Met Arg Val Val Val Asn Ala Leu Ile Gly Ala Ile Pro 1295 1300 1305Ser Ile Met Asn Val Leu Leu Val Cys Leu Ile Phe Trp Leu Ile 1310 1315 1320Phe Ser Ile Met Gly Val Asn Leu Phe Ala Gly Lys Phe Tyr Glu 1325 1330 1335Cys Ile Asn Thr Thr Asp Gly Ser Arg Phe Pro Ala Ser Gln Val 1340 1345 1350Pro Asn Arg Ser Glu Cys Phe Ala Leu Met Asn Val Ser Gln Asn 1355 1360 1365Val Arg Trp Lys Asn Leu Lys Val Asn Phe Asp Asn Val Gly Leu 1370 1375 1380Gly Tyr Leu Ser Leu Leu Gln Val Ala Thr Phe Lys Gly Trp Thr 1385 1390 1395Ile Ile Met Tyr Ala Ala Val Asp Ser Val Asn Val Asp Lys Gln 1400 1405 1410Pro Lys Tyr Glu Tyr Ser Leu Tyr Met Tyr Ile Tyr Phe Val Val 1415 1420 1425Phe Ile Ile Phe Gly Ser Phe Phe Thr Leu Asn Leu Phe Ile Gly 1430 1435 1440Val Ile Ile Asp Asn Phe Asn Gln Gln Lys Lys Lys Leu Gly Gly 1445 1450 1455Gln Asp Ile Phe Met Thr Glu Glu Gln Lys Lys Tyr Tyr Asn Ala 1460 1465 1470Met Lys Lys Leu Gly Ser Lys Lys Pro Gln Lys Pro Ile Pro Arg 1475 1480 1485Pro Gly Asn Lys Ile Gln Gly Cys Ile Phe Asp Leu Val Thr Asn 1490 1495 1500Gln Ala Phe Asp Ile Ser Ile Met Val Leu Ile Cys Leu Asn Met 1505 1510 1515Val Thr Met Met Val Glu Lys Glu Gly Gln Ser Gln His Met Thr 1520 1525 1530Glu Val Leu Tyr Trp Ile Asn Val Val Phe Ile Ile Leu Phe Thr 1535 1540 1545Gly Glu Cys Val Leu Lys Leu Ile Ser Leu Arg His Tyr Tyr Phe 1550 1555 1560Thr Val Gly Trp Asn Ile Phe Asp Phe Val Val Val Ile Ile Ser 1565 1570 1575Ile Val Gly Met Phe Leu Ala Asp Leu Ile Glu Thr Tyr Phe Val 1580 1585 1590Ser Pro Thr Leu Phe Arg Val Ile Arg Leu Ala Arg Ile Gly Arg 1595 1600 1605Ile Leu Arg Leu Val Lys Gly Ala Lys Gly Ile Arg Thr Leu Leu 1610 1615 1620Phe Ala Leu Met Met Ser Leu Pro Ala Leu Phe Asn Ile Gly Leu 1625 1630 1635Leu Leu Phe Leu Val Met Phe Ile Tyr Ala Ile Phe Gly Met Ser 1640 1645 1650Asn Phe Ala Tyr Val Lys Lys Glu Asp Gly Ile Asn Asp Met Phe 1655 1660 1665Asn Phe Glu Thr Phe Gly Asn Ser Met Ile Cys Leu Phe Gln Ile 1670 1675 1680Thr Thr Ser Ala Gly Trp Asp Gly Leu Leu Ala Pro Ile Leu Asn 1685 1690 1695Ser Lys Pro Pro Asp Cys Asp Pro Lys Lys Val His Pro Gly Ser 1700 1705 1710Ser Val Glu Gly Asp Cys Gly Asn Pro Ser Val Gly Ile Phe Tyr 1715 1720 1725Phe Val Ser Tyr Ile Ile Ile Ser Phe Leu Val Val Val Asn Met 1730 1735 1740Tyr Ile Ala Val Ile Leu Glu Asn Phe Ser Val Ala Thr Glu Glu 1745 1750 1755Ser Thr Glu Pro Leu Ser Glu Asp Asp Phe Glu Met Phe Tyr Glu 1760 1765 1770Val Trp Glu Lys Phe Asp Pro Asp Ala Thr Gln Phe Ile Glu Phe 1775 1780 1785Ser Lys Leu Ser Asp Phe Ala Ala Ala Leu Asp Pro Pro Leu Leu 1790 1795 1800Ile Ala Lys Pro Asn Lys Val Gln Leu Ile Ala Met Asp Leu Pro 1805 1810 1815Met Val Ser Gly Asp Arg Ile His Cys Leu Asp Ile Leu Phe Ala 1820 1825 1830Phe Thr Lys Arg Val Leu Gly Glu Ser Gly Glu Met Asp Ser Leu 1835 1840 1845Arg Ser Gln Met Glu Glu Arg Phe Met Ser Ala Asn Pro Ser Lys 1850 1855 1860Val Ser Tyr Glu Pro Ile Thr Thr Thr Leu Lys Arg Lys Gln Glu 1865 1870 1875Asp Val Ser Ala Thr Val Ile Gln Arg Ala Tyr Arg Arg Tyr Arg 1880 1885 1890Leu Arg Gln Asn Val Lys Asn Ile Ser Ser Ile Tyr Ile Lys Asp 1895 1900 1905Gly Asp Arg Asp Asp Asp Leu Leu Asn Lys Lys Asp Met Ala Phe 1910 1915 1920Asp Asn Val Asn Glu Asn Ser Ser Pro Glu Lys Thr Asp Ala Thr 1925 1930 1935Ser Ser Thr Thr Ser Pro Pro Ser Tyr Asp Ser Val Thr Lys Pro 1940 1945 1950Asp Lys Glu Lys Tyr Glu Gln Asp Arg Thr Glu Lys Glu Asp Lys 1955 1960 1965Gly Lys Asp Ser Lys Glu Ser Lys Lys 1970 197571977PRTHomo sapiens 7Met Ala Met Leu Pro Pro Pro Gly Pro Gln Ser Phe Val His Phe Thr1 5 10 15Lys Gln Ser Leu Ala Leu Ile Glu Gln Arg Ile Ala Glu Arg Lys Ser 20 25 30Lys Glu Pro Lys Glu Glu Lys Lys Asp Asp Asp Glu Glu Ala Pro Lys 35 40 45Pro Ser Ser Asp Leu Glu Ala Gly Lys Gln Leu Pro Phe Ile Tyr Gly 50 55 60Asp Ile Pro Pro Gly Met Val Ser Glu Pro Leu Glu Asp Leu Asp Pro65 70 75 80Tyr Tyr Ala Asp Lys Lys Thr Phe Ile Val Leu Asn Lys Gly Lys Thr 85 90 95Ile Phe Arg Phe Asn Ala Thr Pro Ala Leu Tyr Met Leu Ser Pro Phe 100 105 110Ser Pro Leu Arg Arg Ile Ser Ile Lys Ile Leu Val His Ser Leu Phe 115 120 125Ser Met Leu Ile Met Cys Thr Ile Leu Thr Asn Cys Ile Phe Met Thr 130 135 140Met Asn Asn Pro Pro Asp Trp Thr Lys Asn Val Glu Tyr Thr Phe Thr145 150 155 160Gly Ile Tyr Thr Phe Glu Ser Leu Val Lys Ile Leu Ala Arg Gly Phe 165 170 175Cys Val Gly Glu Phe Thr Phe Leu Arg Asp Pro Trp Asn Trp Leu Asp 180 185 190Phe Val Val Ile Val Phe Ala Tyr Leu Thr Glu Phe Val Asn Leu Gly 195 200 205Asn Val Ser Ala Leu Arg Thr Phe Arg Val Leu Arg Ala Leu Lys Thr 210 215 220Ile Ser Val Ile Pro Gly Leu Lys Thr Ile Val Gly Ala Leu Ile Gln225 230 235 240Ser Val Lys Lys Leu Ser Asp Val Met Ile Leu Thr Val Phe Cys Leu 245 250 255Ser Val Phe Ala Leu Ile Gly Leu Gln Leu Phe Met Gly Asn Leu Lys 260 265 270His Lys Cys Phe Arg Asn Ser Leu Glu Asn Asn Glu Thr Leu Glu Ser 275 280 285Ile Met Asn Thr Leu Glu Ser Glu Glu Asp Phe Arg Lys Tyr Phe Tyr 290 295 300Tyr Leu Glu Gly Ser Lys Asp Ala Leu Leu Cys Gly Phe Ser Thr Asp305 310 315 320Ser Gly Gln Cys Pro Glu Gly Tyr Thr Cys Val Lys Ile Gly Arg Asn 325 330 335Pro Asp Tyr Gly Tyr Thr Ser Phe Asp Thr Phe Ser Trp Ala Phe Leu 340 345 350Ala Leu Phe Arg Leu Met Thr Gln Asp Tyr Trp Glu Asn Leu Tyr Gln 355 360 365Gln Thr Leu Arg Ala Ala Gly Lys Thr Tyr Met Ile Phe Phe Val Val 370 375 380Val Ile Phe Leu Gly Ser Phe Tyr Leu Ile Asn Leu Ile Leu Ala Val385 390 395 400Val Ala Met Ala Tyr Glu Glu Gln Asn Gln Ala Asn Ile Glu Glu Ala 405 410 415Lys Gln Lys Glu Leu Glu Phe Gln Gln Met Leu Asp Arg Leu Lys Lys 420 425 430Glu Gln Glu Glu Ala Glu Ala Ile Ala Ala Ala Ala Ala Glu Tyr Thr 435 440 445Ser Ile Arg Arg Ser Arg Ile Met Gly Leu Ser Glu Ser Ser Ser Glu 450 455 460Thr Ser Lys Leu Ser Ser Lys Ser Ala Lys Glu Arg Arg Asn Arg Arg465 470 475 480Lys Lys Lys Asn Gln Lys Lys Leu Ser Ser Gly Glu Glu Lys Gly Asp 485 490 495Ala Glu Lys Leu Ser Lys Ser Glu Ser Glu Asp Ser Ile Arg Arg Lys 500 505 510Ser Phe His Leu Gly Val Glu Gly His Arg Arg Ala His Glu Lys Arg 515 520 525Leu Ser Thr Pro Asn Gln Ser Pro Leu Ser Ile Arg Gly Ser Leu Phe 530 535 540Ser Ala Arg Arg Ser Ser Arg Thr Ser Leu Phe Ser Phe Lys Gly Arg545 550 555 560Gly Arg Asp Ile Gly Ser Glu Thr Glu Phe Ala Asp Asp Glu His Ser 565 570 575Ile Phe Gly Asp Asn Glu Ser Arg Arg Gly Ser Leu Phe Val Pro His 580 585 590Arg Pro Gln Glu Arg Arg Ser Ser Asn Ile Ser Gln Ala Ser Arg Ser 595 600 605Pro Pro Met Leu Pro Val Asn Gly Lys Met His Ser Ala Val Asp Cys 610 615 620Asn Gly Val Val Ser Leu Val Asp Gly Arg Ser Ala Leu Met Leu Pro625 630 635 640Asn Gly Gln Leu Leu Pro Glu Gly Thr Thr Asn Gln Ile His Lys Lys 645 650 655Arg Arg Cys Ser Ser Tyr Leu Leu Ser Glu Asp Met Leu Asn Asp Pro 660 665 670Asn Leu Arg Gln Arg Ala Met Ser Arg Ala Ser Ile Leu Thr Asn Thr 675 680 685Val Glu Glu Leu Glu Glu Ser Arg Gln Lys Cys Pro Pro Trp Trp Tyr 690 695 700Arg Phe Ala His Lys Phe Leu Ile Trp Asn Cys Ser Pro Tyr Trp Ile705 710 715 720Lys Phe Lys Lys Cys Ile Tyr Phe Ile Val Met Asp Pro Phe Val Asp 725 730 735Leu Ala Ile Thr Ile Cys Ile Val Leu Asn Thr Leu Phe Met Ala Met 740 745 750Glu His His Pro Met Thr Glu Glu Phe Lys Asn Val Leu Ala Ile Gly 755 760 765Asn Leu Val Phe Thr Gly Ile Phe Ala Ala Glu Met Val Leu Lys Leu 770 775 780Ile Ala Met Asp Pro Tyr Glu Tyr Phe Gln Val Gly Trp Asn Ile Phe785 790 795 800Asp Ser Leu Ile Val Thr Leu Ser Leu Val Glu Leu Phe Leu Ala Asp 805 810 815Val Glu Gly Leu Ser Val Leu Arg Ser Phe Arg Leu Leu Arg Val Phe 820 825 830Lys Leu Ala Lys Ser Trp Pro Thr Leu Asn Met Leu Ile Lys Ile Ile 835 840 845Gly Asn Ser Val Gly Ala Leu Gly Asn Leu Thr Leu Val Leu Ala Ile 850 855 860Ile Val Phe Ile Phe Ala Val Val Gly Met Gln Leu Phe Gly Lys Ser865 870 875 880Tyr Lys Glu Cys Val Cys Lys Ile Asn Asp Asp Cys Thr Leu Pro Arg 885 890 895Trp His Met Asn Asp Phe Phe His Ser Phe Leu Ile Val Phe Arg Val 900 905 910Leu Cys Gly Glu Trp Ile Glu Thr Met Trp Asp Cys Met Glu Val Ala 915 920 925Gly Gln Ala Met Cys Leu Ile Val Tyr Met Met Val Met Val Ile Gly 930 935 940Asn Leu Val Val Leu Asn Leu Phe Leu Ala Leu Leu Leu Ser Ser Phe945 950 955 960Ser Ser Asp Asn Leu Thr Ala Ile Glu Glu Asp Pro Asp Ala Asn Asn 965 970 975Leu Gln Ile Ala Val Thr Arg Ile Lys Lys Gly Ile Asn Tyr Val Lys 980 985 990Gln Thr Leu Arg Glu Phe Ile Leu Lys Ala Phe Ser Lys Lys Pro Lys 995 1000 1005Ile Ser Arg Glu Ile Arg Gln Ala Glu Asp Leu Asn Thr Lys Lys 1010 1015 1020Glu Asn Tyr Ile Ser Asn His Thr Leu Ala Glu Met Ser Lys Gly 1025 1030 1035His Asn Phe Leu Lys Glu Lys Asp Lys Ile Ser Gly Phe Gly Ser 1040 1045 1050Ser Val Asp Lys His Leu Met Glu Asp Ser Asp Gly Gln Ser Phe 1055 1060 1065Ile His Asn Pro Ser Leu Thr Val Thr Val Pro Ile Ala Pro Gly 1070 1075 1080Glu Ser Asp Leu Glu Asn Met Asn Ala Glu Glu Leu Ser Ser Asp 1085 1090 1095Ser Asp Ser Glu Tyr Ser Lys Val Arg Leu Asn Arg Ser Ser Ser 1100 1105 1110Ser Glu Cys Ser Thr Val Asp Asn Pro Phe Pro Gly Glu Gly Glu 1115 1120 1125Glu Ala Glu Ala Glu Pro Met Asn Ser Asp Glu Pro Glu Ala Cys 1130 1135 1140Phe Thr Asp Gly Cys Val Arg Arg Phe Ser Cys Cys Gln Val Asn 1145 1150 1155Ile Glu Ser Gly Lys Gly Lys Ile Trp Trp Asn Ile Arg Lys Thr 1160 1165 1170Cys Tyr Lys Ile Val Glu His Ser Trp Phe Glu Ser Phe Ile Val 1175 1180 1185Leu Met Ile Leu Leu Ser Ser Gly Ala Leu Ala Phe Glu Asp Ile 1190 1195 1200Tyr Ile Glu Arg Lys Lys Thr Ile Lys Ile Ile Leu Glu Tyr Ala 1205 1210 1215Asp Lys Ile Phe Thr Tyr Ile Phe Ile Leu Glu Met Leu Leu Lys 1220 1225 1230Trp Ile Ala Tyr Gly Tyr Lys Thr Tyr Phe Thr Asn Ala Trp Cys 1235 1240 1245Trp Leu Asp Phe Leu Ile Val Asp Val Ser Leu Val Thr Leu Val 1250 1255 1260Ala Asn Thr Leu Gly Tyr Ser Asp Leu Gly Pro Ile Lys Ser Leu 1265 1270 1275Arg Thr Leu Arg Ala Leu Arg Pro Leu Arg Ala Leu Ser Arg Phe 1280 1285 1290Glu Gly Met Arg Val Val Val Asn Ala Leu Ile Gly Ala Ile Pro 1295 1300 1305Ser Ile Met Asn Val Leu Leu Val Cys Leu Ile Phe Trp Leu Ile 1310 1315 1320Phe Ser Ile Met Gly Val Asn Leu Phe Ala Gly Lys Phe Tyr Glu 1325 1330 1335Cys Ile Asn Thr Thr Asp Gly Ser Arg Phe Pro Ala Ser Gln Val 1340 1345 1350Pro Asn Arg Ser Glu Cys Phe Ala Leu Met Asn Val Ser Gln Asn 1355 1360 1365Val Arg Trp Lys Asn Leu Lys Val Asn Phe Asp Asn Val Gly Leu 1370 1375 1380Gly Tyr Leu Ser Leu Leu Gln Val Ala Thr Phe Lys Gly Trp Thr 1385 1390 1395Ile Ile Met Tyr

Ala Ala Val Asp Ser Val Asn Val Asp Lys Gln 1400 1405 1410Pro Lys Tyr Glu Tyr Ser Leu Tyr Met Tyr Ile Tyr Phe Val Val 1415 1420 1425Phe Ile Ile Phe Gly Ser Phe Phe Thr Leu Asn Leu Phe Ile Gly 1430 1435 1440Val Ile Ile Asp Asn Phe Asn Gln Gln Lys Lys Lys Leu Gly Gly 1445 1450 1455Gln Asp Ile Phe Met Thr Glu Glu Gln Lys Lys Tyr Tyr Asn Ala 1460 1465 1470Met Lys Lys Leu Gly Ser Lys Lys Pro Gln Lys Pro Ile Pro Arg 1475 1480 1485Pro Gly Asn Lys Ile Gln Gly Cys Ile Phe Asp Leu Val Thr Asn 1490 1495 1500Gln Ala Phe Asp Ile Ser Ile Met Val Leu Ile Cys Leu Asn Met 1505 1510 1515Val Thr Met Met Val Glu Lys Glu Gly Gln Ser Gln His Met Thr 1520 1525 1530Glu Val Leu Tyr Trp Ile Asn Val Val Phe Ile Ile Leu Phe Thr 1535 1540 1545Gly Glu Cys Val Leu Lys Leu Ile Ser Leu Arg His Tyr Tyr Phe 1550 1555 1560Thr Val Gly Trp Asn Ile Phe Asp Phe Val Val Val Ile Ile Ser 1565 1570 1575Ile Val Gly Met Phe Leu Ala Asp Leu Ile Glu Thr Tyr Phe Val 1580 1585 1590Ser Pro Thr Leu Phe Arg Val Ile Arg Leu Ala Arg Ile Gly Arg 1595 1600 1605Ile Leu Arg Leu Val Lys Gly Ala Lys Gly Ile Arg Thr Leu Leu 1610 1615 1620Phe Ala Leu Met Met Ser Leu Pro Ala Leu Phe Asn Ile Gly Leu 1625 1630 1635Leu Leu Phe Leu Val Met Phe Ile Tyr Ala Ile Phe Gly Met Ser 1640 1645 1650Asn Phe Ala Tyr Val Lys Lys Glu Asp Gly Ile Asn Asp Met Phe 1655 1660 1665Asn Phe Glu Thr Phe Gly Asn Ser Met Ile Cys Leu Phe Gln Ile 1670 1675 1680Thr Thr Ser Ala Gly Trp Asp Gly Leu Leu Ala Pro Ile Leu Asn 1685 1690 1695Ser Lys Pro Pro Asp Cys Asp Pro Lys Lys Val His Pro Gly Ser 1700 1705 1710Ser Val Glu Gly Asp Cys Gly Asn Pro Ser Val Gly Ile Phe Tyr 1715 1720 1725Phe Val Ser Tyr Ile Ile Ile Ser Phe Leu Val Val Val Asn Met 1730 1735 1740Tyr Ile Ala Val Ile Leu Glu Asn Phe Ser Val Ala Thr Glu Glu 1745 1750 1755Ser Thr Glu Pro Leu Ser Glu Asp Asp Phe Glu Met Phe Tyr Glu 1760 1765 1770Val Trp Glu Lys Phe Asp Pro Asp Ala Thr Gln Phe Ile Glu Phe 1775 1780 1785Ser Lys Leu Ser Asp Phe Ala Ala Ala Leu Asp Pro Pro Leu Leu 1790 1795 1800Ile Ala Lys Pro Asn Lys Val Gln Leu Ile Ala Met Asp Leu Pro 1805 1810 1815Met Val Ser Gly Asp Arg Ile His Cys Leu Asp Ile Leu Phe Ala 1820 1825 1830Phe Thr Lys Arg Val Leu Gly Glu Ser Gly Glu Met Asp Ser Leu 1835 1840 1845Arg Ser Gln Met Glu Glu Arg Phe Met Ser Ala Asn Pro Ser Lys 1850 1855 1860Val Ser Tyr Glu Pro Ile Thr Thr Thr Leu Lys Arg Lys Gln Glu 1865 1870 1875Asp Val Ser Ala Thr Val Ile Gln Arg Ala Tyr Arg Arg Tyr Arg 1880 1885 1890Leu Arg Gln Asn Val Lys Asn Ile Ser Ser Ile Tyr Ile Lys Asp 1895 1900 1905Gly Asp Arg Asp Asp Asp Leu Leu Asn Lys Lys Asp Met Ala Phe 1910 1915 1920Asp Asn Val Asn Glu Asn Ser Ser Pro Glu Lys Thr Asp Ala Thr 1925 1930 1935Ser Ser Thr Thr Ser Pro Pro Ser Tyr Asp Ser Val Thr Lys Pro 1940 1945 1950Asp Lys Glu Lys Tyr Glu Gln Asp Arg Thr Glu Lys Glu Asp Lys 1955 1960 1965Gly Lys Asp Ser Lys Glu Ser Lys Lys 1970 197585934DNAArtificial SequenceSynthetic Construct 8atggcaatgt tgcctccccc aggacctcag agctttgtcc atttcacaaa acagtctctt 60gccctcattg aacaacgcat tgctgaaaga aaatcaaagg aacccaaaga agaaaagaaa 120gatgatgatg aagaagcccc aaagccaagc agtgacttgg aagctggcaa acaactgccc 180ttcgtctatg gggacattcc tcccggcatg gtgtcagagc ccctggagga cttggacccc 240tactatgcag acaaaaagac tttcatagta ttgaacaaag ggaaaacaat cttccgtttc 300aatgccacac ctgctttata tatgctttct cctttcagtc ctctaagaag aatatctatt 360aagattttag tacactcctt attcagcatg ctcatcatgt gcactattct gacaaactgc 420atatttatga ccatgaataa cccgccggac tggaccaaaa atgtcgagta cacttttact 480ggaatatata cttttgaatc acttgtaaaa atccttgcaa gaggcttctg tgtaggagaa 540ttcacttttc ttcgtgaccc gtggaactgg ctggattttg tcgtcattgt ttttgcgtat 600ttaacagaat ttgtaaacct aggcaatgtt tcagctcttc gaactttcag agtattgaga 660gctttgaaaa ctatttctgt aatcccaggc ctgaagacaa ttgtaggggc tttgatccag 720tcagtgaaga agctttctga tgtcatgatc ctgactgtgt tctgtctgag tgtgtttgca 780ctaattggac tacagctgtt catgggaaac ctgaagcata aatgttttcg aaattcactt 840gaaaataatg aaacattaga aagcataatg aataccctag agagtgaaga agactttaga 900aaatattttt attacttgga aggatccaaa gatgctctcc tttgtggttt cagcacagat 960tcaggtcagt gtccagaggg gtacacctgt gtgaaaattg gcagaaaccc tgattatggc 1020tacacgagct ttgacacttt cagctgggcc ttcttagcct tgtttaggct aatgacccaa 1080gattactggg aaaaccttta ccaacagacg ctgcgtgctg ctggcaaaac ctacatgatc 1140ttctttgtcg tagtgatttt cctgggctcc ttttatctaa taaacttgat cctggctgtg 1200gttgccatgg catatgaaga acagaaccag gcaaacattg aagaagctaa acagaaagaa 1260ttagaatttc aacagatgtt agaccgtctt aaaaaagagc aagaagaagc tgaggcaatt 1320gcagcggcag cggctgaata tacaagtatt aggagaagca gaattatggg cctctcagag 1380agttcttctg aaacatccaa actgagctct aaaagtgcta aagaaagaag aaacagaaga 1440aagaaaaaga atcaaaagaa gctctccagt ggagaggaaa agggagatgc tgagaaattg 1500tcgaaatcag aatcagagga cagcatcaga agaaaaagtt tccaccttgg tgtcgaaggg 1560cataggcgag cacatgaaaa gaggttgtct acccccaatc agtcaccact cagcattcgt 1620ggctccttgt tttctgcaag gcgaagcagc agaacaagtc tttttagttt caaaggcaga 1680ggaagagata taggatctga gactgaattt gccgatgatg agcacagcat ttttggagac 1740aatgagagca gaaggggctc actgtttgtg ccccacagac cccaggagcg acgcagcagt 1800aacatcagcc aagccagtag gtccccacca atgctgccgg tgaacgggaa aatgcacagt 1860gctgtggact gcaacggtgt ggtctccctg gttgatggac gctcagccct catgctcccc 1920aatggacagc ttctgccaga gggcacgacc aatcaaatac acaagaaaag gcgttgtagt 1980tcctatctcc tttcagagga tatgctgaat gatcccaacc tcagacagag agcaatgagt 2040agagcaagca tattaacaaa cactgtggaa gaacttgaag agtccagaca aaaatgtcca 2100ccttggtggt acagatttgc acacaaattc ttgatctgga attgctctcc atattggata 2160aaattcaaaa agtgtatcta ttttattgta atggatcctt ttgtagatct tgcaattacc 2220atttgcatag ttttaaacac attatttatg gctatggaac accacccaat gactgaggaa 2280ttcaaaaatg tacttgctat aggaaatttg gtctttactg gaatctttgc agctgaaatg 2340gtattaaaac tgattgccat ggatccatat gagtatttcc aagtaggctg gaatattttt 2400gacagcctta ttgtgacttt aagtttagtg gagctctttc tagcagatgt ggaaggattg 2460tcagttctgc gatcattcag actgctccga gtcttcaagt tggcaaaatc ctggccaaca 2520ttgaacatgc tgattaagat cattggtaac tcagtagggg ctctaggtaa cctcacctta 2580gtgttggcca tcatcgtctt catttttgct gtggtcggca tgcagctctt tggtaagagc 2640tacaaagaat gtgtctgcaa gatcaatgat gactgtacgc tcccacggtg gcacatgaac 2700gacttcttcc actccttcct gattgtgttc cgcgtgctgt gtggagagtg gatagagacc 2760atgtgggact gtatggaggt cgctggtcaa gctatgtgcc ttattgttta catgatggtc 2820atggtcattg gaaacctggt ggtcctaaac ctatttctgg ccttattatt gagctcattt 2880agttcagaca atcttacagc aattgaagaa gaccctgatg caaacaacct ccagattgca 2940gtgactagaa ttaaaaaggg aataaattat gtgaaacaaa ccttacgtga atttattcta 3000aaagcatttt ccaaaaagcc aaagatttcc agggagataa gacaagcaga agatctgaat 3060actaagaagg aaaactatat ttctaaccat acacttgctg aaatgagcaa aggtcacaat 3120ttcctcaagg aaaaagataa aatcagtggt tttggaagca gcgtggacaa acacttgatg 3180gaagacagtg atggtcaatc atttattcac aatcccagcc tcacagtgac agtgccaatt 3240gcacctgggg aatccgattt ggaaaatatg aatgctgagg aacttagcag tgattcggat 3300agtgaataca gcaaagtgag attaaaccgg tcaagctcct cagagtgcag cacagttgat 3360aaccctttgc ctggagaagg agaagaagca gaggctgaac ctatgaattc cgatgagcca 3420gaggcctgtt tcacagatgg ttgtgtacgg aggttctcat gctgccaagt taacatagag 3480tcagggaaag gaaaaatctg gtggaacatc aggaaaacct gctacaagat tgttgaacac 3540agttggtttg aaagcttcat tgtcctcatg atcctgctca gcagtggtgc cctggctttt 3600gaagatattt atattgaaag gaaaaagacc attaagatta tcctggagta tgcagacaag 3660atcttcactt acatcttcat tctggaaatg cttctaaaat ggatagcata tggttataaa 3720acatatttca ccaatgcctg gtgttggctg gatttcctaa ttgttgatgt ttctttggtt 3780actttagtgg caaacactct tggctactca gatcttggcc ccattaaatc ccttcggaca 3840ctgagagctt taagacctct aagagcctta tctagatttg aaggaatgag ggtcgttgtg 3900aatgcactca taggagcaat tccttccatc atgaatgtgc tacttgtgtg tcttatattc 3960tggctgatat tcagcatcat gggagtaaat ttgtttgctg gcaagttcta tgagtgtatt 4020aacaccacag atgggtcacg gtttcctgca agtcaagttc caaatcgttc cgaatgtttt 4080gcccttatga atgttagtca aaatgtgcga tggaaaaacc tgaaagtgaa ctttgataat 4140gtcggacttg gttacctatc tctgcttcaa gttgcaactt ttaagggatg gacgattatt 4200atgtatgcag cagtggattc tgttaatgta gacaagcagc ccaaatatga atatagcctc 4260tacatgtata tttattttgt cgtctttatc atctttgggt cattcttcac tttgaacttg 4320ttcattggtg tcatcataga taatttcaac caacagaaaa agaagcttgg aggtcaagac 4380atctttatga cagaagaaca gaagaaatac tataatgcaa tgaaaaagct ggggtccaag 4440aagccacaaa agccaattcc tcgaccaggg aacaaaatcc aaggatgtat atttgaccta 4500gtgacaaatc aagcctttga tattagtatc atggttctta tctgtctcaa catggtaacc 4560atgatggtag aaaaggaggg tcaaagtcaa catatgactg aagttttata ttggataaat 4620gtggttttta taatcctttt cactggagaa tgtgtgctaa aactgatctc cctcagacac 4680tactacttca ctgtaggatg gaatattttt gattttgtgg ttgtgattat ctccattgta 4740ggtatgtttc tagctgattt gattgaaacg tattttgtgt cccctaccct gttccgagtg 4800atccgtcttg ccaggattgg ccgaatccta cgtctagtca aaggagcaaa ggggatccgc 4860acgctgctct ttgctttgat gatgtccctt cctgcgttgt ttaacatcgg cctcctgctc 4920ttcctggtca tgttcatcta cgccatcttt ggaatgtcca actttgccta tgttaaaaag 4980gaagatggaa ttaatgacat gttcaatttt gagacctttg gcaacagtat gatttgcctg 5040ttccaaatta caacctctgc tggctgggat ggattgctag cacctattct taacagtaag 5100ccacccgact gtgacccaaa aaaagttcat cctggaagtt cagttgaagg agactgtggt 5160aacccatctg ttggaatatt ctactttgtt agttatatca tcatatcctt cctggttgtg 5220gtgaacatgt acattgcagt catactggag aattttagtg ttgccactga agaaagtact 5280gaacctctga gtgaggatga ctttgagatg ttctatgagg tttgggagaa gtttgatccc 5340gatgcgaccc agtttataga gttctctaaa ctctctgatt ttgcagctgc cctggatcct 5400cctcttctca tagcaaaacc caacaaagtc cagctcattg ccatggatct gcccatggtt 5460agtggtgacc ggatccattg tcttgacatc ttatttgctt ttacaaagcg tgttttgggt 5520gagagtgggg agatggattc tcttcgttca cagatggaag aaaggttcat gtctgcaaat 5580ccttccaaag tgtcctatga acccatcaca accacactaa aacggaaaca agaggatgtg 5640tctgctactg tcattcagcg tgcttataga cgttaccgct taaggcaaaa tgtcaaaaat 5700atatcaagta tatacataaa agatggagac agagatgatg atttactcaa taaaaaagat 5760atggcttttg ataatgttaa tgagaactca agtccagaaa aaacagatgc cacttcatcc 5820accacctctc caccttcata tgatagtgta acaaagccag acaaagagaa atatgaacaa 5880gacagaacag aaaaggaaga caaagggaaa gacagcaagg aaagcaaaaa atag 593495934DNAArtificial SequenceSynthetic Construct 9atggcaatgt tgcctccccc aggacctcag agctttgtcc atttcacaaa acagtctctt 60gccctcattg aacaacgcat tgctgaaaga aaatcaaagg aacccaaaga agaaaagaaa 120gatgatgatg aagaagcccc aaagccaagc agtgacttgg aagctggcaa acaactgccc 180ttcatctatg gggacattcc tcccggcatg gtgtcagagc ccctggagga cttggacccc 240tactatgcag acaaaaagac tttcatagta ttgaacaaag ggaaaacaat cttccgtttc 300aatgccacac ctgctttata tatgctttct cctttcagtc ctctaagaag aatatctatt 360aagattttag tacactcctt attcagcatg ctcatcatgt gcactattct gacaaactgc 420atatttatga ccatgaataa cccgcaggac tggaccaaaa atgtcgagta cacttttact 480ggaatatata cttttgaatc acttgtaaaa atccttgcaa gaggcttctg tgtaggagaa 540ttcacttttc ttcgtgaccc gtggaactgg ctggattttg tcgtcattgt ttttgcgtat 600ttaacagaat ttgtaaacct aggcaatgtt tcagctcttc gaactttcag agtattgaga 660gctttgaaaa ctatttctgt aatcccaggc ctgaagacaa ttgtaggggc tttgatccag 720tcagtgaaga agctttctga tgtcatgatc ctgactgtgt tctgtctgag tgtgtttgca 780ctaattggac tacagctgtt catgggaaac ctgaagcata aatgttttcg aaattcactt 840gaaaataatg aaacattaga aagcataatg aataccctag agagtgaaga agactttaga 900aaatattttt attacttgga aggatccaaa gatgctctcc tttgtggttt cagcacagat 960tcaggtcagt gtccagaggg gtacacctgt gtgaaaattg gcagaaaccc tgattatggc 1020tacacgagct ttgacacttt cagctgggcc ttcttagcct tgtttaggct aatgacccaa 1080gattactggg aaaaccttta ccaacagacg ctgcgtgctg ctggcaaaac ctacatgatc 1140ttctttgtcg tagtgatttt cctgggctcc ttttatctaa taaacttgat cctggctgtg 1200gttgccatgg catatgaaga acagaaccag gcaaacattg aagaagctaa acagaaagaa 1260ttagaatttc aacagatgtt agaccgtctt aaaaaagagc aagaagaagc tgaggcaatt 1320gcagcggcag cggctgaata tacaagtatt aggagaagca gaattatggg cctctcagag 1380agttcttctg aaacatccaa actgagctct aaaagtgcta aagaaagaag aaacagaaga 1440aagaaaaaga atcaaaagaa gctctccagt ggagaggaaa agggagatgc tgagaaattg 1500tcgaaatcag aatcagagga cagcatcaga agaaaaagtt tccaccttgg tgtcgaaggg 1560cataggcgag cacatgaaaa gaggttgtct acccccaatc agtcaccact cagcattcgt 1620ggctccttgt tttctgcaag gcgaagcagc agaacaagtc tttttagttt caaaggcaga 1680ggaagagata taggatctga gactgaattt gccgatgatg agcacagcat ttttggagac 1740aatgagagca gaaggggctc actgtttgtg ccccacagac cccaggagcg acgcagcagt 1800aacatcagcc aagccagtag gtccccacca atgctgccgg tgaacgggaa aatgcacagt 1860gctgtggact gcaacggtgt ggtctccctg gttgatggac gctcagccct catgctcccc 1920aatggacagc ttctgccaga gggcacgacc aatcaaatac acaagaaaag gcgttgtagt 1980tcctatctcc tttcagagga tatgctgaat gatcccaacc tcagacagag agcaatgagt 2040agagcaagca tattaacaaa cactgtggaa gaacttgaag agtccagaca aaaatgtcca 2100ccttggtggt acagatttgc acacaaattc ttgatctgga attgctctcc atattggata 2160aaattcaaaa agtgtatcta ttttattgta atggatcctt ttgtagatct tgcaattacc 2220atttgcatag ttttaaacac attatttatg gctatggaac accacccaat gactgaggaa 2280ttcaaaaatg tacttgctat aggaaatttg gtctttactg gaatctttgc agctgaaatg 2340gtattaaaac tgattgccat ggatccatat gagtatttcc aagtaggctg gaatattttt 2400gacagcctta ttgtgacttt aagtttagtg gagctctttc tagcagatgt ggaaggattg 2460tcagttctgc gatcattcag actgctccga gtcttcaagt tggcaaaatc ctggccaaca 2520ttgaacatgc tgattaagat cattggtaac tcagtagggg ctctaggtaa cctcacctta 2580gtgttggcca tcatcgtctt catttttgct gtggtcggca tgcagctctt tggtaagagc 2640tacaaagaat gtgtctgcaa gatcaatgat gactgtacgc tcccacggtg gcacatgaac 2700gacttcttcc actccttcct gattgtgttc cgcgtgctgt gtggagagtg gatagagacc 2760atgtgggact gtatggaggt cgctggtcaa gctatgtgcc ttattgttta catgatggtc 2820atggtcattg gaaacctggt ggtcctaaac ctatttctgg ccttattatt gagctcattt 2880agttcagaca atcttacagc aattgaagaa gaccctgatg caaacaacct ccagattgca 2940gtgactagaa ttaaaaaggg aataaattat gtgaaacaaa ccttacgtga atttattcta 3000aaagcatttt ccaaaaagcc aaagatttcc agggagataa gacaagcaga agatctgaat 3060actaagaagg aaaactatat ttctaaccat acacttgctg aaatgagcaa aggtcacaat 3120ttcctcaagg aaaaagataa aatcagtggt tttggaagca gcgtggacaa acacttgatg 3180gaagacagtg atggtcaatc atttattcac aatcccagcc tcacagtgac agtgccaatt 3240gcacctgggg aatccgattt ggaaaatatg aatgctgagg aacttagcag tgattcggat 3300agtgaataca gcaaagtgag attaaaccgg tcaagctcct cagagtgcag cacagttgat 3360aaccctttgc ctggagaagg agaagaagca gaggctgaac ctatgaattc cgatgagcca 3420gaggcctgtt tcacagatgg ttgtgtacgg aggttctcat gctgccaagt taacatagag 3480tcagggaaag gaaaaatctg gtggaacatc aggaaaacct gctacaagat tgttgaacac 3540agttggtttg aaagcttcat tgtcctcatg atcctgctca gcagtggtgc cctggctttt 3600gaagatattt atattgaaag gaaaaagacc attaagatta tcctggagta tgcagacaag 3660atcttcactt acatcttcat tctggaaatg cttctaaaat ggatagcata tggttataaa 3720acatatttca ccaatgcctg gtgttggctg gatttcctaa ttgttgatgt ttctttggtt 3780actttagtgg caaacactct tggctactca gatcttggcc ccattaaatc ccttcggaca 3840ctgagagctt taagacctct aagagcctta tctagatttg aaggaatgag ggtcgttgtg 3900aatgcactca taggagcaat tccttccatc atgaatgtgc tacttgtgtg tcttatattc 3960tggctgatat tcagcatcat gggagtaaat ttgtttgctg gcaagttcta tgagtgtatt 4020aacaccacag atgggtcacg gtttcctgca agtcaagttc caaatcgttc cgaatgtttt 4080gcccttatga atgttagtca aaatgtgcga tggaaaaacc tgaaagtgaa ctttgataat 4140gtcggacttg gttacctatc tctgcttcaa gttgcaactt ttaagggatg gacgattatt 4200atgtatgcag cagtggattc tgttaatgta gacaagcagc ccaaatatga atatagcctc 4260tacatgtata tttattttgt cgtctttatc atctttgggt cattcttcac tttgaacttg 4320ttcattggtg tcatcataga taatttcaac caacagaaaa agaagcttgg aggtcaagac 4380atctttatga cagaagaaca gaagaaatac tataatgcaa tgaaaaagct ggggtccaag 4440aagccacaaa agccaattcc tcgaccaggg aacaaaatcc aaggatgtat atttgaccta 4500gtgacaaatc aagcctttga tattagtatc atggttctta tctgtctcaa catggtaacc 4560atgatggtag aaaaggaggg tcaaagtcaa catatgactg aagttttata ttggataaat 4620gtggttttta taatcctttt cactggagaa tgtgtgctaa aactgatctc cctcagacac 4680tactacttca ctgtaggatg gaatattttt gattttgtgg ttgtgattat ctccattgta 4740ggtatgtttc tagctgattt gattgaaacg tattttgtgt cccctaccct gttccgagtg 4800atccgtcttg ccaggattgg ccgaatccta cgtctagtca aaggagcaaa ggggatccgc 4860acgctgctct ttgctttgat gatgtccctt cctgcgttgt ttaacatcgg cctcctgctc 4920ttcctggtca tgttcatcta cgccatcttt ggaatgtcca actttgccta tgttaaaaag 4980gaagatggaa ttaatgacat gttcaatttt gagacctttg gcaacagtat gatttgcctg 5040ttccaaatta caacctctgc tggctgggat ggattgctag cacctattct taacagtaag 5100ccacccgact gtgacccaaa aaaagttcat cctggaagtt cagttgaagg agactgtggt 5160aacccatctg ttggaatatt ctactttgtt agttatatca tcatatcctt cctggttgtg 5220gtgaacatgt acattgcagt catactggag aattttagtg ttgccactga agaaagtact 5280gaacctctga gtgaggatga ctttgagatg ttctatgagg tttgggagaa gtttgatccc 5340gatgcgaccc agtttataga gttctctaaa ctctctgatt ttgcagctgc cctggatcct 5400cctcttctca tagcaaaacc caacaaagtc cagctcattg ccatggatct gcccatggtt 5460agtggtgacc ggatccattg

tcttgacatc ttatttgctt ttacaaagcg tgttttgggt 5520gagagtgggg agatggattc tcttcgttca cagatggaag aaaggttcat gtctgcaaat 5580ccttccaaag tgtcctatga acccatcaca accacactaa aacggaaaca agaggatgtg 5640tctgctactg tcattcagcg tgcttataga cgttaccgct taaggcaaaa tgtcaaaaat 5700atatcaagta tatacataaa agatggagac agagatgatg atttactcaa taaaaaagat 5760atggcttttg ataatgttaa tgagaactca agtccagaaa aaacagatgc cacttcatcc 5820accacctctc caccttcata tgatagtgta acaaagccag acaaagagaa atatgaacaa 5880gacagaacag aaaaggaaga caaagggaaa gacagcaagg aaagcaaaaa atag 5934105934DNAArtificial SequenceSynthetic Construct 10atggcaatgt tgcctccccc aggacctcag agctttgtcc atttcacaaa acagtctctt 60gccctcattg aacaacgcat tgctgaaaga aaatcaaagg aacccaaaga agaaaagaaa 120gatgatgatg aagaagcccc aaagccaagc agtgacttgg aagctggcaa acaactgccc 180ttcatctatg gggacattcc tcccggcatg gtgtcagagc ccctggagga cttggacccc 240tactatgcag acaaaaagac tttcatagta ttgaacaaag ggaaaacaat cttccgtttc 300aatgccacac ctgctttata tatgctttct cctttcagtc ctctaagaag aatatctatt 360aagattttag tacactcctt attcagcatg ctcatcatgt gcactattct gacaaactgc 420atatttatga ccatgaataa cccgccggac tggaccaaaa atgtcgagta cacttttact 480ggaatatata cttttgaatc acttgtaaaa atccttgcaa gaggcttctg tgtaggagaa 540ttcacttttc ttcgtgaccc gtggaactgg ctggattttg tcgtcattgt ttttgcgtat 600ttaacagaat ttgtaaacct aggcaatgtt tcagctcttc gaactttcag agtattgaga 660gctttgaaaa ctatttctgt aatcccaggc ctgaagacaa ttgtaggggc tttgatccag 720tcagtgaaga agctttctga tgtcatgatc ctgactgtgt tctgtctgag tgtgtttgca 780ctaattggac tacagctgtt catgggaaac ctgaagcata aatgttttcg aaattcactt 840gaaaataatg aaacattaga aagcataatg aataccctag agagtgaaga agactttaga 900aaatattttt attacttgga aggatccaaa gatgctctcc tttgtggttt cagcacagat 960tcaggtcagt gtccagaggg gtacacctgt gtgaaaattg gcagaaaccc tgattatggc 1020tacacgagct ttgacacttt cagctgggcc ttcttagcct tgtttaggct aatgacccaa 1080gattactggg aaaaccttta ccaacagacg ctgcgtgctg ctggcaaaac ctacatgatc 1140ttctttgtcg tagtgatttt cctgggctcc ttttatctaa taaacttgat cctggctgtg 1200gttgccatgg catatgaaga acagaaccag gcaaacattg aagaagctaa acagaaagaa 1260ttagaatttc aacagatgtt agaccgtctt aaaaaagagc aagaagaagc tgaggcaatt 1320gcagcggcag cggctgaata tacaagtatt aggagaagca gaattatggg cctctcagag 1380agttcttctg aaacatccaa actgagctct aaaagtgcta aagaaagaag aaacagaaga 1440aagaaaaaga atcaaaagaa gctctccagt ggagaggaaa agggagatgc tgagaaattg 1500tcgaaatcag aatcagagga cagcatcaga agaaaaagtt tccaccttgg tgtcgaaggg 1560cataggcgag cacatgaaaa gaggttgtct acccccaatc agtcaccact cagcattcgt 1620ggctccttgt tttctgcaag gcgaagcagc agaacaagtc tttttagttt caaaggcaga 1680ggaagagata taggatctga gactgaattt gccgatgatg agcacagcat ttttggagac 1740aatgagagca gaaggggctc actgtttgtg ccccacagac cccaggagcg acgcagcagt 1800aacatcagcc aagccagtag gtccccacca atgctgccgg tgaacgggaa aatgcacagt 1860gctgtggact gcaacggtgt ggtctccctg gttgatggac gctcagccct catgctcccc 1920tatggacagc ttctgccaga gggcacgacc aatcaaatac acaagaaaag gcgttgtagt 1980tcctatctcc tttcagagga tatgctgaat gatcccaacc tcagacagag agcaatgagt 2040agagcaagca tattaacaaa cactgtggaa gaacttgaag agtccagaca aaaatgtcca 2100ccttggtggt acagatttgc acacaaattc ttgatctgga attgctctcc atattggata 2160aaattcaaaa agtgtatcta ttttattgta atggatcctt ttgtagatct tgcaattacc 2220atttgcatag ttttaaacac attatttatg gctatggaac accacccaat gactgaggaa 2280ttcaaaaatg tacttgctat aggaaatttg gtctttactg gaatctttgc agctgaaatg 2340gtattaaaac tgattgccat ggatccatat gagtatttcc aagtaggctg gaatattttt 2400gacagcctta ttgtgacttt aagtttagtg gagctctttc tagcagatgt ggaaggattg 2460tcagttctgc gatcattcag actgctccga gtcttcaagt tggcaaaatc ctggccaaca 2520ttgaacatgc tgattaagat cattggtaac tcagtagggg ctctaggtaa cctcacctta 2580gtgttggcca tcatcgtctt catttttgct gtggtcggca tgcagctctt tggtaagagc 2640tacaaagaat gtgtctgcaa gatcaatgat gactgtacgc tcccacggtg gcacatgaac 2700gacttcttcc actccttcct gattgtgttc cgcgtgctgt gtggagagtg gatagagacc 2760atgtgggact gtatggaggt cgctggtcaa gctatgtgcc ttattgttta catgatggtc 2820atggtcattg gaaacctggt ggtcctaaac ctatttctgg ccttattatt gagctcattt 2880agttcagaca atcttacagc aattgaagaa gaccctgatg caaacaacct ccagattgca 2940gtgactagaa ttaaaaaggg aataaattat gtgaaacaaa ccttacgtga atttattcta 3000aaagcatttt ccaaaaagcc aaagatttcc agggagataa gacaagcaga agatctgaat 3060actaagaagg aaaactatat ttctaaccat acacttgctg aaatgagcaa aggtcacaat 3120ttcctcaagg aaaaagataa aatcagtggt tttggaagca gcgtggacaa acacttgatg 3180gaagacagtg atggtcaatc atttattcac aatcccagcc tcacagtgac agtgccaatt 3240gcacctgggg aatccgattt ggaaaatatg aatgctgagg aacttagcag tgattcggat 3300agtgaataca gcaaagtgag attaaaccgg tcaagctcct cagagtgcag cacagttgat 3360aaccctttgc ctggagaagg agaagaagca gaggctgaac ctatgaattc cgatgagcca 3420gaggcctgtt tcacagatgg ttgtgtacgg aggttctcat gctgccaagt taacatagag 3480tcagggaaag gaaaaatctg gtggaacatc aggaaaacct gctacaagat tgttgaacac 3540agttggtttg aaagcttcat tgtcctcatg atcctgctca gcagtggtgc cctggctttt 3600gaagatattt atattgaaag gaaaaagacc attaagatta tcctggagta tgcagacaag 3660atcttcactt acatcttcat tctggaaatg cttctaaaat ggatagcata tggttataaa 3720acatatttca ccaatgcctg gtgttggctg gatttcctaa ttgttgatgt ttctttggtt 3780actttagtgg caaacactct tggctactca gatcttggcc ccattaaatc ccttcggaca 3840ctgagagctt taagacctct aagagcctta tctagatttg aaggaatgag ggtcgttgtg 3900aatgcactca taggagcaat tccttccatc atgaatgtgc tacttgtgtg tcttatattc 3960tggctgatat tcagcatcat gggagtaaat ttgtttgctg gcaagttcta tgagtgtatt 4020aacaccacag atgggtcacg gtttcctgca agtcaagttc caaatcgttc cgaatgtttt 4080gcccttatga atgttagtca aaatgtgcga tggaaaaacc tgaaagtgaa ctttgataat 4140gtcggacttg gttacctatc tctgcttcaa gttgcaactt ttaagggatg gacgattatt 4200atgtatgcag cagtggattc tgttaatgta gacaagcagc ccaaatatga atatagcctc 4260tacatgtata tttattttgt cgtctttatc atctttgggt cattcttcac tttgaacttg 4320ttcattggtg tcatcataga taatttcaac caacagaaaa agaagcttgg aggtcaagac 4380atctttatga cagaagaaca gaagaaatac tataatgcaa tgaaaaagct ggggtccaag 4440aagccacaaa agccaattcc tcgaccaggg aacaaaatcc aaggatgtat atttgaccta 4500gtgacaaatc aagcctttga tattagtatc atggttctta tctgtctcaa catggtaacc 4560atgatggtag aaaaggaggg tcaaagtcaa catatgactg aagttttata ttggataaat 4620gtggttttta taatcctttt cactggagaa tgtgtgctaa aactgatctc cctcagacac 4680tactacttca ctgtaggatg gaatattttt gattttgtgg ttgtgattat ctccattgta 4740ggtatgtttc tagctgattt gattgaaacg tattttgtgt cccctaccct gttccgagtg 4800atccgtcttg ccaggattgg ccgaatccta cgtctagtca aaggagcaaa ggggatccgc 4860acgctgctct ttgctttgat gatgtccctt cctgcgttgt ttaacatcgg cctcctgctc 4920ttcctggtca tgttcatcta cgccatcttt ggaatgtcca actttgccta tgttaaaaag 4980gaagatggaa ttaatgacat gttcaatttt gagacctttg gcaacagtat gatttgcctg 5040ttccaaatta caacctctgc tggctgggat ggattgctag cacctattct taacagtaag 5100ccacccgact gtgacccaaa aaaagttcat cctggaagtt cagttgaagg agactgtggt 5160aacccatctg ttggaatatt ctactttgtt agttatatca tcatatcctt cctggttgtg 5220gtgaacatgt acattgcagt catactggag aattttagtg ttgccactga agaaagtact 5280gaacctctga gtgaggatga ctttgagatg ttctatgagg tttgggagaa gtttgatccc 5340gatgcgaccc agtttataga gttctctaaa ctctctgatt ttgcagctgc cctggatcct 5400cctcttctca tagcaaaacc caacaaagtc cagctcattg ccatggatct gcccatggtt 5460agtggtgacc ggatccattg tcttgacatc ttatttgctt ttacaaagcg tgttttgggt 5520gagagtgggg agatggattc tcttcgttca cagatggaag aaaggttcat gtctgcaaat 5580ccttccaaag tgtcctatga acccatcaca accacactaa aacggaaaca agaggatgtg 5640tctgctactg tcattcagcg tgcttataga cgttaccgct taaggcaaaa tgtcaaaaat 5700atatcaagta tatacataaa agatggagac agagatgatg atttactcaa taaaaaagat 5760atggcttttg ataatgttaa tgagaactca agtccagaaa aaacagatgc cacttcatcc 5820accacctctc caccttcata tgatagtgta acaaagccag acaaagagaa atatgaacaa 5880gacagaacag aaaaggaaga caaagggaaa gacagcaagg aaagcaaaaa atag 5934115934DNAArtificial SequenceSynthetic Construct 11atggcaatgt tgcctccccc aggacctcag agctttgtcc atttcacaaa acagtctctt 60gccctcattg aacaacgcat tgctgaaaga aaatcaaagg aacccaaaga agaaaagaaa 120gatgatgatg aagaagcccc aaagccaagc agtgacttgg aagctggcaa acaactgccc 180ttcatctatg gggacattcc tcccggcatg gtgtcagagc ccctggagga cttggacccc 240tactatgcag acaaaaagac tttcatagta ttgaacaaag ggaaaacaat cttccgtttc 300aatgccacac ctgctttata tatgctttct cctttcagtc ctctaagaag aatatctatt 360aagattttag tacactcctt attcagcatg ctcatcatgt gcactattct gacaaactgc 420atatttatga ccatgaataa cccgccggac tggaccaaaa atgtcgagta cacttttact 480ggaatatata cttttgaatc acttgtaaaa atccttgcaa gaggcttctg tgtaggagaa 540ttcacttttc ttcgtgaccc gtggaactgg ctggattttg tcgtcattgt ttttgcgtat 600ttaacagaat ttgtaaacct aggcaatgtt tcagctcttc gaactttcag agtattgaga 660gctttgaaaa ctatttctgt aatcccaggc ctgaagacaa ttgtaggggc tttgatccag 720tcagtgaaga agctttctga tgtcatgatc ctgactgtgt tctgtctgag tgtgtttgca 780ctaattggac tacagctgtt catgggaaac ctgaagcata aatgttttcg aaattcactt 840gaaaataatg aaacattaga aagcataatg aataccctag agagtgaaga agactttaga 900aaatattttt attacttgga aggatccaaa gatgctctcc tttgtggttt cagcacagat 960tcaggtcagt gtccagaggg gtacacctgt gtgaaaattg gcagaaaccc tgattatggc 1020tacacgagct ttgacacttt cagctgggcc ttcttagcct tgtttaggct aatgacccaa 1080gattactggg aaaaccttta ccaacagacg ctgcgtgctg ctggcaaaac ctacatgatc 1140ttctttgtcg tagtgatttt cctgggctcc ttttatctaa taaacttgat cctggctgtg 1200gttgccatgg catatgaaga acagaaccag gcaaacattg aagaagctaa acagaaagaa 1260ttagaatttc aacagatgtt agaccgtctt aaaaaagagc aagaagaagc tgaggcaatt 1320gcagcggcag cggctgaata tacaagtatt aggagaagca gaattatggg cctctcagag 1380agttcttctg aaacatccaa actgagctct aaaagtgcta aagaaagaag aaacagaaga 1440aagaaaaaga atcaaaagaa gctctccagt ggagaggaaa agggagatgc tgagaaattg 1500tcgaaatcag aatcagagga cagcatcaga agaaaaagtt tccaccttgg tgtcgaaggg 1560cataggcgag cacatgaaaa gaggttgtct acccccaatc agtcaccact cagcattcgt 1620ggctccttgt tttctgcaag gcgaagcagc agaacaagtc tttttagttt caaaggcaga 1680ggaagagata taggatctga gactgaattt gccgatgatg agcacagcat ttttggagac 1740aatgagagca gaaggggctc actgtttgtg ccccacagac cccaggagcg acgcagcagt 1800aacatcagcc aagccagtag gtccccacca atgctgccgg tgaacgggaa aatgcacagt 1860gctgtggact gcaacggtgt ggtctccctg gttgatggac gctcagccct catgctcccc 1920aatggacagc ttctgccaga gggcacgacc aatcaaatac acaggaaaag gcgttgtagt 1980tcctatctcc tttcagagga tatgctgaat gatcccaacc tcagacagag agcaatgagt 2040agagcaagca tattaacaaa cactgtggaa gaacttgaag agtccagaca aaaatgtcca 2100ccttggtggt acagatttgc acacaaattc ttgatctgga attgctctcc atattggata 2160aaattcaaaa agtgtatcta ttttattgta atggatcctt ttgtagatct tgcaattacc 2220atttgcatag ttttaaacac attatttatg gctatggaac accacccaat gactgaggaa 2280ttcaaaaatg tacttgctat aggaaatttg gtctttactg gaatctttgc agctgaaatg 2340gtattaaaac tgattgccat ggatccatat gagtatttcc aagtaggctg gaatattttt 2400gacagcctta ttgtgacttt aagtttagtg gagctctttc tagcagatgt ggaaggattg 2460tcagttctgc gatcattcag actgctccga gtcttcaagt tggcaaaatc ctggccaaca 2520ttgaacatgc tgattaagat cattggtaac tcagtagggg ctctaggtaa cctcacctta 2580gtgttggcca tcatcgtctt catttttgct gtggtcggca tgcagctctt tggtaagagc 2640tacaaagaat gtgtctgcaa gatcaatgat gactgtacgc tcccacggtg gcacatgaac 2700gacttcttcc actccttcct gattgtgttc cgcgtgctgt gtggagagtg gatagagacc 2760atgtgggact gtatggaggt cgctggtcaa gctatgtgcc ttattgttta catgatggtc 2820atggtcattg gaaacctggt ggtcctaaac ctatttctgg ccttattatt gagctcattt 2880agttcagaca atcttacagc aattgaagaa gaccctgatg caaacaacct ccagattgca 2940gtgactagaa ttaaaaaggg aataaattat gtgaaacaaa ccttacgtga atttattcta 3000aaagcatttt ccaaaaagcc aaagatttcc agggagataa gacaagcaga agatctgaat 3060actaagaagg aaaactatat ttctaaccat acacttgctg aaatgagcaa aggtcacaat 3120ttcctcaagg aaaaagataa aatcagtggt tttggaagca gcgtggacaa acacttgatg 3180gaagacagtg atggtcaatc atttattcac aatcccagcc tcacagtgac agtgccaatt 3240gcacctgggg aatccgattt ggaaaatatg aatgctgagg aacttagcag tgattcggat 3300agtgaataca gcaaagtgag attaaaccgg tcaagctcct cagagtgcag cacagttgat 3360aaccctttgc ctggagaagg agaagaagca gaggctgaac ctatgaattc cgatgagcca 3420gaggcctgtt tcacagatgg ttgtgtacgg aggttctcat gctgccaagt taacatagag 3480tcagggaaag gaaaaatctg gtggaacatc aggaaaacct gctacaagat tgttgaacac 3540agttggtttg aaagcttcat tgtcctcatg atcctgctca gcagtggtgc cctggctttt 3600gaagatattt atattgaaag gaaaaagacc attaagatta tcctggagta tgcagacaag 3660atcttcactt acatcttcat tctggaaatg cttctaaaat ggatagcata tggttataaa 3720acatatttca ccaatgcctg gtgttggctg gatttcctaa ttgttgatgt ttctttggtt 3780actttagtgg caaacactct tggctactca gatcttggcc ccattaaatc ccttcggaca 3840ctgagagctt taagacctct aagagcctta tctagatttg aaggaatgag ggtcgttgtg 3900aatgcactca taggagcaat tccttccatc atgaatgtgc tacttgtgtg tcttatattc 3960tggctgatat tcagcatcat gggagtaaat ttgtttgctg gcaagttcta tgagtgtatt 4020aacaccacag atgggtcacg gtttcctgca agtcaagttc caaatcgttc cgaatgtttt 4080gcccttatga atgttagtca aaatgtgcga tggaaaaacc tgaaagtgaa ctttgataat 4140gtcggacttg gttacctatc tctgcttcaa gttgcaactt ttaagggatg gacgattatt 4200atgtatgcag cagtggattc tgttaatgta gacaagcagc ccaaatatga atatagcctc 4260tacatgtata tttattttgt cgtctttatc atctttgggt cattcttcac tttgaacttg 4320ttcattggtg tcatcataga taatttcaac caacagaaaa agaagcttgg aggtcaagac 4380atctttatga cagaagaaca gaagaaatac tataatgcaa tgaaaaagct ggggtccaag 4440aagccacaaa agccaattcc tcgaccaggg aacaaaatcc aaggatgtat atttgaccta 4500gtgacaaatc aagcctttga tattagtatc atggttctta tctgtctcaa catggtaacc 4560atgatggtag aaaaggaggg tcaaagtcaa catatgactg aagttttata ttggataaat 4620gtggttttta taatcctttt cactggagaa tgtgtgctaa aactgatctc cctcagacac 4680tactacttca ctgtaggatg gaatattttt gattttgtgg ttgtgattat ctccattgta 4740ggtatgtttc tagctgattt gattgaaacg tattttgtgt cccctaccct gttccgagtg 4800atccgtcttg ccaggattgg ccgaatccta cgtctagtca aaggagcaaa ggggatccgc 4860acgctgctct ttgctttgat gatgtccctt cctgcgttgt ttaacatcgg cctcctgctc 4920ttcctggtca tgttcatcta cgccatcttt ggaatgtcca actttgccta tgttaaaaag 4980gaagatggaa ttaatgacat gttcaatttt gagacctttg gcaacagtat gatttgcctg 5040ttccaaatta caacctctgc tggctgggat ggattgctag cacctattct taacagtaag 5100ccacccgact gtgacccaaa aaaagttcat cctggaagtt cagttgaagg agactgtggt 5160aacccatctg ttggaatatt ctactttgtt agttatatca tcatatcctt cctggttgtg 5220gtgaacatgt acattgcagt catactggag aattttagtg ttgccactga agaaagtact 5280gaacctctga gtgaggatga ctttgagatg ttctatgagg tttgggagaa gtttgatccc 5340gatgcgaccc agtttataga gttctctaaa ctctctgatt ttgcagctgc cctggatcct 5400cctcttctca tagcaaaacc caacaaagtc cagctcattg ccatggatct gcccatggtt 5460agtggtgacc ggatccattg tcttgacatc ttatttgctt ttacaaagcg tgttttgggt 5520gagagtgggg agatggattc tcttcgttca cagatggaag aaaggttcat gtctgcaaat 5580ccttccaaag tgtcctatga acccatcaca accacactaa aacggaaaca agaggatgtg 5640tctgctactg tcattcagcg tgcttataga cgttaccgct taaggcaaaa tgtcaaaaat 5700atatcaagta tatacataaa agatggagac agagatgatg atttactcaa taaaaaagat 5760atggcttttg ataatgttaa tgagaactca agtccagaaa aaacagatgc cacttcatcc 5820accacctctc caccttcata tgatagtgta acaaagccag acaaagagaa atatgaacaa 5880gacagaacag aaaaggaaga caaagggaaa gacagcaagg aaagcaaaaa atag 5934125934DNAArtificial SequenceSynthetic Construct 12atggcaatgt tgcctccccc aggacctcag agctttgtcc atttcacaaa acagtctctt 60gccctcattg aacaacgcat tgctgaaaga aaatcaaagg aacccaaaga agaaaagaaa 120gatgatgatg aagaagcccc aaagccaagc agtgacttgg aagctggcaa acaactgccc 180ttcatctatg gggacattcc tcccggcatg gtgtcagagc ccctggagga cttggacccc 240tactatgcag acaaaaagac tttcatagta ttgaacaaag ggaaaacaat cttccgtttc 300aatgccacac ctgctttata tatgctttct cctttcagtc ctctaagaag aatatctatt 360aagattttag tacactcctt attcagcatg ctcatcatgt gcactattct gacaaactgc 420atatttatga ccatgaataa cccgccggac tggaccaaaa atgtcgagta cacttttact 480ggaatatata cttttgaatc acttgtaaaa atccttgcaa gaggcttctg tgtaggagaa 540ttcacttttc ttcgtgaccc gtggaactgg ctggattttg tcgtcattgt ttttgcgtat 600ttaacagaat ttgtaaacct aggcaatgtt tcagctcttc gaactttcag agtattgaga 660gctttgaaaa ctatttctgt aatcccaggc ctgaagacaa ttgtaggggc tttgatccag 720tcagtgaaga agctttctga tgtcatgatc ctgactgtgt tctgtctgag tgtgtttgca 780ctaattggac tacagctgtt catgggaaac ctgaagcata aatgttttcg aaattcactt 840gaaaataatg aaacattaga aagcataatg aataccctag agagtgaaga agactttaga 900aaatattttt attacttgga aggatccaaa gatgctctcc tttgtggttt cagcacagat 960tcaggtcagt gtccagaggg gtacacctgt gtgaaaattg gcagaaaccc tgattatggc 1020tacacgagct ttgacacttt cagctgggcc ttcttagcct tgtttaggct aatgacccaa 1080gattactggg aaaaccttta ccaacagacg ctgcgtgctg ctggcaaaac ctacatgatc 1140ttctttgtcg tagtgatttt cctgggctcc ttttatctaa taaacttgat cctggctgtg 1200gttgccatgg catatgaaga acagaaccag gcaaacattg aagaagctaa acagaaagaa 1260ttagaatttc aacagatgtt agaccgtctt aaaaaagagc aagaagaagc tgaggcaatt 1320gcagcggcag cggctgaata tacaagtatt aggagaagca gaattatggg cctctcagag 1380agttcttctg aaacatccaa actgagctct aaaagtgcta aagaaagaag aaacagaaga 1440aagaaaaaga atcaaaagaa gctctccagt ggagaggaaa agggagatgc tgagaaattg 1500tcgaaatcag aatcagagga cagcatcaga agaaaaagtt tccaccttgg tgtcgaaggg 1560cataggcgag cacatgaaaa gaggttgtct acccccaatc agtcaccact cagcattcgt 1620ggctccttgt tttctgcaag gcgaagcagc agaacaagtc tttttagttt caaaggcaga 1680ggaagagata taggatctga gactgaattt gccgatgatg agcacagcat ttttggagac 1740aatgagagca gaaggggctc actgtttgtg ccccacagac cccaggagcg acgcagcagt 1800aacatcagcc aagccagtag gtccccacca atgctgccgg tgaacgggaa aatgcacagt 1860gctgtggact gcaacggtgt ggtctccctg gttgatggac gctcagccct catgctcccc 1920aatggacagc ttctgccaga gggcacgacc aatcaaatac acaagaaaag gcgttgtagt 1980tcctatctcc tttcagagga tatgctgaat gatcccaacc tcagacagag agcaatgagt 2040agagcaagca tattaacaaa cactgtggaa gaacttgaag agtccagaca aaaatgtcca 2100ccttggtggt acagatttgc acacaaattc ttgatctgga attgctctcc atattggata 2160aaattcaaaa agtgtatcta ttttattgta atggatcctt ttgtagatct tgcagttacc 2220atttgcatag ttttaaacac attatttatg gctatggaac accacccaat gactgaggaa 2280ttcaaaaatg tacttgctat aggaaatttg gtctttactg gaatctttgc agctgaaatg 2340gtattaaaac tgattgccat ggatccatat gagtatttcc aagtaggctg gaatattttt 2400gacagcctta ttgtgacttt aagtttagtg gagctctttc tagcagatgt ggaaggattg 2460tcagttctgc gatcattcag actgctccga gtcttcaagt tggcaaaatc ctggccaaca 2520ttgaacatgc tgattaagat cattggtaac tcagtagggg ctctaggtaa cctcacctta

2580gtgttggcca tcatcgtctt catttttgct gtggtcggca tgcagctctt tggtaagagc 2640tacaaagaat gtgtctgcaa gatcaatgat gactgtacgc tcccacggtg gcacatgaac 2700gacttcttcc actccttcct gattgtgttc cgcgtgctgt gtggagagtg gatagagacc 2760atgtgggact gtatggaggt cgctggtcaa gctatgtgcc ttattgttta catgatggtc 2820atggtcattg gaaacctggt ggtcctaaac ctatttctgg ccttattatt gagctcattt 2880agttcagaca atcttacagc aattgaagaa gaccctgatg caaacaacct ccagattgca 2940gtgactagaa ttaaaaaggg aataaattat gtgaaacaaa ccttacgtga atttattcta 3000aaagcatttt ccaaaaagcc aaagatttcc agggagataa gacaagcaga agatctgaat 3060actaagaagg aaaactatat ttctaaccat acacttgctg aaatgagcaa aggtcacaat 3120ttcctcaagg aaaaagataa aatcagtggt tttggaagca gcgtggacaa acacttgatg 3180gaagacagtg atggtcaatc atttattcac aatcccagcc tcacagtgac agtgccaatt 3240gcacctgggg aatccgattt ggaaaatatg aatgctgagg aacttagcag tgattcggat 3300agtgaataca gcaaagtgag attaaaccgg tcaagctcct cagagtgcag cacagttgat 3360aaccctttgc ctggagaagg agaagaagca gaggctgaac ctatgaattc cgatgagcca 3420gaggcctgtt tcacagatgg ttgtgtacgg aggttctcat gctgccaagt taacatagag 3480tcagggaaag gaaaaatctg gtggaacatc aggaaaacct gctacaagat tgttgaacac 3540agttggtttg aaagcttcat tgtcctcatg atcctgctca gcagtggtgc cctggctttt 3600gaagatattt atattgaaag gaaaaagacc attaagatta tcctggagta tgcagacaag 3660atcttcactt acatcttcat tctggaaatg cttctaaaat ggatagcata tggttataaa 3720acatatttca ccaatgcctg gtgttggctg gatttcctaa ttgttgatgt ttctttggtt 3780actttagtgg caaacactct tggctactca gatcttggcc ccattaaatc ccttcggaca 3840ctgagagctt taagacctct aagagcctta tctagatttg aaggaatgag ggtcgttgtg 3900aatgcactca taggagcaat tccttccatc atgaatgtgc tacttgtgtg tcttatattc 3960tggctgatat tcagcatcat gggagtaaat ttgtttgctg gcaagttcta tgagtgtatt 4020aacaccacag atgggtcacg gtttcctgca agtcaagttc caaatcgttc cgaatgtttt 4080gcccttatga atgttagtca aaatgtgcga tggaaaaacc tgaaagtgaa ctttgataat 4140gtcggacttg gttacctatc tctgcttcaa gttgcaactt ttaagggatg gacgattatt 4200atgtatgcag cagtggattc tgttaatgta gacaagcagc ccaaatatga atatagcctc 4260tacatgtata tttattttgt cgtctttatc atctttgggt cattcttcac tttgaacttg 4320ttcattggtg tcatcataga taatttcaac caacagaaaa agaagcttgg aggtcaagac 4380atctttatga cagaagaaca gaagaaatac tataatgcaa tgaaaaagct ggggtccaag 4440aagccacaaa agccaattcc tcgaccaggg aacaaaatcc aaggatgtat atttgaccta 4500gtgacaaatc aagcctttga tattagtatc atggttctta tctgtctcaa catggtaacc 4560atgatggtag aaaaggaggg tcaaagtcaa catatgactg aagttttata ttggataaat 4620gtggttttta taatcctttt cactggagaa tgtgtgctaa aactgatctc cctcagacac 4680tactacttca ctgtaggatg gaatattttt gattttgtgg ttgtgattat ctccattgta 4740ggtatgtttc tagctgattt gattgaaacg tattttgtgt cccctaccct gttccgagtg 4800atccgtcttg ccaggattgg ccgaatccta cgtctagtca aaggagcaaa ggggatccgc 4860acgctgctct ttgctttgat gatgtccctt cctgcgttgt ttaacatcgg cctcctgctc 4920ttcctggtca tgttcatcta cgccatcttt ggaatgtcca actttgccta tgttaaaaag 4980gaagatggaa ttaatgacat gttcaatttt gagacctttg gcaacagtat gatttgcctg 5040ttccaaatta caacctctgc tggctgggat ggattgctag cacctattct taacagtaag 5100ccacccgact gtgacccaaa aaaagttcat cctggaagtt cagttgaagg agactgtggt 5160aacccatctg ttggaatatt ctactttgtt agttatatca tcatatcctt cctggttgtg 5220gtgaacatgt acattgcagt catactggag aattttagtg ttgccactga agaaagtact 5280gaacctctga gtgaggatga ctttgagatg ttctatgagg tttgggagaa gtttgatccc 5340gatgcgaccc agtttataga gttctctaaa ctctctgatt ttgcagctgc cctggatcct 5400cctcttctca tagcaaaacc caacaaagtc cagctcattg ccatggatct gcccatggtt 5460agtggtgacc ggatccattg tcttgacatc ttatttgctt ttacaaagcg tgttttgggt 5520gagagtgggg agatggattc tcttcgttca cagatggaag aaaggttcat gtctgcaaat 5580ccttccaaag tgtcctatga acccatcaca accacactaa aacggaaaca agaggatgtg 5640tctgctactg tcattcagcg tgcttataga cgttaccgct taaggcaaaa tgtcaaaaat 5700atatcaagta tatacataaa agatggagac agagatgatg atttactcaa taaaaaagat 5760atggcttttg ataatgttaa tgagaactca agtccagaaa aaacagatgc cacttcatcc 5820accacctctc caccttcata tgatagtgta acaaagccag acaaagagaa atatgaacaa 5880gacagaacag aaaaggaaga caaagggaaa gacagcaagg aaagcaaaaa atag 5934135934DNAArtificial SequenceSynthetic Construct 13atggcaatgt tgcctccccc aggacctcag agctttgtcc atttcacaaa acagtctctt 60gccctcattg aacaacgcat tgctgaaaga aaatcaaagg aacccaaaga agaaaagaaa 120gatgatgatg aagaagcccc aaagccaagc agtgacttgg aagctggcaa acaactgccc 180ttcatctatg gggacattcc tcccggcatg gtgtcagagc ccctggagga cttggacccc 240tactatgcag acaaaaagac tttcatagta ttgaacaaag ggaaaacaat cttccgtttc 300aatgccacac ctgctttata tatgctttct cctttcagtc ctctaagaag aatatctatt 360aagattttag tacactcctt attcagcatg ctcatcatgt gcactattct gacaaactgc 420atatttatga ccatgaataa cccgccggac tggaccaaaa atgtcgagta cacttttact 480ggaatatata cttttgaatc acttgtaaaa atccttgcaa gaggcttctg tgtaggagaa 540ttcacttttc ttcgtgaccc gtggaactgg ctggattttg tcgtcattgt ttttgcgtat 600ttaacagaat ttgtaaacct aggcaatgtt tcagctcttc gaactttcag agtattgaga 660gctttgaaaa ctatttctgt aatcccaggc ctgaagacaa ttgtaggggc tttgatccag 720tcagtgaaga agctttctga tgtcatgatc ctgactgtgt tctgtctgag tgtgtttgca 780ctaattggac tacagctgtt catgggaaac ctgaagcata aatgttttcg aaattcactt 840gaaaataatg aaacattaga aagcataatg aataccctag agagtgaaga agactttaga 900aaatattttt attacttgga aggatccaaa gatgctctcc tttgtggttt cagcacagat 960tcaggtcagt gtccagaggg gtacacctgt gtgaaaattg gcagaaaccc tgattatggc 1020tacacgagct ttgacacttt cagctgggcc ttcttagcct tgtttaggct aatgacccaa 1080gattactggg aaaaccttta ccaacagacg ctgcgtgctg ctggcaaaac ctacatgatc 1140ttctttgtcg tagtgatttt cctgggctcc ttttatctaa taaacttgat cctggctgtg 1200gttgccatgg catatgaaga acagaaccag gcaaacattg aagaagctaa acagaaagaa 1260ttagaatttc aacagatgtt agaccgtctt aaaaaagagc aagaagaagc tgaggcaatt 1320gcagcggcag cggctgaata tacaagtatt aggagaagca gaattatggg cctctcagag 1380agttcttctg aaacatccaa actgagctct aaaagtgcta aagaaagaag aaacagaaga 1440aagaaaaaga atcaaaagaa gctctccagt ggagaggaaa agggagatgc tgagaaattg 1500tcgaaatcag aatcagagga cagcatcaga agaaaaagtt tccaccttgg tgtcgaaggg 1560cataggcgag cacatgaaaa gaggttgtct acccccaatc agtcaccact cagcattcgt 1620ggctccttgt tttctgcaag gcgaagcagc agaacaagtc tttttagttt caaaggcaga 1680ggaagagata taggatctga gactgaattt gccgatgatg agcacagcat ttttggagac 1740aatgagagca gaaggggctc actgtttgtg ccccacagac cccaggagcg acgcagcagt 1800aacatcagcc aagccagtag gtccccacca atgctgccgg tgaacgggaa aatgcacagt 1860gctgtggact gcaacggtgt ggtctccctg gttgatggac gctcagccct catgctcccc 1920aatggacagc ttctgccaga gggcacgacc aatcaaatac acaagaaaag gcgttgtagt 1980tcctatctcc tttcagagga tatgctgaat gatcccaacc tcagacagag agcaatgagt 2040agagcaagca tattaacaaa cactgtggaa gaacttgaag agtccagaca aaaatgtcca 2100ccttggtggt acagatttgc acacaaattc ttgatctgga attgctctcc atattggata 2160aaattcaaaa agtgtatcta ttttattgta atggatcctt ttgtagatct tgcaattacc 2220atttgcatag ttttaaacac attatttatg gctatggaac accacccaat gactgaggaa 2280ttcaaaaatg tacttgctat aggaaatttg gtctttactg gaatctttgc agctgaaatg 2340gtattaaaac tgattgccat ggatccatat gagtatttcc aagtaggctg gaatattttt 2400gacagcctta ttgtgacttt aagtttagtg gagctctttc tagcagatgt ggaaggattg 2460tcagttctgc gatcattcag actgctccga gtcttcaagt tggcaaaatc ctggccaaca 2520ttgaacatgc tgattaagat cattggtaac tcagtagggg ctctaggtaa cctcacctta 2580gtgttggcca tcatcgtctt catttttgct gtggtcggca tgcagctctt tggtaagagc 2640tacaaagaat gtgtctgcaa gatcaatgat gactgtacgc tcccacggtg gcacatgaac 2700gacttcttcc actccttcct gattgtgttc cgcgtgctgt gtggagagtg gatagagacc 2760atgtgggact gtatggaggt cgctggtcaa gctatgtgcc ttattgttta catgatggtc 2820atggtcattg gaaacctggt ggtcctaaac ctatttctgg ccttattatt gagctcattt 2880agttcagaca atcttacagc aattgaagaa gaccctgatg caaacaacct ccagattgca 2940gtgactagaa ttaaaaaggg aataaattat gtgaaacaaa ccttacgtga atttattcta 3000aaagcatttt ccaaaaagcc aaagatttcc agggagataa gacaagcaga agatctgaat 3060actaagaagg aaaactatat ttctaaccat acacttgctg aaatgagcaa aggtcacaat 3120ttcctcaagg aaaaagataa aatcagtggt tttggaagca gcgtggacaa acacttgatg 3180gaagacagtg atggtcaatc atttattcac aatcccagcc tcacagtgac agtgccaatt 3240gcacctgggg aatccgattt ggaaaatatg aatgctgagg aacttagcag tgattcggat 3300agtgaataca gcaaagtgag attaaaccgg tcaagctcct cagagtgcag cacagttgat 3360aacccttttc ctggagaagg agaagaagca gaggctgaac ctatgaattc cgatgagcca 3420gaggcctgtt tcacagatgg ttgtgtacgg aggttctcat gctgccaagt taacatagag 3480tcagggaaag gaaaaatctg gtggaacatc aggaaaacct gctacaagat tgttgaacac 3540agttggtttg aaagcttcat tgtcctcatg atcctgctca gcagtggtgc cctggctttt 3600gaagatattt atattgaaag gaaaaagacc attaagatta tcctggagta tgcagacaag 3660atcttcactt acatcttcat tctggaaatg cttctaaaat ggatagcata tggttataaa 3720acatatttca ccaatgcctg gtgttggctg gatttcctaa ttgttgatgt ttctttggtt 3780actttagtgg caaacactct tggctactca gatcttggcc ccattaaatc ccttcggaca 3840ctgagagctt taagacctct aagagcctta tctagatttg aaggaatgag ggtcgttgtg 3900aatgcactca taggagcaat tccttccatc atgaatgtgc tacttgtgtg tcttatattc 3960tggctgatat tcagcatcat gggagtaaat ttgtttgctg gcaagttcta tgagtgtatt 4020aacaccacag atgggtcacg gtttcctgca agtcaagttc caaatcgttc cgaatgtttt 4080gcccttatga atgttagtca aaatgtgcga tggaaaaacc tgaaagtgaa ctttgataat 4140gtcggacttg gttacctatc tctgcttcaa gttgcaactt ttaagggatg gacgattatt 4200atgtatgcag cagtggattc tgttaatgta gacaagcagc ccaaatatga atatagcctc 4260tacatgtata tttattttgt cgtctttatc atctttgggt cattcttcac tttgaacttg 4320ttcattggtg tcatcataga taatttcaac caacagaaaa agaagcttgg aggtcaagac 4380atctttatga cagaagaaca gaagaaatac tataatgcaa tgaaaaagct ggggtccaag 4440aagccacaaa agccaattcc tcgaccaggg aacaaaatcc aaggatgtat atttgaccta 4500gtgacaaatc aagcctttga tattagtatc atggttctta tctgtctcaa catggtaacc 4560atgatggtag aaaaggaggg tcaaagtcaa catatgactg aagttttata ttggataaat 4620gtggttttta taatcctttt cactggagaa tgtgtgctaa aactgatctc cctcagacac 4680tactacttca ctgtaggatg gaatattttt gattttgtgg ttgtgattat ctccattgta 4740ggtatgtttc tagctgattt gattgaaacg tattttgtgt cccctaccct gttccgagtg 4800atccgtcttg ccaggattgg ccgaatccta cgtctagtca aaggagcaaa ggggatccgc 4860acgctgctct ttgctttgat gatgtccctt cctgcgttgt ttaacatcgg cctcctgctc 4920ttcctggtca tgttcatcta cgccatcttt ggaatgtcca actttgccta tgttaaaaag 4980gaagatggaa ttaatgacat gttcaatttt gagacctttg gcaacagtat gatttgcctg 5040ttccaaatta caacctctgc tggctgggat ggattgctag cacctattct taacagtaag 5100ccacccgact gtgacccaaa aaaagttcat cctggaagtt cagttgaagg agactgtggt 5160aacccatctg ttggaatatt ctactttgtt agttatatca tcatatcctt cctggttgtg 5220gtgaacatgt acattgcagt catactggag aattttagtg ttgccactga agaaagtact 5280gaacctctga gtgaggatga ctttgagatg ttctatgagg tttgggagaa gtttgatccc 5340gatgcgaccc agtttataga gttctctaaa ctctctgatt ttgcagctgc cctggatcct 5400cctcttctca tagcaaaacc caacaaagtc cagctcattg ccatggatct gcccatggtt 5460agtggtgacc ggatccattg tcttgacatc ttatttgctt ttacaaagcg tgttttgggt 5520gagagtgggg agatggattc tcttcgttca cagatggaag aaaggttcat gtctgcaaat 5580ccttccaaag tgtcctatga acccatcaca accacactaa aacggaaaca agaggatgtg 5640tctgctactg tcattcagcg tgcttataga cgttaccgct taaggcaaaa tgtcaaaaat 5700atatcaagta tatacataaa agatggagac agagatgatg atttactcaa taaaaaagat 5760atggcttttg ataatgttaa tgagaactca agtccagaaa aaacagatgc cacttcatcc 5820accacctctc caccttcata tgatagtgta acaaagccag acaaagagaa atatgaacaa 5880gacagaacag aaaaggaaga caaagggaaa gacagcaagg aaagcaaaaa atag 59341415DNAArtificial SequenceSynthetic Construct 14gcccttcatc tatgg 151515DNAArtificial SequenceSynthetic Construct 15aacccgccgg actgg 151615DNAArtificial SequenceSynthetic Construct 16gctccccaat ggaca 151715DNAArtificial SequenceSynthetic Construct 17atacacaaga aaagg 151815DNAArtificial SequenceSynthetic Construct 18tcttgcaatt accat 151915DNAArtificial SequenceSynthetic Construct 19accctttgcc tggag 152021DNAArtificial SequenceSynthetic Construct 20gtcccgccca ttgcctgaca c 212125DNAArtificial SequenceSynthetic Construct 21ttctggtcat gatatggtta ttcac 252224DNAArtificial SequenceSynthetic Construct 22tgatagatgc gttgatgaca ttgg 242324DNAArtificial SequenceSynthetic Construct 23ttcataaatg cagtaacttc ctgg 242424DNAArtificial SequenceSynthetic Construct 24tgtttctttt aagtcagtac agag 242522DNAArtificial SequenceSynthetic Construct 25agagccattc acaagaccag ag 222621DNAArtificial SequenceSynthetic Construct 26actcagaaag gcagagaggt g 212723DNAArtificial SequenceSynthetic Construct 27ttgccatgtt atcaatgtct gtg 232824DNAArtificial SequenceSynthetic Construct 28gactgatttg tatctggtta ggag 242924DNAArtificial SequenceSynthetic Construct 29gcaatgtaat taggaaggtg tgag 243026DNAArtificial SequenceSynthetic Construct 30tttgaatgaa ctctaaatga actacc 263125DNAArtificial SequenceSynthetic Construct 31taagtattag gcgttaagac aaacc 25325PRTArtificial SequenceSynthetic Construct 32Pro Phe Val Tyr Gly1 5335PRTArtificial SequenceSynthetic Construct 33Asn Pro Gln Asp Trp1 5345PRTArtificial SequenceSynthetic Construct 34Leu Pro Tyr Gly Gln1 5355PRTArtificial SequenceSynthetic Construct 35Ile His Arg Lys Arg1 5365PRTArtificial SequenceSynthetic Construct 36Leu Ala Val Thr Ile1 5375PRTArtificial SequenceSynthetic Construct 37Asn Pro Phe Pro Gly1 5381977PRTArtificial SequenceSynthetic Construct 38Met Ala Met Leu Pro Pro Pro Gly Pro Gln Ser Phe Val His Phe Thr1 5 10 15Lys Gln Ser Leu Ala Leu Ile Glu Gln Arg Ile Ala Glu Arg Lys Ser 20 25 30Lys Glu Pro Lys Glu Glu Lys Lys Asp Asp Asp Glu Glu Ala Pro Lys 35 40 45Pro Ser Ser Asp Leu Glu Ala Gly Lys Gln Leu Pro Phe Ile Tyr Gly 50 55 60Asp Ile Pro Pro Gly Met Val Ser Glu Pro Leu Glu Asp Leu Asp Pro65 70 75 80Tyr Tyr Ala Asp Lys Lys Thr Phe Ile Val Leu Asn Lys Gly Lys Thr 85 90 95Ile Phe Arg Phe Asn Ala Thr Pro Ala Leu Tyr Met Leu Ser Pro Phe 100 105 110Ser Pro Leu Arg Arg Ile Ser Ile Lys Ile Leu Val His Ser Leu Phe 115 120 125Ser Met Leu Ile Met Cys Thr Ile Leu Thr Asn Cys Ile Phe Met Thr 130 135 140Met Asn Asn Pro Pro Asp Trp Thr Lys Asn Val Glu Tyr Thr Phe Thr145 150 155 160Gly Ile Tyr Thr Phe Glu Ser Leu Val Lys Ile Leu Ala Arg Gly Phe 165 170 175Cys Val Gly Glu Phe Thr Phe Leu Arg Asp Pro Trp Asn Trp Leu Asp 180 185 190Phe Val Val Ile Val Phe Ala Tyr Leu Thr Glu Phe Val Asn Leu Gly 195 200 205Asn Val Ser Ala Leu Arg Thr Phe Arg Val Leu Arg Ala Leu Lys Thr 210 215 220Ile Ser Val Ile Pro Gly Leu Lys Thr Ile Val Gly Ala Leu Ile Gln225 230 235 240Ser Val Lys Lys Leu Ser Asp Val Met Ile Leu Thr Val Phe Cys Leu 245 250 255Ser Val Phe Ala Leu Ile Gly Leu Gln Leu Phe Met Gly Asn Leu Lys 260 265 270His Lys Cys Phe Arg Asn Ser Leu Glu Asn Asn Glu Thr Leu Glu Ser 275 280 285Ile Met Asn Thr Leu Glu Ser Glu Glu Asp Phe Arg Lys Tyr Phe Tyr 290 295 300Tyr Leu Glu Gly Ser Lys Asp Ala Leu Leu Cys Gly Phe Ser Thr Asp305 310 315 320Ser Gly Gln Cys Pro Glu Gly Tyr Thr Cys Val Lys Ile Gly Arg Asn 325 330 335Pro Asp Tyr Gly Tyr Thr Ser Phe Asp Thr Phe Ser Trp Ala Phe Leu 340 345 350Ala Leu Phe Arg Leu Met Thr Gln Asp Tyr Trp Glu Asn Leu Tyr Gln 355 360 365Gln Thr Leu Arg Ala Ala Gly Lys Thr Tyr Met Ile Phe Phe Val Val 370 375 380Val Ile Phe Leu Gly Ser Phe Tyr Leu Ile Asn Leu Ile Leu Ala Val385 390 395 400Val Ala Met Ala Tyr Glu Glu Gln Asn Gln Ala Asn Ile Glu Glu Ala 405 410 415Lys Gln Lys Glu Leu Glu Phe Gln Gln Met Leu Asp Arg Leu Lys Lys 420 425 430Glu Gln Glu Glu Ala Glu Ala Ile Ala Ala Ala Ala Ala Glu Tyr Thr 435 440 445Ser Ile Arg Arg Ser Arg Ile Met Gly Leu Ser Glu Ser Ser Ser Glu 450 455 460Thr Ser Lys Leu Ser Ser Lys Ser Ala Lys Glu Arg Arg Asn Arg Arg465 470 475 480Lys Lys Lys Asn Gln Lys Lys Leu Ser Ser Gly Glu Glu Lys Gly Asp 485 490 495Ala Glu Lys Leu Ser Lys Ser Glu Ser Glu Asp Ser Ile Arg Arg Lys 500 505 510Ser Phe His Leu Gly Val Glu Gly His Arg Arg Ala His Glu Lys Arg 515 520 525Leu Ser Thr Pro Asn Gln Ser Pro Leu Ser Ile Arg Gly Ser Leu Phe 530 535 540Ser Ala Arg Arg Ser Ser Arg Thr Ser Leu Phe Ser Phe Lys Gly Arg545 550 555 560Gly Arg Asp Ile Gly Ser Glu Thr Glu Phe

Ala Asp Asp Glu His Ser 565 570 575Ile Phe Gly Asp Asn Glu Ser Arg Arg Gly Ser Leu Phe Val Pro His 580 585 590Arg Pro Gln Glu Arg Arg Ser Ser Asn Ile Ser Gln Ala Ser Arg Ser 595 600 605Pro Pro Met Leu Pro Val Asn Gly Lys Met His Ser Ala Val Asp Cys 610 615 620Asn Gly Val Val Ser Leu Val Asp Gly Arg Ser Ala Leu Met Leu Pro625 630 635 640Asn Gly Gln Leu Leu Pro Glu Gly Thr Thr Asn Gln Ile His Lys Lys 645 650 655Arg Arg Cys Ser Ser Tyr Leu Leu Ser Glu Asp Met Leu Asn Asp Pro 660 665 670Asn Leu Arg Gln Arg Ala Met Ser Arg Ala Ser Ile Leu Thr Asn Thr 675 680 685Val Glu Glu Leu Glu Glu Ser Arg Gln Lys Cys Pro Pro Trp Trp Tyr 690 695 700Arg Phe Ala His Lys Phe Leu Ile Trp Asn Cys Ser Pro Tyr Trp Ile705 710 715 720Lys Phe Lys Lys Cys Ile Tyr Phe Ile Val Met Asp Pro Phe Val Asp 725 730 735Leu Ala Ile Thr Ile Cys Ile Val Leu Asn Thr Leu Phe Met Ala Met 740 745 750Glu His His Pro Met Thr Glu Glu Phe Lys Asn Val Leu Ala Ile Gly 755 760 765Asn Leu Val Phe Thr Gly Ile Phe Ala Ala Glu Met Val Leu Lys Leu 770 775 780Ile Ala Met Asp Pro Tyr Glu Tyr Phe Gln Val Gly Trp Asn Ile Phe785 790 795 800Asp Ser Leu Ile Val Thr Leu Ser Leu Val Glu Leu Phe Leu Ala Asp 805 810 815Val Glu Gly Leu Ser Val Leu Arg Ser Phe Arg Leu Leu Arg Val Phe 820 825 830Lys Leu Ala Lys Ser Trp Pro Thr Leu Asn Met Leu Ile Lys Ile Ile 835 840 845Gly Asn Ser Val Gly Ala Leu Gly Asn Leu Thr Leu Val Leu Ala Ile 850 855 860Ile Val Phe Ile Phe Ala Val Val Gly Met Gln Leu Phe Gly Lys Ser865 870 875 880Tyr Lys Glu Cys Val Cys Lys Ile Asn Asp Asp Cys Thr Leu Pro Arg 885 890 895Trp His Met Asn Asp Phe Phe His Ser Phe Leu Ile Val Phe Arg Val 900 905 910Leu Cys Gly Glu Trp Ile Glu Thr Met Trp Asp Cys Met Glu Val Ala 915 920 925Gly Gln Ala Met Cys Leu Ile Val Tyr Met Met Val Met Val Ile Gly 930 935 940Asn Leu Val Val Leu Asn Leu Phe Leu Ala Leu Leu Leu Ser Ser Phe945 950 955 960Ser Ser Asp Asn Leu Thr Ala Ile Glu Glu Asp Pro Asp Ala Asn Asn 965 970 975Leu Gln Ile Ala Val Thr Arg Ile Lys Lys Gly Ile Asn Tyr Val Lys 980 985 990Gln Thr Leu Arg Glu Phe Ile Leu Lys Ala Phe Ser Lys Lys Pro Lys 995 1000 1005Ile Ser Arg Glu Ile Arg Gln Ala Glu Asp Leu Asn Thr Lys Lys 1010 1015 1020Glu Asn Tyr Ile Ser Asn His Thr Leu Ala Glu Met Ser Lys Gly 1025 1030 1035His Asn Phe Leu Lys Glu Lys Asp Lys Ile Ser Gly Phe Gly Ser 1040 1045 1050Ser Val Asp Lys His Leu Met Glu Asp Ser Asp Gly Gln Ser Phe 1055 1060 1065Ile His Asn Pro Ser Leu Thr Val Thr Val Pro Ile Ala Pro Gly 1070 1075 1080Glu Ser Asp Leu Glu Asn Met Asn Ala Glu Glu Leu Ser Ser Asp 1085 1090 1095Ser Asp Ser Glu Tyr Ser Lys Val Arg Leu Asn Arg Ser Ser Ser 1100 1105 1110Ser Glu Cys Ser Thr Val Asp Asn Pro Leu Pro Gly Glu Gly Glu 1115 1120 1125Glu Ala Glu Ala Glu Pro Met Asn Ser Asp Glu Pro Glu Ala Cys 1130 1135 1140Phe Thr Asp Gly Cys Val Arg Arg Phe Ser Cys Cys Gln Val Asn 1145 1150 1155Ile Glu Ser Gly Lys Gly Lys Ile Trp Trp Asn Ile Arg Lys Thr 1160 1165 1170Cys Tyr Lys Ile Val Glu His Ser Trp Phe Glu Ser Phe Ile Val 1175 1180 1185Leu Met Ile Leu Leu Ser Ser Gly Ala Leu Ala Phe Glu Asp Ile 1190 1195 1200Tyr Ile Glu Arg Lys Lys Thr Ile Lys Ile Ile Leu Glu Tyr Ala 1205 1210 1215Asp Lys Ile Phe Thr Tyr Ile Phe Ile Leu Glu Met Leu Leu Lys 1220 1225 1230Trp Ile Ala Tyr Gly Tyr Lys Thr Tyr Phe Thr Asn Ala Trp Cys 1235 1240 1245Trp Leu Asp Phe Leu Ile Val Asp Val Ser Leu Val Thr Leu Val 1250 1255 1260Ala Asn Thr Leu Gly Tyr Ser Asp Leu Gly Pro Ile Lys Ser Leu 1265 1270 1275Arg Thr Leu Arg Ala Leu Arg Pro Leu Arg Ala Leu Ser Arg Phe 1280 1285 1290Glu Gly Met Arg Val Val Val Asn Ala Leu Ile Gly Ala Ile Pro 1295 1300 1305Ser Ile Met Asn Val Leu Leu Val Cys Leu Ile Phe Trp Leu Ile 1310 1315 1320Phe Ser Ile Met Gly Val Asn Leu Phe Ala Gly Lys Phe Tyr Glu 1325 1330 1335Cys Ile Asn Thr Thr Asp Gly Ser Arg Phe Pro Ala Ser Gln Val 1340 1345 1350Pro Asn Arg Ser Glu Cys Phe Ala Leu Met Asn Val Ser Gln Asn 1355 1360 1365Val Arg Trp Lys Asn Leu Lys Val Asn Phe Asp Asn Val Gly Leu 1370 1375 1380Gly Tyr Leu Ser Leu Leu Gln Val Ala Thr Phe Lys Gly Trp Thr 1385 1390 1395Ile Ile Met Tyr Ala Ala Val Asp Ser Val Asn Val Asp Lys Gln 1400 1405 1410Pro Lys Tyr Glu Tyr Ser Leu Tyr Met Tyr Ile Tyr Phe Val Val 1415 1420 1425Phe Ile Ile Phe Gly Ser Phe Phe Thr Leu Asn Leu Phe Ile Gly 1430 1435 1440Val Ile Ile Asp Asn Phe Asn Gln Gln Lys Lys Lys Leu Gly Gly 1445 1450 1455Gln Asp Ile Phe Met Thr Glu Glu Gln Lys Lys Tyr Tyr Asn Ala 1460 1465 1470Met Lys Lys Leu Gly Ser Lys Lys Pro Gln Lys Pro Ile Pro Arg 1475 1480 1485Pro Gly Asn Lys Ile Gln Gly Cys Ile Phe Asp Leu Val Thr Asn 1490 1495 1500Gln Ala Phe Asp Ile Ser Ile Met Val Leu Ile Cys Leu Asn Met 1505 1510 1515Val Thr Met Met Val Glu Lys Glu Gly Gln Ser Gln His Met Thr 1520 1525 1530Glu Val Leu Tyr Trp Ile Asn Val Val Phe Ile Ile Leu Phe Thr 1535 1540 1545Gly Glu Cys Val Leu Lys Leu Ile Ser Leu Arg His Tyr Tyr Phe 1550 1555 1560Thr Val Gly Trp Asn Ile Phe Asp Phe Val Val Val Ile Ile Ser 1565 1570 1575Ile Val Gly Met Phe Leu Ala Asp Leu Ile Glu Thr Tyr Phe Val 1580 1585 1590Ser Pro Thr Leu Phe Arg Val Ile Arg Leu Ala Arg Ile Gly Arg 1595 1600 1605Ile Leu Arg Leu Val Lys Gly Ala Lys Gly Ile Arg Thr Leu Leu 1610 1615 1620Phe Ala Leu Met Met Ser Leu Pro Ala Leu Phe Asn Ile Gly Leu 1625 1630 1635Leu Leu Phe Leu Val Met Phe Ile Tyr Ala Ile Phe Gly Met Ser 1640 1645 1650Asn Phe Ala Tyr Val Lys Lys Glu Asp Gly Ile Asn Asp Met Phe 1655 1660 1665Asn Phe Glu Thr Phe Gly Asn Ser Met Ile Cys Leu Phe Gln Ile 1670 1675 1680Thr Thr Ser Ala Gly Trp Asp Gly Leu Leu Ala Pro Ile Leu Asn 1685 1690 1695Ser Lys Pro Pro Asp Cys Asp Pro Lys Lys Val His Pro Gly Ser 1700 1705 1710Ser Val Glu Gly Asp Cys Gly Asn Pro Ser Val Gly Ile Phe Tyr 1715 1720 1725Phe Val Ser Tyr Ile Ile Ile Ser Phe Leu Val Val Val Asn Met 1730 1735 1740Tyr Ile Ala Val Ile Leu Glu Asn Phe Ser Val Ala Thr Glu Glu 1745 1750 1755Ser Thr Glu Pro Leu Ser Glu Asp Asp Phe Glu Met Phe Tyr Glu 1760 1765 1770Val Trp Glu Lys Phe Asp Pro Asp Ala Thr Gln Phe Ile Glu Phe 1775 1780 1785Ser Lys Leu Ser Asp Phe Ala Ala Ala Leu Asp Pro Pro Leu Leu 1790 1795 1800Ile Ala Lys Pro Asn Lys Val Gln Leu Ile Ala Met Asp Leu Pro 1805 1810 1815Met Val Ser Gly Asp Arg Ile His Cys Leu Asp Ile Leu Phe Ala 1820 1825 1830Phe Thr Lys Arg Val Leu Gly Glu Ser Gly Glu Met Asp Ser Leu 1835 1840 1845Arg Ser Gln Met Glu Glu Arg Phe Met Ser Ala Asn Pro Ser Lys 1850 1855 1860Val Ser Tyr Glu Pro Ile Thr Thr Thr Leu Lys Arg Lys Gln Glu 1865 1870 1875Asp Val Ser Ala Thr Val Ile Gln Arg Ala Tyr Arg Arg Tyr Arg 1880 1885 1890Leu Arg Gln Asn Val Lys Asn Ile Ser Ser Ile Tyr Ile Lys Asp 1895 1900 1905Gly Asp Arg Asp Asp Asp Leu Leu Asn Lys Lys Asp Met Ala Phe 1910 1915 1920Asp Asn Val Asn Glu Asn Ser Ser Pro Glu Lys Thr Asp Ala Thr 1925 1930 1935Ser Ser Thr Thr Ser Pro Pro Ser Tyr Asp Ser Val Thr Lys Pro 1940 1945 1950Asp Lys Glu Lys Tyr Glu Gln Asp Arg Thr Glu Lys Glu Asp Lys 1955 1960 1965Gly Lys Asp Ser Lys Glu Ser Lys Lys 1970 1975

* * * * *