U.S. patent application number 12/036960 was filed with the patent office on 2008-12-25 for glycerol as a predictor of glucose tolerance.
Invention is credited to Steve Arsenault, Mark Daly, Daniel Gaudet, Thomas J. Hudson, John D. Rioux.
Application Number | 20080319176 12/036960 |
Document ID | / |
Family ID | 32328644 |
Filed Date | 2008-12-25 |
United States Patent
Application |
20080319176 |
Kind Code |
A1 |
Gaudet; Daniel ; et
al. |
December 25, 2008 |
GLYCEROL AS A PREDICTOR OF GLUCOSE TOLERANCE
Abstract
Novel alterations in the glycerol kinase gene are described.
Also described are methods of predicting or assisting in the
prediction of impaired glucose tolerance and type 2 diabetes
mellitus.
Inventors: |
Gaudet; Daniel; (Chicoutimi,
CA) ; Rioux; John D.; (Cambridge, MA) ;
Arsenault; Steve; (Quebec City, CA) ; Hudson; Thomas
J.; (Westmount, CA) ; Daly; Mark; (Arlington,
MA) |
Correspondence
Address: |
MORSE, BARNES-BROWN & PENDLETON, P.C.;ATTN: PATENT MANAGER
RESERVOIR PLACE, 1601 TRAPELO ROAD, SUITE 205
WALTHAM
MA
02451
US
|
Family ID: |
32328644 |
Appl. No.: |
12/036960 |
Filed: |
February 25, 2008 |
Related U.S. Patent Documents
|
|
|
|
|
|
Application
Number |
Filing Date |
Patent Number |
|
|
10827131 |
Apr 19, 2004 |
|
|
|
12036960 |
|
|
|
|
09694088 |
Oct 20, 2000 |
6743579 |
|
|
10827131 |
|
|
|
|
60161141 |
Oct 22, 1999 |
|
|
|
Current U.S.
Class: |
536/23.2 |
Current CPC
Class: |
C12N 9/1205 20130101;
G01N 2500/00 20130101; C12Q 1/6883 20130101; C12Y 207/0103
20130101; G01N 2333/91215 20130101; G01N 33/6893 20130101; C12Q
1/48 20130101; G01N 2800/52 20130101; G01N 33/66 20130101; C12Q
1/485 20130101; G01N 2800/042 20130101; C12Q 2600/156 20130101 |
Class at
Publication: |
536/23.2 |
International
Class: |
C07H 21/00 20060101
C07H021/00 |
Claims
1. An isolated nucleic acid molecule comprising a portion of SEQ ID
NO: 3, wherein said portion is at least 10 nucleotides in length
and includes nucleotide position 29 of exon 10 of a glycerol kinase
(GK) gene, and wherein said nucleic acid molecule comprises a
mutant allele of said GK gene at said nucleotide position 29.
Description
RELATED APPLICATIONS
[0001] This application is a continuation of U.S. application Ser.
No. 10/827,131, filed Apr. 19, 2004, which is a divisional of U.S.
application Ser. No. 09/694,088, filed Oct. 20, 2000, which claims
the benefit of U.S. Provisional Application No. 60/161,141, filed
Oct. 22, 1999. The entire teachings of the referenced applications
are incorporated herein by reference.
BACKGROUND OF THE INVENTION
[0002] Glycerol kinase (GK) catalyzes the entry of glycerol into
the glucose and triglyceride metabolic pathway. Impaired glucose
tolerance (IGT) and hypertriglyceridemia are associated with an
increased risk of diabetes mellitus (DM) and cardiovascular
disease. The relationship between glycerol and the risk of IGT,
however, is poorly understood.
SUMMARY OF THE INVENTION
[0003] Work described herein details the identification of
alterations in the glycerol kinase (GK) gene which result in severe
hyperglycerolemia and impaired glucose metabolism and body fat
distribution. Glycerol levels are shown to be highly heritable and
associated with significant variations in glucose tolerance. This
work indicates that glycerol is a potentially significant predictor
of the magnitude of glucose tolerance and thus of increased risk of
diabetes mellitus (DM) and cardiovascular disease.
[0004] Work described herein assessed the association of fasting
plasma glycerol concentration with 2-hour glucose following a 75 g
oral glucose tolerance test in a cohort of 1056 unrelated French
Canadians presenting with a family history of hypertriglyceridemia.
The familial resemblance of fasting glycerol in these subjects'
families has been estimated, and the GK gene was screened for the
presence of mutations.
[0005] Family screening in the initial cohort identified 18
individuals with severe hyperglycerolemia (values above 2.0
mmol/L). These individuals were shown to carry a missense mutation
(N288D) in exon 10 of the GK gene. Analysis of the biological
variables among the N288D carriers led to the observation that
variation in glycerolemia was a predictor of impaired glucose
metabolism and of abdominal fat accumulation. In the absence of
severe hyperglycerolemia, a significant familial resemblance for
fasting glycerol concentration (F ratio:6,3; p<0.0001) was
observed. Furthermore, multivariate analyses performed in the
initial cohort revealed substantial variation in fasting
glycerolemia which was associated with significant differences in
glucose tolerance, independent of known covariates such as age,
gender and body mass index as well as fasting triglyceride,
glucose, insulin and free fatty acid concentrations. These results
suggest an important genetic connection between glycerol and
glucose homeostasis and indicate that assessment of glycerol levels
could be a clinically useful tool in the prediction of IGT.
[0006] The invention relates to a method of predicting or assisting
in the prediction of impaired glucose tolerance, diabetes mellitus,
hyperglycerolemia and/or cardiovascular disease in an individual,
comprising the steps of obtaining a biological sample from an
individual; and assessing the glycerol level in said sample,
wherein an increased level of glycerol in said sample as compared
with a control sample is predictive of impaired glucose tolerance,
diabetes mellitus, hyperglycerolemia and/or cardiovascular disease
in the individual. In one embodiment, the increased glycerol level
is greater than about 0.08 mmol/L. In another embodiment, the
biological sample is a blood sample. In one embodiment, the
glycerol level is a plasma glycerol level, and in one embodiment
the sample is a fasting sample.
[0007] The invention also relates to a method of predicting or
assisting in the prediction of impaired glucose tolerance, diabetes
mellitus, cardiovascular disease and/or hyperglycerolemia in an
individual, comprising the steps of obtaining a nucleic acid sample
from an individual; and determining the nucleotide present at
nucleotide position 29 of exon 10, wherein presence of a guanine at
said position is predictive of impaired glucose tolerance, diabetes
mellitus, cardiovascular disease and/or hyperglycerolemia in the
individual as compared with an individual having an adenosine at
said position.
[0008] The invention also relates to a method of predicting or
assisting in the prediction of impaired glucose tolerance, diabetes
mellitus, cardiovascular disease and/or hyperglycerolemia in an
individual, comprising the steps of obtaining a biological sample
comprising the glycerol kinase protein or portion thereof from an
individual; and determining the amino acid present at amino acid
position 288, wherein presence of an aspartate at said position is
predictive of impaired glucose tolerance, diabetes mellitus,
cardiovascular disease and/or hyperglycerolemia in the individual
as compared with an individual having an asparagine at said
position.
[0009] The invention further relates to a method of identifying an
agent which is an agonist of glycerol kinase, comprising the steps
of providing a recombinant host cell of the invention; contacting
said host cell with an agent to be tested; and assessing the
ability of the agent to increase glycerol kinase activity, wherein
an agent which increases glycerol kinase activity is an agonist of
glycerol kinase activity. In one embodiment, the step of assessing
is performed by determining the level of one or more downstream
effects of a glycerol metabolic pathway and comparing said level
with a level in an appropriate control.
[0010] The invention further relates to a method of predicting or
assisting in the prediction of impaired glucose tolerance, diabetes
mellitus, cardiovascular disease and/or hyperglycerolemia in an
individual, comprising the steps of obtaining a biological sample
from an individual; and assessing the level of glycerol kinase gene
expression in said sample, wherein a decreased glycerol kinase gene
expression level in said sample as compared with a control sample
is predictive of impaired glucose tolerance, diabetes mellitus,
cardiovascular disease and/or hyperglycerolemia in the
individual.
[0011] The invention also relates to a method of predicting or
assisting in the prediction of impaired glucose tolerance, diabetes
mellitus, cardiovascular disease and/or hyperglycerolemia in an
individual, comprising the steps of obtaining a biological sample
from an individual; and assessing the level of active glycerol
kinase in said sample, wherein a decreased level of active glycerol
kinase in said sample as compared with a control sample is
predictive of impaired glucose tolerance, diabetes mellitus,
cardiovascular disease and/or hyperglycerolemia in the
individual.
[0012] The invention also relates to an isolated nucleic acid
molecule comprising SEQ ID NOS: 1-4. The invention further relates
to an isolated nucleic acid molecule comprising a portion of SEQ ID
NOS: 1-4, wherein said portion is at least 10 nucleotides in length
and wherein said portion comprises a polymorphic nucleotide
position occupied by the alternate (non-wildtype) nucleotide. The
invention also relates to nucleic acid constructs and recombinant
host cells comprising the isolated nucleic acid molecules of the
invention. For example, the recombinant host cell can be selected
from the group consisting of adipocytes, lymphoblasts and
fibroblasts.
[0013] The invention further relates to gene products, e.g., mRNA
or polypeptides, encoded by the nucleic acid molecules of the
invention.
BRIEF DESCRIPTION OF THE DRAWINGS
[0014] FIGS. 1A-1C show pedigree drawings for three families with
hyperglycerolemia. Open squares indicate unaffected males; filled
squares indicate hyperglycerolemic males; open circles indicate
unaffected females; and filled circles indicate hyperglycerolemic
females.
[0015] FIG. 2 shows the exonic structure of the Xp GK gene and
location of sequence polymorphisms. The first PAC clone,
RPCI-5.931_C.sub.--24, containing exons 1 to 12 was used as
sequencing template for exons 9, 10 and 11. An insert of 394 base
pairs (bp) was found after the 36th nucleotide of exon 9,
suggesting that the originally described exon actually consists of
two exons (9A and 9B). These exons are 36 and 68 bases in length,
respectively, and the corresponding intron-exon boundaries have the
expected consensus splice site sequence as shown. When the sequence
obtained for intron 10 was aligned with the published cDNA
sequence, it was discovered that the splice junctions had been
incorrectly defined, so that the last 12 bases of exon 10 were in
fact encoded by exon 11. Furthermore, when the entire intron was
sequenced, rather than being greater than 8 kilobases (kb) in
length as originally believed, it was found to be 456 bp. Using
primers located in introns 16 and 18 (forward and reverse primers,
respectively), an amplicon was generated from the second clone,
RPCI-5.1150_B.sub.--8 and then sequenced to determine the sequence
of the 3' end of intron 7. Boxes show each exon and its length in
base pairs (intron length not drawn to scale). Primers used to
amplify each exon are shown over and under the exonic structure
(arrowheads). Exon-intron boundaries of exons 9, 10, 11 and 17 are
shown in the upper part of the diagram (uppercase=exon,
lowercase=intron), and the region covered by the two PAC clones is
illustrated by the two lines at the bottom of the figure. The
approximate location of the sequence polymorphisms, discovered in
the families with severe hyperglycerolemia, are indicated by the
arrows. The polymorphic base and surrounding sequence appear
beneath the arrows (SEQ ID NOS: 20-23).
[0016] FIGS. 3A and 3B show the N288D mutation and alignment of the
amino acid sequence with the wildtype amino acid sequences from
different organisms. FIG. 3A shows the location of the N288D
mutation. FIG. 3B shows the alignment of the amino acid sequence
with the wildtype amino acid sequences from different organisms
(SEQ ID NOS: 6-19). Abbreviations are as follows: pseae,
Pseudomonas aeruginosa; entca, Enterococcus casseliflavus; haein,
Haemophilus influenzae; bacsu, Bacillus subtilis; yeast,
Saccharomyces cerevisiae; mycge, Mycoplasma genitalium; entfa,
Enterococcus faecalis; mycpn, Mycoplasma pneumoniae; syny3,
Synechocystis PCC6803. Dashes represent gaps introduced to maximize
alignment.
[0017] FIG. 4A-4C are graphs of glycerol levels versus plasma
glucose levels and waist girth, as well as mean plasma glycerol
concentrations versus glucose tolerance. FIGS. 4A and 4B illustrate
that among the 18 men carrying the N288D mutation, glycerol was a
significant correlate of 2-hour glucose following a 75 g oral load
(r.sup.2=0.689, p<0.0001) (4A) and waist girth (r.sup.2=0.452,
p<0.0001) (4B). Five men with previously-diagnosed type 2
diabetes mellitus did not undergo oral glucose tolerance test
(OGTT). FIG. 4C shows mean plasma glycerol concentrations (.+-.95%
confidence interval) according to the magnitude of glucose
tolerance in subjects with severe hyperglycerolemia due to the
N288D mutation (N=18), and within the initial cohort (non-GKD,
N=1051). NORM defines the category of subjects with normal glucose
tolerance (2-hour glucose <7, 8 mmol/L following a 75 g oral
glucose absorption). IGT identifies impaired glucose tolerance
(2-hour glucose 7.8-11.0 mmol/L), whereas DM denotes the presence
of criteria of type 2 diabetes mellitus (2-hour glucose
.gtoreq.11.1 mmol/L) during the OGTT.
[0018] FIG. 5 shows the familial resemblance of plasma glycerol
concentrations in the fasting state. Analyses were performed after
having excluded families showing evidence of X-linked transmission
of hyperglycerolemia due to a mutation in the GK gene. The age and
sex adjusted fasting glycerol concentration was calculated as the
residual from the regression model with covariates only, plus mean
glycerolemia for the whole sample. The families are ranked
according to plasma glycerol concentration in the fasting state.
The range of mean glycerolemia between and within families are
depicted by the hatched bars on the right. In the absence of GK
gene mutation, a highly significant (p<0.0001) F ratio of 6.3
was observed, suggesting that there is over 6 times more variance
between families than within them for plasma glycerol levels in the
fasting state. The maximal heritability of glycerolemia in the
fasting state has been estimated at 58% in the absence of severe
hyperglycerolemia. The dotted line denotes median and geometric
mean of plasma glycerol concentration (0.075 mmol/L) observed in
the initial cohort of 1056 individuals (the probands).
[0019] FIG. 6 shows partial nucleic acid sequences (SEQ ID NOS:
1-4, respectively) of the GK gene comprising specific polymorphic
sites, as well as the wild type and alternate nucleotides and the
amino acid change, if any.
[0020] FIGS. 7A-7D show the nucleic acid sequence of the GK gene
(SEQ ID NO: 5). Polymorphic sites are shown in brackets.
[0021] FIG. 8 is a table showing characteristics of carriers of the
N288D GK gene mutation and of their unaffected relatives.
[0022] FIG. 9 is table showing the fasting plasma glycerol
concentration by risk factor of glucose intolerance and diabetes
mellitus.
[0023] FIG. 10 is a table showing a multivariate analysis of the
relationships of fasting plasma glycerol concentration with
impaired glucose tolerance.
DETAILED DESCRIPTION OF THE INVENTION
[0024] Glycerol is an important intermediate of glucose and lipid
metabolism by virtue of its ability to support glycogenesis in
various systems (Rognstad et al., Biochem J. 140(2):249-251
(1974)), as well as serving as aprecursor of the synthesis of
triglycerides (TG) and other glycerolipids (Catron and Lewis, J.
Biol Chem 84:553-559 (1929); Shapiro, J. Biol Chem 108:373-387
(1935)). Administration of glycerol to healthy individuals has been
demonstrated to result in increased serum glucose levels and/or
gluconeogenesis (Sommer et al., Arzneimittel Forschung
43(7):744-747 (1993)), similar to the changes observed in various
pathological situations such as type 2 diabetes mellitus (DM)
(Guggenheim et al., Ann Neurol. 7.441-449 (1980); Frank et al.,
Pharmacotherapy, 1:147-160 (1980); Pelkonen et al., Diabetologia 3:
1-8 (1967)). It has also been shown that obese subjects have
increased levels of plasma glycerol and increased glycerol turnover
when compared with lean individuals (Jansson et al., J. Clin
Invest. 89: 1610-1617 (1992); Jansson et al., Am J. Physiol. 258:
E918-E922 (1990); Bjorntorp et al., Acta Med Scand. 179(2):221-227
(1966)). These observations indicated the potential importance of
glycerol homeostasis in healthy individuals as well as in patients
with abnormalities in glucose or lipid metabolism, who are at
higher risk for DM or coronary artery disease.
[0025] The glycerol kinase (GK) enzyme is a candidate for this
control since it mediates glycerol's entry into metabolic pathways.
Genetic abnormalities involving the GK gene, which is located on
chromosome Xp21.3 (Walker et al., Hum Mol Genet 2(2):107-114
(1993)), have been classified as either complex or isolated
deficiencies (Rose et al., J. Clin. Invest. 978:61(1):163-170;
McCabe et al., Adv. Exp. Med. Biol. 194:481-493 (1986); Blomquist
et al., Clin. Genet. 50(5):375-379 (1996)). The complex GK
deficiency (GKD) is a contiguous gene syndrome involving not only
the GK locus, but also the Duchenne muscular dystrophy and/or the
adrenal hypbplasia congenital gene loci (McCabe "Disorders of
Glycerol Metabolism" In the Metabolic Basis of Inherited Disease,
7.sup.th Edn. (ed. Scriver C R et al.) McGraw-Hill, New York, pp.
945-961 (1995); Walker et al., Hum. Mol. Genet. 1(8):579-585
(1992); Davies et al., Am. J. Med. Genet. 29(3):557-564 (1988);
Romero et al., Neuromuscul. Disord. 7(8):499-504 (1997)). In
contrast, isolated GK deficiencies, which include juvenile and
adult forms, result from either point mutations or small
rearrangements within the GK gene (Walker et al., Am. J. Hum.
Genet. 58(6):1205-1211 (1996); Sjarif et al., J. Med. Genet.
35(8):650-656 (1998)). The adult form is characterized by a
phenotype of hyperglycerolemia, often detected along with
pseudohypertriglyceridemia since the enzymatic measurement of TG is
generally inferred from that of glycerol generated as a product of
a lipolysis reaction. Apart from pseudohypertriglyceridemia,
however, the clinical expression of the adult form of isolated GK
deficiency is not well documented, mainly due to the small number
of clinically and genetically heterogeneous families described in
previous reports (Walker et al., Am. J. Hum. Genet. 58(6):1205-1211
(1996); Sjarif et al., J. Med. Genet. 35(8):650-656 (1998)). None
of these studies was designed, nor had the power, to describe the
metabolic phenotype in individuals having increased plasma glycerol
levels in the fasting state.
[0026] Work described herein reports the findings of clinical and
molecular genetic examinations of the largest group of individuals
with severe hyperglycerolemia ever reported identified from a
cohort of 1,056 unrelated French Canadians. This work provides
evidence that fasting glycerolemia is a significant predictor of
impaired glucose tolerance (IGT), and can be a potentially
important genetic connection between plasma glycerol and glucose
homeostasis.
[0027] It is likely that there are many different genes involved in
the modulation of plasma glucose and lipid homeostasis. Among them
are genes involved in the regulation of glycerol metabolism, since
these pathways contribute directly or indirectly to cellular energy
metabolism by providing mitochondria with substrate for oxidative
phosphorylation (Sarate, Science 283(5407):1488-1493 (1999)). In
this regard GK plays a pivotal role, since it mediates the entry of
glycerol into metabolism, catalyzing the phosphorylation of
glycerol by adenosine triphosphate (ATP) to yield glycerol
3-phosphate (G3P) and adenosine diphosphate (ADP) (Thorner et al.,
J. Biol. Chem. 248(1):3922-3932 (1973)). Although glycerol is a
well accepted indicator of lipolysis and a gluconeogenic precursor,
the relationship between glycerol and glucose homeostasis is
complex and not yet elucidated. One way to further this knowledge
is to study cases of hyperglycerolemia, to establish the effect of
glycerol levels in this extreme phenotype on the other metabolic
pathways and then examine whether similar effects are observable in
normoglycerolemic individuals.
[0028] Following this approach, the molecular and clinical
characteristics of the largest sample of individuals with familial
hyperglycerolemia ever reported were studied. Importantly, all
families exhibiting this severe phenotype were identified through a
systematic screening of fasting glycerol levels in a large number
of individuals of French Canadian descent. The uniformity of this
group of patients is clearly demonstrated by the observation that
all affected individuals bear the same N288D mutation in the GK
enzyme which is present on a haplotype common to all GKD families.
The study of this rare deficiency in glycerol metabolism
demonstrated that although all N288D carriers were
hyperglycerolemic, significant inter-individual variations in
glycerolemia were observed and these differences were found to
explain an important part of the variance observed in glucose
tolerance and abdominal obesity, a feature that has not been
reported in previous studies on familial hyperglycerolemia.
[0029] In the subsequent examination of the large cohort of
normoglycerolemic individuals it was determined that, in absence of
the N288D mutation at the GK locus, fasting plasma glycerol
concentrations have an important familial component in humans. This
finding is notable since glycerol is usually only considered as an
intermediate metabolite, its concentration being affected by
multiple factors such as the degree of glycerol released by
lipolysis, the rate of glyconeogenesis or glycogenolysis, obesity,
starvation, exercise, the use of pharmaceutical preparations, and
numerous pathological conditions. Despite this variety of
environmental factors affecting glycerol concentrations, it was
found that the heritability of fasting glycerolemia could be as
high as 58% in humans, indicating an important genetic control.
Furthermore, it was also found that plasma glycerol was a predictor
of 2-hour glucose, independent of the variation in significant,
well recognized, covariates of IGT or DM. This relationship of
glycerol to 2-hour glucose was not linear across its distribution
and a threshold in the relationship of glycerol of IGT was
observed. Interestingly, in the absence of the N288D mutation, the
threshold for glycerol concentrations was relatively low, at the
level of the median of the studied population, so that even within
what is considered as a "normal range" of glycerol levels, a
moderate elevation in glycerol concentrations substantially
increased the odds of finding patients with IGT. The possibility
that the results of the OGTT can be predicted from the knowledge of
the glycerolemia is clinically relevant, considering that
measurement of plasma glycerol concentrations in the fasting state
is a cheap and widely available analysis. Results of multivariate
analyses clearly demonstrated that there are many other important
IGT predictors, such as impaired fasting glucose and FFA
concentrations. The association of glycerol with IGT, however, was
independent of FFA and of fasting glucose concentrations.
Furthermore, compared to FFA, plasma glycerol measurement in the
fasting state is cheaper and is not affected by qualitative factors
such as the degree of saturation.
[0030] Taken together, these results are most consistent with
glycerol playing a regulatory role in the pathogenesis of IGT and
DM. First, results from N288D carriers demonstrate that increased
levels of glycerol is observable in the context of normal glucose
tolerance. Indeed, even though the majority of men carrying a GK
gene mutation met criteria of IGT or DM, some of them, exhibiting
extremely elevated plasma glycerol concentrations (over 3.0
mmol/L), had normal 2-hour glucose values. Compared to N288D
carriers with IGT, however, these individuals were younger and less
obese. Furthermore, the majority of them also presented elevated
fasting insulin concentration (above 30mU/L) such that they are
possibly at a higher risk of IGT.
[0031] Second, the essential position of glycerol in both glucose
and glycerolipid metabolic pathways favors glycerol as a potential
causal factor. Indeed, it is recognized that the contribution of
glycerol to glucose production is directly correlated to its
release as a consequence of lipolysis (Prentki et al., J. Biol.
Chem., 267(9):5802-5810 (1992)). However, under normal
circumstances gluconeogenesis from glycerol accounts for only a
small percentage of total glucose production, and an important
proportion of glycerol metabolites is used for glycerolipid
synthesis and not for glucose production. Notwithstanding these
factors, variations in the glycerolemia among individuals with GK
deficiency explained 68.9% of the variance in 2-hour glucose, and
among non-carriers the prediction of 2-hour glucose by fasting
glycerolemia was independent of fasting glucose concentration,
suggesting that beyond glycerol-derived gluconeogenesis, glycerol
is likely to have a regulatory role.
[0032] Thus, the current study of a large sample of unrelated
individuals and of an homogeneous group of patients with a rare
deficiency in glycerol metabolism indicate an important genetic
connection between glycerol metabolism and the level of glucose
tolerance, and supports the usefulness of measuring fasting plasma
glycerol concentration in screening for the pre-diabetic
phenotype.
[0033] The present invention also pertains to diagnostic assays and
prognostic assays used for prognostic (predictive) purposes to
thereby treat an individual prophylactically. Accordingly, one
aspect of the present invention relates to diagnostic assays for
determining protein and/or nucleic acid expression as well as
activity of proteins of the invention, in the context of a
biological sample (e.g., blood, serum, cells, tissue) to thereby
determine whether an individual is afflicted with a disease or
disorder, or is at risk of developing a disorder, e.g., type 2
diabetes mellitus, cardiovascular disease, hyperglycerolemia and/or
impaired glucose tolerance, associated with aberrant expression or
activity. The invention also provides for prognostic (or
predictive) assays for determining whether an individual is at risk
of developing a disorder associated with activity or expression of
proteins or nucleic acids of the invention. Thus, such methods can
predict or aid in the prediction of an individual's increased
likelihood for developing a disorder, as well as assisting in the
diagnosis of existing disorders.
[0034] For example, the invention provides methods of predicting or
assisting in the prediction of diabetes mellitus, cardiovascular
disease, hyperglycerolemia and/or impaired glucose tolerance in an
individual, comprising the steps of obtaining a biological sample
from an individual and assessing glycerol levels in said sample,
wherein increased levels of glycerol in said sample as compared
with a control sample, e.g., from a normal individual, is
predictive of diabetes mellitus, cardiovascular disease,
hyperglycerolemia and/or impaired glucose tolerance in the
individual. In a preferred embodiment, the diabetes mellitus is
type 2 diabetes mellitus. In one embodiment, increased glycerol
levels are greater than about 0.08 mmol/L. Alternatively, one could
assess levels of GK gene expression or levels of active GK protein
present in the sample. Increased levels as compared with a suitable
control are indicative of increased likelihood of diabetes mellitus
and/or IGT in the individual. In one embodiment, the biological
sample is a blood sample, such as a fasting blood sample. In a
preferred embodiment, the glycerol levels which are assessed are
plasma glycerol levels.
[0035] An exemplary method for detecting the presence or absence of
proteins or nucleic acids of the invention in a biological sample
involves obtaining a biological sample from a test subject and
contacting the biological sample with a compound or an agent
capable of detecting the protein (e.g., the glycerol protein or the
GK protein), or nucleic acid (e.g., mRNA, genomic DNA) that encodes
the GK protein, such that the presence of the protein or nucleic
acid is detected in the biological sample. A preferred agent for
detecting mRNA or genomic DNA is a labeled nucleic acid probe
capable of hybridizing to mRNA or genomic DNA sequences described
herein. The nucleic acid probe can be, for example, a full-length
nucleic acid, or a portion thereof, such as an oligonucleotide of
at least 15, 30, 50, 100, 250 or 500 nucleotides in length and
sufficient to specifically hybridize under stringent conditions to
appropriate mRNA or genomic DNA. Other suitable probes for use in
the diagnostic assays of the invention are described herein.
[0036] In another embodiment, the invention provides a method of
predicting or assisting in the prediction of diabetes mellitus or
impaired glucose tolerance in an individual, comprising the steps
of obtaining a nucleic acid sample from an individual and
determining the nucleotide present at nucleotide position 29 of
exon 10, wherein presence of a guanine at said position is
predictive of diabetes mellitus or impaired glucose tolerance in
the individual as compared with an appropriate control, e.g., an
individual having an adenosine at said position.
[0037] In one embodiment, the agent for detecting proteins of the
invention is an antibody capable of binding to the protein,
preferably an antibody with a detectable label. Antibodies can be
polyclonal, or more preferably, monoclonal. An intact antibody, or
a fragment thereof (e.g., Fab or F(ab').sub.2) can be used. The
term "labeled", with regard to the probe or antibody, is intended
to encompass direct labeling of the probe or antibody by coupling
(i.e., physically linking) a detectable substance to the probe or
antibody, as well as indirect labeling of the probe or antibody by
reactivity with another reagent that is directly labeled. Examples
of indirect labeling include detection of a primary antibody using
a fluorescently labeled secondary antibody and end-labeling of a
DNA probe with biotin such that it can be detected with
fluorescently labeled streptavidin. In a preferred embodiment, the
antibody is able to distinguish between complete or nearly complete
proteins and truncated versions of the same protein.
[0038] The term "biological sample" is intended to include tissues,
calls and biological fluids isolated from a subject, as well as
tissues, cells and fluids present within a subject. For example,
the sample can be obtained from a tissue selected from the group
consisting of: brain tissue, CNS, lung, fetal lung, testis,
lymphocytes, adipose, fibroblasts, skeletal muscle, pancreas,
uterus, kidney, tonsil, embryo and isolated cells thereof. That is,
the detection method of the invention can be used to detect mRNA,
protein, or genomic DNA of the invention in a biological sample in
vitro as well as in vivo. For example, in vitro techniques for
detection of mRNA include Northern hybridizations and in situ
hybridizations. In vitro techniques for detection of protein
include enzyme linked immunosorbent assays (ELISAs), Western blots,
immunoprecipitations and immunofluorescence. In vitro techniques
for detection of genomic DNA include Southern hybridizations.
Furthermore, in vivo techniques for detection of protein include
introducing into a subject a labeled anti-protein antibody. For
example, the antibody can be labeled with a radioactive marker
whose presence and location in a subject can be detected by
standard imaging techniques.
[0039] In one embodiment, the biological sample contains protein
molecules from the test subject. Alternatively, the biological
sample can contain mRNA molecules from the test subject or genomic
DNA molecules from the test subject. A preferred biological sample
is a serum sample obtained by conventional means from a subject. A
nucleic acid sample is a sample, e.g., a biological sample, which
contains nucleic acid molecules.
[0040] The invention also encompasses kits for detecting the
presence of proteins or nucleic acid molecules of the invention in
a biological sample. For example, the kit can comprise a labeled
compound or agent capable of detecting protein or mRNA in a
biological sample; means for determining the amount of in the
sample; and means for comparing the amount of in the sample with a
standard. The compound or agent can be packaged in a suitable
container. The kit can further comprise instructions for using the
kit to detect protein or nucleic acid.
[0041] In certain embodiments as described herein, it is valuable
to determine the genotype of an individual, particularly where a
specific allelic form of the GK gene has now been associated with
disease. For example, it will be valuable for purposes of diagnosis
to determine which allelic form of the N288D mutation an individual
has with respect to cardiovascular disease, hyperglycerolemia, IGT
or DM diagnosis.
[0042] Detection of the alteration can involve the use of a
probe/primer in a polymerase chain reaction (PCR) (see, e.g., U.S.
Pat. Nos. 4,683,195 and 4,683,202), such an anchor PCR or RACE PCR,
or, alternatively, in a ligation chain reaction (LCR) (see, e.g.,
Landegran et al. (1988) Science, 241:1077-1080; and Nakazawa et al.
(1994) PNAS, 91:360-364), the latter of which can be particularly
useful for detecting point mutations (see Abravaya et al. (1995)
Nucleic Acids Res., 23:675-682). This method can include the steps
of collecting a sample of cells from a patient, isolating nucleic
acid (e.g., genomic, mRNA or both) from the cells of the sample,
contacting the nucleic acid sample with one or more primers which
specifically hybridize to the gene under conditions such that
hybridization and amplification of the gene (if present) occurs,
and detecting the presence or absence of an amplification product,
or detecting the size of the amplification product and comparing
the length to a control sample. It is anticipated that PCR and/or
LCR may be desirable to use as a preliminary amplification step in
conjunction with any of the techniques used for detecting mutations
described herein. In one embodiment, allele-specific primers are
utilized.
[0043] Alternative amplification methods include: self sustained
sequence replication (Guatelli, J. C. et al. (1990) Proc. Natl.
Acad. Sci. USA, 87:1874-1878), transcriptional amplification system
(Kwoh, D. Y. et al., (1989) Proc. Natl. Acad. Sci. USA,
86:1173-1177), Q-Beta Replicase (Lizardi, P. M. et al., (1988)
Bio/Technology, 6:1197), or any other nucleic acid amplification
method, followed by the detection of the amplified molecules using
techniques well known to those of skill in the art. These detection
schemes are especially useful for the detection of nucleic acid
molecules if such molecules are present in very low numbers.
[0044] In an alternative embodiment, mutations in a given gene from
a sample cell can be identified by alterations in restriction
enzyme cleavage patterns. For example, sample and control DNA is
isolated, amplified (optionally), digested with one or more
restriction endonucleases, and fragment length sizes are determined
by gel electrophoresis and compared. Differences in fragment length
sizes between sample and control DNA indicate mutations in the
sample DNA. Moreover, the use of sequence specific ribozymes (see,
for sample, U.S. Pat. No. 5,498,531) can be used to score for the
presence of specific mutations by development or loss of a ribozyme
cleavage site.
[0045] In other embodiments, genetic mutations can be identified by
hybridizing a sample and control nucleic acids, e.g., DNA or RNA,
to high density arrays containing hundreds or thousands of
oligonucleotide probes (Cronin, M. T. et al. (1996) Human Mutation,
7:244-255; Kozal, M. J. et al. (1996) Nature Medicine, 2:753-759).
For example, genetic mutations can be identified in two dimensional
arrays containing light-generated DNA probes as described in
Cronin, M. T. et al. supra. Briefly, a first hybridization array of
probes can be used to scan through long stretches of DNA in a
sample and control to identify base changes between the sequences
by making linear arrays of sequential overlapping probes. This step
allows the identification of point mutations. This step is followed
by a second hybridization array that allows the characterization of
specific mutations by using smaller, specialized probe arrays
complementary to all variants or mutations detected. Each mutation
array is composed of parallel probe sets, one complementary to the
wild-type gene and the other complementary to the mutant gene.
[0046] In yet another embodiment, any of a variety of sequencing
reactions known in the art can be used to directly sequence the
gene and detect mutations by comparing the sequence of the gene
from the sample with the corresponding wild-type (control) gene
sequence. Examples of sequencing reactions include those based on
techniques developed by Maxim and Gilbert ((1997) PNAS, 74:560) or
Sanger ((1977) PNAS, 74:5463). It is also contemplated that any of
a variety of automated sequencing procedures can be utilized when
performing the diagnostic assays ((1995) Biotechniques, 19:448),
including sequencing by mass spectrometry (see, e.g., PCT
International Publication No. WO 94/16101; Cohen et al. (1996) Adv.
Chromatogr., 36:127-162; and Griffin et al. (1993) Appl. Biochem.
Biotechnol., 38:147-159).
[0047] In other embodiments, alterations in electrophoretic
mobility will be used to identify mutations in genes. For example,
single strand conformation polymorphism (SSCP) may be used to
detect differences in electrophoretic mobility between mutant and
wild type nucleic acids (Orita et al. (1989) Proc. Natl. Acad. Sci.
USA, 86:2766, see also Cotton (1993) Mutat Res, 285:125-144; and
Hayashi (1992) Genet Anal. Tech. Appl., 9:73-79). Single-stranded
DNA fragments of sample and control nucleic acids will be denatured
and allowed to renature. The secondary structure of single-stranded
nucleic acids varies according to sequence, the resulting
alteration in electrophoretic mobility enables the detection of
even a single base change. The DNA fragments may be labeled or
detected with labeled probes. The sensitivity of the assay may be
enhanced by using RNA (rather than DNA), in which the secondary
structure is more sensitive to a change in sequence. In one
embodiment, the subject method utilizes heteroduplex analysis to
separate double stranded heteroduplex molecules on the basis of
changes in electrophoretic mobility (Keen et al. (1991) Trends
Genet., 7:5).
[0048] In yet another embodiment the movement of mutant or
wild-type fragments in polyacrylamide gels containing a gradient of
denaturant is assayed using denaturing gradient gel electrophoresis
(DGGE) (Myers et al. (1985) Nature, 313:495). When DGGE is used as
the method of analysis, DNA will be modified to insure that it does
not completely denature, for example by adding a GC clamp of
approximately 40 bp of high-melting GC-rich DNA by PCR. In a
further embodiment, a temperature gradient is used in place of a
denaturing gradient to identify differences in the mobility of
control and sample DNA (Rosenbaum and Reissner (1987) Biophys.
Chem., 265:12753).
[0049] Examples of other techniques for detecting point mutations
include, but are not limited to, selective oligonucleotide
hybridization, selective amplification, or selective primer
extension. For example, oligonucleotide primers may be prepared in
which the known mutation is placed centrally and then hybridized to
target DNA under conditions which permit hybridization only if a
perfect match is found (Saiki et al. (1986) Nature, 324:163); Saiki
et al. (1989) Proc. Natl. Acad. Sci. USA, 86:6320). Such
allele-specific oligonucleotides are hybridized to PCR amplified
target DNA or a number of different mutations when the
oligonucleotides are attached to the hybridizing membrane and
hybridized with labeled target DNA.
[0050] Alternatively, allele specific amplification technology that
depends on selective PCR amplification may be used in conjunction
with the instant invention. Oligonucleotides used as primers for
specific amplification may carry the mutation of interest in the
center of the molecule (so that amplification depends on
differential hybridization) (Gibbs et al. (1989) Nucleic Acids
Res., 17:2437-2448) or at the extreme 3' end of one primer where,
under appropriate conditions, mismatch can prevent, or reduce
polymerase extension (Prossner (1993) Tibtech, 11:238). In addition
it may be desirable to introduce a novel restriction site in the
region of the mutation to create cleavage-based detection
(Gasparini et al. (1992) Mol. Cell. Probes, 6:1). It is anticipated
that in certain embodiments amplification may also be performed
using Taq ligase for amplification (Barany (1991) Proc. Natl. Acad.
Sci. USA, 88:189). In such cases, ligation will occur only if there
is a perfect match at the 3' end of the 5' sequence making it
possible to detect the presence of a known mutation at a specific
site by looking for the presence or absence of amplification.
Single base extension (SBE) and SBE fluorescence resonance energy
transfer (SBE-FRET) can also be used to identify the specific
nucleotide which occupies a given position in a nucleic acid
molecule.
[0051] The methods described herein may be performed, for example,
by utilizing pre-packaged diagnostic kits comprising at least one
probe nucleic acid molecule or antibody reagent described herein,
which may be conveniently used, e.g., in clinical settings to
diagnose patients exhibiting symptoms or family history of a
disease or illness involving a gene of the present invention. Any
cell type or tissue in which the gene is expressed may be utilized
in the prognostic assays described herein.
[0052] The invention also relates to isolated nucleic acid
molecules comprising SEQ ID NOS: 1-4. SEQ ID NOS: referred to
herein are as follows. SEQ ID NO: 1 refers to the nucleic acid
sequence of the GK gene having a polymorphic site at nucleotide
position 13 of exon 3 as shown in FIG. 6. SEQ ID NO: 2 refers to
the nucleic acid sequence of the GK gene having a polymorphic site
at nucleotide position 17 of intron 8 as shown in FIG. 6. SEQ ID
NO: 3 refers to the nucleic acid sequence of the GK gene having a
polymorphic site at nucleotide position 29 of exon 10 as shown in
FIG. 6. SEQ ID NO: 4 refers to the nucleic acid sequence of the GK
gene having polymorphic site at nucleotide position 22 of intron 12
as shown in FIG. 6. In one embodiment, SEQ ID NOS: 1-4 comprise the
reference (first) nucleotide at the polymorphic site. In another
embodiment, SEQ ID NOS: 1-4 comprise the alternate (second)
nucleotide at the polymorphic site. SEQ ID NO: 5 refers to the
complete coding nucleic acid sequence of the GK gene, particularly
as shown in FIGS. 7A-7D.
[0053] As appropriate, the isolated nucleic acid molecules of the
present invention can be RNA, for example, mRNA, or DNA, such as
cDNA and genomic DNA. DNA molecules can be double-stranded or
single-stranded; single stranded RNA or DNA can be either the
coding, or sense, strand or the non-coding, or antisense, strand.
The nucleic acid molecule can include all or a portion of the
coding sequence of a gene and can further comprise additional
non-coding sequences such as introns and non-coding 3' and 5'
sequences (including regulatory sequences, for example).
Additionally, the nucleic acid molecule can be fused to a marker
sequence, for example, a sequence that encodes a polypeptide to
assist in isolation or purification of the polypeptide. Such
sequences include, but are not limited to, those which encode a
glutathione-S-transferase (GST) fusion protein and those which
encode a hemaglutin A (HA) polypeptide marker from influenza. As
used herein, "isolated" is intended to mean that the isolated item
is not in the form or environment in which it exists in nature. For
example, an "isolated" nucleic acid molecule, as used herein, is
one that is separated from nucleic acid which normally flanks the
nucleic acid molecule in nature. With regard to genomic DNA, the
term "isolated" refers to nucleic acid molecules which are
separated from the chromosome with which the genomic DNA is
naturally associated. For example, the isolated nucleic acid
molecule can contain less than about 5 kb, 4 kb, 3 kb, 2 kb, 1 kb,
0.5 kb or 0.1 kb of nucleotides which flank the nucleic acid
molecule in the genomic DNA of the cell from which the nucleic acid
is derived.
[0054] Moreover, an isolated nucleic acid of the invention, such as
a cDNA or RNA molecule, can be substantially free of other cellular
material, or culture medium when produced by recombinant
techniques, or chemical precursors or other chemicals when
chemically synthesized. However, the nucleic acid molecule can be
fused to other coding or regulatory sequences and still be
considered isolated. In some instances, the isolated material will
form part of a composition (for example, a crude extract containing
other substances), buffer system or reagent mix. In other
circumstances, the material may be purified to essential
homogeneity, for example as determined by PAGE or column
chromatography such as HPLC. Preferably, an isolated nucleic acid
comprises at least about 50, 80 or 90% (on a molar basis) of all
macromolecular species present.
[0055] Further, recombinant DNA contained in a vector is included
in the definition of "isolated" as used herein. Also, isolated
nucleic acid molecules include recombinant DNA molecules in
heterologous host cells, as well as partially or substantially
purified DNA molecules in solution. "Isolated" nucleic acid
molecules also encompass in vivo and in vitro RNA transcripts of
the DNA molecules of the present invention produced in a
heterologous host cell. The present invention also provides
isolated nucleic acids that contain a fragment or portion of SEQ ID
NOS: 1-4 described herein and the complements of SEQ ID NOS: 1-4.
Preferred fragments comprises a polymorphic site, and in a
preferred embodiment the polymorphic site is occupied by the
alternate nucleotide. The nucleic acid fragments of the invention
are at least about 15, preferably at least about 18, 20, 23 or 25
consecutive nucleotides, and can be 30, 40, 50, 100, 200 or more
nucleotides in length. Longer fragments, for example, 30 or more
nucleotides in length, which encode antigenic proteins or
polypeptides described herein are useful.
[0056] In a related aspect, the nucleic acid fragments of the
invention are used as probes or primers in assays such as those
described herein. "Probes" are oligonucleotides that hybridize in a
base-specific manner to a complementary strand of nucleic acid.
Such probes include polypeptide nucleic acids, as described in
Nielsen et al., Science, 254, 1497-1500 (1991). Typically, a probe
comprises a region of nucleotide sequence that hybridizes under
highly stringent conditions to at least about 15, typically about
20-25, and more typically about 40, 50 or 75 consecutive
nucleotides of a nucleic acid molecule of the invention. More
typically, the probe further comprises a label, e.g., radioisotope,
fluorescent compound, enzyme, or enzyme co-factor.
[0057] As used herein, the term "primer" refers to a
single-stranded oligonucleotide which acts as a point of initiation
of template-directed DNA synthesis using well-known methods (e.g.,
PCR, LCR) including, but not limited to those described herein. The
appropriate length of the primer depends on the particular use, but
typically ranges from about 15 to 30 nucleotides. The term "primer
site" refers to the area of the target DNA to which a primer
hybridizes. The term "primer pair" refers to a set of primers
including a 5' (upstream) primer that hybridizes with the 5' end of
the nucleic acid sequence to be amplified and a 3' (downstream)
primer that hybridizes with the complement of the sequence to be
amplified.
[0058] The nucleic acid molecules of the invention such as those
described above can be identified and isolated using standard
molecular biology techniques and the sequence information provided
herein. For example, nucleic acid molecules can be amplified and
isolated by the polymerase chain reaction using synthetic
oligonucleotide primers designed based on one or more of the
sequences provided herein and the complements thereof. See
generally PCR Technology: Principles and Applications for DNA
Amplification (ed. H. A. Erlich, Freeman Press, NY, N.Y., 1992);
PCR Protocols: A Guide to Methods and Applications (Eds. Innis, et
al., Academic Press, San Diego, Calif., 1990); Mattila et al.,
Nucleic Acids Res., 19:4967 (1991); Eckert et al., PCR Methods and
Applications, 1:17 (1991); PCR (eds. McPherson et al., IRL Press,
Oxford); and U.S. Pat. No. 4,683,202. The nucleic acid molecules
can be amplified using cDNA, mRNA or genomic DNA as a template,
cloned into an appropriate vector and characterized by DNA sequence
analysis.
[0059] Other suitable amplification methods include the ligase
chain reaction (LCR) (see Wu and Wallace, Genomics, 4:560 (1989),
Landegren et al., Science, 241:1077 (1988), transcription
amplification (Kwoh et al., Proc. Natl. Acad. Sci. USA, 86:1173
(1989)), and self-sustained sequence replication (Guatelli et al.,
Proc. Nat. Acad. Sci. USA, 87:1874 (1990)) and nucleic acid based
sequence amplification (NASBA). The latter two amplification
methods involve isothermal reactions based on isothermal
transcription, which produce both single stranded RNA (ssRNA) and
double stranded DNA (dsDNA) as the amplification products in a
ratio of about 30 or 100 to 1, respectively.
[0060] The amplified DNA can be radiolabelled and used as a probe
for screening a cDNA library derived from mRNA in zap express,
ZIPLOX or other suitable vector. Corresponding clones can be
isolated, DNA can obtained following in vivo excision, and the
cloned insert can be sequenced in either or both orientations by
art recognized methods to identify the correct reading frame
encoding a protein of the appropriate molecular weight. For
example, the direct analysis of the nucleotide sequence of nucleic
acid molecules of the present invention can be accomplished using
well-known methods that are commercially available. See, for
example, Sambrook et al., Molecular Cloning, A Laboratory Manual
(2nd Ed., CSHP, New York 1989); Zyskind et al., Recombinant DNA
Laboratory Manual, (Acad. Press, 1988)). Using these or similar
methods, the protein(s) and the DNA encoding the protein can be
isolated, sequenced and further characterized.
[0061] Antisense nucleic acids of the invention can be designed
using the nucleotide sequences described herein, and constructed
using chemical synthesis and enzymatic ligation reactions using
procedures known in the art. For example, an antisense nucleic acid
(e.g., an antisense oligonucleotide) can be chemically synthesized
using naturally occurring nucleotides or variously modified
nucleotides designed to increase the biological stability of the
molecules or to increase the physical stability of the duplex
formed between the antisense and sense nucleic acids, e.g.,
phosphorothioate derivatives and acridine substituted nucleotides
can be used.
[0062] In general, the isolated nucleic acid sequences can be used
as molecular weight markers on Southern gels, and as chromosome
markers which are labeled to map related gene positions. The
nucleic acid sequences can also be used to compare with endogenous
DNA sequences in patients to identify genetic disorders, and as
probes, such as to hybridize and discover related DNA sequences or
to subtract out known sequences from a sample. The nucleic acid
sequences can further be used to derive primers for genetic
fingerprinting, to raise anti-protein antibodies using DNA
immunization techniques, and as an antigen to raise anti-DNA
antibodies or elicit immune responses. Additionally, the nucleotide
sequences of the invention can be used identify and express
recombinant proteins for analysis, characterization or therapeutic
use, or as markers for tissues in which the corresponding protein
is expressed, either constitutively, during tissue differentiation,
or in diseased states.
[0063] The invention also relates to constructs which comprise a
vector into which a sequence of the invention has been inserted in
a sense or antisense orientation. As used herein, the term "vector"
refers to a nucleic acid molecule capable of transporting another
nucleic acid to which it has been linked. One type of vector is a
"plasmid", which refers to a circular double stranded DNA loop into
which additional DNA segments can be ligated. Another type of
vector is a viral vector, wherein additional DNA segments can be
ligated into the viral genome. Certain vectors are capable of
autonomous replication in a host cell into which they are
introduced (e.g., bacterial vectors having a bacterial origin of
replication and episomal mammalian vectors). Other vectors (e.g.,
non-episomal mammalian vectors) are integrated into the genome of a
host cell upon introduction into the host cell, and thereby are
replicated along with the host genome. Moreover, certain vectors,
expression vectors, are capable of directing the expression of
genes to which they are operably linked. In general, expression
vectors of utility in recombinant DNA techniques are often in the
form of plasmids (vectors). However, the invention is intended to
include such other forms of expression vectors, such as viral
vectors (e.g., replication defective retroviruses, adenoviruses and
adeno-associated viruses) that serve equivalent functions.
[0064] Preferred recombinant expression vectors of the invention
comprise a nucleic acid of the invention in a form suitable for
expression of the nucleic acid in a host cell. This means that the
recombinant expression vectors include one or more regulatory
sequences, selected on the basis of the host cells to be used for
expression, which is operably linked to the nucleic acid sequence
to be expressed. Within a recombinant expression vector, "operably
linked" is intended to mean that the nucleotide sequence of
interest is linked to the regulatory sequence(s) in a manner which
allows for expression of the nucleotide sequence (e.g., in an in
vitro transcription/translation system or in a host cell when the
vector is introduced into the host cell). The term "regulatory
sequence" is intended to include promoters, enhancers and other
expression control elements (e.g., polyadenylation signals). Such
regulatory sequences are described, for example, in Goeddel, Gene
Expression Technology: Methods in Enzymology 185, Academic Press,
San Diego, Calif. (1990). Regulatory sequences include those which
direct constitutive expression of a nucleotide sequence in many
types of host cell and those which direct expression of the
nucleotide sequence only in certain host cells (e.g.,
tissue-specific regulatory sequences). It will be appreciated by
those skilled in the art that the design of the expression vector
can depend on such factors as the choice of the host cell to be
transformed, the level of expression of protein desired, etc.
[0065] The expression vectors of the invention can be introduced
into host cells to thereby produce proteins or peptides, including
fusion proteins or peptides, encoded by nucleic acids as described
herein. The recombinant expression vectors of the invention can be
designed for expression of a polypeptide of the invention in
prokaryotic or eukaryotic cells, e.g., bacterial cells such as E.
coli, insect cells (using baculovirus expression vectors), yeast
cells or mammalian cells. Suitable host cells are discussed further
in Goeddel, supra. Alternatively, the recombinant expression vector
can be transcribed and translated in vitro, for example using T7
promoter regulatory sequences and T7 polymerase.
[0066] Another aspect of the invention pertains to host cells into
which a recombinant expression vector of the invention has been
introduced. The terms "host cell" and "recombinant host cell" are
used interchangeably herein. It is understood that such terms refer
not only to the particular subject cell but also to the progeny or
potential progeny of such a cell. Because certain modifications may
occur in succeeding generations due to either mutation or
environmental influences, such progeny may not, in fact, be
identical to the parent cell, but are still included within the
scope of the term as used herein.
[0067] A host cell can be any prokaryotic or eukaryotic cell. For
example, a nucleic acid of the invention can be expressed in
bacterial cells (e.g., E. coli), insect cells, yeast or mammalian
cells (such as Chinese hamster ovary cells (CHO) or COS cells).
Other suitable host cells are known to those skilled in the art.
For example, suitable cells can be derived from tissues such as
adipocytes, lymphoblasts and fibroblasts.
[0068] Vector DNA can be introduced into prokaryotic or eukaryotic
cells via conventional transformation or transfection techniques.
As used herein, the terms "transformation" and "transfection" are
intended to refer to a variety of art-recognized techniques for
introducing foreign nucleic acid (e.g., DNA) into a host cell,
including calcium phosphate or calcium chloride co-precipitation,
DEAE-dextran-mediated transfection, lipofection, or
electroporation. Suitable methods for transforming or transfecting
host cells can be found in Sambrook, et al. (supra), and other
laboratory manuals.
[0069] A host cell of the invention, such as a prokaryotic or
eukaryotic host cell in culture, can be used to produce (i.e.,
express) a polypeptide of the invention. Accordingly, the invention
further provides methods for producing a polypeptide using the host
cells of the invention. In one embodiment, the method comprises
culturing the host cell of invention (into which a recombinant
expression vector encoding a polypeptide of the invention has been
introduced) in a suitable medium such that the polypeptide is
produced. In another embodiment, the method further comprises
isolating the polypeptide from the medium or the host cell.
[0070] The host cells of the invention can also be used to produce
nonhuman transgenic animals. For example, in one embodiment, a host
cell of the invention is a fertilized oocyte or an embryonic stem
cell into which a nucleic acid of the invention have been
introduced. Such host cells can then be used to create non-human
transgenic animals in which exogenous nucleotide sequences have
been introduced into their genome or homologous recombinant animals
in which endogenous nucleotide sequences have been altered. Such
animals are useful for studying the function and/or activity of the
nucleotide sequence and polypeptide encoded by the sequence and for
identifying and/or evaluating modulators of their activity. As used
herein, a "transgenic animal" is a non-human animal, preferably a
mammal, more preferably a rodent such as a rat or mouse, in which
one or more of the cells of the animal includes a transgene. Other
examples of transgenic animals include non-human primates, sheep,
dogs, cows, goats, chickens, amphibians, etc. A transgene is
exogenous DNA which is integrated into the genome of a cell from
which a transgenic animal develops and which remains in the genome
of the mature animal, thereby directing the expression of an
encoded gene product in one or more cell types or tissues of the
transgenic animal. As used herein, an "homologous recombinant
animal" is a non-human animal, preferably a mammal, more preferably
a mouse, in which an endogenous gene has been altered by homologous
recombination between the endogenous gene and an exogenous DNA
molecule introduced into a cell of the animal, e.g., an embryonic
cell of the animal, prior to development of the animal.
[0071] A transgenic animal of the invention can be created by
introducing a nucleic acid of the invention into the male pronuclei
of a fertilized oocyte, e.g., by microinjection, retroviral
infection, and allowing the oocyte to develop in a pseudopregnant
female foster animal. The sequence can be introduced as a transgene
into the genome of a non-human animal. Intronic sequences and
polyadenylation signals can also be included in the transgene to
increase the efficiency of expression of the transgene. A
tissue-specific regulatory sequence(s) can be operably linked to
the transgene to direct expression of a polypeptide in particular
cells. Methods for generating transgenic animals via embryo
manipulation and microinjection, particularly animals such as mice,
have become conventional in the art and are described, for example,
in U.S. Pat. Nos. 4,736,866 and 4,870,009, U.S. Pat. No. 4,873,191
and in Hogan, Manipulating the Mouse Embryo (Cold Spring Harbor
Laboratory Press, Cold Spring Harbor, N.Y., 1986). Similar methods
are used for production of other transgenic animals. A transgenic
founder animal can be identified based upon the presence of the
transgene in its genome and/or expression of mRNA in tissues or
cells of the animals. A transgenic founder animal can then be used
to breed additional animals carrying the transgene. Moreover,
transgenic animals carrying a transgene encoding the transgene can
further be bred to other transgenic animals carrying other
transgenes.
[0072] The host cells of the invention can also be used as an in
vitro model to assess the ability of agents to act as agonists of
glycerol kinase or the glycerol kinase-mediated pathway of glycerol
metabolism. For example, a suitable host cell can be transfected
with a nucleic acid molecule encoding SEQ ID NO: 3 comprising the
alternate nucleotide at the polymorphic site, which results in a
defective glycerol metabolism pathway. Such cells can then be
contacted with one or more agents to test their ability to overcome
this defect, i.e., to act as agonists of glycerol kinase. As used
herein, an agonist is an agent which increases or enhances the
activity or effect of glycerol kinase. For example, an agent which
mediates phosphorylation of glycerol by adenosine triphosphate
(ATP) to yield glycerol 3-phosphate (G3P) and adenosine diphosphate
(ADP) can be an agonist of glycerol kinase. The ability of an agent
to act as an agonist can be tested, for example, using the level of
a molecule downstream of glycerol kinase in the glycerol metabolic
path as an indicator. For example, one could assess the agent's
ability to increase G3P or ADP production relative to a suitable
control, e.g., a cell which has not been contacted with the
agent.
[0073] The present invention also provides isolated polypeptides
and variants and fragments thereof that are encoded by the nucleic
acid molecules of the invention. For example, as described above,
the nucleotide sequences can be used to design primers to clone and
express cDNAs encoding the polypeptides of the invention. In one
embodiment, a polypeptide of the invention has an amino acid
sequence encoded by SEQ ID NO: 5. In another embodiment, the
polypeptide has the amino acid sequence of the wild type GK protein
(e.g., comprising SEQ ID NO: 6) except that the protein comprises
an aspartate as the tenth amino acid encoded by exon 10.
[0074] As used herein, a polypeptide is said to be "isolated" or
"purified" when it is substantially free of cellular material when
it is isolated from recombinant and non-recombinant cells, or free
of chemical precursors or other chemicals when it is chemically
synthesized. A polypeptide, however, can be joined to another
polypeptide with which it is not normally associated in a cell and
still be "isolated" or "purified."
[0075] The polypeptides of the invention can be purified to
homogeneity. It is understood, however, that preparations in which
the polypeptide is not purified to homogeneity are useful and
considered to contain an isolated form of the polypeptide. The
critical feature is that the preparation allows for the desired
function of the polypeptide, even in the presence of considerable
amounts of other components. Thus, the invention encompasses
various degrees of purity. In one embodiment, the language
"substantially free of cellular material" includes preparations of
the polypeptide having less than about 30% (by dry weight) other
proteins (i.e., contaminating protein), less than about 20% other
proteins, less than about 10% other proteins, or less than about 5%
other proteins.
[0076] When a polypeptide is recombinantly produced, it can also be
substantially free of culture medium, i.e., culture medium
represents less than about 20%, less than about 10%, or less than
about 5% of the volume of the protein preparation. The language
"substantially free of chemical precursors or other chemicals"
includes preparations of the polypeptide in which it is separated
from chemical precursors or other chemicals that are involved in
its synthesis. In one embodiment, the language "substantially free
of chemical precursors or other chemicals" includes preparations of
the polypeptide having less than about 30% (by dry weight) chemical
precursors or other chemicals, less than about 20% chemical
precursors or other chemicals, less than about 10% chemical
precursors or other chemicals, or less than about 5% chemical
precursors or other chemicals.
[0077] The invention also includes polypeptide fragments or
portions of the polypeptides of the invention, as well as fragments
of the variants of the polypeptides described herein. As used
herein, a fragment comprises at least 6 contiguous amino acids.
Useful fragments include those that retain one or more of the
biological activities of the polypeptide as well as fragments that
can be used as an immunogen to generate polypeptide specific
antibodies. Particularly preferred polypeptides are those which
comprise an alternate amino acid encoded by a polymorphic nucleic
acid.
[0078] Biologically active fragments (peptides which are, for
example, 6, 9, 12, 15, 20, 30, 35, 36, 37, 38, 39, 40, 50, 100 or
more amino acids in length) can comprise a domain, segment, or
motif that has been identified by analysis of the polypeptide
sequence using well-known methods, e.g., signal peptides,
extracellular domains, one or more transmembrane segments or loops,
ligand binding regions, zinc finger domains, DNA binding domains,
acylation sites, glycosylation sites, or phosphorylation sites.
Preferred fragments or portions comprise an amino acid encoded by a
codon containing a polymorphic site, e.g., as shown in FIGS. 6 and
7A-7D. In a preferred embodiment, the amino acid is the alternate
amino acid.
[0079] The invention also provides fragments with immunogenic
properties. These contain an epitope-bearing portion of the
polypeptides and variants of the invention. These epitope-bearing
peptides are useful to raise antibodies that bind specifically to a
polypeptide or region or fragment. These peptides can contain at
least 6, 7, 8, 9, 12, at least 14, or between at least about 15 to
about 30 amino acids. The epitope-bearing peptide and polypeptides
may be produced by any conventional means (Houghten, R. A., Proc.
Natl. Acad. Sci. USA, 82:5131-5135 (1985)). Simultaneous multiple
peptide synthesis is described in U.S. Pat. No. 4,631,211.
[0080] Fragments can be discrete (not fused to other amino acids or
polypeptides) or can be within a larger polypeptide. Further,
several fragments can be comprised within a single larger
polypeptide. In one embodiment a fragment designed for expression
in a host can have heterologous pre- and pro-polypeptide regions
fused to the amino terminus of the polypeptide fragment and an
additional region fused to the carboxyl terminus of the
fragment.
[0081] The invention thus provides chimeric or fusion proteins.
These comprise a polypeptide of the invention operatively linked to
a heterologous protein having an amino acid sequence not
substantially homologous to the polypeptide. "Operatively linked"
indicates that the polypeptide protein and the heterologous protein
are fused in-frame. The heterologous protein can be fused to the
N-terminus or C-terminus of the polypeptide. In one embodiment the
fusion protein does not affect function of the polypeptide per se.
For example, the fusion protein can be a GST-fusion protein in
which the polypeptide sequences are fused to the C-terminus of the
GST sequences. The isolated polypeptide can be purified from cells
that naturally express it, such as from mammary epithelium,
purified from cells that have been altered to express it
(recombinant), or synthesized using known protein synthesis
methods.
[0082] In one embodiment, the protein is produced by recombinant
DNA techniques. For example, a nucleic acid molecule encoding the
polypeptide is cloned into an expression vector, the expression
vector introduced into a host cell and the protein expressed in the
host cell. The protein can then be isolated from the cells by an
appropriate purification scheme using standard protein purification
techniques.
[0083] Polypeptides often contain amino acids other than the 20
amino acids commonly referred to as the 20 naturally-occurring
amino acids. Further, many amino acids, including the terminal
amino acids, may be modified by natural processes, such as
processing and other post-translational modifications, or by
chemical modification techniques well known in the art. Common
modifications that occur naturally in polypeptides are described in
basic texts, detailed monographs, and the research literature, and
they are well known to those of skill in the art.
[0084] Accordingly, the polypeptides also encompass derivatives or
analogs in which a substituted amino acid residue is not one
encoded by the genetic code, in which a substituent group is
included, in which the mature polypeptide is fused with another
compound, such as a compound to increase the half-life of the
polypeptide (for example, polyethylene glycol), or in which the
additional amino acids are fused to the mature polypeptide, such as
a leader or secretory sequence or a sequence for purification of
the mature polypeptide or a pro-protein sequence.
[0085] In general, polypeptides or proteins of the present
invention can be used as a molecular weight marker on SDS-PAGE gels
or on molecular sieve gel filtration columns using art-recognized
methods. The polypeptides of the present invention can be used to
raise antibodies or to elicit an immune response. The polypeptides
can also be used as a reagent, e.g., a labeled reagent, in assays
to quantitatively determine levels of the protein or a molecule to
which it binds (e.g., a receptor or a ligand) in biological fluids.
The polypeptides can also be used as markers for tissues in which
the corresponding protein is preferentially expressed, either
constitutively, during tissue differentiation, or in a diseased
state. The polypeptides can be used to isolate a corresponding
binding partner, e.g., receptor or ligand, such as, for example, in
an interaction trap assay, and to screen for peptide or small
molecule antagonists or agonists of the binding interaction.
[0086] In another aspect, the invention provides antibodies to the
polypeptides and polypeptide fragments of the invention. The term
"antibody" as used herein refers to immunoglobulin molecules and
immunologically active portions of immunoglobulin molecules, i.e.,
molecules that contain an antigen binding site that specifically
binds an antigen. A molecule that specifically binds to a
polypeptide of the invention is a molecule that binds to that
polypeptide or a fragment thereof, but does not substantially bind
other molecules in a sample, e.g., a biological sample, which
naturally contains the polypeptide. Examples of immunologically
active portions of immunoglobulin molecules include F(ab) and
F(ab').sub.2 fragments which can be generated by treating the
antibody with an enzyme such as pepsin. The invention provides
polyclonal and monoclonal antibodies that bind to a polypeptide of
the invention; such antibodies can be made using methods known in
the art. The term "monoclonal antibody" or "monoclonal antibody
composition", as used herein, refers to a population of antibody
molecules that contain only one species of an antigen binding site
capable of immunoreacting with a particular epitope of a
polypeptide of the invention. A monoclonal antibody composition
thus typically displays a single binding affinity for a particular
polypeptide of the invention with which it immunoreacts.
[0087] Additionally, recombinant antibodies, such as chimeric and
humanized monoclonal antibodies, comprising both human and
non-human portions, which can be made using standard recombinant
DNA techniques, are within the scope of the invention. Such
chimeric and humanized monoclonal antibodies can be produced by
recombinant DNA techniques known in the art, for example using
methods described in PCT Publication No. WO 87/02671; European
Patent Application 184,187; European Patent Application 171,496;
European Patent Application 173,494; PCT Publication No. WO
86/01533; U.S. Pat. No. 4,816,567; European Patent Application
125,023; Better et al. (1988) Science, 240:1041-1043; Liu et al.
(1987) Proc. Natl. Acad. Sci. USA, 84:3439-3443; Liu et al. (1987)
J. Immunol., 139:3521-3526; Sun et al. (1987) Proc. Natl. Acad.
Sci. USA, 84:214-218; Nishimura et al. (1987) Canc. Res.,
47:999-1005; Wood et al. (1985) Nature, 314:446-449; and Shaw et
al. (1988) J. Natl. Cancer Inst., 80:1553-1559); Morrison (1985)
Science, 229:1202-1207; Oi et al. (1986) Bio/Techniques, 4:214;
U.S. Pat. No. 5,225,539; Jones et al. (1986) Nature, 321:552-525;
Verhoeyan et al. (1988) Science, 239:1534; and Beidler et al.
(1988) J. Immunol., 141:4053-4060.
[0088] In general, antibodies of the invention (e.g., a monoclonal
antibody) can be used to isolate a polypeptide of the invention by
standard techniques, such as affinity chromatography or
immunoprecipitation. A polypeptide specific antibody can facilitate
the purification of natural polypeptide from cells and of
recombinantly produced polypeptide expressed in host cells.
Moreover, an antibody specific for a polypeptide of the invention
can be used to detect the polypeptide (e.g., in a cellular lysate,
cell supernatant, or tissue sample) in order to evaluate the
abundance and pattern of expression of the polypeptide. Antibodies
can be used diagnostically to monitor protein levels in tissue as
part of a clinical testing procedure, e.g., to, for example,
determine the efficacy of a given treatment regimen. Detection can
be facilitated by coupling the antibody to a detectable substance.
Examples of detectable substances include various enzymes,
prosthetic groups, fluorescent materials, luminescent materials,
bioluminescent materials, and radioactive materials. Examples of
suitable enzymes include horseradish peroxidase, alkaline
phosphatase, (.beta.-galactosidase, or acetylcholinesterase;
examples of suitable prosthetic group complexes include
streptavidin/biotin and avidin/biotin; examples of suitable
fluorescent materials include umbelliferone, fluorescein,
fluorescein isothiocyanate, rhodamine, dichlorotriazinylamine
fluorescein, dansyl chloride or phycoerythrin; an example of a
luminescent material includes luminol; examples of bioluminescent
materials include luciferase, luciferin, and aequorin, and examples
of suitable radioactive material include .sup.125I, .sup.131I,
.sup.35S or .sup.3H.
[0089] The invention will now be described by the following
non-limiting examples. The teachings of all references cited herein
are incorporated herein by reference in their entirety.
EXAMPLES
Methods
Subjects:
[0090] All individuals appearing in the various pedigrees included
in this study were derived from a large cohort of 1,056 unrelated
individuals (the probands) of French Canadian descent aged
.gtoreq.18 years who presented at the Chicoutimi Hospital Lipid
Clinic for lipid screening and who had hypertriglyceridemia or a
positive family history of hypertriglyceridemia, defined as a
fasting triglyceride concentration above the 50.sup.th age- and
sex-specific percentile according to the Lipid Research Clinic
Program (LRCP) criteria (Gaudet et al., Circulation 97(9):871-877
(1998)). Patients taking drugs known to affect plasma glycerol
concentrations (McCabe, "Disorders of Glycerol Metabolism" in The
Metabolic Basis of Inherited Disease, 7.sup.th Edn. (ed. Scriver C
R et al.) McGraw-Hill, New York, pp. 945-961 (1995), as well as
individuals presenting a medical condition potentially associated
with secondary hyperglycerolemia, such as previously diagnosed DM,
thyroid disorders or renal insufficiency, were excluded.
Linkage to the Xp21.3 Locus:
[0091] A total of twelve microsatellite markers in the region of
the GK gene were genotyped in the five families with
hyperglycerolemia. These markers were: DXS989, DXS8039, DXS1214,
DXS1036, DXS1067, DXS1219, DXS997, DXS8090, DXS8025, DXS8113,
DXS8042, and DXS8012. Genotypes for these markers were obtained by
polymerase chain reaction (PCR) using fluorescently-labeled
primers. The fluorescent genotyping gels were analyzed in an
automated system developed at the Whitehead Institute/MIT Center
for Genome Research as previously described (Kruglyak et al., Am.
J. Hum Genet. 58:1347-1363 (1996)).
[0092] Multipoint parametric linkage analysis of genotype data was
performed using the GENEHUNTER software package (Rioux et al., Am.
J. Hum. Genet. 63(4):1086-1094 (1998)). Marker order and genetic
distances used in the analysis were based on an integration of the
published genetic map (CEPH-Genethon Database) and radiation hybrid
mapping information obtained using the Genebridge 4 hybrid panel
(Rioux et al., Am J. Hum Genet. 63(4):1086-1094 (1998)). The GK
disease-allele frequency was estimated at 0.001 (McCabe et al., Am
J Hem Genet. 51(6):1277-1285 (1992)), while values for male
penetrance of 0.999, and female penetrance of 0.900 and 0.999
(heterozygotes and homozygotes, respectively) were used.
Genomic Structure of the GK Gene:
[0093] Genomic sequences were sought for the intronic regions
surrounding exons 9, 10, 11, and 17. PAC clone
RPCI-5.931_C.sub.--24 containing exons 9, 10, and 11 was identified
using primer pairs GK08 and GK12, and PAC clone RPCI-5.1150
containing exon 17 was identified using primers GK17F and GK17R.
All details regarding primer sequences and annealing temperatures
are available on the Chicoutimi Hospital Lipid Research Group and
Whitehead Institute/MIT Center for Genome Research GK websites.
Direct sequencing of introns 9 and 10 from clone
RPCI-5.931_C.sub.--24 using specific exonic primers (GK9F, GK10F,
and GK10R), was carried out with the Big Dye terminator cycle
sequencing kit (PE Applied BioSystems, Foster City, Calif.), and
run on ABI377 automated sequencers.
[0094] To obtain the genomic sequence from intron 17, a single
colony of clone RPCI-5.1150_E.sub.--8 was diluted in 100 .mu.l of
water and used as template for PCR amplification. An amplicon
covering exon 17 through exon 18 was obtained with primers GK17_F
AND GK18_R (FIG. 2), using the Platinum Taq High Fidelity (Life
Technologies, Rockville, Md.). The PCR product was purified using
the solid phase reversible immobilization (SPRI) method (Hawkins et
al., Nucleic Acids Research 22:4543-4544 (1994)), and then
sequenced using the DYEnamic Energy Transfer primer kit (Amersham
Pharmacia Biotech Ltd., Cleveland, Ohio).
GK Mutation Screening:
[0095] The screening for mutations in the GK gene was first
performed by resequencing this gene in 9 affected individuals, 4
obligate carriers, and 3 unaffected relatives from the five
families described above. Intronic primers used were previously
published (Sargent et al., Hum Mol Genet. 3(8):1317-1324 (1994)) or
designated from the sequence determined in the present study using
the Primer 3.0 software available on the Whitehead Institute/MIT
Center for Genome Research server. Sequencing reactions and gels
were prepared and analyzed on ABI377 sequencers. Regions in which
sequence polymorphisms were discovered were resequenced in 9 other
affected individuals, 10 obligate carriers, and unaffected
relatives from the GK families.
Plasma Glycerol and Other Biological Measurements:
[0096] Blood samples were drawn at rest after a 12-hour overnight
fast from an antecubital vein into tubes containing EDTA. Specimens
were centrifuged within one hour, and the separated plasma frozen
(-80.degree. C.) until analysis. TG and free fatty acid (FFA)
levels were measured using enzymatic assays (McNamara et al., Clin
Chim Acta 166:11-8 (1987)). Plasma glycerol concentrations were
measures using an analyzer (Technicon RA-500 Bayer Corporation,
Tarrytown, N.Y.) and enzymatic reagents obtained from Randox
(Randox Laboratories Ltd., Crumlin, UK). Glycerol measurements were
calibrated with reference standards purchased from Sigma (Sigma
Diagnostics, St. Louis, USA). Waist and hip circumferences
(Standardization of Anthropometric Measurements. In: Lohaman V., et
al., eds, The Airle (VA) Concensus Conference Human Kinetics
1988:39-80), body weight, height and BMI were recorded. The % body
fat was estimated by bio-electrical impedance (Baumgartner et al.,
Exerc Sport Sci Rev. 18:193-224 (1990)). Family history of DM was
defined as the presence of a confirmed diagnosis in a first degree
relative. An oral glucose tolerance test (OGTT) was performed in
the original cohort of 1,056 individuals and in the families of the
five GK carrier probands using a 75 g glucose load as previously
described (Report of the Expert Committee on the Diagnosis and
Classification of Diabetes Mellitus, Richterich et al., Diabetes
Care 20:1183-1197 (1997)), and plasma glucose concentration was
enzymatically measured (Richterich et al., Schweiz Med Wochenschr
101(17):615-618 (1971)). IGT and DM were defined according to the
World Health Organization. Fasting insulinemia was measured by RIA
with polyethylene glycol separation. (Desbuquois et al., J. Clin
Endocrinol Metab 33(5):732-738 (1971)).
Calculation of Familial Resemblance of Fasting Glycerol
Concentration:
[0097] After having excluded families of subjects bearing the N288D
mutation, calculation of familial resemblance of plasma glycerol
concentrations in the fasting state was performed for a total of
653 individuals arising from the nuclear families of 174 randomly
selected patients of the initial cohort representing all deciles of
fasting glycerol values. Before analyses, glycerol data were
adjusted for age suing sex-specific regressions, and the residuals
from these regressions were standardized to a mean of zero and
standard deviation of 1. The standardized residuals were used to
assess the degree of familial resemblance by computing the
intraclass correlations (r) as previously described (Perusse et
al., Arterioscler Thromb Vase Biol. 17(11):3270-3277 (1997)). This
correlation was calculated by computing the ratio of the between
family variance over the sum of the within- and between-family
variances estimated using a random effect model of analysis of
variance (ANOVA) (Bogardus et al., N Engl J. Med 315(2):96-100
(1986)).
Statistical Analysis:
[0098] Group differences for plasma glycerol concentrations and
other continuous variables were examined by the Student's unpaired
two-tailed t-test. Linear regression models were used to assess the
relationship between the dependent variables (2-hour glucose
following a 75 g oral absorption or correlates of body fat
accumulation) and fasting glycerolemia. To specifically study the
ability of glycerol to predict IGT or DM (defined as 2-hour glucose
.gtoreq.7.8 mmol/L following a 75 g oral glucose load), multiple
logistic regression models were constructed. In a multiple
regression analysis estimates were provided after adjustment for
significant covariates such as age, gender, the BMI, fasting
glucose, insulin, FFA and TG concentrations. The distribution of
plasma TG, insulin, and glycerol levels was normalized by log-10
transformation.
Results
Severe Hyperglycerolemia Families:
[0099] From the sample of 1,056 subjects screened, five male
individuals presented with plasma glycerol values above 2.0 mmol/L.
Screening of their families identified a total of 18 males
demonstrating extremely elevated plasma glycerol levels (range
2.9-6.2 mmol/L). Based on the pedigree data shown in FIG. 1, it was
clear that the severe hyperglycerolemia phenotype segregated as a
simple X-linked trait. In addition, 14 obligate female carriers
were found to be dysglycerolemic, presenting intermediate plasma
glycerol levels ranging from 0.01 to 0.82 mmol/L, whereas all other
family members showed plasma glycerol concentrations below 0.2
mmol/L.
Linkage to Xp21.2:
[0100] 12 microsatellite markers from Xp21.3 were genotyped among
the affected pedigrees. Multipoint parametric linkage analysis of
the genotype data resulted in a peak LOD score of 3.46 centered at
marker DXS8039. As all families originate from a population with a
proven founder effect (Perusse et al., Arterioscler Thromb Vasc
Biol 17(11):3270-3277 (1997)), a common disease hanlotype was
looked for. A six-marker haplotype consisting of markers DXS8039,
DXS1214, DXS1036, DXS1067, DXS1219 and DXS997 (alleles 151, 21,
145, 222, 230, 107) was observed in all families. This haplotype
extended over a region of 5.5 cM.
Genomic Structure of GK Gene:
[0101] Intronic sequences surrounding exons 9, 10, 11, and 17, were
persued in order to design primers to complete the set of
previously reported oligonucleotides (Sargent et al., Hum Mol Genet
3(8):1317-1324 (1994)). In addition, when the sequence obtained for
intron 10 was aligned with the published cDNA sequence, it was
discovered that the splice junctions had been incorrectly defined,
such that the last 12 bases of exon 10 were in fact encoded by exon
11.
Identification of a Missense Mutation in Exon 10 Within Families
With Severe Hyperglycerolemia:
[0102] All 20 GK exons, and their corresponding inton-exon
boundaries, were screened for mutations. Two polymorphisms were
discovered within the introns, and two within the exons (FIG. 2).
Neither of the intronic polymorphisms is expected to lead to a
functional difference. Based on the predicted amino acid sequence
for this gene, the polymorphism in exon 3 is silent, whereas the
polymorphism in exon 10 results in a missence mutation.
Specifically, this latter nucleotide change results in a transition
of an adenine (A) to a guanine (G), and this mutation (N288D) leads
to the substitution of a small polar asparagine for a negatively
charged aspartic acid (FIG. 3). Screening of the remaining family
members demonstrated that this mutation was restricted to the 18
affected males and 14 obligate female carriers. This was not true
of the other three polymorphisms since they were found in
normoglycerolemic controls at frequencies greater than 10%. It is
important to note that asparagine 288 is extremely well conserved
in many different species, including H. influenzae, M. pneumonai,
E. coli, yeast, and mice, as well as man (FIG. 3) (Pettigrew et
al., Arch Biochem Biophys 349(2):236-245 (1998); Pettigrew et al.,
J Biol Chem 263(1):135-139 (1988); Nevoigt et al., FEMS Microbiol
Rev, 21(3):231-241 (1997)).
Phenotypic Expression of the N288D Mutation and Association of
Fasting Glycerol Concentration With Impaired Glucose Tolerance and
Abdominal Obesity:
[0103] The 18 affected males and the 14 obligate female carriers
identified were matched for age (.+-.5 years) and sex with
unaffected relatives; their characteristics are presented in FIG.
8. Monitoring of plasma glycerol levels at 3-6 month intervals in
N288D carriers demonstrated that the hyperglycerolemia was
permanent, resulting in values greater than 2.5 mmol/L in men and
0.2=mol/L in women. Carrying a GK gene mutation was also associated
with a significantly higher BMI, waist circumference and total body
fat, as well as with a higher mean of 2-hour glucose concentration
following an OGTT.
[0104] Further analysis of the association between glycerol and
plasma glucose homeostasis as well as anthropometric indices of
abdominal obesity in men carrying a N288D mutation showed that 12
of the 18 affected men met the criteria of either DM or IGT (FIG.
2). Among the six subjects with normal 2-hour glucose, four men
showed elevated fasting insulinemia values (above 30 mU/L), which
suggests that they were insulin-resistant. There was strong
evidence that fluctuations in glycerolemia among carriers were
important correlates of body fat accumulation and glucose
concentrations. As illustrated in FIGS. 4A and 4B, plasma glycerol
levels in affected males were related to variations in the waist
circumference and 2-hour glucose levels following a 75 g oral
absorption, such that 68.9% of the variance in 2-hour glucose
values (p<0.0001) and 43% of the variance in waist circumference
(p<0.001) were explained by the variance in glycerolemia among
these subjects.
Plasma Glycerol Concentrations in the Original Cohort:
[0105] A similar trend was observed between the mean glycerol
concentration and the degree of glucose intolerance in GK carriers
as well as among subjects of the initial cohort with "normal"
glycerol concentrations (FIG. 4C). As shown in FIG. 9, significant
differences in fasting glycerol concentrations were also noted in
the initial cohort in presence of impaired fasting glucose (values
between 6.0-6.9 mmol/L), hyperinsulinemia, increased FFA
concentrations, hypertriglyceridemia and obesity (defined as a BMI
above 30 kg/m.sup.2). Menopause, which characterized 59.6% of
women, was associated with higher plasma glycerol values. Further
stratification for the use of hormonal replacement therapy (HRT)
showed an additional hormonal effect on the glycerolemia. For these
reasons, appropriate adjustment for the effect of gender, menopause
and HRT was performed in the different multivariate analyses.
Association of Fasting Glycerol Concentration With Impaired Glucose
Tolerance in the Absence of Severe Hyperglycerolemia:
[0106] In multivariate analyses, after having excluded subjects
with severe hyperglycerolemia and DM, a 1-standard deviation (SD)
increase in log-glycerol was associated with a 2.5-fold increase in
the risk of having 2-hour glucose between 7.8-11.0 mmol/L after a
75 g oral glucose challenge (FIG. 10). Furthermore, as illustrated
in FIG. 5, the relative odds (OR) of having 2-hour glucose above
7.8 mmol/L after a 75 g oral glucose challenge was substantially
increased among patients with glycerol concentration above the
median (.gtoreq.0.075 mmol/L) compared to those in the first decile
(p<0.0001), suggesting a threshold for glycerol concentrations
above which there may be an increased risk of IGT.
Familial Resemblance of Plasma Glycerol Concentrations in the
Absence of Severe Hyperglycerolemia:
[0107] Analyses of familial resemblance of plasma glycerol
concentrations were performed on a sample of 652 individuals,
probands and family members from 174 randomly-selected individuals
from the original cohort, covering all deciles of fasting glycerol
concentration. Overall, there was six times more variance in
fasting plasma glycerol levels between than within families (FIG.
6). If it is assumed that the resemblance explained by belonging to
the same pedigree is entirely defined by genetic factors, the
maximal heritability of glycerolemia in the fasting state has been
estimated at 58% in the absence of the GK gene N288D mutation.
[0108] While this invention has been particularly shown and
described with references to preferred embodiments thereof, it will
be understood by those skilled in the art that various changes in
form and details may be made therein without departing from the
scope of the invention encompassed by the appended claims.
Sequence CWU 1
1
23160DNAUnknownPartial nucleic acid sequence of the GK gene
comprising a polymorphic site at nucleotide position 13 of exon 3
1atgccttctt ttgtcaaaga tgggtggaac argaccctaa ggaaattcta cattctgtct
60248DNAUnknownPartial nucleic acid sequence of the GK gene
comprising a polymorphic site at nucleotide position 17 of intron 8
2taatggtaaa aaacaaacaa amaaacaaaa aacacaccaa aaaaccaa
48394DNAUnknownPartial nucleic acid sequence of the GK gene
comprising a polymorphic site at nucleotide position 29 of exon 10
3ttcattctcc cttcaaccat aggtatggaa caggatgttt cttactatgt ratacaggcc
60ataaggttgg tttttaataa aaatgattaa gtca 94458DNAUnknownPartial
nucleic acid sequence of the GK gene comprising a polymorphic site
at nucleotide position 22 of intron 12 4gaaattggtg agtgtgttct
aacaaaagkt tagaaaatct gaaaaatgac acatttca 5858079DNAUnknownGlycerol
kinase gene 5ggttcagcgg acgcgcgcgg cctcggtctc tggactcgtc acctgcccct
ccccctcccg 60ccgccgtcac ccaggaaacc ggccgcaatc gccggccgac ctgaagctgg
tttcatggca 120gcctcaaaga aggcagtttt ggggccattg gtgggggcgg
tggaccaggg caccagttcg 180acgcgctttt tggtgagccc ggggtgacat
gtgaagaggc gctgagctgt aaaacgacgg 240ccagtcatcc ttgatatctg
cctgcatttt tacattaata ttacaatatc tttttcaggt 300tttcaattca
aaaacagctg aactacttag tcatcatcaa gtagaaataa aacaagagtt
360cccaagagaa gggtatgttt cctaatttaa tatgtaaaga cacattatgt
ttgttagtcc 420atctcaccca acttgcccca atgccttctt ttgtcaaaga
tgggtggaac argaccctaa 480ggaaattcta cattctgtct atgagtgtat
agagaaaaca tgtgagaaac ttggacagct 540caatattgat atttccaaca
taaaaggtat tttagtagaa tattttaccc acatgtaaaa 600cgacggccag
ttgagagctg ttttcctgaa gtagttccta cttgttaaat ttttgacttc
660cttctgttta actttctctt taaagctatt ggtgtcagca accagaggga
aaccactgta 720gtctgggaca agataactgg agagcctctc tacaatgctg
tgggtaagct gtcatgcatg 780gatgtcaaat gtagggcctt tcttcacatt
gcaatgtaaa acgacggcca gttccttgat 840agtgatttca gtaagttctt
atttttttaa atgaagtttt tcatgtatat tattttattt 900tggtctatag
tgtggcttga tctaagaacc cagtctaccg ttgagagtct tagtaaaaga
960attccaggaa ataataactt tgtcaaggta agaatttctt cagaagtata
ctataagaat 1020gtttcttttt ttaaaaaaag tttgcagatt tcactagaaa
gaagcatctt atggtacaat 1080agttatttga tacaatttat agaatctttt
tcccggataa ttgaggcctg taaaacgacg 1140gccagtttct tttgtttggt
ggttttgttt taaactgtta cacttttcat ttgctaactg 1200aacttcacaa
ctgcttttag tccaagacag gccttccact tagcacttac ttcagtgcag
1260tgaaacttcg ttggctcctt gacaatgtga gaaaagttca aaaggccgtt
gaagaaaaac 1320gagctctttt tgggactatt gattcatggc ttatttgggt
atgtttaaat ataatggata 1380tatggagaat tttttcagaa attttttcta
gactgccttg cctattgttt ctactagcag 1440gtcagacttt ttaattagca
tgtaaaacga cggccagttg tgctctgctg attatgaccc 1500ttaacaatat
gtaaattaaa ttgccaataa gtacaaattt aacctgattt ttttactctg
1560cctagagttt gacaggagga gtcaatggag gtgtccactg tacagatgta
acaaatgcaa 1620gtaggactat gcttttcaac attcattctt tggaatggga
taaacaactc tgcgagtaag 1680ttctgttttg ctctaaatat agttttccca
atacactacc tatttataac cgaaatctta 1740atattttcag atgtcagtgg
agcatgtaaa acgacggcca gtacagtgtt aaatacccaa 1800tcttcttgtt
tttcagattt tttggaattc caatggaaat tcttccaaat gtccggagtt
1860cttctgagat ctatggccta atggtaaaaa acaaacaaam aaacaaaaaa
cacaccaaaa 1920aaccaaaaaa caaacaaaaa aaaacctaat aattaaagtt
tttttattac aaaacaagtt 1980tactattcat aattcaaaag tcaactgtgt
tatgttttgt gacttaaaaa ctttacagtc 2040ctttttacaa tggaaagctg
gggccttgga aggtgtgcca atatctgggg taagtttcat 2100caccaagtgt
ctccccatcc ccacccttcc ccatgttatg gctttcctcc tcttagttca
2160tcagtgtgcc tctttttaaa ctagggaaaa caagtaaaag ttgcaaaatt
ggannnntct 2220tgttcttaca tgtcatactg tgggccattg agaatctttt
gaataaatta attttaactc 2280tcccttccca tacctattat cttacatatt
aacaaatggt attaacaaat ggggaaaatg 2340gccaaatgga gaaaatgcaa
ggaaatagac agttcattct ttgataaata aaaaatgaaa 2400aataaatcct
atggctcttc taaaaagaaa gttaatacta ttgtattagt cagtgttctt
2460tattgtcatt tatactttca gtgtttaggg gaccagtctg ctgcattggt
gggacaaatg 2520tgcttccaga ttggacaagc caaaaatacg tgagtttaag
aaacagactt aaaaaccaat 2580gctgttttgt tttttctact tggtgctttg
aataaggaaa agcttttgaa gttcatccag 2640gatgaaaatc aatagcttaa
tagctccaat atgcatatat acacttttta ccattttttt 2700atatctttaa
ataaaataca aaatgccata tatatgcaca ctgatgaagc ttataaagac
2760ctaaatttgt aggctgggcg cggttatttg ctttcaataa aattgtcttc
tattcattct 2820cccttcaacc ataggtatgg aacaggatgt ttcttactat
gtratacagg ccataaggtt 2880ggttttttaa attaaaaaat tgatttaaaa
gtctaagttc atctaaataa tgcttgaaca 2940taatttacta ttaaacaact
tttagtcttt agcttttact taatctttat cagggtttaa 3000tttagagctc
aatacaaaat ttgaatcgtt ctaataagaa ccattttaga ctctttgaat
3060tttatatgtg tgtttttaat tgtgctgggg ggaaatctag actgagacct
catcaaattc 3120ttaatgcaaa tctaatttga aacaaggaat aaacttttta
tacagcttaa atgtgttctt 3180aattctgatc gttttgactg taaggattta
ttttaaaaat tggtttattg attgcattat 3240tttgtaccta tgttatttta
actttaaaaa aaagttctca tgttatcttt tcattttcca 3300ctactgaaat
cttttttttt tctttcttac agtgtgtatt ttctgatcat ggccttctca
3360ccacagtggc ttacaaactt ggcagagaca aaccagtata ttatgctttg
gaagtaagtt 3420ctttttaatc aatatggata atatgacaaa cattcaaagc
taataaaaat cacagagttt 3480tctaacactt ttctggtaaa tcttaataca
gaggactcaa aaagttctgc tttcttggca 3540tttgattgag ttgaaggaac
ctgaaactga tctgggtgtc aggactcaca ggagaccttg 3600attagattgg
ttcctcagtt cttatgccaa ttaatcatgt caccttaggc atattacttg
3660agagctctac aatgtgaggt tttttttttt tttatctcta aagtttaatc
ggattaacgt 3720gctctctaac atttctttca tcttgaaaat tctttgattt
tataaataaa atgctccagt 3780gttccaaaga gaaccctggg cacaaatagg
cagaacaact ctcttcactt gtctcctcat 3840aaaaataaat tttgtgtaac
attttgatat agaaaagaaa gcgacgagat ttatgccact 3900tatcactgga
aacatttgtt tcaaacattt ttgtatgtta tagtaggaat atgccagcct
3960aagcctatat tttattagtg acttagataa aactatgttt gtattagaag
acctagttta 4020catatttgtc ggagtctcaa aatggaaact gaattctgtc
catctgattg tgtcatacac 4080agaatatgct caataaaaac cttggatagt
gataaaatat attctgtctt gaattccttt 4140ttttctttag ggttctgtag
ctatagctgg tgctgttatt cgctggctaa gagacaatct 4200tggaattata
aagacctcag aagaaattgg tgagtgtgtt ctaacaaaag kttagaaaat
4260ctgaaaaatg acacatttca gtattttatc tctgcaaagt aaatatcgat
gctttgcccc 4320aaatgtgatc cagttgtgtg atttttgttt tgttttgttt
taatgttaga aaaacttgct 4380aaagaagtag gtacttctta tggctgctac
ttcgtcccag cattttcggg gtaatatgca 4440ccttattggg agcccagcgc
aagagggtaa gtattgaaaa tatggagtgc ttttggggat 4500cttgatttat
tgtaaaacga cggccagttg attatgtcca attttctctt cctggacatt
4560tctgtctacc aaatttgacc ttttcatatt tgagatattt caaattgatt
ggtttatatc 4620attctaatct gaaaatcttt gtgcgtattt ttaggataat
ctgtggactc actcagttca 4680ccaataaatg ccatattgct tttgctgcat
tagaagctgt ttgtttccaa actcgagagg 4740taacaaatat gggcctgttt
tcttgtactt agttcacttt tatcactctt aagttatatg 4800ttaacacccg
agatttattc agtactgaaa atgtagttaa tcaaatatta aggctgccta
4860aatactaatc taaatataag cagggttttc cccctttttc cagctgtcat
taccttctaa 4920gttcctgttc cctgtcaggc actgggaaat ttatggttgt
ggggaggctg agtggcacac 4980attaggcaaa ggaaacagca caaacatagg
catcaaggca gaaaaacagg gtgcaaaata 5040gagttgtata gcttagctga
atatcaaggt gaatgcagag gtgtagtgag agaaaaggtt 5100ggctgtgacc
agatcaaaga gggcttagaa gaccagaata agaagtctca atttattcca
5160taggctcttg gaagctcttg agagtttctg agtggaggat tgccattttc
agagatgtta 5220ctatgaaata gatttataac attaattgca ctggtttatt
taagattttg gatgccatga 5280atcgagactg tggaattcca ctcagtcatt
tgcaggtaga tggaggaatg accagcaaca 5340aaattcttat gcagctacaa
gcagacattc tgtatatacc agtaggttag taagtcttca 5400ttcctttaaa
ctcccagagt aatgtttctt gtggaataac tagttctttg ggtgtaaaac
5460gacggccagt tcccagagta atgtttcttg tggaataact agttctttgg
gcatatgtaa 5520ccacaaagat attgatggaa ctctctctcc tcagtgaagc
cctcaatgcc cgaaaccact 5580gcactgggtg cggctatggc ggcaggggct
gcagaaggag tcggcgtatg gagtctcgaa 5640cccgaggatt tgtctgccgt
cacgatggag cggtttgaac ctcagattaa tgcggagggt 5700acatttaaag
aatgaaatgt tcagtgatat actgtgaaaa cgaccttagt gcacgggagt
5760tttgtttttc tgtttagtta aaagttaagg aaccaagtaa aatagtaaat
gttatcattg 5820cagattcggc tgccaagcat attgggcttt actgaataaa
tgtgaatgag agaaatcgtt 5880gcttatcaaa agaacttcta aaatcacttt
ttaaaaatca tttgtaaaac gacggccagt 5940agccctactg cagtttaatg
tgtcaataat ttgtcaagaa tgttgagtga tcataagtat 6000ggtactaaga
acatctcagc aaactacctt tcgttatgtg ttttttctac cttctaattc
6060tagaaagtga aattcgttat tctacatgga agaaagctgt gatgaagtca
atgggttggg 6120ttacaactca atctccagaa agtggtaaaa atgtttttgt
ttattattgt cacattttct 6180tagtatatta aatagttatt taagtatcta
ggcatttaca catagccagg ctgctctgaa 6240gaaaagcatt atcatatgtc
cagagattct gacattttga aaacacttta aagttctaaa 6300cacaaaatgt
aaattatcag gtgttgtaaa acgacggcca gttggtttgg tttgcttgac
6360tggaatctct tctgcttgga tgaccacagg tgaccctagt atcttctgta
gtctgccctt 6420gggctttttt atagtgagta gcatggtaat gttaatcgga
gcaaggtaca tctcaggtta 6480gttactcttt aaattagaca actctattag
ttagctttaa tgttttcgtg tataacttag 6540cagaaatttt tcagtgtttt
tcattctttc tgtgtctagg aagctggaaa atcaattaaa 6600ggtctaatta
gttagaccaa ttaatctttg ggggcagtta gaagtaagaa ctgtgactct
6660gcttaccctt tttaaatttt taatgtgatg acttctttaa gagggactac
attctgctgt 6720cagctgcagc aataagcaaa agtgaaaata ctaatattta
aatgacagga ctttcagact 6780gactgctgaa agttaaagta tacttaaaat
tactggctta aatggaaatg atgcttctta 6840ttctgtatgt tcccatgaaa
gtgaaactta aaaaaaaaat tcatgattag ggtttcatga 6900aaaggccttg
tttctatgaa aattgagaca ggttgcatct ctctaagcta aaagatgggc
6960tatgtgtcta gagtcttaga cttctaaaat gcatgtggtc actatatgta
ggttatctct 7020tcggtgacat acactgcaat ttgagagggc tggaaattgt
ttgccttggt aaacgattag 7080caacagtggc aatatttgtt aattttggaa
ttggccctgt ttgttgcatt ttaattgtga 7140ggcatgattt agaaatcata
tggactttct agcttaataa atgattgaat catctgcatt 7200gctttaactc
ctgaattgta tgcatgtatt attgacatat atggtttttg ttccccattt
7260caggtattcc ataaaaccta ccaactcatg gattcccaag atgtgagctt
tttacataat 7320gaaagaaccc agcaattctg tctcttaatg caatgacact
attcatagac tttgatttta 7380tttataagcc acttgctgca tgaccctcca
agtagacctg tggcttaaaa taaagaaaat 7440gcagcaaaaa gaatgctata
gaaatatttg gtggtttttt ttttttttaa acatccacag 7500ttaaggttgg
gccagctacc tttggggctg accccctcca ttgccataac atcctgctcc
7560attccctcta agatgtagga agaattcgga tccttaccat tggaatcttc
catcgaacat 7620actcaaacac ttttggacca ggatttgagt ctctgcatga
catatacttg attaaaaggt 7680tattactaac ctgttaaaaa tcagcagctc
tttgctttta agagacaccc taaaagtctt 7740cttttctaca tagttgaaga
cagcaacatc ttcactgaat gtttgaatag aaacctctac 7800taaattatta
aaatagacat ttagtgttct cacagcttgg atatttttct gaaaagttat
7860ttgccaaaac tgaaatcctt cagatgtttt ccatggtccc actaattata
atgactttct 7920gtctgggtct tataggaaaa gatactttct tttttcttcc
atctttcctt tttatatttt 7980ttactttgta tgtataacat acatgcctat
atattttata cactgaggga gcccatttat 8040aaataaagag cacattatat
tcagaaggtt ctaacaggg 8079641PRTUnknownGK N288D mutant 6Phe Gln Ile
Gly Gln Ala Lys Asn Thr Tyr Gly Thr Gly Cys Phe Leu 1 5 10 15Leu
Cys Asp Thr Gly His Lys Cys Val Phe Ser Asp His Gly Leu Leu 20 25
30Thr Thr Val Ala Tyr Lys Leu Gly Arg 35 40741PRTHomo sapiens 7Phe
Gln Ile Gly Gln Ala Lys Asn Thr Tyr Gly Thr Gly Cys Phe Leu 1 5 10
15Leu Cys Asn Thr Gly His Lys Cys Val Phe Ser Asp His Gly Leu Leu
20 25 30Thr Thr Val Ala Tyr Lys Leu Gly Arg 35 40841PRTUnknownRat
8Phe Gln Asp Gly Gln Ala Lys Asn Thr Tyr Gly Thr Gly Cys Phe Leu 1
5 10 15Leu Cys Asn Thr Gly His Lys Cys Val Phe Ser Glu His Gly Leu
Leu 20 25 30Thr Thr Val Ala Tyr Lys Leu Gly Arg 35
40941PRTUnknownMouse 9Phe Gln Asp Gly Gln Ala Lys Asn Thr Tyr Gly
Thr Gly Cys Phe Leu 1 5 10 15Leu Cys Asn Thr Gly His Lys Cys Val
Phe Ser Glu His Gly Leu Leu 20 25 30Thr Thr Val Ala Tyr Lys Leu Gly
Arg 35 401039PRTE. coli 10Val Lys Glu Gly Met Ala Lys Asn Thr Tyr
Gly Thr Gly Cys Phe Met 1 5 10 15Leu Met Asn Thr Gly Glu Lys Ala
Val Lys Ser Glu Asn Gly Leu Leu 20 25 30Thr Thr Ile Ala Cys Gly Pro
351139PRTPseudomonas aeruginosa 11Val Glu Pro Gly Gln Ala Lys Asn
Thr Tyr Gly Thr Gly Cys Phe Leu 1 5 10 15Leu Met His Thr Gly Asp
Lys Ala Val Lys Ser Thr His Gly Leu Leu 20 25 30Thr Thr Ile Ala Cys
Gly Pro 351239PRTEnterococcus casseliflavus 12Phe Glu Lys Gly Met
Ile Lys Asn Thr Tyr Gly Thr Gly Ala Phe Ile 1 5 10 15Val Met Asn
Thr Gly Glu Glu Pro Gln Leu Ser Asp Asn Asp Leu Leu 20 25 30Thr Thr
Ile Gly Tyr Gly Ile 351341PRTHaemophilus influenzae 13Val His Ala
Gly Gln Ala Lys Asn Thr Tyr Gly Thr Gly Cys Phe Met 1 5 10 15Leu
Leu His Thr Gly Asn Lys Ala Ile Thr Ser Lys Asn Gly Leu Leu 20 25
30Thr Thr Ile Ala Cys Asn Ala Lys Gly 35 401439PRTBacillus subtilis
14Phe Glu Glu Gly Met Gly Lys Asn Thr Tyr Gly Thr Gly Cys Phe Met 1
5 10 15Leu Met Asn Thr Gly Glu Lys Ala Ile Lys Ser Glu His Gly Leu
Leu 20 25 30Thr Thr Ile Ala Trp Gly Ile 351541PRTSaccharomyces
cerevisiae 15Tyr Lys Pro Gly Ala Ala Lys Cys Thr Tyr Gly Thr Gly
Cys Phe Leu 1 5 10 15Leu Tyr Asn Thr Gly Thr Lys Lys Leu Ile Ser
Gln His Gly Ala Leu 20 25 30Thr Thr Leu Ala Phe Trp Phe Pro His 35
401641PRTMycoplasma genitalium 16Thr Glu Pro Gly Met Val Lys Asn
Thr Tyr Gly Thr Gly Cys Phe Val 1 5 10 15Leu Met Asn Ile Gly Asp
Lys Pro Thr Leu Ser Lys His Asn Leu Leu 20 25 30Thr Thr Val Ala Trp
Gln Leu Glu Asn 35 401739PRTEnterococcus faecalis 17Phe Glu Pro Gly
Met Val Lys Asn Thr Tyr Gly Thr Gly Ser Phe Ile 1 5 10 15Val Met
Asn Thr Gly Glu Glu Pro Gln Leu Ser Lys Asn Asn Leu Leu 20 25 30Thr
Thr Ile Gly Tyr Gly Ile 351841PRTMycoplasma pneumoniae 18Val Glu
Pro Ala Met Val Lys Asn Thr Tyr Gly Thr Gly Cys Phe Met 1 5 10
15Leu Met Asn Ile Gly Asn Glu Leu Lys Tyr Ser Gln His Asn Leu Leu
20 25 30Thr Thr Val Ala Trp Gln Leu Glu Asn 35
401941PRTSynechocystis PCC6803 19Asp Arg Pro Gly Leu Leu Lys Cys
Thr Tyr Gly Thr Gly Ala Phe Leu 1 5 10 15Val Ala Asn Thr Gly Gln
Thr Val Thr Arg Ser Gln His Arg Leu Leu 20 25 30Ser Thr Val Ala Trp
Thr Gln Thr Asn 35 402012DNAArtificial SequenceGK gene polymorphism
20ggacargacc ct 122116DNAArtificial SequenceGK gene polymorphism
21aaacaaahaa acaaaa 162213DNAArtificial SequenceGK gene
polymorphism 22actatgtrat aca 132316DNAArtificial SequenceGK gene
polymorphism 23aacaaaagkt tagaaa 16
* * * * *