U.S. patent application number 12/208263 was filed with the patent office on 2009-03-05 for genomic markers of hepatitis b virus associated with hepatocellular carcinoma.
This patent application is currently assigned to The Chinese University of Hong Kong. Invention is credited to Angeline Bartholomeusz, Lik Yuen Chan, King Hong Lee, Kwong Sak Leung, Wai Yee Leung, Shu Kam Mok, Jao Yiu SUNG, Kwok Wing Tsui.
Application Number | 20090061419 12/208263 |
Document ID | / |
Family ID | 36034475 |
Filed Date | 2009-03-05 |
United States Patent
Application |
20090061419 |
Kind Code |
A1 |
SUNG; Jao Yiu ; et
al. |
March 5, 2009 |
GENOMIC MARKERS OF HEPATITIS B VIRUS ASSOCIATED WITH HEPATOCELLULAR
CARCINOMA
Abstract
The present invention provides methods of predicting a
pre-disposition of HBV-infected individuals to develop
hepatacellular carcinoma (HCC).
Inventors: |
SUNG; Jao Yiu; (Ma On Shan,
NT, HK) ; Chan; Lik Yuen; (Shatin, HK) ; Tsui;
Kwok Wing; (Ma on Shan, NT, HK) ; Leung; Kwong
Sak; (Shatin, NT, HK) ; Mok; Shu Kam; (Shatin,
NT, HK) ; Bartholomeusz; Angeline; (Victoria, AU)
; Leung; Wai Yee; (Happy Valley, HK) ; Lee; King
Hong; (Hung Hom, HK) |
Correspondence
Address: |
TOWNSEND AND TOWNSEND AND CREW, LLP
TWO EMBARCADERO CENTER, EIGHTH FLOOR
SAN FRANCISCO
CA
94111-3834
US
|
Assignee: |
The Chinese University of Hong
Kong
Shatin
HK
Hospital Authority
Kowloon
HK
|
Family ID: |
36034475 |
Appl. No.: |
12/208263 |
Filed: |
September 10, 2008 |
Related U.S. Patent Documents
|
|
|
|
|
|
Application
Number |
Filing Date |
Patent Number |
|
|
11019426 |
Dec 20, 2004 |
7439020 |
|
|
12208263 |
|
|
|
|
10937987 |
Sep 10, 2004 |
|
|
|
11019426 |
|
|
|
|
Current U.S.
Class: |
435/5 ;
702/19 |
Current CPC
Class: |
C12Q 1/706 20130101 |
Class at
Publication: |
435/5 ;
702/19 |
International
Class: |
C12Q 1/70 20060101
C12Q001/70; G06F 19/00 20060101 G06F019/00 |
Claims
1. A method of determining a pre-disposition of an individual
infected with hepatitis B virus (HBV) to develop hepatocellular
carcinoma (HCC), the method comprising: a) determining a nucleotide
in the genome of HBV isolated from the individual at least the
position corresponding to nucleotide 1613 of SEQ ID NO: 1; and b)
determining the presence or absence of 1613A in the HBV genome,
wherein if the HBV genome has 1613A, the individual has a
predisposition to develop HCC.
2. The method of claim 1, wherein the determining step b) comprises
aligning the determined nucleotides to the HBV genomic sequence to
determine the position of the nucleotide corresponding to position
1613 of SEQ ID NO: 1 and comparing the nucleotide corresponding to
position 1613 with 1613A.
3. The method of claim 1, wherein the aligning step is performed on
a computer.
4. The method of claim 1, further comprising c) providing a
prognosis of HCC predisposition based on the results of step
b).
5. The method of claim 4, wherein the HBV genome has 1613A.
6. The method of claim 5, further comprising testing the individual
for the presence of HCC.
7. The method of claim 1, further comprising determining the
genotype of the HBV from the individual.
8. The method of claim 1, the method further comprising determining
nucleotides in the genome of a genotype C HBV isolated from the
individual at positions corresponding to nucleotides 53, 312, 799,
961, 1499, 1899, 2170, or 2441; and comparing the determined
nucleotides to nucleotides associated with a pre-disposition to
cause HCC, wherein the nucleotides associated with a
pre-disposition to cause HCC comprise: 53C, 312C, 799G, 961G,
1499G, 1899A, 2170C, 2170G, or 2441C.
9. The method of claim 8, the method comprising a) determining the
subtype of a genotype C HBV from the individual, wherein: subtype
C3 comprises nucleotides 2733C, 1856C, 1009C, and 2892T; b) if the
HBV is genotype C3, further determining the nucleotides at
positions corresponding to nucleotides 312 or 961 or 1899 of SEQ ID
NO:1; and c) comparing the determined nucleotides to nucleotides at
the positions associated with a pre-disposition to cause HCC,
wherein the nucleotides associated with a pre-disposition to cause
HCC in subtype C3 comprise: 312C; or 961G; or 1899A.
10. The method of claim 9, wherein the determining step comprises
nucleotide sequencing the HBV genome flanking the nucleotides at
positions corresponding to nucleotides 312, 961, 1613 and 1899 of
SEQ ID NO:1.
11. The method of claim 9, wherein the determining step comprises
amplifying at least a portion of the HBV genome to produce one or
more amplification products comprising the nucleotides at the
positions corresponding to nucleotides 312, 961, 1613 and 1899 of
SEQ ID NO:1.
12. The method of claim 11, comprising contacting the one or more
amplification products with one or more probes that hybridize to
HCC-associated nucleotides: 312C; or 961G; or 1613A; or 1899A;
under conditions to allow for hybridization of a probe to an
amplification product only if the amplification product comprises a
complementary nucleotide at the position of the HCC-associated
nucleotide.
13. The method of claim 12, wherein the hybridization is performed
as a line probe assay.
14. The method of claim 9, further comprising determining the
genotype of the HBV from the individual.
15. The method of claim 1, wherein the determining step b) is
performed on a computer, the computer including a computer readable
medium comprising, a) code for receiving information describing:
nucleotides at positions corresponding to nucleotides 1613 of SEQ
ID NO:1; b) code for comparing the nucleotides received in a) to
nucleotides associated with a pre-disposition to cause HCC; and c)
code for providing a determination of the pre-disposition of the
HBV to cause HCC, wherein nucleotides associated with a
pre-disposition to cause HCC comprise: 1613A.
16. A kit for detecting HBV isolates that are associated with the
development hepatocellular carcinoma (HCC), comprising one or more
probe which, when contacted to an HBV genome, selectively
hybridizes to the genome if the genome comprises an A at a position
corresponding to position 1613 of SEQ ID NO:1.
17. The kit of claim 16, wherein the probe is linked to a solid
support.
18. The kit of claim 16, further comprising primers for
amplification of at least a portion of the HBV genome.
19. A computer readable medium comprising, a) code for receiving
information describing: nucleotides at positions corresponding to
nucleotides 1613 of SEQ ID NO:1; b) code for comparing the
nucleotides received in a) to nucleotides associated with a
pre-disposition to cause HCC; and c) code for providing a
determination of the pre-disposition of the HBV to cause HCC,
wherein nucleotides associated with a pre-disposition to cause HCC
comprise: 1613A.
Description
CROSS-REFERENCE TO RELATED PATENT APPLICATIONS
[0001] The present patent application is a divisional of U.S.
patent application Ser. No. 11/019,426, filed Dec. 20, 2004, which
is a continuation-in-part of U.S. patent application Ser. No.
10/937,987, filed Sep. 10, 2004, the disclosure of each is herein
incorporated by reference in its entirety.
BACKGROUND OF THE INVENTION
[0002] Hepatitis B virus (HBV) infects over 300 million people
worldwide. For those individuals with high levels of viral
replication, chronic active hepatitis with progression to
cirrhosis, liver failure and hepatocellular carcinoma (HCC) is
common.
[0003] The natural progression of chronic HBV infection over a 10
to 20 year period leads to cirrhosis in 20-to-50% of patients and
progression of HBV infection to hepatocellular carcinoma has been
well documented. There have been no studies that have determined
sub-populations of hepatitis B virus that are most likely to cause
hepatocellular carcinoma, thus to date all hepatitis B virus have
been considered of equal risk of hepatocarcarcinogesis.
[0004] It is important to note that the survival for patients
diagnosed with hepatocellular carcinoma is only 0.9 to 12.8 months
from initial diagnosis (Takahashi et al., American Journal of
Gastroenterology 88:240-243 (1993)). Treatment of hepatocellular
carcinoma with chemotherapeutic agents has not proven effective and
only 10% of patients will benefit from surgery due to extensive
tumor invasion of the liver (Trinchet et al., Presse Medicine
23:831-833 (1994)). Given the aggressive nature of primary
hepatocellular carcinoma, the only viable treatment alternative to
surgery is liver transplantation (Pichlmayr et al., Hepatology
20:33 S-40S (1994)).
BRIEF SUMMARY OF THE INVENTION
[0005] The present invention provides for methods of determining a
pre-disposition of an individual infected with hepatitis B virus
(HBV) to develop hepatocellular carcinoma (HCC). In some
embodiments, the methods comprise:
[0006] (a) determining nucleotides in the genome of HBV isolated
from the individual at positions corresponding to nucleotides 31,
53, 312, 799, 961, 1165, 1499, 1613, 1762, 1764, 1899, 2170, 2441,
2525, and/or 2712 of SEQ ID NO:1; and
[0007] (b) comparing the determined nucleotides to nucleotides
associated with a pre-disposition to cause HCC, wherein the
nucleotides associated with a pre-disposition to cause HCC
comprise: 31C, 53C, 312C, 799G, 961G, 1165T, 1499G, 1613A, 1762T,
1764A, 1899A, 2170C, 2170G, 2441C, 2525C, 2712C, 2712A, and/or
2712G.
[0008] In some embodiments, the methods comprise:
[0009] (a) determining nucleotides in the genome of a genotype B
HBV isolated from the individual at positions corresponding to
nucleotides 1165, 1762, 1764, 2525 or 2712 of SEQ ID NO:1; and
[0010] (b) comparing the determined nucleotides to nucleotides
associated with a pre-disposition to cause HCC, wherein the
nucleotides associated with a pre-disposition to cause HCC
comprise: 1165T, 1762T, 1764A, 2525C, 2712C, 2712A, or 2712G.
[0011] In some embodiments, the methods comprise:
[0012] (a) determining nucleotides in the genome of a genotype B
HBV isolated from the individual at positions corresponding to
nucleotides 1165, 1762, 1764, 2525 and 2712 of SEQ ID NO:1; and
[0013] (b) comparing the determined nucleotides to nucleotides
associated with a pre-disposition to cause HCC, wherein the
nucleotides associated with a pre-disposition to cause HCC in
genotype B comprise:
[0014] 1762T and 1764A and 2712A; or
[0015] 1762T and 1764A and 2712C; or
[0016] 1762T and 1764A and 2712G; or
[0017] 1762T and 1764A and 2712T and 2525C; or
[0018] 1762A and 1764G and 1165T.
[0019] In some embodiments, the method comprises determining the
genotype of the HBV from the individual.
[0020] In some embodiments, the determining step comprises
nucleotide sequencing the HBV genome flanking the nucleotides at
positions corresponding to nucleotides 1165, 1762, 1764, 2525 and
2712 of SEQ ID NO:1.
[0021] In some embodiments, the determining step comprises
amplifying at least a portion of the HBV genome to produce one or
more amplification products comprising the nucleotides at the
positions corresponding to nucleotides 1165, 1762, 1764, 2525 and
2712 of SEQ ID NO: 1. In some embodiments, the method comprises
contacting the one or more amplification products with one or more
probes that hybridize to HCC-associated nucleotides:
[0022] 1762T and 1764A and 2712A; or
[0023] 1762T and 1764A and 2712C or;
[0024] 1762T and 1764A and 2712G; or
[0025] 1762T and 1764A and 2712T and 2525C; or
[0026] 1762A and 1764G and 1165T;
under conditions to allow for hybridization of a probe to an
amplification product only if the amplification product comprises a
complementary nucleotide at the position of the HCC-associated
nucleotide. In some embodiments, the hybridization is performed as
a line probe assay.
[0027] In some embodiments, the method comprises:
[0028] (a) determining nucleotides in the genome of a genotype C
HBV isolated from the individual at positions corresponding to
nucleotides 31, 53, 312, 799, 961, 1499, 1613, 1899, 2170, or 2441;
and
[0029] (b) comparing the determined nucleotides to nucleotides
associated with a pre-disposition to cause HCC, wherein the
nucleotides associated with a pre-disposition to cause HCC
comprise: 31C, 53C, 312C, 799G, 961G, 1499G, 1613A, 1899A, 2170C,
2170G, or 2441C.
[0030] In some embodiments, the method comprises
[0031] a) determining the subtype of a genotype C HBV from the
individual, wherein:
[0032] subtype C1 comprises nucleotides 2733A, 1856C, 1009T and
2892T,
[0033] subtype C2 comprises nucleotides 2733C, 1856T, 1009T and
2892T, and
[0034] subtype C3 comprises nucleotides 2733C, 1856C, 1009C and
2892T;
[0035] b1) if the HBV is genotype C1, determining the nucleotides
at positions corresponding to nucleotides 31, 53 and 1499 of SEQ ID
NO:1; or
[0036] b2) if the HBV is genotype C2, determining the nucleotides
at positions corresponding to nucleotides 799, 2441 and 2170 of SEQ
ID NO:1; and
[0037] b3) if the HBV is genotype C3, determining the nucleotides
at positions corresponding to nucleotides 312, 961, 1613, 1899 of
SEQ ID NO:1; and
[0038] c) comparing the determined nucleotides to nucleotides at
the positions associated with a pre-disposition to cause HCC,
wherein the nucleotides associated with a pre-disposition to cause
HCC in subtype C1 comprise:
[0039] 31C; and/or
[0040] 53C; and/or
[0041] 1499G; and
[0042] the nucleotides associated with a pre-disposition to cause
HCC in subtype C2 comprise:
[0043] 2170C; and/or
[0044] 2170G; and/or
[0045] 2441C; and/or
[0046] 799G; and
[0047] the nucleotides associated with a pre-disposition to cause
HCC in subtype C3 comprise:
[0048] 312C; and/or
[0049] 961G; and/or
[0050] 1613A; and/or
[0051] 1899A
[0052] In some embodiments, the determining step comprises
nucleotide sequencing the HBV genome flanking the nucleotides at
positions corresponding to nucleotides 31, 53, and 1499 of SEQ ID
NO: 1. In some embodiments, the determining step comprises
nucleotide sequencing the HBV genome flanking the nucleotides at
positions corresponding to nucleotides 799, 2441, and 2170 of SEQ
ID NO:1. In some embodiments, the determining step comprises
amplifying at least a portion of the HBV genome to produce one or
more amplification products comprising the nucleotides at the
positions corresponding to nucleotides 31, 53, and 1499 of SEQ ID
NO:1. In some embodiments, the determining step comprises
nucleotide sequencing the HBV genome flanking the nucleotides at
positions corresponding to nucleotides 312, 961, 1613, and 1899 of
SEQ ID NO:1
[0053] In some embodiments, the determining step comprises
amplifying at least a portion of the HBV genome to produce one or
more amplification products comprising the nucleotides at the
positions corresponding to nucleotides 799, 2441, and 2170 of SEQ
ID NO: 1. In some embodiments, the determining step comprises
amplifying at least a portion of the HBV genome to produce one or
more amplification products comprising the nucleotides at the
positions corresponding to nucleotides 312, 961, 1613, and 1899 of
SEQ ID NO:1.
[0054] In some embodiments, the method comprises contacting the one
or more amplification products with one or more probes that
hybridize to HCC-associated nucleotides:
[0055] 31C; and/or
[0056] 53C; and/or
[0057] 1499G;
under conditions to allow for hybridization of a probe to an
amplification product only if the amplification product comprises a
complementary nucleotide at the position of the HCC-associated
nucleotide.
[0058] In some embodiments, the hybridization is performed as a
line probe assay.
[0059] In some embodiments, the method comprises contacting the one
or more amplification products with probes that hybridize to
HCC-associated nucleotides:
[0060] 2170G; and/or
[0061] 2441C; and/or
[0062] 799G;
under conditions to allow for hybridization of the probes to the
amplification product only if the amplification product comprises a
complementary nucleotide at the position of the HCC-associated
nucleotide. In some embodiments, the hybridization is performed as
a line assay.
[0063] In some embodiments, the method comprises contacting the one
or more amplification products with probes that hybridize to
HCC-associated nucleotides:
[0064] 312C; and/or
[0065] 961G; and/or
[0066] 1613A; and/or
[0067] 1899A;
[0068] under conditions to allow for hybridization of the probes to
the amplification product only if the amplification product
comprises a complementary nucleotide at the position of the
HCC-associated nucleotide. In some embodiments, the hybridization
is performed as a line assay.
[0069] In some embodiments, the method further comprises
determining the genotype of the HBV from the individual.
[0070] In some embodiments, the method comprises:
[0071] determining the genotype of the HBV, wherein genotype B
comprises 2733C, 1856C, 1009T and 2892T, genotype C1 comprises
2733A, 1856C, T1099T and 2892T, genotype C2 comprises 2733C, 1856T,
1009T and 2892T and genotype C3 comprises 2733C, 1856C, 1009C and
2892T;
[0072] determining nucleotides 1165, 1762, 1764, 2525 and 2712 of
the HBV genome if the HBV is genotype B; and/or
[0073] determining nucleotides 31 and/or 53 and/or 1499 of the HBV
genome if the HBV is C1; and/or
[0074] determining nucleotides 2170 and/or 2441 and/or 799 of the
HBV genome if the HBV is C2; and/or
[0075] determining nucleotides 312 and/or 961 and/or 1613 and/or
1899 of the HBV genome if the HBV is C3; and
[0076] comparing the determined nucleotides to nucleotides
associated with a pre-disposition to cause HCC,
[0077] wherein nucleotides associated with a pre-disposition to
cause HCC in genotype B comprise:
[0078] 1762T and 1764A and 2712A; or
[0079] 1762T and 1764A and 2712C or;
[0080] 1762T and 1764A and 2712G; or
[0081] 1762T and 1764A and 2712T and 2525C; or
[0082] 1762A and 1764G and 1165T;
[0083] wherein nucleotides associated with a pre-disposition to
cause HCC in genotype C1 comprise:
[0084] 31C; and/or
[0085] 53C; and/or
[0086] 1499G; and
[0087] wherein nucleotides associated with a pre-disposition to
cause HCC in genotype C2 comprise:
[0088] 2170C; and/or
[0089] 2170G; and/or
[0090] 2441C; and/or
[0091] 799G;
[0092] wherein nucleotides associated with a pre-disposition to
cause HCC in genotype C3 comprise:
[0093] 312C; and/or
[0094] 961G; and/or
[0095] 1613A; and/or
[0096] 1899A;
[0097] thereby determining the pre-disposition of the individual to
develop HCC.
[0098] The present invention also provides kits for detecting HBV
isolates that are associated with the development hepatocellular
carcinoma (HCC).
[0099] In some embodiments, the kits comprise: one or more probe
which, when contacted to an HBV genome, selectively hybridizes to
the genome if the genome comprises at least one of the following
nucleotides: 31C, 53C, 312C, 799G, 961G, 1165T, 1499G, 1613A,
1762T, 1762A, 1764A, 1764G, 1899A, 2441C, 2170C, 2170G, 2712A,
2712C, 2712G; or 2525C.
[0100] In some embodiments, the probe is linked to a solid
support.
[0101] In some embodiments, the probe selectively hybridizes
to:
[0102] 1762T and 1764A and 2712A; and/or
[0103] 1762T and 1764A and 2712C; and/or;
[0104] 1762T and 1764A and 2712G; and/or
[0105] 1762T and 1764A and 2712T and 2525C; and/or
[0106] 1762A and 1764G and 1165T.
[0107] In some embodiments, the probe selectively hybridizes
to:
[0108] 31C; and/or
[0109] 53C; and/or
[0110] 1499G.
[0111] In some embodiments, the probe selectively hybridizes
to:
[0112] 2170C; and/or
[0113] 2170G; and/or
[0114] 2441C; and/or
[0115] 799G.
[0116] In some embodiments, the probe selectively hybridizes
to:
[0117] 312C; and/or
[0118] 961G; and/or
[0119] 1613A; and/or
[0120] 1899A.
[0121] In some embodiments, the kits further comprise primers for
amplification of at least a portion of the HBV genome.
[0122] The present invention also provides a computer readable
medium for determining whether an HBV sequence is likely to result
in the development of HCC. In some embodiments, the computer
readable form comprises:
[0123] a) code for receiving information describing: nucleotides at
positions corresponding to nucleotides 31, 53, 312, 799, 961, 1165,
1499, 1613, 1762, 1764, 1899, 2170, 2441, 2525, or 2712 of SEQ ID
NO:1;
[0124] b) code for comparing the nucleotides received in a) to
nucleotides associated with a pre-disposition to cause HCC; and
[0125] c) code for providing a determination of the pre-disposition
of the HBV to cause HCC,
[0126] wherein nucleotides associated with a pre-disposition to
cause HCC comprise: 31C, 53C, 312C, 799G, 961G, 1165T, 1499G,
1613A, 1762T, 1764A, 1899A, 2170C, 2170G, 2441C, 2525C, 2712C,
2712A, or 2712G.
DEFINITIONS
[0127] A probe "selectively hybridizes" to a viral genome
comprising a particular nucleotide when the probe hybridizes to the
genome when the particular nucleotide (at the specified position)
is present, but does not hybridize if the nucleotide at the
specified position is different or absent. Conditions to allow for
hybridization of a probe to a particular DNA molecule only if a
complementary nucleotide is present in a particular target DNA are
generally "stringent hybridization conditions."
[0128] The phrase "stringent hybridization conditions" refers to
conditions under which a probe will hybridize to its target
subsequence, typically in a complex mixture of nucleic acid, but to
no other sequences, or at least to no other sequences at which a
particular position is anything but one particular nucleotide.
Stringent conditions are sequence-dependent and will be different
in different circumstances. Longer sequences hybridize specifically
at higher temperatures. An extensive guide to the hybridization of
nucleic acids is found in Tijssen, Techniques in Biochemistry and
Molecular Biology--Hybridization with Nucleic Probes, "Overview of
principles of hybridization and the strategy of nucleic acid
assays" (1993). Generally, stringent conditions are selected to be
about 5-10.degree. C. lower than the thermal melting point
(T.sub.m) for the specific sequence at a defined ionic strength pH.
The T.sub.m is the temperature (under defined ionic strength, pH,
and nucleic concentration) at which 50% of the probes complementary
to the target hybridize to the target sequence at equilibrium (as
the target sequences are present in excess, at T.sub.m, 50% of the
probes are occupied at equilibrium). Stringent conditions for
Southern hybridization are generally those in which the salt
concentration is less than about 1.0 M sodium ion, typically about
0.01 to 1.0 M sodium ion concentration (or other salts) at pH 7.0
to 8.3 and the temperature is at least about 30.degree. C. for
short probes (e.g., 10 to 50 nucleotides) and at least about
60.degree. C. for long probes (e.g., greater than 50 nucleotides).
Stringent conditions may also be achieved with the addition of
destabilizing agents such as formamide. For selective
hybridization, a positive signal is at least two times background,
optionally 10 times background hybridization, i.e., hybridization
to another nucleotide sequence with a different nucleotide at the
position of interest. Exemplary stringent hybridization conditions
can be as follows: 50% formamide, 5.times.SSC, and 1% SDS,
incubating at 42.degree. C., or 5.times.SSC, 1% SDS, incubating at
65.degree. C., with wash in 0.2.times.SSC, and 0.1% SDS at
65.degree. C. Such washes can be performed for 5, 15, 30, 60, 120,
or more minutes.
[0129] "Determining nucleotides in the genome of HBV at positions
corresponding to" particular nucleotides of a reference sequence
(e.g., SEQ ID NO: 1) refers to identifying a position in an
isolated HBV genome that occurs in a position that is the
equivalent of the particular position in the reference sequence.
The variants identified in the present invention are not limited to
predicting sequence pre-disposition of variants of SEQ ID NO: 1,
but instead apply to any HBV strain carrying particular
corresponding nucleotides. Thus, when the genome of an HBV isolate
differs from SEQ ID NO: 1 (e.g., by changes in nucleotides or
addition or deletion of nucleotides), it may be that a particular
nucleotide associated with the development of HCC will not be in
exactly the same position as it is in SEQ ID NO: 1. For example,
the nucleotide corresponding to nucleotide 31C of SEQ ID NO: 1 may
occur at position 32 of a particular HBV strain due to a one
nucleotide insertion at an earlier position in the strain's genome.
Nevertheless, position 32 of the HBV strain would correspond to
position 31 of SEQ ID NO: 1, which can be readily illustrated in an
alignment of the two sequences. As described herein, the
corresponding nucleotide in the genome of an HBV isolate can be
determined using an alignment algorithm such as BLAST.
BRIEF DESCRIPTION OF THE DRAWINGS
[0130] FIG. 1 illustrates the locations of various primers used for
amplification of HBV and the resulting amplified fragments relative
to the HBV genome, represented as a line at the bottom of the
figure.
[0131] FIGS. 2A and 2B illustrate the genome (SEQ ID NO: 1) of an
exemplary HBV genotype B isolate comprising highlighted nucleotides
associated with the development of HCC.
[0132] FIGS. 3A and 3B illustrate the genome (SEQ ID NO:2) of an
exemplary HBV genotype C1 isolate comprising highlighted
nucleotides associated with the development of HCC.
[0133] FIGS. 4A and 4B illustrate the genome (SEQ ID NO:3) of an
exemplary HBV genotype C2 isolate comprising highlighted
nucleotides associated with the development of HCC.
DETAILED DESCRIPTION OF THE INVENTION
I. Introduction
[0134] The present invention is based on the discovery that certain
sequence variants of HBV are associated with the development of
hepatocellular carcinoma (HCC) in individuals infected with HBV.
Specifically, the presence of the following nucleotides in an HBV
genome is associated with the development of HCC: 31C, 53C, 312C,
799G, 961G, 1165T, 1499G, 1613A, 1762T, 1764A, 1899A, 2170C, 2170G,
2441C, 2525C, 2712C, 2712A, or 2712G. Accordingly, the invention
provides for methods of determining whether an individual infected
with HBV has a predisposition for HCC by detecting the nucleotide
sequence of the HBV variant infecting the individual. The method
also provides for kits comprising reagents to detect any of the
specific variants associated with HCC and computer readable forms
for applying the methods of the invention.
II. Detecting HBV Variants Associated with HCC
[0135] Any number of methods may be used to determine the
nucleotides at the positions corresponding to nucleotides at
positions 31, 53, 312, 799, 961, 1165, 1499, 1613, 1762, 1764,
1899, 2170, 2441, 2525, and/or 2712 of SEQ ID NO:1 and/or other
positions as described herein.
[0136] In some embodiments, nucleotide sequencing is used to
determine the nucleotides at particular positions of the HBV
genome. Without intending to limit the invention, examples of
nucleotide sequencing include chain termination sequencing. See,
e.g., Sanger et al. Proc. Nat. Acad. Sci. USA 74:5463-5467 (1977);
Sambrook et al., Molecular Cloning, A Laboratory Manual (2nd ed.
1989); Kriegler, Gene Transfer and Expression: A Laboratory Manual
(1990); and Current Protocols in Molecular Biology (Ausubel et al.,
eds., 1994)). Sequencing may be performed following amplification
of the HBV genome or a fragment thereof. Direct sequencing of PCR
generated amplicons by selectively incorporating boronated nuclease
resistant nucleotides into the amplicons during PCR and digestion
of the amplicons with a nuclease to produce sized template
fragments may also be performed (Porter et al., Nucleic Acids
Research 25(8):1611-1617 (1997)). Alternatively, microfluidic
techniques such as those described in U.S. Patent Publication No.
2003/0215862 may be used. See also U.S. Patent Publication No.
2003/0152996 describing alternate sequencing methods.
[0137] Specific probes that bind to nucleotides at particular
positions in the HBV genome may also be used to detect nucleotides
in the HBV genome. Probes that detect the particular nucleotides
associated with HCC may be used in a reverse hybridization assay
format using immobilized oligonucleotide probes present at distinct
locations on a solid support. More particularly, the Line Probe
Assay (LiPA) may be used. The LiPA is a reverse hybridization assay
using oligonucleotide probes immobilized as parallel lines on a
solid support strip. See, e.g., PCT Publication No. WO 94/12670. In
this assay, specific oligonucleotides may be immobilized at known
locations on membrane strips and hybridized under strictly
controlled conditions with the labeled PCR product. Different
probes may be designed such that each probe on the strip comprises
an HBV nucleotide sequence, or complement thereof, but contains a
different nucleotide at a particular position. Amplifying an HBV
genome, or fragment thereof, and hybridizing the amplification
product to one or more probes specific for a particular variant
will result in complete or at least preferential hybridization of
one of the probes to the product, thereby indicating which
nucleotide at the particular position is contained in the amplified
genome. Hybridization conditions using this assay are generally set
at a high stringency such that only one probe binds to the
amplification product. Exemplary conditions may include, e.g.,
standard hybridization and washing conditions (e.g., 1.times.SSC
buffer containing 0.1% sodium dodecyl sulfate at 62.degree.
C.).
[0138] Amplification of HBV
[0139] The HBV genome or a portion thereof may be amplified before
the nucleotides at positions associated with HCC are determined. An
"amplification" refers to any chemical, including enzymatic,
reaction that results in increased copies of a template nucleic
acid sequence. Amplification reactions include polymerase chain
reaction (PCR) and ligase chain reaction (LCR) (see U.S. Pat. Nos.
4,683,195 and 4,683,202; PCR Protocols: A Guide to Methods and
Applications (Innis et al., eds, 1990)), strand displacement
amplification (SDA) (Walker, et al Nucleic Acids Res. 20(7): 1691-6
(1992); Walker PCR Methods Appl 3(1):1-6 (1993)),
transcription-mediated amplification (Phyffer, et al, J. Clin.
Microbiol. 34:834-841 (1996); Vuorinen, et al., J. Clin. Microbiol.
33:1856-1859 (1995)), nucleic acid sequence-based amplification
(NASBA) (Compton, Nature 350(6313):91-2 (1991), rolling circle
amplification (RCA) (Lisby, Mol. Biotechnol. 12(1):75-99 (1999));
Hatch et al., Genet. Anal. 15(2):35-40 (1999)) and branched DNA
signal amplification (bDNA) (see, e.g., Iqbal et al., Mol. Cell.
Probes 13(4):315-320 (1999)).
[0140] Amplified portions of the HBV genome (optionally labeled)
may be hybridized to DNA comprising one or more HCC-associated
nucleotides, or a complement thereof, thereby allowing for
determination of the identity of nucleotides at a nucleotide
position of interest. Alternatively, the probes may detect
non-HCC-associated nucleotides, thereby allowing for detection of
HCC-associated HBV variants by detecting a lack of
hybridization.
[0141] In some embodiments, the amplified fragment of the genome
will comprise more than one HCC-associated nucleotide. Thus, in
some embodiments, the fragment will comprise any combination of
positions corresponding to nucleotides at positions 31, 53, 312,
799, 961, 1165, 1499, 1613, 1762, 1764, 1899, 2170, 2441, 2525,
and/or 2712 of SEQ ID NO: 1. In some embodiments, the fragment will
comprise positions corresponding to nucleotides 1165, 1762, 1764,
2525 and 2712 of SEQ ID NO:1. In some embodiments, the fragment
will comprise positions corresponding to nucleotides 31, 53, 312,
799, 961, 1499, 1613, 1899, 2170, and 2441 of SEQ ID NO:1.
[0142] In some cases, more than one fragment of HBV is amplified.
In these cases, the sum of all fragments amplified may comprise any
combination of positions corresponding to nucleotides at positions
31, 53, 312, 799, 961, 1165, 1499, 1613, 1762, 1764, 1899, 2170,
2441, 2525, and/or 2712 of SEQ ID NO:1. For example, one fragment
may comprise positions 31, 53, 312, 799, 961, 1165, 1499, 1613,
1762, 1764 and a second fragment may comprise positions 1899, 2170,
2441, 2525, or 2712. In some embodiments, the sum of all amplified
fragments will comprise positions corresponding to nucleotides
1165, 1762, 1764, 2525 and 2712 an SEQ ID NO: 1. In some
embodiments, the sum of all amplified fragments will comprise
positions corresponding to nucleotides 31, 53, 312, 799, 961, 1499,
1613, 1899, 2170, and 2441.
[0143] In some embodiments, amplification and detection methods are
used in combination, and sometimes in the same reaction vessel, to
detect HBV polynucleotides using detectably-labeled probes that
distinguish between HCC-associated nucleotides and nucleotides not
associated with HCC. Binding of a probe to its complementary
hybridization sequence allows the user to quantify the accumulation
of a particular sequence without necessarily removing the contents
from the reaction vessel. In general, any type of label that allows
for the detection and differentiation of different probes can be
used according to the methods of the invention.
[0144] Accumulation of amplified product can be quantified by any
method known to those in the art. For instance, fluorescence from a
probe can be detected by measurement of light at a particular
frequency. Similarly, the accumulation of various chemical products
created via an enzymatic reaction linked to the probe can be
measured, for instance, by measuring absorbance of light at a
particular wavelength. In other embodiments, amplification
reactions can be quantified directly by blotting them onto a solid
support and hybridizing with a detectably-labeled nucleic acid
probe. Once unbound probe is washed away, the amount of probe can
be quantified by measuring radioactivity as is known to those of
skill in the art. Other variations of this technique employ the use
of chemiluminescence to detect hybridization events.
[0145] Measurement of amplification products can be performed after
the reaction has been completed or can be measured in "real time"
(i.e., as the reaction occurs). If measurement of accumulated
amplified product is performed after amplification is complete,
then detection reagents (e.g. probes) can be added after the
amplification reaction. Alternatively, probes can be added to the
reaction prior or during the amplification reaction, thus allowing
for measurement of the amplified products either after completion
of amplification or in real time. Real time measurements can be
particularly useful because they allow for measurement at any given
cycle of the reaction and thus provide more information about
accumulation of products throughout the reaction. For measurement
of amplification product in real time, fluorescent probes are often
used.
[0146] One amplification assay utilizing a FRET pair to detect an
amplification product is the "TaqMan.RTM." assay described in
Gelfand et al. U.S. Pat. No. 5,210,015, and Livak et al. U.S. Pat.
No. 5,538,848. The probe is a single-stranded oligonucleotide
labeled with a FRET pair. In a TaqMan.RTM. assay, a DNA polymerase
releases single or multiple nucleotides by cleavage of the
oligonucleotide probe when it is hybridized to a target strand.
That release provides a way to separate the quencher label and the
fluorophore label of the FRET pair.
[0147] Another type of nucleic acid hybridization probe assay
utilizing FRET pairs is described in Tyagi et al. U.S. Pat. No.
5,925,517, which utilizes labeled oligonucleotide probes, which are
referred to as "molecular beacons." See Tyagi, S. and Kramer, F.
R., Nature Biotechnology 14: 303-308 (1996). A molecular beacon
probe is an oligonucleotide whose end regions hybridize with one
another in the absence of target but are separated if the central
portion of the probe hybridizes to its target sequence. The
rigidity of the probe-target hybrid precludes the simultaneous
existence of both the probe-target hybrid and the intramolecular
hybrid formed by the end regions. Consequently, the probe undergoes
a conformational change in which the smaller hybrid formed by the
end regions disassociates, and the end regions are separated from
each other by the rigid probe-target hybrid. For molecular beacon
probes, a central target-recognition sequence is flanked by arms
that hybridize to one another when the probe is not hybridized to a
target strand, forming a "hairpin" structure, in which the
target-recognition sequence (which is commonly referred to as the
"probe sequence") is in the single-stranded loop of the hairpin
structure, and the arm sequences form a double-stranded stem
hybrid. When the probe hybridizes to a target, that is, when the
target-recognition sequence hybridizes to a complementary target
sequence, a relatively rigid helix is formed, causing the stem
hybrid to unwind and forcing the arms apart.
[0148] One of skill will recognize that a large number of different
fluorophores can be used to label probes useful in the invention.
Some fluorophores useful in the methods and composition of the
invention include: fluorescein, fluorescein isothiocyanate (FITC),
carboxy tetrachloro fluorescein (TET),NHS-fluorescein, 5 and/or
6-carboxy fluorescein (FAM), 5-(or 6-) iodoacetamidofluorescein,
5-{[2(and 3)-5-(Acetylmercapto)-succinyl]amino}fluorescein
(SAMSA-fluorescein), and other fluorscein derivatives, rhodamine,
Lissamine rhodamine B sulfonyl chloride, Texas red sulfonyl
chloride, 5 and/or 6 carboxy rhodamine (ROX) and other rhodamine
derivatives, coumarin, 7-amino-methyl-coumarin,
7-Amino-4-methylcoumarin-3-acetic acid (AMCA), and other coumarin
derivatives, BODIPY.TM. fluorophores, Cascade Blue.TM. fluorophores
such as 8-methoxypyrene-1,3,6-trisulfonic acid trisodium salt,
Lucifer yellow fluorophores such as
3,6-Disulfonate-4-amino-naphthalimide, phycobiliproteins
derivatives, Alexa fluor dyes (available from Molecular Probes,
Eugene, Oreg.) and other fluorophores known to those of skill in
the art. For a general listing of useful fluorophores, see
Hermanson, G. T., BIOCONJUGATE TECHNIQUES (Academic Press, San
Diego, 1996). Thus, each probe used in a reaction may fluoresce at
a different wavelength and can be individually detected without
interference from the other probes. This is useful, for example, if
probes that detect different nucleotides at a particular position
are used in a reaction. Thus, for example, one wavelength may
indicate binding of a probe that detects 31T while a probe
comprising a label with a different wavelength will detect 31C.
[0149] Preparing HBV from a Test Sample
[0150] The presence or amount of HBV nucleic acids in a test sample
can be determined by amplifying the target regions within the HBV
gene. Thus, any liquid or solid material believed to comprise HBV
nucleic acids can be an appropriate sample. Preferred sample
tissues include plasma, serum, whole blood, blood cells, lymphatic
fluid, cerebral spinal fluid, synovial fluid and others.
[0151] As used herein, the term "test sample" refers to any liquid
or solid material believed to comprise HBV nucleic acids. A test
sample may be obtained from a biological source, such as cells in
culture or a tissue sample from an animal, e.g., a human. Sample
tissues of the instant invention may include, but are not limited
to, plasma, serum, whole blood, blood cells, lymphatic fluid,
cerebrospinal fluid, synovial fluid, urine, saliva, and skin or
other organs (e.g. liver biopsy material).
[0152] Such sample will often be taken from patients suspected of
having HBV infection, or having any of the wide spectrum of liver
diseases related to HBV infection.
[0153] Nucleic acids representing the HBV gene of interest may be
extracted from tissue samples. Various commercial nucleic acid
purification kits, such as QIAmp 96 Virus BioRobot Kit and Qiagen's
BioRobot 9604 are known to the skilled artisan, and used to isolate
HBV nucleic acids from samples.
III. Determination of HBV Genotype
[0154] The present methods may also involve a determination of the
genotype of HBV in an individual. For example, particular
nucleotide variants identified herein may have a stronger
predisposition to cause HCC if the variants are found in one
genotype than in another. In this context, "genotype" refers to the
at least 8 genotypes of HBV (genotypes A, B, C, D, E, F, G, and H)
deduced from genome comparisons and designated genotypes A to H.
See, e.g., Westland C. Hepatology 36: 2-8 (2002);
Borchani-Chabchoub I, et al., Microbes Infect 2: 607-12 (2000);
Grandjacques C, et al., J Hepatol 33: 430-9 (2000); Kato H, et al.,
J Virol Methods 98: 153-9 (2001); Ashton-Rickardt P G, et al., J
Med Virol 29: 204-14 (1989). Thus, by detecting nucleotides at
particular positions identified to occur only in a specific
genotype, one may determine the genotype of HBV. Of course, other
methods such as serological methods may also be used.
[0155] In some embodiments, the presence or absence of the B or C
genotype of HBV will be determined. In some embodiments, genotype B
comprises 2733C, 1856C, 1009T and 2892T. Further, the subtype of
genotype may also be determined. For example, in some embodiments,
subtype C1 is characterized by 2733A, 1856C, T1099T and 2892T. In
some embodiments, subtype C2 is identified by 2733C, 1856T, 1009T
and 2892T. In some embodiments, subtype C3 is identified by 2733C,
1856C, 1009C and 2892T. The details are showed in the table
below:
TABLE-US-00001 2733 1856 1009 2892 B C C T T C1 A C T T C2 C T T T
C3 C C C T Minor-cluster C C C C
[0156] Detection of the nucleotides associated with a particular
genotype may be detected by any method useful for detecting
nucleotide sequences, including all of those described herein
(e.g., amplification, nucleotide sequencing and/or probes,
etc.).
IV. Comparing Nucleotides of HBV with Nucleotides Associated with
HCC
[0157] Nucleotide sequence information regarding an isolate from an
individual may be compared to nucleotides associated with HCC by
any method.
[0158] Where a nucleotide sequence of the isolate is determined,
the sequence may be aligned with SEQ ID NO: 1 or another HBV
genomic sequence to determine the position of the specific
nucleotides of interest. Methods of alignment of sequences for
comparison are well-known in the art. Optimal alignment of
sequences for comparison can be conducted, e.g., by the local
homology algorithm of Smith & Waterman, Adv. Appl. Math. 2:482
(1981), by the homology alignment algorithm of Needleman &
Wunsch, J. Mol. Biol. 48:443 (1970), by the search for similarity
method of Pearson & Lipman, Proc. Nat'l Acad. Sci. USA 85:2444
(1988), by computerized implementations of these algorithms (GAP,
BESTFIT, FASTA, and TFASTA in the Wisconsin Genetics Software
Package, Genetics Computer Group, 575 Science Dr., Madison, Wis.),
or by manual alignment and visual inspection (see, e.g., Current
Protocols in Molecular Biology (Ausubel et al., eds. 1995
supplement)).
[0159] An example of algorithm that is suitable for aligning
sequences and determining percent sequence identity and sequence
similarity are the BLAST and BLAST 2.0 algorithms, which are
described in Altschul et al., Nuc. Acids Res. 25:3389-3402 (1977)
and Altschul et al., J. Mol. Biol. 215:403-410 (1990),
respectively. BLAST and BLAST 2.0 may be used, with the parameters
described herein, to determine an optimal alignment. Software for
performing BLAST analyses is publicly available through the
National Center for Biotechnology Information. This algorithm
involves first identifying high scoring sequence pairs (HSPs) by
identifying short words of length W in the query sequence, which
either match or satisfy some positive-valued threshold score T when
aligned with a word of the same length in a database sequence. T is
referred to as the neighborhood word score threshold (Altschul et
al., supra). These initial neighborhood word hits act as seeds for
initiating searches to find longer HSPs containing them. The word
hits are extended in both directions along each sequence for as far
as the cumulative alignment score can be increased. Cumulative
scores are calculated using, for nucleotide sequences, the
parameters M (reward score for a pair of matching residues; always
>0) and N (penalty score for mismatching residues; always
<0). For amino acid sequences, a scoring matrix is used to
calculate the cumulative score. Extension of the word hits in each
direction are halted when: the cumulative alignment score falls off
by the quantity X from its maximum achieved value; the cumulative
score goes to zero or below, due to the accumulation of one or more
negative-scoring residue alignments; or the end of either sequence
is reached. The BLAST algorithm parameters W, T, and X determine
the sensitivity and speed of the alignment. The BLASTN program (for
nucleotide sequences) uses as defaults a wordlength (W) of 11, an
expectation (E) of 10, M=5, N=-4 and a comparison of both strands
and it is generally useful to turn off the complexity filter.
[0160] Positions of nucleotides of interest are provided throughout
this application with reference to the first C of the first EcoR1
cleavage site (GAACTCC) that generally occur in the HBV genome. The
first "C" is position 1 of SEQ ID NO: 1. Thus, following alignment
of a sequence of interest with SEQ ID NO: 1, a particular
nucleotide of the sequence of interest may be assigned a position
relative to the corresponding position in the alignment with SEQ ID
NO: 1.
[0161] The presence of any of the following nucleotides is
indicative of a pre-disposition for HCC: 31C, 53C, 312C, 799G,
961G, 1165T, 1499G, 1613A, 1762T, 1764A, 1899A, 2170C, 2170G,
2441C, 2525C, 2712C, 2712A, or 2712G. While those of skill in the
art will recognize that any number of algorithms may be useful for
predicting a predisposition for developing HCC, as described in the
Example, particularly good sensitivity and specificity may be
obtained using the following algorithm:
[0162] For genotype B HBV, the presence of:
[0163] 1762T and 1764A and 2712A; or
[0164] 1762T and 1764A and 2712C or;
[0165] 1762T and 1764A and 2712G; or
[0166] 1762T and 1764A and 2712T and 2525C; or
[0167] 1762A and 1764G and 1165T, indicates a pre-disposition for
HCC.
[0168] For genotype C1 HBV, the presence of:
[0169] 31C; and/or
[0170] 53C; and/or
[0171] 1499G, indicates a pre-disposition for HCC.
[0172] For genotype C2 HBV, the presence of:
[0173] 2170C; and/or
[0174] 2170G; and/or
[0175] 2441C; and/or
[0176] 799G, indicates a pre-disposition for HCC.
[0177] For genotype C3 HBV, the presence of:
[0178] 312C; and/or
[0179] 961G; and/or
[0180] 1613A; and/or
[0181] 1899A, indicates a pre-disposition for HCC.
[0182] In some embodiments of the invention, it is useful to apply
the above-listed algorithms in a computer readable form. The code
for performing any of the functions described herein can be
executed by the digital computers and may be stored on any suitable
computer readable media. Examples of computer readable media
include magnetic, electronic, or optical disks, tapes, sticks,
chips, etc. The code for performing any of the functions described
herein may also be written in any suitable computer programming
language including, for example, Fortran, C, C++, etc. The
graphical user interfaces and functions underlying the graphical
user interfaces can be created using an object oriented programming
language such as Java.
V. Benefits of Identifying Individuals Pre-Disposed for HCC
[0183] The conventional methods of surveillance for HCC are testing
an infected person's serum alfa-fetoprotein levels (see, e.g., Liaw
Y F et al., Gastroenterology, 30:263-267 (1986); Colombo M. et al,
N. Engl. J. Med., 325:675-680 (1991); Oka H. et al., Hepatology
12:680-687 (1990) or by subjecting the person to abdominal
ultrasound scanning. Another method for diagnosis of HCC is
detecting des-gamma-carboxy prothrombin (Chan C Y et al. J Hepatol.
13:21-24 (1991); Weitz I C et al., Hepatology 18:990-997 (1993)).
Another marker for HCC is TGF-1.beta.. See, e.g., US Patent
Publication No. 2004/0121414.
[0184] However, without information regarding which patients may be
pre-disposed for HCC, it is necessary to screen every person
infected with HBV on a regular basis to catch HCC as early as
possible. Unfortunately, given the large number of people infected
with HBV, as well as the finite resources available to screen
individuals, it is impossible to perform all of the necessary
screens. The present invention addresses this problem, by
indicating which individuals should have intense surveillance for
the initial signs of HCC and which individuals do not require such
intense surveillance. Thus, the present invention provides for
detecting those individuals carrying HBV that is pre-disposed to
cause HCC and then further testing those individuals on a regular
basis for the presence of HCC and optionally, only rarely or never
testing those individuals lacking HCC-associated HBV variants.
VI. Kits
[0185] Kits comprising the components needed in the methods
(typically in an unmixed form) and kit components (packaging
materials, instructions for using the components and/or the
methods, one or more containers (reaction tubes, columns, etc.))
for holding the components are a feature of the present invention.
Kits of the present invention may contain reagents for detecting
any one or more of the following nucleotide variants in an HBV
genome: 31C, 53C, 312C, 799G, 961G, 1165T, 1499G, 1613A, 1762T,
1764A, 1899A, 2170C, 2170G, 2441C, 2525C, 2712C, 2712A, and/or
2712G. For example, the kits of the invention may comprise
combinations of primers and/or probes as described herein for the
detection of nucleotide variants associated with HCC. Optionally,
the kits may contain reagents for amplification, including but not
limited to, thermostable polymerases such as Taq polymerase,
nucleotides, buffers, etc.
EXAMPLE
[0186] Our goal was to discover genetic markers of HCC cases from
HBV DNA sequences. In other words, we built up a classification
model based on HBV DNA to predict cancer. Several classification
models including Naive Bayes, Decision Tree, Neural Networks, and
Rule Learning Using Evolutionary Algorithm, have been applied to
classify the DNA datasets. The experimental results showed that the
Rule Learning Using Evolutionary Algorithm has the best
performance. In this section, we present the results of applying
the Rule Learning Using Evolutionary Algorithm to classify the HBV
DNA data in liver cancer (HCC) and normal cases.
Experimental Methodology
[0187] For each experiment, 90% of samples are selected randomly as
the training set and the remains 10% samples form the testing set.
For each dataset, the experiment is repeated for 10 times.
[0188] In medical diagnosis and disease predication problems, the
algorithm or model performance is not only judged by accuracy, but
also sensitivity and specificity. Sensitivity is generally more
important than specificity and accuracy in medical diagnoses
because doctors and patients prefer not to miss any patients with
diseases. Extra diagnosis and tests can be performed to confirm
their prediction and remove initial false positives.
[0189] We evaluated our model in all these three measurements.
Accuracy = True Positive + True Negative True Positive + True
Negative ##EQU00001## Sensitivity = True Positive True Positive +
False Negative ##EQU00001.2## Specificity = True Negative True
Negative + False Positive ##EQU00001.3##
[0190] The true positive is the number of all the patients with the
disease and a positive test result, whereas the true negative is
the number of all the patients without the disease and a negative
test result. The false positive is the number of all the patients
without the disease but a positive test result, whereas the false
negative is the number of all the patients with the disease but a
negative test result. In medical diagnosis, a false negative is the
most undesirable case.
Results
Data Description
[0191] Genotype B and genotype C data were separated for analysis.
The proportion of patients in each genotype or C subtypes is shown
in Table 2. "CON" refers to "control," i.e., no HCC.
TABLE-US-00002 TABLE 2 Datasets CON HCC Total % B 49 37 86 43.8776
C1 10 16 26 13.2653 C2 18 22 40 20.4082 C3 19 25 44 22.4490 Total
96 100 196 100
Genotype B
[0192] Table 3 shows the details of the markers for HBV genotype
B.
TABLE-US-00003 TABLE 3 HBV markers for HCC of genotype B Markers
Normal value HCC-related value 1762, 1764 AG TA 1165 C T 2712 T C
(A, G) 2525 A, T C
[0193] The classification rules based on the applied data cleansing
process for genotype B are as follows:
If 1762A and 1764G and 11165T are present in genotype B, then HCC
is likely to occur. If 1762T and 1764A and 2712A, 2712C or 2712G
are present in genotype B, then HCC is likely to occur. If 1762T
and 1764A and 2712T and 2525C are present in genotype B, then HCC
is likely to occur.
[0194] The experimental results for the genotype B dataset are
shown in Table 4.
TABLE-US-00004 TABLE 4 Results of genotype B HBV dataset to predict
HCC Results Training set (STD) Testing set (STD) Sensitivity
0.75029 (0.05361) 0.75 (0.16667) Specificity 0.68 (0.06215) 0.66
(0.13499) Accuracy 0.7093 (0.02615) 0.70 (0.07499)
C1 Subgroup
[0195] Table 5 shows the details of the markers for C1
subgroup.
TABLE-US-00005 TABLE 5 HCC related markers for C1 subgroup Markers
Normal value HCC-related value 31 T C 53 T C 1499 A G
[0196] The classification rules based on the applied data cleansing
process for C1 subgroup are as follows:
If 31C or 53C or 1499G are present in genotype C1, then HCC is
likely to occur.
[0197] The Experimental results for the C1 subgroup are showed in
Table 6.
TABLE-US-00006 TABLE 6 Results of genotype C1 HBV dataset to
predict HCC Results Training set (STD) Testing set (STD)
Sensitivity 0.80769 (0.04054) 0.75 (0.26252) Specificity 0.7875
(0.06038) 0.7 (0.48305) Accuracy 0.8 (0.03012) 0.7333 (0.21082)
C2 Subgroup
[0198] Table 7 shows the details of the markers for C2
subgroup.
TABLE-US-00007 TABLE 7 HCC related markers for C2 subgroup Markers
Normal value HCC-related value 2170 T C, G 2441 T C 799 A G
[0199] The classification rules based on the applied data cleansing
process for C2 subgroup are as follows:
If 2170C or 2170G or 2441C or 799G are present in genotype C2, then
HCC is likely to occur.
[0200] The Experimental results on the C2 subgroup are showed in
Table 8.
TABLE-US-00008 TABLE 8 Results of C2 genotype dataset to predict
HCC Results Training set (STD) Testing set (STD) Sensitivity
0.84706 (0.06323) 0.85 (0.24152) Specificity 0.97857 (0.0345) 1
(0.00000) Accuracy 0.90645 (0.0355) 0.925 (0.12076)
[0201] The classification rules based on the applied data cleansing
process for C3 subgroup are as follows:
If C312 or G961 or A1613 or A1899 are present in genotype C3, then
HCC is likely to occur.
[0202] The Experimental results on the C3 subgroup are showed in
Table 9.
TABLE-US-00009 TABLE 9 Results of C3 genotype dataset to predict
HCC Results Training set (STD) Testing set (STD) Sensitivity 0.75
(0.0044) 0.77 (0.22) Specificity 0.81 (0.0040) 0.80 (0.26) Accuracy
0.77 (0.0024) 0.78 (0.18)
Patients and Methods
Patients
[0203] Residual serum samples of one hundred chronic hepatitis B
patients suffering from hepatocellular carcinoma (HCC) and one
hundred age-matched control patients who had chronic hepatitis B
but without hepatocellular carcinoma were studied. Consecutive
patients with confirmed diagnosis of HCC who had positive HBsAg
attending the Joint Hepatoma Clinic, Prince of Wales Hospital from
July 1999 to 2001 were included. Confirmed diagnosis of HCC is
defined by either histology or radiological evidence of a hepatic
mass with a serum alpha-fetoprotein (AFP) of 500 .mu.g/l or more.
Patients who had positive anti-HCV or history of alcoholism were
excluded. Informed consent to provide serum sample for experimental
study were routinely obtained from patients in Joint Hepatoma
Clinic. Relevant clinical information of enrolled patients was
collected retrospectively.
[0204] Age-matched control patients were identified from the cohort
of chronic hepatitis B patients prospectively follow-up in the
Hepatitis Clinic since December 1997. Patients who had other
possible causes of hepatitis or liver cirrhosis including
autoimmune liver disease, primary biliary cirrhosis, Wilson's
disease and hemochromatosis were also excluded. At initial
presentation, abdomen ultrasounds were performed to exclude any
pre-existing HCC. Patients were prospectively followed up every 6
monthly, or more frequently if clinically indicated, with
monitoring of liver biochemistry, HBeAg and anti-HBe status as well
as alfa-fetoprotein levels. Abdominal ultrasounds, computerized
tomography, hepatic angiogram and/or liver biopsy were performed
whenever alfa-fetoprotein level was higher than 50 .mu.g/l or on a
rising trend over 20 .mu.g/l to confirm the diagnosis of HCC. For
patients with normal alfa-fetoprotein levels, ultrasound abdomen
was performed every 1-2 yearly.
Laboratory Method
Extraction of DNA
[0205] Serum viral DNA was extracted using QIAamp DNA Blood Mini
Kit (Qiagen, CA, USA) according to the manufacturer's
instructions.
Amplification of HBV DNA
[0206] To obtain the full-length HBV DNA sequence, a long distance
semi-nested PCR was performed to amplify three overlapping
fragments (A, B and C). Relative positions of these PCR fragments
to the map of HBV genome are shown in FIG. 1 and the nucleotide
sequences of the PCR primers can be found in Table 1.
TABLE-US-00010 TABLE 1 The sequences of primers used for amplifying
and sequencing the HBV DNA Nucleotide sequence Nt Name
(5'.fwdarw.3') positions Direction Primers used for PCR (SEQ ID
NOS: 4-12) P1 TTTTTCACCTCTGCCTAATCA 1821-1841 sense P2
CCCTAGAAAATTGAGAGAAGTC 262-283 antisense P3.sup.a
CCACTGCATGGCCTGAGGATG 3193-3213 antisense P4
GCCTCATTTTGTGGGTCACCATA 2801-2824 sense P5 TTCTTTGACATACTTTCCA
979-997 antisense P6.sup.a TTGGGGTGGAGCCCTCAGGCT 3070-3090 sense
P7.sup.a TTGGCCAAAATTCGCAGTC 300-318 sense P8.sup.a
CCCCACTGTTTGGCTTTCAG 714-734 sense P9.sup.a
GTTGATAAGATAGGGGCATTTGGTGG 2299-2325 antisense Primers used for
sequencing (SEQ ID NOS: 13-21) S1 CTCCGGAACATTGTTCACCT 2031-2050
sense S2 AAGGTGGGAAACTTTACTGGGC 2469-2490 sense S3
GCTGACGCAACCCCCACTGG 1186-1205 sense S4 TCGCATGGAGACCACCGTGA
1604-1623 sense S5 GGCAAAAACGAGAGTAACTC 1940-1959 antisense S6
GGGTCGTCCGCGGGATTCAG 1441-1460 antisense S7 GACATACTTTCCAATCAATAGG
970-991 antisense S8 GAAGATGAGGCATAGCAGCAGG 411-433 antisense S9
CATGCTGTAGCTCTTGTTCC 2831-2850 antisense .sup.aThese primers were
also used for sequencing.
Fragment A
[0207] When amplifying fragment A, 5 .mu.l of the extracted DNA was
subjected to PCR in the presence of 50 mM KCl, 1.5 mM MgCl.sub.2,
10 mM Tris-HCl, 200 .mu.M of each dNTP, 1.25 units Taq DNA
polymerase (Amersham Biosciences), 1.5 units pfu DNA polymerase
(Promega), and 10 pmol of each P1 primer and P2 primer in a final
volume of 50 .mu.l. PCR was carried out under a 5-min initial
denaturation at 95.degree. C., followed by 10 cycles of
amplification (94.degree. C., 36 sec; 60.degree. C., 36 sec;
72.degree. C., 2.5 min) and then 30 cycles of amplification
(94.degree. C., 36 sec; 50.degree. C., 36 sec; 72.degree. C., 2.5
min) and 7-min final extension at 72.degree. C.
[0208] The PCR product was further amplified in a semi-nested PCR.
One microliter of the product was subjected to PCR in the presence
of 50 mM KCl, 1.5 mM MgCl.sub.2, 10 mM Tris-HCl, 200 .mu.M of each
dNTP, 2.5 units Taq DNA polymerase (Amersham Biosciences) and 10
pmol of each P1 primer and P3 primer in a final volume of 50 .mu.l.
PCR was carried out under a 5-min initial denaturation at
95.degree. C., followed by 10 cycles of amplification (94.degree.
C., 36 sec; 60.degree. C., 36 sec; 72.degree. C., 2 min) and then
30 cycles of amplification (94.degree. C., 36 sec; 52.degree. C.,
36 sec; 72.degree. C., 2 min) and a 7-min final extension at
72.degree. C. Finally, quality and quantity of the PCR product was
examined on a 1.0% agarose/EtBr gel run in 1.times.TBE buffer.
Fragment B
[0209] When amplifying fragment B, 5 .mu.l of the extracted DNA was
subjected to PCR in the presence of 50 mM KCl, 1.5 mM MgCl.sub.2,
10 mM Tris-HCl, 200 .mu.M of each dNTP, 1.25 units Taq DNA
polymerase (Amersham Biosciences), 1.5 units pfu DNA polymerase
(Promega), and 10 pmol of each P4 primer and P5 primer in a final
volume of 50 .mu.l. PCR was carried out under a 5-min initial
denaturation at 95.degree. C., followed by 10 cycles of
amplification (94.degree. C., 36 sec; 60.degree. C., 36 sec;
72.degree. C., 90 sec) and then 30 cycles of amplification
(94.degree. C., 36 sec; 50.degree. C., 36 sec; 72.degree. C., 90
sec) and a 7-min final extension at 72.degree. C.
[0210] The PCR product was further amplified in a semi-nested PCR.
One microliter of the product was subjected to PCR in the presence
of 50 mM KCl, 1.5 mM MgCl.sub.2, 10 mM Tris-HCl, 200 .mu.M of each
dNTP, 2.5 units Taq DNA polymerase (Amersham Biosciences) and 10
pmol of each P5 primer and P6 primer in a final volume of 50 .mu.l.
PCR was carried out under a 5-min initial denaturation at
95.degree. C., followed by 10 cycles of amplification (94.degree.
C., 36 sec; 60.degree. C., 36 sec; 72.degree. C., 90 sec) and then
30 cycles of amplification (94.degree. C., 36 sec; 52.degree. C.,
36 sec; 72.degree. C., 90 sec) and a 7-min final extension at
72.degree. C. Finally, quality and quantity of the PCR product was
examined on a 1.0% agarose/EtBr gel run in 1.times.TBE buffer.
Fragment C
[0211] When amplifying fragment C, 5 .mu.l of the extracted DNA was
subjected to PCR in the presence of 50 mM KCl, 1.5 mM MgCl.sub.2,
10 mM Tris-HCl, 200 .mu.M of each dNTP, 1.25 units Taq DNA
polymerase (Amersham Biosciences), 1.5 units pfu DNA polymerase
(Promega), and 10 pmol of each P7 primer and P9 primer in a final
volume of 50 .mu.l. PCR was carried out under a 5-min initial
denaturation at 95.degree. C., followed by 10 cycles of
amplification (94.degree. C., 36 sec; 60.degree. C., 36 sec;
72.degree. C., 2 min and 15 sec) and then 30 cycles of
amplification (94.degree. C., 36 sec; 50.degree. C., 36 sec;
72.degree. C., 2 min and 15 sec) and a 7-min final extension at
72.degree. C.
[0212] The PCR product was further amplified in a semi-nested PCR.
One microliter of the product was subjected to PCR in the presence
of 50 mM KCl, 1.5 mM MgCl.sub.2, 10 mM Tris-HCl, 200 .mu.M of each
dNTP, 2.5 units Taq DNA polymerase (Amersham Biosciences) and 10
pmol of each P8 primer and P9 primer in a final volume of 50 .mu.l.
PCR was carried out under a 5-min initial denaturation at
95.degree. C., followed by 10 cycles of amplification (94.degree.
C., 36 sec; 60.degree. C., 36 sec; 72.degree. C., 1 min and 50 sec)
and then 30 cycles of amplification (94.degree. C., 36 sec;
52.degree. C., 36 sec; 72.degree. C., 1 min and 50 sec) and a 7-min
final extension at 72.degree. C. Finally, quality and quantity of
the PCR product was examined on a 1.0% agarose/EtBr gel run in
1.times.TBE buffer.
DNA Sequencing
[0213] All semi-nested PCR products (plus and minus strands) were
directly sequenced with the Cycling Sequencing Kit DYEnamic ET Dye
terminator for MegaBACE (Amersham Biosciences).
[0214] Primers for the sequencing of three HBV DNA fragments
(primers sequences are listed in Table 1):
Fragment A: S1, S2, P3, S9
Fragment B: P6, P7, S7, S8
Fragment C: P8, S3, S4, P9, S5, S6
[0215] One microliter of unpurified PCR product was used as the DNA
template for cycle sequencing. It was subjected to sequencing
reaction in the presence of 8 .mu.l of DYEnamic ET reagent premix
and 10 pmol primer in a final volume of 20 .mu.l. Sequencing
reaction mix was subjected to a 2 min initial denaturation at
95.degree. C., followed by 30 cycles at 95.degree. C., 25 sec;
52.degree. C., 30 sec; 60.degree. C.; 60 sec.
[0216] The sequencing products were purified by post reaction clean
up using ethanol precipitation. In each reaction tube, 2 .mu.l of
7.5M ammonium acetate and 2.5 volumes (55 .mu.l) of 100% ethanol
were added so that the final concentration of ethanol was 70%. Then
it was subjected to centrifugation at 4,000 rpm for 30 min at
14.degree. C. Afterwards, the supernatant was drawn off by
performing a brief inverted spin (1 min at 500 rpm). The DNA pellet
was washed by 100 .mu.l of 70% ethanol. Then, it was subjected to
centrifugation at 4,000 rpm for 15 min at 14.degree. C. and the
supernatant was drawn off by performing a brief inverted spin (1
min at 500 rpm). Then the DNA pellet was allowed to air dry and was
resuspended in 10 .mu.l of loading buffer (70% formamide and 1 mM
EDTA). The samples were stored at 4.degree. C. before gel
electrophoresis analysis using the MegaBACE 1000 DNA sequencer.
[0217] It is understood that the examples and embodiments described
herein are for illustrative purposes only and that various
modifications or changes in light thereof will be suggested to
persons skilled in the art and are to be included within the spirit
and purview of this application and scope of the appended claims.
All publications, patents, and patent applications cited herein are
hereby incorporated by reference in their entirety for all
purposes.
Sequence CWU 1
1
2113215DNAHepatitis B virusexemplary HBV genotype B isolate
1ctccaccact ttccaccaaa ctcttcaaga tcccagagtc agggccctgt actttcctgc
60tggtggctcc agttcagaaa cagtgagccc tgctcagaat actgtctctg ccatatcgtc
120aatcttatcg aagactgggg accctgtacc gaacatggag aacatcgcat
caggactcct 180aggacccctg ctcgtgttac aggcggggtt tttctcgttg
acaaaaatcc tcacaatacc 240acagagtcta gactcgtggt ggacttctct
caattttcta ggggaaacac ccgtgtgtct 300tggccaaaat tcgcagtccc
aaatctccag tcactcacca acctgttgtc ctccaatttg 360tcctggttat
cgctggatgt gtctgcggcg ttttatcatc ttcctctgca tcctgctgct
420atgcctcatc ttcttgttgg ttcttctgga ctatcaaggt atgttgcccg
tttgtcctct 480aattccagga tcatcaacga ccagcaccgg accatgcaaa
acctgcacaa ctcctgctca 540aggaacctct atgtttccct catgttgctg
tacaaaacct acggacggaa attgcacctg 600tattcccatc ccatcatctt
gggctttcgc aaaataccta tgggagtggg cctcagtccg 660tttctcttgg
ctcagtttac tagtgccatt tgttcagtgg ttcgtagggc tttcccccac
720tgtctggctt tcagttatat ggatgatgtg gttttggggg ccaagtctgt
acaacatctt 780gagtcccttt atgccgctgt taccaatttt cttttgtctt
tgggtataca tttaaaccct 840cacaaaacaa aaagatgggg atattccctt
aacttcatgg gatatgtaat tgggagttgg 900ggcacattgc cacaggaaca
tattgtacaa aaaatcaaaa tgtgttttcg gaaacttcct 960gtaaatagac
ctattgattg gaaagtatgt caacgaattg tgggtctttt ggggtttgcc
1020gcccctttca cgcaatgtgg atatcctgct ttaatgcctt tatatgcatg
tattcaagca 1080aaacaggctt ttactttctc gccaacttac aaggcctttc
taagtaaaca gtatctgaac 1140ctttaccccg ttgctcggca acggcctggt
ctgtgccaag tgtttgctga cgcaaccccc 1200actggttggg gcttggccat
aggccatcag cgcatgcgtg gaacctttgg gtctcctctg 1260ccgatccata
ctgcggaact cctagccgct tgttttgctc gcagcaggtc tggagcaaga
1320ctcatcggga ctgacaattc tgtcgtgctc tcccgcaagt atacatcatt
tccatggctg 1380ctaggctgtg ctgccaactg gatcctacgc gggacgtcct
ttgtttacgt cccgtcggcg 1440ctgaatcccg cggacgaccc ctcccggggc
cgcttggggc tctaccgccc gcttctccgc 1500ctattgtacc gaccgaccac
ggggcgcacc tctctttacg cggactcccc gtctgtgcct 1560tctcatctgc
cggaccgtgt gcacttcgct tcacctctgc acgtcgcatg gagaccaccg
1620tgaacgccca cgggaacctg cccaaggtct tgcataagag aactcttgga
ctttcagcaa 1680tgtcaacgac cgaccttgag gcatacttca aagactgtgt
gtttactgag tgggaggagt 1740tgggggagga gattaggtta aaggtctttg
tactaggagg ctgtaggcat aaattggtgt 1800gttcaccagc accatgcaac
tttttcacct ctgcctaatc atctcatgtt catgtcctac 1860tgttcaagcc
tccaagctgt gccttgggtg gctttggggc atggacattg acccgtataa
1920agaatttgga gcttctgtgg agttactctc ttttttgcct tctgacttct
ttccttctat 1980tcgagatctc ctcgacaccg cctctgctct gtatcgggag
gccttagagt ctccggaaca 2040ttgttcacct caccatacgg cactcaggca
agctattctg tgttggggtg agttaatgaa 2100tctagccacc tgggtgggaa
gtaatttgga agatccagca tccagggaat tagtagtcag 2160ctatgtcaac
gttaatatgg gcctaaaact cagacaaata ttgtggtttc acatttcctg
2220tcttactttt gggagagaaa ctgttcttga atatttggtg tcttttggag
tgtggattcg 2280cactcctcct gcatatagac cacaaaatgc ccctatctta
tcaacacttc cggaaactac 2340tgttgttaga cgaagatgca ggtcccctag
aagaagaact ccctcgcctc gcagacgaag 2400gtctcaatcg ccgcgtcgca
gaagatctca atctcgggaa tctcaatgtt agtattcctt 2460ggacacataa
ggtgggaaac tttacggggc tttattcttc tacggtacct tgctttaatc
2520ctaaatggca aactccttct tttcctgaca ttcatttgca ggaggacatt
gttgatagat 2580gcaagcaatt tgtggggccc cttacagtaa atgaaaacag
gagactaaaa ttaattatgc 2640ctgctaggtt ttatcccaat gttactaaat
atttgccctt agataaaggg atcaaaccgt 2700attatccaga gtatgtagtt
aatcattact ttcagacgcg acattattta cacactcttt 2760ggaaggcggg
gatcttatat aaaagagagt ccacacgtag cgcctcattt tgcgggtcac
2820catattcttg ggaacaagat ctacagcatg ggaggttggt cttccaaacc
tcgaaaaggc 2880atggggacaa atctttctgt ccccaatccc ctgggattct
tccccgatca tcagttggac 2940cctgcattca aagccaactc agaaaatcca
gattgggacc tcaacccgca caaggacaac 3000tggccggacg ccaacaaggt
gggagtggga gcattcgggc cagggttcac ccctccccat 3060gggggactgt
tggggtggag ccctcaagct cagggcctac tcacaactgt gccagcagct
3120cctcctcctg cctccaccaa tcggcagtca ggaaggcagc ctactccctt
atctccacct 3180ctaagggaca ctcatcctca ggccatgcag tggaa
321523215DNAHepatitis B virusexemplary HBV genotype C1 isolate
2ctccagcaca ttccaccaag ctctgctaga tcccagagtg aggggcctat accttcctgc
60tggtggctcc agttccggaa cagtaaaccc tgttccgact actgcctctc ccatatcgtc
120aatcttctcg aggactgggg accctgcacc gaatatggag agcaccacat
caggattcct 180aggacccctg ctcgtgttac aggcggggtt tttcttgttg
acaagaatcc tcacaatacc 240acagagtcta gactcgtggt ggacttctct
caattttcta gggggagcac ccacgtgtcc 300tggccaaaat ttgcagtccc
caacctccaa tcactcacca acctcttgtc ctccaatttg 360tcctggttat
cgctggatgt gtctgcggcg ttttatcatc ttcctcttca tcctgctgct
420atgcctcatc ttcttgttgg ttcttctgga ctaccaaggt atgttgcccg
tttgtcctct 480acttccagga acatcaacta ccagcacggg accatgcaag
acctgcacga ttcctgctca 540aggaacctct atgtttccct cttgttgctg
tacaaaacct tcggacggaa attgcacttg 600tattcccatc ccatcatctt
gggctttcgc aagattccta tgggagtggg cctcagtccg 660tttctcctgg
ctcagtttac tagtgccatt tgttcagtgg ttcgtagggc tttcccccac
720tgtttggctt tcagttatat ggatgatgtg gtattggggg ccaagtctgt
acaacatctt 780gaatcccttt ataccgctat taccaatttt cttgtgtctt
tgggtataca tttaaaccct 840aataaaacca aacgttgggg ctactccctt
aacttcatgg gatatgtaat tggaagttgg 900ggtaccttgc cacaggaaca
tattgtacaa aaaatcaaac aatgttttcg aaaacttcct 960ataaatagac
ctattgattg gaaagtatgt caaagaattg tgggtctttt gggttttgcc
1020gctcccttta cacaatgtgg ttacccagca ttaatgcctt tatatgcatg
tatacaagct 1080aaacaggctt tcactttctc gccaacttac aaggcctttc
tgtataaaca atatctgaac 1140ctttaccccg ttgctcggca acggtcaggt
ctttgccaag tgtttgctga cgcaaccccc 1200actggttggg gcttggccat
gggccatcag cgcatgcgtg gaacctttgt ggctcctctg 1260ccgatccata
ctgcggaact cctagcagct tgttttgctc gcagccggtc tggagcaaac
1320cttatcggca ccgacaactc tgttgtcctc tctcggaaat acacctcttt
tccatggctg 1380ctaggctgtg ctgccaactg gatcctgcgc gggacgtcct
ttgtctacgt cccgtcggcg 1440ctgaatcccg cggacgaccc gtctcggggt
cgtttgggac tctaccgtcc ccttctccgt 1500ctgccgttcc ggccgaccac
ggggcgcacc tctctttacg cggactcccc gtctgtgcct 1560tctcatctgc
cggaccgtgt gcacttcgct tcacctctgc acgtcgcatg gagaccaccg
1620tgaacgcccg ccaggtcttg cctaaggtct tacataagag gactcttgga
ctctcagcaa 1680tgtcaacgac cgaccttgag gcatacttca aagactgtgt
atttaaggac tgggaggagt 1740tgggggagga gattaggtta atgatctttg
tactgggagg ctgtaggcat aaattggtct 1800gttcaccagc accatgcaac
tttttcacct ctgcctaatc atctcatgtt catgttccac 1860tgttcaagcc
tccaagctgt gccttgggtg gctttggggc atggacattg acccgtataa
1920agaatttgga gcttctgtgg agttactctc ttttttgcct tctgactttt
ttccttctat 1980tcgtgatctc ctcgacaccg cctctgctct gtatcgggag
gccttagagt ctccggaaca 2040ttgttcacct caccatacag cactaaggca
agctattctg tgttggggtg agttgatgaa 2100tctggccacc tgggtgggaa
gtaatttgga agacccggca tccagggaat tagtagtaag 2160ctatgtcaat
gttaatatgg gcctaaaaat cagacaacta ttgtggtttc acatttcctg
2220tcttactttt ggaagagaaa ctgttcttga gtatttggtg tcttttggag
tgtggattcg 2280cactcctccc gcttacagac caccaaatgc ccctatctta
tcaacacttc cggaaactac 2340tgttgttaga cgacgaggca ggtcccctag
aagaagaact ccctcgcctc gcagacgaag 2400gtctcaatcg ccgcgtcgca
gaagatctca gtctcgggaa tctcaatgtt agtatccctt 2460ggactcataa
ggtgggaaac tttactgggc tttattcttc tactgtacct gtctttaatc
2520ctgaatggca aactccctct tttcctcaca ttcatttgaa agaggatatt
atcaatagat 2580gtcaacaata tgtgggccct cttacagtta acgaaaaaag
gagattaaaa ttgatcatgc 2640ctgctaggtt ctatcctaac cttactaaat
atttgccctt agataaaggc atcaaacctt 2700attatcctga acatatagtt
aatcattact tccaaactag gcattattta catactctgt 2760ggaaggctgg
tattttatat aagagagaaa ctactcgcag cgcctcattt tgtgggtcac
2820catattcttg ggaacaagag ctacagcatg ggaggttggt cttccaaacc
tcgacaaggc 2880atggggacga atctttccgt tcccaatcct ctgggattct
ttcccggtca ccagttggac 2940ccgacattcg gagccaattc aaacaatcca
gattgggact tcaaccccaa caaggatcaa 3000tggccagcgg caaaccaggt
aggagtggga tcattcgggc cggggttcac tccaccacac 3060ggcaatcttt
tggggtggag ccctcaggct cagggcatat tgacaacagt accagcagcg
3120cctcctcctg cctccaccaa tcggcagtca ggaagaaagc ctactcccat
ctctccacct 3180ctaagagaca gtcatcctca ggccatgcaa tggaa
321533215DNAHepatitis B virusexemplary HBV genotype C2 isolate
3ctccagcaca ttccaccaag ctctgctaga tcccagagtg aggggcctat accttcctgc
60tggtggctcc agttccggaa cagtaaaccc tgttccgact actgcctctc ccatatcgtc
120aatcttctcg aggactgggg accctgcacc gaatatggag agcaccacat
caggattcct 180aggacccctg ctcgtgttac aggcggggtt tttcttgttg
acaagaatcc tcacaatacc 240acagagtcta gactcgtggt ggacttctct
caattttcta gggggagcac ccacgtgtcc 300tggccaaaat ttgcagtccc
caacctccaa tcactcacca acctcttgtc ctccaatttg 360tcctggttat
cgctggatgt gtctgcggcg ttttatcatc ttcctcttca tcctgctgct
420atgcctcatc ttcttgttgg ttcttctgga ctaccaaggt atgttgcccg
tttgtcctct 480acttccagga acatcaacta ccagcacggg accatgcaag
acctgcacga ttcctgctca 540aggaacctct atgtttccct cttgttgctg
tacaaaacct tcggacggaa attgcacttg 600tattcccatc ccatcatctt
gggctttcgc aagattccta tgggagtggg cctcagtccg 660tttctcctgg
ctcagtttac tagtgccatt tgttcagtgg ttcgtagggc tttcccccac
720tgtttggctt tcagttatat ggatgatgtg gtattggggg ccaagtctgt
acaacatctt 780gaatcccttt ataccgctat taccaatttt cttgtgtctt
tgggtataca tttaaaccct 840aataaaacca aacgttgggg ctactccctt
aacttcatgg gatatgtaat tggaagttgg 900ggtaccttgc cacaggaaca
tattgtacaa aaaatcaaac aatgttttcg aaaacttcct 960ataaatagac
ctattgattg gaaagtatgt caaagaattg tgggtctttt gggttttgcc
1020gctcccttta cacaatgtgg ttacccagca ttaatgcctt tatatgcatg
tatacaagct 1080aaacaggctt tcactttctc gccaacttac aaggcctttc
tgtataaaca atatctgaac 1140ctttaccccg ttgctcggca acggtcaggt
ctttgccaag tgtttgctga cgcaaccccc 1200actggttggg gcttggccat
gggccatcag cgcatgcgtg gaacctttgt ggctcctctg 1260ccgatccata
ctgcggaact cctagcagct tgttttgctc gcagccggtc tggagcaaac
1320cttatcggca ccgacaactc tgttgtcctc tctcggaaat acacctcttt
tccatggctg 1380ctaggctgtg ctgccaactg gatcctgcgc gggacgtcct
ttgtctacgt cccgtcggcg 1440ctgaatcccg cggacgaccc gtctcggggt
cgtttgggac tctaccgtcc ccttctccgt 1500ctgccgttcc ggccgaccac
ggggcgcacc tctctttacg cggactcccc gtctgtgcct 1560tctcatctgc
cggaccgtgt gcacttcgct tcacctctgc acgtcgcatg gagaccaccg
1620tgaacgcccg ccaggtcttg cctaaggtct tacataagag gactcttgga
ctctcagcaa 1680tgtcaacgac cgaccttgag gcatacttca aagactgtgt
atttaaggac tgggaggagt 1740tgggggagga gattaggtta atgatctttg
tactgggagg ctgtaggcat aaattggtct 1800gttcaccagc accatgcaac
tttttcacct ctgcctaatc atctcatgtt catgttccac 1860tgttcaagcc
tccaagctgt gccttgggtg gctttggggc atggacattg acccgtataa
1920agaatttgga gcttctgtgg agttactctc ttttttgcct tctgactttt
ttccttctat 1980tcgtgatctc ctcgacaccg cctctgctct gtatcgggag
gccttagagt ctccggaaca 2040ttgttcacct caccatacag cactaaggca
agctattctg tgttggggtg agttgatgaa 2100tctggccacc tgggtgggaa
gtaatttgga agacccggca tccagggaat tagtagtaag 2160ctatgtcaat
gttaatatgg gcctaaaaat cagacaacta ttgtggtttc acatttcctg
2220tcttactttt ggaagagaaa ctgttcttga gtatttggtg tcttttggag
tgtggattcg 2280cactcctccc gcttacagac caccaaatgc ccctatctta
tcaacacttc cggaaactac 2340tgttgttaga cgacgaggca ggtcccctag
aagaagaact ccctcgcctc gcagacgaag 2400gtctcaatcg ccgcgtcgca
gaagatctca gtctcgggaa tctcaatgtt agtatccctt 2460ggactcataa
ggtgggaaac tttactgggc tttattcttc tactgtacct gtctttaatc
2520ctgaatggca aactccctct tttcctcaca ttcatttgaa agaggatatt
atcaatagat 2580gtcaacaata tgtgggccct cttacagtta acgaaaaaag
gagattaaaa ttgatcatgc 2640ctgctaggtt ctatcctaac cttactaaat
atttgccctt agataaaggc atcaaacctt 2700attatcctga acatatagtt
aatcattact tccaaactag gcattattta catactctgt 2760ggaaggctgg
tattttatat aagagagaaa ctactcgcag cgcctcattt tgtgggtcac
2820catattcttg ggaacaagag ctacagcatg ggaggttggt cttccaaacc
tcgacaaggc 2880atggggacga atctttccgt tcccaatcct ctgggattct
ttcccggtca ccagttggac 2940ccgacattcg gagccaattc aaacaatcca
gattgggact tcaaccccaa caaggatcaa 3000tggccagcgg caaaccaggt
aggagtggga tcattcgggc cggggttcac tccaccacac 3060ggcaatcttt
tggggtggag ccctcaggct cagggcatat tgacaacagt accagcagcg
3120cctcctcctg cctccaccaa tcggcagtca ggaagaaagc ctactcccat
ctctccacct 3180ctaagagaca gtcatcctca ggccatgcaa tggaa
3215421DNAArtificial SequenceDescription of Artificial Sequencelong
distance semi-nested amplification PCR sense primer P1 4tttttcacct
ctgcctaatc a 21522DNAArtificial SequenceDescription of Artificial
Sequencelong distance semi-nested amplification PCR antisense
primer P2 5ccctagaaaa ttgagagaag tc 22621DNAArtificial
SequenceDescription of Artificial Sequencelong distance semi-nested
amplification PCR antisense primer and Fragment A sequencing primer
P3 6ccactgcatg gcctgaggat g 21723DNAArtificial SequenceDescription
of Artificial Sequencelong distance semi-nested amplification PCR
sense primer P4 7gcctcatttt gtgggtcacc ata 23819DNAArtificial
SequenceDescription of Artificial Sequencelong distance semi-nested
amplification PCR antisense primer P5 8ttctttgaca tactttcca
19921DNAArtificial SequenceDescription of Artificial Sequencelong
distance semi-nested amplification PCR sense primer and Fragment B
sequencing primer P6 9ttggggtgga gccctcaggc t 211019DNAArtificial
SequenceDescription of Artificial Sequencelong distance semi-nested
amplification PCR sense primer and Fragment B sequencing primer P7
10ttggccaaaa ttcgcagtc 191120DNAArtificial SequenceDescription of
Artificial Sequencelong distance semi-nested amplification PCR
sense primer and Fragment C sequencing primer P8 11ccccactgtt
tggctttcag 201226DNAArtificial SequenceDescription of Artificial
Sequencelong distance semi-nested amplification PCR antisense
primer and Fragment C sequencing primer P9 12gttgataaga taggggcatt
tggtgg 261320DNAArtificial SequenceDescription of Artificial
SequenceFragment A sense sequencing primer S1 13ctccggaaca
ttgttcacct 201422DNAArtificial SequenceDescription of Artificial
SequenceFragment A sense sequencing primer S2 14aaggtgggaa
actttactgg gc 221520DNAArtificial SequenceDescription of Artificial
SequenceFragment C sense sequencing primer S3 15gctgacgcaa
cccccactgg 201620DNAArtificial SequenceDescription of Artificial
SequenceFragment C sense sequencing primer S4 16tcgcatggag
accaccgtga 201720DNAArtificial SequenceDescription of Artificial
SequenceFragment C antisense sequencing primer S5 17ggcaaaaacg
agagtaactc 201820DNAArtificial SequenceDescription of Artificial
SequenceFragment C antisense sequencing primer S6 18gggtcgtccg
cgggattcag 201922DNAArtificial SequenceDescription of Artificial
SequenceFragment B antisense sequencing primer S7 19gacatacttt
ccaatcaata gg 222022DNAArtificial SequenceDescription of Artificial
SequenceFragment B antisense sequencing primer S8 20gaagatgagg
catagcagca gg 222120DNAArtificial SequenceDescription of Artificial
SequenceFragment A antisense sequencing primer S9 21catgctgtag
ctcttgttcc 20
* * * * *