U.S. patent application number 12/500151 was filed with the patent office on 2009-10-29 for storage medium including text-based caption information, reproducing apparatus and reproducing method thereof.
This patent application is currently assigned to Samsung Electronics Co.,Ltd.. Invention is credited to Hyun-kwon Chung, Sung-wook Park.
Application Number | 20090268090 12/500151 |
Document ID | / |
Family ID | 36284171 |
Filed Date | 2009-10-29 |
United States Patent
Application |
20090268090 |
Kind Code |
A1 |
Chung; Hyun-kwon ; et
al. |
October 29, 2009 |
STORAGE MEDIUM INCLUDING TEXT-BASED CAPTION INFORMATION,
REPRODUCING APPARATUS AND REPRODUCING METHOD THEREOF
Abstract
A storage medium including moving picture data and subtitle data
to be output as a graphic overlapping an image based on the moving
picture data, wherein the subtitle data includes text data to
generate pixel data converted into a bitmap image, and control
information to control the pixel data to be output in real time,
and a reproducing apparatus and reproducing method using the
storage medium.
Inventors: |
Chung; Hyun-kwon; (Seoul,
KR) ; Park; Sung-wook; (Seoul, KR) |
Correspondence
Address: |
STEIN MCEWEN, LLP
1400 EYE STREET, NW, SUITE 300
WASHINGTON
DC
20005
US
|
Assignee: |
Samsung Electronics
Co.,Ltd.
Suwon-si
KR
|
Family ID: |
36284171 |
Appl. No.: |
12/500151 |
Filed: |
July 9, 2009 |
Related U.S. Patent Documents
|
|
|
|
|
|
Application
Number |
Filing Date |
Patent Number |
|
|
10954356 |
Oct 1, 2004 |
|
|
|
12500151 |
|
|
|
|
Current U.S.
Class: |
348/468 ;
348/E7.001 |
Current CPC
Class: |
H04N 9/8042 20130101;
H04N 5/85 20130101; H04N 9/8205 20130101; H04N 9/8227 20130101;
H04N 21/42653 20130101; G11B 27/3027 20130101; H04N 5/44 20130101;
H04N 9/8233 20130101; H04N 21/4312 20130101; H04N 21/4325 20130101;
G11B 2220/2541 20130101; G11B 27/105 20130101; H04N 5/445 20130101;
H04N 21/8146 20130101; H04N 21/4884 20130101; H04N 21/42646
20130101; G11B 2220/2562 20130101; H04N 9/8063 20130101 |
Class at
Publication: |
348/468 ;
348/E07.001 |
International
Class: |
H04N 7/00 20060101
H04N007/00 |
Foreign Application Data
Date |
Code |
Application Number |
Oct 1, 2003 |
KR |
2003-68336 |
Dec 4, 2003 |
KR |
2003-87554 |
Claims
1. A method of reproducing information from a storage medium
comprising moving picture data and subtitle data to be output as a
graphic overlapping on an image based on the moving picture data,
the method comprising: reading the subtitle data including text
data and control information from the storage medium; decoding the
text data, parsing caption contents and output style information,
and converting the caption contents into pixel data formed as a
bitmap image based on the parsed style information; decoding the
control information, parsing time information to control the pixel
data to be output in real time, and parsing position information to
control a position at which a caption is output; and outputting the
converted pixel data in real time according to the parsed time
information and position information.
2. A storage medium comprising subtitle information to display a
caption, the subtitle information comprising: text data to generate
pixel data converted into a bitmap image; and control information
to control the pixel data to be output in real time.
3. An apparatus to reproduce information from a storage medium
having subtitle information, the apparatus comprising: a decoder to
decode text data from the subtitle information and generate a
bitmap image, and to decode and parse control information from the
subtitle data to control a caption to be output in real time.
4. The apparatus of claim 3, further comprising a graphic
controller to control the caption to be output in real time
according to the control information.
5. A method of reproducing subtitle information from a storage
medium, the method comprising: decoding text data and control
information from the subtitle information; parsing caption
contents, output style information, time information, and position
information to control a position at which the caption is output
from the text data and control information; converting the caption
contents into the caption based on the output style information;
and outputting the caption in real time according to the time
information and position information.
6. A text caption decoder of an apparatus to reproduce information
from a storage medium comprising moving picture data and subtitle
data, the text decoder comprising: a text caption parser to decode
and parse text data and control information from the subtitle data;
and a font renderer to convert the parsed text data into a bitmap
image so that the parsed text data is output as a graphic
overlapping an image based on the moving picture data.
7. The text caption decoder of claim 6, wherein the text data as
the graphic overlapping the image is output in real time.
Description
CROSS-REFERENCE TO RELATED APPLICATIONS
[0001] This application is a continuation application of U.S.
application Ser. No. 10/954,356 filed Oct. 1, 2004, now pending,
and claims the priorities of Korean Patent Application No.
2003-68336, filed on Oct. 1, 2003, and No. 2003-87554, filed on
Dec. 4, 2003 in the Korean Intellectual Property Office, the
disclosures of which are incorporated herein in their entireties by
reference.
BACKGROUND OF THE INVENTION
[0002] 1. Field of the Invention
[0003] The present invention relates to reproduction of data on a
storage medium, and, more particularly, to a storage medium
containing text-based caption information compatible with the
subpicture method of a digital versatile disc (DVD) and the
presentation method of a Blu-ray disc, and a reproducing apparatus
and reproducing method thereof.
[0004] 2. Description of the Related Art
[0005] Among conventional caption technologies, there exists
text-based caption technologies, which are mainly used in a
personal computer (PC), and a subpicture-graphic-based caption
technology, which is used in a DVD.
[0006] First, as examples of the conventional text-based caption
technologies mainly used in a PC, there are Synchronized Accessible
Media Interchange (SAMI) of Microsoft, and Real-text technology of
RealNetworks. The conventional text-based caption technologies have
a structure in which a caption is output on the basis of
synchronization information in relation to a file in which video
stream data is recorded, or video stream data provided on a
network.
[0007] FIG. 1 is a diagram illustrating the structure of a caption
file used in a text-based caption technology mainly used in the
conventional PC.
[0008] Referring to FIG. 1, there is a text-based caption file for
video stream data, and a caption for video stream data is output on
the basis of synchronization time information, for example,
<sync time 00:00>, contained in the caption file. An example
of a caption file constructed assuming continuous reproduction of
the video stream data is shown.
[0009] FIG. 2 is a diagram illustrating the structure of an
apparatus reproducing the conventional text-based captions.
[0010] Referring to FIG. 2, a text caption file is read from a
storage medium 200, stored in a text caption data and font data
buffer 220, and then converted into bitmap image graphic data by a
text caption decoder 222. By control of a graphic controller 224,
the converted graphic data is output on the screen 232 overlapping
video frame data from a video frame buffer 214 that has been
decoded in a video decoder 212.
[0011] However, as shown in FIG. 2, the conventional text-based
caption file structure considers only synchronization time
(<sync time=00:00>) by which a caption is displayed on the
screen, and the type, size, and color of font when a caption is
output on the screen, but does not consider how long a bitmap image
is kept in a buffer after the bitmap is generated by decoding text
caption data. Accordingly, there is a problem such that in a
reproducing apparatus using a low speed processor, a caption cannot
be output on the screen in real time as the conventional DVD
reproducing apparatus reproduces data.
[0012] Meanwhile, the subpicture-graphic-based caption technology
used in the conventional DVD will now be explained.
[0013] A DVD uses a bitmap image for a subtitle. Subtitle data of a
bitmap image is losslessly encoded and recorded on a DVD. A maximum
of 32 losslessly encoded bitmap images are recorded on a DVD.
[0014] FIG. 3 is a diagram illustrating the data structure of the
conventional DVD explaining the structure of a caption file used in
a subpicture-graphic-based caption technology used in the
conventional DVD.
[0015] Referring to FIG. 3, in a DVD, the disc area is divided into
a video manager (VMG) area and a plurality of video title set (VTS)
areas. Title information and information on title menus is stored
in the VMG area, and information on the title is stored in the
plurality of VTS areas. The VMG area is formed with 2.about.3
files, and each of the VTS areas is formed with 3.about.12 files.
The VMG area includes a VMGI area storing additional information on
the VMG, a video object set (VOBS) area storing moving information
(video objects) on a menu, and a backup area (BUP) of the VMGI.
These areas are stored as one file and among them the presence of
the VOBS area is optional.
[0016] In a VTS area, information on a title that is a reproduction
unit, and a VOBS having moving picture data is stored. In one VTS,
at least one title is recorded. The VTS area includes video title
set information (VTSI), a VOBS having moving picture data for a
menu screen, a VOBS having moving picture data of a video title
set, and backup data of the VTSI. The presence of the VOBS to
display the menu screen is optional. Each VOBS is again divided
into recording units of a VOB and Cells that are recording units.
One VOB is formed with a plurality of Cells. The smallest recording
unit mentioned in the present invention is the Cell.
[0017] FIG. 4 is a diagram illustrating a detailed structure of the
VOBS having moving picture data in the data structure of the
conventional DVD shown in FIG. 3.
[0018] Referring to FIG. 4, one VOBS is formed with a plurality of
VOBs, and one VOB is formed with a plurality of Cells. A Cell is
again formed with a plurality of video object units (VOBUs). A VOBU
is data encoded by a moving picture experts group (MPEG) method
that is a moving picture coding method used in a DVD. According to
the MPEG, since images are coded through spatiotemporal
compression, a previous or succeeding image is required to decode a
predetermined image. Accordingly, in order to support a random
access function by which reproduction starts from an arbitrary
position, intra coding that does not require a previous or
succeeding image is performed in each predetermined interval. In
the MPEG, this is referred to as an intra picture or I picture, and
pictures between this I picture and the next I picture are referred
to as a group of pictures (GOP). Usually, a GOP is formed with
12.about.15 images.
[0019] Meanwhile, the MPEG defines system coding (ISO/IEC13818-1)
to combine video data and audio data into one bitstream. The system
coding defines two multiplexing methods: a program stream (PS)
multiplexing method for optimization to generate one program and
store in an information storage medium, and a transport stream (TS)
multiplexing method appropriate to generate a plurality of programs
for transmission. The conventional DVD employs the PS coding
method.
[0020] According to the PS coding method, video data or audio data
is divided into units referred to as a pack (PCK) and multiplexed
through a time division method. Data other than video data and
audio data defined by the MPEG is named as a private stream and
also is contained in the PCKs such that the private stream can be
multiplexed together with the video data and audio data.
[0021] A VOBU is formed with a plurality of packs (PCK). The first
pack (PCK) among the plurality of packs (PCK) is a navigation pack
(NV_PCK), and the remaining packs include video packs (V_PCKs),
audio packs (A_PCKs), and subpicture packs (SP_PCKs). Video data
contained in a video pack is formed with a plurality of GOPs.
[0022] The subpicture pack (SP_PCK) is used for 2-dimensional
graphic data and caption data. That is, in a DVD, caption data
displayed overlapping a video image is encoded by the same method
as for 2-dimensional graphic data. In the case of DVD, a separate
encoding method to support multiple languages is not employed and
each caption data is converted into graphic data and then processed
and recorded by one encoding method. The graphic data for a caption
is referred to as a subpicture. The subpicture is formed with
subpicture units (SPUs). A subpicture unit corresponds to one sheet
of graphic data.
[0023] FIG. 5 is a diagram illustrating the correlation of a
subpicture pack (SP_PCK) and a subpicture unit (SPU) in the
structure of the VOBS having moving picture data shown in FIG.
4.
[0024] Referring to FIG. 5, one subpicture unit (SPU) includes a
subpicture unit header (SPUH), pixel data (PXD), and a subpicture
display control sequence table (SP_DCSQT). These are sequentially
divided and recorded in subpicture packs (SP_PCK) each with a size
of 2048 bytes. At this time, if the last data of the subpicture
unit (SPU) cannot fill one subpicture pack (SP_PCK) fully, the
remainder of the last subpicture pack (SP_PCK) is filled with
padding data. As a result, one subpicture unit (SPU) is formed with
a plurality of subpicture packs (SP_PCKS).
[0025] Recorded in the subpicture unit header (SPUH) are the size
of the entire subpicture unit (SPU) and the location from which the
subpicture display control sequence table (SP_DCSQT) having display
control information in the subpicture unit (SPU) starts. The pixel
data (PXD) is coded data obtained by compression coding a
subpicture. The pixel data (PXD) forming a subpicture can have four
types of values, including background, pattern pixel, emphasis
pixel-1, and emphasis pixel-2. The values can be expressed by two
bits, and have binary values, 00, 01, 10, and 11, respectively.
Accordingly, the subpicture can be regarded as a set of data formed
with a plurality of lines and having four types of pixel values.
Encoding is performed for each line.
[0026] FIG. 6 is a diagram illustrating a run-length coding method
among methods of encoding the subpicture unit shown in FIG. 5.
[0027] Referring to FIG. 6, in the run-length coding method, when
one to three instances of an identical pixel data value continue,
the number of the continued pixel (No_P) is expressed by 2 bits and
after that, a 2-bit pixel data value (PD) is recorded. When 4 to 15
instances of an identical pixel data value continue, the first 2
bits are recorded as 0s, 4 bits are used to record the No_P, and 2
bits are used to record the PD. When 16 to 63 instances of an
identical pixel data value continue, the first 4 bits are recorded
as Os, 6 bits are used to record the No_P, and 2 bits are used to
record the PD. When 64 to 255 instances of an identical pixel data
value continue, the first 6 bits are recorded as 0s, 8 bits are
used to record the No_P, and 2 bits are used to record the PD. When
a run of identical pixel data values continues to the end of a
line, the first 14 bits are recorded as 0s, and 2 bits are used to
record PD. When encoding of one line is thus finished, if byte-unit
alignment is not achieved, 4 bits of 0s are recorded. The number of
encoded data bits in one line cannot exceed 1440 bits.
[0028] FIG. 7 is a diagram illustrating the data structure of the
SP_DCSQT having output control information of pixel data (PXD)
shown in FIG. 5.
[0029] Referring to FIG. 7, the SP_DCSQT contains output control
information for outputting the pixel data (PXD) described above.
The SP_DCSQT is formed with a plurality of subpicture display
control sequences (SP_DCSQ). One SP_DCSQ is a set of output control
commands (SD_DCCMDs) performed at one time, and is formed with an
SP_DCSQ_STM indicating a starting time, an SP_NXT_DCSQ_SA
containing position information of the next SP_DCSQ, and a
plurality of SP_DCCMDs.
[0030] The SP_DCCMD includes control information on how the pixel
data (PXD) described above is combined with a video image and
output, and includes color information of the pixel data,
transparency information (or contrast information) of the video
data, information on an output starting time, and an output
finishing time.
[0031] FIG. 8 is a diagram illustrating the output result of a
subpicture together with moving picture data according to the data
structure described above.
[0032] Referring to FIG. 8, the pixel data itself is losslessly
encoded, and information on a subpicture display area having an
area where a subpicture is output in a video display area having a
video image area, and information on an output starting time and
finishing time are contained in the SP_DCSQT as output control
information.
[0033] In a DVD, subpicture data for caption data of a maximum of
32 different languages can be multiplexed together with moving
picture data and recorded. These languages are distinguished by a
stream id provided by the MPEG system coding method, and a sub
stream id defined by the DVD. Accordingly, if a user selects one
language, the subpicture unit (SPU) is extracted by taking only
subpicture packs (SP_PCK) having the stream id and sub stream id
corresponding to the language, and then, by decoding the subpicture
unit (SPU), caption data is extracted and, according to output
control information, the output is controlled.
[0034] This caption technology based on the subpicture graphic
formed with bitmap images used in the conventional DVD has the
following problems.
[0035] First, if bitmap based caption data is multiplexed with
moving picture data and recorded, when the moving picture data is
encoded, the bit generation amount occupied by subpicture data
should be considered in advance. That is, by converting the caption
data into graphic data, the amount of data generated in each
language is different and the entire amount is huge. Usually,
encoding moving picture data is performed only once and, by
addition to the output, subpicture data for each language is again
multiplexed and a DVD appropriate to each region is manufactured.
However, depending on the language, there occurs a case in which
the amount of subpicture data is huge, and when the subpicture data
is multiplexed with the moving picture data, the total generated
bit amount exceeds the maximum limit. Also, since the subpicture
data is multiplexed between each moving picture data unit, the
starting position of each VOBU becomes different in each region. In
a DVD, since the starting position of a VOBU is separately managed,
whenever a multiplexing process begins, this information should
also be updated.
[0036] Secondly, since the contents of each subpicture cannot be
known, it cannot be used for a separate purpose such as outputting
two languages at the same time, or outputting only caption data
without moving picture data to use for language learning.
[0037] As described above, since the text-based caption technology
used in a PC and the caption technology using subpicture graphics
as in a DVD are designed differently, if text-based caption data
information is applied to the DVD reproducing apparatus without
change, such problems as difficulties in guaranteeing real time
reproduction or managing a subpicture data buffer occur.
SUMMARY OF THE INVENTION
[0038] The present invention provides an information storage medium
including text-based caption information to solve these and/or
other problems of the text-based caption technology and the
subpicture-graphic-based caption technology used in a DVD, and a
reproducing apparatus and a reproducing method thereof.
[0039] Additional aspects and/or advantages of the invention will
be set forth in part in the description which follows and, in part,
will be obvious from the description, or may be learned by practice
of the invention.
[0040] According to an aspect of the present invention, there is
provided a storage medium including: moving picture data; and
subtitle data to be output as a graphic overlapping an image based
on the moving picture data, wherein the subtitle data includes:
text data to generate pixel data converted into a bitmap image; and
control information to control the pixel data to be output in real
time.
[0041] The text data may generate the pixel data to be converted
into the bitmap image such that caption contents are output as the
graphic overlapping the image.
[0042] The text data may further include style information to
specify the style of the caption output as the graphic overlapping
the image, wherein the style information may include at least one
of a pixel data area, a background color, a starting point at which
a first letter of text begins, line spacing information, an output
direction, a type of a font, font color, and a character code.
[0043] The control information may include time information
indicating a time at which the pixel data is generated in a buffer
memory and a time at which the pixel data is deleted in the buffer
memory, and position information recording a position at which the
pixel data is output.
[0044] The subtitle data may include the text data corresponding to
pixel data (PXD) contained in subpicture information and the
control information corresponding to display control information
(SP_DCSQT). The subtitle data may be in a text format or a packet
format.
[0045] The subtitle data may include the text data corresponding to
a presentation composition segment (PCS) contained in presentation
data, and the control information corresponding to an object
definition segment (ODS). The subtitle data may be in a text format
or in a packet format.
[0046] According to another aspect of the present invention, there
is provided an apparatus to reproduce information from a storage
medium including moving picture data and subtitle data to be output
as a graphic overlapping on an image based on the moving picture
data, the apparatus including: a text caption decoder to decode
text data contained in the subtitle data and generate pixel data
converted into a bitmap image, and decode and parse control
information contained in the subtitle data to control a caption to
be output in real time; and a graphic controller to control the
pixel data to be output in real time using the control
information.
[0047] The text caption decoder may include: a text caption parser
to decode and parse the text data and the control information; and
a font renderer to convert the parsed text data into a bitmap image
so that the parsed text is output as the graphic overlapping the
image.
[0048] The text caption parser may decode and parse style
information from the text data and specify an output style of the
caption, and the font renderer may convert the text data into the
bitmap image reflecting the parsed style information.
[0049] The text caption parser may parse the text data and transfer
the parsed text data to the font renderer. The text caption parser
may parse time information indicating a time at which the pixel
data is generated in a buffer memory and a time at which the pixel
data is deleted in the buffer memory, and position information
recording a position at which the pixel data is output, from the
control information, and transfer the parsed information to the
graphic controller, and the graphic controller may control the
pixel data to be output in real time by using the parsed time
information and position information.
[0050] The subtitle data may include the text data corresponding to
pixel data contained in subpicture information of a DVD formed by a
bitmap image reproducing method, and the control information
corresponding to display control information (SP_DCSQT). The text
caption parser may transfer the text data to the font renderer, and
the control information to the graphic controller, and the graphic
controller may control the pixel data (PXD) to be output in real
time by using the transferred control information.
[0051] The subtitle data may include the text data corresponding to
a PCS contained in presentation data of a Blu-ray disc formed by a
bitmap image reproducing method, and the control information
corresponding to an ODS. The text caption parser may transfer the
text data to the font renderer, and the control information to the
graphic controller, and the graphic controller may control the
pixel data to be output in real time by using the transferred
control information.
[0052] According to still another aspect of the present invention,
there is provided a method of reproducing information from a
storage medium including moving picture data and subtitle data to
be output as a graphic overlapping on an image based on the moving
picture data, the method including: reading the subtitle data
including text data and control information from the storage
medium; decoding the text data, parsing caption contents and output
style information, and converting the caption contents into pixel
data formed as a bitmap image based on the parsed style
information; decoding the control information, parsing time
information to control the pixel data to be output in real time,
and parsing position information to control a position at which a
caption is output; and outputting the converted pixel data in real
time according to the parsed time information and position
information.
BRIEF DESCRIPTION OF THE DRAWINGS
[0053] These and/or other aspects and advantages of the invention
will become apparent and more readily appreciated from the
following description of the embodiments, taken in conjunction with
the accompanying drawings of which:
[0054] FIG. 1 is a diagram illustrating the structure of a caption
file used in a text-based caption technology used in the
conventional personal computer (PC);
[0055] FIG. 2 is a diagram illustrating the structure of a
reproducing apparatus reproducing the conventional text-based
captions;
[0056] FIG. 3 is a diagram illustrating the data structure of the
conventional DVD explaining the structure of a caption file used in
a subpicture-graphic-based caption technology used in the
conventional DVD;
[0057] FIG. 4 is a diagram illustrating a detailed structure of
video object set (VOBS) having moving picture data in the data
structure of the conventional DVD shown in FIG. 3;
[0058] FIG. 5 is a diagram illustrating the correlation of a
subpicture pack (SP_PCK) and a subpicture unit (SPU) in the
structure of the VOBS having moving picture data shown in FIG.
4;
[0059] FIG. 6 is a diagram illustrating a run-length coding method
among methods of encoding the subpicture unit shown in FIG. 5;
[0060] FIG. 7 is a diagram illustrating the data structure of the
SP_DCSQT having output control information of pixel data (PXD)
shown in FIG. 5;
[0061] FIG. 8 is a diagram illustrating the output result of a
subpicture together with moving picture data according to the data
structure described above;
[0062] FIG. 9 is a block diagram of a reproducing apparatus
processing a text caption according to an embodiment of the present
invention;
[0063] FIG. 10 is a detailed block diagram of the reproducing
apparatus shown in FIG. 9;
[0064] FIG. 11A is an example of text data to generate pixel data
according to an embodiment of the present invention;
[0065] FIG. 11B is an example of graphic control information to
control real time display of a caption according to an embodiment
of the present invention;
[0066] FIG. 12 is a diagram of an embodiment of subtitle data
according to the present invention using a subpicture data
structure of a DVD;
[0067] FIG. 13 is a diagram of an embodiment of subtitle data
according to the present invention using a presentation data
structure of a Blu-ray disc;
[0068] FIG. 14 is a diagram of an embodiment of subtitle data in a
text format that can be applied to a DVD;
[0069] FIG. 15 is a diagram of an embodiment of subtitle data in a
text format that can be applied to a Blu-ray disc;
[0070] FIG. 16 is a diagram illustrating the output result of
caption data according to an embodiment of the present invention;
and
[0071] FIG. 17 is a flowchart illustrating operations performed in
a method of processing a text caption according to an embodiment of
the present invention.
DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS
[0072] Reference will now be made in detail to the embodiments of
the present invention, examples of which are illustrated in the
accompanying drawings, wherein like reference numerals refer to the
like elements throughout. The embodiments are described below to
explain the present invention by referring to the figures.
[0073] Referring to FIG. 9, the reproducing apparatus processing a
text-based caption according to an embodiment of the present
invention includes buffer units 902 and 906, a video data
processing unit 910, a text caption data processing unit 920, an
audio data processing unit 930, and a blender 940.
[0074] According to the types of data to be stored, the buffer
units 902 and 906 include an AV data buffer 902 storing moving
picture data, and a text caption data and font data buffer 906
storing text-based caption data. Data read from a variety of
storage media 900, including a removable storage medium such as an
optical disc, a local storage, and storages on the Internet, is
temporarily stored in each buffer according to the type of the
data.
[0075] The video data processing unit 910 includes a video decoder
914 and a video frame buffer 916. The video decoder 914 receives
compression coded moving picture data from the AV data buffer 902
and decodes the data. The decoded video data is output to the
screen 942 through the video frame buffer 916.
[0076] The text caption data processing unit 920 includes a text
caption decoder 922, a subpicture decoder 924, and a graphic
controller 926. The reproducing apparatus according to the present
invention has the subpicture decoder 924 to process a subtitle in
the conventional multiplexed subpicture type, and, in addition, has
the text caption decoder 922 so that text-based caption data
according to an embodiment of the present invention can be
processed. The text caption decoder 922 decodes text data to
generate a bitmap image for a caption and control information to
control real time reproduction of a caption among subtitle data,
and generates pixel data converted into a bitmap image. The control
information among the decoded data is transferred to the graphic
controller 926 such that the generated pixel data is controlled to
be output in real time.
[0077] The audio data processing unit 930 has an audio decoder to
decode audio data so that the audio data is decoded and output
through a speaker 932.
[0078] The blender 940 superimposes a bitmap image obtained by
rendering caption data on video data obtained by decoding moving
picture data, and outputs the data to the screen 942.
[0079] FIG. 10 is a detailed block diagram of the reproducing
apparatus to process a text-based caption shown in FIG. 9.
[0080] Referring to FIG. 10, the structure of the text caption data
processing unit 920 illustrated in FIG. 9 is shown in detail.
[0081] The reproducing apparatus according to this embodiment of
the present invention includes buffer units 1010, 1020, 1030, and
1040, a video data processing unit 910, a text caption data
processing unit 920, and a blender 1039. Explanation of the audio
processing unit described in FIG. 9 will be omitted.
[0082] The buffer units 1010, 1020, 1030, and 1040 include a video
data buffer 1010, a subpicture data buffer 1020, a text caption
data buffer 1030, and a font data buffer 1040. Moving picture data
and subtitle data are read from a variety of storage media 1000,
including a removable storage medium such as an optical disc, a
local storage, and storages on the Internet, and, according to the
type of data, stored in respective buffers temporarily. The moving
picture data (AV data) is de-multiplexed and, temporarily, video
data is stored in the video data buffer 1010, subpicture data for a
subtitle is stored in the subpicture data buffer 1020, and audio
data is stored in the audio data buffer (not shown). Meanwhile,
text data to generate pixel data and control information to control
a caption to be output in real time as subtitle data for a
text-based caption are temporarily stored in the text caption data
buffer 1030, and font data for a subtitle is temporarily stored in
the font data buffer 1040. The video data processing unit 910
includes a video decoder 1012 and a video frame buffer 1014, and is
the same as explained with reference to FIG. 9.
[0083] The text caption data processing unit 920 includes a text
caption parser 1031, a font renderer 1034, a subpicture decoder
1033, a graphic controller 1038, a variety of buffers 1032, 1035,
and 1036, and a color lookup table (CLUT) 1037.
[0084] The text caption parser 1031 decodes and parses text data
and control information included in subtitle data. Also, it decodes
and parses style information specifying an output style of a
caption further included in text data. The parsed text data is
transferred to the font renderer 1034 along path 2.
[0085] The font renderer 1034 generates a bitmap image so that the
parsed text data can be output as a graphic for overlapping. At
this time, by reflecting the parsed style information, a bitmap
image is generated and the generated graphic data is temporarily
stored in the pixel data buffer 1035 along path 3. The data
structure of the text data and style information will be explained
later.
[0086] The subpicture decoder 1033 decodes subpicture data for a
subtitle de-multiplexed from the moving picture data. This is
provided for compatibility with caption data of the conventional
DVD subpicture method. According to another embodiment of the
present invention, when text-based subtitle data according to the
present invention is packetized and included in moving picture
data, text data and control information are de-multiplexed and
transferred to the text caption parser 1031 along path 9.
[0087] The graphic controller 1038 controls the caption to be
output in real time by using control information. In the case of
the conventional text-based caption technology such as SAMI of
Microsoft described above, only a time for a caption to be output
is specified, and therefore, if the caption is reproduced in a
hardware device using a low speed processor, real time
reproduction, in which moving picture data and caption data are
synchronized and output, may not be guaranteed.
[0088] However, in the case of the reproducing apparatus according
to the present invention, time information regarding when pixel
data, which is converted into a bitmap image in the control
information described above, is generated and deleted in the buffer
memory, and position information regarding a position from which
the pixel data is output, are parsed and the output of the pixel
data buffer is controlled. By doing so, moving picture data and
captions can be synchronized and reproduced in real time.
[0089] The variety of buffers 1032, 1035, and 1036 include a
graphic control information buffer 1032, a pixel data buffer 1035,
and a subpicture frame buffer 1036.
[0090] The graphic control information buffer 1032 temporarily
stores control information parsed in the text caption parser 1031,
and the pixel data buffer 1035 temporarily stores graphic data
converted into a bitmap image.
[0091] The subpicture frame buffer 1036 temporarily stores pixel
data so that the subpicture for a caption can be output by
controlling the output of the pixel data according to the time
information that is included in the control information from the
graphic controller 1038.
[0092] The color lookup table (CLUT) 1037 controls the color of a
caption to be output by using palette information included in
control information.
[0093] The blender 1039 superimposes the graphic image of a caption
output from the text caption data processing unit 920 on an image
output from the video data processing unit 910, and outputs the
overlapping images on the screen 1041.
[0094] The operation of each block of the reproducing apparatus
according to the embodiment of the present invention illustrated in
FIG. 10 and described above can be summarized as follows.
[0095] First, moving picture data read from the storage medium 1000
is de-multiplexed, and the video data is decoded by the video
decoder 1012 after passing through the video data buffer 1010.
After being output through the video frame buffer 1014, the decoded
video data is output together with the graphic data of a caption
output from the text caption processing unit 920, with the graphic
data overlapping the video data. Audio data in the moving picture
data is decoded by the audio decoder of the audio data processing
unit 930 and output through the speaker 932.
[0096] Meanwhile, text-based subtitle data according to this
embodiment of the present invention which is read from the storage
medium 1000 is parsed into text data and control information in the
text caption parser 1031 after passing through the text caption
data buffer 1030. The parsed text data is transferred to the font
renderer 1034 along path 2. Here, the text data is converted into
graphic data in which caption contents are formed as a bitmap
image, and the graphic data is stored in the pixel data buffer
1035.
[0097] Meanwhile, control information, parsed into time information
to output a caption in real time and output position information of
the caption, is transferred through the graphic control information
buffer 1032 along path 1 to the graphic controller 1038 along path
7. The graphic controller 1038 adjusts the output speed of the
graphic data stored in the pixel data buffer 1035 by using control
information, outputs the graphic data to the subpicture frame
buffer 1036, and, by referring to the color lookup table 1037,
reflects color. The graphic controller 1038 superimposes the
graphic data on the moving picture data through the blender 1039
and outputs the data to the screen.
[0098] Meanwhile, as another embodiment of the present invention,
when text-based subtitle data is packetized and multiplexed with
moving picture data, subtitle data is decoded by the subpicture
decoder 1033 and transferred to the text caption parser 1031 along
path 9. The processing of the subtitle data thereafter is the same
as described above.
[0099] As an embodiment of the present invention, when subtitle
data includes the text data corresponding to pixel data (PXD) among
subpicture information of a DVD formed by a bitmap data
reproduction method, and the control information corresponding to
display control information (SP_DCSQT), the subtitle data decoded
by the subpicture decoder 1033 is transferred to the text caption
parser 1031, and here, text data is transferred to the font
renderer 1034 and control information is transferred to the graphic
controller 1038 such that, by using the control information
transferred to the graphic controller 1038, a caption is controlled
to be output in real time.
[0100] As another embodiment of the present invention, when
subtitle data includes the text data corresponding to a
presentation composition segment (PCS) among presentation data of a
Blu-ray disc formed by a bitmap data reproduction method, and the
control information corresponding to an object definition segment
(ODS), the subtitle data decoded by the subpicture decoder 1033 is
transferred to the graphic controller 1038 such that, by using the
control information transferred to the graphic controller 1038, a
caption is controlled to be output in real time.
[0101] A storage medium on which text-based subtitle data according
to an embodiment of the present invention is recorded will now be
described.
[0102] The storage medium according to this embodiment of the
present invention includes moving picture data and subtitle data
that is output as a graphic overlapping an image based on the
moving picture. The subtitle data includes text data to generate
pixel data and control information to control a caption to be
output in real time.
[0103] Text data is utilized to convert caption contents into a
bitmap image to be output as a graphic for overlapping. Text data
further includes style information specifying the style of a font.
Preferably, though not necessarily, the style information includes
at least one of a pixel data area, a background color, the starting
point at which the first letter of text begins, line spacing
information, an output direction, the type of a font, font color,
and a character code.
[0104] Meanwhile, the control information includes time information
regarding when the pixel data obtained by rendering text data is
generated and deleted in the buffer memory, and position
information regarding a position at which pixel data is output.
[0105] As an embodiment of a storage medium according to the
present invention, subtitle data may include text data
corresponding to pixel data (PXD) among subpicture information, and
control information corresponding to display control information
(SP_DCSQT) such that predetermined contents similar to the
subpicture information of a DVD formed by a bitmap data
reproduction method can be included. Subtitle data may be
implemented in a text format or may be implemented as data in the
form of packets.
[0106] Also, as another embodiment of a storage medium according to
the present invention, subtitle data may include text data
corresponding to a PCS among presentation data, and control
information corresponding to an ODS such that predetermined
contents similar to presentation data of a Blu-ray disc formed by a
bitmap data reproduction method can be included. Subtitle data may
be implemented in a text format or may be implemented as data in
the form of packets.
[0107] FIG. 11A is an example of text data to generate pixel data
according to an embodiment of the present invention.
[0108] Referring to FIG. 11A, in a text data area, text information
includes caption contents and style information required to
generate a bitmap image of pixel data.
[0109] That is, text information includes, for example, the
contents of a caption to be output and style information specifying
the output style of the caption. As style information, when
multiple lines of text are output, information on line spacing is
included, and information indicating the output direction of text
(left->right, right->left, up->down) can be included.
Also, information on a font, such as the size of text, bold,
Italic, and underline, is included, and information on line change
to render text to begin from the next line, and information on the
color of text can be included. In addition, character code
information for encoding can be included, for example, information
on whether a character code to be used is 8859-1 or UTF-16 can be
included.
[0110] This text information is an example according to this
embodiment of the present invention and can be modified and
implemented to fit the characteristic of a medium, such as a DVD
and a Blu-ray disc, to which the present invention is applied.
[0111] FIG. 11B is an example of graphic control information to
control real time display of a caption according to an embodiment
of the present invention.
[0112] Referring to FIG. 11B, control information to control output
of the pixel data converted into a bitmap image is shown.
[0113] That is, in order to indicate the size of the pixel data
area in which the text data is converted into a bitmap image and
rendered, information on the width and height of the pixel data
area can be recorded. Also, information on the color of the
background of the pixel data, time information regarding when the
pixel data is generated and deleted in the pixel data buffer
memory, and starting point information indicating a position at
which the first line of text characters begin can be recorded.
These data items are included in subtitle data as control
information, and play the role of controlling a caption to be
output in real time.
[0114] Also, when control data is applied to a Blu-ray disc, in
order to collect and output a plurality of pixel data items in one
screen, construction information collecting a plurality of data
areas into one page can also be included. A color lookup table
including information to be used for the background color and
foreground color of caption text used in the page can be included.
Since a specified area among pixel data information is output on
the screen, area specifying information ((Xs, Ys), the width and
height information of a pixel data area, or information on starting
point (Xs, Ys) and end point (Xe, Ye)) can be included. Also,
starting point information in a pixel data area corresponding to
the first starting point of a subpicture display area explained
with reference to FIG. 8 can also be included. Meanwhile,
preferably, though not necessarily, time information is included
which indicates a time when pixel data temporarily stored in a
buffer is output, and a time when the pixel data is deleted.
[0115] This control information is but one example according to an
embodiment of the present invention, and can be modified and
implemented to fit the characteristic of a medium, such as a DVD
and a Blu-ray disc, to which the present invention is applied.
[0116] FIG. 12 is a diagram of an embodiment of subtitle data
according to the present invention using a subpicture data
structure of a DVD.
[0117] Referring to FIG. 12, subtitle data according to this
embodiment of the present invention can be implemented in a packet
format of the MPEG method that is the construction method of a
subpicture data stream of a DVD. That is, in a packetized element
stream (PES) structure, in addition to a SPUH having header
information, text caption data according to this embodiment of the
present invention can be made to be recorded in a PXD area for
pixel data, and control information according to this embodiment of
the present invention can be made to be recorded in an SP_DCSQT
area for output control information. Obviously, subtitle data
according to this embodiment of the present invention can be
implemented as binary data in the form of a packet, and can also be
implemented in a text format including contents similar to the
subpicture data stream described above. Any data in a text format
or in a binary format can be parsed by the text caption parser 1031
described with reference to FIG. 10. Parsed text data is
transferred to the font renderer 1034 along path 2, and control
information is transferred to the graphic controller 1038 along
path 1 such that based on the control information, a caption
converted into a bitmap image can be output in real time.
[0118] FIG. 13 is a diagram of an embodiment of subtitle data
according to the present invention using a presentation data
structure of a Blu-ray disc.
[0119] Referring to FIG. 13, subtitle data according to this
embodiment of the present invention can be implemented in a packet
format of the MPEG method that is the construction method of a
presentation data stream of a Blu-ray disc. That is, in a PES
structure, control information can be recorded to correspond to a
PCS area and text caption data can be recorded to correspond to an
ODS. In addition, a palette definition segment (PDS) and an end
segment (END) can be further included. Obviously, subtitle data
according to this embodiment of the present invention can be
implemented as binary data in the form of a packet, and can also be
implemented in a text format including contents similar to the
presentation data stream described above.
[0120] Any data in a text format or in a binary format can be
parsed by the text caption parser 1031 described with reference to
FIG. 10. Parsed text data is transferred to the font renderer 1034
along path 2, and control information is transferred to the graphic
controller 1038 along path 1 such that based on the control
information, a caption converted into a bitmap image can be output
in real time.
[0121] FIGS. 14 and 15 illustrate examples of embodiments of
subtitle data implemented in a text format. In particular, FIG. 14
illustrates an example of an embodiment of subtitle data in a text
format that can be applied to a DVD, and the subtitle data includes
text and control information. Also, FIG. 15 illustrates an example
of an embodiment of subtitle data in a text format that can be
applied to a Blu-ray disc, and the subtitle data includes text data
and control information and can further include color information.
FIGS. 14 and 15 are just examples of the data structure of a
storage medium according to embodiments of the present invention,
and the data structure can be modified and implemented in a variety
of ways.
[0122] In order to specify the style of subtitle data according to
the embodiment of the present invention described above, the
following character strings can be used:
[0123] \cn]\: specifies a color to be used in text. The basic value
is 0.
[0124] \bn]\: specifies a background color to be used for the
background of text. This should be used at the front of a character
string, and the basic value is 0.
[0125] \f[n]\: specifies the type of font to be used in text. The
basic value is 0.
[0126] \s[n]\: specifies the size of font to be used in text. The
unit is a pixel and the basic value is 0.
[0127] \e[n]\: specifies a character code to be used for encoding
text. The encoding method can be changed. If 0, ISO-8859-1 is used,
and if 1, UTF-16 is used, and the basic value is 0.
[0128] \o[n]\: specifies the position of a starting point from
which text is rendered in a pixel data area.
[0129] \I[n]\: specifies a line space when a line change for a text
character string is performed. The unit for n is a pixel and the
basic value is 0.
[0130] \d[n]\: specifies the output direction of text. If n is 0,
text is output from left to right in the horizontal direction, and
if n is 1, text is output from right to left in the horizontal
direction. If n is 2, text is output in the vertical direction, and
if there is a line change, the line change is performed from right
to left. If n is 3, text is output in the vertical direction, and
if there is a line change, the line change is performed from left
to right. The basic value is 0.
[0131] \b[n]\: selects the size of a text character as bold or
normal. Bold is 1 and normal is 0, and the basic value is 0.
[0132] \i[n]\: selects the shape of a text character as Italic or
normal. Italic is 1 and normal is 0, and the basic value is 0.
[0133] \u[n]\: specifies whether or not to underline a text
character. To underline is 1 and no underline is 0, and the basic
value is 0.
[0134] \n\: performs line change. The basic value is 0.
[0135] \\:\ outputs a character. The basic value is 0.
[0136] FIG. 16 is a diagram illustrating the output result of
caption data according to an embodiment of the present
invention.
[0137] Referring to FIG. 16, for example, when the following
character string is used as style information, the output result on
the screen is shown. That is, when style information, \o2000\ \b0\
\c1\ \f0\ \I20\Hello, \b1\Subtitle\b0\ \i1\ \n\World, is used, the
output result of pixel data generated by parsing this information
is shown.
[0138] For information regarding a font used in text data, font
information recorded separately from subtitle data is received from
a disc or a network, and related font information is stored in a
font buffer memory such that the font information can be used.
[0139] A method of processing a text caption based on the
structures of the reproducing apparatus and the storage medium
described above will now be explained.
[0140] FIG. 17 is a flowchart illustrating operations performed in
a method of processing a text caption according to an embodiment of
the present invention.
[0141] Referring to FIG. 17, in order to reproduce data on a
storage medium including moving picture data and subtitle data that
is output as a graphic overlapping an image based on the moving
picture data, first, subtitle data including text data and control
information is read from the storage medium in operation 1502. The
read text data is decoded, caption contents and output style
information are parsed, and based on the parsed style information,
caption contents are converted into pixel data in operation 1504.
The read control information is decoded, and time information and
position information to control a caption to be output in real time
are parsed in operation 1506. According to the parsed time
information and position information, the converted pixel data is
output in real time in operation 1508.
[0142] The present invention can also be embodied as computer
readable codes on a computer readable recording medium. The
computer readable recording medium is any data storage device that
can store data which can be thereafter read by a computer system.
Examples of the computer readable recording medium include
read-only memory (ROM), random-access memory (RAM), CD-ROMs,
magnetic tapes, floppy disks, and optical data storage devices. The
computer readable recording medium can also be distributed over
network coupled computer systems so that the computer readable code
is stored and executed in a distributed fashion.
[0143] According to the present invention as described above, an
information storage medium including text-based caption information
to alleviate the discussed and/or other problems of the text-based
caption technology and the subpicture-graphic-based caption
technology used in a DVD, and a reproducing apparatus and a
reproducing method thereof, are provided.
[0144] Accordingly, management of a buffer becomes convenient, and
captions in more than two different languages can be output at the
same time, or only captions can be output separately without moving
picture information. In addition, real time reproduction of
captions controlled by hardware can be guaranteed. Furthermore,
since the amount of encoded data of the subtitle data according to
the present invention is relatively less than that of the
conventional subpicture type caption data based on a bitmap image,
address management of a VOBU is easier even when encoding is again
performed in order to process multiple languages.
[0145] Although a few embodiments of the present invention have
been shown and described, it would be appreciated by those skilled
in the art that changes may be made in these embodiments without
departing from the principles and spirit of the invention, the
scope of which is defined in the claims and their equivalents.
* * * * *