U.S. patent application number 16/665370 was filed with the patent office on 2020-02-27 for reception apparatus, reception method, and transmission apparatus.
This patent application is currently assigned to SONY CORPORATION. The applicant listed for this patent is SONY CORPORATION. Invention is credited to Kazuhiko Takabayashi, Ikuo Tsukagoshi.
Application Number | 20200068247 16/665370 |
Document ID | / |
Family ID | 59398039 |
Filed Date | 2020-02-27 |
![](/patent/app/20200068247/US20200068247A1-20200227-D00000.png)
![](/patent/app/20200068247/US20200068247A1-20200227-D00001.png)
![](/patent/app/20200068247/US20200068247A1-20200227-D00002.png)
![](/patent/app/20200068247/US20200068247A1-20200227-D00003.png)
![](/patent/app/20200068247/US20200068247A1-20200227-D00004.png)
![](/patent/app/20200068247/US20200068247A1-20200227-D00005.png)
![](/patent/app/20200068247/US20200068247A1-20200227-D00006.png)
![](/patent/app/20200068247/US20200068247A1-20200227-D00007.png)
![](/patent/app/20200068247/US20200068247A1-20200227-D00008.png)
![](/patent/app/20200068247/US20200068247A1-20200227-D00009.png)
![](/patent/app/20200068247/US20200068247A1-20200227-D00010.png)
View All Diagrams
United States Patent
Application |
20200068247 |
Kind Code |
A1 |
Tsukagoshi; Ikuo ; et
al. |
February 27, 2020 |
RECEPTION APPARATUS, RECEPTION METHOD, AND TRANSMISSION
APPARATUS
Abstract
An object is to make it possible to perform caption display
satisfactorily in a case where the caption display position is
designated as a relative position. The video stream is decoded to
obtain video data, and the subtitle stream including the caption
information is decoded to obtain bitmap data of the caption. The
caption display position in the caption display position
information included in the caption information is designated as a
relative position with respect to the caption display range. In a
case where the aspect ratio of the video area is different from the
aspect ratio of the display video area, the caption display
position is determined with the display video area defined as the
caption display range, further resize processing is performed, and
the display position control is performed onto the bitmap data of
the caption on the basis of the caption display position that has
undergone the resize processing. Bitmap data of the caption that
has undergone display position control is superimposed on the video
data to obtain display video data.
Inventors: |
Tsukagoshi; Ikuo; (Tokyo,
JP) ; Takabayashi; Kazuhiko; (Tokyo, JP) |
|
Applicant: |
Name |
City |
State |
Country |
Type |
SONY CORPORATION |
Tokyo |
|
JP |
|
|
Assignee: |
SONY CORPORATION
Tokyo
JP
|
Family ID: |
59398039 |
Appl. No.: |
16/665370 |
Filed: |
October 28, 2019 |
Related U.S. Patent Documents
|
|
|
|
|
|
Application
Number |
Filing Date |
Patent Number |
|
|
16070815 |
Jul 18, 2018 |
10511882 |
|
|
PCT/JP2017/001438 |
Jan 17, 2017 |
|
|
|
16665370 |
|
|
|
|
Current U.S.
Class: |
1/1 |
Current CPC
Class: |
H04N 9/8233 20130101;
H04N 7/0885 20130101; H04N 21/435 20130101; H04N 21/4355 20130101;
H04N 21/4884 20130101; H04N 21/2358 20130101; H04N 21/431 20130101;
H04N 9/8205 20130101 |
International
Class: |
H04N 21/431 20060101
H04N021/431; H04N 21/435 20060101 H04N021/435; H04N 7/088 20060101
H04N007/088; H04N 9/82 20060101 H04N009/82; H04N 21/488 20060101
H04N021/488; H04N 21/235 20060101 H04N021/235 |
Foreign Application Data
Date |
Code |
Application Number |
Jan 26, 2016 |
JP |
2016-012856 |
Claims
1. A reception apparatus, comprising: processing circuitry
configured to receive a video stream including video data and a
subtitle stream including caption information, the caption
information including caption display position information that
designates a caption display position by a relative position with
respect to a reference point of a caption display range, decode the
subtitle stream to obtain caption bitmap data, and superimpose the
caption bitmap data at the caption display position on the video
data based on the caption display position information, wherein the
caption display range in a video area is set based on aspect ratio
information indicating the caption display range in the video area
when the aspect ratio information exists in the caption information
and when a display of the reception apparatus is in a mode of
displaying the entire video area.
2. The reception apparatus according to claim 1, wherein the video
area is set as the caption display range when the aspect ratio
information does not exist in the caption information.
3. The reception apparatus according to claim 1, wherein the
caption information is in a timed text markup language (TTML) or
TTML-derived format.
4. The reception apparatus according to claim 3, wherein the aspect
ratio information is in a root container of the TTML or the
TTML-derived format.
5. The reception apparatus according to claim 1, wherein the
processing circuitry is further configured to determine a caption
display area in the caption display range based on the caption
display position, wherein the caption information includes
information related to a resizing process of the caption display
area based on the aspect ratio information.
6. A reception method, comprising: receiving a video stream
including video data and a subtitle stream including caption
information, the caption information including caption display
position information that designates a caption display position by
a relative position with respect to a reference point of a caption
display range; decoding the subtitle stream to obtain caption
bitmap data; and superimposing the caption bitmap data at the
caption display position on the video data based on the caption
display position information, wherein the caption display range in
a video area is set based on aspect ratio information indicating
the caption display range in the video area when the aspect ratio
information exists in the caption information and when a display of
the reception apparatus is in a mode of displaying the entire video
area.
7. The reception method according to claim 6, wherein the video
area is set as the caption display range when the aspect ratio
information does not exist in the caption information.
8. The reception method according to claim 6, wherein the subtitle
information is in a timed text markup language (TTML) or
TTML-derived format.
9. The reception method according to claim 8, wherein the aspect
ratio information exists in a root container of the TTML or the
TTML-derived format.
10. The reception method according to claim 6, further comprising:
determining a caption display area in the caption display range
based on the caption display position, wherein the caption
information includes information related to a resizing process of
the caption display area based on the aspect ratio information.
11. A non-transitory computer readable medium including executable
instructions, which when executed by a computer cause the computer
to execute a method for a reception apparatus, the method
comprising: receiving a video stream including video data and a
subtitle stream including caption information, the caption
information including caption display position information that
designates a caption display position by a relative position with
respect to a reference point of a caption display range; decoding
the subtitle stream to obtain caption bitmap data; and
superimposing the caption bitmap data at the caption display
position on the video data based on the caption display position
information, wherein the caption display range in a video area is
set based on aspect ratio information indicating the caption
display range in the video area when the aspect ratio information
exists in the caption information and when a display of the
reception apparatus is in a mode of displaying the entire video
area.
12. The method according to claim 11, wherein the video area is set
as the caption display range when the aspect ratio information does
not exist in the caption information.
13. The method according to claim 11, wherein the subtitle
information is in a timed text markup language (TTML) or
TTML-derived format.
14. The method according to claim 11, wherein the aspect ratio
information exists in a root container of the TTML or the
TTML-derived format.
15. The method according to claim 11, further comprising:
determining a caption display area in the caption display range
based on the caption display position, wherein the caption
information includes information related to a resizing process of
the caption display area based on the aspect ratio information.
Description
CROSS-REFERENCE TO RELATED APPLICATION
[0001] This application is a continuation of U.S. patent
application Ser. No. 16/070,815, filed Jul. 18, 2018, which is a
National Stage application of PCT/JP 2017/001438, filed Jan. 17,
2017, which is based upon and claims the benefit of priority from
Japanese Patent Application No. 2016-012856 filed Jan. 26, 2016.
The entire contents of the above-identified applications are
incorporated herein by reference.
TECHNICAL FIELD
[0002] The present technology relates to a reception apparatus, a
reception method, and a transmission apparatus, and particularly
relates to a transmission apparatus or the like that transmits
caption information together with video data.
BACKGROUND ART
[0003] Conventionally, broadcasting, for example, such as digital
video broadcasting (DVB) includes operation of transmitting caption
information as bitmap data. In recent years, transmission of the
caption information in text character codes, that is, transmission
on a text basis is proposed. As the text information, for example,
a timed text markup language (TTML) is suggested by the World Wide
Web Consortium (W3C) (refer to Patent Document 1).
[0004] Conventionally, there is a known technique of designating a
caption display position in caption display position information
included in caption information by a relative position with respect
to a video area, for example. In this case, part of the caption
might exceed beyond the display area and not be displayed depending
on the display method in a case where the aspect ratio of the video
area does not match the aspect ratio of the display area.
CITATION LIST
Patent Document
Patent Document 1: Japanese Patent Application Laid-Open No.
2012-169885
SUMMARY OF THE INVENTION
Problems to be Solved by the Invention
[0005] In order to avoid the case where part of the caption exceeds
beyond the display area and not displayed depending on the display
method as described above, it is conceivable to perform processing
on a receiving side that display is to be made onto a relative
position with respect to a video area (display video area)
displayed on the monitor (display) rather than the relative
position with respect to the video area. In that case, in a case
where the aspect ratio of the video area is different from the
aspect ratio of the display video area, the caption display
position might be compressed solely in the horizontal direction,
and this might result in caption display that would give a sense of
discomfort to the viewer.
[0006] An object of the present technology is to make it possible
to perform caption display satisfactorily in a case where the
caption display position is designated as a relative position.
Solutions to Problems
[0007] A concept of the present technology is a reception apparatus
including:
[0008] a reception unit that receives a container containing a
video stream including video data and a subtitle stream including
caption information including caption display position information
that designates a caption display position by a relative position
with respect to a caption display range; and
[0009] a control unit that controls: video decode processing of
decoding the video stream to obtain video data; subtitle decode
processing of decoding the subtitle stream to obtain bitmap data of
a caption; display position control processing, performed in a case
where an aspect ratio of a video area is different from an aspect
ratio of the display video area, of determining a caption display
position on the basis of the caption display position information
with a display video area defined as a caption display range,
performing resize processing on the determined caption display
position, and performing display position control on the bitmap
data of the caption on the basis of the caption display position
that has undergone the resize processing; and video superimposition
processing of superimposing the caption bitmap data that has
undergone the display position control, on the video data.
[0010] In the present technology, a reception unit receives a
container containing a video stream including video data and a
subtitle stream including caption information. Here, the caption
display position in the caption display position information
included in the caption information is designated as a relative
position with respect to the caption display range. The control
unit controls video decode processing, subtitle decode processing,
display position control processing, and superimposition
processing.
[0011] The video decode processing decodes a video stream to obtain
video data. The subtitle decode processing decodes subtitle streams
to obtain bitmap data of captions. In the display position control
processing, in a case where the aspect ratio of the video area is
different from the aspect ratio of the display video area, the
display video area is defined as the caption display range, and the
caption display position is determined on the basis of the caption
display position information.
[0012] In the display position control processing, resize
processing is performed on the determined caption display position,
and display position control is performed on the caption bitmap
data on the basis of the caption display position that has
undergone the resize processing. In the video superimposition
processing, the bitmap data of the caption that has undergone the
display position control is superimposed on the video data. For
example, in the resize processing, in a case where the determined
caption display position has been compressed solely in the
horizontal direction, the position is also compressed in the
vertical direction in the same proportion.
[0013] In this manner, in a case where the aspect ratio of the
video area is different from the aspect ratio of the display video
area in the present technology, the caption display position
determined with the display video area defined as the caption
display range further undergoes resize processing. Therefore, even
in a case where the aspect ratio of the video area is different
from the aspect ratio of the display video area, the original shape
can be maintained as the caption display position, making it
possible to perform display of captions satisfactorily without
giving the viewer a sense of discomfort.
[0014] Note that in the present technology, it is allowable, for
example, in the display control processing that in a case where the
size in the vertical direction is compressed by resize processing
of the caption display position, compression is performed in a
state where a predetermined line position is fixed. With this
compression executed in a state where the predetermined line
position is fixed in this manner, for example, in a case where
there are two caption display positions, it is possible to maintain
an interval in the vertical direction between the two caption
display positions even when the resize processing is performed.
[0015] Moreover, in the present technology, for example, caption
information contained in a subtitle stream may include information
related to the resize processing, and the display position control
processing may use the information related to the resize processing
to perform the resize processing on the determined caption display
position. For example, the information related to the resize
processing may be information indicating a line position to be set
as a fixed position in a case where the size in the vertical
direction is compressed in the resize processing of the caption
display position. With the resize processing performed on the basis
of the information related to the resize processing in this manner,
it is possible to easily perform the resize processing
appropriately.
[0016] In addition, another concept of the present technology
is
[0017] a reception apparatus including:
[0018] a reception unit that receives a container containing a
video stream including video data and a subtitle stream including
caption information including caption display position information
that designates a caption display position by a relative position
with respect to a caption display range; and
[0019] a control unit that controls: video decode processing of
decoding the video stream to obtain video data; subtitle decode
processing of decoding the subtitle stream to obtain caption bitmap
data; display position control processing performed in a case where
an aspect ratio of a video area is different from an aspect ratio
of the display video area and being processing of setting a caption
display range in the display video area, determining a caption
display position on the basis of the caption display position
information, and performing display position control on the caption
bitmap data on the basis of the determined caption display
position; and video superimposition processing of superimposing the
caption bitmap data that has undergone the display position
control, on the video data.
[0020] In the present technology, a reception unit receives a
container containing a video stream including video data and a
subtitle stream including caption information. Here, the caption
display position in the caption display position information
included in the caption information is designated as a relative
position with respect to the caption display range. The control
unit controls video decode processing, subtitle decode processing,
display position control processing, and superimposition
processing. The video decode processing decodes a video stream to
obtain video data. The subtitle decode processing decodes subtitle
streams to obtain bitmap data of captions.
[0021] In a case where the aspect ratio of the video area is
different from the aspect ratio of the display video area, the
display position control processing sets the caption display range
in the display video area, determines the caption display position
on the basis of the caption display position information, and
performs display position control on the caption bitmap data on the
basis of the determined caption display position. For example, the
caption display range having the same aspect ratio as the aspect
ratio of the video area is set in the display video area. The video
superimposing unit superimposes caption bitmap data that has
undergone the display position control, on the video data.
[0022] In this manner, according to the present technology, in a
case where the aspect ratio of the video area is different from the
aspect ratio of the display video area, the caption display range
is set in the display video area and the caption display position
is determined on the basis of the caption display position
information. Therefore, even in a case where the aspect ratio of
the video area is different from the aspect ratio of the display
video area, the original shape can be maintained as the caption
display position, making it possible to perform display of captions
satisfactorily without giving the viewer a sense of discomfort.
[0023] Note that in the present technology, for example, caption
information contained in a subtitle stream may include information
indicating a caption display range and the display position control
processing may use the information indicating a caption display
range to set the caption display range in the display video area.
In this case, for example, the information indicating a caption
display range may be reference point information and aspect ratio
information of a caption display range, or reference point
information of a caption display range. With the setting of the
caption display range performed on the basis of the information
indicating a caption display range in this manner, it is possible
to easily set the caption display range appropriately in the
display video area.
[0024] In addition, another concept of the present technology
is
[0025] a transmission apparatus including
[0026] a transmission unit that transmits a container containing a
video stream including video data and a subtitle stream including
caption information,
[0027] in which the caption display position in the caption display
position information included in the caption information is
designated by a relative position with respect to a caption display
range, and
[0028] the caption information includes
[0029] information related to resize processing on the caption
display position determined on the basis of the caption display
position information, performed on a receiving side in a case where
the aspect ratio of the video area is different from the aspect
ratio of the display video area.
[0030] In the present technology, a transmission unit transmits a
container containing a video stream including video data and a
subtitle stream including caption information. Examples of the
container correspond to containers of various formats such as an
MPEG-2 TS transport stream, an MMT transport stream adopted in the
digital broadcasting standard, and the ISOBMFF (MP4) used for
distribution via the Internet.
[0031] The caption display position in the caption display position
information included in the caption information is designated as a
relative position with respect to the caption display range. The
caption information includes information related to the resize
processing of the caption display position determined on the basis
of the caption display position information, performed on the
receiving side in a case where the aspect ratio of the video area
is different from the aspect ratio of the display video area. For
example, the information related to the resize processing may be
information indicating a line position to be set as a fixed
position in a case where the size in the vertical direction is
compressed in the resize processing of the caption display
position.
[0032] In this manner, the present technology is provided such that
the caption information includes information related to the resize
processing performed on the receiving side. Since the caption
information includes the information related to the resize
processing in this manner, the receiving side can perform the
resize processing on the basis of this information, making it
possible to easily perform the resize processing appropriately.
[0033] In addition, another concept of the present technology
is
[0034] a transmission apparatus including
[0035] a transmission unit that transmits a container containing a
video stream including video data and a subtitle stream including
caption information,
[0036] in which the caption display position in caption display
position information included in the caption information is
designated by a relative position with respect to a caption display
range, and
[0037] the caption information includes
[0038] information indicating the caption display range.
[0039] In the present technology, a transmission unit transmits a
container containing a video stream including video data and a
subtitle stream including caption information. The caption display
position in the caption display position information included in
the caption information is designated as a relative position with
respect to the caption display range. The caption information
includes information indicating a caption display range. For
example, the information indicating a caption display range may be
reference point information and aspect ratio information of a
caption display range, or reference point information of a caption
display range.
[0040] In this manner, in the present technology, caption
information includes information indicating a caption display
range. With the configuration in which information indicating a
caption display range is included in the caption information and
with the setting of the caption display range performed on the
basis of the information in this manner, it is possible on the
receiving side to easily set the caption display range
appropriately in the display video area.
Effects of the Invention
[0041] According to the present technology, it is possible to
perform caption display satisfactorily in a case where the caption
display position is designated as a relative position. Note that
effects described here in the present specification are provided
for purposes of exemplary illustration and are not intended to be
limiting. Still other additional effects may also be
contemplated.
BRIEF DESCRIPTION OF DRAWINGS
[0042] FIG. 1 is a block diagram illustrating an exemplary
configuration of a transmission-reception system according to an
embodiment.
[0043] FIG. 2 is a diagram illustrating an example of a caption
display position (region) determined by caption display position
information.
[0044] FIG. 3 is a diagram illustrating an exemplary structure of
TTML (one caption display position).
[0045] FIG. 4 is a diagram illustrating main information contained
in the TTML structure.
[0046] FIG. 5 is a diagram illustrating an exemplary structure of
TTML (two caption display positions).
[0047] FIG. 6 is a diagram illustrating a caption display example
(one caption display position) in a case where the aspect ratio of
the video area is the same as the aspect ratio of the display video
area.
[0048] FIG. 7 is a diagram illustrating a caption display example
(two caption display positions) in a case where the aspect ratio of
the video area is the same as the aspect ratio of the display video
area.
[0049] FIG. 8 is a diagram illustrating an example of caption
display (one caption display position) in a case where the aspect
ratio of the video area is different from the aspect ratio of the
display video area and in a case where the display video area is
defined as the caption display range and the caption display
position is determined on the basis of caption display position
information (first method).
[0050] FIG. 9 is a diagram illustrating a display example in a case
where resize processing is performed.
[0051] FIG. 10 is a diagram illustrating an example of caption
display (two caption display positions) in a case where the aspect
ratio of the video area is different from the aspect ratio of the
display video area and in a case where the display video area is
defined as the caption display range and the caption display
position is determined on the basis of caption display position
information, and resize processing is further performed (first
method).
[0052] FIG. 11 is a diagram illustrating an example of caption
display (one caption display position) in a case where the aspect
ratio of the video area is different from the aspect ratio of the
display video area and in a case where the caption display range is
set in the display video area and the caption display position is
determined on the basis of caption display position information
(second method).
[0053] FIG. 12 is a diagram illustrating an example of caption
display (two caption display positions) in a case where the aspect
ratio of the video area is different from the aspect ratio of the
display video area and in a case where the caption display range is
set in the display video area and the caption display position is
determined on the basis of caption display position information
(second method).
[0054] FIG. 13 is a block diagram illustrating an exemplary
configuration of a stream generation unit of a broadcast delivery
system.
[0055] FIG. 14 is a block diagram illustrating an exemplary
configuration of a television receiver.
[0056] FIG. 15 is a flowchart illustrating an exemplary procedure
of determining a caption display position and performing resize
processing in a CPU of a television receiver.
[0057] FIG. 16 is a diagram illustrating an example of an aspect
ratio of a video area and an aspect ratio of a monitor
(display).
[0058] FIG. 17 is a diagram illustrating exemplary determination as
to whether the mode is a mode for displaying an entire video
area.
[0059] FIG. 18 is a diagram illustrating determination of the
caption display position in the mode of displaying the entire video
area and in a case where the caption display range is not
designated.
[0060] FIG. 19 is a diagram illustrating the determination of the
caption display position in the mode of displaying the entire video
area and in a case where the caption display range is
designated.
[0061] FIG. 20 is a diagram illustrating the determination of the
caption display position in the mode not displaying the entire
video area and in a case where the caption display range is not
designated.
[0062] FIG. 21 is a diagram illustrating an exemplary structure
(one caption display position) of TTML in a case where reference
point information (RPoffset) alone is included as information
indicating a caption display range.
[0063] FIG. 22 is a diagram illustrating an exemplary structure
(two caption display positions) of TTML in a case where reference
point information (RPoffset) alone is included as information
indicating a caption display range.
[0064] FIG. 23 is a diagram for illustrating how the CPU of the
television receiver sets the caption display range in a case where
the reference point information (RPoffset) alone is given.
[0065] FIG. 24 is a flowchart illustrating another example of a
procedure of determining a caption display position and performing
resize processing in a CPU of a television receiver.
MODE FOR CARRYING OUT THE INVENTION
[0066] Hereinafter, embodiments of the present invention
(hereinafter, embodiment(s)) will be described. Note that
description will be presented in the following order.
1. Embodiments
2. Modifications
1. Embodiment
[Exemplary Configuration of Transmission-Reception System]
[0067] FIG. 1 illustrates an exemplary configuration of a
transmission-reception system 10 according to an embodiment. The
transmission-reception system 10 includes a broadcast delivery
system 100 and a television receiver 200. The broadcast delivery
system 100 transmits a transport stream of MPEG-2 TS (hereinafter
simply referred to as "transport stream TS") as a container
(multiplexed stream) on a broadcast wave and transmits the
transport stream.
[0068] The transport stream TS contains a video stream including
video data and a subtitle stream including caption (subtitle)
information. Herein, the caption information is text information of
captions of a predetermined format. While the text information
includes, for example, TTML or a TTML derived format or the like,
the embodiment is a case where TTML is used as the text information
format. The caption display position (region) in caption display
position information included in the TTML is designated by a
relative position (proportional value) with respect to a caption
display range.
[0069] The TTML includes information related to the resize
processing of the caption display position determined on the basis
of the caption display position information, performed on the
receiving side in a case where the aspect ratio of the video area
is different from the aspect ratio of the display video area.
According to the present embodiment, the information related to the
resize processing is information indicating a line position to be
set as a fixed position in a case where the size in the vertical
direction is compressed in the resize processing on the caption
display position.
[0070] In addition, this TTML includes information indicating a
caption display range. According to the present embodiment, the
information indicating a caption display range is reference point
information and aspect ratio information of the caption display
range, or reference point information of the caption display
range.
[0071] The television receiver 200 receives the transport stream TS
sent from the broadcast delivery system 100. The television
receiver 200 performs decode processing on the video stream
including video data to obtain video data, and performs decode
processing on the subtitle stream including caption information to
obtain caption bitmap data. As described above, the caption display
position in the caption display position information included in
the caption information is designated as a relative position with
respect to the caption display range.
[0072] The television receiver 200 determines the caption display
position on the basis of the caption display position information
and performs display position control on caption bitmap data on the
basis of the determined caption display position. The television
receiver 200 superimposes the caption bitmap data that has
undergone the display position control on the video data to obtain
video data for display.
[0073] In a case where the aspect ratio of the video area is
different from the aspect ratio of the display video area (video
area displayed on the monitor) as the display position control for
the caption bitmap data, the television receiver 200 selectively
performs one of a first method and a second method described
below.
[0074] With the first method, in a case where the aspect ratio of
the video area is different from the aspect ratio of the display
video area, the television receiver 200 defines the display video
area as the caption display range and determines the caption
display position on the basis of the caption display position
information, and performs resize processing on the determined
caption display position and performs display position control on
the caption bitmap data on the basis of the determined caption
display position.
[0075] The resize processing is processing of restoring the
original shape as the caption display position, and for example, in
a case where the determined caption display position is compressed
solely in the horizontal direction, the position is also compressed
in the vertical direction in the same proportion. For example, in a
case where the size in the vertical direction is compressed by the
resize processing, compression is performed in a state where a
predetermined line position such as a top line (upper line), a
bottom line (lower line), or a middle line (intermediate line) is
fixed.
[0076] With appropriate selection of the predetermined line
position, for example, in a case where there are two caption
display positions, it is possible to maintain an interval in the
vertical direction between the two caption display positions even
when the resize processing is performed. The television receiver
200 can utilize the information when the caption information
included in the subtitle stream includes information indicating the
line position to be set as a fixed position as information related
to the resize processing.
[0077] With the second method, in a case where the aspect ratio of
the video area is different from the aspect ratio of the display
video area, the television receiver 200 sets a caption display
range in the display video area, determines the caption display
position on the basis of the caption display position information,
and performs display position control on the caption bitmap data on
the basis of the determined caption display position. In this case,
a caption display range having the same aspect ratio as the aspect
ratio of the video area is set in the display video area, for
example.
[0078] In a case where the caption information contained in the
subtitle stream includes information indicating a caption display
range, the television receiver 200 can appropriately set the
caption display range using the information. For example, the
television receiver 200 selects the second method when the caption
information contained in the subtitle stream includes information
indicating a caption display range, and selects the first method
when the information is not included.
[0079] FIG. 2 illustrates an example of a caption display position
(region) determined by caption display position information. This
example illustrates a case of the TTML in which the caption display
position information is given by information indicating a base
point (origin) "origin=" OH % OV %, and by information indicating
an area (extent) of the caption display position "extent="EH % EV
%". The sign "RP" indicates a reference point which is the top-left
of the caption display range.
[0080] FIG. 2(a) illustrates an example in a case where the aspect
ratio of the video area is the same as the aspect ratio of the
display video area. In this example, when the aspect ratio of the
video area is 16:9, the aspect ratio of the monitor is 16:9, and
the aspect ratio of the display video area is 16:9. In this case,
the display video area is defined as the caption display range, and
the caption display position is determined on the basis of the
caption display position information designated by the relative
position with respect to the display video area.
[0081] FIG. 2(b) is an exemplary case where the aspect ratio of the
video area is different from the aspect ratio of the display video
area and where the display video area is defined as the caption
display range and the caption display position is determined on the
basis of the caption display position information (first method).
In this example, in a case where the aspect ratio of the video area
is 16:9, the aspect ratio of the monitor is 4:3, and the display
method is center-cut, leading to the aspect ratio of the display
video area being 4:3. In this case, while the caption display
position has the same width in the vertical direction as compared
with the case of FIG. 2(a), its width is compressed in the
horizontal direction. In this case, the shape of the caption
display position is different from the case of FIG. 2(a).
[0082] FIG. 2(c) illustrates an exemplary case where the aspect
ratio of the video area is different from the aspect ratio of the
display video area, the caption display range is set in the display
video area, and the caption display position is determined on the
basis of the caption display position information (second method).
In this example, in a case where the aspect ratio of the video area
is 16:9, the aspect ratio of the monitor is 4:3, and the display
method is center-cut, leading to the aspect ratio of the display
video area being 4:3. In this case, while the caption display
position has the width compressed both in the vertical and
horizontal directions as compared with the case of FIG. 2(a). In a
case where the aspect ratio of the caption display range to be set
is 16:9, the shape of the caption display position is the same as
in the case of FIG. 2(a).
[0083] FIG. 3 illustrates an exemplary TTML structure. This example
is an exemplary case where there is one caption display position
(region). TTML is described on the basis of XML. In the tt root
container, language and namespace are defined. The namespace is
defined as a unique element name that can be uniquely identified in
all elements in a system or a standard system. Moreover, in
<tt>, "tts: extent" first declares a target area of video
100% as a source of the caption position information. "Fullvideo"
represents an entire video with resolution of 3840 (H).times.2160
(V) in a case where 4K video is the target, while it illustrates an
entire video with a resolution of 1920 (H).times.1080 (V) in a case
where 2K (full HD) video is the target.
[0084] While detailed description of namespaces of
"xmlns=http://www.w3.org/ns/ttml",
"xmlns:ttp=http://www.w3.org/ns/ttml#parameter",
"xmlns:tts=http://www.w3.org/ns/ttml#styling" will be omitted, they
are namespaces such as parameters and styling which are secured as
attribute classes of TTML in W3C beforehand.
[0085] "xmlns:dto=http://www.example.org/ns/displaytextoverlay" is
a newly defined namespace. This namespace is used for inserting
information indicating a caption display range. Then,
"dto:dispasp="16:9"" and "dto:RPoffset="Ax %, By %"" indicates
information indicating a caption display range.
[0086] "dto:dispasp="16:9"" indicates the aspect ratio information
of the caption display range, and that the caption display range is
the area of aspect ratio 16:9. While the illustrated example
illustrates that the aspect ratio of the caption display range is
16:9, the aspect ratio of the caption display range may be
designated 4:3, 21:9, or the like, as illustrated in FIG. 4.
[0087] "dto:RPoffset="Ax %, By %"" indicates a reference point
information of the caption display range, and as illustrated in
FIG. 4, the position of the reference point (RP) of the caption
display range when each of the horizontal and vertical portions of
the display video area is 100% is indicated by the ratio of the
offset from the top-left of the display video area.
[0088] A header (head) contains an element of layout. The region ID
is indicated by "r1", and the starting point (origin) of the
caption display position and the area (extent) are illustrated by
relative positions as the caption display position information.
That is, "origin="OH % OV %"" indicates a base point of the caption
display position, indicating that the starting point is OH from the
left and OV % from the top. In addition, "extent="EH % EV %""
indicates an area of the caption display position, indicating that
the horizontal width is EH % and the vertical width is EV %.
[0089] In the body, XML ID is indicated by "p1" and region ID is
indicated by "r1", while text data of caption (subtitle) is
described. Here, the text data is represented by "ABCDE".
"dto:scalingjustify=top" constitutes information related to the
resize processing, and indicates a line position to be set as a
fixed position in a case where the size in the vertical direction
is to be compressed by the resize processing of the caption display
position. While the illustrated example is a case where the line
position to be set as the fixed position is the top line (upper
line), it is also possible to designate the bottom line (lower
line), the middle line (intermediate line) or the like as
illustrated in FIG. 4.
[0090] FIG. 5 also illustrates an exemplary TTML structure. This
example is an exemplary case where there are two caption display
positions (regions). The tt root container is similar to the case
of FIG. 3, and thus description will be omitted.
[0091] A header (head) contains an element of layout. The region ID
of the first caption display position is indicated by "r1", and the
starting point (origin) of the caption display position and the
area (extent) are illustrated by relative positions as the caption
display position information. That is, "origin="OH1% OV1%""
indicates that the starting point is OH1% from the left and OV1%
from the top. In addition, "extent="EH1% EV1%"" indicates that the
horizontal width of the area is EH1% and the vertical width of the
area is EV1%.
[0092] Moreover, the region ID of the second caption display
position is indicated by "r2", and the starting point (origin) and
the area (extent) of the caption display position are illustrated
by relative positions as the caption display position information.
That is, "origin="OH2% OV2%"" indicates that the starting point is
OH2% from the left and OV2% from the top. In addition,
"extent="EH2% EV2%"" indicates that the horizontal width of the
area is EH2% and the vertical width of the area is EV2%.
[0093] In the body, in relation with the first caption position,
XML ID is indicated by "p1" and region ID is indicated by "r1",
while text data of caption (subtitle) is described. Here, the text
data is represented by "ABCDE". "dto:scalingjustify=bottom"
constitutes information related to the resize processing, and
indicates a line position to be set as a fixed position in a case
where the size in the vertical direction is to be compressed by the
resize processing of the caption display position. The illustrated
example illustrates a case where the line position to be set as the
fixed position is the bottom line (lower line).
[0094] Moreover, in the body, in relation with the second caption
position, XML ID is indicated by "p2" and region ID is indicated by
"r2", while text data of caption (subtitle) is described. Here, the
text data is represented by "FGH". "dto:scalingjustify=top"
constitutes information related to the resize processing, and
indicates a line position to be set as a fixed position in a case
where the size in the vertical direction is to be compressed by the
resize processing of the caption display position. In the
illustrated example, the line position to be set as the fixed
position is the top line (upper line).
[0095] FIG. 6 illustrates a display example of captions (subtitles)
in a case where the aspect ratio of the video area and the aspect
ratio of the display video area (video area displayed on the
monitor) are the same. The illustrated example is an exemplary case
where the aspect ratio of the video area is 16:9 and the aspect
ratio of the monitor is also 16:9, having the TTML structure (one
caption display position) as illustrated in FIG. 3.
[0096] In the illustrated example, the video area is indicated by a
broken line frame, while the monitor area is indicated by a solid
line frame. In this case, as illustrated by a one-dot chain line
frame, the display video area is defined as a caption display
range, and the caption display position (region) is determined on
the basis of caption display position information ("origin="OH % OV
%"", "extent="EH % EV %"") designated by a relative position with
respect to the range. The sign "RP" indicates a reference point
which is the top-left of the caption display range.
[0097] The caption "ABCDE" in text data is displayed at the caption
display position determined in this manner. Note that while in the
illustrated example, the frames indicating the video area, the
monitor area, and the caption display range are not aligned in
display, this illustration is presented for clearly displaying
individual frames, and the frames are aligned with each other in
practice. Although the explanation is omitted, the similar display
will be presented in the following drawings.
[0098] FIG. 7 also illustrates a display example of captions
(subtitles) in a case where the aspect ratio of the video area and
the aspect ratio of the display video area (video area displayed on
the monitor) are the same. The illustrated example is an exemplary
case where the aspect ratio of the video area is 16:9 and the
aspect ratio of the monitor is also 16:9, having the TTML structure
(two caption display positions) as illustrated in FIG. 5.
[0099] In the illustrated example, the video area is indicated by a
broken line frame, while the monitor area is indicated by a solid
line frame. In this case, as illustrated by a one-dot chain line
frame, the display video area is defined as a caption display
range, and the first and second caption display positions (regions)
are determined on the basis of caption display position information
("origin="OH1% OV1%"", "extent="EH1% EV1%"", "origin="OH2% OV2%"",
and "extent="EH2% EV2%"") designated by a relative position with
respect to the range.
[0100] Then, the caption "ABCDE" in text data is displayed in the
first caption display position (first region), while the caption
"FGH" in text data is displayed in the second caption display
position (second region). In this case, the interval between the
two caption display positions (regions) is 10 lines, for
example.
[0101] FIG. 8 is a diagram illustrating an example of displaying
caption (subtitle) in a case where the aspect ratio of the video
area is different from the aspect ratio of the display video area
(video area displayed on the monitor) and in a case where the
display video area is defined as the caption display range and the
caption display position is determined on the basis of the caption
display position information (first method). The illustrated
example is an exemplary case where the aspect ratio of the video
area is 16:9 and the aspect ratio of the monitor is 4:3, having the
TTML structure (one caption display position) as illustrated in
FIG. 3.
[0102] In the illustrated example, the video area is indicated by a
broken line frame, while the monitor area is indicated by a solid
line frame. In this case, as illustrated by a one-dot chain line
frame, the display video area is defined as a caption display
range, and the caption display position (region) is determined on
the basis of caption display position information ("origin="OH % OV
%"", "extent="EH % EV %"") designated by a relative position with
respect to the range. In addition, the caption "ABCDE" in text data
is displayed at the caption display position.
[0103] In this case, while the caption display position has the
same width in the vertical direction as compared with the case of
FIG. 6, its width is compressed in the horizontal direction. In
this case, together with the compression of the width of the
caption display position, the font size of the caption is also
adjusted to a smaller size. As illustrated in the drawing, while
adjustment of the font size of the caption allows the relation
between the caption display position and the caption displayed in
the position to be aligned in the horizontal direction, the
relation between the caption display position and the caption
displayed in the position is not aligned in the vertical direction
in which the width of the caption display position is not
compressed. This gives the viewer a sense that the black area of
the caption display position is projecting.
[0104] In view of the above, the first method as described above
performs the resize processing on the determined caption display
position, so as to achieve alignment in the relationship between
the caption display position and the caption displayed on the
caption display position not solely in the horizontal direction but
also in the vertical direction. FIG. 9 illustrates a display
example in which resize processing is performed. In this case, the
caption display position compressed solely in the horizontal
direction by the determination based on the caption display
position information ("origin="OH % OV %"" and "extent="EH % EV
%"") is compressed by the resize processing in the same proportion
also in the vertical direction. In this case, as a result, the
caption display position is determined by caption display position
information ("origin="OH % OV %"", and "extent="EH % EVu %""). In
this case, the relationship would be EVu=3/4*EV.
[0105] When the resize processing is performed in this manner,
compression of the width in the vertical direction is performed in
a state where the predetermined line position is fixed. The
illustrated example is an example in which a predetermined line
position is set as a top line (upper line) on the basis of
information of "dto:scalingjustify=top" included in TTML. Note that
in the illustrated example, the broken line frame illustrates the
caption display position before compression of the width in the
vertical direction is performed.
[0106] FIG. 10 is also a diagram illustrating an example of
displaying a caption (subtitle) in a case where the aspect ratio of
the video area is different from the aspect ratio of the display
video area (video area displayed on the monitor) and in a case
where the display video area is defined as the caption display
range and the caption display position is determined on the basis
of the caption display position information (first method). The
illustrated example is an exemplary case of display where the
aspect ratio of the video area is 16:9 and the aspect ratio of the
monitor is 4:3, having the TTML structure (two caption display
position) as illustrated in FIG. 5, with resize processing
performed.
[0107] In the illustrated example, the video area is indicated by a
broken line frame, while the monitor area is indicated by a solid
line frame. In this case, as illustrated by a one-dot chain line
frame, the display video area is defined as a caption display
range, and the first and second caption display positions (regions)
are determined on the basis of caption display position information
("origin="OH1% OV1%"", "extent="EH1% EV1%"", "origin="OH2% OV2%"",
and "extent="EH2% EV2%"") designated by a relative position with
respect to the range, and thereafter, resize processing is further
performed.
[0108] In this case, as a result, the first caption display
position (first region) is determined by caption display position
information ("origin="OH1% OV1%"" and "extent="EH1% EV1u %""). In
this case, the relationship would be EV1u=3/4*EV1. Similarly, in
this case, as a result, the second caption display position (second
region) is determined by caption display position information
("origin="OH2% OV2%"" and "extent="EH2% EV2u %""). In this case,
the relationship would be EV2u=3/4*EV2.
[0109] Then, the caption "ABCDE" in text data is displayed in the
first caption display position (first region), while the caption
"FGH" in text data is displayed in the second caption display
position (second region). In this case, the font size of the
caption is adjusted so as to be aligned in accordance with the
compression of the caption display position (region).
[0110] When the resize processing is performed, compression of the
width in the vertical direction is performed in a state where the
predetermined line position is fixed. The illustrated example is an
exemplary case where the predetermined line position is set to the
bottom line (lower line) with relation to the first caption display
position (first region) on the basis of the information of
"dto:scalingjustify=bottom" included in the TTML. Moreover, this is
an exemplary case where the predetermined line position is set to
the top line (upper line) with relation to the second caption
display position (second region) on the basis of the information of
"dto:scalingjustify=top" included in the TTML.
[0111] In this manner, the predetermined line position in the first
and second caption display positions are selected, whereby, for
example, 10 lines are maintained as the interval between the first
and second caption display positions, similarly to the case of the
display example of FIG. 7. This makes it possible to substantially
maintain the perceptibility of captions (subtitles) on the display
image by the viewer.
[0112] FIG. 11 is a diagram illustrating an example of displaying a
caption (subtitle) in a case where the aspect ratio of the video
area is different from the aspect ratio of the display video area
(video area displayed on the monitor) and in a case where the
caption display range is set in the display video area and the
caption display position is determined on the basis of the caption
display position information (second method). The illustrated
example is an exemplary case of display where the aspect ratio of
the video area is 16:9 and the aspect ratio of the monitor is 4:3,
having the TTML structure (one caption display position) as
illustrated in FIG. 3.
[0113] In the illustrated example, the video area is indicated by a
broken line frame, while the monitor area is indicated by a solid
line frame. In this case, as illustrated by a one-dot chain line
frame, the caption display range is set in the display video area
and the caption display position (region) is determined on the
basis of caption display position information ("origin="OH % OV
%"", "extent="EH % EV %"") designated by a relative position with
respect to the range. In addition, the caption "ABCDE" in text data
is displayed at the caption display position. In this case, the
font size of the caption is adjusted so as to be aligned in
accordance with the compression of the caption display position
(region).
[0114] In this case, a caption display range having the same aspect
ratio as the aspect ratio of the video area is set in the display
video area, for example. The illustrated example is an exemplary
case where the caption display range with the aspect ratio of 16:9
is set in the display video area on the basis of information
indicating a caption display range included in the TTML, that is,
the reference point information ("dto:RPoffset="Ax %, By %"") of
the caption display range and the aspect ratio information
("dto:dispasp="16:9"").
[0115] In this case, the caption display position is compressed in
width in both the vertical direction and the horizontal direction,
so as to form the shape of the caption display position the same as
the case of FIG. 6, and thus, there is no need to perform caption
display position adjustment (resize processing) in accordance with
the adjustment of the font size of the caption.
[0116] FIG. 12 also is a diagram illustrating an example of
displaying a caption (subtitle) in a case where the aspect ratio of
the video area is different from the aspect ratio of the display
video area (video area displayed on the monitor) and in a case
where the caption display range is set in the display video area
and the caption display position is determined on the basis of the
caption display position information (second method). The
illustrated example is an exemplary case of display where the
aspect ratio of the video area is 16:9 and the aspect ratio of the
monitor is 4:3, having the TTML structure (two caption display
position) as illustrated in FIG. 5.
[0117] In the illustrated example, the video area is indicated by a
broken line frame, while the monitor area is indicated by a solid
line frame. In this case, as illustrated by a one-dot chain line
frame, the display video area is defined as a caption display
range, and the first and second caption display positions (regions)
are determined on the basis of caption display position information
("origin="OH1% OV1%"", "extent="EH1% EV1%"", "origin="OH2% OV2%"",
and "extent="EH2% EV2%"") designated by a relative position with
respect to the range. Then, the caption "ABCDE" in text data is
displayed in the first caption display position (first region),
while the caption "FGH" in text data is displayed in the second
caption display position (second region).
[Exemplary Configuration of Stream Generation Unit of Broadcast
Delivery System]
[0118] FIG. 13 illustrates an exemplary configuration of a stream
generation unit 110 of the broadcast delivery system 100. The
stream generation unit 110 includes a control unit 111, a video
encoder 112, an audio encoder 113, a text format converter 114, a
subtitle encoder 115, and a TS formatter (multiplexer) 116.
[0119] The control unit 111 includes a central processing unit
(CPU), for example, and controls operation of each of portions of
the stream generation unit 110. The video encoder 112 inputs video
data DV, encodes the video data DV, and generates a video stream
(PES stream) formed with a video PES packet having encoded video
data in the payload.
[0120] The audio encoder 113 inputs the audio data DA, encodes the
audio data DA, and generates an audio stream (PES stream) formed
with an audio PES packet having encoded audio data. The text format
converter 114 inputs text data (character code) DT and obtains
timed text markup language (TTML) as caption information (refer to
FIGS. 3 and 5).
[0121] The caption display position (region) in caption display
position information included in the TTML is designated by a
relative position (proportional value) with respect to a caption
display range. Moreover, this TTML includes information related to
resize processing of the caption display position to be performed
on the receiving side in a case where the aspect ratio of the video
area is different from the aspect ratio of the display video area,
for example, information indicating the line position to be set as
the fixed position in a case where the size in the vertical
direction is compressed by the resize processing of the caption
display position. In addition, this TTML includes information
indicating a caption display range (reference point information of
the caption display range and aspect ratio information).
[0122] The subtitle encoder 115 converts the TTML obtained by the
text format converter 114 into various segments, and generates a
subtitle stream (PES stream) formed with the subtitle PES packet
arranging these segments (caption information) in the payload.
[0123] The TS formatter 116 packetizes the video stream generated
by the video encoder 112, the audio stream generated by the audio
encoder 113, and the subtitle stream generated by the subtitle
encoder 115, into a transport packet and multiplexes the packetized
streams, thereby obtaining a transport stream TS as a container
(multiplexed stream).
[0124] Operation of the stream generation unit 110 illustrated in
FIG. 13 will be briefly described. The video data DV is supplied to
the video encoder 112. The video encoder 112 encodes the video data
DV and generates a video stream (PES stream) formed with the video
PES packet having encoded video data in the payload. This video
stream is supplied to the TS formatter 116.
[0125] The audio data DA is also supplied to the audio encoder 113.
The audio encoder 113 encodes the audio data DA and generates an
audio stream (PES stream) formed with an audio PES packet having
encoded audio data. This audio stream is supplied to the TS
formatter 116.
[0126] Moreover, the text data (character code) DT is supplied to
the text format converter 114. This text format converter 114
obtains TTML as caption information (refer to FIGS. 3 and 5). The
TTML is supplied to the subtitle encoder 115. The subtitle encoder
115 converts the TTML into various segments and generates a
subtitle stream formed with the subtitle PES packet in each of
which these segments are arranged in the payload. This subtitle
stream is supplied to the TS formatter 116.
[0127] The TS formatter 116 packetizes the video stream generated
by the video encoder 112, the audio stream generated by the audio
encoder 113, and the subtitle stream generated by the subtitle
encoder 115, into a transport packet and multiplexes the packetized
streams, thereby generating the transport stream TS as a container
(multiplexed stream).
[Exemplary Configuration of Television Receiver]
[0128] FIG. 14 illustrates an exemplary configuration of the
television receiver 200. The television receiver 200 includes a
reception unit 201, a TS analysis unit (demultiplexer) 202, a video
decoder 203, a video superimposing unit 204, a panel drive circuit
205, and a display panel 206 as a monitor (display). Moreover, the
television receiver 200 includes an audio decoder 207, an audio
output circuit 208, a speaker 209, and a subtitle decoder 210.
Moreover, the television receiver 200 includes a CPU 221, a flash
ROM 222, a DRAM 223, an internal bus 224, a remote control
reception unit 225, and a remote control transmitter 226.
[0129] The CPU 221 controls operation of each of portions of the
television receiver 200. The flash ROM 222 stores control software
and data. The DRAM 223 constitutes a work area of the CPU 221. The
CPU 221 develops the software and data read from the flash ROM 222
onto the DRAM 223 to activate the software, and controls each of
portions of the television receiver 200.
[0130] The remote control reception unit 225 receives a remote
control signal (remote control code) transmitted from the remote
control transmitter 226, and supplies the received signal to the
CPU 221. The CPU 221 controls each of portions of the television
receiver 200 on the basis of this remote control code. The CPU 221,
the flash ROM 222, and the DRAM 223 are connected to the internal
bus 224.
[0131] The reception unit 201 receives the transport stream TS sent
from the broadcast delivery system 100 over the broadcast waves. As
described above, the transport stream TS includes the video stream,
the audio stream, and the subtitle stream. The TS analysis unit 202
extracts the PES packet of each of the video stream, the audio
stream, and the subtitle stream, from the transport stream TS.
[0132] The audio decoder 207 performs decode processing on the
audio PES packet obtained by the TS analysis unit 202 and then
obtains audio data. The audio output circuit 208 performs required
processing such as D/A conversion and amplification on the audio
data, and supplies the processed data to the speaker 209. The video
decoder 203 performs decode processing on the video PES packet
obtained by the TS analysis unit 202 and then obtains video data.
Note that the video decoder 203 also performs resolution conversion
of video data as appropriate in accordance with the display mode or
the like. For example, in a case where the aspect ratio of the
video area is 16:9 and the aspect ratio of the monitor (display) is
4:3, and the display mode is the letterbox, the resolution
conversion of the video data is performed.
[0133] The subtitle decoder 210 performs decode processing on the
subtitle PES packet obtained by the TS analysis unit 202 to convert
text data (font data) of caption (subtitle) of each of the caption
display positions (regions) included in the TTML into bitmap data
(binary image information). In this case, the font size of the
caption is adjusted appropriately from the font size designated in
TTML under the control of the CPU 221 in accordance with the size
of the caption display position determined by caption display
position information or obtained by further resize processing.
[0134] Moreover, the subtitle decoder 210 extracts various types of
information from the TTML and supplies it to the CPU 221. This
information also includes attribute information defined by
<tt> and <head>. The CPU 221 determines the caption
display position on the basis of the caption display position
information and further performs resize processing on the
determined caption display position as necessary. Details of the
procedure of determination and resize processing on the caption
display position in the CPU 221 will be further described
below.
[0135] The video superimposing unit 204 superimposes the bitmap
data of the caption at each of the caption display positions
obtained from the subtitle decoder 210, on the video data obtained
by the video decoder 203 so as to obtain display video data. In
this case, the CPU 221 controls so as to set the superimposed
position of the caption bitmap data to the caption display position
on the basis of the caption display position determined by caption
display position information or obtained by further resize
processing, as described above.
[0136] The panel drive circuit 205 drives the display panel 206 on
the basis of the display video data obtained by the video
superimposing unit 204. The display panel 206 includes a liquid
crystal display (LCD), an organic electroluminescence (EL) display,
and the like, for example.
[0137] Operation of the television receiver 200 illustrated in FIG.
14 will be briefly described. The reception unit 201 receives the
transport stream TS sent from the broadcast delivery system 100
over the broadcast waves. The transport stream TS includes the
video stream, the audio stream, and the subtitle stream. The
transport stream TS is supplied to the TS analysis unit 202. The TS
analysis unit 202 extracts the PES packet of each of the video
stream, the audio stream, and the subtitle stream, from the
transport stream TS.
[0138] The video PES packet extracted by the TS analysis unit 202
is supplied to the video decoder 203. In the video decoder 203,
decode processing is performed on the video PES packet so as to
obtain video data. In this case, the video decoder 203
appropriately converts the resolution of the video data according
to the display mode or the like.
[0139] Moreover, the subtitle PES packet extracted by the TS
analysis unit 202 is supplied to the subtitle decoder 210. The
subtitle decoder 210 performs decode processing on the subtitle PES
packet obtained by the TS analysis unit 202 and thus, bitmap data
of caption for each of the caption display positions to be
superimposed on the video data is obtained on the basis of the text
data included in TTML.
[0140] Moreover, the subtitle decoder 210 extracts various types of
information from the TTML and supplies it to the CPU 221. This
information also includes attribute information defined by
<tt> and <head>. The CPU 221 determines the caption
display position on the basis of the caption display position
information and further performs resize processing on the
determined caption display position as necessary.
[0141] The bitmap data of each of the caption display positions
output from subtitle decoder 210 is supplied to the video
superimposing unit 204. The video superimposing unit 204
superimposes the bitmap data of the caption at each of the caption
display positions, obtained from the subtitle decoder 210, on the
video data obtained by the video decoder 203 so as to obtain
display video data. In this case, the CPU 221 controls so as to set
the superimposed position of the caption bitmap data to the caption
display position on the basis of the caption display position
determined by caption display position information or obtained by
further resize processing.
[0142] The display video data obtained by the video superimposing
unit 204 is supplied to the panel drive circuit 205. The panel
drive circuit 205 drives the display panel 206 on the basis of the
display video data. With this configuration, an image on which a
caption (subtitle) is superimposed on each of the caption display
positions (regions) is displayed on the display panel 206.
[0143] Moreover, the audio PES packet extracted by the TS analysis
unit 202 is supplied to the audio decoder 207. The audio decoder
207 performs decode processing on the audio PES packet and then
obtains audio data. This audio data is supplied to the audio output
circuit 208. The audio output circuit 208 performs necessary
processing such as D/A conversion and amplification on the audio
data. Then, the processed audio data is supplied to the speaker
209. With this configuration, an audio output corresponding to the
display image of the display panel 206 is obtained from the speaker
209.
"Procedure of Determination and Resize Processing on Caption
Display Position"
[0144] The procedure of determination and resize processing on the
caption display position in the CPU 221 will be described in
detail. The flowchart of FIG. 15 illustrates an exemplary procedure
of determination and resize processing on the caption display
position in the CPU 221.
[0145] In this example, the aspect ratio of the video area is
assumed to be 16:9. Then as illustrated in FIG. 16(a), there are
two assumed cases, namely, a case where the caption display range
is not designated by TTML and a case where the caption display
range is designated by TTML as illustrated in FIGS. 16(b) and
16(c). Note that while this is an example in which the aspect
ratios of the designated caption display range are 16:9 and 4:3,
the aspect ratio of the designated caption display range is not
limited to these ratios. In addition, here, there are two assumed
cases where the aspect ratio of the monitor (display) is 16:9 and
4:3.
[0146] The CPU 221 starts processing in step ST1, and then proceeds
to processing in step ST2. In this step ST2, the CPU 221 determines
whether the receiver display is in a mode of displaying the entire
video area. For example, in a case where the aspect ratio of the
monitor is 16:9 (refer to FIG. 17(a)) or in a case where the aspect
ratio of the monitor is 4:3 and adopts the display method of the
letterbox (FIG. 17(b)), it is determined that the mode is a mode of
displaying the entire video area. Moreover, for example, in a case
where the aspect ratio of the monitor is 4:3 and the center-cut
display method is adopted (refer to FIG. 17(c)), it is determined
that the mode is a mode of not displaying the entire video
area.
[0147] When the CPU 221 determines that the mode is the mode of
displaying the entire video area, the CPU 221 proceeds to the
processing in step ST3. In this step ST3, the CPU 221 determines
whether the caption display range is designated. For example, in a
case where reference point information (RPoffset) and aspect ratio
information (dispasp) of the caption display range exist in the tt
root container of TTML, it is determined that the caption display
range is designated.
[0148] When the caption display range is not designated, the CPU
221 proceeds to the processing of step ST4. In this step ST4, the
CPU 221 determines the caption display position (region) with the
display video area defined as the caption display range. At this
time, the CPU 221 defines the top-left of the display video area as
the reference point RP and determines the caption display position
(region) in accordance with an instruction of the caption display
position information ("origin="OH % OV %"" and "extent="EH % EV
%"") designated by the relative position with respect to the
caption display range.
[0149] FIG. 18(a) illustrates an exemplary case where the monitor
has an aspect ratio of 16:9. FIG. 18(b) illustrates an exemplary
case where the aspect ratio of the monitor is 4:3 and adopts the
display method of letterbox. Note that the sign "RP" indicates a
reference point which is the top-left of the caption display
range.
[0150] After the processing of step ST4, the CPU 221 finishes the
processing in step ST5.
[0151] When the caption display range is designated in the
above-described step ST3, the CPU 221 proceeds to the processing of
step ST6. In this step ST6, the CPU 221 determines the caption
display position (region) in the designated caption display range.
At this time, the CPU 221 uses the aspect ratio information
(dispasp) as the information indicating a caption display range,
and sets a caption display range in the display video area.
Subsequently, the CPU 221 sets the top-left of the caption display
range as the reference point RP and determines the caption display
position (region) in accordance with the instruction of the caption
display position information ("origin="OH % OV %"" and "extent="EH
% EV %"") designated by the relative position with respect to the
caption display range.
[0152] FIG. 19(a) illustrates an exemplary case where the aspect
ratio of the monitor is 16:9 and the aspect ratio indicated by the
aspect ratio information (dispasp) is 16:9. FIG. 19(b) illustrates
an exemplary case where the aspect ratio of the monitor is 16:9 and
the aspect ratio indicated by the aspect ratio information
(dispasp) is 4:3.
[0153] Note that in a case where the aspect ratio indicated by the
aspect ratio information (dispasp) is different from the aspect
ratio of the monitor in this manner, the CPU 221 sets a caption
display range having the width in the vertical direction or the
width in the horizontal direction matching with each other and
having the aspect ratio indicated by the aspect ratio information
(dispasp) at the center of the display video area. In the
illustrated example, since the aspect ratio of the monitor is 16:9
and the aspect ratio information (dispasp) is 4:3, the width in the
vertical direction matches with each other.
[0154] FIG. 19(c) illustrates an exemplary case where the aspect
ratio of the monitor is 4:3, the letterbox display method is
adopted, and the aspect ratio indicated by the aspect ratio
information (dispasp) is 16:9. FIG. 19(d) illustrates an exemplary
case where the aspect ratio of the monitor is 4:3, the letterbox
display method is adopted, and the aspect ratio indicated by the
aspect ratio information (dispasp) is 4:3.
[0155] After the processing of step ST6, the CPU 221 finishes the
processing in step ST5.
[0156] When it is determined that the mode is a mode not displaying
the entire video area in the above-described step ST2, the
processing proceeds to step ST7. In this step ST7, the CPU 221
determines whether the caption display range is designated. For
example, in a case where reference point information (RPoffset) and
aspect ratio information (dispasp) of the caption display range
exist in the tt root container of TTML, it is determined that the
caption display range is designated.
[0157] When the caption display range is not designated, the CPU
221 proceeds to the processing of step ST8. In this step ST8, the
CPU 221 determines the caption display position (region) with the
display video area defined as the caption display range. At this
time, the CPU 221 defines the top-left of the display video area as
the reference point RP and determines the caption display position
(region) in accordance with an instruction of the caption display
position information ("origin="OH % OV %"" and "extent="EH % EV
%"") designated by the relative position with respect to the
caption display range.
[0158] The caption display position determined in this manner has a
compressed width solely in the horizontal direction. Therefore, the
CPU 221 further performs resize processing on the determined
caption display position, compresses the width also in the vertical
direction, so as to obtain a final caption display position. In
this case, the CPU 221 compresses the width in the vertical
direction in a state where a predetermined line position is fixed
on the basis of the information of "dto:scalingjustify=top"
included in the TTML, for example.
[0159] FIG. 20(a) illustrates an exemplary case where the aspect
ratio of the monitor is 4:3 and the center-cut display method is
adopted. The width of the caption display position in the vertical
direction is compressed from EV % to EVu % by resize
processing.
[0160] After the processing of step ST8, the CPU 221 finishes the
processing in step ST5.
[0161] When the caption display range is designated in the
above-described step ST7, the CPU 221 proceeds to the processing of
step ST9. In this step ST9, the CPU 221 determines the caption
display position (region) in the designated caption display range.
At this time, the CPU 221 uses the information indicating a caption
display range (reference point information (RPoffset) and aspect
ratio information (dispasp)) so as to set a caption display range
on the display video area.
[0162] In this case, the CPU 221 sets the position shifted from the
top-left of the display video area by the reference point
information (RPoffset) as the top-left of the caption display
range, and then, sets the range corresponding to the aspect ratio
indicated by the aspect ratio information (dispasp). In this case,
the horizontal direction width of the caption display range matches
the horizontal direction width of the display video area.
[0163] Subsequently, the CPU 221 sets the top-left of the caption
display range that has been set as above as the reference point RP
and determines the caption display position (region) in accordance
with the instruction of the caption display position information
("origin="OH % OV %"" and "extent="EH % EV %"") designated by the
relative position with respect to the caption display range.
[0164] FIG. 20(b) illustrates an exemplary case where the aspect
ratio of the monitor is 4:3, the center-cut display method is
adopted, and the aspect ratio indicated by the aspect ratio
information (dispasp) is 16:9. FIG. 20(c) illustrates an exemplary
case where the aspect ratio of the monitor is 4:3, the center-cut
display method is adopted, and the aspect ratio indicated by the
aspect ratio information (dispasp) is 4:3.
[0165] After the processing of step ST9, the CPU 221 ends the
processing in step ST5.
[0166] As described above, in a case where the aspect ratio of the
video area is different from the aspect ratio of the display video
area in the transmission-reception system 10 illustrated in FIG. 1,
the television receiver 200 either obtains the final caption
display position by further performing resize processing on the
caption display position determined on the basis of caption display
position information with the display video area defined as the
caption display range, or sets the caption display range in the
display video area and determines the caption display position on
the basis of the caption display position information. Therefore,
even in a case where the aspect ratio of the video area is
different from the aspect ratio of the display video area, the
original shape can be maintained as the caption display position,
making it possible to perform display of captions satisfactorily
without giving the viewer a sense of discomfort.
[0167] Moreover, in the transmission-reception system 10
illustrated in FIG. 1, the broadcast delivery system 100 includes,
in the TTML as caption information, information related to resize
processing to be performed on the receiving side such as
information indicating the line position to be a fixed position in
a case where the vertical direction size is compressed by the
resize processing of the caption display position. Therefore, this
enables the receiving side to easily perform the resize processing
appropriately on the basis of this information.
[0168] Moreover, in the transmission-reception system 10
illustrated in FIG. 1, the broadcast delivery system 100 includes
information indicating a caption display range in TTML as caption
information. Therefore, with the setting of the caption display
range on the basis of the information, it is possible on the
receiving side to easily set the caption display range
appropriately in the display video area.
2. Modification
[0169] Note that the above-described embodiment is an example in
which the broadcast delivery system 100 includes the reference
point information (RPoffset) and the aspect ratio information
(dispasp) as the information indicating a caption display range in
the TTML. It is, however, conceivable that the broadcast delivery
system 100 includes the reference point information (RPoffset)
alone as the information indicating a caption display range, in the
TTML. FIG. 21 and FIG. 22 illustrate an example of the TTML
structure in this case. While the exemplary TTML structures are not
described in detail, the structures are similar to the exemplary
TTML structures illustrated in FIGS. 3 and 5 except that there is
no aspect ratio information (dispasp) of the caption display
range.
[0170] An example of how the CPU 221 of the television receiver 200
sets the caption display range in a case where the reference point
information (RPoffset) alone is given will be described with
reference to FIG. 23. The illustrated example is a case where the
aspect ratio of the video area is 16:9 while the aspect ratio of
the display video area is 4:3.
[0171] On the basis of the reference point information (RPoffset),
the CPU 221 initially sets the position shifted from the top-left
of the display video area by the reference point information
(RPoffset) as the reference point RP of the caption display range.
The center position of the display video area is defined as OP, and
the coordinate position point-symmetric with respect to OP of the
reference point RP is defined as TP. Moreover, the position
line-symmetric with respect to a horizontal line JK passing through
the OP of the reference point RP is defined as VP. Moreover, the
position line-symmetrical with respect to a vertical line ST
passing through the OP of the reference point RP is defined as HP.
Then, a rectangular area surrounded by RP-HP-TP-VP is set as the
caption display range.
[0172] In this manner, in a case where the broadcast delivery
system 100 sends solely the reference point information (RPoffset)
as the information indicating a caption display range, it is
possible to designate caption display range more flexibly compared
to the case where both the reference point information (RPoffset)
and the aspect ratio information (dispasp) are sent.
[0173] The flowchart of FIG. 24 illustrates an exemplary procedure
of determination and resize processing on the caption display
position in the CPU 221 of the television receiver 200 in a case
where solely the reference point information (RPoffset) is sent as
the information indicating a caption display range. In FIG. 24,
portions corresponding to those in FIG. 15 are denoted by the same
reference numerals.
[0174] When it is determined in step ST2 that the mode is a mode of
displaying the entire video area, the CPU 221 determines in step
ST4 the caption display position (region) with the display video
area defined as the caption display range. At this time, the CPU
221 defines the top-left of the display video area as the reference
point RP and determines the caption display position (region) in
accordance with an instruction of the caption display position
information ("origin="OH % OV %"" and "extent="EH % EV %"")
designated by the relative position with respect to the caption
display range.
[0175] After the processing of step ST4, the CPU 221 finishes the
processing in step ST5.
[0176] While detailed description is omitted, the other steps of
the flowchart of FIG. 24 are similar to the steps of the flowchart
of FIG. 15.
[0177] Moreover, in the above-described embodiment is an example of
using TTML as text information of caption of a predetermined
format. The present technology, however, is not limited to this,
and it is conceivable to use other text information having
information equivalent to TTML. For example, a derived format of
TTML may be used.
[0178] Moreover, while the above-described embodiment illustrates a
case where the transmission-reception system 10 includes the
broadcast delivery system 100 and the television receiver 200, the
configuration of the transmission-reception system to which the
present technology can be applied is not limited to this. For
example, it is allowable to have a configuration including a set
top box and a monitor being connected with a digital interface such
as a high-definition multimedia interface (HDMI) used as the
portion of the television receiver 200. Note that "HDMI" is a
registered trademark.
[0179] Moreover, the above-described embodiment illustrates an
example in which the container is a transport stream of MPEG-2 TS.
Needless to say, the present technology can be similarly applied to
the case where the container is a transport stream of MMT, a
DASH/ISOBMFF stream, or the like.
[0180] Moreover, the present technology may also be configured as
below.
[0181] (1) A reception apparatus including:
[0182] a reception unit that receives a container of a
predetermined format containing a video stream including video data
and a subtitle stream including caption information;
[0183] a video decoding unit that performs decode processing on the
video stream to obtain video data; and
[0184] a subtitle decoding unit that performs decode processing on
the subtitle stream to obtain bitmap data of a caption;
[0185] in which a caption display position is designated by a
relative position with respect to a caption display range in
caption display position information included in the caption
information,
[0186] the reception apparatus further including:
[0187] a display control unit that, in a case where an aspect ratio
of a video area is different from an aspect ratio of the display
video area, determines a caption display position on the basis of
the caption display position information with a display video area
defined as a caption display range, performs resize processing on
the determined caption display position, and performs display
position control on the bitmap data of the caption on the basis of
the caption display position that has undergone the resize
processing; and
[0188] a video superimposing unit that superimposes the bitmap data
of the caption that has undergone the display position control, on
the video data.
[0189] (2) The reception apparatus according to (1),
[0190] in which in a case where the size in the vertical direction
is compressed by the resize processing of the caption display
position, the display control unit performs compression in a state
where a predetermined line position is fixed.
[0191] (3) The reception apparatus according to (1) or (2),
[0192] in which the caption information contained in the subtitle
stream includes information related to the resize processing, and
the display control unit uses the information related to the resize
processing to perform the resize processing on the determined
caption display position.
[0193] (4) A reception method including:
[0194] a reception step, executed by a reception unit, of receiving
a container of a predetermined format containing a video stream
including video data and a subtitle stream including caption
information;
[0195] a video decoding step of performing decode processing on the
video stream to obtain video data; and
[0196] a subtitle decoding step of performing decode processing on
the subtitle stream to obtain bitmap data of a caption;
[0197] in which a caption display position is designated by a
relative position with respect to a caption display range in
caption display position information included in the caption
information,
[0198] the reception method further including:
[0199] a display control step, performed in a case where an aspect
ratio of a video area is different from an aspect ratio of the
display video area, of determining a caption display position on
the basis of the caption display position information with a
display video area defined as a caption display range, performing
resize processing on the determined caption display position, and
performing display position control on the bitmap data of the
caption on the basis of the caption display position that has
undergone the resize processing; and
[0200] a video superimposing step of superimposing the bitmap data
of the caption that has undergone the display position control, on
the video data.
[0201] (5) A reception apparatus including:
[0202] a reception unit that receives a container of a
predetermined format containing a video stream including video data
and a subtitle stream including caption information;
[0203] a video decoding unit that performs decode processing on the
video stream to obtain video data; and
[0204] a subtitle decoding unit that performs decode processing on
the subtitle stream to obtain bitmap data of a caption;
[0205] in which a caption display position is designated by a
relative position with respect to a caption display range in
caption display position information included in the caption
information,
[0206] the reception apparatus further including:
[0207] a display control unit that, in a case where an aspect ratio
of a video area is different from an aspect ratio of the display
video area, sets a caption display range in the display video area,
determines a caption display position on the basis of the caption
display position information, and performs display position control
on the bitmap data of the caption on the basis of the determined
caption display position; and
[0208] a video superimposing unit that superimposes the bitmap data
of the caption that has undergone the display position control, on
the video data.
[0209] (6) The reception apparatus according to (5),
[0210] in which the caption information contained in the subtitle
stream includes information indicating the caption display range,
and
[0211] the display control unit sets
[0212] a caption display range in the display video area using the
information indicating the caption display range.
[0213] (7) The reception apparatus according to (6),
[0214] in which the information indicating the caption display
range is reference point information and aspect ratio information
of the caption display range, or reference point information of the
caption display range.
[0215] (8) A reception method including:
[0216] a reception step, executed by a reception unit, of receiving
a container of a predetermined format containing a video stream
including video data and a subtitle stream including caption
information;
[0217] a video decoding step of performing decode processing on the
video stream to obtain video data; and
[0218] a subtitle decoding step of performing decode processing on
the subtitle stream to obtain bitmap data of a caption;
[0219] in which a caption display position is designated by a
relative position with respect to a caption display range in
caption display position information included in the caption
information,
[0220] the reception method further including:
[0221] a display control step, performed in a case where an aspect
ratio of a video area is different from an aspect ratio of the
display video area, of setting a caption display range in the
display video area, determining a caption display position on the
basis of the caption display position information, and performing
display position control on the bitmap data of the caption on the
basis of the determined caption display position; and
[0222] a video superimposing step of superimposing the bitmap data
of the caption that has undergone the display position control, on
the video data.
[0223] (9) A transmission apparatus including a transmission unit
that transmits a container of a predetermined format containing a
video stream including video data and a subtitle stream including
caption information,
[0224] in which the caption display position in the caption display
position information included in the caption information is
designated by a relative position with respect to the caption
display range, and
[0225] the caption information includes
[0226] information related to resize processing on the caption
display position determined on the basis of the caption display
position information, performed on a receiving side in a case where
the aspect ratio of the video area is different from the aspect
ratio of the display video area.
[0227] (10) The transmission apparatus according to (9),
[0228] in which the information related to the resize processing is
information indicating a line position to be set as a fixed
position in a case where the size in the vertical direction is
compressed in the resize processing of the caption display
position.
[0229] (11) A transmission apparatus including a transmission unit
that transmits a container of a predetermined format containing a
video stream including video data and a subtitle stream including
caption information,
[0230] in which the caption display position in the caption display
position information included in the caption information is
designated by a relative position with respect to a caption display
range, and
[0231] the caption information includes
[0232] information indicating the caption display range.
[0233] (12) The transmission apparatus according to (11),
[0234] in which the information indicating the caption display
range is reference point information and aspect ratio information
of the caption display range, or reference point information of the
caption display range.
[0235] Main features of the present technology include capability,
in a case where the aspect ratio of the video area is different
from the aspect ratio of the display video area, of obtaining a
final caption display position by further performing resize
processing on a caption display position determined on the basis of
caption display position information with the display video area
defined as the caption display range, or setting the caption
display range in the display video area and determining the caption
display position on the basis of the caption display position
information. With this configuration, it is possible to maintain an
original shape as the caption display position in a case where the
aspect ratio of the video area is different from the aspect ratio
of the display video area, enabling display of captions
satisfactorily without giving a viewer a sense of discomfort (refer
to FIG. 20).
REFERENCE SIGNS LIST
[0236] 10 Transmission-reception system [0237] 100 Broadcast
delivery system [0238] 110 Stream generation unit [0239] 111
Control unit [0240] 112 Video encoder [0241] 113 Audio encoder
[0242] 114 Text format converter [0243] 115 Subtitle encoder [0244]
116 TS formatter [0245] 200 Television receiver [0246] 201
Reception unit [0247] 202 TS analysis unit [0248] 203 Video decoder
[0249] 204 Video superimposing unit [0250] 205 Panel drive circuit
[0251] 206 Display panel [0252] 207 Audio decoder [0253] 208 Audio
output circuit [0254] 209 Speaker [0255] 210 Subtitle decoder
[0256] 221 CPU
* * * * *
References