U.S. patent application number 15/860494 was filed with the patent office on 2018-07-05 for method and system for providing virtual reality (vr) video transcoding and broadcasting.
This patent application is currently assigned to Black Sails Technology Inc.. The applicant listed for this patent is Black Sails Technology Inc.. Invention is credited to Chia-Chi Chang, Yongtao Tang, Zhuo Wang, Ruoxi Zhao, Haoyan Zu.
Application Number | 20180189980 15/860494 |
Document ID | / |
Family ID | 62711388 |
Filed Date | 2018-07-05 |
United States Patent
Application |
20180189980 |
Kind Code |
A1 |
Wang; Zhuo ; et al. |
July 5, 2018 |
Method and System for Providing Virtual Reality (VR) Video
Transcoding and Broadcasting
Abstract
Disclosed a method and a system for providing virtual reality
(VR) video transcoding and broadcasting. The method comprises:
obtaining a use's viewport; processing a VR video data into a basic
video set and an enhancement video set in accordance with the
user's viewport, wherein the basic video set comprises a plurality
of basic video segments, the enhancement video set comprises a
plurality of enhancement video segments, and the playback effect of
the sum of the basic video segment and the enhancement video
segment is better than that of the basic video segment; downloading
the basic video segments and the enhancement video segments; and
displaying a sum of two video data obtained by adding the basic
video segments and the enhancement video segments in accordance
with the user's viewport. According to the embodiments of the
present disclosure, the VR video data is processed into a basic
video set and an enhancement video set and an video data obtained
by adding the basic video segments and the enhancement video
segments in accordance with the user's viewport is displayed. Thus,
viewing experience is ensured while downloaded data is reduced and
transmission effect is improved.
Inventors: |
Wang; Zhuo; (Sunnyvale,
CA) ; Tang; Yongtao; (San Leandro, CA) ; Zhao;
Ruoxi; (San Jose, CA) ; Zu; Haoyan; (Newark,
CA) ; Chang; Chia-Chi; (San Jose, CA) |
|
Applicant: |
Name |
City |
State |
Country |
Type |
Black Sails Technology Inc. |
Sunnyvale |
CA |
US |
|
|
Assignee: |
Black Sails Technology Inc.
Sunnyvale
CA
|
Family ID: |
62711388 |
Appl. No.: |
15/860494 |
Filed: |
January 2, 2018 |
Related U.S. Patent Documents
|
|
|
|
|
|
Application
Number |
Filing Date |
Patent Number |
|
|
62441936 |
Jan 3, 2017 |
|
|
|
Current U.S.
Class: |
1/1 |
Current CPC
Class: |
H04N 21/2335 20130101;
H04N 5/76 20130101; H04N 13/161 20180501; H04N 21/234381 20130101;
G06T 9/001 20130101; H04N 13/275 20180501; G06T 15/04 20130101;
H04L 43/0888 20130101; H04N 9/8715 20130101; H04N 21/234345
20130101; H04N 13/398 20180501; H04N 13/378 20180501; H04N 21/231
20130101; H04N 13/344 20180501; H04N 13/122 20180501; H04N 13/189
20180501; H04N 13/117 20180501; G06F 3/012 20130101; H04L 67/38
20130101; G06T 3/40 20130101; G06T 15/205 20130101; H04L 65/601
20130101; H04N 13/332 20180501; H04N 21/816 20130101; H04N 19/44
20141101; H04N 19/70 20141101; H04N 21/234363 20130101; H04N 19/40
20141101; G06F 3/011 20130101; G06T 2215/16 20130101; H04N 13/383
20180501; H04N 13/139 20180501 |
International
Class: |
G06T 9/00 20060101
G06T009/00; G06F 3/01 20060101 G06F003/01; G06T 15/04 20060101
G06T015/04; G06T 15/20 20060101 G06T015/20 |
Claims
1. A method for providing virtual reality (VR) video transcoding
and broadcasting, comprising: obtaining a user's viewport;
processing a VR video data into a basic video set and an
enhancement video set in accordance with the user's viewport,
wherein the basic video set comprises a plurality of basic video
segments, the enhancement video set comprises a plurality of
enhancement video segments, and the playback effect of the sum of
the basic video segments and the enhancement video segments is
better than that of the basic video segments; downloading the basic
video segments and the enhancement video segments; and displaying a
sum of two video data obtained by adding the basic video segments
and the enhancement video segments in accordance with the user's
viewport.
2. The method according to claim 1, wherein, the user's viewport is
relevant to specification and parameters of a head-up display
device.
3. The method according to claim 1, wherein the step of processing
a VR video data into a basic video set and an enhancement video set
in accordance with the user's viewport comprises: dividing a
projection area of the VR video data into a plurality of grid
blocks; determining that among the plurality of grid blocks, which
grid blocks constitute a viewport block in accordance with the
user's viewport; and processing the VR video data into the basic
video set and the enhancement video set in accordance with the grid
blocks constituting the viewport block.
4. The method according to claim 1, wherein the step of processing
the VR video data into the basic video set and the enhancement
video set in accordance with the grid blocks constituting the
viewport block comprises: obtaining an audio data set and a first
frame data set by decoding the VR video data, obtaining a second
frame data set by scaling down the first frame data set losslessly
to a target resolution; obtaining a third frame data set by
decreasing a resolution of the first frame data set to a basic
resolution and then increasing to a target resolution by using
interpolation algorithm; obtaining a basic video set by combining
the audio data set and the second frame data set and segmenting the
combination; encoding and segmenting an enhancement data set
obtained by performing a substraction between the second frame data
set and the third frame data set to obtain a plurality of video
segments in accordance with the plurality of grid blocks; and
assigning some of the plurality of video segments into the
enhancement video set in accordance with the grid blocks
constituting the viewport block.
5. The method according to claim 3, wherein the step of dividing a
projection area of the VR video data into a plurality of grid
blocks comprises dividing the projection area into the plurality of
grid blocks with equal areas.
6. The method according to claim 1, further comprises: obtaining a
plurality of combinations of resolution and bitrate, wherein each
of the plurality of combinations comprises a basic resolution, an
enhancement resolution, a basic bitrate and an enhancement bitrate.
the step of processing a VR video data into a basic video set and
an enhancement video set comprises: processing the VR video data
into the basic video set and the enhancement video set in
accordance with the plurality of combinations of resolution and
bitrate.
7. The method according to claim 6, wherein, the step of
downloading the basic video segments and the enhancement video
segments comprises: calculating an average download speed;
selecting from the plurality of combinations of resolution and
bitrate in accordance with the average download speed; and
downloading corresponding basic video segments and enhancement
video segments in accordance with a selected combination of
resolution and bitrate.
8. The method according to claim 1, wherein, the step of displaying
a sum of two video data obtained by adding the basic video segments
and the enhancement video segments comprises: displaying the sum of
two video data respectively in panoramic mode and binocular
mode.
9. The method according to claim 8, wherein the step of displaying
the sum of two video data in panoramic mode comprises: building a
basic video model and an enhancement video model respectively;
initializing UV coordinates of the basic video model and the
enhancement video model; obtaining basic video segments and
enhancement video segments; obtaining pixel information of the
basic video segments and the enhancement video segments by
decoding; generating a basic video texture according to the pixel
information of the basic video segments and the UV coordinates of
the basic video model, and an enhancement video texture according
to the pixel information of the enhancement video segments and the
UV coordinates of the enhancement video model; determining UV
alignment coordinates of the enhancement video texture according to
a user's viewport; generating reconstructed pixel information by
adding the basic video texture and the enhancement video texture
according to UV alignment coordinates; and drawing an image
according to the reconstructed pixel information.
10. The method according to claim 8, wherein the step of displaying
the sum of two video data in binocular mode comprises: obtaining
relevant parameters including a camera matrix, a projection matrix,
a model matrix and a center position of lens distortion; creating a
three-dimensional model and obtaining an original coordinate data
of the three-dimensional model; obtaining a first coordinate data
in accordance with the relevant parameters and the original
coordinate data of the three-dimensional model; performing lens
distortion on the first coordinate data based on the center
position of lens distortion to obtain a second coordinate data;
rasterizing the second coordinate data to obtain pixel units; and
drawing an image in accordance with a VR video data and the pixel
units.
11. A system for providing virtual reality (VR) video transcoding
and broadcasting, comprising: a obtaining module configured to
obtaining a user's viewport; a data transcoding module configured
to process a VR video data into a basic video set and an
enhancement video set in accordance with the user's viewport,
wherein the basic video set comprises a plurality of basic video
segments, the enhancement video set comprises a plurality of
enhancement video segments, and the playback effect of the sum of
the basic video segment and the enhancement video segment is better
than that of the basic video segment; a downloading module
configured to download the basic video segments and the enhancement
video segments; and a playing module configured to display a sum of
two video data obtained by adding the basic video segments and the
enhancement video segments in accordance with the user's
viewport.
12. The system according to claim 11, wherein, the user's viewport
is relevant to specification and parameters of a head-up display
device.
13. The system according to claim 11, wherein the data transcoding
module comprises: a division unit configured to divide a projection
area of the VR video data into a plurality of grid blocks; a
cutting unit configured to determining that among the plurality of
grid blocks, which grid blocks constitute a viewport block in
accordance with the user's viewport; and a processing unit
configured to process the VR video data into the basic video set
and the enhancement video set in accordance with the grid blocks
constituting the viewport block.
14. The system according to claim 13, further comprises: a mapping
table generating unit configured to obtain a plurality of
combinations of resolution and bitrate, wherein each of the
plurality of combinations comprises a basic resolution, an
enhancement resolution, a basic bitrate and an enhancement bitrate.
the processing unit comprises: processing a VR video data into the
basic video set and the enhancement video set in accordance with
the plurality of combinations of resolution and bitrate.
15. The system according to claim 14, wherein, the downloading
module comprises: a speed calculation unit configured to calculate
an average download speed; a selection unit configured to select
from the plurality of combinations of resolution and bitrate in
accordance with the average download speed; an execution unit
configured to download corresponding basic video segments and
enhancement video segments in accordance with a selected
combination of resolution and bitrate.
16. The system according to claim 11, wherein the playing module
display a sum of two video data respectively in two different modes
of panoramic mode and binocular mode.
Description
CROSS-REFERENCE TO RELATED APPLICATION
[0001] This application claims the benefit of U.S. provisional
application 62/441,936, filed on Jan. 3, 2017, which are
incorporated herein by reference in its entirety.
BACKGROUND OF THE DISCLOSURE
Field of the Disclosure
[0002] The present disclosure relates to video processing
technology, and more particularly, to a method and a system for
proving virtual reality (VR) video transcoding and
broadcasting.
Background of the Disclosure
[0003] Virtual Reality (VR) is a computer simulation technology for
creating and experiencing a virtual world. For example, a
three-dimensional real-time image can be presented based on a
technology which tracks a user's head, eyes or hand. For a
network-based virtual reality technology, full-view video data is
pre-stored on a server, and then transmitted to a display device,
such as glasses. A video is displayed on the display device in
accordance with a viewing angle of the user.
[0004] However, high-resolution video data needs to occupy a large
transmission bandwidth and requires that the display device has a
high data processing capability. Because the data transmission
needs a large transmission bandwidth and the display device needs a
high processing capability, the existing video processing
technology has high requirements for network and terminals.
Moreover, it is difficult to present high-resolution and real-time
image display.
[0005] Therefore, it is desirable to further improve video
processing and rendering method for a VR playback system, so as to
save transmission bandwidth, reduce performance requirements for
the display device, and present real-time image display
smoothly.
SUMMARY OF THE DISCLOSURE
[0006] In view of this, the present disclosure provides a method
and a system for providing virtual reality (VR) video transcoding
and broadcasting, to solve the above problems.
[0007] According to a first aspect of the present disclosure, there
is a method for providing virtual reality (VR) video transcoding
and broadcasting, comprising:
[0008] A method for providing virtual reality (VR) video
transcoding and broadcasting, comprising:
[0009] obtaining a user's viewport;
[0010] processing a VR video data into a basic video set and an
enhancement video set in accordance with the user's viewport,
wherein the basic video set comprises a plurality of basic video
segments, the enhancement video set comprises a plurality of
enhancement video segments, and the playback effect of the sum of
the basic video segments and the enhancement video segments is
better than that of the basic video segments;
[0011] downloading the basic video segments and the enhancement
video segments; and
[0012] displaying a sum of two video data obtained by adding the
basic video segments and the enhancement video segments in
accordance with the user's viewport.
[0013] Preferably, the user's viewport is relevant to specification
and parameters of a head-up display device.
[0014] Preferably, wherein the step of processing a VR video data
into a basic video set and an enhancement video set in accordance
with the user's viewport comprises:
[0015] dividing a projection area of the VR video data into a
plurality of grid blocks;
[0016] determining that among the plurality of grid blocks, which
grid blocks constitute a viewport block in accordance with the
user's viewport; and
[0017] processing the VR video data into the basic video set and
the enhancement video set in accordance with the grid blocks
constituting the viewport block.
[0018] Preferably, the step of processing the VR video data into
the basic video set and the enhancement video set in accordance
with the grid blocks constituting the viewport block comprises:
[0019] obtaining an audio data set and a first frame data set by
decoding the source VR video data,
[0020] obtaining a second frame data set by scaling down the first
frame data set losslessly to a target resolution;
[0021] obtaining a third frame data set by decreasing a resolution
of the first frame data set to a basic resolution and then
increasing to a target resolution by using interpolation
algorithm;
[0022] obtaining a basic video set by combining the audio data set
and the second frame data set and segmenting the combination;
[0023] encoding and segmenting the second frame data set minus the
third frame data set to obtain a plurality of video segments in
accordance with the plurality of grid blocks; and
[0024] assigning some of the plurality of video segments into the
enhancement video set in accordance with the grid blocks
constituting the viewport block.
[0025] Preferably, the step of dividing a projection area of the VR
video data into a plurality of grid blocks comprises dividing the
projection area into the plurality of grid blocks with equal
areas.
[0026] Preferably, the method further comprises:
[0027] obtaining a plurality of combinations of resolution and
bitrate, wherein each of the plurality of combinations comprises a
basic resolution, an enhancement resolution, a basic bitrate and an
enhancement bitrate.
[0028] the step of processing a VR video data into a basic video
set and an enhancement video set comprises:
[0029] processing the VR video data into the basic video set and
the enhancement video set in accordance with the plurality of
combinations of resolution and bitrate.
[0030] Preferably, the step of downloading the basic video segments
and the enhancement video segments comprises:
[0031] calculating an average download speed;
[0032] selecting from the plurality of combinations of resolution
and bitrate in accordance with the average download speed; and
[0033] downloading corresponding basic video segments and
enhancement video segments in accordance with a selected
combination of resolution and bitrate.
[0034] Preferably, the step of displaying a sum of two video data
obtained by adding the basic video segments and the enhancement
video segments comprises:
[0035] displaying the sum of two video data respectively in
panoramic mode and binocular mode.
[0036] Preferably, the step of displaying the sum of two video data
in panoramic mode comprises:
[0037] building a basic video model and an enhancement video model
respectively;
[0038] initializing UV coordinates of the basic video model and the
enhancement video model;
[0039] obtaining basic video segments and enhancement video
segments;
[0040] obtaining pixel information of the basic video segments and
the enhancement video segments by decoding;
[0041] generating a basic video texture according to the pixel
information of the basic video segments and the UV coordinates of
the basic video model, and an enhancement video texture according
to the pixel information of the enhancement video segments and the
UV coordinates of the enhancement video model;
[0042] determining UV alignment coordinates of the enhancement
video texture according to a user's viewport;
[0043] generating reconstructed pixel information by adding the
basic video texture and the enhancement video texture according to
UV alignment coordinates; and
[0044] drawing an image according to the reconstructed pixel
information.
[0045] Preferably, the step of displaying the sum of two video data
in binocular mode comprises:
[0046] obtaining relevant parameters including a camera matrix, a
projection matrix, a model matrix and a center position of lens
distortion;
[0047] creating a three-dimensional model and obtaining an original
coordinate data of the three-dimensional model;
[0048] obtaining a first coordinate data in accordance with the
relevant parameters and the original coordinate data of the
three-dimensional model;
[0049] performing lens distortion on the first coordinate data
based on the center position of lens distortion to obtain a second
coordinate data;
[0050] pixel-quantizing the second coordinate data to obtain pixel
units; and
[0051] drawing an image in accordance with a VR video data and the
pixel units.
[0052] According to a second aspect of the disclosure, there is
provided a system for providing virtual reality (VR) video
transcoding and broadcasting, comprising:
[0053] a obtaining module configured to obtaining a user's
viewport;
[0054] a data transcoding module configured to process a VR video
data into a basic video set and an enhancement video set in
accordance with the user's viewport, wherein the basic video set
comprises a plurality of basic video segments, the enhancement
video set comprises a plurality of enhancement video segments, and
the playback effect of the sum of the basic video segment and the
enhancement video segment is better than that of the basic video
segment;
[0055] a downloading module configured to download the basic video
segments and the enhancement video segments; and
[0056] a playing module configured to display a sum of two video
data obtained by adding the basic video segments and the
enhancement video segments in accordance with the user's
viewport.
[0057] Preferably, the user's viewport is relevant to specification
and parameters of a head-up display device.
[0058] Preferably, the data transcoding module comprises:
[0059] a division unit configured to divide a projection area of
the VR video data into a plurality of grid blocks;
[0060] a cutting unit configured to determining that among the
plurality of grid blocks, which grid blocks constitute a viewport
block in accordance with the user's viewport; and
[0061] a processing unit configured to process the VR video data
into the basic video set and the enhancement video set in
accordance with the grid blocks constituting the viewport
block.
[0062] Preferably, the system further comprises: a mapping table
generating unit configured to obtain a plurality of combinations of
resolution and bitrate, wherein each of the plurality of
combinations comprises a basic resolution, an enhancement
resolution, a basic bitrate and an enhancement bitrate.
[0063] the processing unit comprises:
[0064] processing a VR video data into the basic video set and the
enhancement video set in accordance with the plurality of
combinations of resolution and bitrate.
[0065] Preferably, the downloading module comprises:
[0066] a speed calculation unit configured to calculate an average
download speed;
[0067] a selection unit configured to select from the plurality of
combinations of resolution and bitrate in accordance with the
average download speed;
[0068] an execution unit configured to download corresponding basic
video segments and enhancement video segments in accordance with a
selected combination of resolution and bitrate.
[0069] Preferably, the playing module display a sum of two video
data respectively in two different modes of panoramic mode and
binocular mode.
[0070] According to the embodiments of the present disclosure, the
VR video data is processed into a basic video set and an
enhancement video set in accordance with a user's viewport and the
sum of two video data obtained by adding the basic video segments
and the enhancement video segments in accordance with the user's
viewport is displayed. As a result, viewing experience is ensured
while downloaded data is reduced and transmission effect is
improved.
BRIEF DESCRIPTION OF THE DRAWINGS
[0071] The above and other objects, features and advantages of the
present disclosure will become more apparent by describing the
embodiments of the present disclosure with reference to the
following drawings, in which:
[0072] FIG. 1 is a diagram illustrating an example network of a VR
playback system;
[0073] FIG. 2 is a flowchart diagram showing method for providing
virtual reality (VR) video transcoding and broadcasting according
to an embodiment of the disclosure;
[0074] FIG. 3 is a specific flowchart diagram of step S200 in FIG.
2;
[0075] FIG. 4 is a specific flowchart diagram of step S203 of FIG.
3;
[0076] FIG. 5 is a specific flowchart diagram of step S300 of FIG.
2;
[0077] FIG. 6 is a specific flowchart diagram of displaying a sum
of two video data in panoramic mode of step S400 shown in FIG.
2;
[0078] FIG. 7 is a specific flowchart diagram of displaying a sum
of two video data in binocular mode of step S400 shown in FIG.
2;
[0079] FIG. 8 is a schematic diagram of a system for proving
virtual reality (VR) video transcoding and broadcasting according
to an embodiment of the disclosure;
[0080] FIG. 9 is a specific schematic diagram of a data transcoding
module 602 in FIG. 6; and
[0081] FIG. 10 is a specific schematic diagram of a downloading
module 603 in FIG. 6;
DETAILED DESCRIPTION OF THE DISCLOSURE
[0082] Exemplary embodiments of the present disclosure will be
described in more details below with reference to the accompanying
drawings. In the drawings, like reference numerals denote like
members. The figures are not drawn to scale, for the sake of
clarity. Moreover, some well known parts may not be shown.
[0083] FIG. 1 is a diagram illustrating an example network of a VR
playback system. The VR playback system 10 includes a server 100
and a display device 120 which are coupled with each other through
a network 110, and a VR device. For example, the server 100 may be
a stand-alone computer server or a server cluster. The server 100
is used to store various video data and to store various
applications that process these video data. For example, various
daemons run on the server 100 in real time, so as to process
various video data in the server 100 and to respond various
requests from VR devices and the display device 120. The network
110 may be a selected one or selected ones from the group
consisting of an internet, a local area network, an internet of
things, and the like. For example, the display device 120 may be
any of the computing devices, including a computer device having an
independent display screen and a processing capability. The display
device 120 may be a personal computer, a laptop computer, a
computer workstation, a server, a mainframe computer, a palmtop
computer, a personal digital assistant, a smart phone, an
intelligent electrical apparatus, a game console, an iPad/iPhone, a
video player, a DVD recorder/player, a television, or a home
entertainment system. The display device 120 may store VR player
software as a VR player. When the VR player is started, it requests
and downloads various video data from the server 100, and renders
and plays the video data in the display device. In this example,
the VR device 130 is a stand-alone head-up display device that can
interact with the display device 120 and the server 100, to
communicate the user's current information with the display device
120 and/or the server 100 through signaling. The user's current
information is, for example, parameters relevant to users'
viewport, changes of sight of eyes. According to these information,
the display device 120 can flexibly process the currently played
video data. In some embodiments, when a user's viewport is changed,
the display device 120 determines that a core viewing region for
the user has been changed and starts to play video data with high
resolution in the changed core viewing region.
[0084] In the above embodiment, the VR device 130 is a stand-alone
head-up display device. However, those skilled in the art should
understand that the VR device 130 is not limited thereto, and the
VR device 130 may also be an all-in-one head-up display device. The
all-in-one head-up display device itself has a display screen, so
that it is not necessary to connect the all-in-one head-up display
device with the external display device. For example, in this
example, if the all-in-one head-up display device is used as the VR
device, the display device 120 may be omitted. At this point, the
all-in-one head-up display device is configured to obtain video
data from the server 100 and to perform playback operation, and the
all-in-one head-up display device is also configured to detect a
user's current viewport and to adjust the playback operation.
[0085] FIG. 2 is a flowchart diagram showing method for providing
virtual reality (VR) video transcoding and broadcasting according
to an embodiment of the disclosure. The method includes the
following steps.
[0086] In step S100, a use's viewport is obtained.
[0087] In step S200, a VR video data is processed into a basic
video set and an enhancement video set in accordance with the
user's viewport. The basic video set includes a plurality of basic
video segments, and the enhancement video set includes a plurality
of enhancement video segments.
[0088] In step S300, the basic video segments and the enhancement
video segments are downloaded in accordance with the user's
viewport.
[0089] In step S400, a sum of two video data obtained by adding the
basic video segments and the enhancement video segments is
displayed.
[0090] In the scenario where a user uses a VR head-up display
device, the user's viewport can be obtained through the
specification and parameters of the head-up display device and a
screen size. VR video data stored on the server is generally
obtained by collecting 360-degree panoramic images of real world,
but what a specific user can see is the video image within the
viewport. Therefore, in this embodiment, the VR video data is
processed into the basic video set and the enhancement video set
according to the viewport, and after the basic video segments and
the enhancement video segment are downloaded in accordance with the
user's viewport, and a video data obtained by adding the basic
video segments and the enhancement video segments is displayed. As
a result, higher-resolution video data can be displayed within the
viewport for better viewing experience while basic video data is
displayed outside the viewport to reduce the amount of downloaded
data.
[0091] FIG. 3 is a specific flowchart diagram of step S200 in FIG.
2. It specifically includes the following steps.
[0092] In step S201, a projection area of the VR video data is
divided into a plurality of grid blocks; The projection area is
generally divided into a number of grid blocks with equal
areas.
[0093] In step S202, it is determined that among the plurality of
grid blocks, which grid blocks constitute a viewport block in
accordance with the user's viewport.
[0094] In step S203, the VR video data is processed into the basic
video set and the enhancement video set in accordance with the grid
blocks constituting the viewport block.
[0095] FIG. 4 is a specific flowchart diagram of step S203 of FIG.
3. It specifically includes the following steps.
[0096] In step S2031, the VR video data is decoded into an audio
data set 1 and a frame data set 1.
[0097] In step S2032, a frame data set 2 is obtained by scaling
down the frame data set 1 losslessly to a target resolution.
[0098] In step S2033, a frame data set 3 is obtained by decreasing
a resolution of the frame data set 1 to a basic resolution and then
increasing to a target resolution by using interpolation
algorithm.
[0099] In step S2034, an enhancement data set is obtained by a
subtraction between the frame data set 2 and the frame data set
3.
[0100] In step S2035, the basic video set is obtained by combining
the frame data set 2 and the audio data set 1 and segmenting the
combination.
[0101] In step S2036, a plurality of video segements are obtained
by encoding, compressing and segmenting the enhancement data
set.
[0102] In step S2037, some of the plurality of video segments are
assigned into the enhancement video set in accordance with the grid
blocks constituting the viewport block.
[0103] The above embodiments specifically describe the process of
processing a specific VR video data into a basic video set and an
enhancement video set according to a user's viewport. For ease of
understanding, the following examples are provided for further
explaining. The frame data set 1 is set to have an original
resolution of 12,600.times.6,000 pixels, the frame data set 2 and
the frame data set 3 having a target resolution of
6,300.times.3,000 pixels are obtained according to steps S2032 and
S2033. The frame data set 2 is obtained by decreasing losslessly
12,600.times.6,000 pixels of the frame data set 1 to
6,300.times.3,000 pixels, and the frame data set 3 is obtained by
decreasing 12,600.times.6,000 pixels of the frame data set 1 to,
for example, 798.times.1024 pixels and then performing
interpolation and amplification. A substraction is performed
between the frame data set 2 and the frame data set 3 to obtain an
enhancement data set and the enhancement data set is encoded and
compressed to obtain a compressed data, the compressed data is
segmented in accordance with the grid blocks to obtain a plurality
of video segments, and finally some of the plurality of video
segments are assigned into the enhancement video set in accordance
with correspondence between viewport block and grid blocks, that
is, the enhancement video set contains the video segements
corresponding to the viewport block.
[0104] Of course, the present disclosure is not limited thereto,
and other methods can also obtain the basic video set and an
enhancement video set.
[0105] The above video data processing steps are generally executed
on the server. When the server corresponds to multiple head-up
display devices or display devices, the VR video data is
correspondingly processed according to respective viewport of
multiple users. Therefore, a number of correspondence of viewport
block and grid blocks can be established. In order to easily
process, the grid blocks should be set to have an appropriate size,
for corresponding to a plurality of different viewport.
[0106] In the above embodiment, step S200 may process the video
data in accordance with a preset resolution and a preset bitrate,
that is, after processing, the basic video segments in the basic
video set each have a preset basic resolution and a preset basic
bitrate, and the enhancement video segments in the enhancement
video set each have a preset enhancement resolution and a preset
enhancement bitrate. Therefore, a plurality of base video sets and
a plurality of enhancement video sets with different combinations
of resolution and bitrate may be generated corresponding to one
specific video data, and when a display device or a head-up display
device requests video data from the server, according to current
network conditions, a basic video segment and an enhancement video
segment with an ppropriate combination of resolution and bitrate is
selected.
[0107] FIG. 5 is a specific flowchart diagram of step S300 of FIG.
2.
[0108] In step S301, an average download speed is calculated
out.
[0109] In step S302, from the plurality of combinations of
resolution and bitrate, a combination of resolution and bitrate is
selected in accordance with the average download speed.
[0110] In step S303, corresponding basic video segments and
enhancement video segments are downloaded in accordance with the
selected combination of resolution and bitrate.
[0111] In this embodiment, a combination of resolution and bitrate
is determined in accordance with the average download speed, and
corresponding basic video segements and enhancement video segments
are obtained, thereby achieving the purpose of optimizing data
transmission. Alternatively, a plurality of combinations of
resolution and bitrate can be established based on an initial
combination of resolution and bitrate. For example, N-level
combinations of resolution and bitrate are established, and the
combinations at each level have a preset corresponding
relationship. When the average download speed is within a certain
interval, the basic video segments and enhancement video segments
which have a resolution and a bitrate within corresponding level,
are selected.
[0112] Further, step S400 includes panoramic mode and binocular
mode.
[0113] FIG. 6 is a specific flowchart diagram of displaying a sum
of two video data in panoramic mode of step S400 shown in FIG.
2;
[0114] In step S401, pixel information of the basic video segments
and the enhancement video segments are obtained by decoding.
[0115] In this step, the basic video segments and the enhancement
video segments are decoded by a suitable decoder to obtain
respective pixel information. The decoding process may also include
a decompression process for decompressing a compressed video data.
Corresponding to each of the different color spaces, different
pixel components are solved, for example, the R, G, B components
are solved for RGB color space.
[0116] In step S402, a basic video model and an enhancement video
model respectively are built.
[0117] In this step, a suitable three-dimensional model can be
created in accordance with requirements. For example, two polygonal
sphere can be created as the three-dimensional the basic video
model and the enhancement video model.
[0118] In step S403, UV coordinates of the basic video model and
the enhancement video model are initialized.
[0119] Here, UV coordinates are the abbreviation of u, v texture
mapping coordinates, similar to space model of X, Y, Z axis. It
defines position information for each point on the plane that
corresponds to the three-dimensional model. Through the UV
coordinates, each point on the image can be accurately mapped to
the three-dimensional model. In this step, each UV coordinate point
on the base video model and the enhancement video model is created
and initialized.
[0120] In step S404, a basic video texture and an enhancement video
texture are generated.
[0121] In this step, the basic video texture is generated in
accordance with the pixel information of the basic video segments
and the UV coordinates of the basic video model, and the
enhancement video texture is generated in accordance with the pixel
information of the enhancement video segments and the UV
coordinates of the enhancement video model.
[0122] In step S405, UV alignment coordinates of the enhancement
video texture are determined.
[0123] In this step, the UV alignment coordinates of the
enhancement video texture may be determined in accordance with the
user's current viewport.
[0124] In step S406, a reconstructed pixel information is generated
by adding the basic video texture and the enhancement video texture
with each other in accordance with the UV alignment
coordinates.
[0125] In this step, the reconstructed pixel information is
obtained based on the relationship between the basic video segments
and the enhancement video segments.
[0126] For easily understanding, the following example is used to
further illustrate.
[0127] Assuming that the pixel Px,y.sup.Original=(r, g, b).sup.T is
a pixel information of x, y coordinates in the video data, r, g,
b.di-elect cons.[L, H],Px,y.sup.ScaledBase=(r', g', b').sup.T is a
pixel information in the basic video with coordinates x, y, and r',
g', b'.di-elect cons.[L, H].
[0128] For all (x,y), following difference generate equation (1) is
provided,
Px , y NormalizedResidual = Px , y Original - Px , y ScaledBase + H
- L 2 ##EQU00001##
[0129] Px,y.sup.NormalizedResidual represents pixel difference, for
all pixels (x, y), the following difference reconstruction equation
is provided,
Px , y Reconstructed = Px , y ScaleBase + Px , y NormalizedResidual
- H - L 2 . ##EQU00002##
[0130] In step S407, an image is drawn in accordance with the
reconstructed pixel information.
[0131] FIG. 7 is a specific flowchart diagram of displaying a sum
of two video data in binocular mode of step S400 shown in FIG.
2.
[0132] In step S411, relevant parameters are obtained.
[0133] For example, the relevant parameters are calculated based on
specification and parameters of a head-up display device and a
screen size. The relevant parameters include, for example, the
parameters for field of view of left and right lenses, the camera
matrix, the projection matrix and the center position of lens
distortion. A head-up display device generally includes a stand and
left and right lenses on the stand, and human eyes obtains images
from left and right viewable regions through the left and right
lenses. Because the left and right viewable regions provide images
with difference, human mind, ater obtaining the information with
difference, produces a three-dimensional sense. Different type of
head-up devices have different specification and parameters,
generally, the specification and parameters can be obtained by
querying website or querying built-in parameter files, and then the
relevant parameters required in rendering process can be calculated
in accordance with these relevant parameters.
[0134] In step S412, a three-dimensional model is created and the
original coordinate data of the three-dimensional model is
obtained.
[0135] In this step, a suitable three-dimensional model can be
created in accordance with requirements. For example, a polygonal
sphere can be created as the three-dimensional model and the
original coordinate data can be obtained based on the polygonal
sphere.
[0136] In step S413, the first coordinate data is obtained in
accordance with the relevant parameters and the original coordinate
data of the three-dimensional model.
[0137] In step S414, lens distortion is performed on the first
coordinate data based on the center position of lens distortion to
obtain second coordinate data.
[0138] In step S413, vector calculation on the original coordinate
data is performed in accordance with the camera matrix, the
projection matrix and the model matrix to obtain the calculated
coordinate data as the first coordinate data, and in step S414, the
first coordinate data is further distorted to obtain the second
coordinate data.
[0139] In step S415, the second coordinate data is rasterized to
obtain pixel units.
[0140] In this step, the second coordinate data is processed into
pixel units in a two-dimensional plane.
[0141] In step S416, an image is drawn based on the VR video data
and the pixel units.
[0142] In the step, the VR video data downloaded from the server is
decoded to obtain the pixel information therein, the pixel units
are assigned in accordance with the pixel information and finally
the image is drawn.
[0143] The embodiments of the disclosure provides two viewing modes
including panoramic mode and binocular mode. In panorama mode, a
three dimensional model is created, the UV alignment coordinates of
the base video segments and the enhancement video segments are
determined in accordance with to the user's current viewport, and
then the reconstructed pixel information is assigned to the
three-dimensional model in accordance with the UV alignment
coordinates, so as to achieve three-dimensional panoramic viewing
effect. In binocular mode, a three-dimensional model is created,
lens distortion is performed on the coordinate data of the
three-dimensional model, and then the basic video segments and the
enhancement video segments are added and displayed in the distorted
three-dimensional model, so as to achieve binocular-mode VR
immersive viewing effect. In binocular mode, the rendering of video
data is completed by one processing, which improves rendering
efficiency.
[0144] FIG. 8 is a schematic diagram of a system for proving
virtual reality (VR) video transcoding and broadcasting according
to an embodiment of the disclosure. The system includes an
obtaining module 801, a data transcoding module 802, a downloading
module 803 and a playing module 804.
[0145] The obtaining module 801 is configured to obtain the user's
viewport.
[0146] The data transcoding module 802 is configured to process the
VR video data into a basic video set and an enhancement video set
in accordance with the user's viewport, the basic video set
includes a plurality of basic video segments, the enhancement video
set includes a plurality of enhancement video segments, and the
playback effect of the sum of the basic video segments and the
enhancement video segments is better than that of the basic video
segments.
[0147] The downloading module 803 is configured to download basic
video segments and enhancement video segments.
[0148] The playing module 804 is configured to display a sum of two
video data obtained by adding the basic video segments and the
enhancement video segments in accordance with the user's
viewport.
[0149] FIG. 9 is a specific schematic diagram of a data transcoding
module 602 in FIG. 6.
[0150] The data transcoding module includes a division unit 8021, a
cutting unit 8022, and a processing unit 8023. The division unit
8021 is configured to divide a projection area of the VR video data
into a plurality of grid blocks. The cutting unit 8022 is
configured to determine that among the plurality of grid blocks,
which grid blocks constitute a viewport block in accordance with
the user's viewport. The processing unit 8023 is configured to
process the VR video data into the basic video set and the
enhancement video set in accordance with the grid blocks
constituting the viewport block.
[0151] In an alternative embodiment, the system further includes a
mapping table generating unit which is configured to obtain a
plurality of combinations of resolution and bitrate, each
combination of resolution and bitrate includes a basic resolution,
an enhancement resolution, a basic bitrate and an enhancement
bitrate. At this point, the processing unit 8023 processes the VR
video data into a basic video set and an enhancement video set in
accordance with the plurality of combinations of resolution and
bitrate. Preferably, a plurality of combinations of resolution and
bitrate corresponding to a plurality of level can be established
based on an initial combination of resolution and bitrate, the
resolution and the bitrate at neighboring levels have a specific
proportional relationship.
[0152] In another alternative embodiment, referring to FIG. 10, the
downloading module 803 includes a speed calculation unit 8031, a
selection unit 8032, and an execution unit 8033. The speed
calculation unit 8031 is configured to calculate an average
download speed. The selection unit 8032 is configured to select
from the plurality of combinations of resolution and bitrate in
accordance with the average download speed. The execution unit 8033
is configured to download corresponding basic video segments and
enhancement video segments in accordance with a selected
combination of resolution and bitrate.
[0153] According to the present disclosure, at the server, the VR
video data is processed into the basic video set and the
enhancement video set in accordance with the viewport, and at the
head-up display device or the display device, the basic video
segments and the enhancement video segments are downloaded and the
sum of two video data obtained by adding the basic video segments
and the enhancement video segments in accordance with the user's
viewport is displayed. Thus, the viewing experience can be ensured
while transmission bandwidth and performance requirements for the
display device are reduced.
[0154] In an alternative embodiment, after the appropriate
resolution and bitrate are selected, the corresponding video
segments are downloaded, so that dynamically adjusting the amount
of downloaded data can be realized, thereby optimizing data
transmission.
[0155] In another alternative embodiment, panorama and binocular
viewing modes are provided, and the sum of two video data is
displayed in different view modes.
[0156] Although the embodiments of the present disclosure have been
described above with reference to the preferred embodiments, it is
not intended to limit the claims. Any modifications and variations
may be made by those skilled in the art without departing from the
spirit and scope of the present disclosure, Therefore, the
protection scope of the present disclosure should be based on the
scope of the claims of the present disclosure.
[0157] The foregoing descriptions of specific embodiments of the
present disclosure have been presented, but are not intended to
limit the disclosure to the precise forms disclosed. It will be
readily apparent to one skilled in the art that many modifications
and changes may be made in the present disclosure. Any
modifications, equivalence, variations of the preferred embodiments
can be made without departing from the doctrine and spirit of the
present disclosure.
* * * * *