U.S. patent application number 12/982248 was filed with the patent office on 2011-06-30 for interpolation of three-dimensional video content.
This patent application is currently assigned to BROADCOM CORPORATION. Invention is credited to James D. Bennett, Jeyhan Karaoguz.
Application Number | 20110157315 12/982248 |
Document ID | / |
Family ID | 43797724 |
Filed Date | 2011-06-30 |
United States Patent
Application |
20110157315 |
Kind Code |
A1 |
Bennett; James D. ; et
al. |
June 30, 2011 |
INTERPOLATION OF THREE-DIMENSIONAL VIDEO CONTENT
Abstract
Techniques are described herein for interpolating
three-dimensional video content. Three-dimensional video content is
video content that includes portions representing respective frame
sequences that provide respective perspective views of a given
subject matter over the same period of time. For example, the
three-dimensional video content may be analyzed to identify one or
more interpolation opportunities. If an interpolation opportunity
is identified, frame data that is associated with the interpolation
opportunity may be replaced with an interpolation marker. In
another example, a frame that is not directly represented by data
in the three-dimensional video content may be identified. For
instance, the frame may be represented by an interpolation marker
or corrupted data. The interpolation marker or corrupted data may
be replaced with an interpolated representation of the frame.
Inventors: |
Bennett; James D.;
(Hroznetin, CZ) ; Karaoguz; Jeyhan; (Irvine,
CA) |
Assignee: |
BROADCOM CORPORATION
Irvine
CA
|
Family ID: |
43797724 |
Appl. No.: |
12/982248 |
Filed: |
December 30, 2010 |
Related U.S. Patent Documents
|
|
|
|
|
|
Application
Number |
Filing Date |
Patent Number |
|
|
61291818 |
Dec 31, 2009 |
|
|
|
61303119 |
Feb 10, 2010 |
|
|
|
Current U.S.
Class: |
348/46 ;
348/E13.074 |
Current CPC
Class: |
G03B 35/24 20130101;
H04N 13/359 20180501; G09G 3/003 20130101; G09G 2320/028 20130101;
H04N 13/189 20180501; H04N 13/139 20180501; H04S 7/303 20130101;
G09G 2300/023 20130101; H04N 13/312 20180501; G02B 6/00 20130101;
H04N 13/383 20180501; H04N 13/398 20180501; G09G 5/003 20130101;
H04N 13/31 20180501; H04N 2013/403 20180501; H04N 13/315 20180501;
G06F 3/14 20130101; G09G 5/14 20130101; H04N 21/235 20130101; H04N
21/4122 20130101; H04N 13/00 20130101; H04N 13/161 20180501; H04N
13/194 20180501; H04N 13/361 20180501; H04N 13/366 20180501; H04N
13/305 20180501; H04N 2013/405 20180501; G09G 2370/04 20130101;
H04N 13/332 20180501; H04N 21/435 20130101; G09G 3/20 20130101;
H04N 13/351 20180501; G06F 3/0346 20130101 |
Class at
Publication: |
348/46 ;
348/E13.074 |
International
Class: |
H04N 13/02 20060101
H04N013/02 |
Claims
1. An encoding system servicing three-dimensional video content,
the three-dimensional video content having both a first portion
representing a first sequence of frames that provide a first
perspective view and a second portion representing a second
sequence of frames that provide a second perspective view, the
encoding system comprising: processing circuitry; input circuitry
through which the processing circuitry receives both the first
portion that represents the first sequence of frames that provide
the first perspective view and the second portion that represents
the second sequence of frames that provide the second perspective
view; the processing circuitry encodes the first portion and the
second portion received, the encoding involving at least in part
analyzing the first portion and the second portion to identify an
interpolation opportunity, and, upon so identifying, the processing
circuitry replaces frame data with an interpolation marker; and
output circuitry through which the processing circuitry delivers an
encoded representation of the three-dimensional video content.
2. The encoding system of claim 1, wherein the processing circuitry
compares a current frame with frames that neighbor the current
frame to identify the interpolation opportunity.
3. The encoding system of claim 2, wherein the interpolation
opportunity is identified in a first frame of the first portion
while the neighboring frames include a second frame from the second
portion.
4. The encoding system of claim 1, wherein the interpolation marker
is accompanied by an interpolation instruction.
5. The encoding system of claim 1, wherein the processing circuitry
determines that an accuracy of an estimate of the frame data is
greater than a threshold accuracy; and wherein the processing
circuitry analyzes the first portion and the second portion to
identify the interpolation opportunity in response to determination
that the accuracy of the estimate is greater than the threshold
accuracy.
6. The encoding system of claim 1, wherein the processing circuitry
determines that an error occurs with respect to the frame data; and
wherein the processing circuitry analyzes the first portion and the
second portion to identify the interpolation opportunity in
response to determination that the error occurs.
7. The encoding system of claim 1, wherein the processing circuitry
determines that a source that generates the three-dimensional video
content has at least one specified characteristic; and wherein the
processing circuitry analyzes the first portion and the second
portion to identify the interpolation opportunity in response to
determination that the source has the at least one specified
characteristic.
8. The encoding system of claim 1, wherein the processing circuitry
determines that a communication channel via which the
three-dimensional video content is to be transmitted has at least
one specified characteristic; and wherein the processing circuitry
analyzes the first portion and the second portion to identify the
interpolation opportunity in response to determination that the
communication channel has the at least one specified
characteristic.
9. The encoding system of claim 1, wherein the interpolation marker
specifies a type of interpolation to be performed to generate the
frame data.
10. A decoding system servicing encoded three-dimensional video
content, the encoded three-dimensional video content having both a
first encoded portion of a first encoded sequence of frames that
represent a first perspective view and a second encoded portion of
a second encoded sequence of frames that represent a second
perspective view, the encoding system comprising: processing
circuitry; input circuitry through which the processing circuitry
receives both the first encoded portion of the first encoded
sequence of frames that represent the first perspective view and
the second encoded portion of the second encoded sequence of frames
that represent the second perspective view; the processing
circuitry decodes the first encoded portion and the second encoded
portion received, the decoding involving responding to an
interpolation marker by generating frame data to replace the
interpolation marker; and output circuitry through which the
processing circuitry delivers a decoded representation of the
encoded three-dimensional video content.
11. The decoding system of claim 10, wherein the processing
circuitry determines that a number of perspective views that a
display is capable of processing is greater than a number of
perspective views that is initially represented by the encoded
three-dimensional video content; wherein the processing circuitry
provides an interpolation request to an encoder through the output
circuitry, the interpolation request requesting inclusion of the
interpolation marker in the encoded three-dimensional video
content, in response to determination that the number of
perspective views that the display is capable of processing is
greater than the number of perspective views that is initially
represented by the encoded three-dimensional video content; and
wherein the processing circuitry interpolates between a decoded
version of the first encoded portion and a decoded version of the
second encoded portion to generate the frame data that corresponds
to a third sequence of frames that represent a third perspective
view, the third perspective view not being initially represented by
the encoded three-dimensional video content.
12. The decoding system of claim 10, wherein the processing
circuitry receives an interpolation instruction from an upstream
device through the input circuitry; and wherein the processing
circuitry generates the frame data in accordance with the
interpolation instruction.
13. The decoding system of claim 10, wherein the processing
circuitry decodes the first encoded portion to provide a first
decoded portion of a first decoded sequence of frames that
represents the first perspective view; wherein the processing
circuitry decodes the second encoded portion to provide decoded
data that represents the second perspective view, the decoded data
including the interpolation marker; and wherein the processing
circuitry interpolates between the first decoded portion and a
third decoded portion of a third decoded sequence of frames that
represents a third perspective view to generate the frame data to
replace the interpolation marker in the decoded data.
14. The decoding system of claim 13, wherein the processing
circuitry receives a weight indicator from an upstream device
through the input circuitry, the weight indicator specifying an
extent to which the first decoded portion is to be weighed with
respect to the third decoded portion; and wherein the processing
circuitry generates the frame data based on the extent that is
specified by the weight indicator.
15. A method used in decoding encoded three-dimensional video
content, the encoded three-dimensional video content having both
first encoded data relating to a first sequence of frames
representing a first perspective view and second encoded data
relating to a second sequence of frames representing a second
perspective view, the method comprising: retrieving at least a
portion of the first encoded data that relates to the first
sequence of frames representing the first perspective view;
retrieving at least a portion of the second encoded data that
relates to the second sequence of frames representing the second
perspective view; identifying a first frame within the first
sequence of frames not directly represented by the first encoded
data retrieved; and producing an interpolation of the first
frame.
16. The method of claim 15, wherein the interpolation is based at
least in part on the second encoded data.
17. The method of claim 15, wherein identifying the first frame
comprises: identifying an interpolation marker that is associated
with the first frame.
18. The method of claim 17, wherein the interpolation marker is
accompanied by interpolation instructions.
19. The method of claim 15, wherein the first frame comprising a
missing frame.
20. The method of claim 15, wherein producing the interpolation of
the first frame comprises: producing the interpolation of the first
frame based on at least the portion of the first encoded data and
at least a portion of third encoded data that relates to a third
sequence of frames representing a third perspective view based on a
weight indicator, the weight indicator specifying an extent to
which at least the portion of the first encoded data is to be
weighed with respect to at least the portion of the third encoded
data.
Description
CROSS-REFERENCE TO RELATED APPLICATIONS
[0001] This application claims the benefit of U.S. Provisional
Application No. 61/291,818, filed Dec. 31, 2009, which is
incorporated by reference herein in its entirety. This application
also claims the benefit of U.S. Provisional Application No.
61/303,119, filed Feb. 10, 2010, which is incorporated by reference
herein in its entirety.
[0002] This application is also related to the following U.S.
Patent Applications, each of which also claims the benefit of U.S.
Provisional Patent Application Nos. 61/291,818 and 61/303,119 and
each of which is incorporated by reference herein:
[0003] U.S. patent application Ser. No. 12/845,409, filed on Jul.
28, 2010, and entitled "Display with Adaptable Parallax
Barrier";
[0004] U.S. patent application Ser. No. 12/845,440, filed on Jul.
28, 2010, and entitled "Adaptable Parallax Barrier Supporting Mixed
2D and Stereoscopic 3D Display Regions";
[0005] U.S. patent application Ser. No. 12/845,461, filed on Jul.
28, 2010, and entitled "Display Supporting Multiple Simultaneous 3D
Views"; and
[0006] U.S. patent application Ser. No. ______ (Attorney Docket No.
A05.01330000), filed on even date herewith and entitled
"Hierarchical Video Compression Supporting Selective Delivery of
Two-Dimensional and Three-Dimensional Video Content."
BACKGROUND OF THE INVENTION
[0007] 1. Field of the Invention
[0008] The present invention relates to techniques for processing
video images.
[0009] 2. Background Art
[0010] Images may be transmitted for display in various forms. For
instance, television (TV) is a widely used telecommunication medium
for transmitting and displaying images in monochromatic ("black and
white") or color form. Conventionally, images are provided in
analog form and are displayed by display devices in the form of
two-dimensional images. More recently, images are being provided in
digital form for display in two-dimensions on display devices
having improved resolution. Even more recently, images capable of
being displayed in three-dimensions are being provided.
[0011] Conventional displays may use a variety of techniques to
achieve three-dimensional image viewing functionality. For example,
various types of glasses have been developed that may be worn by
users to view three-dimensional images displayed by a conventional
display. Examples of such glasses include glasses that utilize
color filters or polarized filters. In each case, the lenses of the
glasses pass two-dimensional images of differing perspective to the
user's left and right eyes. The images are combined in the visual
center of the brain of the user to be perceived as a
three-dimensional image. In another example, synchronized left eye,
right eye LCD (liquid crystal display) shutter glasses may be used
with conventional two-dimensional displays to create a
three-dimensional viewing illusion. In still another example, LCD
display glasses may be used to display three-dimensional images to
a user. The lenses of the LCD display glasses include corresponding
displays that provide images of differing perspective to the user's
eyes, to be perceived by the user as three-dimensional.
[0012] Some displays are configured for viewing three-dimensional
images without the user having to wear special glasses, such as by
using techniques of autostereoscopy. For example, a display may
include a parallax barrier that has a layer of material with a
series of precision slits. The parallax barrier is placed proximal
to a display so that a user's eyes each see a different set of
pixels to create a sense of depth through parallax. Another type of
display for viewing three-dimensional images is one that includes a
lenticular lens. A lenticular lens includes an array of magnifying
lenses configured so that when viewed from slightly different
angles, different images are magnified. Displays are being
developed that use lenticular lenses to enable autostereoscopic
images to be generated.
[0013] Each technique for achieving three-dimensional image viewing
functionality involves transmitting three-dimensional video content
to a display device, so that the display device can display
three-dimensional images that are represented by the
three-dimensional video content to a user. A variety of issues may
arise with respect to such transmission. For example, errors that
occur during the transmission may cause frame data in the video
content to become corrupted. In another example, a source of the
video content and/or the channels through which the video content
is transferred may become temporarily unable to handle a load that
is imposed by the video content. In yet another example, the
display device may be capable of processing frame data of a greater
number of perspectives than the source is capable of providing.
BRIEF SUMMARY OF THE INVENTION
[0014] Methods, systems, and apparatuses are described for
interpolating three-dimensional video content as shown in and/or
described herein in connection with at least one of the figures, as
set forth more completely in the claims.
BRIEF DESCRIPTION OF THE DRAWINGS/FIGURES
[0015] The accompanying drawings, which are incorporated herein and
form part of the specification, illustrate embodiments of the
present invention and, together with the description, further serve
to explain the principles involved and to enable a person skilled
in the relevant art(s) to make and use the disclosed
technologies.
[0016] FIG. 1 is a block diagram of an exemplary system for
generating three-dimensional video content that may be encoded in
accordance with an embodiment.
[0017] FIG. 2 is a block diagram of an exemplary display system
according to an embodiment.
[0018] FIG. 3 depicts an exemplary implementation of an encoding
system shown in FIG. 2 in accordance with an embodiment.
[0019] FIGS. 4-9 show flowcharts of exemplary methods for encoding
portions of three-dimensional video content for subsequent
interpolation according to embodiments.
[0020] FIG. 10 depicts an exemplary implementation of a decoding
system shown in FIG. 2 in accordance with an embodiment.
[0021] FIGS. 11-16 show flowcharts of exemplary methods for
decoding portions of encoded three-dimensional video content using
interpolation according to embodiments.
[0022] FIGS. 17-20 illustrate exemplary interpolation techniques
according to embodiments.
[0023] FIG. 21 is a block diagram of an exemplary electronic device
according to an embodiment.
[0024] The features and advantages of the disclosed technologies
will become more apparent from the detailed description set forth
below when taken in conjunction with the drawings, in which like
reference characters identify corresponding elements throughout. In
the drawings, like reference numbers generally indicate identical,
functionally similar, and/or structurally similar elements. The
drawing in which an element first appears is indicated by the
leftmost digit(s) in the corresponding reference number.
DETAILED DESCRIPTION OF THE INVENTION
[0025] I. Introduction
[0026] The following detailed description refers to the
accompanying drawings that illustrate exemplary embodiments of the
present invention. However, the scope of the present invention is
not limited to these embodiments, but is instead defined by the
appended claims. Thus, embodiments beyond those shown in the
accompanying drawings, such as modified versions of the illustrated
embodiments, may nevertheless be encompassed by the present
invention.
[0027] References in the specification to "one embodiment," "an
embodiment," "an example embodiment," or the like, indicate that
the embodiment described may include a particular feature,
structure, or characteristic, but every embodiment may not
necessarily include the particular feature, structure, or
characteristic. Moreover, such phrases are not necessarily
referring to the same embodiment. Furthermore, when a particular
feature, structure, or characteristic is described in connection
with an embodiment, it is submitted that it is within the knowledge
of one skilled in the relevant art(s) to implement such feature,
structure, or characteristic in connection with other embodiments
whether or not explicitly described.
[0028] Furthermore, it should be understood that spatial
descriptions (e.g., "above," "below," "up," "left," "right,"
"down," "top," "bottom," "vertical," "horizontal," etc.) used
herein are for purposes of illustration only, and that practical
implementations of the structures described herein can be spatially
arranged in any orientation or manner.
[0029] II. Example Embodiments
[0030] Example embodiments relate to interpolation of
three-dimensional video content. Three-dimensional video content is
video content that includes portions representing respective frame
sequences that provide respective perspective views of a given
subject matter over the same period of time. In accordance with
some embodiments, an upstream device analyzes the three-dimensional
video content to identify one or more interpolation opportunities.
An interpolation opportunity occurs when a target perspective view
that is associated with the three-dimensional video content is
between reference perspective views that are associated with the
three-dimensional video content. The target perspective view and
the reference perspective views are perspective views of a common
video event that are provided by respective sequences of frames
(alternatively referred to herein as "images" or "pictures") that
are represented by respective portions of the three-dimensional
video content.
[0031] For example, assume that three-dimensional video content
includes portions PA, PB, and PC that represent respective
perspective views VA, VB, and VC for illustrative purposes. Further
assume that VB is between VA and VC. In accordance with this
example, an interpolation opportunity is said to occur for
providing an interpolated representation of PB based on PA and
PC.
[0032] If an interpolation opportunity is identified, frame data
that is associated with the interpolation opportunity may be
replaced with an interpolation marker. In accordance with the
example mentioned above, the upstream device may replace PB with
the interpolation marker.
[0033] When a downstream device receives the three-dimensional
video content, which includes the interpolation marker, the
downstream device may replace the interpolation marker with an
interpolated representation of the frame data that the
interpolation marker replaced. For instance, the downstream device
may interpolate between the portions of the three-dimensional video
content that represent the sequences of frames that provide the
reference perspective views to generate an interpolated
representation (a.k.a. an interpolation) of the portion of the
three-dimensional video content that represents the sequence of
frames that provides the target perspective view. In accordance
with the example mentioned above, the downstream device may
interpolate between PA and PC to generate an interpolated
representation of PB.
[0034] In some embodiments, the downstream device identifies a
frame that is not directly represented by data that is included in
the three-dimensional video content. For example, the frame may be
represented by an interpolation marker. However, in such
embodiments, the downstream device may perform an interpolation
operation with respect to portions of the three-dimensional video
content even in the absence of an interpolation marker. For
example, the data may be corrupted. In accordance with this
example, the frame may be missing from the data, or a portion of
the data that corresponds to the frame may include erroneous data.
Accordingly, the interpolation need not necessarily be performed in
response to an interpolation marker.
[0035] The embodiments described herein have a variety of benefits
as compared to conventional techniques for processing video
content. For example, the embodiments may increase the likelihood
that a source of the video content and/or the channels through
which the video content is transferred are capable of handling a
load that is imposed by the video content. In another example, the
embodiments may be capable of increasing the number of perspectives
that are provided by the video content. In yet another example, the
embodiments may be capable of correcting corrupted data that is
included in the video content based on other data in the video
content. For instance, the corrupted data may be corrected on the
fly using one or more of the techniques described herein.
[0036] The following subsections describe a variety of example
embodiments of the present invention. It will be apparent to
persons skilled in the relevant art that various changes in form
and detail can be made to the embodiments described herein without
departing from the spirit and scope of the invention. Thus, the
breadth and scope of the present invention should not be limited by
any of the example embodiments described herein.
[0037] A. Example Display System and Method Embodiments
[0038] In accordance with embodiments described herein,
three-dimensional video content is represented as a plurality of
separate portions (a.k.a. digital video streams). Each portion
represents a respective frame sequence that provides a respective
perspective view of a video event. This is illustrated by FIG. 1,
which is a diagram of an exemplary system 100 for generating
three-dimensional video content that may be encoded in accordance
with an embodiment. As shown in FIG. 1, system 100 includes a
plurality of video cameras 102A-102N that are directed at and
operate to record images of the same subject matter 104 from
different perspectives over the same period of time. This results
in the generation three-dimensional video content 106, which
includes N different portions 108A-108N that provide different
perspective views of subject matter 104 over the same period of
time.
[0039] Of course, techniques other than utilizing video cameras may
be used to produce the different portions 108A-108N. For example,
one or more of the portions 108A-108N may be created in a manual or
automated fashion by digital animators using advanced graphics and
animation tools. Additionally, at least one of the portions
108A-108N may be created by using a manual or automated
interpolation process that creates a portion based on analysis of
at least two of the other portions. For example, with reference to
FIG. 1, if camera 102B were absent, a digital video stream
corresponding to the perspective view of subject matter 104
provided by that camera could nevertheless be created by performing
an interpolation process on the portions of the three-dimensional
video content 106 produced by camera 102A and another of the
cameras. Still other techniques not described herein may be used to
produce one or more of the different digital video streams.
[0040] Display systems have been described that can display a
single image of certain subject matter to provide a two-dimensional
view thereof and that can also display two images of the same
subject matter viewed from different perspectives in an integrated
manner to provide a three-dimensional view thereof. Such
two-dimensional (2D)/three-dimensional (3D) display systems can
further display a multiple of two images (e.g., four images, eight
images, etc.) of the same subject matter viewed from different
perspectives in an integrated manner to simultaneously provide
multiple three-dimensional views thereof, wherein the particular
three-dimensional view perceived by a viewer is determined based at
least in part on the position of the viewer. Examples of such 2D/3D
display systems are described in the following commonly-owned,
co-pending U.S. Patent Applications: U.S. patent application Ser.
No. 12/845,409, filed on Jul. 28, 2010, and entitled "Display with
Adaptable Parallax Barrier"; U.S. patent application Ser. No.
12/845,440, filed on Jul. 28, 2010, and entitled "Adaptable
Parallax Barrier Supporting Mixed 2D and Stereoscopic 3D Display
Regions"; and U.S. patent application Ser. No. 12/845,461, filed on
Jul. 28, 2010, and entitled "Display Supporting Multiple
Simultaneous 3D Views." The entirety of each of these applications
is incorporated by reference herein.
[0041] The portions 108A-108N produced by system 100 can be
obtained and provided to a 2D/3D display system as described above
in order to facilitate the presentation of a two-dimensional view
of subject matter 104, a single three-dimensional view of subject
matter 104, or multiple three-dimensional views of subject matter
104.
[0042] FIG. 2 is a block diagram of an exemplary display system 200
according to an embodiment. Generally speaking, display system 200
operates to transmit three-dimensional video content, such as
three-dimensional video content 106 of FIG. 1, to a display device,
so that the display device can display three-dimensional images
that are represented by the three-dimensional video content to
user(s). According to embodiments, display system 200 interpolates
between portions of the three-dimensional video content that
correspond to respective perspective views to provide frame data
that corresponds to another perspective view. As shown in FIG. 2,
display system 200 includes source(s) 202 and a display device 204.
Source(s) provide three-dimensional video content 206. Source(s)
202 can include any number of sources, including one, two, three,
etc. Each source provides one or more portions of the
three-dimensional video content 206. Examples of a source include
but are not limited to a computer storage disc (e.g., a digital
video disc (DVD) or a Blu-Ray.RTM. disc), local storage on a
display device, a remote server (i.e., a server that is located
remotely from the display device), a gaming system, a satellite, a
cable headend, and a point-to-point system.
[0043] Some of the portions of the three-dimensional video content
206 may serve as reference portions, while others serve as
supplemental portions, though the scope of the embodiments is not
limited in this respect. For instance, the supplemental portions
may be used to increase the number of perspective views that are
included in the three-dimensional video content beyond the number
of perspective views that are represented by the reference
portions. The reference portions may include 2D data, 3D2 data, 3D4
data, 3D8 data, etc. Supplemental portions may include
auto-interpolated 2D-3D2 (single stream) data, manually generated
interpolation 3D2 data, A-I 3D4 (3 stream) data, M-G-I 3D4 (3
stream) data, etc.
[0044] As shown in FIG. 2, source(s) 202 includes an encoding
system 208. Encoding system 208 encodes the three-dimensional video
content 206 to provide encoded three-dimensional video content 210.
For example, encoding system 208 may replace frame data in the
three-dimensional video content 206 with an interpolation marker.
The interpolation marker may indicate that interpolation is to be
performed between portions of the three-dimensional video content
in order to generate an interpolated representation of the frame
data that is replaced with the interpolation marker. The
interpolation marker may be accompanied by instructions for
generating the interpolated representation. It will be recognized,
however, that encoding system 208 need not necessarily replace
frame data in the three-dimensional video content 206 with an
interpolation marker. Regardless, encoding system 208 transmits the
encoded three-dimensional video content 210 toward display device
204 via communication channels 212.
[0045] It will be further recognized that source(s) 202 need not
necessarily include encoding system 208. For example, source(s) 202
may store the encoded three-dimensional video content 210, rather
than generating the encoded three-dimensional video content 210
based on the three-dimensional video content 206.
[0046] Communication channels 212 may include one or more local
device pathways, point-to-point links, and/or pathways in a hybrid
fiber coaxial (HFC) network, a wide-area network (e.g., the
Internet), a local area network (LAN), another type of network, or
a combination thereof. Communication channels 212 may support
wired, wireless, or both transmission media, including satellite,
terrestrial (e.g., fiber optic, copper, twisted pair, coaxial, or
the like), radio, microwave, free-space optics, and/or any other
form or method of transmission.
[0047] Display device 204 displays images to user(s) upon receipt
of the encoded three-dimensional video content 210. Display device
204 may be implemented in various ways. For instance, display
device 204 may be a television display (e.g., a liquid crystal
display (LCD) television, a plasma television, etc.), a computer
monitor, a projection system, or any other type of display
device.
[0048] Display device 204 includes an interpolation-enabled
decoding system 214, display circuitry 216, and a screen 218.
Decoding system 214 decodes the encoded three-dimensional video
content 210 to provide decoded three-dimensional video content 220.
For instance, decoding system 214 may interpolate between portions
of a decoded representation of the encoded three-dimensional video
content 210 to generate one or more of the portions of the decoded
three-dimensional video content 220. In one example, decoding
system 214 may interpolate in response to detecting an
interpolation indicator in the encoded three-dimensional video
content 210. In another example, decoding system 214 may
interpolate in response to determining that a frame that is
included in the decoded representation of the encoded
three-dimensional video content 210 is not directly represented by
data in the decoded representation. For instance, decoding system
214 may determine that the frame is replaced by an interpolation
marker, that the frame is missing from the data, or that a portion
of the data that corresponds to the frame includes erroneous data.
Interpolation that is performed by decoding system 214 may be
incorporated into a decoding process or may be performed after such
a decoding process on raw data.
[0049] In an embodiment, decoding system 214 maintains
synchronization of the portions that are included in the decoded
three-dimensional video content 220. For instance, such
synchronization may be maintained during inter-reference frame
periods, during screen reconfiguration, etc. If decoding system 214
is unable to maintain synchronization with respect to one or more
portions of the decoded three-dimensional video content 220,
decoding system 214 may perform interpolation to generate
interpolated representations of those portion(s) until
synchronization is re-established. Decoding system 214 may
synchronize 3DN adjustments with reference frame occurrence, where
N can be any positive integer greater than or equal to two. A 3DN
adjustment may include the addition of frame data corresponding to
a perspective view, for example. For each additional perspective
that is represented by the decoded three-dimensional video content
220, N is incremented by one.
[0050] Display circuitry 216 directs display of one or more of the
frame sequences that are represented by the decoded
three-dimensional video content 220 toward screen 218, as indicated
by arrow 222, for presentation to the user(s). It will be
recognized that although display circuitry 216 is labeled as such,
the functionality of display circuitry 216 may be implemented in
hardware, software, firmware, or any combination thereof.
[0051] Screen 218 displays the frame sequence(s) that are received
from display circuitry 216 to the user(s). Screen 218 may be any
suitable type of screen, including but not limited to an LCD
screen, a plasma screen, a light emitting device (LED) screen
(e.g., an OLED (organic LED) screen), etc.
[0052] It will be recognized that encoding system 208 may be
external to source(s) 202. Moreover, decoding system 214 may be
external to display device 204. For instance, encoding system 208
and decoding system 214 may be implemented in a common device, such
as a transcoder that is coupled between source(s) 202 and display
device 204.
[0053] It will be further recognized that feedback may be provided
from communication channels 212 and/or display device 204 to any
one or more of the source(s) 202. For example, display device 204
may provide feedback to indicate an error that occurs with respect
to frame data that is included in encoded three-dimensional video
content 210, one or more characteristics that are associated with
display device 204, etc. Examples of such characteristics include
but are not limited to a load that is associated with display
device 204 and a number of perspective views that display device
204 is capable of processing. In another example, channels 212 may
provide feedback to indicate an error that occurs with respect to
frame data that is included in encoded three-dimensional video
content 210, one or more characteristics (e.g., a load) that are
associated with the channels 212, etc.
[0054] B. Example Encoding Embodiments
[0055] FIG. 3 depicts a block diagram of an encoding system 300,
which is an exemplary implementation of encoding system 208 of FIG.
2, in accordance with an embodiment. As shown in FIG. 3, encoding
system 300 includes input circuitry 302, processing circuitry 304,
and output circuitry 306. Input circuitry 302 serves as an input
interface for encoding system 300. Processing circuitry 304
receives a plurality of portions 310A-310N of three-dimensional
video content 308 through input circuitry 302. Each of the portions
310A-310N represents a respective sequence of frames that provides
a respective perspective view of a video event. Processing
circuitry 304 encodes the portions 310A-310N to provide encoded
portions 314A-314N.
[0056] Processing circuitry 304 analyzes at least some of the
portions 310A-310N to identify one or more interpolation
opportunities. An interpolation opportunity occurs when a target
perspective view that is associated with the three-dimensional
video content 308 is between reference perspective views that are
associated with the three-dimensional video content 308. The target
perspective view and the reference perspective views are provided
by respective sequences of frames that are represented by
respective portions of the three-dimensional video content 308. For
each identified interpolation opportunity, processing circuitry 304
replaces frame data that is included in the corresponding portion
of the three-dimensional video content 308 with an interpolation
marker. For example, if processing circuitry 304 identifies an
interpolation opportunity in each of first portion 310A and second
portion 310B, processing circuitry replaces frame data that is
included in first portion 310A with an interpolation marker and
replaces frame data that is included in second portion 310B with
another interpolation marker.
[0057] Any one or more of the interpolation marker(s) may be
accompanied by an interpolation instruction. For instance, a first
interpolation instruction that corresponds to a first interpolation
marker may specify which of the portions 310A-310N of the
three-dimensional video content 308 are to be used for generating
an interpolated representation of the frame data that the first
interpolation marker replaces. A second interpolation instruction
that corresponds to a second interpolation marker may specify which
of the portions 310A-310N are to be used for generating an
interpolated representation of the frame data that the second
interpolation marker replaces, and so on.
[0058] Each interpolation marker may specify a type of
interpolation to be performed to generate an interpolated
representation of the frame data that the interpolation marker
replaces. For instance, a first type of interpolation may assign a
first weight to a first reference portion of the three-dimensional
video content 308 and a second weight that is different from the
first weight to a second reference portion of the three-dimensional
video content 308 for generating an interpolated representation of
frame data. A second type of interpolation may assign equal weights
to the first and second reference portions of the three-dimensional
video content 308. Other exemplary types of interpolation include
but are not limited to linear interpolation, polynomial
interpolation, and spline interpolation.
[0059] Output circuitry 306 serves as an output interface for
encoding system 300. Processing circuitry 304 delivers encoded
three-dimensional video content 312 that includes encoded portions
314A-314N through output circuitry 306.
[0060] Portions of three-dimensional video content, such as
portions 310A-310N, may be encoded in any of a variety of ways.
FIGS. 4-9 show flowcharts 400, 500, 600, 700, 800, and 900 of
exemplary methods for encoding portions of three-dimensional video
content for subsequent interpolation according to embodiments.
Flowcharts 400, 500, 600, 700, 800, and 900 may be performed by
encoding system 300 shown in FIG. 3, for example. However the
methods of flowcharts 400, 500, 600, 700, 800, and 900 are not
limited to that embodiment. Further structural and operational
embodiments will be apparent to persons skilled in the relevant
art(s) based on the discussion regarding flowcharts 400, 500, 600,
700, 800, and 900. Flowcharts 400, 500, 600, 700, 800, and 900 are
described as follows.
[0061] In all of FIGS. 4-9, the basic approach involves encoder
processing of at least a first sequence of frames and a second
sequence of frames, wherein the first sequence represents a first
perspective view (e.g., a right eye view) while the second sequence
represents a second perspective view (e.g., a left eye view). As an
output of such encoder processing, many frames will be encoded
based on the frame itself (no referencing to other frames),
internal referencing (referencing frames within the same sequence
of frames), and external referencing (referencing frames outside of
the current frame's sequence of frames). In addition, whenever an
interpolation opportunity presents itself, instead of sending
encoded data for such frame, such encoded data will either be (i)
merely deleted (forcing a decoder to perform interpolation based on
its determination that the encoded frame data is missing), or (ii)
replaced with interpolation information. Such interpolation
information may be nothing more than an indicator or marker (an
"interpolation marker") but may also contain interpolation
instructions, data and parameters.
[0062] In the encoder processing, a determination is made as to the
hierarchical importance of the current frame under consideration.
That is to determine the extent that the current frame will be
referenced by other frames. For example, if the current frame is a
primary reference frame (e.g., an I-Frame) that will be referenced
by many other frames, applying interpolation may not be
justifiable. If, on the other hand, the current frame will be
referenced by no (or few) other frame(s), it may be a prime
candidate for considering interpolation. In addition, a current
frame is encoded to determine the size of the resultant encoded
frame data. If the size is less than an established threshold,
interpolation may not be applied. But, for example, if the current
frame offers a justifiable data savings and without being
referenced, it is a prime candidate for considering
interpolation.
[0063] Once a candidate frame has been identified, the encoder
processing involves applying at least one but, depending on the
embodiment, may apply multiple interpolation approaches (along with
various underlying parameters variations). If only one approach is
applied, a determination is made as to whether such interpolation
can be used to yield a visually acceptable output. When multiple
approaches are available, a selection therefrom of (i) a best match
which is also determined to be visually acceptable, (ii) the first
match that can be used to yield something visually acceptable, (iv)
an acceptable match selected at least in part based on the ease of
decoding and/or the size of the interpolation information, or etc.
If best and/or acceptable interpolation information is identified
which saves justifiable amounts of data, the encoder processing
involves selecting to use the interpolation information (or use
nothing to force default interpolation by a decoder) instead of the
encoded frame data in subsequent storage and/or transmissions.
[0064] For example, in a three frame sequence, a camera is fixed
and only a relatively small object with the field of view moves
relatively slowly therein, while the background remains practically
unchanged. An interpolation opportunity might involve replacing the
middle frame in the sequence with nothing at all, to force the
decoder to interpolate between the first frame sequence and the
third frame sequence. Alternatively, a marker (an interpolation
marker) might be used instead of the second frame's encoded data.
Upon identification of such marker, a decoder might either (i)
substitute the first or the third frame data for the missing second
frame data, which is likely to not be noticed by a viewer due to
the relatively short frame rate period, (ii) create a substitute
for the missing second frame data by creating an average between
the first frame and the second frame (e.g., a 50/50 "weighted"
addition), or (iii) otherwise create a substitute based on weighted
addition percentage or using some other interpolation approach that
may utilize interpolation parameters, filters and other data.
[0065] In another example, when a camera is panning, to interpolate
a missing middle frame, a first and second frame sequence might be
stitched together and then cropped to produce a substitute for the
missing middle frame data. Of course in a panning scene, some
objects such as a moving car may appear stationary at least in
areas within the field of view so multiple interpolation approaches
within a single frame may be applied.
[0066] Likewise, although the above two examples of interpolation
opportunities were applied to a single camera view's frame
sequence, interpolation with reference to other camera view frame
sequences may also be performed. For example, an object moving in a
frame of a first camera's frame sequence might have strong
correlation with the same object a short time later captured in a
frame of the second camera's frame sequence. Thus, if the
correlating frame of the second camera's frame sequence is
discarded or replaced, at least the frame in the first camera's
frame sequence can be used by a decoder to recreate the missing
data. In addition, a single frame (or frame portion) alone or along
with other frames (or frame portions) from either or both camera
sequences can be used by the decoder to recreate the
substitute.
[0067] Thus, by sending no interpolation information (i.e., no
replacement for deleted frame data), a decoder will conclude that
interpolation is needed and respond by either repeating an adjacent
frame (e.g., if the frames are substantially different) or create a
middling alternative based on both preceding and subsequent frame
data using a single camera's frame sequence. If the interpolation
information contains only a marker, the decoder will immediately do
the same as above without having to indirectly reach the conclusion
that interpolation is needed. The interpolation information may
also contain further items that either direct or assist a decoder
in performing a desired interpolation. That is, for example, the
interpolation information may also contain interpolation
instructions, frame reference identifiers (that identify a frame or
frames from which a decoder can base its interpolation),
interpolation parameters (weighting factors, interpolation
approaches to be used, regional/area definitions, etc.), filters
(to be applied in the interpolation process) and any accompanying
data (e.g., texture maps, etc.) that may enhance the interpolation
process.
[0068] For example, images captured by one camera might be very
close to those created at a brief time later by another camera.
Thus, instead of using merely adjacent reference frames for
interpolation (such as the three frame sequences with a missing
middle frame approach mentioned above), the encoder may choose to
send interpolation information that identifies for use in the
interpolation process one or more frames selected from other
camera's frame sequences and other possibly non-adjacent frames
from within the same camera's frame sequence. The interpolation
information may also include the various interpolation parameters
mentioned above, interpolation approaches to be used, regional
definitions in which such approaches and frames are used, filters
and data.
[0069] A single encoder can perform all or any portion of the above
in association with a full frame or sections thereof. For instance,
a single frame can be broken down into regions and interpolation
per region can be different from that of another region.
[0070] FIGS. 4-7 are flow charts that illustrate several of many
approaches for carrying out at least a portion of such encoder
interpolation processing. More specifically, as shown in FIG. 4,
flowchart 400 begins step 402. In step 402, both a first portion of
three-dimensional video content and a second portion of the
three-dimensional video content are received. The first portion
corresponds to data that represents at least one frame from a first
sequence of frames that provide a first perspective view. The
second portion corresponds to data that represents at least one
frame from a second sequence of frames that provide a second
perspective view. Although not shown, a third portion that
corresponds to data that represents at least one other frame from
either the first or the second sequences of frames could also be
gathered and considered in the interpolation process. Of course,
many other portions from various other frames can also be gathered
and used.
[0071] In the implementation example of FIG. 3, the processing
circuitry 304 receives all portions, including both the first
portion and the second portion of the three-dimensional video
content through the input circuitry 302.
[0072] At step 404, the first portion and the second portion are
encoded. The encoding involves at least in part analyzing the first
portion and the second portion to identify an interpolation
opportunity. In the implementation example of FIG. 3, the
processing circuitry 304 encodes the first portion and the second
portion.
[0073] At step 406, frame data is replaced with an interpolation
marker. In the implementation example of FIG. 3, the processing
circuitry 304 replaces the frame data with the interpolation
marker.
[0074] At step 408, an encoded representation of the
three-dimensional video content is delivered. In the implementation
example of FIG. 3, the processing circuitry 304 delivers the
encoded representation of the three-dimensional video content
(e.g., encoded three-dimensional video content 312) through the
output circuitry 306.
[0075] In some embodiments, one or more of the steps 402, 404, 406,
and/or 408 of the flowchart 400 may not be performed. Moreover,
other steps in addition to or in lieu of the steps 402, 404, 406,
and/or 408 may be performed.
[0076] FIG. 5 shows a flowchart 500 that illustrates one of many
possible implementations of the step 404 of the flowchart 400 in
FIG. 4 in accordance with an embodiment of the present invention.
Similarly, as shown in FIG. 5, flowchart 500 includes step 502 that
may be applied in the step 404 of the flowchart 400 in FIG. 4, for
example. In step 502, a current frame is compared with frames that
neighbor the current frame to identify the interpolation
opportunity. For example, the frames that neighbor the current
frame may be included in respective portions of the
three-dimensional video content that correspond to respective
reference perspective views. In accordance with this example, the
current frame may be included in a portion of the three-dimensional
video content that corresponds to a perspective view that is
between the reference perspective views. In an embodiment, the
interpolation opportunity is identified in a first frame of the
first portion while the neighboring frames include a second frame
from the second portion. In the implementation example of FIG. 3,
the processing circuitry 304 may compare the current frame with the
frames that neighbor the current frame (neighbors within either or
both of the current camera view frame sequence and other camera
view's frame sequences) to identify the interpolation
opportunity.
[0077] In some embodiments, step 404 of flowchart 400 may be
performed in response to any one or more of the steps shown in
flowcharts 600, 700, 800, and/or 900 shown in FIGS. 6-9. As shown
in FIG. 6, flowchart 600 includes step 602. In step 602, a
determination is made that an accuracy of an estimate of the frame
data is greater than a threshold accuracy. In the implementation
example of FIG. 3, the processing circuitry 304 determines that the
accuracy of the estimate is greater than the threshold accuracy.
For example, the processing circuitry 304 may perform an
interpolation operation with respect to the first portion and/or
the second portion to generate the estimate of the frame data. In
accordance with this example, processing circuitry 304 may compare
the estimate to the frame data to determine the accuracy of the
estimate. Processing circuitry 304 may compare the accuracy to the
threshold accuracy to determine whether the estimate is greater
than the threshold accuracy. For instance, processing circuitry may
be configured to replace the frame data with an interpolation
marker at step 406 if the estimate of the frame data is greater
than the threshold accuracy, but not if the estimate is less than
the threshold accuracy.
[0078] As shown in FIG. 7, a determination is made that an error
occurs with respect to the frame data. For instance, it may be
desirable to avoid sending frame data with respect to which an
error is known to have occurred.
[0079] As shown in FIG. 8, a determination is made that a source
that generates the three-dimensional video content has at least one
specified characteristic. For example, a load that is associated
with the source may be greater than a threshold load. In another
example, the source may not support a viewing format that is
associated with the frame data.
[0080] As shown in FIG. 9, a determination is made that a
communication channel via which the three-dimensional video content
is to be transmitted has at least one specified characteristic. For
instance, a load that is associated with the communication channel
may be greater than a threshold load.
[0081] C. Example Decoding Embodiments
[0082] FIG. 10 depicts a block diagram of a decoding system 1000,
which is an exemplary implementation of interpolation-enabled
decoding system 214 of FIG. 2, in accordance with an embodiment. As
shown in FIG. 10, decoding system 1000 includes input circuitry
1002, processing circuitry 1004, and output circuitry 1006. Input
circuitry 1002 serves as an input interface for decoding system
1000. Processing circuitry 1004 receives a plurality of encoded
portions 1010A-1010N of encoded three-dimensional video content
1008 through input circuitry 1002. Each of the encoded portions
1010A-1010N represents a respective sequence of frames that
provides a respective perspective view of a video event. Processing
circuitry 1004 decodes the encoded portions 1010A-1010N to provide
decoded portions 1014A-1014M, which are included in decoded
three-dimensional video content 1012. The decoded three-dimensional
video content 1012 is also referred to as a decoded representation
of the encoded three-dimensional video content 1008. It will be
recognized that the number of encoded portions "N" need not
necessarily be equal to the number of decoded portions "M". For
instance, processing circuitry 1004 may interpolate between any the
encoded portions 1010A-1010N to generate one or more of the decoded
portions 1014A-1014M.
[0083] In some embodiments, processing circuitry 1004 responds to
one or more interpolation markers by generating frame data to
replace the respective interpolation marker(s). For instance,
processing circuitry 1004 may respond to a first interpolation
marker by generating first frame data to replace the first
interpolation marker. Processing circuitry may respond to a second
interpolation marker by generating second frame data to replace the
second interpolation marker, and so on. The interpolation marker(s)
are included in the encoded three-dimensional video content 1008.
The instance(s) of frame data that replace the respective
interpolation marker(s) are included in the decoded
three-dimensional video content 1012.
[0084] Any one or more of the interpolation marker(s) may be
accompanied by an interpolation instruction. For instance,
processing system 1004 may use a first subset of the encoded
portions 1010A-1010N that is specified by a first interpolation
instruction that corresponds to a first interpolation marker to
generate first frame data to replace the first interpolation
marker. Processing system 1004 may use a second subset of the
encoded portions 1010A-1010N that is specified by a second
interpolation instruction that corresponds to a second
interpolation marker to generate second frame data to replace the
second interpolation marker, and so on. Each interpolation
instruction (or the interpolation marker that it accompanies) may
specify a type of interpolation to be performed to generate the
frame data that the interpolation marker replaces.
[0085] In other embodiments, processing circuitry 1004 identifies
one or more frames that are not directly represented by one or more
respective encoded portions of the encoded three-dimensional video
content 1008. For example, a frame is not directly represented if
the frame is replaced with an interpolation marker in the encoded
three-dimensional video content 1008. In another example, a frame
is not directly represented if the frame is missing from the
encoded three-dimensional video content 1008. In yet another
example, a frame is not directly represented if the frame is
represented by erroneous data in the encoded three-dimensional
video content. Missing frames and erroneous frame data may occur,
for example, because of (i) defects in storage media or storage
process, and (ii) losses or unacceptable delays encountered in a
less than perfect communication pathway. Another example resulting
in a need for interpolation occurs when referenced frame data
cannot be found or is itself erroneous (corrupted). That is,
current frame data is correct but to decode it, one or more other
portions of frame data (such portions directly associated with
different frames) happen to be missing or contain erroneous data.
In such case and without an ability to decode the present, correct
frame data, interpolation may be performed to generate the current
frame as an alternative. Processing circuitry 1004 produces
interpolation(s) of the respective frame(s) that are not directly
represented.
[0086] Output circuitry 1006 serves as an output interface for
decoding system 1000. Processing circuitry 1004 delivers the
decoded three-dimensional video content 1012 through output
circuitry 1006.
[0087] Portions of encoded three-dimensional video content, such as
encoded portions 1010A-1010N, may be decoded in any of a variety of
ways. FIGS. 11-16 show flowcharts 1100, 1200, 1300, 1400, 1500, and
1600 of exemplary methods for decoding portions of encoded
three-dimensional video content using interpolation according to
embodiments. Flowcharts 400, 500, 600, 700, 800, and 900 may be
performed by decoding system 1000 shown in FIG. 10, for example.
However the methods of flowcharts 1100, 1200, 1300, 1400, 1500, and
1600 are not limited to that embodiment. Further structural and
operational embodiments will be apparent to persons skilled in the
relevant art(s) based on the discussion regarding flowcharts 1100,
1200, 1300, 1400, 1500, and 1600. Flowcharts 1100, 1200, 1300,
1400, 1500, and 1600 are described as follows.
[0088] In all of FIGS. 11-16, the basic approach involves decoder
processing of at least a first sequence of frames and a second
sequence of frames, wherein the first sequence represents a first
perspective view (e.g., a right eye view) while the second sequence
represents a second perspective view (e.g., a left eye view). The
decoder receives many frames that are encoded based on the frame
itself (no referencing to other frames), internal referencing
(referencing frames within the same sequence of frames), and
external referencing (referencing frames outside of the current
frame's sequence of frames). In addition, instead of receiving
encoded data for such frame, such encoded data is either (i)
deleted or (ii) replaced with interpolation information. Upon
determining that encoded data is deleted or replaced with
interpolation information, the decoder performs interpolation to
generate the encoded data. Interpolation information may be nothing
more than an indicator or marker (an "interpolation marker") but
may also contain interpolation instructions, data and parameters.
The decoder processing involves applying at least one but,
depending on the embodiment, may apply multiple interpolation
approaches (along with various underlying parameters
variations).
[0089] For example, in a three frame sequence, a camera is fixed
and only a relatively small object with the field of view moves
relatively slowly therein, while the background remains practically
unchanged. In accordance with this example, an encoder may replace
the middle frame in the sequence with nothing at all. The decoder
detects that the middle frame is missing and interpolates between
the first frame sequence and the third frame sequence.
Alternatively, the encoder may use a marker (an interpolation
marker) instead of the second frame's encoded data. Upon
identification of such marker, the decoder might either (i)
substitute the first or the third frame data for the missing second
frame data, which is likely to not be noticed by a viewer due to
the relatively short frame rate period, (ii) create a substitute
for the missing second frame data by creating an average between
the first frame and the second frame (e.g., a 50/50 "weighted"
addition), or (iii) otherwise create a substitute based on weighted
addition percentage or using some other interpolation approach that
may utilize interpolation parameters, filters and other data.
[0090] In another example, when a camera is panning, to interpolate
a missing middle frame, the decoder may stitch a first and second
frame sequence together and then crop the stitched sequence to
produce a substitute for the missing middle frame data. Of course
in a panning scene, some objects such as a moving car may appear
stationary at least in areas within the field of view so multiple
interpolation approaches within a single frame may be applied.
[0091] Likewise, although the above two interpolation examples were
applied to a single camera view's frame sequence, interpolation
with reference to other camera view frame sequences may also be
performed. For example, an object moving in a frame of a first
camera's frame sequence might have strong correlation with the same
object a short time later captured in a frame of the second
camera's frame sequence. Thus, if the correlating frame of the
second camera's frame sequence is discarded or replaced, at least
the frame in the first camera's frame sequence can be used by the
decoder to recreate the missing data. In addition, a single frame
(or frame portion) alone or along with other frames (or frame
portions) from either or both camera sequences can be used by the
decoder to recreate the substitute.
[0092] Thus, if frame data is missing and no interpolation
information (i.e., no replacement for the missing frame data) is
received by the decoder, the decoder will conclude that
interpolation is needed and respond by either repeating an adjacent
frame (e.g., if the frames are substantially different) or create a
middling alternative based on both preceding and subsequent frame
data using a single camera's frame sequence. If the interpolation
information contains only a marker, the decoder will immediately do
the same as above without having to indirectly reach the conclusion
that interpolation is needed. The interpolation information may
also contain further items that either direct or assist the decoder
in performing a desired interpolation. That is, for example, the
interpolation information may also contain interpolation
instructions, frame reference identifiers (that identify a frame or
frames from which the decoder can base its interpolation),
interpolation parameters (weighting factors, interpolation
approaches to be used, regional/area definitions, etc.), filters
(to be applied in the interpolation process) and any accompanying
data (e.g., texture maps, etc.) that may enhance the interpolation
process.
[0093] For example, images captured by one camera might be very
close to those created at a brief time later by another camera.
Thus, instead of using merely adjacent reference frames for
interpolation (such as the three frame sequences with a missing
middle frame approach mentioned above), an encoder may choose to
send interpolation information that identifies for use by the
decoder one or more frames selected from other camera's frame
sequences and other possibly non-adjacent frames from within the
same camera's frame sequence. The interpolation information may
also include the various interpolation parameters mentioned above,
interpolation approaches to be used, regional definitions in which
such approaches and frames are used, filters and data.
[0094] A decoder can perform all or any portion of the above in
association with a full frame or sections thereof. For instance,
the decoder can perform interpolation operations on respective
regions of a single frame, and interpolation per region can be
different from that of another region.
[0095] FIGS. 11-16 are flow charts that illustrate several of many
approaches for carrying out at least a portion of such decoder
interpolation processing. more specifically, as shown in FIG. 11,
flowchart 1100 begins step 1102. In step 1102, both a first encoded
portion of a first encoded sequence of frames that represent a
first perspective view and a second encoded portion of a second
encoded sequence of frames that represent a second perspective view
are received. In the implementation example of FIG. 10, the
processing circuitry 1004 receives the first encoded portion and
the second encoded portion through the input circuitry 1002.
[0096] At step 1104, the first encoded portion and the second
encoded portion are decoded. The decoding involves responding to an
interpolation marker by generating frame data to replace the
interpolation marker. In the implementation example of FIG. 10, the
processing circuitry 1004 decodes the first encoded portion and the
second encoded portion.
[0097] At step 1106, a decoded representation of the encoded
three-dimensional video content is delivered. In the implementation
example of FIG. 10, the processing circuitry 1004 delivers the
decoded representation of the encoded three-dimensional video
content (e.g., decoded three-dimensional video content 1012)
through the output circuitry 1006.
[0098] In some example embodiments, one or more steps 1102, 1104,
and/or 1106 of flowchart 1100 may not be performed. Moreover, steps
in addition to or in lieu of steps 1102, 1104, and/or 1106 may be
performed.
[0099] Instead of performing step 1104 of flowchart 1100, the steps
shown in flowchart 1200, flowchart 1300, or flowchart 1400 shown in
respective FIGS. 12-14 may be performed. A shown in FIG. 12,
flowchart 1200 begins at step 1202. In step 1202, a determination
is made that a number of perspective views that a display is
capable of processing is greater than a number of perspective views
that is initially represented by the encoded three-dimensional
video content. In the implementation example of FIG. 10, the
processing circuitry 1004 determines that the number of perspective
views that the display is capable of processing is greater than the
number of perspective views that is initially represented by the
encoded three-dimensional video content.
[0100] At step 1204, an interpolation request is provided to an
encoder. The interpolation request requests inclusion of an
interpolation marker in the encoded three-dimensional video
content. In the implementation example of FIG. 10, the processing
circuitry 1004 provides the interpolation request through the
output circuitry 1006.
[0101] At step 1206, an interpolation is performed between a
decoded version of the first encoded portion and a decoded version
of the second encoded portion to generate frame data that
corresponds to a third sequence of frames that represent a third
perspective view to replace the interpolation marker. The third
perspective view is not initially represented by the encoded
three-dimensional video content. In the implementation example of
FIG. 10, the processing circuitry 1004 interpolates between the
decoded version of the first encoded portion and the decoded
version of the second encoded portion to generate the frame
data.
[0102] As shown in FIG. 13, flowchart 1300 begins at step 1302. In
step 1302, an interpolation instruction is received from an
upstream device. In the implementation example of FIG. 10, the
processing circuitry 1004 receives the interpolation instructions
through the input circuitry 1002.
[0103] At step 1304, the first encoded portion and the second
encoded portion are decoded. The decoding involves responding to an
interpolation marker by generating frame data to replace the
interpolation marker in accordance with the interpolation
instruction. In the implementation example of FIG. 10, the
processing circuitry 1004 decodes the first encoded portion and the
second encoded portion.
[0104] As shown in FIG. 14, flowchart 1400 begins at step 1402. In
step 1402, the first encoded portion is decoded to provide a first
decoded portion of a first decoded sequence of frames that
represents the first perspective view. In the implementation
example of FIG. 10, the processing circuitry 1004 decodes the first
encoded portion.
[0105] At step 1404, the second encoded portion is decoded to
provide decoded data that represents the second perspective view,
the decoded data including an interpolation marker. In the
implementation example of FIG. 10, the processing circuitry 1004
decodes the second encoded portion.
[0106] At step 1406, an interpolation is performed between the
first decoded portion and a third decoded portion of a third
decoded sequence of frames that represents a third perspective view
to generate frame data to replace the interpolation marker in the
decoded data. In the implementation example of FIG. 10, the
processing circuitry 1004 interpolates between the first decoded
portion and the third decoded portion to generate the frame
data.
[0107] Instead of performing step 1406 of flowchart 1400, the steps
shown in flowchart 1500 shown in FIG. 15 may be performed. A shown
in FIG. 15, flowchart 1500 begins at step 1502. In step 1502, a
weight indicator is received from an upstream device. The weight
indicator specifies an extent to which the first decoded portion is
to be weighed with respect to the third decoded portion. In the
implementation example of FIG. 10, the processing circuitry 1004
receives the weight indicator from the upstream device through
input circuitry 1002.
[0108] At step 1504, an interpolation is performed between the
first decoded portion and a third decoded portion of a third
decoded sequence of frames that represents a third perspective view
to generate frame data to replace the interpolation marker in the
decoded data based on the extent that is specified by the weight
indicator. In the implementation example of FIG. 10, the processing
circuitry 1004 interpolates between the first decoded portion and
the third decoded portion to generate the frame data.
[0109] As shown in FIG. 16, flowchart 1600 begins at step 1602. In
step 1602, at least a portion of first encoded data is retrieved
that relates to a first sequence of frames representing a first
perspective view. In the implementation example of FIG. 10, the
processing circuitry 1004 retrieves the at least one portion of the
first encoded data.
[0110] At step 1604, at least a portion of second encoded data is
received that relates to a second sequence of frames representing a
second perspective view. In the implementation example of FIG. 10,
the processing circuitry 1004 retrieves the at least one portion of
the second encoded data.
[0111] At step 1606, a first frame is identified within the first
sequence of frames that is not directly represented by the first
encoded data retrieved. For example, an interpolation marker that
is associated with the first frame may be identified. In accordance
with this example, the interpolation marker may be accompanied by
interpolation instructions. In another example, the first frame
includes a missing frame. In the implementation example of FIG. 10,
the processing circuitry 1004 identifies the first frame.
[0112] At step 1608, an interpolation of the first frame is
produced. For example, the interpolation may be based at least in
part on the second encoded data. In another example, production of
the interpolation of the first frame may be based on at least the
portion of the first encoded data and at least a portion of third
encoded data that relates to a third sequence of frames
representing a third perspective view based on a weight indicator.
In accordance with this example, the weight indicator specifies an
extent to which at least the portion of the first encoded data is
to be weighed with respect to at least the portion of the third
encoded data. In the implementation example of FIG. 10, the
processing circuitry 1004 produces the interpolation of the first
frame.
[0113] FIGS. 17-20 illustrate exemplary interpolation techniques
1700, 1800, 1900, and 2000 according to embodiments. Each of the
interpolation techniques 1700, 1800, 1900, and 2000 is described
with reference to exemplary instances of video content. The
instances of video content may be 2D video content, 3D2 video
content, 3D4 video content, 3D8 video content, etc. It will be
recognized that 2D video content represents one perspective view of
a video event, 3D2 video content represents two perspective views
of the video event, 3D4 represents four perspective views of the
video event, 3D8 video content represents eight perspective views
of the video content, and so on. The number of views represented by
the instances of video content that are used to describe techniques
1700, 1800, 1900, and 2000 are provided for illustrative purposes
and are not intended to be limiting. It will be recognized that
techniques 1700, 1800, 1900, and 2000 are applicable to video
content that represents any suitable number of views. In the
following discussion, instances of video content are referred to
simply as "content" for convenience.
[0114] Referring to FIG. 17, technique 1700 is directed to staging
3D8 content from 2D up according to an embodiment. Technique 1700
will be described with reference to original 3D8 content 1702, 2D
content 1704, 3D2 content 1706, 3D4 content 1708, and 3D8 content
1710. The original content used to illustrate technique 1700 is 3D8
content, which includes eight video streams (labeled as 1-8) that
represent respective views of a video event.
[0115] A single stream of the original 3D8 content 1702 may be used
to provide 2D content. As shown in FIG. 17, stream 3 of the
original 3D8 content 1702 is used to provide 2D content 1704 for
illustrative purposes. Internal interframe compression referencing
is used with respect to stream 3 of the original 3D8 content 1702
to generate stream 3 of the 2D content 1704. However, no other
streams that are included in the original 3D8 content 1702 are
referenced to generate stream 3 of the 2D content 1704.
[0116] Internal interframe compression referencing is a technique
in which differences between frames (e.g., adjacent frames) that
are included in a stream of video content are used to represent the
frames in that stream. For example, a first frame in the stream may
be designated as a reference frame with other frames in the stream
being designated as dependent frames. In accordance with this
example, the reference frame may be represented by data that is
sufficient to independently define the reference frame, while the
dependent frames may be represented by difference data. The
difference data that represents each dependent frame may use data
that represents one or more of the other frames in the stream to
generate data that is sufficient to independently define that
dependent frame.
[0117] Two streams of the original 3D8 content 1702 may be used to
provide 3D2 content. As shown in FIG. 17, streams 3 and 7 of the
original 3D8 content 1702 are used to provide 3D2 content 1706 for
illustrative purposes. Internal interframe compression referencing
is used with respect to stream 3 of the original 3D8 content 1702
to generate stream 3 of the 3D2 content 1706, as described above
with reference to 2D content 1704. Stream 7 of the 3D2 content 1706
is generated using internal interframe compression referencing with
respect to stream 7 of the original 3D8 content 1702 and stream 3
of the 2D content 1704 for referencing.
[0118] Four streams of the original 3D8 content 1702 may be used to
provide 3D4 content. As shown in FIG. 17, streams 3 and 7 of the
3D4 content 1708 are generated as described above with reference to
3D2 content 1706. Streams 1 and 5 of the 3D4 content 1708 are
generated using streams 1 and 5 of the original 3D8 content 1702
and streams 3 and 7 of the 3D2 content 1706 for referencing.
[0119] All eight streams of the original 3D8 content 1702 may be
used to provide 3D8 content. As shown in FIG. 17, streams 1, 3, 5,
and 7 of the 3D8 content 1710 are generated as described above with
reference to 3D4 content 1708. Streams 2, 4, 6, and 8 of the 3D8
content 1710 are generated using any of a plurality of streams,
which includes streams 1, 3, 5, and 7 of the 3D4 content 1708 and
stream 2, 4, 6, and 8 of the original 3D8 content 1702, for
referencing.
[0120] Referring to FIG. 18, technique 1800 is directed to a
limited referencing configuration according to an embodiment.
Technique 1800 will be described with reference to original 3D8
content 1802, 2D content 1804, 3D2 content 1806, 3D4 content 1808,
and 3D8 content 1810. The original content used to illustrate
technique 1800 is 3D8 content, which includes eight video streams
(labeled as 1-8) that represent respective views of a video event.
The 2D content 1804 and the 3D2 content 1806 are generated in the
same manner as the 2D content 1704 and the 3D2 content 1706
described above with reference to FIG. 17. However, the manner in
which the 3D4 content 1808 and the 3D8 content 1810 are generated
differs from the manner in which the 3D4 content 1708 and the 3D8
content 1710 are generated.
[0121] As shown in FIG. 18, streams 3 and 7 of the 3D4 content 1808
are generated as described above with reference to 3D4 content 1708
of FIG. 17. However, stream 1 of the 3D4 content 1808 is generated
using stream 1 of the original 3D8 content 1802 and streams 3 and 7
of the 3D2 content 1806 for referencing. Furthermore, stream 5 of
the 3D4 content 1808 is generated using stream 5 of the original
3D8 content 1802 and streams 3 and 7 of the 3D2 content 1806 for
referencing.
[0122] Streams 1, 3, 5, and 7 of the 3D8 content 1810 are generated
as described above with reference to the 3D4 content 1808. Streams
2, 4, 6, and 8 of the 3D8 content 1810 are generated using internal
interframe compression referencing and streams 1, 3, 5, and 7 of
the 3D4 content 1808 for referencing.
[0123] Referring to FIG. 19, technique 1900 is directed to
interpolation of lost frame data to maintain image stability
according to an embodiment. Technique 1900 will be described with
reference to original 3D8 content 1902, 2D content 1904, and 3D2
content 1906. The original content used to illustrate technique
1900 is 3D8 content, which includes eight video streams (labeled as
1-8) that represent respective views of a video event. The 2D
content 1904 is generated in the same manner as the 2D content 1704
described above with reference to FIG. 17. However, the manner in
which the 3D2 content 1906 is generated differs from the manner in
which the 3D2 content 1708 is generated.
[0124] As shown in FIG. 19, stream 3 of the 3D2 content 1906 is
generated as described above with reference to 3D2 content 1706 of
FIG. 17. Moreover, stream 7 of the 3D2 content 1906 is generated
using stream 3 of the 2D content 1904 for referencing. Stream 7 of
the 3D2 content 1906 is generated further using internal interframe
compression referencing if a previous frame and/or a future frame
of stream 3 of the 2D content 1904 is similar to a current frame of
stream 3 of the 2D content 1904.
[0125] Referring to FIG. 20, technique 2000 is directed to
interpolation to provide a number of views that is greater than a
number of views that are represented by received video content
according to an embodiment. Technique 2000 will be described with
reference to original 3D4 content 2002, 2D content 2004, 3D2
content 2006, 3D4 content 2008, and 3D8 content 2010. The original
content used to illustrate technique 2000 is 3D4 content, which
includes four video streams (labeled as 1-4) that represent
respective views of a video event.
[0126] A single stream of the original 3D4 content 2002 may be used
to provide 2D content. As shown in FIG. 20, stream 3 of the
original 3D4 content 2002 is used to provide 2D content 2004 for
illustrative purposes. Internal interframe compression referencing
is used with respect to stream 3 of the original 3D4 content 2002
to generate stream 3 of the 2D content 2004. However, no other
streams that are included in the original 3D4 content 2002 are
referenced to generate stream 3 of the 2D content 2004.
[0127] Two streams of the original 3D4 content 2002 may be used to
provide 3D2 content. As shown in FIG. 20, streams 1 and 3 of the
original 3D4 content 2002 are used to provide 3D2 content 2006 for
illustrative purposes. Stream 3 of the 3D2 content 2006 is
generated as described above with reference to the 2D content 2004.
Stream 1 of the 3D2 content 2006 is generated using internal
interframe compression referencing with respect to stream 1 of the
original 3D4 content 2002 and stream 3 of the 2D content 2004 for
referencing.
[0128] All four streams of the original 3D4 content 2002 may be
used to provide 3D4 content. As shown in FIG. 20, streams 1 and 3
of the 3D4 content 2008 are generated as described above with
reference to 3D2 content 2006. Streams 2 and 4 of the 3D4 content
2008 are generated using streams 2 and 4 of the original 3D4
content 2002 and streams 1 and 3 of the 3D2 content 2006 for
referencing.
[0129] All four streams of the original 3D4 content 2002 may be
used to provide 3D8 content. As shown in FIG. 20, streams 1-4 of
the 3D8 content 1710 are generated as described above with
reference to 3D4 content 2008. Streams 5-8 of the 3D8 content 2010
are entirely interpolated using nearest neighbor streams.
[0130] Streams that are used by a decoder for purposes of
interpolation must be available to the decoder. For example, when
external interpolation referencing is used to encode data, only
frame sequences (perspective views) that are allowed to be
referenced for encoding purposes may be used for interpolation
purposes. External interpolation referencing involves referencing
frames to be used for interpolation that can be found in frame
sequences outside of a current frame sequence (i.e., from a
different perspective view).
[0131] In some embodiments, streams are encoded using hierarchical
encoding techniques, as described in commonly-owned, co-pending
U.S. patent application Ser. No. ______ (Atty. Docket No.
A05.01330000), filed on even date herewith and entitled
"Hierarchical Video Compression Supporting Selective Delivery of
Two-Dimensional and Three-Dimensional Video Content," the entirety
of which is incorporated by reference herein. Such embodiments
enable a subset of a total number of streams to be received and
decoded at the decoder, wherein some of the streams received may be
decoded by referencing some of the other streams received. In
accordance with these embodiments, none of the received streams
rely on non-received streams for decoding. Accordingly, in these
embodiments, interpolation referencing is limited to received
streams.
[0132] III. Exemplary Electronic Device Implementations
[0133] Embodiments may be implemented in hardware, software,
firmware, or any combination thereof. For example, encoding system
208, decoding system 214, display circuitry 216, input circuitry
302, processing circuitry 304, output circuitry 306, input
circuitry 1002, processing circuitry 1004, and/or output circuitry
1006 may be implemented as hardware logic/electrical circuitry. In
another example, encoding system 208, decoding system 214, display
circuitry 216, input circuitry 302, processing circuitry 304,
output circuitry 306, input circuitry 1002, processing circuitry
1004, and/or output circuitry 1006 may be implemented as computer
program code configured to be executed in one or more
processors.
[0134] For instance, FIG. 21 shows a block diagram of an exemplary
implementation of electronic device 2100 according to an
embodiment. In embodiments, electronic device 2100 may include one
or more of the elements shown in FIG. 21. As shown in the example
of FIG. 21, electronic device 2100 may include one or more
processors (also called central processing units, or CPUs), such as
a processor 2104. Processor 2104 is connected to a communication
infrastructure 2102, such as a communication bus. In some
embodiments, processor 2104 can simultaneously operate multiple
computing threads.
[0135] Electronic device 2100 also includes a primary or main
memory 2106, such as random access memory (RAM). Main memory 2106
has stored therein control logic 2128A (computer software), and
data.
[0136] Electronic device 2100 also includes one or more secondary
storage devices 2110. Secondary storage devices 2110 include, for
example, a hard disk drive 2112 and/or a removable storage device
or drive 2114, as well as other types of storage devices, such as
memory cards and memory sticks. For instance, electronic device
2100 may include an industry standard interface, such a universal
serial bus (USB) interface for interfacing with devices such as a
memory stick. Removable storage drive 2114 represents a floppy disk
drive, a magnetic tape drive, a compact disk drive, an optical
storage device, tape backup, etc.
[0137] Removable storage drive 2114 interacts with a removable
storage unit 2116. Removable storage unit 2116 includes a computer
useable or readable storage medium 2124 having stored therein
computer software 2128B (control logic) and/or data. Removable
storage unit 2116 represents a floppy disk, magnetic tape, compact
disk, DVD, optical storage disk, or any other computer data storage
device. Removable storage drive 2114 reads from and/or writes to
removable storage unit 2116 in a well known manner.
[0138] Electronic device 2100 further includes a communication or
network interface 2118. Communication interface 2118 enables the
electronic device 2100 to communicate with remote devices. For
example, communication interface 2118 allows electronic device 2100
to communicate over communication networks or mediums 2142
(representing a form of a computer useable or readable medium),
such as LANs, WANs, the Internet, etc. Network interface 2118 may
interface with remote sites or networks via wired or wireless
connections.
[0139] Control logic 2128C may be transmitted to and from
electronic device 2100 via the communication medium 2122.
[0140] Any apparatus or manufacture comprising a computer useable
or readable medium having control logic (software) stored therein
is referred to herein as a computer program product or program
storage device. This includes, but is not limited to, electronic
device 2100, main memory 2106, secondary storage devices 2110, and
removable storage unit 2116. Such computer program products, having
control logic stored therein that, when executed by one or more
data processing devices, cause such data processing devices to
operate as described herein, represent embodiments of the
invention.
[0141] Devices in which embodiments may be implemented may include
storage, such as storage drives, memory devices, and further types
of computer-readable media. Examples of such computer-readable
storage media include a hard disk, a removable magnetic disk, a
removable optical disk, flash memory cards, digital video disks,
random access memories (RAMs), read only memories (ROM), and the
like. As used herein, the terms "computer program medium" and
"computer-readable medium" are used to generally refer to the hard
disk associated with a hard disk drive, a removable magnetic disk,
a removable optical disk (e.g., CDROMs, DVDs, etc.), zip disks,
tapes, magnetic storage devices, MEMS (micro-electromechanical
systems) storage, nanotechnology-based storage devices, as well as
other media such as flash memory cards, digital video discs, RAM
devices, ROM devices, and the like. Such computer-readable storage
media may store program modules that include computer program logic
for encoding system 208, decoding system 214, display circuitry
216, input circuitry 302, processing circuitry 304, output
circuitry 306, input circuitry 1002, processing circuitry 1004,
output circuitry 1006, flowchart 400, flowchart 500, flowchart 600,
flowchart 700, flowchart 800, flowchart 900, flowchart 1100,
flowchart 1200, flowchart 1300, flowchart 1400, flowchart 1500,
flowchart 1600 (including any one or more steps of flowcharts 400,
500, 600, 700, 800, 900, 1100, 1200, 1300, 1400, 1500, and 1600),
and/or further embodiments of the present invention described
herein. Embodiments of the invention are directed to computer
program products comprising such logic (e.g., in the form of
program code or software) stored on any computer useable medium.
Such program code, when executed in one or more processors, causes
a device to operate as described herein.
[0142] The invention can be put into practice using software,
firmware, and/or hardware implementations other than those
described herein. Any software, firmware, and hardware
implementations suitable for performing the functions described
herein can be used.
[0143] As described herein, electronic device 2100 may be
implemented in association with a variety of types of display
devices. For instance, electronic device 2100 may be one of a
variety of types of media devices, such as a stand-alone display
(e.g., a television display such as flat panel display, etc.), a
computer, a game console, a set top box, a digital video recorder
(DVR), other electronic device mentioned elsewhere herein, etc.
Media content that is delivered in two-dimensional or
three-dimensional form according to embodiments described herein
may be stored locally or received from remote locations. For
instance, such media content may be locally stored for playback
(replay TV, DVR), may be stored in removable memory (e.g. DVDs,
memory sticks, etc.), may be received on wireless and/or wired
pathways through a network such as a home network, through Internet
download streaming, through a cable network, a satellite network,
and/or a fiber network, etc. For instance, FIG. 21 shows a first
media content 2130A that is stored in hard disk drive 2112, a
second media content 2130B that is stored in storage medium 2124 of
removable storage unit 2116, and a third media content 2130C that
may be remotely stored and received over communication medium 2122
by communication interface 2118. Media content 2130 may be stored
and/or received in these manners and/or in other ways.
[0144] IV. Conclusion
[0145] While various embodiments have been described above, it
should be understood that they have been presented by way of
example only, and not limitation. It will be understood by those
skilled in the relevant arts) that various changes in form and
details may be made to the embodiments described herein without
departing from the spirit and scope of the invention. Accordingly,
the breadth and scope of the present invention should not be
limited by any of the above-described exemplary embodiments, but
should be defined only in accordance with the following claims and
their equivalents.
* * * * *