U.S. patent application number 14/261730 was filed with the patent office on 2014-12-25 for interactive wide-angle video server.
This patent application is currently assigned to Grandeye Ltd.. The applicant listed for this patent is Grandeye Ltd.. Invention is credited to Bartu Ahiska, Yavuz Ahiska, Mark Davey.
Application Number | 20140375761 14/261730 |
Document ID | / |
Family ID | 38089015 |
Filed Date | 2014-12-25 |
United States Patent
Application |
20140375761 |
Kind Code |
A1 |
Ahiska; Bartu ; et
al. |
December 25, 2014 |
Interactive wide-angle video server
Abstract
An interactive video server which enables multiple clients to
independently and interactively extract views from a one or more
wide-angle imagery sources is disclosed.
Inventors: |
Ahiska; Bartu; (Surrey,
GB) ; Davey; Mark; (Kent, GB) ; Ahiska;
Yavuz; (Surrey, GB) |
|
Applicant: |
Name |
City |
State |
Country |
Type |
Grandeye Ltd. |
London |
|
GB |
|
|
Assignee: |
Grandeye Ltd.
London
GB
|
Family ID: |
38089015 |
Appl. No.: |
14/261730 |
Filed: |
April 25, 2014 |
Related U.S. Patent Documents
|
|
|
|
|
|
Application
Number |
Filing Date |
Patent Number |
|
|
11287465 |
Nov 23, 2005 |
8723951 |
|
|
14261730 |
|
|
|
|
Current U.S.
Class: |
348/36 |
Current CPC
Class: |
H04N 21/4402 20130101;
H04N 21/6377 20130101; H04N 21/21805 20130101; H04N 21/24 20130101;
H04N 7/17318 20130101; H04N 5/23238 20130101; H04N 5/2628 20130101;
H04N 5/23206 20130101 |
Class at
Publication: |
348/36 |
International
Class: |
H04N 5/232 20060101
H04N005/232 |
Claims
1-18. (canceled)
19. A method for remotely viewing wide-angle video comprising the
steps of: outputting from a first location one or more portions of
a wide-angle video image with positional information of the one or
more portions in relation to said wide-angle image; and receiving
at a second location said one or more portions and said positional
information, and generating one or more corresponding reduced
distortion views from the one or more portions using said
positional information.
20. A method as recited in claim 19 wherein said view has
associated coordinates, and first location outputs said
coordinates.
21. A method as recited in claim 19 wherein said view generation is
based on 3D graphics techniques, using texture-mapping for
stereographic projection or linear perspective projection or
cylindrical projection.
22. A method as recited in claim 19 wherein said view generation is
based on tabular distortion-correction.
23. A method as recited in claim 19 wherein said view generation is
based on 2D transform mapping using orthogonal transform
algorithms.
24-29. (canceled)
Description
BACKGROUND AND SUMMARY OF THE INVENTION
[0001] The present application relates to video transmission, and
more particularly to an interactive wide-angle video server.
DESCRIPTION OF BACKGROUND ART
[0002] Real-time video surveillance systems have become
increasingly popular in security monitoring applications. In
particular the ability to monitor a wide-angle field of view (FOV)
is important as it leads to a large situational awareness of an
environment. A camera can be used with a wide-angle optical system
such as a fisheye lens to capture wide-angle video, typically with
a field of view of approximately 180 degrees. The constant
improvements in the imaging technology used to capture the video
are responsible for an observed increase in output resolution, and
consequently the corresponding output data-rates.
[0003] The ability to remotely monitor wide-angle video
applications is becoming particularly important. The U.S. Pat. No.
6,603,502, entitled "System for Omnidirectional Image Viewing at a
Remote Location Without the Transmission of Control Signals to
Select Viewing Parameters," which is hereby incorporated by
reference, describes a system for achieving perspective corrected
views at a location removed from the creation site of a distorted
wide-angle image. A notable advantage of this system is that it
operates without the transmission of control signals from the
removed locations. The system transmits the wide-angle output from
a camera to multiple sites. This process will require significant
bandwidth due to the increasing resolution of wide-angle images.
Although the patent acknowledges the requirement for compressing
imagery when transmitting data over bandwidth-limited telephone
lines, any such compression unavoidably leads to a loss of quality.
This is called the "available transmission bandwidth problem".
[0004] U.S. Prov. Pat. App. No. 60/627,531, entitled "Interactive
Media Server," which is hereby incorporated by reference, describes
a web-based server servicing a fixed number of clients with
media-streams in response to received view-requests. The streams
consist of distortion-corrected views extracted from a wide-angle
video source by using image-processing circuitry. Through streaming
transformed views corresponding to requested portions of the
wide-angle video, the system offers a solution to the available
transmission bandwidth problem. The clients are lightweight web
clients, not requiring sophisticated graphics hardware. These
advantages are achieved by compromising the simplicity of the
server, which now requires powerful dedicated image processing and
client-handling hardware.
[0005] When a video camera is used with a conventional fisheye
lens, the image output by the camera is distorted. This distortion
is typically circular for a circular imaging system, but can be of
other shapes, depending on the lens system implemented. This
distortion needs to be alleviated in real-time to allow correct
viewing. Systems and methods for transforming a wide-angle image
from one perspective form to another have been implemented using
different techniques, and generally may be divided into three
separate categories:
[0006] (1) tabular distortion-correction systems and methods;
[0007] (2) three-dimensional (3D) projection systems and methods;
and
[0008] (3) two-dimensional (2D) transform mapping systems and
methods.
[0009] The first category includes U.S. patent application Ser. No.
10/837,012, entitled "Correction of Optical Distortion by Image
Processing," which is hereby incorporated by reference. The
distortion is corrected by reference to a stored table that
indicates the mapping between pixels of the distorted image and
pixels on the corrected image. The table is typically one of two
types: it may be a forward table in which the mapping from
distorted image to corrected image is held, or it may be a reverse
table holding the mapping from corrected image to distorted image.
On the other hand, U.S. patent application Ser. No. 10/186,915,
entitled "Real-Time Wide-Angle Image Correction System and Method
for Computer Image Viewing," which is hereby incorporated by
reference, generates warp tables from pixel coordinates of a
wide-angle image and applies the warp table to create a corrected
image. The corrections are performed using a parametric class of
warping functions that include Spatially Varying Uniform (SVU)
functions.
[0010] The second category of systems and methods use 3D computer
graphics techniques to alleviate the distortion. For example, U.S.
Pat. No. 6,243,099, entitled "Method for Interactive Viewing
Full-Surround Image Data and Apparatus Therefor," which is hereby
incorporated by reference, discloses a method of projecting a
full-surround image onto a surface. The full-surround image data is
texture-mapped onto a computer graphics representation of a surface
to model the visible world. A portion of this visible world is
projected onto a plane to achieve one of a variety of perspectives.
Stereographic projection is implemented by using a spherical
surface and one-to-one projecting each point on the sphere to
points on an infinite plane by rays from a point antipodal to the
sphere and the plane's intersection.
[0011] The third category includes U.S. Pat. No. Re 36,207,
entitled "Omniview Motionless Camera Orientation System," which is
hereby incorporated by reference, which discloses a system and
method of perspective correcting views from a hemispherical image
using 2D transform mapping. The correction is achieved by an
image-processor implementing an orthogonal set of transform
algorithms. The transformation is predictable and based on lens
characteristics.
[0012] These transformations alleviate the typical distortion and
perception problems in a wide-angle image. One or more views can be
generated and steered about the wide-angle video in real-time. A
new class of camera replaces the mechanical Pan-Tilt-Zoom (PTZ)
functions with a wide-angle optical system and image processing, as
discussed in U.S. patent application Ser. No. 10/837,019 entitled
"Method of Simultaneously Displaying Multiple Views for Video
Surveillance," which is hereby incorporated by reference. This
class of camera is further discussed in U.S. patent application
Ser. No. 10/837,325 entitled "Multiple View Processing in
Wide-Angle Video Camera," which is hereby incorporated by
reference. This type of camera monitors a wide field of view and
selects regions from it to transmit to a base station; in this way
it emulates the behaviour of a mechanical PTZ camera. The
wide-angle optics introduces distortion into the captured image,
and processing algorithms are used to transform the distortion and
convert it to a view that has similar projection as a mechanical
PTZ camera.
[0013] Interactive Wide-Angle Video Server
The present innovations include, in one class of embodiments, an
interactive wide-angle video server that receives requests and
information from clients, and sends to the clients distorted
portions of the wide-angle video which are preferably modified by
the client. In preferred embodiments, the server feeds, over time,
selected uncorrected portions of wide-angle video to clients based
on their requests. The available transmission bandwidth problem is
addressed, but at the expense of computation within the clients and
the transmission of request-signals from the clients to the server.
The server is preferably not used to produce distortion-correcting
views from a wide-angle video. The clients preferably have the task
of computing the views by transforming said requested portions of
the wide-angle image. In preferred embodiments, the present
innovations generate on-demand PTZ views at a remote client by
generating view-requests which are sent to a server.
BRIEF DESCRIPTION OF THE DRAWINGS
[0014] The disclosed inventions will be described with reference to
the accompanying drawings, which show important sample embodiments
of the invention and which are incorporated in the specification
hereof by reference, wherein:
[0015] FIG. 1 shows a flowchart implementing process steps
consistent with a preferred embodiment of the present
innovations.
[0016] FIG. 2 shows one example method consistent with a preferred
embodiment of the present innovations.
[0017] FIG. 3 shows another example method consistent with a
preferred embodiment of the present innovations.
[0018] FIG. 4 shows another example method consistent with a
preferred embodiment of the present innovations.
[0019] FIG. 5 shows another example method consistent with a
preferred embodiment of the present innovations.
[0020] FIG. 6 shows another example method consistent with a
preferred embodiment of the present innovations.
[0021] FIGS. 7A and 7B show another example method consistent with
a preferred embodiment of the present innovations.
[0022] FIG. 8 shows another example method consistent with a
preferred embodiment of the present innovations.
[0023] FIG. 9 shows another example method consistent with a
preferred embodiment of the present innovations.
[0024] FIG. 10 shows another example method consistent with a
preferred embodiment of the present innovations.
[0025] FIG. 11 shows another example method consistent with a
preferred embodiment of the present innovations.
[0026] FIG. 12 shows another example method consistent with a
preferred embodiment of the present innovations.
DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS
[0027] The numerous innovative teachings of the present application
will be described with particular reference to the presently
preferred embodiment (by way of example, and not of
limitation).
[0028] One class of preferred embodiments includes a web-based
server connected to one or more wide-angle video sources (such as a
wide-angle video camera or a video storage device). The sources
may, for example, be indirectly connected to the server through the
web, or directly connected through a Local Area Network (LAN), or
connected by other means. Many typical connections between the
server and a client will not have sufficient bandwidth to feed a
full size wide-angle video sequence in the form received from a
source. To alleviate the bandwidth limitations, the server is
preferably capable of extracting a number (one or more) of portions
of an input wide-angle video and distributing them in real-time to
one or more clients based on view-requests issued by the clients to
the server.
[0029] FIG. 9 shows one example system consistent with implementing
a preferred embodiment of the present innovations, containing
multiple wide-angle video sources. In this preferred embodiment,
wide-angle sources 10, 11, 12 can be remotely connected to the
server 14 through a network connection 18 such as the web or by a
LAN connection. These sources 10, 11, 12 may be wide-angle video
cameras or other supplier of wide-angle video, such as a storage
device 13. The LAN may be arranged in different topologies,
including star, bus, collapsed backbone or ring. The server is
preferably connected to one or more clients 15, 16, 17 through a
network connection 19. The wide-angle video cameras are preferably
implemented as cameras outfitted with wide-angle optical lenses,
such as a fisheye or purpose-built lens designed to enhance
peripheral vision, typically with a field of view in the region of
170 to 180 degrees. Other wide-angle input devices are of course
also consistent with the present innovations. In preferred
embodiments, the clients request receive inputs, such as local
inputs from an operator or software (such as motion detection
software) that serve as PTZ controls determining what view the
client is requesting. Preferably, processing such as mapping pixel
coordinates to wide-angle image coordinates is done at the client.
The client preferably sends a view-request containing information
identifying the image source (such as one or more of sources 10,
11, 12) and the portions or segments of that source that are being
requested. The server 14 preferably extracts the requested segments
from the wide-angle video of the source, as determined by the
view-request, and feeds the segments to the client. In preferred
embodiments, the data is compressed before being transmitted to the
client, to save bandwidth. The client preferably transforms the
received data (such as segments or portions) on local software and
hardware, creating a view for display. The transforms can, for
example, include 3D projection.
[0030] The response-time of the server is a key performance factor
for interactivity. Clients serviceable by said server preferably
have associated hardware and are capable of receiving and
transforming a portion received from the server to generate a
transformed-view for display. By feeding these limited portions to
clients, the available bandwidth problem is alleviated. In a
preferred embodiment the demand for bandwidth is further reduced by
compressing the portions using known compression techniques, such
as MPEG4 or JPEG compression. Portions may be represented with a
finite number of segments. For example, FIG. 5 shows one example
way to segment and apportion image data. In this example, the image
data is shown in a Cartesian grid 500. A selected portion 502 is
made up of a selected number of one or more subdivisions, such as
segment 504. Of course, other implementations are possible within
the scope of the present innovations, including but not limited to
using different coordinate spaces, and greater numbers of
subdivisions (such as sub-segments, etc.) or no subdivisions of
portions at all.
[0031] FIG. 4 shows a close-up of portion 502 from FIG. 5. The
desired area in this example is shown to cover several segments,
and together give a view of a particular region as captured by the
camera or image source.
[0032] In preferred embodiments, a view-request is a request
specifying the generation of portions. It preferably includes
information identifying a wide-angle video source (source ID) and
identifying the required portions to be extracted from it. Clients
are preferably capable of issuing view-requests over time in
response to computation on local device input such as mouse,
keyboard or suitably adapted TV remote control input, or outputs
from software (such as motion detection software). The view-request
will typically be generated by processing PTZ commands from an
operator, or from software performing motion analysis such as
motion detection, moving region tracking or object tracking. In
preferred embodiments, locally input PTZ commands are capable of
steering the distortion-alleviated field of-view (FOV) displayed by
the client (client view).
[0033] The client view is preferably defined by a pan, tilt, and
horizontal and vertical FOVs. As the client view is steered (e.g.,
by panning and tilting) different areas of the wide-angle scene
will be visible, possibly requiring a different portion from the
server. As the client view is zoomed-out, increasing the FOVs, a
different larger portion may be required for a similar reason.
Zooming-in, a function decreasing the FOVs, will result in a
smaller portion being required. If portions are represented as
segments, zooming-out will typically result in an increased total
number of required segments, while zooming-in will typically result
in fewer segments being required. As a client view will have a
fixed output resolution, increasing the FOVs results in a lower
number of pixels being allocated to each segment. The segments are
therefore not required at as high a resolution. On the contrary,
decreasing the FOVs results in segments being required at a higher
resolution.
[0034] The clients may be capable of conveying characteristics to
the server (such as client characteristics). The client
characteristics can include, for example, required portion or
segment resolution, available bandwidth and capability of graphics
hardware, or other information. In a preferred embodiment, the
server is capable of receiving and/or determining or estimating
client characteristics and extracting portions with properties in
response thereto. In the preferred embodiment the server can
extract segments to correspond to a resolution required by a
client. In a further embodiment the server may alter the data-rate
of the fed segments to reflect the typically fixed bandwidth of the
communication path between the server and a client, which may be
overwhelmed in the instances where many segments are requested.
[0035] In a preferred embodiment, the server is capable of
receiving and processing multiple wide-angle videos from a number
sources. The maximum number of sources the server can handle will
depend on its capabilities, such as the size of its processing
circuitry. In the preferred embodiment, each source is given a
source ID. The clients are capable of requesting portions extracted
from a particular video feed using the appropriate source ID. The
server preferably has access to, or holds, a frequently updated
database of its source connections, their associated source IDs and
a description of their physical geographic locations. Each client
is preferably capable of requesting a search function to locate the
source ID of a wide-angle video source most suited to its
requirement. The search keywords may be obtained from a client
operator through local device input. The returned source ID is then
used in subsequent view-requests. In theory each source may service
any number and combination of clients.
[0036] As the number of clients increase, portion extraction will
place an increasing demand on the request-handling and processing
hardware. In a possible embodiment using segments, the server
extracts only the segments defined by each view-request and any
client characteristics. In contrast, the preferred embodiment
comprises of a server capable of servicing an indefinite number of
view-requests by generating a finite number of segments. The number
of segments depends upon the number of wide-angle video sources and
the segmenting policy applied to each one. The policy may be
different for every source connected to the server. In the
preferred embodiment the wide-angle video image from a source is
segmented in a regular rectangular grid, wherein the grid is
defined by the number of segments in the horizontal and vertical
directions. For example, FIG. 3 shows one possible segmentation of
the wide-angle video.
[0037] In the preferred embodiment, part of a client's behavior can
be conceptualized in a "virtual camera" (VCAM). It represents the
extraction of a transformed view from a distorted wide-angle image.
The VCAM may be controlled with electronic PTZ control to emulate
the motion of a mechanical PTZ camera (the design of a mechanically
steerable camera can be found in U.S. Pat. No. 4,728,839, entitled
"Motorized Pan/Tilt Head for Remote Control," which is hereby
incorporated by reference).
[0038] FIG. 1 shows a flowchart implementing process steps
consistent with a preferred embodiment of the present innovations.
In this embodiment, the client is displaying video based on a
portion extracted from the wide-angle video output from a
particular server-connected source with known source ID. Portions
are represented as a plurality of segments. In this example flow,
much of the processing burden is on the client, though variations
in the distribution of burdens (and the specifics of the burdens
themselves) are still within the scope of the present
innovations.
[0039] First, the client receives information on the source
segmenting policy and imaging characteristics from the server (step
102). The client then receives PTZ control signals, for example,
from local device input or from software output (step 104). The
client steers to a view based on the PTZ controls (step 106). The
client then maps view pixel coordinates to wide-angle image
coordinates using intermediate world coordinates (step 108). The
client generates a segment-identifying binary bitmap for all
segments intersecting the chosen view (step 110). The client sends
a view-request containing the requested segment identifier, source
ID, and segment resolution to the server (step 112). The server
extracts the respective segments from the wide-angle video of the
source determined in the view request (step 114). The server
compresses the data and feeds segments to the client (step 116).
The client transforms the input segment data by 3D projection on
local graphics hardware and creates a view for display (step 118).
The process then returns to the point at which the client receives
further PTZ controls and proceeds.
[0040] Before the first view-request is made, the client preferably
requests information on the lens/imaging characteristics of the
source and the segmenting policy applied to it. In the preferred
embodiment the server responds with information defining the
regular rectangular grid used to divide the source video,
consisting of the number of segments in the horizontal and vertical
directions (See, for example, FIG. 3). It also responds with a
table and numbers describing the imaging characteristics (see
below, for example, description of LensTable, circleXCentre and
circleYCentre).
[0041] The client receives PTZ control signals from a local device
input, such as a joystick controller. The control signals can also
be obtained from output of a software program, for example. The
client view is defined by a pan, tilt and horizontal and vertical
fields-of-view (FOV), and can be steered by said PTZ control. The
defined client view has associated 2D screen coordinates.
[0042] A function for mapping between every pixel (p) in the client
view and an associated coordinate in the planar wide-angle image
coordinate system is preferably used. In the preferred embodiment,
this 2D-to-2D coordinate mapping is performed by introducing
intermediate spherical-polar "world-coordinates" (see, for example,
Mathworld: Coordinate Geometry, "Spherical Coordinates," Wolfram
Research at http://mathworld.wolfram.com/spherical
coordinates.html). 3D computer graphics techniques are used to
project any pixel p onto a triangulated partial sphere surface with
unity radius. For example, FIG. 10 shows example code consistent
with this objective.
[0043] In an example implementation, pixel p lies on a plane
(representing the client view) with size xSize, ySize. See, for
example, FIG. 2 which shows geometry of a camera setup, including
camera 202, a field of view, and the x- and y-extends of that field
of view, labeled as xSize 204 and ySize 206. It is noted that this
is only one possible geometric description consistent with an
embodiment of the present innovations. The plane 208 is tangential
to the partial sphere 210, with point of intersection c at the
centre of the plane (determined by theta and phi: the pan and tilt
angle of the virtual camera). The origin of the world coordinate
system (O) is located in the centre of the sphere. A camera 202
exists along a line connecting point c with O. The camera is at a
distance trans on the opposite side of O as c. If this distance is
unity (same as the radius of the sphere), the client is performing
stereographic projection; whereas a distance of zero entails linear
perspective projection. A ray generated by connecting the camera to
the point p will intersect the partial sphere at a point q, with
coordinates (qTheta, qPhi, 1). In this way, the qTheta and qPhi of
every associated point p may be calculated.
[0044] A function is preferably used to obtain the 2D coordinates
on the wide-angle fisheye image corresponding to q (and therefore
corresponding to p). This function, in some embodiments, depends on
the characteristics of the lens used to capture the wide-angle
image (see, for example, FIG. 11). In the preferred embodiment, a
fisheye lens is used with a linear relationship between the
captured FOV and the radial distance from the centre of the
corresponding fisheye circle. The characteristics of imaging
through the lens are stored in table (LensTable). LensTable returns
a radius (rad in pixels) when given qPhi as input. The imaging
process may result in a fisheye circle that is not in the centre of
the fisheye image. For example, FIG. 12 shows an example
characterized by circleXCentre and circleYCentre. Using this
information, together with rad and qTheta, the 2D Cartesian
coordinates (x,y) of the point on the fisheye image corresponding
to point p are calculated.
[0045] The client can determine which segment intersects the point
(x,y) on the wide-angle image corresponding to p. In the preferred
embodiment consisting of regular rectangular segmentation, the
function of identifying the segment of each pixel is obvious (see,
for example, FIG. 4). The width and height of the planar wide-angle
image are divided by the number of segments in the horizontal and
vertical directions respectively, resulting in the width and height
of the identically shaped segments. The (x,y) coordinate is used to
calculate a grid position, measured from a corner of the image,
which in turn identifies a segment.
[0046] In preferred embodiments, this function is applied to every
point in the wide-angle image derived from all the p pixels in the
client view. Any segment which is identified by at least one point
is required to create a complete client view picture. A
view-request is prepared and transmitted containing the source ID
and an identifier for the required portion. In the preferred
embodiment, segments contributing to the required portion can be
identified by sending a 1 bit bitmap image, such as that shown in
the example of FIG. 5. The bitmap has the same number of horizontal
and vertical pixels as the number of segments in the horizontal and
vertical directions of the segmenting grid (which was shown in FIG.
3). Each bit preferably represents an associated segment in the
same grid position. The bit associated with any required segment is
set HIGH (1), with all other bits set LOW (0). In other
embodiments, a list of segment grid-position identifiers can be
sent as a vector list or run-length coded. The client may also be
capable of conveying client characteristics to the server. In the
preferred embodiment the client can request a resolution for all
requested segments.
[0047] In the preferred embodiment, the server is capable of
receiving the view-requests and extracts all of the requested
segments from all of the requested wide-angle video sources. In the
preferred embodiment, these segments are prepared at the requested
resolution. In another embodiment, the server extracts all possible
segments from all possible wide-angle video source(s) at a
frame-rate(s) it can handle. These extracted segments are extracted
at a fixed number of resolutions. As all the possible segments are
available, the server may serve an indefinite number of clients.
The server sends the requested segment(s) to the appropriate
client(s). In a preferred embodiment, segments are sent as a
compressed sequence.
[0048] The client receives the segments and preferably applies a
transformation to generate a client view image (virtual camera
view). The transformation is based on any of a number of
techniques, for example, possibly one of:
(1) tabular distortion-correction methods; (2) 3D projection
methods; and (3) 2D transform mapping methods.
[0049] Other methods can also be implemented. The preferred
embodiment uses client 3D graphics hardware to implement the second
category of transformation. The sources provide video captured
using a fisheye lens in the preferred embodiment. The received
segment data is used to create a fisheye image consisting of
partial useful information (partial image). The partial image is
texture mapped onto a triangulated surface representing a partial
sphere with unity radius, such as that shown in the example of FIG.
6. The undefined areas of the texture, for which no segments were
received, are preferably filled with an arbitrary pixel color. The
mapping preferably uses a well-known 3D computer graphics technique
and can be implemented by storing information representing an
approximately circular grid of triangles to assign the texture.
Other means of mapping can also be implemented. In the preferred
embodiment the mapped texture is stereographically projected onto a
plane (image plane) representing the client view, a method
disclosed in U.S. Pat. No. 6,243,099, entitled "Method for
Interactive Viewing Full-Surround Image Data and Apparatus
Therefore," which is hereby incorporated by reference. The model is
described by a spherical polar coordinate system with an origin in
the centre of the partial sphere. The client view is defined by the
known pan, tilt and horizontal and vertical fields-of-view (FOV).
The image plane is tangential to the partial sphere and intersects
at a point defined by the pan and tilt, as shown in FIGS. 7A and
7B. The intersection point and the FOVs preferably define the
points that need to be projected to create the client view.
Projecting rays from the antipodal point of the plane/sphere
intersection results in a stereographic projection; rays projected
from the centre of the partial-sphere results in a linear
perspective projection. In a further embodiment, the projection
point can be moved between these two positions in response to zoom
PTZ commands. This represents a hybrid use of stereographic and
linear perspective projection (and states between), a method as
suggested in U.S. Prov. Pat. App. No. 60/681,109, entitled
"Stereographic Correction in a Camera," which is hereby
incorporated by reference.
[0050] Alternative transformation techniques can be used to
alleviate the distortion. A preferred embodiment may use a
transformation belonging to one of the other categories if a client
does not have 3D graphics hardware. The first category includes
U.S. patent application Ser. No. 10/837,012, entitled "Correction
of Optical Distortion by Image Processing," which is hereby
incorporated by reference. The distortion is corrected by reference
to a stored table that indicates the mapping between pixels in the
distorted wide-angle image and pixels in the corrected image. On
the other hand, U.S. patent application Ser. No. 10/186,915,
entitled "Real-Time Wide-Angle Image Correction System and Method
for Computer Image Viewing," which is hereby incorporated by
reference, generates warp tables from pixel coordinates of a
wide-angle image and applies the warp tables to create a corrected
image. A third category includes U.S. Pat. No. Re 36,207, entitled
"Omniview Motionless Camera Orientation System," which is hereby
incorporated by reference, which discloses a system and method of
perspective correcting views from a hemispherical image using 2D
transform mapping. The correction is achieved by an image-processor
implementing an orthogonal set of transform algorithms. The
transformation is predictable and based on lens characteristics.
These examples are only intended to be illustrative, and do not
limit the potential application of other methods of transformation
to the present innovations.
[0051] In a preferred embodiment, each client has multiple virtual
cameras capable of viewing portions from one or more wide-angle
video sources. A client can issue view-requests for the one or more
sources using their unique source IDs. A portion (group of
segments) is received for each view-request, wherein each portion
is processed and distortion-reduced to generate a different view. A
composite video is generated from these multiple views, wherein
each view occupies a part of said composite video and the composite
can be output for display.
[0052] In another embodiment, the server is capable of storing the
latest view-request issued by a client. The server continues
sending the same portion(s) to said client until either a new
view-request is sent, or a specified time-out occurs (to ensure
that portions are not indefinitely sent to a client which has
disconnected since issuing a view-request).
[0053] In other embodiments, the server is capable of dividing
wide-angle video images into non-regular and/or non-rectangular
segments. An image may be segmented using a "quadtree", as depicted
in FIG. 8, as mentioned in U.S. Pat. No. 6,526,176, entitled
"Efficient Processing of Quadtree Data," which is hereby
incorporated by reference. Possible embodiments may segment based
on concentric circular rings, or another mathematical segmentation
model.
[0054] In a further embodiment, the server sends information with
the portions (or segments) describing the position in the
wide-angle image from which each portion (or segment) has been
extracted. This information can take the form of tag associated
with each segment. The server may also be capable of sending client
view coordinates (e.g. pan, tilt, horizontal and vertical FOVs)
corresponding to viewing these portions. This functionality is
particularly useful in guiding a client view to a region of
interest, notably when the client first connects to a new video
source.
[0055] An additional embodiment implements a server capable of
distributing a copy of every segment from the requested source to
each requesting client, wherein the segments requested by a client
are fed to said client at a high resolution, and other segments are
fed at a lower resolution. This method and systems empowers the
client with a full situational awareness, while still retaining the
advantage of an alleviated bandwidth problem.
[0056] In other embodiments, more sophisticated methods can be used
to identify the required segments, in which only part of the view
pixels p need be used. In an embodiment, only the points in the
perimeter of the client view and one point in the centre of the
view are used to generate associated segment identifiers. The
server is capable of determining any segments which lie within said
closed perimeter of segments. In an embodiment the server uses
"filling" to determine these unspecified segments, wherein the
segment associated with the point in the view centre is used as the
"seed", which is a method that will be familiar to those skilled in
the art: In a further embodiment the clients request additional
predicted segments based on extrapolating past PTZ commands. This
can assist in creating a more real-time experience.
[0057] An additional embodiment implements a server capable of
transforming any input wide-angle video to generate a panoramic
video, possibly using a transformation engine with cylindrical
projection, wherein said panoramic video is distributed to the
clients alongside portions. Another embodiment generates the
panoramic video in transformation engines in the wide-angle
sources, and feeds them to the server together with the normal
wide-angle video. The server again has the capability to distribute
both segments and panoramic video sequence(s) to the multiple
clients in response to view-requests and possible client
characteristics. Other projection styles may be implemented in the
transformation, such as cylindrical projection.
Modifications and Variations
[0058] As will be recognized by those skilled in the art, the
innovative concepts described in the present application can be
modified and varied over a tremendous range of applications, and
accordingly the scope of patented subject matter is not limited by
any of the specific exemplary teachings given.
[0059] Additional general background, which helps to show
variations and implementations, may be found in the following
publications, all of which are hereby incorporated by reference:
[0060] U.S. Pat. No. 6,603,502, entitled "System for
Omnidirectional Image Viewing at a Remote Location Without the
Transmission of Control Signals to Select Viewing Parameters,"
which is hereby incorporated by reference. [0061] U.S. Pat. No. Re
36,207, entitled "Omniview Motionless Camera Orientation System,"
which is hereby incorporated by reference. [0062] U.S. Pat. No.
4,728,839, entitled "Motorized Pan/Tilt Head for Remote Control,"
which is hereby incorporated by reference. [0063] U.S. Pat. No.
6,243,099, entitled "Method for Interactive Viewing Full-Surround
Image Data and Apparatus Therefor," which is hereby incorporated by
reference. [0064] U.S. Pat. No. 6,526,176, entitled "Efficient
Processing of Quadtree Data," which is hereby incorporated by
reference. U.S. patent application Ser. No. 10/837,012 (Attorney
Docket No. GRND-13), filed Apr. 30, 2004, entitled "Correction of
Optical Distortion by Image Processing," which is hereby
incorporated by reference. [0065] U.S. patent application Ser. No.
10/837,325 (Attorney Docket No. GRND14), filed Apr. 30, 2004,
entitled "Multiple View Processing in Wide-Angle Video Camera,"
which is hereby incorporated by reference. [0066] U.S. patent
application Ser. No. 10/837,019 (Attorney Docket No. GRND16), filed
Apr. 30, 2004, entitled "Method of Simultaneously Displaying
Multiple Views for Video Surveillance," which is hereby
incorporated by reference. [0067] U.S. patent application Ser. No.
10/186,915, entitled "Real-Time Wide-Angle Image Correction System
and Method for Computer Image Viewing," which is hereby
incorporated by reference. [0068] U.S. Provisional Patent
Application Ser. No. 60/627,531 (Attorney Docket No. GRND-11P),
filed Nov. 12, 2004, entitled "Interactive Media Server," which is
hereby incorporated by reference. [0069] U.S. Provisional Patent
Application Ser. No. 60/681,109 (Attorney Docket No. GRND-21P),
filed May 13, 2005, entitled "Stereographic Correction in a
Camera," which is hereby incorporated by reference. [0070]
"Fundamentals of Digital Image Processing" by Anil Jain,
Prentice-Hall, NJ, 1988, which is hereby incorporated by reference;
Mathworld: Coordinate Geometry, "Spherical Coordinates", [0071]
Wolfram Research [Nov. 18, 2005] [0072]
http://mathworld.wolfram.com/SphericalCoordinates.html
[0073] None of the description in the present application should be
read as implying that any particular element, step, or function is
an essential element which must be included in the claim scope: THE
SCOPE OF PATENTED SUBJECT MATTER IS DEFINED ONLY BY THE ALLOWED
CLAIMS. Moreover, none of these claims are intended to invoke
paragraph six of 35 USC section 112 unless the exact words "means
for" are followed by a participle.
[0074] The claims as filed are intended to be as comprehensive as
possible, and NO subject matter is intentionally relinquished,
dedicated, or abandoned.
* * * * *
References