U.S. patent application number 14/203471 was filed with the patent office on 2014-09-18 for method for shooting a performance using an unmanned aerial vehicle.
This patent application is currently assigned to THOMSON LICENSING. The applicant listed for this patent is THOMSON LICENSING. Invention is credited to Caroline BAILLARD, Francois LE CLERC, Nicolas MOLLET.
Application Number | 20140267777 14/203471 |
Document ID | / |
Family ID | 48049921 |
Filed Date | 2014-09-18 |
United States Patent
Application |
20140267777 |
Kind Code |
A1 |
LE CLERC; Francois ; et
al. |
September 18, 2014 |
METHOD FOR SHOOTING A PERFORMANCE USING AN UNMANNED AERIAL
VEHICLE
Abstract
The present invention discloses a method for shooting a
performance making use of umanned aerial vehicles, such drones for
example, to provide the physical markers that are needed to give a
physical actor indications on the positioning of virtual elements
to be inserted later in the scene, and with which s/he needs to
interact.
Inventors: |
LE CLERC; Francois;
(L'Hermitage, FR) ; BAILLARD; Caroline; (St.
Sulpice La Foret, FR) ; MOLLET; Nicolas; (Meillac,
FR) |
|
Applicant: |
Name |
City |
State |
Country |
Type |
THOMSON LICENSING |
Issy de Moulineaux |
|
FR |
|
|
Assignee: |
THOMSON LICENSING
Issy de Moulineaux
FR
|
Family ID: |
48049921 |
Appl. No.: |
14/203471 |
Filed: |
March 10, 2014 |
Current U.S.
Class: |
348/169 |
Current CPC
Class: |
H04N 5/2224 20130101;
G03B 15/07 20130101; G03B 15/006 20130101; H04N 5/28 20130101; G05D
1/101 20130101 |
Class at
Publication: |
348/169 |
International
Class: |
H04N 5/28 20060101
H04N005/28 |
Foreign Application Data
Date |
Code |
Application Number |
Mar 12, 2013 |
EP |
13305269.6 |
Claims
1. Unmanned aerial vehicle, wherein it comprises a marker
indicating a position of an interaction intended to occur between
an actor and a virtual element moving along a trajectory.
2. Unmanned aerial vehicle according to claim 1, wherein the marker
is on an extremity of an object rigidly attached to the unmanned
aerial vehicle.
3. Unmanned aerial vehicle according to claim 2 which comprises
propellers, wherein the stick is rigidly attached to the unmanned
aerial vehicle in such a way that the extremity of the stick can be
accessed without danger of getting hurt by the propellers.
4. Unmanned aerial vehicle according to claim 1, wherein its
position and attitude with respect to a 3D coordinate system is
controllable using a navigation control method.
5. Unmanned aerial vehicle according to claim 4 wherein the
navigation control method estimates the 3D position of its center
of mass (r(t)) from the measurements provided by an optical motion
capture system through non-coplanar retro-reflective markers
attached to the unmanned aerial vehicle.
6. Method for shooting a performance in which at least one actor
interacts with a virtual element moving along a trajectory, wherein
the method makes use of an unmanned aerial vehicle navigation
control capability.
7. Method according to claim 6, wherein a position of an
interaction intended to occur between an actor and the moving
virtual element is materialized by a marker on the unmanned aerial
vehicle.
8. Method according to the claim 6, wherein when several unmanned
aerial vehicles are used to shoot a scene, a minimal separation
distance between these unmanned aerial vehicles is maintained at
all times.
9. Apparatus comprising means to specify a 3D position of an
unmanned aerial vehicle according to a determined trajectory,
wherein said means are configured in order that a marker on the
unmanned aerial vehicle follows the trajectory at a predefined
speed, said trajectory being determined in order to allow
interactions intended to occur between an actor and a virtual
element moving along a trajectory.
10. Film shooting studio wherein it is equipped with at least one
unmanned aerial vehicle according to claim 1.
Description
1. FIELD OF INVENTION
[0001] This invention generally relates to a method for shooting a
performance in which at least one actor interacts with a virtual
element moving along a determined motion trajectory.
[0002] The invention relies on a specific unmanned aerial vehicle,
an apparatus and a film shooting studio.
2. TECHNICAL BACKGROUND
[0003] Computer-Generated Imagery is increasingly present in film
and TV production. Dedicated techniques are needed to ensure
seamless compositing and interaction between the virtual and real
elements of a scene.
[0004] In a typical scenario, the performance of real actors is
composited with a virtual background. This is, for instance, the
situation in a virtual TV studio, where the news presenter is
filmed against a green background, and the furniture and background
of the studio are inserted later as virtual elements. Chroma keying
is used to matte out the silhouette of the journalist for
compositing with the virtual elements in the scene.
[0005] It may also be that all the elements in the scene are
virtual, but the animated parts (humans, creatures) are obtained
from the performance of actors in a TV or film shooting studio.
[0006] A TV or film shooting studio is usually equipped with an
optical motion capture system which consists of a camera setup and
an acquisition system.
[0007] The camera setup consists of a set of calibrated cameras
placed around a capture volume. Typically, the actors wear
dedicated suits where physical markers are placed at the location
of the main body articulations. The actors play the role of the
film characters or virtual creatures inside the capture volume, as
defined by the scenario.
[0008] The optical motion capture system tracks the locations of
the physical markers in the images captured by the cameras. This
data is fed into animation and rendering software that generates
the appearance of virtual characters or creatures at each frame of
the target production.
[0009] In the simplest situations, there is no interaction at all
between the real and virtual elements in the scene, and the spatial
separation between these elements is easy to achieve. This is for
instance the case in a virtual TV news studio, where the only
virtual element is the background located behind the presenter.
[0010] Even in the absence of interaction between the real and
virtual elements, the compositing becomes more complex when real
elements are partially occluded by virtual elements placed in front
of them, as seen by the camera. Some form of real-time depth keying
is then required to ensure proper management of the occlusions in
order to avoid that, say, the leg of the presenter that should
normally be masked by a virtual table in front of him does not
appear in the composited image in front of the table.
[0011] Interactions between real and virtual elements are even more
difficult to manage. Imagine, for instance, a news presenter is
asked to lay his hand on a virtual table. The table is not
physically present when the presenter is filmed making the hand
gesture in the green-screen environment. A marking on the floor of
the virtual studio may tell him where to stand in order to be
correctly positioned with respect to the table, but telling him/her
where exactly the hand should be placed in order to lie exactly on
the surface of the table after it has been inserted in the picture
would require a marker "floating in air". This is impractical.
[0012] Arguably, a misplacement of the presenter's hand in this
case could be fixed during the compositing phase by tweaking the
viewpoint of the virtual camera. However, this solution would not
be applicable to multiple interactions occurring with elements of a
rigid virtual layout, since the adjustments would need to be
different for each interaction.
[0013] The complexity of managing interactions between real and
virtual elements is maximal when they are both moving. An example
of such a situation would be, for instance, a film character
represented by a real actor attempting to step into a virtual
train, with the train already in motion. The actor filmed in the
green screen environment would need to simulate grasping a handle
in the door of a carriage while this door is translating, and
accelerating. Adjusting the desired location of the actor's hand
would require following over time some marking of the predefined
trajectory of the carriage door handle in 3D space. No other
solution to this problem other than ad-hoc fixes in the compositing
phase was found in prior art.
3. SUMMARY OF THE INVENTION
[0014] The present invention solves the aforementioned drawbacks by
using umanned aerial vehicles, such as drones for example, to
provide the physical markers that are needed to give the real
actors indications on the positioning of virtual elements to be
inserted later in the scene, and with which they need to
interact.
[0015] More precisely, according to one of its aspects, the
invention concerns an unmanned aerial vehicle which is
characterized in that a part of said unmanned aerial vehicle
follows a determined motion trajectory of a contact location of a
virtual element in a scene that it materializes.
[0016] Said part of the unmanned aerial vehicle is then a physical
marker "floating in the air" that allow an interaction occurring
between an actor and a real virtual element of a scene. Multiple
unmanned aerial vehicles may be used and each of them may be
controlled with different adjustments to reproduce interactions
between real and/or virtual elements even when these elements are
moving along different motion trajectories.
[0017] According to an other aspect, the invention concerns a
method for shooting a performance in which at least one actor
interacts with a virtual element moving along a determined motion
trajectory. The method is characterized in that it makes use of an
unmanned aerial vehicle navigation control capability.
[0018] According to another aspect, the invention concerns an
apparatus comprising means to specify a 3D position of an unmanned
aerial vehicle according to a determined motion trajectory. The
apparatus is characterized in that said means are configured in
order that a part of the unmanned aerial vehicle follows the motion
trajectory at a predefined speed, said motion trajectory being
determined in order to allow interactions occurring between real
and/or virtual elements of a scene.
[0019] According to another aspect, the invention concerns a film
shooting studio which is characterized in that it is equipped with
at least one unmanned aerial vehicle as previously disclosed and an
apparatus as previously disclosed.
[0020] The specific nature of the invention as well as other
objects, advantages, features and uses of the invention will become
evident from the following description of a preferred embodiment
taken in conjunction with the accompanying drawings.
4. LIST OF FIGURES
[0021] The embodiments will be described with reference to the
following figures:
[0022] FIG. 1 shows schematically an example of a TV or film
shooting studio,
[0023] FIG. 2 shows schematically a diagram illustrating a possible
control scheme of the attitude and position of a drone, and
[0024] FIG. 3 show an example of an internal architecture of an
apparatus configured to control the navigation of an unmanned
aerial vehicle.
5. DETAILED DESCRIPTION OF A PREFERRED EMBODIMENT OF THE
INVENTION
[0025] FIG. 1 shows an example of a TV or film shooting studio. The
invention is not limited to this single example but may extend to
any indoor or outdoor environment which is adapted to capture the
optical motion of an object from images of physical markers.
[0026] A TV or film shooting studio is a room equipped with an
optical motion capture system which comprises a camera setup and an
acquisition system.
[0027] The camera setup comprises cameras, here four referenced C1
to C4, and light sources, here three referenced L1 to L3.
[0028] The TV or film shooting studio is surrounded, at least
partially, by walls which are painted in a uniform green or blue
colour, so that actors or props filmed in the studio can be easily
segmented out from the background of the studio using chroma
keying. The studio needs to be large enough to hold the camera
setup and make sure that the volume captured by this setup, called
the capture volume, allows sufficient room for the props and the
performance of the actors.
[0029] The cameras, here C1-C4, are positioned all around the
capture volume usually in the center of the room, in such a way
that any point within this volume is seen by a minimum of 3
cameras, and preferably more. The cameras must be synchronized,
typically from an external genlock signal, and operate at
sufficiently high frame rates (to avoid motion blur) and with
sufficient resolution to accurately estimate the motion
trajectories of physical markers used for motion capture.
Furthermore, the cameras are calibrated, both with respect to their
intrinsic and extrinsic parameters, so that the location on a
camera image of the projection of any 3D point of the motion
capture volume in its viewing frustum, referenced in some 3D
coordinate system S.sub.MC, can be accurately predicted.
[0030] Lighting in the TV or film shooting studio relies on a set
of fixed light sources, here L1 to L3 that provides an ideally
diffuse and uniform lighting within the capture volume.
[0031] The time-stamped video signals captured by the camera setup
are transferred and recorded from each of the cameras to a storage
device, typically hard disk drives, thanks to the acquisition
system (not represented in FIG. 1). The acquisition system also
features a user interface and software for controlling the
operation of the cameras and visualizing their outputs.
[0032] Tracking the motion of an object equipped with physical
markers using such an optical motion capture system is well-known
from prior art, and follows the principles described for instance
by G. B. Guerra-Filho in "Optical Motion Capture: Theory and
Implementation", published in the Journal of Theoretical and
Applied Informatics in 2005.
[0033] The tracking method comprises detecting the locations of the
physical markers in the images of the cameras. This is
straightforward, as markers, owing to their high reflectivity,
appear as bright spots in the images. Next, spatial correspondences
between the detected markers locations across camera images are
established. A 3D point in the 3D coordinate system S.sub.MC having
generated a detected location in a camera image lies on a viewing
line going through this location in the camera image plane and the
camera projection centre. Spatial correspondences between detected
locations across camera views, corresponding to the projections in
the views of physical markers, can be determined by the fact that
the above-defined viewing lines for each considered camera
intersect at the location of the physical marker in 3D space. The
locations and orientations of the image plane and projection center
for each camera are known from the camera calibration data. Next,
the detected marker locations set in correspondence, and thus
corresponding to the projections of physical markers, are tracked
over time for each camera image. Temporal tracking typically relies
on non-rigid point set registration techniques, wherein a global
mapping is determined between the distributions of marker locations
between two consecutive images of the same camera in consecutive
frames. Next, the marker tracks are labeled. This can be performed
manually, or alternatively the labels can be set automatically.
Automatic labeling can benefit from a known initial layout of
markers, for instance, in the case of body motion capture, the
"T-stance" where the person stands with legs apart and both arms
stretched away from the body. Next, the captured data is
post-processed, especially in order to fill holes caused by marker
occlusion. This can be automated up to some point using priors from
a model of the captured object (e.g., an articulated body model)
that constrains the locations of the missing markers when most of
the markers locations are known, but needs to be performed manually
if too many marker locations are missing.
[0034] Optionally, specifically for body motion capture, an
articulated human body is fitted to the 3D locations of physical
markers at each frame, thus providing data for animating a virtual
character (possibly after retargeting if the anthropometric
proportions of the actor and the virtual character are
different).
[0035] At least four non-planar physical markers M detectable by
the optical motion capture system are located on an unmanned aerial
vehicle UAV schematically represented in FIG. 1, where the unmanned
aerial vehicle UAV is represented by the four ovales and the
markers M are represented by black filled disks.
[0036] The non-coplanar physical markers define a 3D coordinate
system S.sub.UAV for the unmanned aerial vehicle UAV, whose
relative translation and rotation with respect to the 3D coordinate
system S.sub.MC can be computed using straightforward 3D geometry,
the locations of the markers in S.sub.MC being determined by the
optical motion capture system.
[0037] According to the invention, a part of the unmanned aerial
vehicle UAV follows a determined motion trajectory of a contact
location of a virtual element in a scene that it materializes.
[0038] Advantageously, a stick S is rigidly attached to the
unmanned aerial vehicle UAV, as represented on FIG. 1, in such a
way that its extremity can be accessed without danger of getting
hurt by the unmanned aerial vehicle propellers. The location of the
extremity of the stick S mounted on the unmanned aerial vehicle UAV
is fixed and known in the 3D coordinate system S.sub.UAV, and can
therefore easily be computed in the 3D coordinate system S.sub.MC.
The extremity of the stick S is then the part of the unmanned
vehicle which follows the determined motion trajectory of a contact
location of a virtual element in a scene that it materializes.
[0039] Complex scenes may require several unmanned aerial vehicles
UAV, on each of which at least four physical markers are
located.
[0040] According to an embodiment of the invention, when several
unmanned aerial vehicles UAV are used to shoot a scene, a minimal
separation distance between these unmanned aerial vehicles UAV is
maintained at all times, to avoid aerodynamic interference.
[0041] According to another embodiment, the unmanned aerial vehicle
is a drone.
[0042] A drone is a lightweight unmanned aerial vehicle powered by
multiple rotors, typically 4 to 8, running on batteries. The drone
is equipped with onboard electronics including processing means, an
Inertial Measurement Unit and additional position and velocity
sensors for navigation, and with means for wireless communication
with a remote apparatus.
[0043] The navigation of a drone can be controlled by a so-called
navigation control method usually implemented on a remote station
over a dedicated Application Programming Interface (API) which may
provide access to low-level controls, such as the speeds of the
rotors, and/or to higher-level features such as a target drone
attitude, elevation speed or rotation speed around the vertical
axis passing through the drone center of mass.
[0044] The navigation control method can be developed on top of
this API in order to control the displacements of the drone in
real-time. The control can be performed manually from a user
interface, for instance relying on graphical pads on a mobile
device display. Alternatively, the navigation of the drone can be
constrained programmatically to follow a determined motion
trajectory. This motion trajectory defines a target 3D position of
the center of mass of the drone in some reference 3D coordinate
system at each time instant after a reference start time.
[0045] The navigation control method can benefit from the
positional estimates of the drone provided by an optical motion
capture system. Such a closed-loop feedback control of a drone
using an optical motion capture system is described, for example,
in the paper entitled <<The GRASP Multiple Micro UAV
Testbed>>, by N. Michael et al., published in the September
2010 issue of the IEEE Robotics and Automation Magazine, September
2010. In this paper, the control of the drone relies on two nested
feedback loops, as shown on FIG. 2. The purpose of the loops is to
ensure that the actual attitude and position values of the drone,
as computed from the IMU and positional sensors measurements, match
the target values determined by a target trajectory. Typically,
this is obtained by continuously adjusting the control loop
parameters in order to minimize the error between the measured and
target values, as in well-known PID controllers (see the Wikipedia
page on PID controllers,
http://en.wikipedia.org/wiki/PID_controller).
[0046] Into more detail, with reference to FIG. 2, the Position
Control module takes as input, at each time instant t, the target
3D position of the drone center of mass r.sub.T(t) and its
estimated position r(t) in the coordinate system of the motion
capture volume S.sub.MC. According to the invention, the accurate
estimates of r(t) provided by the motion capture system, owing to
the non-coplanar retro-reflective markers attached to the drone,
can advantageously be fed into the navigation control method, in
order to improve the stability and accuracy of the motion
trajectory following.
[0047] More precisely, a control loop within the position control
module generates, as a function of the positional error
r.sub.T(t)-r(t), the desired values of the attitude angles
.phi..sub.des(t), .theta..sub.des(t) and .psi..sub.des(t) or the
roll, pitch and yaw angles respectively, that stabilize the
attitude of the drone and ensure the desired linear displacement
that compensates for the positional error. The Attitude Control
module is a second, inner, control loop that generates the
increments of the moments .DELTA..omega..sub..phi.,
.DELTA..omega..sub..theta., .DELTA..omega..sub..psi., to be
produced by the drone rotors along the roll, pitch and yaw axes
respectively, in order to obtain the desired attitude values. In
addition, the position control module feeds the motor dynamics
module with an extra moment .DELTA..omega..sub.F that results in a
net force along the vertical axis at the center of gravity of the
drone, allowing the control of its altitude. The Motor Dynamics
module translates .DELTA..omega..sub..phi.,
.DELTA..omega..sub..theta., .DELTA..omega..sub..psi. and
.DELTA..omega..sub.F into set point values for the rotor speeds,
that are transmitted to the drone via its communication means, so
that the rotor speeds are updated over the API. Using a model of
the drone motors, the Motor Dynamics module translates the updates
of the rotors speeds into net forces T.sub.i applied to the drone
along the vertical axes at the location of each rotor, as well as
into angular moments M.sub.i along these same axes. From these
forces and angular moments, a model of the drone dynamics allows to
compute, in the Rigid Body Dynamics module, the linear acceleration
of the drone {umlaut over (r)} and its angular accelerations {dot
over (p)}(t), {dot over (q)}(t) and {dot over (r)}(t) in its body
frame. These accelerations are fed back to the Position Control and
Attitude Control modules, respectively, to provide the inputs to
the control loops implemented in these two modules.
[0048] Note that the Position Control and Attitude Control loops
use measurements, not represented on FIG. 2, from the Inertial
Measurement Unit and the positional sensors mounted on the drone,
in order to estimate the drone position and attitude at their
inputs.
[0049] The invention also concerns a method for shooting a
performance in which at least one actor interacts with a virtual
element moving along a determined motion trajectory comprises two
phases, both making use of an unmanned aerial vehicle UAV
navigation control capability.
[0050] In a first initialization phase, prior to the start of the
shooting, a part of the unmanned aerial vehicle UAV, such as the
extremity of the stick S, is moved to the initial position of a
determined motion trajectory of a contact location of a virtual
element in the scene that it materializes. Upon a trigger signal
synchronized with the action taking place during the shooting, for
instance provided by a member of the on-set staff, the part of the
unmanned aerial vehicle UAV is moved along said determined motion
trajectory, either manually from a control interface, or
programmatically.
[0051] In a second phase, triggered by a signal synchronized with
the captured performance, which may be provided for instance by a
member of the on-set staff, the unmanned aerial vehicle UAV is
displaced so that its part which materializes the contact location
of the virtual element follows said determined motion
trajectory.
[0052] For the purpose of a motion capture session involving an
interaction of at least one of the actors in the studio with at
least one element of a virtual scene, a 3D model of the virtual
scene is assumed known and registered with the 3D coordinate system
S.sub.MC. The motion trajectories of all moving virtual elements
within the 3D virtual scene model are predefined from the scenario
of the performance to be captured. These motion trajectories are
represented by a temporal sequence of 3D locations in the 3D
coordinate system S.sub.MC, defined with reference to a predefined
start time t.sub.ref, typically set to the starting time of the
performance to be captured. The sampling frequency of this sequence
is chosen, for example, so as to be compatible with the rate at
which the target 3D position of the drone center of mass r.sub.T(t)
can be estimated.
[0053] According to the invention, the location of contact on each
of the moving virtual elements of the performance where, for
instance an actor should interact with the element, for instance by
placing a hand on this location, is materialized by a part of an
unmanned aerial vehicle UAV such as, according to an embodiment,
the extremity of a stick S. As the 3D coordinate system S.sub.UAV
is registered with respect to the 3D coordinate system S.sub.MC,
the coordinate of this location of contact on the unmanned aerial
vehicle UAV can be expressed in the 3D coordinate system S.sub.MC
via a straightforward change of coordinate system, and therefore
matched at any time against the target location of the virtual
element, also expressed in the 3D coordinate system S.sub.MC.
[0054] FIG. 3 shows an apparatus 300 that can be used in a Film or
TV studio to control an unmanned aerial vehicle. The apparatus
comprises the following components, interconnected by a digital
data- and address bus 30: [0055] a processing unit 33 (or CPU for
Central Processing Unit); [0056] a memory 35; [0057] a network
interface 34, for interconnection of apparatus 300 to other devices
connected in a network via connection 31.
[0058] Processing unit 33 can be implemented as a microprocessor, a
custom chip, a dedicated (micro-) controller, and so on. Memory 35
can be implemented in any form of volatile and/or non-volatile
memory, such as a RAM (Random Access Memory), hard disk drive,
non-volatile random-access memory, EPROM (Erasable Programmable
ROM), and so on.
[0059] The processing unit 33, the memory 35 and the network
interface 34 are configured to control the navigation of an
unmanned aerial vehicle such as a drone, i.e. they are configured
to specify a target position of the unmanned aerial vehicle at each
time instant, corresponding to a determined motion trajectory in
the 3D coordinate system S.sub.UAV. It is then possible to control
the unmanned aerial vehicle (a drone for example) in such a way
that a part of it follows a motion trajectory in the 3D coordinate
system S.sub.MC at a predefined speed, said motion trajectory being
determined in order to allow interactions to occur between real
and/or virtual elements of a scene. This form of control allows to
combine the navigation of the unmanned aerial vehicle UAV with
other features, for instance, related to the remote operation of a
camera mounted on the unmanned aerial vehicle UAV.
[0060] According to a variant, the apparatus comprises a Graphical
User Interface 32 which is configured to allow a user to specify
the target position of the unmanned aerial vehicle UAV at each time
instant. The unmanned aerial vehicle UAV trajectory control is then
operated from the Graphical User Interface 32 that can take the
form for example of a joystick or a tactile interface, e.g., on a
tablet.
[0061] On FIGS. 2, and 3, the modules are functional units, which
may or not be in relation with distinguishable physical units. For
example, these modules or some of them may be brought together in a
unique component or circuit, or contribute to functionalities of a
software. A contrario, some modules may potentially be composed of
separate physical entities. The apparatus which are compatible with
the invention are implemented using either pure hardware, for
example using dedicated hardware such ASIC or FPGA or VLSI,
respectively <<Application Specific Integrated
Circuit>>, <<Field-Programmable Gate Array>>,
<<Very Large Scale Integration>>, or from several
integrated electronic components embedded in a device or from a
brend of hardware and software components.
[0062] While not explicitly described, the present embodiments and
variants may be employed in any combination or sub-combination.
* * * * *
References