Methods And Systems For Generating An Animation Control Rig Including Posing Of Non-rigid Areas Stevenson; Thomas ; et al. [Weta Digital Limited]

Methods And Systems For Generating An Animation Control Rig Including Posing Of Non-rigid Areas

Stevenson; Thomas ; et al.

Patent Application Summary

U.S. patent application number 17/493710 was filed with the patent office on 2022-01-27 for methods and systems for generating an animation control rig including posing of non-rigid areas. This patent application is currently assigned to Weta Digital Limited. The applicant listed for this patent is Weta Digital Limited. Invention is credited to Andrew R. Phillips, Thomas Stevenson, Edward Sun.

Application Number	20220028152 17/493710
Document ID	/
Family ID
Filed Date	2022-01-27

United States Patent Application	20220028152
Kind Code	A1
Stevenson; Thomas ; et al.	January 27, 2022

METHODS AND SYSTEMS FOR GENERATING AN ANIMATION CONTROL RIG INCLUDING POSING OF NON-RIGID AREAS

Abstract

In an embodiment, an animator is provided with an indication when a model's component such as a joint or limb is being moved or twisted in a way that would be unnatural and cause unusual stress on the model component. For example, as a shoulder joint is stressed by moving an arm in an extreme position a yellow bar or coloring of the shoulder, arm or other component can grow increasingly bright and shift to red just before a breaking point is reached. An animator can choose to go past the breaking point and the breaking can be modeled and incorporated into the animation.

Inventors:

Stevenson; Thomas; (Wellington, NZ) ; Phillips; Andrew R.; (Wellington, NZ) ; Sun; Edward; (Wellington, NZ)

Applicant:

Name	City	State	Country	Type
Weta Digital Limited	Wllington		NZ

Assignee:

Weta Digital Limited
Wllington
NZ

Appl. No.:

17/493710

Filed:

October 4, 2021

Related U.S. Patent Documents


Application Number	Filing Date	Patent Number
17120065	Dec 11, 2020	11170553
17493710
63056420	Jul 24, 2020

International Class:

G06T 13/40 20060101 G06T013/40; G06T 13/80 20060101 G06T013/80; G06T 19/20 20060101 G06T019/20

Claims

1. A computer-implemented method for generating an animation control rig configured to manipulate a non-rigid portion of a digital character, the method comprising: associating a control shape with the non-rigid portion of the digital character; accepting a signal from a user input control to move the control shape to specify a new pose for the non-rigid portion of the digital character; traversing a hierarchical node graph representing a plurality of animation control points associated with the new pose; identifying a plurality of nodes that are implicated in moving the digital character to the new pose; and using the implicated nodes to display the new pose for the non-rigid portion of the digital character.

2. The method of claim 1, wherein the digital character includes a skeleton, wherein the control shape is adapted to handle an area of motion.

3. The method of claim 1, wherein the non-rigid portion includes a belly.

4. The method of claim 1, wherein the non-rigid portion includes a tail.

5. The method of claim 1, wherein the non-rigid portion includes lips.

6. The method of claim 5, wherein the non-rigid portion includes a face.

7. An apparatus including a processor configured to perform the actions recited in claim 1.

8. One or more non-transitory processor-readable media including instructions executable by one or more processors to perform the actions recited in claim 1.

Description

CROSS REFERENCES TO RELATED APPLICATIONS

[0001] This application is a continuation of the following application, U.S. patent application Ser. No. 17/120,065, entitled METHODS AND SYSTEMS FOR GENERATING AN ANIMATION CONTROL RIG, filed on Dec. 11, 2020, which claims the benefit of U.S. Provisional Patent Application Ser. No. 63/056,420, entitled METHODS AND SYSTEMS FOR GENERATING AN ANIMATION CONTROL RIG, filed on Jul. 24, 2020, which is hereby incorporated by reference as if set forth in full in this application for all purposes.

FIELD

[0002] The present disclosure generally relates to methods and systems for generating an animation using an indication of a stress level being exerted on a component of an animation model.

BACKGROUND

[0003] Visual content generation systems are used to generate imagery in the form of still images and/or video sequences of images. The still images and/or video sequences of images include live action scenes obtained from a live action capture system, computer generated scenes obtained from an animation creation system, or a combination thereof.

[0004] An animation artist is provided with tools that allow them to specify what is to go into that imagery. Where the imagery includes computer generated scenes, the animation artist may use various tools to specify the positions in a scene space such as a three-dimensional coordinate system of objects. Some objects are articulated, having multiple limbs and joints that are movable with respect to each other.

[0005] The animation artist may retrieve a representation of an articulated object and generate an animation sequence movement of the articulated object, or part thereof. Animation sequence data representing an animation sequence may be stored in data storage, such as animation sequence storage described below.

[0006] Animation sequence data might be in the form of time series of data for control points of an articulated object having attributes that are controllable. Generating animation sequence data has the potential to be a complicated task when a scene calls for animation of an articulated object.

SUMMARY

[0007] In accordance with an aspect, a computer-implemented method for creating an animation includes: displaying an animation model including a component; displaying a control point corresponding to at least one component of the animation model; accepting a signal from a user input device to move the control point; determining a stress amount in response to the moved control point; indicating to a user the stress amount; and creating an animation using a movement derived from the moved control point.

[0008] The term `comprising` as used in this specification means `consisting at least in part of`. When interpreting each statement in this specification that includes the term `comprising`, features other than that or those prefaced by the term may also be present. Related terms such as `comprise` and `comprises` are to be interpreted in the same manner.

[0009] In an embodiment, the method further includes: determining a stress level of a component of the digital character resulting from movement toward the new pose; and displaying an indicator to visually indicate the stress level.

[0010] In an embodiment, the visual indicator includes a color.

BRIEF DESCRIPTION OF THE DRAWINGS

[0011] Various embodiments in accordance with the present disclosure will be described with reference to the drawings, in which:

[0012] FIG. 1 shows an example of a control rig configured to enable an artist to create animation sequence data.

[0013] FIG. 2 shows an example of an animation skeleton obtained from an animation system that is matched to the skeleton of the control rig of FIG. 1.

[0014] FIG. 3 shows examples of animation control points associated with the control rig of FIG. 1.

[0015] FIG. 4 shows an example of a hierarchical node graph suitable for implementing the control rig of FIG. 1.

[0016] FIG. 5 shows an example method for generating the animation control rig of FIG. 1.

[0017] FIG. 6 is a block diagram illustrating an example computer system upon which computer systems of the systems illustrated herein may be implemented.

[0018] FIG. 7 illustrates an example visual content generation system as might be used to generate imagery in the form of still images and/or video sequences of images.

DETAILED DESCRIPTION

[0019] In the following description, various embodiments will be described. For purposes of explanation, specific configurations and details are set forth in order to provide a thorough understanding of the embodiments. However, it will also be apparent to one skilled in the art that the embodiments may be practiced without the specific details. Furthermore, well-known features may be omitted or simplified in order not to obscure the embodiment being described.

[0020] Described below are methods and systems for generating an animation control rig configured to manipulate a skeleton of a digital character (an "animation skeleton").

[0021] FIG. 1 shows an example of a control rig 100, or animated skeleton. Control rig 100 is configured to enable an artist to create animation sequence data. Animation sequence data is typically in the form of time series of data for control points of an object that has attributes that are controllable. In some examples the object includes a humanoid character with limbs and joints that are movable in manners similar to typical human movements.

[0022] Here, control rig 100 represents a humanoid character, but may be configured to represent a plurality of different characters. In an embodiment control rig 100 includes a hierarchical set of interconnected bones, connected by joints forming a kinematic chain.

[0023] For example, control rig 100 includes thigh 102, knee 104, lower leg 106, ankle 108, and foot 110, connected by joints 112, 114 to form a leg appendage 118. Control rig 100 may be employed to individually move individual bones and joints using forward kinematics to pose a character. Moving thigh 102 causes a movement of lower leg 106, as lower leg 106 is connected to thigh 102 via knee 104. Thigh 102 and lower leg 106, for example, are in a parent-child relationship. Movement of lower leg 106 is a product of movement of thigh 102 as well as movement of lower leg 106 itself. Control rig 100 may also use inverse kinematics, in which an artist moves ankle 108 for example. If an artist moves ankle 108 upwards, knee 104 consequently bends and moves upwards to accommodate a pose in which ankle 108 is at a user specified location.

[0024] Control rig 100 may be formed using a plurality of data points. Control rig 100 may be matched to a skeleton obtained from an animation system, or from, for example, motion capture markers or other means on real-life actors. A live action scene of a human actor is captured by live action capture system 702 (see FIG. 7) while wearing mo-cap fiducials for example high-contrast markers outside actor clothing. The movement of those fiducials is determined by live action processing system 722. Animation driver generator 744 may convert that movement data into specifications of how joints of an articulated character are to move over time.

[0025] FIG. 2 shows an example of a skeleton 200 obtained from an animation system such as visual content generation system 700.

[0026] The motions of control rig 100 are able to correspond to the motions of motion captured skeleton 200 when matched. Control rig 100 may also be controlled freely by an animator to produce motions beyond the motions of a real-life skeleton, such as the real-life skeleton of a human. Control rig 100 may represent a character of a different size to the skeleton of a real-life actor.

[0027] As shown in FIG. 3, control rig 100 includes a plurality of animation control points, or control points. Examples of control points are indicated at 120, 122 and 124 respectively. For example, in an embodiment control rig 100 includes control point 120 at the ankle that allows an animator to control the motion of a leg of control rig 100. In another example, control point 122 is positioned at lower leg 106 of rig 100 and/or control point 124 is positioned at thigh 102. Different parts of the control rig 100 have associated to them respective control points.

[0028] In an embodiment, an artist may create an animation sequence by selecting a control point on control rig 100. Control rig 100 may be displayed, for example, on display 612 (see FIG. 6). The artist selects a control point using input device 614 and/or cursor 616. The control points may be displayed as extending from a character represented by control rig 100. Displaying the control points in this manner enables the artist to select a control point easily.

[0029] In an embodiment, operating animation control(s) causes an animated object to be generated. The animated object may be displayed, for example, on display 612 as an image sequence.

[0030] The artist may, for example, select control point 122 for the lower leg or control point 124 for the upper leg of control rig 100. The artist selects a position and/or location of the control point that is different to the current position and/or location of the control point. This process is known as key-framing The artist moves controls to new positions at given times, thereby creating key poses in an animation sequence. Interpolation is performed between key poses.

[0031] In different embodiments, different types of controls can be made available to the artist. Typically, graphical user interface (GUI) controls called "handles" are presented to an artist for manipulation. A handle might correspond to movement of a part of a finger, for example. Or a foot, arm bone or any other component or set of components in an animation skeleton or other underlying rig. Usually, the handles are simplified subsets of the component definitions and control points and will not all line up exactly with skeleton components. In other embodiments, there can be a one-to-one correspondence of one or more handles to control points. Or the control points themselves may be presented to an artist or animator for manipulation.

[0032] In an embodiment, there can be one-off handles or other controls (e.g., "control shapes") that can allow custom movement of animation models for complex or non-rigid structures such as for a fat belly, a tail, lips, etc. Often animation of caricatures will not closely follow a skeleton and so handles are adapted to handle areas of motion, or other organized systems of movement. Animating face models may not require an underlying skeleton in order to animate the model, or digital character.

[0033] In an embodiment, an animator is provided with a visual indication when a joint or limb is being moved or twisted in a way that would be unnatural and cause unusual stress on the model component, were the digital character being modeled to exist in the real world. So, for example, as a shoulder joint is stressed by moving an arm in an extreme position a yellow bar or coloring of a model component can grow increasingly bright and shift to red just before a breaking point or failure point is reached. An animator can choose to go past the breaking point and the breaking can be modeled and incorporated into the animation.

[0034] In an embodiment, control points may be used to control more than one bone, joint, etc. For example, a control point may be used to control the upper arm and lower arm at the same time.

[0035] In an embodiment, at least one inverse kinematics operation is performed in order to generate the animation sequence specified by the artist. For example, the artist may wish to specify that ankle 108 is to move from a location within control rig 100 shown in FIG. 1 to a location within control rig shown in FIG. 3. The artist manipulates control point 120 to specify a desired change in ankle location.

[0036] A series of calculations is performed to determine what changes in location and/or orientation of parts of control rig 100 are required to result in an orientation of control rig shown in FIG. 3. For example, the new location of control point 120 selected by the artist may require a change in location and/or orientation of at least thigh 102, knee 104, lower leg 106, ankle 108 and foot 110. The changes in location and/or orientation that are required to achieve a goal of the artist are then determined.

[0037] FIG. 4 shows an example of a hierarchical node graph 400 suitable for implementing control rig 100 of FIG. 1. Node graph 400 includes a plurality of nodes, examples of which are shown at 402, 404 and 406 respectively. At least some of the nodes are associated with at least one input and at least one output.

[0038] In an embodiment one or more nodes of the hierarchical node graph 400 represent respective animation control points of control rig 100. Outputs from individual nodes include the solved positions of each joint angle and bone position in the kinematic chain. In inverse kinematics, the new joint angles and positions are determined relative to the control input. Inputs to the individual nodes include the new position of a member that is then used to calculate the position of the other members of the skeleton and the associated joint angles. For example, moving the hand from a first position resting on the ground to a new position above the ground will be used to determine the position of the forearm, upper arm, and elbow, and shoulder.

[0039] In an embodiment, at least one node in hierarchical node graph 400 is inversely solvable through an analytical approach. For example, node 404 has associated to it an inverse kinematics function. The artist selects a position and/or location of control point 120 that is different to the current position and/or location of the control point. Node 402 in hierarchical node graph 400 is identified as corresponding to control point 120. The inverse kinematics function associated to node 402 is applied to control point 120. The output of node 402 becomes an input to node 404. The output of node 404 then becomes the input of node 406.

[0040] The result is that the position and angle associated with each node further away from the control point allows the position and/or location of control point 120 to correspond to the position and/or location selected by the artist. In an example, a node that is inversely solvable may involve a problem that has three points of a limb that are analytically solvable using a single solution, such as a trigonometric solution.

[0041] In an embodiment at least one node in hierarchical node graph 400 is not inversely solvable through an analytical approach. For those nodes that are not inversely solvable, there is no associated inverse kinematics function. An inverse kinematics function cannot be applied to a control point so that the position and/or location of the control point corresponds to a position and/or location selected by the artist.

[0042] In an embodiment control rig 100 and hierarchical node graph 400 are subject to at least one constraint. For example, there may be a finite number of positions and/or orientations that an artist may select for knee 104. Some positions may violate physical constraints that apply to a physical skeleton associated to control rig 100. These physical constraints include biomechanical limitations of movement that apply to a physical skeleton and that are modeled by control rig 100.

[0043] Furthermore, some positions may involve applying transformations to at least one node in the hierarchical node graph 400 for which there is no associated inverse kinematics function. Such transformations violate a computational constraint. For example, there may be cases where a node does not have an inverse kinematics function.

[0044] In an embodiment, a series of forward kinematics operations are applied to at least one node in hierarchical node graph 400 for which there is no associated inverse kinematics function. The resulting location is compared to the desired location selected by the artist. Any difference in location is compared to a tolerance. If the difference is within the tolerance then the transformation is completed. On the other hand, if the difference in locations is outside the tolerance then a further forward kinematics operation is performed on the node with the intention of achieving a difference between the calculated location and the selected location that is within the tolerance.

[0045] In an embodiment, an iterative solve is required where there are overlapping solves associated with a change in a control node. It is possible that some nodes may have more than one associated inverse kinematics function related to multiple parent nodes. Most of the solves are based on trigonometry, so there may be more than one way to solve the equation.

[0046] Trigonometric solutions are generally directly solvable. A problem may arise where there are solves that require more than one solve at a time. For example, a solve may be straightforward to determine for a hand position, elbow position, and shoulder position if the hand was fixed relative to the forearm. In other words, a problem that involves three points of a limb is analytically solvable using a single trigonometric solution.

[0047] However, the artist may wish to, for example, bend a wrist back and then move the lower arm. There is a choice of trigonometric solutions. A first solution may involve the bent position of the hand, forearm, elbow, and upper arm. A second solution may involve the bent position of the hand and just the end of the forearm. In other words, when more than three points of a limb are involved in a problem, multiple trigonometric solutions involving the same points of a limb may exist. Approximation involving iterative solves may be required in such cases.

[0048] A control node selected by the artist may influence multiple parent nodes. The relationship between the control node and each of the multiple parent nodes has an associated inverse kinematics function. Therefore, a changed position and/or orientation of the control node would require multiple inverse kinematics functions to be solved iteratively to determine the positions and/or orientations of the multiple parent nodes.

[0049] In an example, when a series of movements of control rig 100 is required to match with a series of movements from a motion captured skeleton, the control points of control rig 100 are controlled by the corresponding positions of the motion captured skeleton. For example, an ankle of the motion captured skeleton drives a corresponding ankle of the control rig. The inverse kinematics operations associated with the ankle node of the control rig are then used to derive the positions and/or orientations of at least thigh 102, knee 104 and lower leg 106.

[0050] FIG. 5 shows an example method 500 for generating animation control rig 100 of FIG. 1. Method 500 uses inverse solves at an appendage level where possible to determine where controls should be placed and oriented. In an embodiment, control rig 100 is configured to manipulate a skeleton of an animated character.

[0051] A plurality of animation control points is associated 502 to an animated skeleton. One example of an animated skeleton is motion captured skeleton 200 (see FIG. 2). Examples of control points include control points 120, 122, and 124 (see FIG. 3). In an embodiment at least one of the control points is associated to at least two members of skeleton 200. For example, a single control point may be associated to at least two bones of skeleton 200.

[0052] Method 500 includes traversing a node graph representing the plurality of animation control points of the animated skeleton. One example of a node graph is hierarchical node graph 400 (see FIG. 4) in which nodes in node graph 400 are associated with respective control points such as 120, 122 and 124. It will be appreciated that method 500 can be applied to either directed or undirected graphs, and can be applied to weighted or unweighted graphs.

[0053] A first node in node graph 400 is selected 504 for analysis to determine if the first node is inversely solvable. In an embodiment, selecting a node in node graph 400 has the effect of selecting an animation control point represented by the selected node.

[0054] If 506 the selected node is inversely solvable then an inverse solve is performed 508 to determine where the animation control point represented by the selected node should be placed and oriented. An operation of the selected node is modified by updating 510 the active skeleton.

[0055] In an embodiment, if the selected node is not inversely solvable then an approximation 509 of an inverse solve is obtained. Method 500 attempts to perform a solve as close as possible to the desired position. For example, a desired limb position may not be achievable within the limitations of an animated skeleton. In such cases, a limb position can be determined, within the limitations of the animated skeleton, that is as close as possible to the desired limb position. This approximation may be obtained, for example, using an iterative solve. An operation of the selected node is modified by updating 510 the active skeleton with the approximation.

[0056] Once the animated skeleton has been updated 510, a stopping condition is checked 512.

[0057] In an embodiment, method 500 is performed at an appendage level. Stopping condition 512 is satisfied when all appendages have been checked. Alternatively, a stopping condition is satisfied when a predetermined number of appendages have been checked. Alternatively, a stopping condition is satisfied when a sufficient number of appendages within a predetermined region of the animated skeleton have been checked.

[0058] According to one embodiment, the techniques described herein are implemented by one or generalized computing systems programmed to perform the techniques pursuant to program instructions in firmware, memory, other storage, or a combination. Special-purpose computing devices may be used, such as desktop computer systems, portable computer systems, handheld devices, networking devices or any other device that incorporates hard-wired and/or program logic to implement the techniques.

[0059] For example, FIG. 6 is a block diagram that illustrates computer system 600 upon which visual content generation system 700 (see FIG. 7) may be implemented. Computer system 600 includes bus 602 or other communication mechanism for communicating information, and processor 604 coupled with bus 602 for processing information. Processor 604 may be, for example, a general purpose microprocessor.

[0060] The computer system 600 also includes main memory 606, such as a random access memory (RAM) or other dynamic storage device, coupled to bus 602 for storing information and instructions to be executed by the processor 604. The main memory 606 may also be used for storing temporary variables or other intermediate information during execution of instructions to be executed by the processor 604. Such instructions, when stored in non-transitory storage media accessible to the processor 604, render the computer system 600 into a special-purpose machine that is customized to perform the operations specified in the instructions.

[0061] The computer system 600 further includes a read only memory (ROM) 608 or other static storage device coupled to the bus 602 for storing static information and instructions for the processor 604. A storage device 610, such as a magnetic disk or optical disk, is provided and coupled to the bus 602 for storing information and instructions.

[0062] The computer system 600 may be coupled via the bus 602 to a display 612, such as a computer monitor, for displaying information to a computer user. An input device 614, including alphanumeric and other keys, is coupled to the bus 602 for communicating information and command selections to the processor 604. Another type of user input device is a cursor control 616, such as a mouse, a trackball, or cursor direction keys for communicating direction information and command selections to the processor 604 and for controlling cursor movement on the display 612. This input device typically has two degrees of freedom in two axes, a first axis (e.g., x) and a second axis (e.g., y), that allows the device to specify positions in a plane.

[0063] The computer system 600 may implement the techniques described herein using customized hard-wired logic, one or more ASICs or FPGAs, firmware and/or program logic which in combination with the computer system causes or programs the computer system 600 to be a special-purpose machine. According to one embodiment, the techniques herein are performed by the computer system 600 in response to the processor 604 executing one or more sequences of one or more instructions contained in the main memory 606. Such instructions may be read into the main memory 606 from another storage medium, such as the storage device 610. Execution of the sequences of instructions contained in the main memory 606 causes the processor 604 to perform the process steps described herein. In alternative embodiments, hard-wired circuitry may be used in place of or in combination with software instructions.

[0064] The term "storage media" as used herein refers to any non-transitory media that store data and/or instructions that cause a machine to operate in a specific fashion. Such storage media may include non-volatile media and/or volatile media. Non-volatile media includes, for example, optical or magnetic disks, such as the storage device 610. Volatile media includes dynamic memory, such as the main memory 606. Common forms of storage media include, for example, a floppy disk, a flexible disk, hard disk, solid state drive, magnetic tape, or any other magnetic data storage medium, a CD-ROM, any other optical data storage medium, any physical medium with patterns of holes, a RAM, a PROM, an EPROM, a FLASH-EPROM, NVRAM, any other memory chip or cartridge.

[0065] Storage media is distinct from but may be used in conjunction with transmission media. Transmission media participates in transferring information between storage media. For example, transmission media includes coaxial cables, copper wire, and fiber optics, including the wires that include the bus 602. Transmission media can also take the form of acoustic or light waves, such as those generated during radio-wave and infra-red data communications.

[0066] Various forms of media may be involved in carrying one or more sequences of one or more instructions to the processor 604 for execution. For example, the instructions may initially be carried on a magnetic disk or solid state drive of a remote computer. The remote computer can load the instructions into its dynamic memory and send the instructions over a network connection. A modem or network interface local to the computer system 600 can receive the data. The bus 602 carries the data to the main memory 606, from which the processor 604 retrieves and executes the instructions. The instructions received by the main memory 606 may optionally be stored on the storage device 610 either before or after execution by the processor 604.

[0067] The computer system 600 also includes a communication interface 818 coupled to the bus 602. The communication interface 818 provides a two-way data communication coupling to a network link 820 that is connected to a local network 822. For example, the communication interface 818 may be an integrated services digital network (ISDN) card, cable modem, satellite modem, or a modem to provide a data communication connection to a corresponding type of telephone line. Wireless links may also be implemented. In any such implementation, the communication interface 818 sends and receives electrical, electromagnetic, or optical signals that carry digital data streams representing various types of information.

[0068] The network link 820 typically provides data communication through one or more networks to other data devices. For example, the network link 820 may provide a connection through the local network 822 to a host computer 824 or to data equipment operated by an Internet Service Provider (ISP) 826. The ISP 826 in turn provides data communication services through the world wide packet data communication network now commonly referred to as the "Internet" 828. The local network 822 and Internet 828 both use electrical, electromagnetic, or optical signals that carry digital data streams. The signals through the various networks and the signals on the network link 820 and through the communication interface 818, which carry the digital data to and from the computer system 600, are example forms of transmission media.

[0069] The computer system 600 can send messages and receive data, including program code, through the network(s), the network link 820, and communication interface 818. In the Internet example, a server 830 might transmit a requested code for an application program through the Internet 828, ISP 826, local network 822, and communication interface 818. The received code may be executed by the processor 604 as it is received, and/or stored in the storage device 610, or other non-volatile storage for later execution.

[0070] For example, FIG. 7 illustrates the example visual content generation system 700 as might be used to generate imagery in the form of still images and/or video sequences of images. The visual content generation system 700 might generate imagery of live action scenes, computer generated scenes, or a combination thereof. In a practical system, users are provided with tools that allow them to specify, at high levels and low levels where necessary, what is to go into that imagery. For example, a user might be an animation artist (like the artist 142 illustrated in FIG. 1) and might use the visual content generation system 700 to capture interaction between two human actors performing live on a sound stage and replace one of the human actors with a computer-generated anthropomorphic non-human being that behaves in ways that mimic the replaced human actor's movements and mannerisms, and then add in a third computer-generated character and background scene elements that are computer-generated, all in order to tell a desired story or generate desired imagery.

[0071] Still images that are output by the visual content generation system 700 might be represented in computer memory as pixel arrays, such as a two-dimensional array of pixel color values, each associated with a pixel having a position in a two-dimensional image array. Pixel color values might be represented by three or more (or fewer) color values per pixel, such as a red value, a green value, and a blue value (e.g., in RGB format). Dimension of such a two-dimensional array of pixel color values might correspond to a preferred and/or standard display scheme, such as 1920 pixel columns by 1280 pixel rows Images might or might not be stored in a compressed format, but either way, a desired image may be represented as a two-dimensional array of pixel color values. In another variation, images are represented by a pair of stereo images for three-dimensional presentations and in other variations, some or all of an image output might represent three-dimensional imagery instead of just two-dimensional views.

[0072] A stored video sequence might include a plurality of images such as the still images described above, but where each image of the plurality of images has a place in a timing sequence and the stored video sequence is arranged so that when each image is displayed in order, at a time indicated by the timing sequence, the display presents what appears to be moving and/or changing imagery. In one representation, each image of the plurality of images is a video frame having a specified frame number that corresponds to an amount of time that would elapse from when a video sequence begins playing until that specified frame is displayed. A frame rate might be used to describe how many frames of the stored video sequence are displayed per unit time. Example video sequences might include 24 frames per second (24 FPS), 50 FPS, 140 FPS, or other frame rates. In some embodiments, frames are interlaced or otherwise presented for display, but for the purpose of clarity of description, in some examples, it is assumed that a video frame has one specified display time and it should be understood that other variations are possible.

[0073] One method of creating a video sequence is to simply use a video camera to record a live action scene, i.e., events that physically occur and can be recorded by a video camera. The events being recorded can be events to be interpreted as viewed (such as seeing two human actors talk to each other) and/or can include events to be interpreted differently due to clever camera operations (such as moving actors about a stage to make one appear larger than the other despite the actors actually being of similar build, or using miniature objects with other miniature objects so as to be interpreted as a scene containing life-sized objects).

[0074] Creating video sequences for story-telling or other purposes often calls for scenes that cannot be created with live actors, such as a talking tree, an anthropomorphic object, space battles, and the like. Such video sequences might be generated computationally rather than capturing light from live scenes. In some instances, an entirety of a video sequence might be generated computationally, as in the case of a computer-animated feature film. In some video sequences, it is desirable to have some computer-generated imagery and some live action, perhaps with some careful merging of the two.

[0075] While computer-generated imagery might be creatable by manually specifying each color value for each pixel in each frame, this is likely too tedious to be practical. As a result, a creator uses various tools to specify the imagery at a higher level. As an example, an artist (e.g., the artist 142 illustrated in FIG. 1) might specify the positions in a scene space, such as a three-dimensional coordinate system, of objects and/or lighting, as well as a camera viewpoint, and a camera view plane. Taking all of that as inputs, a rendering engine may compute each of the pixel values in each of the frames. In another example, an artist specifies position and movement of an articulated object having some specified texture rather than specifying the color of each pixel representing that articulated object in each frame.

[0076] In a specific example, a rendering engine performs ray tracing wherein a pixel color value is determined by computing which objects lie along a ray traced in the scene space from the camera viewpoint through a point or portion of the camera view plane that corresponds to that pixel. For example, a camera view plane might be represented as a rectangle having a position in the scene space that is divided into a grid corresponding to the pixels of the ultimate image to be generated, and if a ray defined by the camera viewpoint in the scene space and a given pixel in that grid first intersects a solid, opaque, blue object, that given pixel is assigned the color blue. Of course, for modern computer-generated imagery, determining pixel colors--and thereby generating imagery--can be more complicated, as there are lighting issues, reflections, interpolations, and other considerations.

[0077] As illustrated in FIG. 7, a live action capture system 702 captures a live scene that plays out on a stage 704. The live action capture system 702 is described herein in greater detail, but might include computer processing capabilities, image processing capabilities, one or more processors, program code storage for storing program instructions executable by the one or more processors, as well as user input devices and user output devices, not all of which are shown.

[0078] In a specific live action capture system, cameras 706(1) and 706(2) capture the scene, while in some systems, there might be other sensor(s) 708 that capture information from the live scene (e.g., infrared cameras, infrared sensors, motion capture ("mo-cap") detectors, etc.). On the stage 704, there might be human actors, animal actors, inanimate objects, background objects, and possibly an object such as a green screen 710 that is designed to be captured in a live scene recording in such a way that it is easily overlaid with computer-generated imagery. The stage 704 might also contain objects that serve as fiducials, such as fiducials 712(1)-(3), that might be used post-capture to determine where an object was during capture. A live action scene might be illuminated by one or more lights, such as an overhead light 714.

[0079] During or following the capture of a live action scene, the live action capture system 702 might output live action footage to a live action footage storage 720. A live action processing system 722 might process live action footage to generate data about that live action footage and store that data into a live action metadata storage 724. The live action processing system 722 might include computer processing capabilities, image processing capabilities, one or more processors, program code storage for storing program instructions executable by the one or more processors, as well as user input devices and user output devices, not all of which are shown. The live action processing system 722 might process live action footage to determine boundaries of objects in a frame or multiple frames, determine locations of objects in a live action scene, where a camera was relative to some action, distances between moving objects and fiducials, etc. Where elements are sensored or detected, the metadata might include location, color, and intensity of the overhead light 714, as that might be useful in post-processing to match computer-generated lighting on objects that are computer-generated and overlaid on the live action footage. The live action processing system 722 might operate autonomously, perhaps based on predetermined program instructions, to generate and output the live action metadata upon receiving and inputting the live action footage. The live action footage can be camera-captured data as well as data from other sensors.

[0080] An animation creation system 730 is another part of the visual content generation system 700. The animation creation system 730 might include computer processing capabilities, image processing capabilities, one or more processors, program code storage for storing program instructions executable by the one or more processors, as well as user input devices and user output devices, not all of which are shown. The animation creation system 730 might be used by animation artists, managers, and others to specify details, perhaps programmatically and/or interactively, of imagery to be generated. From user input and data from a database or other data source, indicated as a data store 732, the animation creation system 730 might generate and output data representing objects (e.g., a horse, a human, a ball, a teapot, a cloud, a light source, a texture, etc.) to an object storage 734, generate and output data representing a scene into a scene description storage 736, and/or generate and output data representing animation sequences to an animation sequence storage 738.

[0081] Scene data might indicate locations of objects and other visual elements, values of their parameters, lighting, camera location, camera view plane, and other details that a rendering engine 750 might use to render CGI imagery. For example, scene data might include the locations of several articulated characters, background objects, lighting, etc. specified in a two-dimensional space, three-dimensional space, or other dimensional space (such as a 2.5-dimensional space, three-quarter dimensions, pseudo-3D spaces, etc.) along with locations of a camera viewpoint and view place from which to render imagery. For example, scene data might indicate that there is to be a red, fuzzy, talking dog in the right half of a video and a stationary tree in the left half of the video, all illuminated by a bright point light source that is above and behind the camera viewpoint. In some cases, the camera viewpoint is not explicit, but can be determined from a viewing frustum. In the case of imagery that is to be rendered to a rectangular view, the frustum would be a truncated pyramid. Other shapes for a rendered view are possible and the camera view plane could be different for different shapes.

[0082] The animation creation system 730 might be interactive, allowing a user to read in animation sequences, scene descriptions, object details, etc. and edit those, possibly returning them to storage to update or replace existing data. As an example, an operator might read in objects from object storage into a baking processor that would transform those objects into simpler forms and return those to the object storage 734 as new or different objects. For example, an operator might read in an object that has dozens of specified parameters (movable joints, color options, textures, etc.), select some values for those parameters and then save a baked object that is a simplified object with now fixed values for those parameters.

[0083] Rather than have to specify each detail of a scene, data from the data store 732 might be used to drive object presentation. For example, if an artist is creating an animation of a spaceship passing over the surface of the Earth, instead of manually drawing or specifying a coastline, the artist might specify that the animation creation system 730 is to read data from the data store 732 in a file containing coordinates of Earth coastlines and generate background elements of a scene using that coastline data.

[0084] Animation sequence data might be in the form of time series of data for control points of an object that has attributes that are controllable. For example, an object might be a humanoid character with limbs and joints that are movable in manners similar to typical human movements. An artist can specify an animation sequence at a high level, such as "the left hand moves from location (X1, Y1, Z1) to (X2, Y2, Z2) over time T1 to T2", at a lower level (e.g., "move the elbow joint 2.5 degrees per frame") or even at a very high level (e.g., "character A should move, consistent with the laws of physics that are given for this scene, from point P1 to point P2 along a specified path").

[0085] Animation sequences in an animated scene might be specified by what happens in a live action scene. An animation driver generator 744 might read in live action metadata, such as data representing movements and positions of body parts of a live actor during a live action scene, and generate corresponding animation parameters to be stored in the animation sequence storage 738 for use in animating a CGI object. This can be useful where a live action scene of a human actor is captured while wearing mo-cap fiducials (e.g., high-contrast markers outside actor clothing, high-visibility paint on actor skin, face, etc.) and the movement of those fiducials is determined by the live action processing system 722. The animation driver generator 744 might convert that movement data into specifications of how joints of an articulated CGI character are to move over time.

[0086] A rendering engine 750 can read in animation sequences, scene descriptions, and object details, as well as rendering engine control inputs, such as a resolution selection and a set of rendering parameters. Resolution selection might be useful for an operator to control a trade-off between speed of rendering and clarity of detail, as speed might be more important than clarity for a movie maker to test a particular interaction or direction, while clarity might be more important that speed for a movie maker to generate data that will be used for final prints of feature films to be distributed. The rendering engine 750 might include computer processing capabilities, image processing capabilities, one or more processors, program code storage for storing program instructions executable by the one or more processors, as well as user input devices and user output devices, not all of which are shown.

[0087] The visual content generation system 700 can also include a merging system 760 that merges live footage with animated content. The live footage might be obtained and input by reading from the live action footage storage 720 to obtain live action footage, by reading from the live action metadata storage 724 to obtain details such as presumed segmentation in captured images segmenting objects in a live action scene from their background (perhaps aided by the fact that the green screen 710 was part of the live action scene), and by obtaining CGI imagery from the rendering engine 750.

[0088] A merging system 760 might also read data from a rulesets for merging/combining storage 762. A very simple example of a rule in a ruleset might be "obtain a full image including a two-dimensional pixel array from live footage, obtain a full image including a two-dimensional pixel array from the rendering engine 750, and output an image where each pixel is a corresponding pixel from the rendering engine 750 when the corresponding pixel in the live footage is a specific color of green, otherwise output a pixel value from the corresponding pixel in the live footage."

[0089] The merging system 760 might include computer processing capabilities, image processing capabilities, one or more processors, program code storage for storing program instructions executable by the one or more processors, as well as user input devices and user output devices, not all of which are shown. The merging system 760 might operate autonomously, following programming instructions, or might have a user interface or programmatic interface over which an operator can control a merging process. In some embodiments, an operator can specify parameter values to use in a merging process and/or might specify specific tweaks to be made to an output of the merging system 760, such as modifying boundaries of segmented objects, inserting blurs to smooth out imperfections, or adding other effects. Based on its inputs, the merging system 760 can output an image to be stored in a static image storage 770 and/or a sequence of images in the form of video to be stored in an animated/combined video storage 772.

[0090] Thus, as described, the visual content generation system 700 can be used to generate video that combines live action with computer-generated animation using various components and tools, some of which are described in more detail herein. While the visual content generation system 700 might be useful for such combinations, with suitable settings, it can be used for outputting entirely live action footage or entirely CGI sequences. The code may also be provided and/or carried by a transitory computer readable medium, e.g., a transmission medium such as in the form of a signal transmitted over a network.

[0091] Operations of processes described herein can be performed in any suitable order unless otherwise indicated herein or otherwise clearly contradicted by context. Processes described herein (or variations and/or combinations thereof) may be performed under the control of one or more computer systems configured with executable instructions and may be implemented as code (e.g., executable instructions, one or more computer programs or one or more applications) executing collectively on one or more processors, by hardware or combinations thereof. The code may be stored on a computer-readable storage medium, for example, in the form of a computer program comprising a plurality of instructions executable by one or more processors. The computer-readable storage medium may be non-transitory.

[0092] Conjunctive language, such as phrases of the form "at least one of A, B, and C," or "at least one of A, B and C," unless specifically stated otherwise or otherwise clearly contradicted by context, is otherwise understood with the context as used in general to present that an item, term, etc., may be either A or B or C, or any nonempty subset of the set of A and B and C. For instance, in the illustrative example of a set having three members, the conjunctive phrases "at least one of A, B, and C" and "at least one of A, B and C" refer to any of the following sets: {A}, {B}, {C}, {A, B}, {A, C}, {B, C}, {A, B, C}. Thus, such conjunctive language is not generally intended to imply that certain embodiments require at least one of A, at least one of B and at least one of C each to be present.

[0093] The use of any and all examples, or exemplary language (e.g., "such as") provided herein, is intended merely to better illuminate embodiments of the invention and does not pose a limitation on the scope of the invention unless otherwise claimed. No language in the specification should be construed as indicating any non-claimed element as essential to the practice of the invention.

[0094] In the foregoing specification, embodiments of the invention have been described with reference to numerous specific details that may vary from implementation to implementation. The specification and drawings are, accordingly, to be regarded in an illustrative rather than a restrictive sense. The sole and exclusive indicator of the scope of the invention, and what is intended by the applicants to be the scope of the invention, is the literal and equivalent scope of the set of claims that issue from this application, in the specific form in which such claims issue, including any subsequent correction.

[0095] Further embodiments can be envisioned to one of ordinary skill in the art after reading this disclosure. In other embodiments, combinations or sub-combinations of the above-disclosed invention can be advantageously made. The example arrangements of components are shown for purposes of illustration and it should be understood that combinations, additions, re-arrangements, and the like are contemplated in alternative embodiments of the present invention. Thus, while the invention has been described with respect to exemplary embodiments, one skilled in the art will recognize that numerous modifications are possible.

[0096] For example, the processes described herein may be implemented using hardware components, software components, and/or any combination thereof. The specification and drawings are, accordingly, to be regarded in an illustrative rather than a restrictive sense. It will, however, be evident that various modifications and changes may be made thereunto without departing from the broader spirit and scope of the invention as set forth in the claims and that the invention is intended to cover all modifications and equivalents within the scope of the following claims.

[0097] All references, including publications, patent applications, and patents, cited herein are hereby incorporated by reference to the same extent as if each reference were individually and specifically indicated to be incorporated by reference and were set forth in its entirety herein.

* * * * *