U.S. patent application number 13/250857 was filed with the patent office on 2013-03-21 for audio meters and parameter controls.
The applicant listed for this patent is Aaron M. Eppolito, Brian Meaney, Colleen Pendergast, Michaelle Stikich. Invention is credited to Aaron M. Eppolito, Brian Meaney, Colleen Pendergast, Michaelle Stikich.
Application Number | 20130073960 13/250857 |
Document ID | / |
Family ID | 47881820 |
Filed Date | 2013-03-21 |
United States Patent
Application |
20130073960 |
Kind Code |
A1 |
Eppolito; Aaron M. ; et
al. |
March 21, 2013 |
AUDIO METERS AND PARAMETER CONTROLS
Abstract
Some embodiments provide a media editing application that
displays the audio level of a set of one or more clips that has
been mixed with other clips. To indicate the audio level of the set
of clips that has been mixed with other clips, the media editing
application of some embodiments routes a combined audio signal of
the set of clips over a meter bus in order to determine the audio
level of the combined audio signal. Alternatively, the media
editing application of some embodiments extracts metering
information from each clip in a set of clips prior to mixing the
clips. The metering information is then used to estimate the audio
level of one or more clips in the composite presentation.
Inventors: |
Eppolito; Aaron M.; (Santa
Cruz, CA) ; Meaney; Brian; (Livermore, CA) ;
Pendergast; Colleen; (Livermore, CA) ; Stikich;
Michaelle; (El Cerrito, CA) |
|
Applicant: |
Name |
City |
State |
Country |
Type |
Eppolito; Aaron M.
Meaney; Brian
Pendergast; Colleen
Stikich; Michaelle |
Santa Cruz
Livermore
Livermore
El Cerrito |
CA
CA
CA
CA |
US
US
US
US |
|
|
Family ID: |
47881820 |
Appl. No.: |
13/250857 |
Filed: |
September 30, 2011 |
Related U.S. Patent Documents
|
|
|
|
|
|
Application
Number |
Filing Date |
Patent Number |
|
|
61537041 |
Sep 20, 2011 |
|
|
|
61537567 |
Sep 21, 2011 |
|
|
|
Current U.S.
Class: |
715/716 ;
381/56 |
Current CPC
Class: |
G11B 27/034 20130101;
G11B 27/322 20130101; G11B 27/34 20130101 |
Class at
Publication: |
715/716 ;
381/56 |
International
Class: |
G06F 3/01 20060101
G06F003/01; H04R 29/00 20060101 H04R029/00 |
Claims
1. A non-transitory machine readable medium storing a program that
when executed by at least one processing unit outputs audio content
for a composite presentation defined by a plurality of media clips,
the program comprising sets of instructions for: identifying a
sequence of clips that define the composite presentation;
extracting audio data from a set of clips in the sequence of clips;
estimating audio level of the set of clips based on the audio data;
and indicating the estimated audio level of the set of clips when
playing a mix of the sequence of clips.
2. The non-transitory machine readable medium of claim 1, wherein
the set of instructions for indicating the estimated audio level
comprises a set of instructions for displaying the audio level in
one or more meters.
3. The non-transitory machine readable medium of claim 1, wherein
the set of instructions for extracting audio data comprises
extracting audio signal data from each clip in the set of clips,
wherein the set of instructions for estimating the audio level
comprises a set of instructions for summing the audio signal data
of the set of clips.
4. The non-transitory machine readable medium of claim 3, wherein
the audio signal data are summed by adding a power contribution of
each clip in the set of clips.
5. The non-transitory machine readable medium of claim 1, wherein
the set of instructions for estimating the audio level comprises a
set of instructions for identifying a contribution of each clip in
the set of clips to the mix of the sequence and estimating the
audio level based on the identification.
6. The non-transitory machine readable medium of claim 5, wherein
the sequence of clips is associated with a series of operations,
wherein the set of instructions for estimating the audio level
comprises a set of instructions for processing down the series of
operations to identify the contribution of each clip to the mix of
the sequence of clips.
7. The non-transitory machine readable medium of claim 5, wherein
the set of instructions for estimating the audio level comprises a
set of instructions for scaling the audio level based on the
identification.
8. The non-transitory machine readable medium of claim 1, wherein
the program further comprises a set of instructions for identifying
a tag associated with the set of clips, wherein the audio data are
extracted from the set of clips based on the identification of the
tag.
9. The non-transitory machine readable medium of claim 1, wherein
the set of clips includes a compound clip that is defined by two or
more clips, wherein the computer program further comprises a set of
instructions for identifying each tag of the compound clip and the
compound clip's inner clips, and determining, based on the
identification, whether to indicate the audio level of the compound
clip or one or more of the compound clip's inner clips.
10. A method of outputting audio content for a composite
presentation defined by a plurality of media clips, the method
comprising: identifying a sequence of clips that define the
composite presentation; determining audio level of a set of clips
by sending the set of clip's audio data over a meter bus; and
indicating the estimated audio level of the set of clips when
playing a mix of the sequence of clips.
11. The method of claim 10, wherein indicating the estimated audio
level comprises displaying the audio level in one or more
meters.
12. The method of claim 10, wherein the audio data comprises audio
signal data from each clip in the set of clips, wherein determining
the audio level comprises summing the audio signal data of the set
of clips.
13. The method of claim 10, wherein the sequence of clips is
associated with a series of operations, wherein estimating the
audio comprises processing down the series of operations to
identify the contribution of each clip in the set of clips to the
mix of the sequence of clips.
14. The method of claim 13, wherein the set of instructions for
estimating the audio level comprises scaling the audio level based
on the identification.
15. The method of claim 10 further comprising identifying a tag
associated with the set of clips, wherein the set of clip's audio
data is sent over the bus based on the identification of the
tag.
16. The non-transitory machine readable medium of claim 10, wherein
the set of clips includes a compound clip that is defined by two or
more clips, wherein the computer program further comprises a set of
instructions for identifying each tag of the compound clip and the
compound clip's inner clips, and determining, based on the
identification, whether to indicate the audio level of the compound
clip or one or more of the compound clip's inner clips.
17. A non-transitory machine readable medium storing a program that
when executed by at least one processing unit outputs a composite
presentation defined by a plurality of media clips, the program
comprising sets of instructions for: displaying the plurality of
media clips for defining the composite presentation, wherein at
least some of the plurality of media clips is tagged with different
tags; providing a set of controls for each particular tag that is
associated with one or more media clips; and modifying, in response
to an adjustment of the set of controls, a set of parameter
associated with each media clip tagged the particular tag.
18. The non-transitory machine readable medium of claim 17, wherein
the set of controls includes audio controls and the set of
parameter includes audio level, wherein the set of instructions for
modifying comprises modifying the audio of level of the one or more
clips tagged with the particular tag.
19. The non-transitory machine readable medium of claim 17, wherein
the program further comprises a set of instructions for outputting
audio content for the composite presentation based on the
modification.
20. The non-transitory machine readable media of claim 17, wherein
the plurality of media clips comprises a compound clip that
includes multiple inner clips.
21. The non-transitory machine readable medium of claim 20, wherein
the set of instructions for adjusting the set of parameter
comprises a set of instructions for identifying each tag of the
compound clip and the compound clip's inner clips, and determining,
based on the identification, whether to adjust a set of parameters
associated with the compound clip or one or more of the compound
clip's inner clips.
22. The non-transitory machine readable medium of claim 17, wherein
the set of parameters relates to an effect or filter associated
with one or more of the tagged clips.
Description
CLAIM OF BENEFIT TO PRIOR APPLICATIONS
[0001] This application claims the benefit of U.S. Provisional
Application 61/537,041, filed Sep. 20, 2011, and U.S. Provisional
Application 61/537,567, filed Sep. 21, 2011. U.S. Provisional
Application 61/537,041 and U.S. Provisional Application 61/537,567
are incorporated herein by reference.
BACKGROUND
[0002] To date, many media editing applications exist for creating
a composite media presentation by compositing several pieces of
media content such as video, audio, animation, still image, etc. In
some cases, a media editing application combines a composite of two
or more clips with one or more other clips to output (e.g., play,
export) the composite presentation.
[0003] There are a number of different problems that can occur when
outputting such a composite presentation. For example, some movie
studios require a particular content (e.g., dialog content, music
content) of a composite presentation to be separate from other
content. The content separation allows the movie studios to easily
replace the composite presentation's dialog in one language with a
dialog in another language. The problem with providing separate
content is that, once several pieces of media content are mixed as
one mixed content, the mixed content cannot be un-mixed to provide
the separate content.
[0004] As another example, displaying the audio levels of different
media clips during playback of a composite presentation is useful
as the audio levels indicate how much audio one or more of the
different media clips are contributing to the overall mix. The
problem with this is similar to the example described above. That
is, a mix of the different media clips cannot be un-mixed during
playback to provide metering information for the different media
clips.
[0005] In addition, some media editing applications apply one or
more different effects (e.g., reverb effect, echo effect, blur
effect, distort effect, etc.) to a set of clips when outputting a
composite presentation. Several of these effects are applied using
a "send" (i.e., "send and return") that entails routing audio
signals of different clips over an auxiliary ("aux") bus to an
effects unit. For a typical media editing application, a "send"
effect is applied with the user manually adding an input aux track,
specifying an effect for the aux track, specifying an input bus for
the aux track, creating the "send", and identifying the specified
bus to route the audio signals of different clips. In this manner,
several audio signals of different clips can be routed over one aux
bus in order to apply a same effect (e.g., an echo effect) to a
combined audio signal of the different clips. However, the "send"
technique becomes increasingly complicated as additional aux buses
are added to route audio signals of multiple different clips.
[0006] Furthermore, several of the media editing applications
described above allow users to view metadata associated with media
content and/or perform organizing operations using the metadata.
However, these media editing applications lack the tools or the
functionality to perform different editing operations by using one
or more pieces of metadata that is associated with the media
content.
[0007] The concepts described in this section have not necessarily
been previously conceived, or implemented in any prior approach.
Therefore, unless otherwise indicated, it should not be assumed
that any concepts described in this section qualify as prior art
merely by virtue of their inclusion in this section.
BRIEF SUMMARY
[0008] Some embodiments provide a media editing application that
uses metadata or metadata tags associated with media content to
facilitate editing operations. In some embodiments, the editing
operations are performed on the media content at various different
stages of the editing process in order to create a composite
presentation. In creating the composite presentation, one or more
effects are associated with a metadata tag. Once the effects are
associated, the media editing application applies the effects to
different pieces of media content tagged with the metadata tag in
order to create the composite presentation.
[0009] Different embodiments provide different schemes for
specifying one or more effects to apply to media content that have
been associated with a metadata tag. For instance, in some
embodiments, the media editing application allows an effect chain
or an effect list to be specified for each type or category of
metadata tag. In some embodiments, the media editing application
allows its user to specify effect properties for the effects in the
effect list. These effect properties define how the corresponding
effect is applied to the media content.
[0010] Based on metadata associated with different clips, the media
editing application of some embodiments applies a set of effects
(e.g., echo effect, reverb effect) by using a "send" or a "send and
return". In some embodiments, the "send" is performed automatically
such that the routing of audio signals of the different clips to an
effect module is transparent to the application's user. That is,
the user does not have to add an input auxiliary ("aux") track,
specify an effect for the aux track, specify an input bus for the
aux track, create the "send", and identify the specified bus to
route the audio signals of the different clips. Instead, the user
can simply specify a particular effect for a metadata tag. The
media editing application then applies the particular effect using
the "send" to a combined audio signal of each clip tagged with the
metadata tag.
[0011] The media editing application of some embodiments applies
one or more effects directly on each clip without using the "send".
One example of such technique is applying an effect as an "insert"
effect that processes (e.g., filters, distorts) an incoming audio
signal and outputs the processed audio signal. For example, when a
metadata tag is associated with a particular effect, the media
editing application of some embodiments automatically applies the
particular effect to each audio signal of the different clips
tagged with the metadata tag.
[0012] In some embodiments, when playing a composite presentation,
the media editing application displays the audio level of a set of
one or more clips that has been mixed with other clips. For
example, the audio signals of the set of clips can be mixed with
other clips in order to play the composite presentation. To
indicate the audio level of the set of clips that has been mixed
with other clips, the media editing application of some embodiments
routes a combined audio signal of the set of clips over a meter bus
in order to determine the audio level of the combined audio signal.
In some embodiments, the media editing application scales (i.e.,
reduces or increases) the audio level of one or more clips by
processing down a signal chain or sequence of operations and
identifying what one or more of the clips are contributing to the
overall mix.
[0013] Alternatively, the media editing application of some
embodiments extracts metering information from each clip in a set
of clips prior to mixing the clips. The metering information is
then used to estimate the audio level of one or more clips in the
composite presentation. Similar to sending the audio signal over
the meter bus, the media editing application of some embodiments
scales the estimated audio level by identifying what one or more of
the clips are contributing to the overall mix.
[0014] In some embodiments, the media editing application allows a
composite presentation to be exported to different tracks (e.g.,
different files). To export the composite presentation, the media
editing application of some embodiments performs multiple rendering
passes on a sequence of clips while muting one or more of the clips
in the sequence. In some such embodiments, the composite
presentation is output to different tracks based on metadata
associated with the clips. For example, with these metadata tags, a
multi-track output can be specified as a first track for each clip
tagged as dialog, a second track for each clip tagged as music,
etc. In this manner, the editor or a movie studio can easily
replace one track with another track.
[0015] The media editing application of some embodiments uses
metadata to provide user interface controls. In some such
embodiments, these controls are used to display properties of
tagged clips and/or specify parameters that affect the tagged
clips. Example of such user interface controls include audio
meters, volume controls, different controls for modifying (e.g.,
distorting, blurring, changing color) images, etc.
[0016] The preceding Summary is intended to serve as a brief
introduction to some embodiments of the invention. It is not meant
to be an introduction or overview of all inventive subject matter
disclosed in this document. The Detailed Description that follows
and the Drawings that are referred to in the Detailed Description
will further describe the embodiments described in the Summary as
well as other embodiments. Accordingly, to understand all the
embodiments described by this document, a full review of the
Summary, Detailed Description and the Drawings is needed. Moreover,
the claimed subject matters are not to be limited by the
illustrative details in the Summary, Detailed Description, and the
Drawings, but rather are to be defined by the appended claims,
because the claimed subject matters can be embodied in other
specific forms without departing from the spirit of the subject
matters.
BRIEF DESCRIPTION OF THE DRAWINGS
[0017] The novel features of the invention are set forth in the
appended claims. However, for purpose of explanation, several
embodiments of the invention are set forth in the following
figures.
[0018] FIG. 1 conceptually illustrates a process that some
embodiments use to apply effects.
[0019] FIG. 2 shows a signal flow diagram that conceptually
illustrates how some embodiments apply the reverb effect.
[0020] FIG. 3 shows a signal flow diagram of some embodiments that
conceptually illustrates application of an effect on multiple
clips.
[0021] FIG. 4 shows a signal flow diagram that conceptually
illustrates how some embodiments apply an effect chain with
multiple different effects.
[0022] FIG. 5 shows a signal flow diagram that conceptually
illustrates how some embodiments apply a particular effect or a
particular filter as an insert effect.
[0023] FIG. 6 illustrates an example of specifying an effect for a
compound clip.
[0024] FIG. 7 shows a signal flow diagram of some embodiments that
conceptually illustrates the application of the reverb effect on
the compound clip.
[0025] FIG. 8 shows a signal flow diagram of some embodiments that
conceptually illustrates the application of the reverb effect on an
inner clip of a compound clip.
[0026] FIG. 9 shows a signal flow diagram that conceptually
illustrates how some embodiments route a combined audio signal of
several clips over a particular aux bus based on the clips'
association with a metadata tag.
[0027] FIG. 10 conceptually illustrates a process that some
embodiments use to apply one or more effects to a compound clip
and/or the compound clip's nested clips.
[0028] FIG. 11 illustrates an example of how some embodiments
perform editing operations based a compound clip's tag.
[0029] FIG. 12 illustrates example meters that indicate audio
levels of several clips that have been mixed with other clips.
[0030] FIG. 13 shows a signal flow diagram that conceptually
illustrates sending an audio signal of a clip over a meter bus in
order to display the clip's audio level during playback of a mixed
audio signal of a composite presentation.
[0031] FIG. 14 shows a signal flow diagram that conceptually
illustrates routing a combined audio signal of several clips over a
meter bus for the purposes of displaying the clips' audio
level.
[0032] FIG. 15 conceptually illustrates a process that some
embodiments use to estimate audio levels of clips that are tagged
with metadata tags.
[0033] FIG. 16 conceptually illustrates a process that some
embodiments use to construct user interface controls based on
metadata tags.
[0034] FIG. 17 shows a data flow diagram that conceptually
illustrates an example of adjusting parameters of several clips at
different levels of a hierarchy, in some embodiments.
[0035] FIG. 18 illustrates how some embodiments output audio
content to different tracks based on metadata that is associated
with different clips.
[0036] FIG. 19 provides an illustrative example of an output tool
for the media editing application.
[0037] FIG. 20A illustrates the problem with outputting a composite
presentation to different tracks.
[0038] FIG. 20B illustrates outputting a composite presentation to
different audio files, in some embodiments.
[0039] FIG. 21 conceptually illustrates a process that some
embodiments use to output a composite presentation based on
metadata tags associated with one or more output tracks.
[0040] FIG. 22 illustrates a graphical user interface of a media
editing application of some embodiments.
[0041] FIG. 23 conceptually illustrates the software architecture
of a media editing application of some embodiments.
[0042] FIG. 24 conceptually illustrates example data structures for
several objects associated with a media editing application of some
embodiments.
[0043] FIG. 25 illustrates an electronic system with which some
embodiments of the invention are implemented.
DETAILED DESCRIPTION
[0044] In the following detailed description of the invention,
numerous details, examples, and embodiments of the invention are
set forth and described. However, it will be clear and apparent to
one skilled in the art that the invention is not limited to the
embodiments set forth and that the invention may be practiced
without some of the specific details and examples discussed.
[0045] Some embodiments provide a media editing application that
uses metadata or metadata tags associated with media content to
facilitate editing operations. In some embodiments, the editing
operations are performed on the media content at various different
stages of the editing process in order to create a composite
presentation. In creating the composite presentation, one or more
effects are associated with a metadata tag. Once the effects are
associated, the media editing application applies the effects to
different pieces of media content tagged with the metadata tag in
order to create the composite presentation.
[0046] Different embodiments provide different schemes for
specifying one or more effects to apply to media content that have
been associated with a metadata tag. For instance, in some
embodiments, the media editing application allows an effect chain
or an effect list to be specified for each type or category of
metadata tag. In some embodiments, the media editing application
allows its user to specify effect properties for the effects in the
effect list. These effect properties define how the corresponding
effect is applied to the media content.
[0047] Based on metadata associated with different clips, the media
editing application of some embodiments applies a set of effects
(e.g., echo effect, reverb effect) by using a "send" or a "send and
return". In some embodiments, the "send" is performed automatically
such that the routing of audio signals of the different clips to an
effect module is transparent to the application's user. That is,
the user does not have to add an input auxiliary ("aux") track,
specify an effect for the aux track, specify an input bus for the
aux track, create the "send", and identify the specified bus to
route the audio signals of the different clips. Instead, the user
can simply specify a particular effect for a metadata tag. The
media editing application then applies the particular effect using
the "send" to a combined audio signal of each clip tagged with the
metadata tag.
[0048] The media editing application of some embodiments applies
one or more effects directly on each clip without using the "send".
One example of such technique is applying an effect as an "insert"
effect that processes (e.g., filters, distorts) an incoming audio
signal and outputs the processed audio signal. For example, when a
metadata tag is associated with a particular effect, the media
editing application of some embodiments automatically applies the
particular effect to each audio signal of the different clips
tagged with the metadata tag.
[0049] In some embodiments, when playing a composite presentation,
the media editing application displays the audio level of a set of
one or more clips that has been mixed with other clips. For
example, the audio signals of the set of clips can be mixed with
other clips in order to play the composite presentation. To
indicate the audio level of the set of clips that has been mixed
with other clips, the media editing application of some embodiments
routes a combined audio signal of the set of clips over a meter bus
in order to determine the audio level of the combined audio signal.
In some embodiments, the media editing application scales (i.e.,
reduces or increases) the audio level of one or more clips by
processing down a signal chain or sequence of operations and
identifying what one or more of the clips are contributing to the
overall mix.
[0050] Alternatively, the media editing application of some
embodiments extracts metering information from each clip in a set
of clips prior to mixing the clips. The metering information is
then used to estimate the audio level of one or more clips in the
composite presentation. Similar to sending the audio signal over
the meter bus, the media editing application of some embodiments
scales the estimated audio level by identifying what one or more of
the clips are contributing to the overall mix.
[0051] In some embodiments, the media editing application allows a
composite presentation to be exported to different tracks (e.g.,
different files). To export the composite presentation, the media
editing application of some embodiments performs multiple rendering
passes on a sequence of clips while muting one or more of the clips
in the sequence. In some such embodiments, the composite
presentation is output to different tracks based on metadata
associated with the clips. For example, with these metadata tags, a
multi-track output can be specified as a first track for each clip
tagged as dialog, a second track for each clip tagged as music,
etc. In this manner, the editor or a movie studio can easily
replace one track with another track.
[0052] The media editing application of some embodiments uses
metadata to provide user interface controls. In some such
embodiments, these controls are used to display properties of
tagged clips and/or specify parameters that affect the tagged
clips. Example of such user interface controls include audio
meters, volume controls, different controls for modifying (e.g.,
distorting, blurring, changing color) images, etc.
[0053] Several more examples editing operations are described
below. Section I describes several examples of applying effects to
different tagged clips. Section II then introduces compound clips
and proves several examples of applying effects to the compound
clips. Section III then describes examples of metering clips that
has previously been mixed. Section IV then describes constructing
user interface controls and propagating parameters specified
through the user interface controls. Section V then describes using
metadata tags to output a composite presentation to different
tracks. Section VI describes an example graphical user interface
and software architecture of a media editing application of some
embodiments. Section VI also describes several example data
structures for the media editing application of some embodiments.
Finally, Section VII describes an electronic system which
implements some embodiments of the invention.
I. Applying Effects to Clips Based on Metadata
[0054] In some embodiments, the media editing application applies
one or more effects to clips in a composite presentation based on
metadata (i.e., metadata tags) associated with the clips. In
creating the composite presentation, one or more effects are
associated with a metadata tag. Once the effects are associated,
the media editing application applies the effects to different
pieces of media content tagged with the metadata tag in order to
create the composite presentation.
[0055] There are many different effects or filters that can be
associated with metadata to facilitate editing operations. Although
this list is non-exhaustive, several example audio effects include
an equalizer for modifying the signal strength of a clip within
specified frequency ranges, an echo effect for creating an echo
sound, and a reverb effect for creating a reverberation effect that
emulates a particular acoustic environment. Several example video
effects or image effects include color filters that operate on
color values, different filters that sharpen, stylize, distort, or
blur an image, and fade-in/fade-out effects for creating
transitions between scenes.
[0056] FIG. 1 conceptually illustrates a process 100 that some
embodiments use to apply effects to different clips based on
metadata. Specifically, this figure illustrates process 100 that
applies effects to the different clips in a composite presentation
when outputting the composite presentation. In some embodiments,
process 100 is performed by a media editing application. This
process 100 will be described by reference to FIGS. 3-5 that
illustrate application of effects on a set of clips based on the
association of the effects to metadata and the association of the
metadata to the set of clips.
[0057] As shown, process 100 identifies (at 105) each clip tagged
with a particular metadata tag having an associated effect. FIG. 2
shows a signal flow diagram 200 that conceptually illustrates
application of an effect. Specifically, the signal flow diagram 200
illustrates an example of how the audio signal of a clip 215 is
routed to output a mixed audio signal with a specified reverb
effect. As shown, the figure includes the clip 215, a master 210,
and a reverb effect ("FX") module 205. The reverb FX module 205
receives an audio signal of one or more clips, applies the reverb
effect to the received audio signal, and outputs an audio signal
containing the reverb effect. The master 210 defines the output
audio level of a composite presentation.
[0058] In example illustrated in FIG. 2, process 100 identifies the
clip 215 as a clip tagged with a "Dialog" tag having an associated
effect. In some embodiments, the identification is initiated based
on user input to output a composite presentation based on a
sequence of clips that define the composite presentation.
Alternatively, in some embodiments, the media editing application
performs rendering and/or mixing operations in the background in
order to output the composite presentation (e.g., to play a preview
of the composite presentation in real-time).
[0059] Process 100 then identifies (at 110) the effect that is
associated with the particular metadata tag. As shown FIG. 2, the
clip 215 is associated with a "Dialog" tag. This piece of metadata
is associated with a reverb effect. Process 100 then determines (at
115) whether the effect requires data of one or more clips to be
routed to one effect creation unit (e.g., by using send and
return).
[0060] When the effect does not requires data of one or clips to be
routed, process 100 proceeds to 120 which is described below.
Otherwise, process 100 process 100 defines (at 135) a bus for the
particular metadata tag. In some embodiments, the process creates
this bus to send a combined audio signal of each clip tagged with
the particular metadata tag. In the example illustrated in FIG. 2,
an aux send bus is defined to send an audio signal of each clip
tagged with the "Dialog tag".
[0061] Process 100 then sends (at 140) an audio signal of each
identified clip over the aux send bus. Process 100 identifies (at
145) parameters of the identified effect. Different effects can be
associated with different parameters. For example, a reverb effect
can have one set of parameters including the output audio level of
the reverberation effect, the type of reverberation (e.g., room,
hall, space), etc. Different from the reverb effect, an image
distortion effect can have a different set of settings or
parameters for distorting images.
[0062] The process 100 then applies (at 150) the effect to each
identified clip based on the identified parameters. As shown in
FIG. 2, the clip 215 is tagged with a "Dialog" tag. This "Dialog"
tag is associated with a reverb effect. Based on the association,
the reverb effect is applied to the clip 205 using the send and
return. Specifically, the audio signal of the clip 215 is routed to
the reverb FX module 205. The audio signal of is directly routed to
the master 210. The reverb FX module 205 then applies the reverb
effect to the received audio signal and returns an audio signal
containing the reverb effect to the master 210. The "+" symbol in
this and other figures indicates that audio signals are being
combined (i.e., mixed, summed). Hence, the master 210 receives the
mixed audio signal and outputs a resulting mixed audio signal for
the composite presentation.
[0063] In the example illustrated in FIG. 2, the audio signal of
the clip 215 and the audio signal containing the reverb effect are
mixed because a reverb effect represents one type of effect that
typically mixes back in the original audio signal. An echo effect
is another example of such type of effect. For example, the output
of the reverb FX module 205 for a clip with dialog is the
reverberation of that dialog (e.g., in a theatre, in a hallway).
Therefore, the audio signal of the clip is mixed back in such that
the audience can hear the dialog and not just the reverberation of
that dialog.
[0064] In the example illustrated in FIG. 2, the routing of the
audio signal to the reverb effect module 205 is transparent to the
application's user. The user does not have to add an auxiliary
track, insert the reverb effect to the auxiliary track, specify a
bus for the auxiliary track, etc. The user can simply associate the
clip's metadata tag with the reverb effect. The media editing
application then automatically applies the reverb effect to the
clip 215 using the "send and return".
[0065] One reason for utilizing the "send" technique is that it
allows a combined audio signal of multiple clips to be processed
through the same effects unit. In most cases, the "send" operation
is used to efficiently process multiple audio signals as one
composite audio signal. In other words, as multiple audio signals
are mixed and processed together, the "send" technique can be less
computationally expensive than applying an effect to each
individual audio signal.
[0066] FIG. 3 shows a signal flow diagram 300 that conceptually
illustrates an example of applying an effect to multiple different
clips. Specifically, this figure illustrates an example of how
audio signals of clips 305-315 are routed to output a mixed audio
signal for a composite presentation. As shown, the figure includes
clips 305-315, an echo FX module 305, and the master 210. The
master 210 is the same as the one described above by reference to
FIG. 2. The echo FX module 305 receives an audio signal of one or
more audio clips, applies the echo effect to the received audio
signal, and output an audio signal containing the echo effect.
[0067] As shown in FIG. 3, the audio signals of clips 305-310 are
sent to the echo FX module 305. The audio signals of clips 305-315
are sent to the master 210. The echo FX module 305 receives a mixed
audio signal of clips 305-310, processes the received audio signal,
and returns an audio signal containing the echo effect to the
master 210. The master 210 receives a mixed audio signal of clips
305-315 and the audio signal containing echo effect from the echo
FX module 305. The master 210 then outputs a resulting mixed audio
signal for the sequence of clips 305-315. Here, the resulting mixed
audio signal is a composite audio signal of clips 305-315 and
includes the echo effect applied to clips 305 and 310. In some
embodiments, the duration of this composite audio signal is the
duration of the composite presentation.
[0068] In the examples described above, one effect is applied to
one or more clips. FIG. 4 shows a signal flow diagram 400 that
conceptually illustrates applying an effect chain with multiple
different effects. In some embodiments, the effect chain represents
an ordered sequence or series of effects that is specified for a
particular metadata tag and applied to one or more clips tagged
with the particular metadata tag.
[0069] As show in FIG. 4, the clip 215 is tagged with the "Dialog"
tag, and a chain of effect has been specified for this tag. The
chain of effects includes a reverb effect and an echo effect. As
the echo effect is being applied to the clip 215, the signal flow
diagram 400 includes an echo FX module 305.
[0070] In the example illustrated in FIG. 4, the clip's audio
signal is first routed to the reverb FX module 205. This is because
the reverb effect is the first effect in the chain of effects.
Here, the reverb FX module 205 applies the reverb effect to the
incoming audio signal and outputs an audio signal containing the
reverb effect. To continue the chain of effects, the audio signal
containing the reverb effect is received at the echo FX module 305.
The echo FX module 305 processes the incoming audio signal and
outputs a processed audio signal. As indicated by the "+" symbol,
the audio signal from the echo FX module 410 is then mixed with the
audio signal of the clip 215. The master 210 receives the mixed
audio signal and outputs a resulting mixed audio signal.
[0071] Referring back to FIG. 1, when the effect does not require
data of one or more clips to be routed, process 100 identifies (at
120) properties of the identified effect. As mentioned, different
effects can have different properties. For example, an image
distortion effect can have one set of parameters for distorting an
image, while an echo effect can have another set of parameters for
adding the echo to an audio signal.
[0072] The process 100 then applies (at 125) the effect to each
identified clip. Specifically, each particular effect is applied to
the clip based on the properties of the particular effect. The
media editing application of some embodiments applies one or more
effects directly on each clip without using the "send". One example
of such technique is applying effects as "insert" effects.
Different from the "send" effect, an "insert" effect simply
processes the incoming audio signal and outputs a processed audio
signal. In using this technique, the audio signals of different
clips are not routed over an auxiliary bus to an effect module to
be processed as one combined audio signal. Also, the output of an
effect module is not mixed back in with one or more original audio
signals. For example, the output audio data of a filter or an
effect that compresses or distorts input audio data does not need
to be mixed back in with the original uncompressed or undistorted
audio data. Similarly, the output of an equalizer that reduces the
bass of a clip does not need be mixed back in with the original
clip as it will defeat the purpose of reducing the bass in the
first place. Many different audio effects or audio filters (e.g.,
equalizers, compressors, band-pass filters) are applied as "insert"
effects, in some embodiments.
[0073] FIG. 5 shows a signal flow diagram 500 that conceptually
illustrates compressing audio signals of clips 305-310 based on the
clips' association with a "Music" tag. Specifically, this figure
illustrates how the media editing application of some embodiments
compresses the audio signals of clips 305-310 as insert effects
instead of routing the audio signals using the "send and return".
As shown, the figure includes a set of compression modules 505. In
some embodiments, the set of compression modules 505 represents
separate instances of the same compression module that are linked
parametrically. For example, the output of these instances can be
based on the same set of compression parameters or settings.
[0074] As shown in FIG. 5, the audio signals of clips 305-310 are
individually compressed by the set of compression modules 505. The
compressed audio signals of clips 305-310 are then output to the
master 210. As indicated by the "+" symbol, the compressed audio
signals of the clips 305 and 310, and the audio signal of clip 315
are then mixed. This mixed audio signal is received at the master
210 that defines the output audio signal for the composite
presentation.
[0075] Referring back to FIG. 1, process 100 determines (at 130)
whether any other clip is tagged with a different tag having an
associated effect. When no other clip is tagged with a different
tag, process 1500 proceeds to 1520. Otherwise, the process outputs
(at 155) the composite presentation. For example, the media editing
application may output the composite presentation by playing a
real-time preview. Alternatively, the media editing application
renders and/or mixes the composite presentation to storage (e.g.,
for playback at another time). The process then ends.
[0076] Some embodiments perform variations on process 100. For
instance, process 100 of some embodiments identifies each effect in
an effect chain. Specifically, before identifying a next tag with
an effect, process 100 applies each effect in the chain to a set of
tagged clips. Also, some embodiments might take into account that a
clip can be a compound clip (described below). In some such
embodiments, process 100 identifies each outer metadata tag of the
compound clip and each inner tag of the compound clip's nested
clips. Process 100 then applies one or more effects to the compound
clip and/or the inner clips according to this identification.
Several examples applying effects to compound clips are described
below by reference to FIGS. 6-10.
[0077] In the examples described above, different effects are
applied using different techniques. In some embodiments, the media
editing application automatically determines whether to apply an
effect by using an "insert" or by using the "send and return". For
instance, the media editing application of some embodiments
automatically applies a first type of effect (e.g., reverb, echo)
using the "send and return", while applying a second type of effect
(e.g., compressor, equalizer) as an "insert" effect. In conjunction
with this automatic determination, or instead of it, the media
editing application of some embodiments provides one or more
user-selectable items for specifying whether to apply an effect as
a "send" effect or an "insert" effect.
II. Applying Effects to Compound Clips
[0078] The media editing application of some embodiments allow
users to create compound clips from multiple different clips. In
some embodiments, a compound clip is any combination of clips
(e.g., in a composite display area or in a clip browser as
described below by reference to FIG. 22) and nests clips within
other clips. Compound clips, in some embodiments, contain video and
audio clips as well as other compound clips. As such, each compound
clip can be considered a mini project or a mini composite
presentation, with its own distinct project settings. In some
embodiments, compound clips function just like any other clips.
That is, the application's user can add the compound clips to a
project or composite display area, trim them, tag them, retime
them, and add effects and transitions.
[0079] FIG. 6 illustrates an example of specifying an effect for a
compound clip. Specifically, this figure illustrates (1) creating a
compound clip from multiple different clips, (2) tagging the
compound clip with a metadata tag, and (3) specifying an effect for
the metadata tag. Five operational stages 605-625 of the GUI are
shown in this figure.
[0080] As shown, the figure includes a composite display area 660
and a tag display area 665. The composite display area 660 provides
a visual representation of the composite presentation (or project)
being created with the media editing application. Specifically, it
displays one or more geometric shapes that represent one or more
media clips that are part of the composite presentation. In some
embodiments, the tag display area 665 displays one or more pieces
of metadata associated with different media clips.
[0081] The first stage 605 shows the tag display area 665 and the
composite display area 660. The tag display area 665 includes a
metadata tag 655 that is associated with an add effect control 660.
The composite display area 660 displays representations of three
clips 630-640 that are not tagged with the metadata tag 655. In
this first stage, the user selects the clip 630 by selecting a
corresponding representation in the composite display area 660.
[0082] The second stage 610 shows the creation of a compound clip
from clips 630 and 635. Specifically, after selecting these two
clips, the user selects a selectable option 640 (e.g., context menu
item) to create the compound clip 650 as illustrated in the third
stage 615. In some embodiments, the media editing application
provides several different controls (not shown) for creating the
compound clip. Several examples of such controls include (1) a text
field for inputting a name for the compound clip, (2) a first set
of control for specifying video properties (e.g., automatically
based on the properties of the first video clip, custom), and a
second set of controls for specifying audio properties (e.g.,
default settings, custom).
[0083] The third stage 615 illustrates tagging the compound clip
650 with the first metadata tag 655. Here, a tagging option 645 is
used to tag the compound clip 650. However, different embodiments
provide different ways for tagging a compound clip. The fourth
stage 620 illustrates the selection of an add effect control 660.
The selection causes an add effect window 665 with a list of
effects to appear as illustrated in the fifth stage 625. As shown
in the fifth stage 625, the add effect window 665 displays several
different effects from which the user can choose from to associate
with the first metadata tag 655. The user then selects the reverb
effect to associate it with the first metadata tag 655.
[0084] Once the effect is set, the media editing application
applies the reverb effect to the compound clip 650 in order to
produce a resulting composite presentation. For example, the media
editing application of some embodiments applies the reverb effects
to the compound clip 650 to play a real-time preview of the
presentation. Alternatively, the media editing application renders
or outputs the sequence in the composite display area 660 to
storage for playback at another time.
[0085] FIG. 7 shows a signal flow diagram 700 that conceptually
illustrates the application of the reverb effect on the compound
clip 650, in some embodiments. Specifically, this figure
illustrates an example of how audio signals of clips 630-640 (in
the composite display area 660 of FIG. 6) are routed to output a
mixed audio signal with the reverb effect. As shown, the figure
includes the clips 630-640, the master 210, and the reverb FX
module 205.
[0086] As shown in FIG. 7, the audio signals of clips 630 and 635
are mixed for the compound clip 650. The mixed audio signal of the
compound clip 650 is sent to the reverb FX module 205. The reverb
FX module 205 processes the received audio signal and returns the
audio signal containing the reverb effect to the master 210. The
master 210 receives a mixed audio signal containing the audio
signal of the compound clip 650, the audio signal of clip 640, and
the audio signal containing reverb effect. The master 210 then
outputs a resulting mixed audio signal.
[0087] In the example illustrated in FIG. 7, the "send" is
performed on the audio signal of the compound clip 650.
Alternatively, the media editing application of some embodiments
allows its users to add "insert" effects for compound clips tagged
with a metadata tag. Several examples of such "insert" effects are
described above by reference to FIG. 5.
[0088] In the previous example, a compound clip is tagged with a
metadata tag that is associated with an effect. Also, the nested
clips of the compound clip are not tagged with this metadata tag.
Accordingly, the effect associated with the compound clip's tag is
applied to the audio signal of the compound clip. In some cases,
one or more inner clips of the compound clip are tagged with a
metadata tag. In order to simply the discussion below, a compound
clip's tag will be referred to as an outer tag, while the tag of
the inner clip of the compound clip will be referred to as an inner
tag. Also, in several examples below, the outermost tag refers to
the tag of the compound clip that is not contained by another
compound clip.
[0089] FIG. 8 shows a signal flow diagram 800 that conceptually
illustrates the application of the reverb effect on an inner clip
805 of a compound clip 820. Specifically, this figure illustrates
an example of how audio signals of clips 805, 815, and 810 (e.g.,
in the composite display area 660 of FIG. 6) are routed to output a
mixed audio signal with the reverb effect. In this example, the
inner clip 805 has been tagged with a "Dialog" tag with a reverb
effect, while the compound clip 820 is not tagged with any tag.
[0090] As shown, the audio signal of clip 805 is routed to the
reverb FX module 205. This is because the clip 805 is tagged with
the "Dialog" tag that is associated with a reverb effect. In other
words, even though the clip 805 is a nested clip of the compound
clip 820, the media editing application of some embodiments
identifies each inner tag of the compound clip's nested clips to
apply one or more effects. Here, the reverb FX module 205 applies
the reverb effect to the received audio signal and returns an audio
signal containing the reverb effect to the master 210. As indicated
by the "+" symbol, the audio signals of clip 805 and 810 are
combined for the compound clip 820. The audio signal of the
compound clip 820, the audio signal containing the reverb effect
for clip 805, and the audio signal of clip 805 are then mixed. The
master 210 receives the mixed audio signal and outputs a resulting
mixed audio signal.
[0091] In the example described above, the output of the reverb FX
module 205 is sent to the master 210 instead of being mixed in as
part of the compound clip 820. This is because the media editing
application of some embodiments defines a separate auxiliary
("aux") bus or virtual pathway for one or more effects associated
with a metadata tag. In some embodiments, this aux bus always
outputs to the master.
[0092] FIG. 9 shows a signal flow diagram 900 that conceptually
illustrates how audio signals of several clips are routed to a
particular aux bus based on the clips' association with a metadata
tag. This example is similar to FIG. 8. However, the "Dialog" tag
is associated with a chain of effects that includes a reverb effect
and an echo effect. Also, the clips 805 and 815 are both tagged
with the "Dialog" tag.
[0093] As shown in FIG. 9, the audio signal of each clip that is
tagged with the "Dialog" tag is routed to the "Dialog" aux bus.
Specifically, the audio signals of clips 805 and 815 are both
routed to this aux bus. The audio signals are routed to the aux bus
regardless of whether the clip is a nested clip (as in clip 805) or
a non-nested clip (as in clip 815). The audio signals of clips 805
and 815 are then combined and sent over the aux bus in order to
apply the chain of effects to the clips 805 and 805.
[0094] In the example illustrated in FIG. 9, the combined audio
signal is first routed to the reverb FX module 205. To continue the
chain of effects, the audio signal containing the reverb effect is
received at the echo FX module 305. The echo FX module 305
processes the incoming audio signal and outputs a processed audio
signal. The output of the echo FX module 305 is returned to the
master 210. As indicated by the "+" symbol, the audio signals of
clip 805 and 810 are combined for the compound clip 820. The audio
signal of the compound clip 820, the audio signal from the echo FX
module 305, and the audio signal of clip 805 are then mixed. The
master 210 receives the mixed audio signal and outputs a resulting
mixed audio signal.
[0095] In some cases, a compound clip is tagged with the same tag
as one or more of the compound clip's inner clips. In some
embodiments, the media editing application identifies an
appropriate level of a compound clip to apply the effect such that
the effect is not reapplied at another level. For example, when the
inner clip's tag is the same as the compound clip's outer tag, the
media editing application of some embodiments identifies the
compound clip's outer tag and performs the editing operations based
on the compound clip's outer tag. This prevents the same effect
being applied to the compound and one or more of the compound
clip's nested clips.
[0096] FIG. 10 conceptually illustrates a process 1000 that some
embodiments use to apply one or more effects to a compound clip
and/or the compound clip's nested clips. In some embodiments,
process 1000 is performed by a media editing application. Process
1000 may be a performed in conjunction with several other processes
(e.g., including FIGS. 15, 16, and 20 described below). Process
1000 will be described by reference to FIG. 11 that illustrates
applying an effect to a compound clip based on the compound clip's
outer tag.
[0097] As shown, process 1000 identifies (at 1005) a clip tagged
with a particular metadata tag in a composite presentation. Process
1000 then determines (at 1010) whether the clip tagged the
particular metadata tag is a compound clip. In the example
illustrated in FIG. 11, the clip 820 is a compound clip tagged with
a particular metadata tag. Specifically, the compound clip 820 is
associated with a "Dialog" tag having a reverb effect.
[0098] When the clip is not a compound clip, process 1000 proceeds
to 1035, which is described below. Otherwise, process 1000
identifies (at 1015) the particular metadata tag of the compound
clip and each inner tag of the compound clip's nested clips.
Process 1000 then determines (at 1020) whether any inner tag of the
compound clip's nested clips is different from the outer tag of the
compound clip.
[0099] When no inner tag is different than the outer tag or no
nested clip is tagged with a tag associated with an effect, process
1000 performs (at 1025) one or more operations based on the outer
tag. In the example illustrated in FIG. 11, as the compound clip's
outer tag takes precedence over the inner tag, the effect
associated with the inner tag of inner clip 805 is not applied to
this inner clip. Instead, the effect is applied to the mixed audio
signal of the compound clip 820. Specifically, the audio signals of
clips 810 and 805 are mixed as a mixed audio signal for the
compound clip 820. As the compound clip is tagged with the "Dialog"
tag, the mixed audio signal of the compound clip is then sent to
the reverb FX module 205. Although the clip 805 is also tagged with
the "Dialog" tag, the clip's audio signal is not sent to the reverb
FX module. This is because the media editing application identified
that the compound clip's outer tag is the same as the inner tag of
the nested clip 805.
[0100] The reverb FX module 205 applies the reverb effect to the
received audio signal and returns an audio signal containing the
reverb effect to the master 210. The mixed audio signal of the
compound clip 820, the audio signal containing the reverb effect
from the reverb FX module 205, and the audio signal of clip 815 are
then mixed. The master 210 then receives the mixed audio signal and
outputs a resulting audio signal.
[0101] Referring back to FIG. 10, when one or more inner tags of
the compound clip's nested clips are different, process 1000
performs (at 1030) one or more operations based on each different
inner tag. Process 1000 also performs (at 1030) one or more
operations based on the compound clip's outer tag. In some
embodiments, process 1000 applies the different effects following a
sequence of operations (e.g., as represented in a signal chain or
render graph). For example, when a reverb effect is associated with
a first metadata tag of a compound clip's nested clip, process 1000
of some embodiments first applies the reverb effect to the compound
clip's nested clip. Once the effect is applied to the nested clip,
process 1000 combines the nested clip with one or more other clips
in order to apply a second effect associated with a second metadata
tag of the compound clip.
[0102] Process 1000 then determines (at 1035) whether there is any
other tagged clip in the composite presentation. When there is
another tagged clip, process 1000 returns to 1005 which was
described above. Otherwise, process 1000 ends.
[0103] Some embodiments perform variations on process 1000. For
example, the specific operations of process 1000 may not be
performed in the exact order shown and described. The specific
operations may not be performed in one continuous series of
operations, and different specific operations may be performed in
different embodiments.
III. Audio Meters
[0104] In many of the example described above, the audio signals of
several clips are mixed and output as one combined audio signal for
a composite presentation. In some cases, the mixed audio signal of
a compound clip is again combined with an audio signal of another
clip to output a composite presentation. In some embodiments, the
media editing application displays the audio level of a set of one
or more clips even though the set of clips has been mixed with
other clips.
[0105] A. Displaying Audio Levels
[0106] FIG. 12 illustrates an example of displaying the audio
levels of several mixed clips. Specifically, this figure
illustrates meters that indicate the audio levels of the clips even
though the clips have been mixed with other clips. Three
operational stages 1205-1215 are shown in this figure. The
composite display area 660 is the same as the one described above
by reference to FIG. 6. The figure also includes an audio mixer
1220.
[0107] In some embodiments, the media editing application provides
audio meters and/or audio controls for metadata tags associated
with different clips. An example of this is illustrated in FIG. 12.
Specifically, the audio mixer 1220 includes a corresponding audio
meter (1245 or 1255) and a level control (1240 or 1250) for each of
a first metadata tag specified as "Dialog" and a second metadata
tag specified as "SFX". Several other examples of providing
different controls (e.g., audio meters, audio controls) for
different metadata tags are described below by reference to FIG.
16.
[0108] The first stage 1205 shows the composite display area 660
and the audio mixer 1220 prior to playing the composite
presentation. As shown, clip 1225 is tagged with the "Dialog" tag.
Compound clip 1235 includes several nested clips 1260 and 1265. The
compound clip 1235 is tagged with the "SFX" tag. The nested clips
1260 and 1265, and clip 1230 are not tagged with the "Dialog" tag
or the "SFX" tag.
[0109] The second stage 1210 shows the playback of the composite
presentation represented in the composite display area 660 at a
first instance in time. To output the composite presentation's
mixed audio signal, the audio signals of the nested clips 1260 and
1265 has been mixed for the compound clip 1235. In addition, the
audio signals of the clips 1225 and 1230 have been mixed with the
audio signal of the compound clip 1235. In this second stage 1210,
the audio meter 1245 displays the audio level of the clip 1225 even
though the clips in the composite display area 660 has been mixed
to play the composite presentation.
[0110] The third stage 1215 shows the playback of the composite
presentation at a second instance in time. Similar to the previous
stage, the audio meter 1255 displays the audio level of the
compound clip 1225 even though the compound clip has been mixed
with the clips 1225 and 1230.
[0111] B. Sending an Audio Signal Over a Meter Bus
[0112] In the example illustrated in FIG. 12, the media editing
application provides meters that indicate the audio levels of clips
that have been mixed with other clips. To display audio levels of
these clips, the media editing application of some embodiments
creates one or more meter buses and routes audio signals of the
clips over the meter buses.
[0113] FIG. 13 shows a signal flow diagram 1300 that conceptually
illustrates sending an audio signal of a clip over a meter bus 1305
in order to display the clip's audio level during playback of a
mixed audio signal of a composite presentation. As shown, the
figure includes clips 1310 and 1315 that are mixed as a compound
clip 1320. The figure also includes the meter bus 1305 and the
master 210. The master 210 is the same as the one described above
by reference to FIG. 2.
[0114] As shown FIG. 13, the signal flow 1300 includes a chain or a
sequence of operations that is performed on the clips 1310 and
1315. However, this signal chain does not include a place to
determine the audio level of clip 1310 once it has been summed with
clip 1315. Specifically, in the signal chain, the clips 1310 and
1315 are mixed as the compound clip 1320. The mixed audio signal of
clips 1310 and 1315 is then output through the master 210. Also,
the mixed audio signal cannot be used to determine how much the
clip 1310 contributed to the overall mix.
[0115] As the audio level of clip 1310 cannot be determined using
the mixed audio signal, the clip's audio signal is sent over the
meter bus 1305. This meter bus 1305 is not for playing sound but
for metering. Specifically, in the example illustrate in FIG. 13,
the meter bus 1305 is for displaying the audio level of each clip
tagged with the "Dialog" tag. In some embodiments, the clips audio
is routed over the meter bus 1305 to a component (not shown) that
translates the audio signal to one or more meters. For example, the
media editing application of some embodiments determines a set of
decibel (dB) values. The set of dB values is then used to meter the
audio level of the clip 1310.
[0116] FIG. 14 shows a signal flow diagram 1400 that conceptually
illustrates routing a combined audio signal of several clips over
the meter bus 1305 for the purposes of displaying the clips' audio
level. This example is similar to FIG. 13. However, in addition to
the clip 1310, the figure includes a clip 1405 that is tagged with
the "Dialog" tag.
[0117] As shown in FIG. 13, the clips 1310 and 1315 are mixed as
the compound clip 1320. The mixed audio signal of the clips 1310
and 1315 is then combined with the audio signal of the clip 1405.
The composite audio signal of the clips 1310, Q15, and Q15 is then
output through the master 210. To display the audio level of the
"Dialog" clips 1310 and 1410, the audio signals of these clips are
combined and sent over the meter bus 1305. Similar to the example
of FIG. 14, the combined audio signal is then translated into a set
of decibel values by the media editing application.
[0118] In some embodiments, the media editing application takes
into account other factors when displaying the audio level of clips
that has been with other clips. The media editing application of
some embodiments scales (i.e., reduces or increases) the audio
level of one or more clips by processing later down the signal
chain. For example, in the example illustrated in FIG. 14, the
audio meter for the "Dialog" clips 1310 and 1405 should reflect the
audio level of the entire mix (e.g., as defined by the master 210).
That is, when the master's volume is set at a particular dB, the
combined audio signal of clips 1310 and 1405 should be scaled or
synchronized such that audio meter does not indicate an audio level
that is higher than the particular dB.
[0119] C. Estimating the Volume
[0120] In the previous example, a combined audio signal of several
clips is sent over a meter bus to display the audio level of
several clips. Alternatively, the media editing application of some
embodiments estimates the audio level of the one or more clips.
That is, instead of routing the audio signal over the meter bus,
the media editing application numerically estimates the audio level
by extracting metering information from the clips prior to mixing
the clips.
[0121] FIG. 15 conceptually illustrates a process 1500 that some
embodiments use to estimate audio levels of clips that are tagged
with metadata tags. In some embodiments, process 1500 is performed
by a media editing application. As shown, process 1500 identifies
(at 1505) each clip, in a composite presentation, that is tagged
with a particular metadata tag. Process 1500 then extracts (at
1510) metering information (e.g., audio level) from each clip
tagged with the particular metadata tag.
[0122] Process 1500 then determines (at 1515) whether any other
clip is tagged with a different tag. When no other clip is tagged
with a different tag, process 1500 proceeds to 1520. Otherwise,
process 1500 returns to 1505 which was described above.
[0123] At 1520, process 1500 determines the audio level of one or
more clips based on the metering information. In the example
described above in FIG. 14, the audio level of several clips is
determined by summing the clips' audio signals and sending the
summed audio signal over the meter bus. Here, as the audio signals
are being mixed later in the signal chain, process 1500 estimates
the audio level based on the metering information (e.g., audio
signal data) extracted from each of the clips. In other words,
process 1500 estimates what the audio level would be when two or
more audio signals of different clips are added together.
[0124] In some embodiments, process 1500 estimates the audio level
by adding the power contribution of each clip. One example of such
addition is adding about 3 dB for every doubling of equal input
sources. For example, if the audio signals of two clips have an
identical volume of -10 dB, then the sum of the two signals is
estimated to be about 3 dB higher. As such, the estimated sum of
the two signals is about -7 dB. If there are four audio signals
that have the identical volume, then the sum of these signals will
be estimated to be about 6 dB higher, and so on. One example
formula for adding sound pressure levels of multiple sound sources
is shown below:
L .SIGMA. = 10 log 10 ( 10 L 1 10 + 10 L 2 10 + + 10 L u 10 ) dB
##EQU00001##
[0125] Here, L.sub..SIGMA. equals total level, and L.sub.1,
L.sub.2, . . . L.sub.n equal sound pressure level (spl) of the
separate sources in dBspl. This formula above translates to about 3
dB per doubling of equal sources. One of ordinary skill in the art
would realize that other formulas can be used to differently sum
two or more audio signals in order to estimate the audio level.
[0126] Returning to FIG. 15, process 1500 displays (at 1525) the
audio level of one or more tagged clips based on each estimated
audio level. Specifically, process 1500 displays the audio level
when playing the mixed audio signal of the composite presentation.
Process 1500 then ends.
[0127] In some embodiments, process 1500 takes into account other
factors when displaying the audio level of clips that has been
previously mixed. For example, process 1500 of some embodiments
scales (i.e., reduces or increases) the audio level of one or more
clips by processing later down the signal chain. In some
embodiments, the process estimates the audio level of the mixed
clips by identifying what each clip is contributing to the overall
mix and numerically estimating the audio level based on the
identification and the extracted metering information. For example,
when a compound clip is muted, the media editing application should
not display audio level of the compound's nested clip as the nested
clip is also muted.
[0128] In some cases, estimating the audio level has several
advantages over routing audio signals over meter buses. For
example, this technique can be less computationally expensive than
using meter buses. This is because the meter buses do not have to
be created and the audio signals of different clips do not have to
be routed over these meter buses.
IV. Parameter Controls and Propagation
[0129] The media editing application of some embodiments uses
metadata to provide user interface controls. In some embodiments,
these controls are used to display properties of tagged clips
and/or specify parameters that affect the tagged clips. Example of
such user interface controls include audio meters, volume controls,
different controls for modifying images (e.g., distorting,
blurring, changing color), etc.
[0130] FIG. 16 conceptually illustrates a process 1600 that some
embodiments use to construct user interface controls based on
metadata tags. In some embodiments, process 1600 is performed by a
media editing application. As shown, process 1600 identifies (at
1605) each clip tagged with a particular metadata tag. In some
embodiments, one or more clips are categorized with a particular
role or category. For example, several clips may be assigned one
audio role of "Dialog", "Music", or "SFX". Process 1600 then
provides one or more user interface controls. Here, the user
interface controls are also associated with the tagged clips. That
is, the user interface controls are associated so that these
controls can be used to display or modify properties of the tagged
clips.
[0131] Process 1600 then determines (at 1615) whether any other
clip is tagged with a different tag. When no other clip is tagged
with a different tag, process 1600 proceeds to 1620. Otherwise,
process 1600 returns to 1605 which was described above. Process
1600 then receives (at 1620) adjustment of parameters through one
or more corresponding user interface controls. Process 1600 then
outputs (at 1625) the sequence of clips in the composite
presentation by propagating the adjusted parameter to one or more
of corresponding tagged clips. Process 1600 then ends.
[0132] Some embodiments allow a compound clip to be tagged with the
same tag as one or more of the compound clip's inner clips. In some
embodiments, the media editing application identifies an
appropriate level in a render graph or signal chain to adjust
parameters such that the parameters are not readjusted at another
level. For example, when the inner clip's tag is the same as the
compound clip's outer tag, the media editing application of some
embodiments identifies the compound clip's outer tag and performs
the adjustment based on the compound clip's outer tag. This
prevents the same adjustment being applied at multiple different
levels.
[0133] FIG. 17 shows a data flow diagram 1700 that conceptually
illustrates an example of adjusting parameters of several clips at
different levels of a hierarchy, in some embodiments. As shown in
FIG. 17, clips 1705 and 1710 are combined for the compound clip
1715. The nested clips 1705 and 1710 are tagged with the "Dialog"
tag, while the compound clip 1715 is not tagged with this tag. As
the compound clip 1715 is not tagged with the same tag as its
nested clips 1705 and 1710, the parameter adjustment occurs for
this compound clip at the level of the nested clips. Similarly, the
adjustment for compound clip 1740 occurs at the level of the nested
clips 1735 and 1730.
[0134] In the example illustrated in FIG. 17, the clips 1720 and
1725 are combined for the compound clip 1730. Here, the nested clip
1720 and the compound clip 1730 are tagged with the "Dialog" tag,
while the nested clip 1725 is not tagged this tag. As the compound
clip 1730 includes the same tag as one of its nested clips (i.e.,
the clip 1720), the adjustment occurs for this compound clip at the
level of the compound clip. Similarly, the adjustment for compound
clip 1755 occurs at the level of the compound clip. This is because
the nested clips 1745 and 1755 are tagged with the same "Dialog"
tag as the compound clip 1755.
[0135] In some cases, the compound clip's outer tag can be
different from one or more tags of its inner clips. When the
compound clip's outer tag is different from the inner clip's tag,
the media editing of some embodiments adjusts one set of parameter
associated with the inner clip based on the inner clip's tag. Also,
the media editing application adjusts another set of parameters
associated with the compound clip based on the compound clip's
tag.
[0136] In some embodiments, the media editing application does not
support tagging compound clips. In some such embodiments, the
adjustment is only made at the nested clip level. For example, when
several nested clips of a compound clip are tagged with a "Dialog"
tag, an adjustment to a control relating to the "Dialog" tag will
adjust parameters associated with these nested clips and not the
combined clip of the compound clip.
V. Outputting Content to Different Tracks
[0137] The media editing application of some embodiments allows a
composite presentation to be output to different tracks (e.g.,
different files) based on metadata associated with media content.
Outputting content to different tracks is particularly useful
because one track can easily be replaced with another track. For
example, when audio content is mixed, a movie studio cannot replace
a dialog track in one language with another dialog track in another
language. With audio content output to different tracks (e.g.,
audio files), the movie studio can easily replace one dialog track
with another such that the dialog is in a different language.
[0138] A. Specifying Output Tracks
[0139] FIG. 18 illustrates specifying output tracks for clips based
on metadata that is associated the clips. Six operational stages
1805-1830 of the GUI are shown in this figure. In this example, the
tag display area 665 includes an output control (1835, 1840, or
1845) for each metadata tag (1850, 1855, or 1860). The output
control allows the application's user to specify an output track or
stem for each clip associated with a metadata tag.
[0140] The first stage 1805 shows the tag display area 665 and the
composite display area 660. The tag display area 665 includes a
list of metadata tags. This list includes a first metadata tag 1850
specified as "Dialog", a second metadata tag 1855 specified as
"Music", and a third metadata tag 1860 specified as "SFX". The
first metadata tag 1850 is associated with a first output control
1835, the second metadata tag 1855 with a second output control
1840, and the third metadata tag 1860 with a third output control
1845.
[0141] The composite display area 660 displays representations of
five clips 1865-1885. The clips 1865 and 1885 are tagged with the
first metadata tag 1850, the clip 1870 is tagged with the second
metadata tag 1855, and the clips 1875 and 1880 are tagged with the
third metadata tag 1860. To specify an output track for the clips
1865 and 1885 that are tagged with the first metadata tag 1850, the
user selects the output control 1835. The selection causes a track
control 1890 to appear as illustrated in the second stage 1810.
[0142] The second stage 1810 illustrates specifying an output track
for the clips 1865 and 1885 tagged with the first metadata tag
1850. Specifically, the user specifies the output track to be
"Track 1" by using the track control 1890. In some embodiments, the
media editing application provides various different options for
outputting content. Several example output options include
compression type and settings, bit rate, bit size, mono or stereo,
name of file, etc. For instance, when outputting an audio
containing dialog to a separate file, the media editing application
of some embodiments displays different user interface items that
allow the application's user to define the output audio clip such
as the type of audio file, compression settings, etc.
[0143] The third and fourth stages 1815 and 1820 illustrate
specifying an output track for the clip 1870 that is tagged with
the second metadata tag 1855. To specify the output track, the user
selects the output control 1840 that is associated with the second
metadata tag 1855. The selection causes the track control 1890 to
appear, as illustrated in the fourth stage 1820. In the fourth
stage 1820, the application's user specifies the output track to be
"Track 2" by using the track control 1890.
[0144] The fifth and sixth stages 1825 and 1830 are similar to the
previous stages. However, in these stages 1825 and 1830, an output
track is specified for the clips 1875 and 1880 that are tagged with
the third metadata tag 1860. To specify the output track, the user
selects the output control 1845 that is associated with the third
metadata tag 1860. The selection causes the track control 1890 to
appear, as illustrated in the sixth stage 1830. In the sixth stage
1830, the user specifies the output track to be "Track 3" by using
the track control 1890. Once the output tracks are specified for
the metadata tags, the user can select an output or export option
(not shown) to start the output of clips based on the clip's
association with a particular metadata tag.
[0145] In the example described above, several output tracks are
associated with metadata tags. In some embodiments, the media
editing application allows a user to associate metadata tags with
output tracks. FIG. 19 provides an illustrative example of an
output tool 1900 for the media editing application. As shown, the
figure includes several user-selectable items (e.g., drop-down
lists) 1905-1920. Each selectable item represents a particular
output track for a composite presentation. The user can use any one
of these items 1905-1920 to associate one or more roles with a
particular track. For instance, two different roles have been
specified with the selectable item 1920. This is different from
FIG. 18 where a particular output track is associated with one
particular metadata tag (e.g., a role).
[0146] As shown in FIG. 19, several of these selectable items
1905-1920 are associated with other user interface items 1925-1935.
A user of the application select any one of these items 1925-1935
to associate a particular output setting (e.g., mono, stereo,
surround) with a corresponding output track. The user can then
select a button 1940 to specify a multi-track output for the
composite presentation.
[0147] B. Performing Multiple Passes
[0148] In some embodiments, the media editing application performs
multiples passes on a render graph or signal chain to output a
composite presentation to different tracks. FIG. 20A shows a signal
flow diagram 2000 that conceptually illustrates the problem of
outputting the composite presentation to different tracks in a
single pass. To simplify the discussion, this signal flow diagram
2000 represents a scenario where only one compound clip 2010 that
nests clips 2005 and 2010 is in the composite presentation.
[0149] As shown in FIG. 20A, the inner clip 2005 is tagged with a
first metadata tag, and inner clip 2010 is tagged with a second
metadata tag. The output track for the first metadata tag has been
specified as "Track 1", and the output track for the second
metadata tag has been specified as "Track 2". Here, the audio
signals of clips 2005 and 2010 are mixed as one mixed audio signal
for the compound clip 2010. This prevents the audio signal of clip
2005 to be played through one channel, while the audio signal of
clip 2010 is being played through another channel. As such, the
mixed audio signal cannot be unmixed to output the audio signal of
clip 2005 to the first track and the audio signal of clip 2010 to
the second track.
[0150] Although the composite presentation cannot be unmixed during
playback, the media editing application allows the composite
presentation to be output to different audio files by performing
multiple passes on a render graph or signal chain. FIG. 20B
illustrates outputting the composite presentation different audio
files. Two example stages 20B05 and 20B10 of the media editing
application are illustrated in this figure. Specifically, these
stages illustrate performing multiple rendering passes to output
the composite presentation to different files.
[0151] The first stage 20B05 illustrates a first pass that is
performed to output the audio content of clip 2005 to "Track 1".
The audio signals of clips 2005 and 2010 are mixed for the compound
clip 2020. However, in this first pass, the audio signal of clip
2010 is disabled (e.g., muted or silenced). As the audio clip 2010
is muted, the mixed audio signal includes only the audio signal of
the clip 2005.
[0152] The second stage 20B10 illustrates a second pass that is
performed to output the audio content of clip 2010 to "Track 2".
Similar to the first stage 20B05, the audio signals of clips 2005
and 2010 are mixed for the compound clip 2020. However, in this
second pass, the audio signal of clip 2005 is disabled (e.g.,
muted). As the audio clip 2005 is disabled, the mixed audio signal
includes only the audio signal of the clip 2010. In some
embodiment, the output files include a same duration as the
composite presentation. For example, if the duration of the
composite presentation (e.g., represented in the composite display
area) is one hour and each of the clips 2005 and 2010 includes
thirty minutes of sound, then each output file will be one hour in
duration with thirty minutes of sound.
[0153] In the example described above, multiple rendering passes
are performed to output the audio content to different tracks. The
media editing application of some embodiments performs these
multiple passes simultaneously. In some such embodiments, the media
editing application generates multiple copies of one or more render
objects (e.g., render graphs, render files) for rendering the
sequence of clips in the composite display area. The media editing
application then performs the multiple passes such that these
passes occur at least partially at the same time. By simultaneous
performing these passes, the media editing application saves time
in that it does not need to wait for one pass to end to start
another. This also saves time as files (e.g., source clips) are
read out of disk or loaded in memory once instead of multiple
times.
[0154] The preceding section described and illustrated various ways
to use metadata to facilitate output operations. FIG. 21
conceptually illustrates a process 2100 that some embodiments use
to output a composite presentation to different tracks. In some
embodiments, process 2100 is performed by a media editing
application. As shown, process 2100 receives (at 2105) input to
output a composite presentation (e.g., a sequence of clips in the
composite display area). Process 2100 then identifies (at 2110) a
track to process.
[0155] At 2115, process 2100 identifies each clip tagged with a tag
(e.g., role) that is associated with the identified track. An
example of associating one or more roles to a particular output
track is described above by reference to FIG. 19.
[0156] Process 2100 then adds (at 2120) each identified clip to a
render list for that track. Process 2100 then determines (at 2125)
whether there are any more tracks. When there is another track,
process 2100 returns to 2110 that was described above. Otherwise,
process 2100 renders the composite presentation based on one or
more render lists. For example, process 2100 of some embodiments
renders the composite presentation by identifying clips in a render
list, combining any two or more clips in the list, and outputting
the combined clip to a particular track. Process 2100 then
ends.
[0157] Some embodiments perform variations on process 2100. For
example, the operations of process 2100 might be performed by two
or more separate processes. Also, the specific operations of the
process may not be performed in the exact order shown and
described.
VI. Software Architecture
[0158] A. Example Media Editing Application
[0159] Having described several example editing operations above,
an example media editing application that implements several
editing features will now be described. FIG. 22 illustrates a
graphical user interface (GUI) 2200 of a media editing application
of some embodiments. One of ordinary skill will recognize that the
GUI 2200 is only one of many possible GUIs for such a media editing
application. In fact, the GUI 2200 includes several display areas
which may be adjusted in size, opened or closed, replaced with
other display areas, etc. As shown, the GUI 2200 includes a clip
library 2205 (also referred to as an event library), a clip browser
2210 (also referred to as a clip browser), a composite display area
2215, a preview display area 2220, an inspector display area 2225,
and a toolbar 2235.
[0160] The clip library 2205 includes a set of folder-like or
bin-line representations through which a user accesses media clips
that have been imported into the media editing application.
[0161] Some embodiments organize the media clips according to the
device (e.g., physical storage device such as an internal or
external hard drive, virtual storage device such as a hard drive
partition, etc.) on which the media represented by the clips are
stored. Some embodiments also enable the user to organize the media
clips based on the date the media represented by the clips was
created (e.g., recorded by a camera).
[0162] Within the clip library 2205, users can group the media
clips into "events" or organized folders of media clips. For
instance, a user might give the events descriptive names that
indicate what kind of media is stored in the event (e.g., the "New
Event 2-5-11" event shown in clip library 2205 might be renamed
"European Vacation" as a descriptor of the content). In some
embodiments, the media files corresponding to these clips are
stored in a file storage structure that mirrors the folders shown
in the clip library.
[0163] In some embodiments, the clip library 2205 enables users to
perform various clip management actions. These clip management
actions include moving clips between bins (e.g., events), creating
new bins, merging two bins together, duplicating bins (which, in
some embodiments, create a duplicate copy of the media to which the
clips in the bin correspond), deleting bin, etc.
[0164] As shown in FIG. 22, the clip library 2205 displays several
keywords 2202 and 2204. To categorize a clip or associate the clip
with a particular keyword, the application's user can drag and drop
the clip onto the particular keyword. The same technique used in
some embodiments to associate multiple clips with the particular
keyword by simultaneously dragging and dropping the clips onto the
keyword. In some embodiments, the keywords 2202 and 2204 are
represented as keyword collections (e.g., keyword bin or keyword
folder) in the clip library 2205. That is, the keyword collection
acts a virtual bin or virtual folder that the user can drag and
drop items onto in order to create keyword associations. In some
embodiments, upon selection of a keyword collection, the media
editing application filters the clip browser 2210 to only display
those clips associated with a particular keyword of the keyword
collection.
[0165] The clip browser 2210 allows the user to view clips from a
selected folder or collection (e.g., an event, a sub-folder, etc.)
of the clip library 2205. In the example illustrated in FIG. 22,
the collection "New Event 2-5-11" is selected in the clip library
2205, and the clips belonging to that folder are displayed in the
clip browser 2210. Some embodiments display the clips as thumbnail
filmstrips (i.e., filmstrip representations). These thumbnail
filmstrips are similar to the representations in the composite
display area 2215.
[0166] By moving a position indicator (e.g., through a cursor,
through the application's user touching a touch screen) over one of
the thumbnails, the user can skim through the clip. For example,
when the user places the position indicator at a particular
horizontal location within the thumbnail filmstrip, the media
editing application associates that horizontal location with a time
in the associated media file, and displays the image from the media
file for that time. In addition, the user can command the
application to play back the media file in the thumbnail filmstrip.
In some embodiments, the selection and movement is received through
a user selection input such as input received from a cursor
controller (e.g., a mouse, touchpad, trackpad, etc.), from a
touchscreen (e.g., a user touching a user interface (UI) item on a
touchscreen), from the keyboard, etc. In some embodiments, one
example of such a user selection input is the position indicator
that indicates the user's interaction (e.g., with the cursor, the
touchscreen, etc.). The term user selection input is used
throughout this specification to refer to at least one of the
preceding ways of making a selection, moving a control, or pressing
a button through a user interface.
[0167] In the example illustrated in FIG. 22, the thumbnails for
the clips in the clip browser 2210 display an audio waveform
underneath the clip that represents the audio of the media file. In
some embodiments, as a user skims through or plays back the
thumbnail filmstrip, the audio portion plays as well. Many of the
features of the clip browser are user-modifiable. For instance, the
user can modify one or more of the thumbnail size, the percentage
of the thumbnail occupied by the audio waveform, whether audio
plays back when the user skims through the media files, etc. In
addition, some embodiments enable the user to view the clips in the
clip browser 2210 in a list view. In this view, the clips are
presented as a list (e.g., with clip name, duration, metadata,
etc.). Some embodiments also display a selected clip from the list
in a filmstrip view at the top of the clip browser 2210 so that the
user can skim through or playback the selected clip. The clip
browser in some embodiments allows users to select different ranges
of a media clip and/or navigate to different sections of the media
clip.
[0168] In some embodiments, the media editing application displays
content differently based on their association with one or more
metadata tags (e.g., keywords). This allows users to quickly assess
a large group of media clips and see which ones are associated or
not associated with any metadata tags. For example, in FIG. 22, a
horizontal bar is displayed across each of the clips 2240-2250.
This indicates to the application's user that these clips are
tagged with one or more metadata tags.
[0169] In some embodiments, the media editing application allows
the user to tag a portion of a clip with a metadata tag. To
associate a metadata tag with a portion of a clip, the user can
select the portion of the clip (e.g., using a range selector on a
clip's filmstrip representation in the clip browser 2210), and drag
and drop the selected portion onto the metadata tag (e.g., 2202 or
2204). For example, a user can specify that an audio clip includes
crowd noise starting at one point in time and ending at another
point, and then tag that range as "crowd noise". When a portion of
a clip is associated with a metadata tag, the media editing
application of some embodiments indicates this by marking a portion
of the clip's representation in the clip browser 2210. For example,
a horizontal bar is displayed across only the portion the clip's
filmstrip representation associated with a particular metadata tag,
in some embodiments.
[0170] The composite display area 2215 provides a visual
representation of a composite presentation (or project) being
created by the user of the media editing application. As mentioned
above, the composite display area 2215 displays one or more
geometric shapes that represent one or more media clips that are
part of the composite presentation. In some embodiments, the
composite display area 2215 spans a displayed timeline 2226 which
displays time (e.g., the elapsed time of clips displayed on the
composite display area). The composite display area 2215 of some
embodiments includes a primary lane 2216 (also called a "spine",
"primary compositing lane", or "central compositing lane") as well
as one or more secondary lanes (also called "anchor lanes"). The
spine represents a primary sequence of media which, in some
embodiments, does not have any gaps. The clips in the anchor lanes
are anchored to a particular position along the spine (or along a
different anchor lane). Anchor lanes (e.g., the anchor lane 2218)
may be used for compositing (e.g., removing portions of one video
and showing a different video in those portions), B-roll cuts
(i.e., cutting away from the primary video to a different video
whose clip is in the anchor lane), audio clips, or other composite
presentation techniques.
[0171] The user can select different media clips from the clip
browser 2210, and drag and drop them into the composite display
area 2215 in order to add the clips to a composite presentation
represented in the composite display area 2215. Alternatively, the
user can select the different media clips and select a shortcut
key, a tool bar button, or a menu item to add them to the composite
display area 2215. Within the composite display area 2215, the user
can perform further edits to the media clips (e.g., move the clips
around, split the clips, trim the clips, apply effects to the
clips, etc.). The length (i.e., horizontal expanse) of a clip in
the composite display area is a function of the length of the media
represented by the clip. As the timeline 2226 is broken into
increments of time, a media clip occupies a particular length of
time in the composite display area. As shown, in some embodiments,
the clips within the composite display area are shown as a series
of images or filmstrip representations. The number of images
displayed for a clip varies depending on the length of the clip
(e.g., in relation to the timeline 2226), as well as the size of
the clips (as the aspect ratio of each image will stay constant).
As with the clips in the clip browser, the user can skim through
the composite presentation or play back the composite presentation.
In some embodiments, the playback (or skimming) is not shown in the
composite display area's clips, but rather in the preview display
area 2220.
[0172] The preview display area 2220 (also referred to as a
"viewer") displays images from media files which the user is
skimming through, playing back, or editing. These images may be
from a composite presentation in the composite display area 2215 or
from a media clip in the clip browser 2210. In the example of FIG.
22, the user is playing the composite presentation in the composite
display area 2215. Hence, an image from the start of the composite
presentation is displayed in the preview display area 2220. As
shown, some embodiments will display the images as large as
possible within the display area while maintaining the aspect ratio
of the image.
[0173] The inspector display area 2225 displays detailed properties
about a selected item and allows a user to modify some or all of
these properties. The selected item might be a clip, a composite
presentation, an effect, etc. As shown in FIG. 22, the inspector
display area 2225 displays information about the audio clip 2250.
To display the information, the application's user might have
selected the audio clip 2250 from the clip browser 2210. In this
case, the information about the selected media clip 2250 includes
name, notes, codec, audio channel count, and sample rate. However,
depending on the type of media clip, the inspector display area
2225 can display other information such as file format, file
location, frame rate, date created, etc. In some embodiments, the
inspector display area 2225 displays different metadata tags
associated with a clip. For example, the inspector display area
2225 includes a text box 2206 for displaying and/or modifying one
or more metadata tags.
[0174] The toolbar 2235 includes various selectable items for
editing, modifying items that are displayed in one or more display
areas, etc. The illustrated toolbar 2235 includes items for video
effects, visual transitions between media clips, photos, titles,
generators and backgrounds, etc. The toolbar 2235 also includes
selectable items for media management and editing. Selectable items
are provided for adding clips from the clip browser 2210 to the
composite display area 2215. In some embodiments, different
selectable items may be used to add a clip to the end of the spine,
add a clip at a selected point in the spine (e.g., at the location
of a playhead), add an anchored clip at the selected point, perform
various trim operations on the media clips in the composite display
area, etc. The media management tools of some embodiments allow a
user to mark selected clips as favorites, among other options.
[0175] The audio mixer 2255 provides different audio mixing tools
that the application's user can use to define the output audio of
the composite presentation represented in the composite display
area 2215. The audio mixer 2255 includes several level controls
(2260, 2270, and 2280) and several audio meters (2265, 2275, and
2285). The level control 2280 and the audio meters 2285 are related
to the master that represents the output audio. Specifically, the
master's level control 2280 raises or lowers the combined output
level of all sequence of clips in the composite display area at the
same time. That is, the control 2280 affects output levels during
playback, export to a file, etc. Hence, the level control 2280
adjusts the level of the output audio, and the meters 2285 display
that audio level. In the example illustrated in FIG. 22, the audio
meters 2285 include a first meter that represents the left channel
of the output audio, and a second meter that represents the right
channel of the output audio. In some embodiments, the master
includes an audio meter for each output track (e.g., channel, file)
specified for the sequence. For example, when the sequence of the
clips has four output tracks, there are four corresponding audio
meters for the master in the audio mixer 2255. Several examples of
specifying different output tracks by using metadata tags are
described above in Section V.
[0176] As shown in FIG. 22, the audio meters 2285 provide visual
representations of the level of the output audio. Specifically,
each meter displays a fluctuating or moving bar in accord with the
audio level. In some embodiments, the fluctuating bar changes color
when the audio level exceeds a particular threshold. For example,
the color of the bar may change from one color to another color
when the volume goes over a predetermined threshold decibel
value.
[0177] The level control 2260 and the audio meter 2265 are related
to the keyword 2202. The level control 2270 and the audio meter are
related to the keyword 2204. In some embodiments, the audio meters
2265 and 2275 display the audio levels of the clips associated with
the corresponding keywords. For example, when a clip tagged with
"Dialog" is being output, the audio meter 2265 fluctuates to
indicate the level of the clip's audio. Similarly, the audio level
control 2260 controls the audio level of each clip that is tagged
with the keyword 2202, and the audio level control 2270 controls
the audio level of each clip tagged with the keyword 2204.
[0178] As shown in FIG. 22, the level controls 2260, 2270, and 2280
are represented as channel faders, while the audio meters 2265,
2275, and 2285 are represented as fluctuating bars. Alternatively,
the media editing application of some embodiments provides
different types of controls. For example, any one of the level
controls can be provided as a dial knob that is rotated to adjust
the gain or volume of each clip that is tagged with a particular
keyword. Also, in different embodiments, the audio levels at
different instances in time are represented as a graph, numerically
by displaying different decibels, etc. In some embodiment, the
audio controls (e.g., audio level controls 2260 and 2270) are not
used to control absolute audio levels but are used to make relative
adjustments. In some such embodiments, the media editing
application provides a wheel or a knob that can be turned
infinitely to add or subtract gain to all clips tagged with a
particular metadata tag.
[0179] In the example illustrated in FIG. 22, the audio mixer 2255
includes other controls associated with the master and the keywords
2202 and 2204. For example, the keyword 2204 includes (1) a mute
button 2208 for muting all clips associated with the keyword, (2) a
solo button 2212 for muting all other clips except those associated
with the keyword, and (3) a pan control 2214 for controlling the
spread of audio. The same set of controls is provided for the
keyword 2202. In addition, the master includes a mute button 2222
for muting all channels and a downmix button 2224 for mixing down
all output channels to a single stereo output. In some embodiments,
when the downmix is activated, all audio outputs in the composite
display area's sequence are mixed down to stereo during playback,
export to one or more files, etc. Instead of, or in conjunction
with these controls, the media editing application of some
embodiments provides other controls such as two separate controls
for controlling the gain and the volume, two separate faders or
knobs for individually controlling the audio levels of left and
right channels, a record button for recording the audio, etc.
[0180] One or ordinary skill will also recognize that different
display areas shown in the GUI 2200 is one of many possible
configurations for the GUI of some embodiments. For instance, in
some embodiments, the presence or absence of many of the display
areas can be toggled through the GUI (e.g., the inspector display
area 2225, clip library 2205, etc.). In addition, some embodiments
allow the user to modify the size of the various display areas
within the GUI. For instance, when the mixer 2255 is removed, the
composite display area 2215 can increase in size to include that
area. Similarly, the preview display area 2220 increases in size
when the inspector display area 2225 is removed.
[0181] B. Example Software Architecture
[0182] In some embodiments, the processes described above are
implemented as software running on a particular machine, such as a
computer or a handheld device, or stored in a machine readable
medium. FIG. 23 conceptually illustrates the software architecture
of a media editing application 2300 of some embodiments. In some
embodiments, the media editing application is a stand-alone
application or is integrated into another application, while in
other embodiments the application is implemented within an
operating system. Furthermore, in some embodiments, the application
is provided as part of a server-based solution. In some such
embodiments, the application is provided via a thin client. That
is, the application runs on a server while a user interacts with
the application via a separate machine remote from the server. In
other such embodiments, the application is provided via a thick
client. That is, the application is distributed from the server to
the client machine and runs on the client machine.
[0183] The media editing application 2300 includes a user interface
(UI) interaction and generation module 2305, a media ingest module
2310, editing modules 2315, effects modules 2340, output components
2308, a playback module 2325, a metadata association module 2335,
and an effects association module 2330.
[0184] The figure also illustrates stored data associated with the
media editing application: source files 2350, event data 2355,
project data 2360, and other data 2365. In some embodiments, the
source files 2350 store media files (e.g., video files, audio
files, combined video and audio files, etc.) imported into the
application. The source files 2350 of some embodiments also store
transcoded versions of the imported files as well as analysis data
(e.g., people detection data, shake detection data, color balance
data, etc.). The event data 2355 stores the events information used
by some embodiments to populate the thumbnails view (e.g., in a
clip browser). The event data 2355 may be a set of clip object data
structures stored as one or more SQLite database (or other format)
files in some embodiments. The project data 2360 stores the project
information used by some embodiments to specify a composite
presentation in the composite display area 2345. The project data
2360 may also be a set of clip object data structures stored as one
or more SQLite database (or other format) files in some
embodiments.
[0185] In some embodiments, the four sets of data 2350-2365 are
stored in a single physical storage (e.g., an internal hard drive,
external hard drive, etc.). In some embodiments, the data may be
split between multiple physical storages. For instance, the source
files might be stored on an external hard drive with the event
data, project data, and other data on an internal drive. Some
embodiments store event data with their associated source files and
render files in one set of folders, and the project data with
associated render files in a separate set of folders.
[0186] FIG. 23 also illustrates an operating system 2370 that
includes input device driver(s) 2375, display module 2380, and
media import module 2385. In some embodiments, as illustrated, the
input device drivers 2375, display module 2380, and media import
module 2385 are part of the operating system 2370 even when the
media editing application 2300 is an application separate from the
operating system 2370.
[0187] The input device drivers 2375 may include drivers for
translating signals from a keyboard, mouse, touchpad, tablet,
touchscreen, etc. A user interacts with one or more of these input
devices, each of which send signals to its corresponding device
driver. The device driver then translates the signals into user
input data that is provided to the UI interaction and generation
module 2305.
[0188] The present application describes a graphical user interface
that provides users with numerous ways to perform different sets of
operations and functionalities. In some embodiments, these
operations and functionalities are performed based on different
commands that are received from users through different input
devices (e.g., keyboard, trackpad, touchpad, mouse, etc.). For
example, the present application illustrates the use of a cursor in
the graphical user interface to control (e.g., select, move)
objects in the graphical user interface. However, in some
embodiments, objects in the graphical user interface can also be
controlled or manipulated through other controls, such as touch
control. In some embodiments, touch control is implemented through
an input device that can detect the presence and location of touch
on a display of the input device. An example of a device such
functionality is a touch screen device (e.g., as incorporated into
a smart phone, a tablet computer, etc.). In some embodiments, with
touch control, a user directly manipulates objects by interacting
with the graphical user interface that is displayed on the display
of the touch screen device. For instance, a user can select a
particular object in the graphical user interface by simply
touching that particular object on the display of the touch screen
device. As such, when touch control is utilized, a cursor may not
even be provided for enabling selection of an object of a graphical
user interface in some embodiments. However, when a cursor is
provided in a graphical user interface, touch control can be used
to control the cursor in some embodiments.
[0189] The display module 2380 translates the output of a user
interface for a display device. That is, the display module 2380
receives signals (e.g., from the UI interaction and generation
module 2305) describing what should be displayed and translates
these signals into pixel information that is sent to the display
device. The display device may be an LCD, plasma screen, CRT
monitor, touchscreen, etc.
[0190] The media import module 2385 receives media files (e.g.,
audio files, video files, etc.) from storage devices (e.g.,
external drives, recording devices, etc.) through one or more ports
(e.g., a USB port, Firewire port, etc.) of the device on which the
application 2300 operates and translates this media data for the
media editing application or stores the data directly onto a
storage of the device.
[0191] The UI interaction and generation module 2305 of the media
editing application 2300 interprets the user input data received
from the input device drivers 2375 and passes it to various
modules, including the editing modules 2315, the rendering engine
2320, the playback module 2325, the metadata association modules
2335, and the effects association module 2330. The UI interaction
and generation module 2305 also manages the display of the UI, and
outputs this display information to the display module 2380. This
UI display information may be based on information from the editing
modules 2315, the playback module 2325, and the data 2350-2365. In
some embodiments, the UI interaction and generation module 2305
generates a basic GUI and populates the GUI with information from
the other modules and stored data.
[0192] As shown, the UI interaction and generation module 2305 of
some embodiments provides a number of different UI elements. In
some embodiments, these elements include a tag display area 2306, a
composite display area 2345, an effects association tool 2304, an
audio mixing tool 2318, and a preview display area 2312. All of
these UI elements are described in detail above by reference to
FIG. 22.
[0193] The media ingest module 2310 manages the import of source
media into the media editing application 2300. Some embodiments, as
shown, receive source media from the media import module 2385 of
the operating system 2370. The media ingest module 2310 receives
instructions through the UI interaction and generation module 2305
as to which files should be imported, then instructs the media
import module 2385 to enable this import (e.g., from an external
drive, from a camera, etc.). The media ingest module 2310 stores
these source files 2350 in specific file folders associated with
the application. In some embodiments, the media ingest module 2310
also manages the creation of event data structures upon import of
source files and the creation of the clip and asset data structures
contained in the events. In some embodiments, the media ingest
module 2310 tags the imported media clip with one or more metadata
tags. For example, when a media clip is imported from a music
library, the media ingest module 2310 might tag the clip with a
"Music" tag. Alternatively, when the media clip is imported from a
folder named "Dialog", the media ingest module 2310 might tag the
clip with a "Dialog" tag.
[0194] The editing modules 2315 include a variety of modules for
editing media in the clip browser as well as in the composite
display area. The editing modules 2315 handle the creation of
projects, addition and subtraction of clips from projects, trimming
or other editing processes within the composite display area, or
other editing processes. In some embodiments, the editing modules
2315 create and modify project and clip data structures in both the
event data 2355 and the project data 2360.
[0195] The effects association module 2330 of some embodiments
associates an effect with a metadata tag. In some embodiments, the
effect association module 2330 defines an effect chain with one or
more effects for the metadata tag. The effect modules 2340
represent the various different effects, filters, transitions, etc.
As mentioned above, there are many different effects or filters
that can be associated with metadata to facilitate editing
operations. Although this list is non-exhaustive, several example
audio effects include different equalizers for modifying the signal
strength of a clip within specified frequency ranges, a
compressor/limiter for reducing the clip's dynamic range by
attenuating parts of the audio signal above a particular threshold,
an echo effect for creating an echo sound, and a reverb effect for
creating a reverberation effect that emulates a particular acoustic
environment. Several example video effects or image effects include
color filters that operate on color values, different filters that
sharpen, stylize, distort, or blur an image, and fade-in/fade-out
effects for creating transitions between scenes. Several of these
effect modules are associated with one or more settings or
properties that the application's user can specify to edit media
content.
[0196] In some embodiments, the output components 2308 generate the
resulting output composite presentation based on one or more clips
in the composite display area 2345. As shown, the output components
2308 include a rendering engine 2320 and a mixer 2314. However,
depending on the type of output, the media editing application of
some embodiments includes other component (e.g., encoders,
decoders, etc). The rendering engine 2320 handles the rendering of
images for the media editing application. In some embodiments, the
rendering engine 2320 manages the creation of images for the media
editing application. When an image is requested by a destination
within the application (e.g., the playback module 2325) the
rendering engine 2320 outputs the requested image according to the
project or event data. The rendering engine 2320 retrieves the
project data or event data that identifies how to create the
requested image and generates a render graph that is a series of
nodes indicating either images to retrieve from the source files or
operations to perform on the source files. In some embodiments, the
rendering engine 2320 schedules the retrieval of the necessary
images through disk read operations and the decoding of those
images.
[0197] In some embodiments, the rendering engine 2320 performs
various operations to generate an output image. In some
embodiments, these operations include blend operations, effects
(e.g., blur or other pixel value modification operations), color
space conversions, resolution transforms, etc. In some embodiments,
one or more of these processing operations are actually part of the
operating system and are performed by a GPU or CPU of the device on
which the application 2300 operates. The output of the rendering
engine (a rendered image) may be stored as render files in storage
2365 or sent to a destination for additional processing or output
(e.g., playback).
[0198] In some embodiments, the mixer 2314 receives several audio
signals of different clips and outputs a mixed audio signal. The
mixer 2314 of some embodiments is utilized in number of different
instances during the non-linear editing process. For example, the
mixer may be utilized in generating a composite presentation from
multiple different clips. The mixer can also act as the master to
output a mixed audio signal, as described in many of the examples
above. In some embodiments, the media editing application includes
different types of mixers for mixing audio. For example, the media
editing application can include a first mixer for mixing one type
of audio file and a second mixer for mixing another type of audio
file.
[0199] The playback module 2325 handles the playback of images
(e.g., in a preview display area 2312 of the user interface). Some
embodiments do not include a playback module and the rendering
engine directly outputs its images for integration into the GUI, or
directly to the display module 2380 for display at a particular
portion of the display device.
[0200] In some embodiments, the metadata association module 2335
associates clips with metadata tags. Different embodiments provide
different ways of associating media clips with metadata tags. In
some embodiments, the metadata tags indicate pre-defined categories
(e.g., dialog, music) that an editor can select to categorize
different clips. Instead of, or in conjunction with, these
categories, some embodiments allow the editor to specify one or
more keywords to associate with the media clips. For instance, in
some such embodiments, the media editing application provides a
keyword association tool that displays different keywords for
tagging the media content. To tag a clip, the application's user
drags and drops the clip onto a particular keyword in the keyword
association tool. The same technique is used in some embodiments to
associate multiple clips by simultaneously dragging and dropping
the clips onto the particular keyword.
[0201] In addition, some embodiments automatically associate one or
more metadata tags with a media clip. In some such embodiments,
this automatic association is based on a number of different
factors including the source of the media clip (e.g., based on the
library or camera from which the clip was imported), based on an
analysis of the media clip (e.g., based on color balance analysis,
image stabilization analysis, audio channel analysis, etc.). For
example, the media editing application might tag one set of clips
from a music library as "Music" and tag another set of clips from a
sound effects library as "SFX". Alternatively, the automatic
association can be based on an analysis of the media content (e.g.,
based on color balance analysis, image stabilization analysis,
audio channel analysis, people analysis, etc.). As mentioned above,
in some embodiments, the media ingest module 2310 can also perform
at least some of the metadata association task when importing media
content into the media editing application 2300. In some
embodiments, the media editing application includes one or more
analysis modules for analyzing the number of people (e.g., one
person, two persons, group, etc.) in a clip and/or a type of shot
(e.g., a close-up, medium, or wide shot). Other types of analysis
modules can include image stabilization analysis modules (e.g., for
camera movements), color balance analysis modules, audio analysis
modules (e.g., for mono, stereo, silent channels), metadata
analysis, etc. In some embodiments, metadata tags represent
metadata that are embedded in media content. For example, some
video cameras embed frame rate, creation date, and encoding info
into video clips that they capture. In addition some devices embed
other metadata such as location data, audio channel count, sample
rate, file type, camera type, exposure info, etc.
[0202] While many of the features of the media editing application
2300 have been described as being performed by one module (e.g.,
the UI interaction and generation module 2305, the media ingest
module 2310, etc.), one of ordinary skill in the art will recognize
that the functions described herein might be split up into multiple
modules. Similarly, functions described as being performed by
multiple different modules might be performed by a single module in
some embodiments (e.g., the playback module 2325 might be part of
the UI interaction and generation module 2305).
[0203] C. Example Data Structure
[0204] FIG. 24 conceptually illustrates example data structures for
several objects associated with the media editing application of
some embodiments. Specifically, this figure illustrates a sequence
2435 that references a primary collection data structure 2440.
Here, the primary collection data structure 2440 is in itself a
group of one or more clip objects or collection objects. As shown,
the figure illustrates (1) a clip object 2405, (2) a component
object 2410, (3) a tag object 2420, (4) an effect object 2430, (5)
the sequence object 2435, (6) the primary collection object 2440,
and (7) an asset object 2445.
[0205] As shown in FIG. 24, the sequence 2435 includes a sequence
ID and sequence attributes. The sequence ID identifies the sequence
2435. In some embodiments, the application's user sets the sequence
attributes for the project represented in the composite display
area. For example, the user might have specified several settings
that correspond to these sequence attributes when creating the
project. The sequence 2435 also includes a pointer to a primary
collection 2440.
[0206] The primary collection 2440 includes the collection ID and
the array of clips. The collection ID identifies the primary
collection. The array references several clips (i.e., clip 1 to
clip N). These represent clips or collections that have been added
to the composite display area. In some embodiments, the array is
ordered based on the locations of media clips in the composite
display area and only references clips in the primary lane of the
primary collection. An example of one or more clips in the primary
lane of the composite display area is described above by reference
to FIG. 22.
[0207] The clip object 2405 or collection object, in some
embodiments, is an ordered array of clip objects. The clip object
2405 references one or more component clips (e.g., the component
object 2410) in the array. In addition, the clip object 2405 stores
a clip ID that is a unique identifier for the clip object. In some
embodiments, the clip object 2405 is a collection object that can
reference component clip objects as well as additional collection
objects. An example of such collection object is a compound clip
that references multiple different clips. In some embodiments, the
clip object 2405 or collection object only references the video
component clip in the array, and any additional components
(generally one or more audio components) are then anchored to that
video component.
[0208] As shown in FIG. 24, the clip object 2405 is associated with
one more metadata tags (i.e., tags 1-N). In some embodiments, these
tags represent those that are associated by the application's user.
Alternatively, one or more of these tags can be tags specified by
the media editing application. For example, when a media clip is
imported from a music library, the media editing application might
tag the clip with a "Music" tag. Alternatively, when the media clip
is imported from a folder named "Dialog", the media editing
application might tag the clip with a "Dialog" tag.
[0209] The component object 2410 includes a component ID, an asset
reference, and anchored components. The component ID identifies the
component. The asset reference of some embodiments uniquely
identifies a particular asset object. In some embodiments, the
asset reference is not a direct reference to the asset but rather
is used to locate the asset when needed. For example, when the
media editing application needs to identify a particular asset, the
application uses an event ID to locate an event object (not shown)
that contains the asset, and then the asset ID to locate the
particular desired asset. Several examples of clips associated with
an event or an event folder are described above by reference to
FIG. 22.
[0210] In some embodiments, the clip object 2405 only stores the
video component clip in its array, and any additional components
(generally one or more audio components) are then anchored to that
video component. This is illustrated in FIG. 24 as the component
object 2410 includes a set of one or more anchored components 2415
(e.g., audio components). In some embodiments, each component that
is anchored to another clip or collection stores an anchor offset
that indicates a particular instance in time along the range of the
other clip or collection. That is, the anchor offset may indicate
that the component is anchored x number of seconds and/or frames
into the other clip or collection. In some embodiments, the offset
refers to the trimmed ranges of the clips.
[0211] As shown, the asset object 2445 includes an asset ID,
reference to a source file, and a set of source file metadata. The
asset ID identifies the asset, while the source file reference is a
pointer to the original media file. The set of source file metadata
is different for different media clips. Examples of source file
metadata include the file type (e.g., audio, video, movie, still
image, etc.), the file format (e.g., ".mov", ".avi", etc),
different video properties, audio properties, etc.
[0212] In the example illustrated in FIG. 24, the tag object 2420
includes a tag ID that identifies the tag, a tag name that
represents the metadata tag, and an effect list 2425 that
represents the one or more effects associated with the metadata
tag. In some embodiments, the tag object 2425 includes an output
track that represents the output track associated with the metadata
tag. Several examples of such output track are described above by
reference to FIGS. 18-20.
[0213] As shown in FIG. 24, the effects object 2430 includes an
effect ID and effect properties. The effect ID identifies the
effect. In some embodiments, the effect properties are based on
parameters specified using an effect properties tool. The
properties tool can include different user interface items to
specify different parameters or settings for the effect.
[0214] One of ordinary skill will also recognize that the data
structures shown in FIG. 24 are just a few of the many different
possible configurations for implementing the editing features
described above. For instance, in some embodiments, instead of
multiple tags per clip, only one tag (e.g., role, category) is
assigned to the clip. For example, several clips may be assigned
one audio role of "Dialog", "Music", or "SFX". When multiple tags
per clip are supported, the media editing application applies
different sets of effects in parallel. For example, if a clip is
tagged with both first and second tags, the media editing
application applies a first set of effects associated with the
first tag, and applies a second set of effects in parallel with the
first set of effect. Also, in the example illustrated in FIG. 24,
the tag object 2420 can be associated with the component object
2410 or the asset object 2445 instead of the clip object 2405.
VII. Electronic System
[0215] Many of the above-described features and applications are
implemented as software processes that are specified as a set of
instructions recorded on a computer readable storage medium (also
referred to as computer readable medium). When these instructions
are executed by one or more processing unit(s) (e.g., one or more
processors, cores of processors, or other processing units), they
cause the processing unit(s) to perform the actions indicated in
the instructions. Examples of computer readable media include, but
are not limited to, CD-ROMs, flash drives, random access memory
(RAM) chips, hard drives, erasable programmable read only memories
(EPROMs), electrically erasable programmable read-only memories
(EEPROMs), etc. The computer readable media does not include
carrier waves and electronic signals passing wirelessly or over
wired connections.
[0216] In this specification, the term "software" is meant to
include firmware residing in read-only memory or applications
stored in magnetic storage, which can be read into memory for
processing by a processor. Also, in some embodiments, multiple
software inventions can be implemented as sub-parts of a larger
program while remaining distinct software inventions. In some
embodiments, multiple software inventions can also be implemented
as separate programs. Finally, any combination of separate programs
that together implement a software invention described here is
within the scope of the invention. In some embodiments, the
software programs, when installed to operate on one or more
electronic systems, define one or more specific machine
implementations that execute and perform the operations of the
software programs.
[0217] FIG. 25 conceptually illustrates an electronic system 2500
with which some embodiments of the invention are implemented. The
electronic system 2500 may be a computer (e.g., a desktop computer,
personal computer, tablet computer, etc.), phone (e.g., smart
phone), PDA, or any other sort of electronic device. Such an
electronic system includes various types of computer readable media
and interfaces for various other types of computer readable media.
Electronic system 2500 includes a bus 2505, processing unit(s)
2510, a graphics processing unit (GPU) 2515, a system memory 2520,
a network 2525, a read-only memory 2530, a permanent storage device
2535, input devices 2540, and output devices 2545.
[0218] The bus 2505 collectively represents all system, peripheral,
and chipset buses that communicatively connect the numerous
internal devices of the electronic system 2500. For instance, the
bus 2505 communicatively connects the processing unit(s) 2510 with
the read-only memory 2530, the GPU 2515, the system memory 2520,
and the permanent storage device 2535.
[0219] From these various memory units, the processing unit(s) 2510
retrieves instructions to execute and data to process in order to
execute the processes of the invention. The processing unit(s) may
be a single processor or a multi-core processor in different
embodiments. Some instructions are passed to and executed by the
GPU 2515. The GPU 2515 can offload various computations or
complement the image processing provided by the processing unit(s)
2510.
[0220] The read-only-memory (ROM) 2530 stores static data and
instructions that are needed by the processing unit(s) 2510 and
other modules of the electronic system. The permanent storage
device 2535, on the other hand, is a read-and-write memory device.
This device is a non-volatile memory unit that stores instructions
and data even when the electronic system 2500 is off. Some
embodiments of the invention use a mass-storage device (such as a
magnetic or optical disk and its corresponding disk drive) as the
permanent storage device 2535.
[0221] Other embodiments use a removable storage device (such as a
floppy disk, flash memory device, its corresponding disk drive,
etc.) as the permanent storage device. Like the permanent storage
device 2535, the system memory 2520 is a read-and-write memory
device. However, unlike storage device 2535, the system memory 2520
is a volatile read-and-write memory, such as a random access
memory. The system memory 2520 stores some of the instructions and
data that the processor needs at runtime. In some embodiments, the
invention's processes are stored in the system memory 2520, the
permanent storage device 2535, and/or the read-only memory 2530.
For example, the various memory units include instructions for
processing multimedia clips in accordance with some embodiments.
From these various memory units, the processing unit(s) 2510
retrieves instructions to execute and data to process in order to
execute the processes of some embodiments.
[0222] The bus 2505 also connects to the input and output devices
2540 and 2545. The input devices 2540 enable the user to
communicate information and select commands to the electronic
system. The input devices 2540 include alphanumeric keyboards and
pointing devices (also called "cursor control devices"), cameras
(e.g., webcams), microphones or similar devices for receiving voice
commands, etc. The output devices 2545 display images generated by
the electronic system or otherwise output data. The output devices
2545 include printers and display devices, such as cathode ray
tubes (CRT) or liquid crystal displays (LCD), as well as speakers
or similar audio output devices. Some embodiments include devices
such as a touchscreen that function as both input and output
devices.
[0223] Finally, as shown in FIG. 25, bus 2505 also couples
electronic system 2500 to a network 2525 through a network adapter
(not shown). In this manner, the computer can be a part of a
network of computers (such as a local area network ("LAN"), a wide
area network ("WAN"), or an Intranet, or a network of networks,
such as the Internet. Any or all components of electronic system
2500 may be used in conjunction with the invention.
[0224] Some embodiments include electronic components, such as
microprocessors, storage and memory that store computer program
instructions in a machine-readable or computer-readable medium
(alternatively referred to as computer-readable storage media,
machine-readable media, or machine-readable storage media). Some
examples of such computer-readable media include RAM, ROM,
read-only compact discs (CD-ROM), recordable compact discs (CD-R),
rewritable compact discs (CD-RW), read-only digital versatile discs
(e.g., DVD-ROM, dual-layer DVD-ROM), a variety of
recordable/rewritable DVDs (e.g., DVD-RAM, DVD-RW, DVD+RW, etc.),
flash memory (e.g., SD cards, mini-SD cards, micro-SD cards, etc.),
magnetic and/or solid state hard drives, read-only and recordable
Blu-Ray.RTM. discs, ultra density optical discs, any other optical
or magnetic media, and floppy disks. The computer-readable media
may store a computer program that is executable by at least one
processing unit and includes sets of instructions for performing
various operations. Examples of computer programs or computer code
include machine code, such as is produced by a compiler, and files
including higher-level code that are executed by a computer, an
electronic component, or a microprocessor using an interpreter.
[0225] While the above discussion primarily refers to
microprocessor or multi-core processors that execute software, some
embodiments are performed by one or more integrated circuits, such
as application specific integrated circuits (ASICs) or field
programmable gate arrays (FPGAs). In some embodiments, such
integrated circuits execute instructions that are stored on the
circuit itself.
[0226] As used in this specification and any claims of this
application, the terms "computer", "server", "processor", and
"memory" all refer to electronic or other technological devices.
These terms exclude people or groups of people. For the purposes of
the specification, the terms display or displaying means displaying
on an electronic device. As used in this specification and any
claims of this application, the terms "computer readable medium,"
"computer readable media," and "machine readable medium" are
entirely restricted to tangible, physical objects that store
information in a form that is readable by a computer. These terms
exclude any wireless signals, wired download signals, and any other
ephemeral signals.
[0227] While the invention has been described with reference to
numerous specific details, one of ordinary skill in the art will
recognize that the invention can be embodied in other specific
forms without departing from the spirit of the invention. In
addition, a number of the figures (including FIGS. 1, 10, 15, 16,
and 21) conceptually illustrate processes. The specific operations
of these processes may not be performed in the exact order shown
and described. The specific operations may not be performed in one
continuous series of operations, and different specific operations
may be performed in different embodiments. Furthermore, the process
could be implemented using several sub-processes, or as part of a
larger macro process. In addition, some embodiments execute
software stored in programmable logic devices (PLDs), ROM, or RAM
devices.
[0228] In addition, many of the user interface controls described
above relates to controlling audio. However, one of ordinary skill
in the art would recognize that similar controls can be provided
for image effect or filters. For example, one or more user
interface controls (e.g., sliders, knobs, buttons) can be provided
for each metadata tag to control the effect settings (e.g.,
brightness, sharpness, amount of distortion, fade-in effect,
fade-out effect, etc.).
[0229] In many of the examples described herein, a media editing
application uses metadata to facilitate editing operations.
However, one of ordinary skill in the art would recognize that the
metadata features can be provided for different types of
applications or programs (e.g., an image organizing application, a
server-side web application, an operating system framework). For
instance, the metadata features can be provided in an image
application that allows the application's user to associate
different items with keywords, and apply one or more effects to
those items based on the association of the keywords, and/or output
those items to different tracks (e.g., files, channels) based on
the association of the keywords. Thus, one of ordinary skill in the
art would understand that the invention is not to be limited by the
foregoing illustrative details, but rather is to be defined by the
appended claims.
* * * * *