U.S. patent application number 10/887072 was filed with the patent office on 2004-12-30 for authoring system for combining temporal and nontemporal digital media.
Invention is credited to Balkus, Peter A., Crofton, T. Winton, McElhoe, Glenn, Purcell, Thomas C..
Application Number | 20040268224 10/887072 |
Document ID | / |
Family ID | 24152483 |
Filed Date | 2004-12-30 |
United States Patent
Application |
20040268224 |
Kind Code |
A1 |
Balkus, Peter A. ; et
al. |
December 30, 2004 |
Authoring system for combining temporal and nontemporal digital
media
Abstract
An authoring tool has a graphical user interface enabling
interactive authoring of a multimedia presentation including
temporal and nontemporal media. The graphical user interface
enables specification of the temporal and spatial relationships
among the media and playback of the presentation with the specified
temporal and spatial relationships. The spatial and temporal
relationships among the media may be changed independently of each
other. The presentation may be viewed interactively under the
control of the author during the authoring process without encoding
the audio and video data into a streaming media data file for
combination with the other media, simulating behavior of a browser
that would receive a streaming media data file. The multimedia
presentation may include elements that initiate playback of the
presentation from a specified point in time. After authoring of the
presentation is completed, the authoring tool assists in encoding
and transferring the presentation for distribution. Information
about the distribution format and location may be stored as
user-defined profiles. Communication with the distribution location
may be tested and presentation and the distribution information may
be audited prior to encoding and transfer to reduce errors. A
presentation is encoded according to the defined temporal and
spatial relationships and the distribution format and location
information to produce and encoded presentation. The encoded
presentation and any supporting media data are transferred to the
distribution location, such as a server. A streaming media server
may be used for streaming media, whereas other data may be stored
on a conventional data server. Accounts may be provided for a
streaming media server for authors to publish their presentations.
The authoring tool may be associated with a service that uses the
streaming media server. Such streaming media servers also may be a
source of stock footage for use by authors.
Inventors: |
Balkus, Peter A.; (Acton,
MA) ; McElhoe, Glenn; (Arlington, MA) ;
Crofton, T. Winton; (Newton, MA) ; Purcell, Thomas
C.; (Northwood, NH) |
Correspondence
Address: |
PETER J. GORDON, PATENT COUNSEL
AVID TECHNOLOGY, INC.
ONE PARK WEST
TEWKSBURY
MA
01876
US
|
Family ID: |
24152483 |
Appl. No.: |
10/887072 |
Filed: |
July 8, 2004 |
Related U.S. Patent Documents
|
|
|
|
|
|
Application
Number |
Filing Date |
Patent Number |
|
|
10887072 |
Jul 8, 2004 |
|
|
|
09539749 |
Mar 31, 2000 |
|
|
|
Current U.S.
Class: |
715/203 ;
707/E17.009; 715/204 |
Current CPC
Class: |
H04N 21/854 20130101;
G11B 27/105 20130101; G11B 27/34 20130101; G06F 16/4393 20190101;
G11B 27/034 20130101 |
Class at
Publication: |
715/500.1 |
International
Class: |
G06F 017/00 |
Claims
What is claimed is:
1. A method for publishing a streaming media presentation
containing temporal media and events associated with references to
nontemporal media combined according to a timeline and a layout
specification, comprising: confirming availability of all of the
data files including the temporal and nontemporal media in the
streaming media presentation; encoding the streaming media
presentation; transferring the streaming media presentation to a
first streaming media server; and transferring the nontemporal
media data files to a second server.
2. The method of claim 1, further comprising previewing the
streaming media presentation from the first streaming media
server.
3. The method of claim 1, further comprising previewing the
streaming media presentation before transferring the streaming
media presentation.
4. The method of claim 1, further comprising setting up a profile
indicating account access information, a pathname for reading, a
pathname for writing for each of the first and second servers and
associated with a name.
5. The method of claim 4, wherein encoding uses the profile to
create the streaming media presentation.
6. A system for providing a service to an author for publishing a
multimedia presentation, comprising: an encoder having a first
input for receiving a timeline comprising one or more first tracks
for temporal media and one or more second tracks for nontemporal
media, a second input for receiving a layout specification
indicating an association between each of the one or more first
tracks and one or more second tracks and a display location and
having an output for providing a streaming media presentation
containing the temporal media and the nontemporal media combined
according to the timeline and the layout specification; a transfer
tool for transferring the streaming media presentation file to a
first media server and the nontemporal media to a second media
server; wherein the user has a first account for the streaming
media server; wherein the user has a second account for the second
media server; and wherein the authoring tool has an association
with a service that provides the streaming media server.
7. A system for providing a service to authors for creating and
publishing multimedia presentations, accessible remotely by an
authoring tool capable of transferring data between the authoring
tool and the system, comprising: an account management system
enabling multiple users to register, each with a usemame and
password and billing information; a server including computer
readable storage media having storage space allocated for each of
the registered users, for publishing multimedia presentations for
access through a publicly accessible computer network; a media
publication management system for interacting with the authoring
tool to enable transfer of streaming media from multimedia
presentations from the authoring tool to the server; and a media
access management system accessible by each registered user and
enabling each registered user to transfer multimedia data from the
system to the authoring tool for use in a multimedia
presentation.
8. A method for publishing a presentation specified by a timeline
including a plurality of tracks and a layout defining a spatial
relationship among media in the plurality of tracks, comprising:
receiving an indication of a distribution format for the
presentation and one or more destination storage locations; for
each file referred to in the timeline of the presentation, create a
file name for the file in the one or more destination storage
locations; encoding the presentation in the distribution format
using the file names in the one or more destination storage
locations and indicating the spatial relationship; and transferring
the encoded presentation and each file to the one or more
destination storage locations.
9. The method of claim 8, further comprising verifying connections
with the destination storage location before transferring.
10. The method of claim 8, wherein the one or more destination
storage locations includes a first media streaming server for the
encoded presentation and a second server for each file referred to
in the timeline of the presentation
Description
CROSS REFERENCE TO RELATED APPLICATION
[0001] This Application is a divisional application of U.S. patent
application Ser. No. 09/539,749, filed Mar. 31, 2000, now pending,
which is hereby incorporated by reference.
BACKGROUND
[0002] A variety of systems are used for authoring multimedia
presentations such as motion pictures, television shows,
advertisements for television, presentations on digital versatile
disks (DVDs), interactive hypermedia, and other presentations. Such
authoring systems generally provide a user interface and a process
through which multimedia data is captured and stored, and through
which the multimedia presentation is created, reviewed and
published for distribution. The user interface and process for
authoring generally depend on the kind of presentation being
created and what the system developer believes is intuitive and
enables an author to work creatively, flexibly and quickly.
[0003] Some multimedia presentations are primarily nontemporal
presentations. That is, any change in the presentation typically
depends on user activity or other event, instead of the passage of
time. Some nontemporal multimedia presentations may include
temporal components. For example, a user may cause a video to be
displayed that is related to a text document by selecting a
hyperlink to the video in the document.
[0004] Other multimedia presentations are primarily temporal
presentations incorporating audio and/or video material, and
optionally other media related to the temporal media. Primarily
temporal media presentations that are well known today include
streaming media formats such as QuickTime, Real Media, Windows
Media Technology and SMIL, and formats that encode data in the
vertical blanking interval of a television signal, such as used by
WebTV, ATVEF, and other similar formats.
[0005] A variety of authoring tools have been developed for
different kinds of presentations. Tools for processing combined
temporal and nontemporal media include those described in PCT
Publication No. WO99/52045, corresponding to U.S. Pat. No.
6,426,778, and PCT Publication No. WO96/31829, corresponding to
U.S. Pat. No. 5,892,507, and U.S. Pat. No. 5,659,793 and U.S. Pat.
No. 5,428,731.
SUMMARY
[0006] An authoring tool has a graphical user interface enabling
interactive authoring of a multimedia presentation including
temporal and nontemporal media. The graphical user interface
enables specification of the temporal and spatial relationships
among the media and playback of the presentation with the specified
temporal and spatial relationships. The spatial and temporal
relationships among the media may be changed independently of each
other. The presentation may be viewed interactively under the
control of the author during the authoring process without encoding
the audio and video data into a streaming media data file for
combination with the other media, simulating behavior of a browser
that would receive a streaming media data file. The multimedia
presentation may include elements that initiate playback of the
presentation from a specified point in time. After authoring of the
presentation is completed, the authoring tool assists in encoding
and transferring the presentation for distribution. Information
about the distribution format and location may be stored as
user-defined profiles. Communication with the distribution location
may be tested and presentation and the distribution information may
be audited prior to encoding and transfer to reduce errors. A
presentation is encoded according to the defined temporal and
spatial relationships and the distribution format and location
information to produce and encoded presentation. The encoded
presentation and any supporting media data are transferred to the
distribution location, such as a server. A streaming media server
may be used for streaming media, whereas other data may be stored
on a conventional data server. Accounts may be provided for a
streaming media server for authors to publish their presentations.
The authoring tool may be associated with a service that uses the
streaming media server. Such streaming media servers also may be a
source of stock footage for use by authors. These various
functions, and combinations thereof, of the authoring tool are each
aspects of the present invention that may be embodied as a computer
system, a computer program product or a computer implemented
process that provides these functions.
[0007] In one embodiment, the spatial relationship may be defined
by a layout specification that indicates an association of one or
more tracks of temporal media and one or more tracks of nontemporal
media with a corresponding display location. If the temporal media
is not visible, such as audio, the spatial relationship may be
defined among the nontemporal media. One kind of temporal
relationship between nontemporal data and temporal media is
provided by a table of contents track. The nontemporal media of
elements associated with points in time in the table of contents
track of a presentation is combined and displayed for the duration
of the presentation. If a user selects one of the elements from the
table of contents track, presentation of the temporal media data is
initiated from the point in time associated with that element on
the table of contents track.
[0008] It is also possible to associate a streaming media
presentation with another streaming media presentation. For
example, an event in one streaming media presentation may be used
to initiate playback of another subsequent streaming media
presentation. The two presentations may have different layout
specifications. A document in a markup language may be created to
include a hyperlink to each of the plurality of streaming media
presentations.
BRIEF DESCRIPTION OF THE DRAWINGS
[0009] FIG. 1 is an illustration of an example multimedia
presentation;
[0010] FIG. 2 is an illustration of a relationship among multiple
presentations;
[0011] FIG. 3 is an illustration of a timeline for defining a
multimedia presentation;
[0012] FIG. 4 illustrates example layouts for a multimedia
presentation;
[0013] FIG. 5 is an illustration of an example graphical user
interface for specifying a layout;
[0014] FIG. 6 is an illustration of an example graphical user
interface for specifying a mapping between frames in a layout and
tracks in a timeline;
[0015] FIG. 7A is a data flow diagram illustrating a relationship
of parts of a system for authoring and publishing a multimedia
presentation;
[0016] FIG. 7B is an illustration of an example graphical user
interface for interactively authoring and viewing a
presentation;
[0017] FIG. 8A illustrates an architecture for implementing an
editing viewer of FIG. 7A;
[0018] FIG. 8B illustrates an architecture for implementing a
display manager of FIG. 8A;
[0019] FIG. 8C is a flowchart describing how a graphical user
interface may be constructed;
[0020] FIG. 8D is a flowchart describing how a display manager may
display contents and its corresponding portion of the editing
interface;
[0021] FIG. 8E is a flowchart describing how the table of contents
display may be updated;
[0022] FIG. 8F is a flowchart describing how a new table of
contents file may be generated;
[0023] FIG. 9 is a flowchart describing how a presentation may be a
published;
[0024] FIG. 10 illustrates a graphical user interface for managing
a transfer process of a multimedia presentation;
[0025] FIG. 11A is a flowchart describing how a presentation may be
encoded;
[0026] FIG. 11B is a flowchart describing, in one implementation,
how a program may be encoded;
[0027] FIG. 11C is a flowchart describing how a presentation may be
transferred;
[0028] FIG. 12 is a data flow diagram illustrating interaction of a
transfer tool with a streaming server and a data server; and
[0029] FIG. 13 is a data flow diagram illustrating a relationship
of multiple editing and transfer systems with a streaming
server.
DETAILED DESCRIPTION
[0030] In this description, all patent applications and published
patent documents referred to herein are hereby incorporated by
reference.
[0031] Referring to FIG. 1, an example of a multimedia
presentation, which may be created using an authoring system to be
described herein, will now be described. In general, a multimedia
presentation is a combination of temporal media, such as video,
audio and computer-generated animation, and nontemporal media, such
as still images, text, hypertext documents, etc. Some temporal
media, such as animations in the GIF format or the Macromedia Flash
formats may be used as if they were nontemporal media.
[0032] The temporal and nontemporal media may be combined in many
different ways. For example, a multimedia presentation may include
audio and/or video combined with multimedia slides that are time
synchronized with the audio and/or video. The presentation also may
include advertisements and/or an index of the temporal media. In
general, there is a temporal relationship and a spatial
relationship among the temporal and nontemporal media. In some
presentations, only a temporal relationship exists between certain
temporal media, such as audio, and the nontemporal media. An
example presentation shown in FIG. 1, includes video 100, HTML
events 102, a table of contents 104, and an advertisement 106.
[0033] FIG. 2 illustrates a more complex multimedia presentation
format. This multimedia presentation includes a hypermedia document
200, for example in a markup language, including hyperlinks to one
or more streaming media presentations, as indicated at 202, 204,
and 206. Upon selection of a hyperlink, the corresponding streaming
multimedia presentation 208, 210 or 212 may be played. An event at
or near the end of a streaming multimedia presentation may be used
to initiate playback of the subsequent multimedia presentation. The
different presentations may have different specified spatial
relationships.
[0034] There are many ways in which such multimedia presentations
may be stored. For example, various streaming media formats, such
as Real Media, Microsoft Windows Media Technology, QuickTime and
SMIL, may be used. The temporal media also may be encoded in a
television signal, with nontemporal media encoded in a
vertical-blanking interval of the television signal, such as used
by WebTV, ATVEF and other formats.
[0035] Creating such a multimedia presentation involves creating a
temporal relationship between each element of nontemporal media and
the temporal media. Such a relationship may be visualized using a
timeline, an example of which is shown in FIG. 3. In general, a
timeline has one or more tracks of temporal media, and one or more
tracks of nontemporal media. For example, there may be one video
track, one audio track, and an event track. The presentation of the
media on all the tracks is synchronized by the positions of the
elements in the timeline. These positions may be specified
graphically through a graphical user interface. Various data
structures may be used to represent such a timeline, such as those
described in U.S. Pat. No. 5,584,006 (Reber), U.S. Pat. No.
5,724,605 (Wissner) and PCT Publication No. WO98/05034.
[0036] The timeline is a time based representation of a
composition. The horizontal dimension represents time, and the
vertical dimension represents the tracks of the composition. Each
track has a row in the timeline which it occupies. The size of a
displayed element in a graphical user interface is determined as a
function of the duration of the segment it represents and a
timeline scale. Each element in each track of the timeline has a
position (determined by its start time within the presentation), a
title and associated data and optionally a duration.
[0037] FIG. 3 illustrates an example timeline which includes two
audio tracks 300, two video tracks 302, two event tracks 304, a
title track 306, and a table of contents track 308. Each of these
tracks will now be described.
[0038] An audio track 300 or a video track 302 is for placement of
temporal media. Such tracks commonly are used in video editing
applications, such as shown in PCT Publication No. WO98/05034,
which corresponds to U.S. patent application Ser. Nos. 08/687,926
and 08/691,985. Similarly, a title track 306 commonly is used to
create title effects for movies, such as scrolling credits. As
such, titles commonly are considered temporal media because they
have parameters that are animated over time and that are combined
with video data. Each track supports defining a sequence of
segments of media data. A segment references, either directly or
indirectly, the media data for the segment.
[0039] In the timeline shown herein, event tracks 304 associate
nontemporal media with a particular point in time, thus creating a
temporal relationship with the temporal media in tracks 300, 302,
and 306. Each event track is a list of events. Each event includes
a time and references a data file or a uniform resource locator,
either directly or indirectly, from which media data for the event
may be received.
[0040] The table of contents track 308 associates a table of
contents entry with a point in time. The table of contents may be
used as an index to the temporal media. Each entry includes a time
and associated content, typically text, entered by the author. As
described in more detail below, the table of contents entries are
combined into a single document for display. If a user selects an
element in the table of contents as displayed, the presentation is
displayed starting at the point in time corresponding to the
selected element.
[0041] The spatial relationship of the elements in the timeline as
presented also may be specified by the author. In one simple
example, a layout specification indicates a combination of frames
of a display area, of which one or more frames is associated to one
or more of the tracks in the timeline. Some tracks might not be
associated with a display frame. Some frames might be associated
directly with static media and not with a track. In general a frame
is associated with only one track and a track is associated with
only one frame.
[0042] The possible combinations and arrangements of the various
tracks in a timeline are unlimited, and are not limited to visual
media. As shown in the examples in FIG. 4, the visual display may
be merely a table of contents 400, or an event track 402, or both
404, for example, in combination with audio. These examples are
merely illustrative. In some cases, the audio has a corresponding
visual component that may be displayed, such as volume and position
controls. Video may be displayed, for example, with an event track
406, or a table of contents track 408, or both 410, such as shown
in FIG. 4.
[0043] A graphical user interface, and example of which is
described in connection with FIG. 5, enables a user to select from
among several layout specifications that have been stored as
templates. A graphical user interface, an example of which is
described in connection with FIG. 6, enables an author to make
assignments between tracks in the timeline and frames in the
display.
[0044] In FIG. 5, a graphical user interface 500 illustrates
templates in a template window 502. A template defines a mapping
between frames and tracks and a display arrangement of the frames
such as described in FIG. 4. A selected template such as 504 is
viewed in a preview pane 506. A user may browse the file system to
identify other templates by selecting a button 508 as in
conventional user interfaces. A template may be defined using the
hypertext markup language (HTML), for example by using frame set
definitions. A template may be authored using any conventional HTML
authoring tool, word processor or text editor. In the user
interface, a template file may be accessed to determine its frame
set definitions to generate an appropriate icon for display.
Similarly, the preview pane 506 is generated by accessing the frame
set definition within the selected template file. The mapping
between frames and tracks also is stored in the template file.
[0045] An example template file follows:
1 <HTML> <AVIDPUB tagtype="framemap" framename="Frame_A"
feature="MOVIE" originalurl="static.htm"> <AVIDPUB
tagtype="framemap" framename="Frame_B" feature="EVENTTRACK"
featurenum="1" > <AVIDPUB tagtype="framemap"
framename="Frame_C" feature="EVENTTRACK" featurenum="2" >
<AVIDPUB tagtype="framemap" framename="Frame_D" feature="TOC"
originalurl="static.htm"> <AVIDPUB tagtype="framemap"
framename="Frame_E" feature="EVENTTRACK" featurenum="3" >
<AVIDPUB tagtype="framemap" framename="Frame_Top"
feature="STATICHTML" featurenum="0" originalurl="header.htm">
<FRAMESET cols="40%,60%" bordercolor="blue" frameborder=yes
framespacing=2> <FRAMESET rows="70,40%,*"> <FRAME
SRC="header.htm" name="Frame_Top"> <FRAME SRC="AvidVid.htm"
name="Frame_A"> <FRAME SRC="AvidPubToc.html"
name="Frame_D"> </FRAMESET> <FRAMESET
rows="33%,34%,*"> <FRAME SRC="static.htm" name="Frame_B">
<FRAME SRC="static.htm" name="Frame_C"> <FRAME
SRC="static.htm" name="Frame_E"> </FRAMESET>
</FRAMESET> </HTML>
[0046] The first few lines of this template include
"<AVIDPUB>" HTML elements. These elements keep track of the
mappings between frames and tracks. Following these elements, a
frame set definition is provided using the "<FRAMESETS>"
element. Each frame has a source file name (SRC="filename") and a
name (name="name") associated with it. Each <AVIDPUB> element
maps a frame name to a "feature," which is a name of a type of a
track, and a feature number, indicative of which of the number of
tracks of that type is mapped to the frame.
[0047] A template may include other content and structure beyond
that shown in the example. For example, a company may want all of
its presentations to use the same logo in the same position. This
consistency may be provided by adding a reference to the logo to
the template.
[0048] By selecting the next button 510 in FIG. 5, the mapping
between frames and tracks may be defined. A user interface such as
shown in FIG. 6 is then displayed. The system uses the template
HTML file to generate a view 600. Also, the frame names are
extracted from the selected template and are listed in a region
602. The available tracks for a presentation are accessed, possibly
using the timeline, to generate menus such as indicated at 604. The
name of each track is put into a menu associated with each frame
name to enable a user to select that track and associate it with
the corresponding frame. If a track is associated with a frame, the
<AVIDPUB> element for that frame has its feature attribute
modified to indicate the track is associated with that frame. A
check may be performed to ensure that a track is not associated
with more than one frame.
[0049] In this and other processes described below in which an HTML
file is read and accessed, an application programming interface
provided by the Microsoft Corporation used may be to read and write
data in HTML files.
[0050] Having now described examples of data structures for
timelines and layout specifications how they may be defined and how
they may be associated with each other, authoring and publishing of
such presentations will now be described.
[0051] FIG. 7A is a data flow diagram illustrating a relationship
of parts of a system for authoring and publishing a multimedia
presentation. Using an editing graphical user interface (GUI) 700
described below with FIG. 7B and a layout GUI 702, described above
with FIG. 6, timeline activity 704 and a layout specification 706
are defined. This data is provided to an editing manager 708 to
enable viewing of the presentation during editing. The editing
manager, given a point in time 722 on the timeline and optionally a
playback rate 724 form the editing GUI 700, generates video data
714 and other visible data 710 for display in the editing GUI 700,
in an arrangement defined by the layout specification 706, using
media files 712. An example implementation of the editing manager
is described below in connection FIGS. 8A-F. After the author has
completed creating the presentation, the publisher 718 is invoked
to process the timeline 716, layout specification 706, and media
file 712 to generate the published presentation 720.
[0052] An example GUI for the editing GUI of FIG. 7A will now be
described in connection with FIG. 7B. In FIG. 7B, the timeline
region 700 includes an index track 702, a video track 704, a titles
track 706, two audio tracks 708 and 710, three event tracks 712,
714 and 716 and the timeline scale 718. The timeline scale
determines the number of pixels that represents a time unit.
Increasing or decreasing this time scale allows the user to focus
on a particular location in the composition, or to have a more of
an overview of the composition. A viewer window 720 displays the
video data and other visual information. A display controller 722
includes a position indicator 724 which points to the present
position within the multimedia presentation which is being viewed.
Forward and backward skip buttons 726 and 728 and play buttons 730
also may be provided. The position indicator 724 is associated with
a position indicator 736 in the timeline 700. The buttons 726, 728
and 730, and position indicator 724 may be used to control the
viewing of the multimedia presentation during authoring. Frame
boundaries, as indicated at 732 and 734, to the frame set
definitions in the layout specification. The frame boundaries 732
and 734 may be made adjustable using a cursor positioning device,
such as a mouse or touchpad. Such adjustments may be transformed
into edits of the layout specification. The various kinds of
operations that may be performed to edit the audio and video and to
add titles are described in more detail in PCT Publication No.
WO98/05034.
[0053] How entries in the index or table of contents track 702 and
event tracks 712 through 716 are added or modified will now be
described. A region 740 illustrates available multimedia data for
insertion into events. Buttons 742, 744 and 746 enable different
views of the information presented in region 740. Button 742
selects a mode in which the system displays a picture of the data.
Button 744 selects a mode in which the system displays a detailed
list including a small picture, filename, and timestamp of the data
file or resource. Button 746 selects a mode in which the system
displays only titles. Other modes are possible and the invention is
not limited to these. The names displayed are for those files found
in the currently active path in the file system used by the
authoring tool or other resources available to the system. The list
operation, for example, may involve a directory lookup performed by
the computer on its file system. A user may select an indicated
data file or resource and drag its icon to an event timeline either
to create a new event, or to replace media in an existing event, or
to add media to an existing event.
[0054] On the event timeline, an event 750 indicates a data file or
other resource associated with a particular point in time. Event
752 indicates that no file or resource is associated with the event
at this time. In response to a user selection of a point on an
event track, a new event may be created, if one is not already
there, or the selected event may be opened. Whether a new event is
created, or an existing event is opened, the user may be presented
with a properties dialog box to enable entry of information, such
as a name for the event, or a file name or resource locator for the
associated media, for storage into the event data structure. An
event that is created may be empty, i.e., might not refer to any
data file or resource.
[0055] The elements on the event track may be illustrated as having
a width corresponding to the amount of time it would take to
download the data file over a specified network connection. To
achieve this kind of display, the number of bytes of a data file is
divided by the byte-per-second rate of the network connection to
determine a time value, in seconds, which is used to determine the
width of the icon for the event to be displayed on the event track.
Displaying the temporal width of an object provides information to
the author about whether enough time is available at the location
of distribution to download the data and to display the data at the
desired time.
[0056] Similar to the events, a user may select an element on the
table of contents track as indicated at 754. An item may be added
by selecting a point on the table of contents track with a cursor
control device. Upon selection, a dialog window is displayed
through which the user may enter text for the selected element.
Each of the elements in the table of contents track 702 is
displayed in the frame 756 in the viewer 720.
[0057] To display the presentation to the author, for a given point
in time of the presentation, the system determines which contents
should be displayed. In the example shown in FIG. 7B, event 758 is
currently being displayed from the event track in viewer frame 760.
The video is being shown in frame 762. The table of contents
elements are shown in frame 756. A viewer such as shown in FIGS. 7A
and 7B may be implemented in many ways, depending on the
availability of preexisting program components to be used, and the
platform on which the viewer is implemented. An example
implementation will now be described in connection FIGS. 8A through
8E for use with a platform as specified below. In this
implementation, the viewer uses an Internet Explorer browser
component, available from Microsoft Corporation, to render the
nontemporal media. Currently available browser components are
capable of processing encoded streaming media files but not video
and audio data defined using a timeline. Thus, the temporal media,
in particular the audio and video, is rendered in a manner typical
in video editing systems, such as described in PCT Publication No.
WO98/05034. The viewer described herein reads a presentation and
accesses data, audio and video files to produce an the presentation
without an encoded streaming media file, thus simulating the
operation of a browser that uses streaming media files.
[0058] Referring now to FIG. 8A, an architecture for this
implementation is illustrated. This architecture includes an asset
manager 8100 which manages access to data files 8102 used in the
presentation. A clip manager 8104 maintains the timeline data
structure 8106 in response to instructions from the user via the
graphical user interface. Requests for access to information from
the timeline 8106 by the presentation manager 8108 and display
manager 8110 also are managed by the clip manager 8104. The
presentation manager 8108 maintains the layout specification 8112
and other display files 8114. The other display files include files
in a markup language that define the table of contents frame and
the video frames. An example layout was described above in
connection with FIG. 6. An example table of contents file and
example video frame files, for the Real Media and Windows Media
technology formats, are provided in Appendices I-III, the
interrelationship of which will now be described.
[0059] There are several ways in which the table of contents may be
constructed to allow actions on a table of contents entry to cause
a change in the playback position in the video frame. One example
is provided by the source code and Appendices I-III. In the table
of contents page, a JAVA script function called "seekToEPMarker"
takes either a marker number (for Windows Media technology) or a
time in milliseconds (for Real Media) and calls a function
"seekToVideoMarker" of its parent frame in its frame set. This
function call actually calls the JAVA script function of the child
frame of the table of contents' parent frame that includes the
video player. That function receives the marker and the time in
milliseconds and generates the appropriate commands to the media
player to initiate playback of the streaming media from the
designated position.
[0060] Turning again to FIG. 8A, the display managers 8110 each are
associated with a display window in the viewer and control
displaying content in their respective windows. In general, the
display managers access data from the presentation manager 8108 and
clip manager 8104 to provide data to the graphical user interface
8116, in response to events that modify the timeline or the
presentation of data in the timeline as received from the graphical
user interface or the clip manager. The graphical user interface
8116 communicates with the clip manager, presentation manger and
display manager to create and maintain the view of the timeline and
the presentation in response to user inputs.
[0061] A display manager, in one implementation, is described in
more detail in connection with FIG. 8B. The display manager
includes a controller module 8200 which communicates with the
graphical user interface, presentation manager and clip manager. To
display a data file, the controller instructs a browser component
8202 to render data for display. The output of the browser
component is processed by an image scaling module 8204 that scales
the result to fit within the appropriate display region in the
viewer.
[0062] Referring now to FIG. 8C, how the display of the
presentation in the viewer may be created will now be described. In
particular, the layout of the presentation is defined by the layout
specification 8112. This layout specification is parsed 8300 to
generate a tree-like representation of the layout. In particular,
as shown in the example layout specification provided above, some
frames are defined as subframes of other frame sets. This
hierarchical definition of frames translates into a tree-like
representation. For each nonleaf node in the tree,.a splitter
window is created 8302 in the presentation display region on the
user interface. For each leaf node in the tree, a display window is
created 8304 within its associated splitter window. This display
window is instructed 8306 to display its content at time zero,
i.e., the beginning, in the presentation to initialize the display.
The display window has an associated display manager 8110.
[0063] How the display manager displays data given a specified time
in the presentation will now be described in connection with FIG.
8D. In particular, the display manager receives 8400 a time T. For
event tracks, the event that has most recently occurred in the
presentation prior to time T is identified 8402. The data file for
that event is then obtained 8404. The browser component is then
instructed 8406 to render the received data file. The image scaling
module scales the image produced by the browser component, in 8408,
which is then displayed 8410 in the associated window. For video
information, this process involves identifying the sample from the
data file for the segment that is in the presentation at the
specified time. This sample is scaled and displayed. Because the
table of contents file is not time dependent, it is simply
rendered, scaled and displayed and step 8402 may be omitted.
[0064] After initialization, each display manager acts as a
"listener" process that responds to messages from other components,
such as the clip manager and graphical user interface, to update
the display. One kind of update is generated if display controls in
the graphical user interface are manipulated. For example, a user
may modify the position bar on either the timeline or the viewer to
initiate display from a different point in time T. In response to
such a change, the graphical user interface or the clip manager may
issue a message requesting the display managers to update the
display given a different time T. Similarly, during editing,
changes to the timeline data structure at a given point in time T
cause the clip manager to instruct the display managers to update
the display with the new presentation information at that point in
time T.
[0065] Playback may be implemented using the same display
mechanism. During either forward or reverse playback at a
continuous or user-controlled rate, a stream of instructions to
update the display at different points in time T may be sent to the
display managers. Each display manager updates its region of the
display at each of the specified times T which it receives from the
clip manager or graphical user interface.
[0066] Although the table of contents generally is a single file
without time dependency, during editing it may be modified, after
which the display is updated. One implementation for modifying the
table of contents display will now be described in connection with
FIGS. 8E and 8F. In FIG. 8E, a display manager for the table of
contents receives 8500 a message from the clip manager that a table
of contents entry has been added to the table of contents track.
The display manager requests 8502 the presentation manager for a
new table of contents file. After receiving 8504 the indication of
the new table of contents file, the browser component is instructed
8504 to render the data file. The rendered data file is then scaled
8506 and displayed 8508 in the window.
[0067] How the presentation manager generates a new table of
contents file is described in FIG. 8F. The presentation manager
receives 8600 a message requesting a new table of contents file.
The presentation manager requests 8602 the table of contents track
information from the clip manager. HTML data is generated 8604 for
each table of contents entry. Referring to the sample table of
contents file in Appendix I, a list of items is created for each
entry in the table of contents track. The table of contents file is
then modified 8606 with the newly generated HTML, for example, by
overwriting the table of contents information in the existing table
of contents file. Although the identity of the table of contents
file is known by the display manager, the presentation manager may
return the name of the data file to confirm completion of the
generation of the table of contents.
[0068] In one implementation, the display manager for each frame
also may permit display of a zoomed version of the frame. In this
implementation, selection of a frame for zooming causes the
graphical user interface to display the data for this frame in the
full display region. For video and events tracks, the zoom
instruction merely changes the image scaling performed on the image
to be displayed. For the table of contents track, the zoomed
version may be provided by a display that enables editing of the
table of contents. Modifications to the entries in the table of
contents in the zoomed interface are passed back to the clip
manager to update the timeline data structures.
[0069] After completing editing of a presentation, it may be
published to its desired distribution format. A variety of
operations maybe performed and assisted by the publishing component
of this system to prepare a presentation for distribution.
Operations that may be performed to publish a multimedia
presentation will now be described in more detail in connection
with FIG. 9.
[0070] First, the author provides setup data, which is accepted 900
through a GUI, to define the distribution format and other
information used to encode and transfer the presentation.
[0071] For example, the selected output format may be a streaming
media format, such as RealG2, Windows Media Technology, QuickTime
or SMIL. Other settings for the encoder may include the streaming
data file type, the video width, the video height, a title, author,
copyright and keyword data.
[0072] For transferring the presentation, various information may
be used to specify characteristics of one or more servers to which
the presentation will be sent and any account information for those
servers. Transfer settings may include a transfer protocol, such as
file transfer protocol (FTP) or a local or LAN connection, for
sending the presentation data files to the server. The server name,
a directory at the server in which the media files will be copied,
and optionally a user name and password also may be provided. A
default file name for the server, and the HTTP address or URL of
the server from which a user will access the published
presentation, also may be provided. The server information may be
separate for both data files and streaming media files.
[0073] This encoding and transfer information may be stored by the
transfer tool as a named profile for later retrieval for
transferring other presentations. Such profile data may include,
for example, the data defining settings for encoding, and the data
defining settings for transfer of encoded data files.
[0074] When setting up each of the connections for transfer, the
connection also may be tested to confirm its operation. This test
process involves transferring a small file to the destination and
confirming the ability of the system to read the file from the
destination.
[0075] After setup, the presentation may be audited 901 to reduce
the number of errors that may otherwise result during the encoding
and/or transfer processes. Profile information, described below,
the presentation, and other information may be reviewed for likely
sources of errors. For example, titles and/or other effects may be
checked to determine whether the title and/or effect has been
rendered. The timeline data structure may be searched to identify
the data files related to each event, segment, table of contents
entry, etc., to determine if any file is missing. The events in the
timeline may be compared to the video or audio or other temporal
data track to determine if any events occur after the end of the
video or audio or other temporal data track. The layout
specification also may be compared to the timeline data structure
to ensure that no events or other data have been defined on tracks
that are not referred to in the layout specification. Results of
these various tests on the layout and timeline data structures may
be provided to the user. Information about the profile used for the
transfer process also may be audited. For example, whether
passwords might be used on the target server, and the other
information about the accessibility of the target server may be
checked. The target directory also may be checked to ensure that no
files in the native file format of the authoring tool are present
in the target directory. Various other tests may be performed in an
audit process and the invention is not limited thereto.
[0076] After optional auditing, the presentation is encoded 902 by
transforming the timeline data structures into a format used by a
standard encoder, such as provided for the Real Media Player or
Windows Media Technology. Such encoding is described in more detail
below in connection with FIGS. 11A and 11B. The encoded
presentation optionally may be previewed 904. To support preview,
during encoding the files used to encode, and that will ultimately
be transferred to each server, are collected locally. The
presentation may be encoded first to support preview by referring
to the local files. The files for the presentation then are
transferred 906 to each server. Before transfer, if the
presentation was encoded for local preview, the references to local
files are translated into references to files on the destination
servers. For example, the encoded streaming media file generally is
provided to a streaming media server, whereas other data files
referred to by the streaming media file are provided to a standard
hypertext transfer protocol daemon (HTTPD) or web server. The
transfer process is described in more detail below in connection
with FIG. 11C. Finally, the transferred presentation may be
previewed 908 from the remote site.
[0077] A graphical user interface for facilitating the publishing
process described in FIG. 9 will now be described in connection
with FIG. 10. A user may set profile data by selecting setup or
options 1000. During set up, a profile may be recalled, created or
edited, and the user may specify the file folder and server on
which the presentation will be stored. In response to selection of
the "do it" menu item 1002, the screen shown in FIG. 10 is
displayed. First the presentation and profile data are audited as
shown at 1004. After the auditing step is complete, a checkmark
appears in an icon 1006. Next, encoding of the presentation may be
started at 1008. A user may optionally select to preview the
encoded presentation locally prior to transfer. By selecting button
1010, a preview of the presentation may be initiated. After
preview, the icon 1012 includes a checkmark. During transfer, a
user may select to overwrite files that have the same name on the
destination server, as indicated at 1014. The user may initiate the
transfer by selecting the button indicated at 1016. After
completion, the icon 1018 includes a checkmark. Finally, after
transfer, the user may view the presentation as transferred from
the destination server by selecting button 1020.
[0078] Referring to FIG. 11A, encoding of a presentation will now
be described. In general, most encoders have an application
programming interface that generate an encoded file in response to
commands to add samples of media to the presentation. The commands
for adding samples generally include the type of media, the time in
the presentation in which the media is to be added and the media
data itself as inputs to the command. The sample for video data is
usually a frame. The sample of audio data is usually several
samples defining a fraction of a second. The data also may be, for
example, a uniform resource locator (URL) or other data.
[0079] More particularly, an API has functions that: 1) enable
opening the component, 2) optionally present the user with a dialog
box interface to configure the component, 3) set settings of the
component that control its behavior, 4) connect the component to a
user visible progress bar and to the source of the data, 5) to
initiate the component to start translating the data into the
desired format, 6) write the desired format to a file, and 7) close
the component if the process is complete. On the receiving side of
the API, the system has code to respond to requests for data from
the export or encode component. The export component generally
accesses the time, track number, and file or URL specified by the
user, which are obtained from the timeline data structure. To the
extent that data interpretation or project-specific settings are
used by the encoder, this information also may be made available
through an API.
[0080] The video and audio may be encoded 1100 separately using
standard techniques. The table of contents and event tracks are
then processed. In particular, a list of event assets is generated
1102. An event asset is defined by its filename, track, and time in
the presentation. The frame set is then accessed 1104 to obtain a
list of tracks and frame names. The items in the event tracks are
then added to the streaming media file using the filename for the
event and the frame name for the event, at the indicated time for
the event, in 1106. The filename for the event is its full path
including either a full URL for remote files or an indicator of the
disk volume for files that are accessed locally or over a local
area network (LAN). In step 1106, the filenames and frame names
inserted into the streaming media file are those in the destination
to which the media file is being transferred. Therefore, the
encoding is dependent in part on the transfer parameters. The list
created in step 1102 may be sorted or unsorted.
[0081] Using Real Media, the table of contents track does not
affect the streaming media file. Using Windows Media technology,
however, marker codes are inserted for each table of contents
entry, although no marker codes are inserted for events.
[0082] Referring to FIG. 11B, an implementation using the Real
Media encoder will now be described. A Real Media encoder 112
issues requests 1122 for samples at a specified time. In response
to these requests, a presentation processor 1124 implements the
process described in FIG. 11A, and returns a sample 1126 from an
event that occurs in the presentation at a time closest to and
after the requested time. The response 1126 also indicates a time
at which the encoder 112 should request the next sample. This time
is the time corresponding to the sample which was returned by the
presentation processor 1124. The list of event assets created in
1102 in FIG. 11A may be sorted prior to initiating encoding with
the encoder 112, or may be sorted on the fly in response to
requests 1122 from the encoder 112. After the end of the
presentation is reached, the encoded presentation 1128 is
available.
[0083] The process of transferring data to the servers will now be
described in connection with FIG. 11C. After setup and encoding
have been completed, the transfer of the presentation starts with
preparing 1130 lists of files or resources of the presentation. A
first list includes the table of contents file, the video frame
file and the index or template file and all of the files that these
three files directly reference. A second list is all files destined
for the streaming media server. A third list is all of the files
and resources in events and all of the files and resources these
events reference directly. Resources that are not directly
available at the local machine may be omitted from the list. This
third list uses the complete path name or URL for the file or
resource. For the drives or servers used for the files in the third
list, a base path is found 1132. New directories on the destination
servers are then created 1134 using the base paths as
subdirectories of the target directory on the server. Files is all
three lists are then transferred 1136 to their respective
destinations.
[0084] A computer system with which the various elements of the
system described above, either individually or in combination, may
be implemented typically includes at least one main unit connected
to both one or more output devices which store information,
transmit information or display information to one or more users or
machines and one or more input devices which receives input from
one or more users or machines. The main unit may include one or
more processors connected to a memory system via one or more
interconnection mechanisms. Any input device and output device also
are connected to the processor and memory system via the
interconnection mechanism.
[0085] The computer system may be a general purpose computer system
which is programmable using a computer programming language.
Computer programming languages suitable for implementing such a
system include procedural programming languages, object-oriented
programming languages, combinations of the two, or other languages.
The computer system may also be specially programmed, special
purpose hardware, or an application specific integrated circuit
(ASIC).
[0086] In a general purpose computer system, the processor is
typically a commercially available processor which executes a
program called an operating system which controls the execution of
other computer programs and provides scheduling, debugging,
input/output control, accounting, compilation, storage assignment,
data management and memory management, and communication control
and related services. The processor and operating system defines
computer platform for which application programs in other computer
programming languages are written. The invention is not limited to
any particular processor, operating system or programming
language.
[0087] A memory system typically includes a computer readable and
writeable nonvolatile recording medium in which signals are stored
that define a program to be executed by the processor or
information stored on the disk to be processed by the program.
Typically, in operation, the processor causes data to be read from
the nonvolatile recording medium into another memory that allows
for faster access to the information by the processor than does the
disk. This memory is typically a volatile, random access memory
such as a dynamic random access memory (DRAM) or static memory
(SRAM). The processor generally manipulates the data within the
integrated circuit memory and may copy the data to the disk if
processing is completed. A variety of mechanisms are known for
managing data movement between the disk and the integrated circuit
memory element, and the invention is not limited thereto. The
invention is not limited to a particular memory system.
[0088] Such a system may be implemented in software or hardware or
firmware, or any combination thereof. The various elements of this
system, either individually or in combination, may be implemented
as a computer program product including a computer-readable medium
on which instructions are stored for access and execution by a
processor. Various steps of the process may be performed by a
computer processor executing instructions stored on a
computer-readable medium to perform functions by operating on input
and generating output.
[0089] Additionally, the computer system may be a multiprocessor
computer system or may include multiple computers connected over a
computer network. Various possible configurations of computers in a
network permit access to the system by multiple users using
multiple instances of the programs even if they are dispersed
geographically. Each program or step shown in the figures and the
substeps or subparts shown in the figures may correspond to
separate modules of a computer program, or may be separate computer
programs. Such modules may be operable on one or more separate
computers or other devices. The data produced by these components
may be stored in a memory system or transmitted between computer
systems or devices. The plurality of computers or devices may be
interconnected by a communication network, such as a public
switched telephone network or other circuit switched network, or a
packet switched network such as an Internet protocol (IP) network.
The network may be wired or wireless, and may be public or
private.
[0090] A suitable platform for implementing software to provide
such an authoring system includes a processor, operating system, a
video capture device, a Creative Labs Sound Blaster or compatible
sound card, CD-ROM drive, and 64 Megabytes of RAM minimum. For
analog video capture, the video capture device may be the
Osprey-100 PCI Video Capture Card or the Eskape MyCapture II USB
Video Capture Device. The processor may be a 230 megahertz Pentium
II or Pentium III processor, or Intel equivalent processor with MMX
Technology, such as the AMD-K6-III, or Celeron Processor with 128K
cache, and may be used with an operating system such as the
Windows98/98SE or Millennium operating systems. For digital video
capture, the video capture device may be an IEEE 1394 Port (OHCI
compliant or Sony ILink). The processor may be a 450 megahertz
Pentium II or Pentium III processor, or Intel equivalent processor
with MMX Technology, such as the AMD-K6-III, or Celeron processor
with 128K cache.
[0091] Given an authoring tool such as described above, the use of
multiple authoring tools by multiple authors for publishing data to
a public or private computer network for access by other users will
now be described in connection with FIGS. 12 and 13. In particular,
an encoded presentation 1200 and associated data files 1202 may be
transferred by a transfer tool 1204 to a streaming media server
1206 and a data server 1208. The transfer tool also may store
preference data 1210 for the author with a profile manager 1212.
The streaming media server 1206 and data server 1208 may be
publicly accessible web servers accessible by web browsers 1214.
Other kinds of distributed libraries of digital media, instead of a
web server, also may be used to publish the presentation. If
additional transfer tools 1216 are used by other authors, these
transfer tools 1216 may transfer the streaming media to the same or
a different streaming media data server 1206 as the other transfer
tool 1204, but may have a separate data server 1218. Use of the
same streaming media data server is possible where each transfer
tool has access to the streaming media server 1206. Such access may
be built into either the transfer tool or the authoring tool. The
transfer tool and/or the authoring tool may be provided by the same
entity or another entity related to the entity that owns or
distributes the streaming media server 1206. The streaming media
server may be implemented, for example, as described in U.S. patent
application Ser. No. 09/054,761, which corresponds to PCT
Publication No. WO99/34291. The streaming media server 1206 may
charge authors for access to and/or for the amount of data stored
on the steaming media server 1206.
[0092] In addition to publishing presentations to the media server,
an authoring tool may use the media server or data server as a
source of content for presentations. As shown in FIG. 13, for
example, the editing system 1300, and optionally the transfer
system 1302, may have access to one or more streaming servers 1304.
The editing system may acquire stock footage 1306 from the
streaming media server 1304 or other content from a data server
1312. Such stock footage, for example, may be purchased from the
entity maintaining or owning the streaming server 1304. An author
may add such stock footage to the presentation. The completed
presentation 1308 may be in turn published by the transfer system
1302 to the streaming media server 1304 (as indicated by
presentation 13), with data files 1310 stored on a data server
1312. Tools used by other publishers and authors, as indicated at
1314, also may access the streaming server 1304 for receiving stock
footage or for publishing presentations. Such authors and
publishers may use a separate data server 1316 for storing
nontemporal data related to the temporal data published on the
streaming server 1304.
[0093] Having now described a few embodiments, it should be
apparent to those skilled in the art that the foregoing is merely
illustrative and not limiting, having been presented by way of
example only. Numerous modifications and other embodiments are
within the scope of the invention.
* * * * *