Low Latency And Low Defect Media File Transcoding Using Optimized Storage, Retrieval, Partitioning, And Delivery Techniques Luthra; Tanooj ; et al. [Box, Inc.]

Low Latency And Low Defect Media File Transcoding Using Optimized Storage, Retrieval, Partitioning, And Delivery Techniques

Luthra; Tanooj ; et al.

Patent Application Summary

U.S. patent application number 15/140357 was filed with the patent office on 2016-11-03 for low latency and low defect media file transcoding using optimized storage, retrieval, partitioning, and delivery techniques. This patent application is currently assigned to Box, Inc.. The applicant listed for this patent is Box, Inc.. Invention is credited to Bryan Huh, Tanooj Luthra, Ritik Malhotra.

Application Number	20160323351 15/140357
Document ID	/
Family ID	57204093
Filed Date	2016-11-03

United States Patent Application	20160323351
Kind Code	A1
Luthra; Tanooj ; et al.	November 3, 2016

LOW LATENCY AND LOW DEFECT MEDIA FILE TRANSCODING USING OPTIMIZED STORAGE, RETRIEVAL, PARTITIONING, AND DELIVERY TECHNIQUES

Abstract

Systems, methods and computer program products for high-performance, low latency start-up of large shared media files. A method for low latency startup with low defect playback commences upon identifying a first media file having a first format to be converted to a second media file having a second format. A scheduler divides the first media file into multiple partitions separated by partition boundaries. The method continues by converting the partitions into respective converted partitions that comport with the second format. Determinations as to the position of the partition boundaries is made based on measurable conditions present at a particular moment in time. Different formats receive different treatment based on the combination of characteristics of the first format, characteristics of the second format, as well as on characteristics of measurable conditions present at the moment in time just before conversion of a segment.

Inventors:

Luthra; Tanooj; (San Diego, CA) ; Malhotra; Ritik; (San Jose, CA) ; Huh; Bryan; (San Jose, CA)

Applicant:

Name	City	State	Country	Type
Box, Inc.	Redwood City	CA	US

Assignee:

Box, Inc.
Redwood City
CA

Family ID:

57204093

Appl. No.:

15/140357

Filed:

April 27, 2016

Related U.S. Patent Documents


Application Number	Filing Date	Patent Number
62154658	Apr 29, 2015

Current U.S. Class:	1/1
Current CPC Class:	G06F 2212/154 20130101; G06F 16/183 20190101; G06F 16/185 20190101; G06F 2212/657 20130101; H04L 67/34 20130101; G06F 2212/463 20130101; H04L 67/1097 20130101; G06F 16/113 20190101; G06F 16/1774 20190101; H04L 65/607 20130101; G06F 16/1727 20190101; G06F 2212/1016 20130101; H04L 65/602 20130101; G06F 9/46 20130101; G06F 16/188 20190101; G06F 16/22 20190101; G06F 16/196 20190101; G06F 16/172 20190101; G06F 12/122 20130101; G06F 12/0891 20130101; G06F 16/23 20190101; G06F 2212/60 20130101; H04L 67/06 20130101; G06F 16/1748 20190101; H04N 19/40 20141101; G06F 12/1081 20130101; G06F 16/2443 20190101; G06F 16/182 20190101; G06F 16/9574 20190101; G06F 2212/1044 20130101; H04L 63/0428 20130101; H04L 65/80 20130101
International Class:	H04L 29/06 20060101 H04L029/06; H04N 19/40 20060101 H04N019/40

Claims

1. A method comprising: identifying a first media file having a first format to be converted to a second media file having a second format; partitioning the first media file into two or more partitions separated by one or more partition boundaries, wherein the one or more partition boundaries are at a determined key frame position; converting into the second format, the two or more partitions to respective two or more converted partitions, wherein the respective two or more partitions are converted by respective two or more computing devices; and assembling the respective two or more converted partitions to comprise the second media file.

2. The method of claim 1, wherein the determined key frame position is a closest key frame location to a respective one of the partition boundaries.

3. The method of claim 1, further comprising a delivering at least a portion of the second media file to a requestor to be viewed on a media player.

4. The method of claim 1, wherein the two or more partitions are characterized by a set of progressively increasing quality levels.

5. The method of claim 1, wherein at least one of the respective two or more converted partitions comprise an attribute dataset.

6. The method of claim 5, wherein the attribute dataset is at least one of, an atom, a movie atom, and a moov atom.

7. The method of claim 1, wherein steps for converting into the second format comprise a timecode correction.

8. The method of claim 1, wherein at least one of the respective two or more partitions corresponds to a beginning portion of the first media file.

9. A computer readable medium, embodied in a non-transitory computer readable medium, the non-transitory computer readable medium having stored thereon a sequence of instructions which, when stored in memory and executed by a processor causes the processor to perform a set of acts, the acts comprising: identifying a first media file having a first format to be converted to a second media file having a second format; partitioning the first media file into two or more partitions separated by one or more partition boundaries, wherein the one or more partition boundaries are at a determined key frame position; converting into the second format, the two or more partitions to respective two or more converted partitions, wherein the respective two or more partitions are converted by respective two or more computing devices; and assembling the respective two or more converted partitions to comprise the second media file.

10. The computer readable medium of claim 9, wherein the determined key frame position is a closest key frame location to a respective one of the partition boundaries.

11. The computer readable medium of claim 9, further comprising instructions which, when stored in memory and executed by the processor causes the processor to perform acts of a delivering at least a portion of the second media file to a requestor to be viewed on a media player.

12. The computer readable medium of claim 9, wherein the two or more partitions are characterized by a set of progressively increasing quality levels.

13. The computer readable medium of claim 9, wherein at least one of the respective two or more converted partitions comprise an attribute dataset.

14. The computer readable medium of claim 13, wherein the attribute dataset is at least one of, an atom, a movie atom, and a moov atom.

15. The computer readable medium of claim 9, wherein steps for converting into the second format comprise a timecode correction.

16. The computer readable medium of claim 9, wherein at least one of the respective two or more partitions corresponds to a beginning portion of the first media file.

17. A system comprising: a storage medium having stored thereon a sequence of instructions; and a processor or processors that execute the instructions to cause the processor or processors to perform a set of acts, the acts comprising, identifying a first media file having a first format to be converted to a second media file having a second format; partitioning the first media file into two or more partitions separated by one or more partition boundaries, wherein the one or more partition boundaries are at a determined key frame position; converting into the second format, the two or more partitions to respective two or more converted partitions, wherein the respective two or more partitions are converted by respective two or more computing devices; and assembling the respective two or more converted partitions to comprise the second media file.

18. The system of claim 17, wherein the determined key frame position is a closest key frame location to a respective one of the partition boundaries.

19. The system of claim 17, wherein the two or more partitions are characterized by a set of progressively increasing quality levels.

20. The system of claim 17, wherein at least one of the respective two or more converted partitions comprise an attribute dataset.

Description

RELATED APPLICATIONS

[0001] The present application claims the benefit of priority to co-pending U.S. Provisional Patent Application Ser. No. 62/154,658 titled, "METHOD MECHANISM TO IMPLEMENT A VIRTUAL FILE SYSTEM FROM REMOTE CLOUD STORAGE" (Attorney Docket No. BOX-2015-0012-US00-PRO), filed Apr. 29, 2015, and this application claims the benefit of priority to co-pending U.S. Provisional Patent Application Ser. No. 62/154,022, titled, "LOW LATENCY AND LOW DEFECT MEDIA FILE TRANSCODING USING OPTIMIZED STORAGE, RETRIEVAL, PARTITIONING, AND DELIVERY TECHNIQUES" (Attorney Docket No. BOX-2015-0013-US00-PRO), filed Apr. 28, 2015, both of which are hereby incorporated by reference in their entirety

FIELD

[0002] This disclosure relates to the field of file sharing of large media files, and more particularly to techniques for low latency and low defect media file transcoding using optimized storage, partitioning, and delivery techniques.

BACKGROUND

[0003] In today's "always on, always connected" world, people often share video and other media files on multiple devices (e.g., smart phones, tablets, laptops, etc.) for various purposes (e.g., collaboration, social interaction, entertainment, etc.). In some situations, the format (e.g., encoding, container, etc.) of a particular media file needs to be converted (e.g., transcoded) into some other format. There are many reasons why such a conversion or transcoding is needed. For example, a collaborator might have a video file in a first encoding or format, and would want to compress it so as to consume less storage space and/or consume less transmission bandwidth when it is shared (e.g., delivered to collaborating recipients). In many cases, a collaborator would want to view a video as soon as it is posted, however, due to the aforementioned reasons why such a conversion or transcoding might be needed, the video would need to be converted before being made available for previewing or sharing. Further transcoding may be needed for viewing the video using the various media players available on the various devices of the collaborators.

[0004] Legacy approaches to the problem of reducing the latency between availability of an original media file (e.g., in a first format) and availability of a transcoded media file (e.g., in a second format) can be improved. In one legacy case, an original media file is sent to an extremely high-powered computer, with the expectation that the transcoding can complete sooner. In other legacy cases, an original media file in a first format is divided into equally sized partitions, and each partition is transcoded in parallel with each other partition. While such a partitioning and parallel processing techniques serve to reduce the latency time to a first viewing of a transcoded media file, such an approach is naive, at least as pertains to the extent that many of the resulting transcoded partitions exhibit defects.

[0005] What is needed is a technique or techniques to improve over legacy and/or over other considered approaches. Some of the approaches described in this background section are approaches that could be pursued, but not necessarily approaches that have been previously conceived or pursued. Therefore, unless otherwise indicated, it should not be assumed that any of the approaches described in this section qualify as prior art merely by virtue of their inclusion in this section.

[0006] What is needed is a technique or techniques to reduce the first-view latency time incurred when transcoding a media file in a first format to a second format while reducing or eliminating defects in the resulting transcoded media file. The problem to be solved is rooted in technological limitations of the legacy approaches. Improvements, in particular improved design, and improved implementation and application of the related technologies, are needed.

SUMMARY

[0007] The present disclosure provides improved systems, methods, and computer program products suited to address the aforementioned issues with legacy approaches. More specifically, the present disclosure provides a detailed description of techniques used in systems, methods, and in computer program products for low latency and low defect media file transcoding using optimized partitioning. Certain embodiments are directed to technological solutions for exploiting parallelism when transcoding from a first format to a second format by determining partition boundaries based on the first format. The disclosed techniques and devices within the shown environments as depicted in the figures provide advances in the technical field of high-performance computing as well as advances in the technical fields of distributed computing and distributed storage.

[0008] Further details of aspects, objectives, and advantages of the disclosure are described below and in the detailed description, drawings, and claims. Both the foregoing general description of the background and the following detailed description are exemplary and explanatory, and are not intended to be limiting as to the scope of the claims.

BRIEF DESCRIPTION OF THE DRAWINGS

[0009] The drawings described below are for illustration purposes only. The drawings are not intended to limit the scope of the present disclosure.

[0010] FIG. 1 depicts an environment for using and transcoding media files.

[0011] FIG. 2A presents a diagram depicting a full file transcoding approach for comparison of techniques for performing low latency and low defect media file transcoding using optimized partitioning.

[0012] FIG. 2B presents a diagram illustrating techniques for low latency and low defect media file transcoding using optimized partitioning, according to an embodiment.

[0013] FIG. 3A is a chart showing media file partitioning as implemented in systems for low latency and low defect media file transcoding using optimized partitioning, according to an embodiment.

[0014] FIG. 3B1 and FIG. 3B2 are a block diagrams showing media file reformatting as implemented in systems for low latency and low defect media file transcoding using optimized partitioning, according to an embodiment.

[0015] FIG. 3C depicts media file reformatting as implemented in systems for low latency and low defect media, according to an embodiment.

[0016] FIG. 3D depicts an on-the-fly watermarking system as implemented in systems for low latency and low defect media file transcoding, according to an embodiment.

[0017] FIG. 4A presents a processing timeline to show full file transcoding latency for comparison to techniques for low latency and low defect media file transcoding using optimized partitioning, according to an embodiment.

[0018] FIG. 4B presents a latency timeline illustrating a low first-view latency technique used in systems implementing low latency and low defect media file transcoding using optimized partitioning, according to an embodiment.

[0019] FIG. 5 is a flow diagram illustrating a system for low latency and low defect media file transcoding using optimized partitioning, according to an embodiment.

[0020] FIG. 6A is a flow diagram illustrating caching of an initial clip as used in systems for low latency and low defect media file transcoding using optimized partitioning, according to an embodiment.

[0021] FIG. 6B is a flow diagram illustrating pre-transcoding of an initial clip as used in systems for low latency and low defect media file transcoding using optimized partitioning, according to an embodiment.

[0022] FIG. 6C is a flow diagram illustrating frame-by-frame delivery of a video clip as used in systems for low latency and low defect media file transcoding using optimized partitioning, according to an embodiment.

[0023] FIG. 6D1 is a flow diagram illustrating playlist generation from a video clip as used in systems for low latency and low defect media file transcoding using optimized partitioning, according to an embodiment.

[0024] FIG. 6D2 is a flow diagram illustrating generation of URLs for video clips as used in systems for low latency and low defect media file transcoding using optimized partitioning, according to an embodiment.

[0025] FIG. 6D3 is a flow diagram illustrating generation of URLs for video clips as used in systems for low latency and low defect media file transcoding using optimized partitioning, according to an embodiment.

[0026] FIG. 6D4 is a flow diagram illustrating timecode correction techniques used when delivering video clips to viewers as used in systems for low latency and low defect media file transcoding using optimized partitioning, according to an embodiment.

[0027] FIG. 6E is a flow diagram illustrating techniques for accessing media files through a custom virtual file system as used when delivering video clips to collaborators, according to an embodiment.

[0028] FIG. 7 depicts a system as an arrangement of computing modules that are interconnected so as to operate cooperatively to implement certain of the herein-disclosed embodiments.

[0029] FIG. 8A and FIG. 8B depict exemplary architectures of components suitable for implementing embodiments of the present disclosure, and/or for use in the herein-described environments.

DETAILED DESCRIPTION

[0030] Some embodiments of the present disclosure address the problem of reducing the first-view latency time incurred when transcoding a media file in a first format to a second format, while reducing or eliminating defects in the resulting transcoded file and some embodiments are directed to approaches for exploiting parallelism when transcoding from a first format to a second format by determining partition boundaries based on the first format. More particularly, disclosed herein and in the accompanying figures are exemplary environments, systems, methods, and computer program products for low latency and low defect media file transcoding using optimized partitioning.

Overview

[0031] In today's "always on, always connected" world, people often share video and other media files on multiple devices (e.g., smart phones, tablets, laptops, etc.) for various purposes (e.g., collaboration, social interaction, etc.). In some situations, the format (e.g., encoding, container, etc.) of a particular media file needs to be converted (e.g., transcoded) into some other format. However, a person may want to immediately view and/or her media file, yet may need to wait for the media file to be converted or transcoded. To address the need to reduce the first-view latency time incurred when transcoding a media file in a first format to a second format while reducing or eliminating defects in the resulting transcoded media file, the techniques described herein receive and analyze an original media file to determine optimized partitions for transcoding, and techniques described herein operate in conjunction with cloud-based remote file storage. For example, a custom file system can be employed and/or optimized partitions can be based in part on the target format or formats (e.g., encoding scheme, codec, container, etc.) and/or available computing resources (e.g., storage, processing, communications bandwidth, etc.). Specifically, in one or more embodiments, the partition boundaries can be selected with respect to key frames (e.g., I-frames). For example, a leading edge boundary partition can be selected to be precisely at a key frame, and a trailing edge boundary can be adjacent to a next key frame. When the partitions and partition boundaries have been determined, the partitions can be assigned to computing resources for simultaneous transcoding of the respective partitions. The transcoded media file partitions can then be assembled into a single transcoded video file (e.g., container) and delivered for viewing. In some embodiments, the partitions can include attribute datasets (e.g., moov atoms) such that a first or beginning transcoded partition can be delivered and viewed in advance of the availability and assemblage of the remaining transcoded partitions, thus further reduce the first-view latency time.

[0032] Various embodiments are described herein with reference to the figures. It should be noted that the figures are not necessarily drawn to scale and that the elements of similar structures or functions are sometimes represented by like reference numerals throughout the figures. It should also be noted that the figures are only intended to facilitate the description of the disclosed embodiments--they are not representative of an exhaustive treatment of all possible embodiments, and they are not intended to impute any limitation as to the scope of the claims. In addition, an illustrated embodiment need not portray all aspects or advantages of usage in any particular environment. An aspect or an advantage described in conjunction with a particular embodiment is not necessarily limited to that embodiment and can be practiced in any other embodiments even if not so illustrated. Also, reference throughout this specification to "some embodiments" or "other embodiments" means that a particular feature, structure, material, or characteristic described in connection with the embodiments is included in at least one embodiment. Thus, the appearances of the phrase "in some embodiments" or "in other embodiments" in various places throughout this specification are not necessarily referring to the same embodiment or embodiments.

DEFINITIONS

[0033] Some of the terms used in this description are defined below for easy reference. The presented terms and their respective definitions are not rigidly restricted to these definitions--a term may be further defined by the term's use within this disclosure. The term "exemplary" is used herein to mean serving as an example, instance, or illustration. Any aspect or design described herein as "exemplary" is not necessarily to be construed as preferred or advantageous over other aspects or designs. Rather, use of the word exemplary is intended to present concepts in a concrete fashion. As used in this application and the appended claims, the term "or" is intended to mean an inclusive "or" rather than an exclusive "or". That is, unless specified otherwise, or is clear from the context, "X employs A or B" is intended to mean any of the natural inclusive permutations. That is, if X employs A, X employs B, or X employs both A and B, then "X employs A or B" is satisfied under any of the foregoing instances. The articles "a" and "an" as used in this application and the appended claims should generally be construed to mean "one or more" unless specified otherwise or is clear from the context to be directed to a singular form.

[0034] Reference is now made in detail to certain embodiments. The disclosed embodiments are not intended to be limiting of the claims.

Descriptions of Exemplary Embodiments

[0035] FIG. 1 depicts an environment 100 for using and transcoding media files. As an option, one or more instances of environment 100 or any aspect thereof may be implemented in the context of the architecture and functionality of the embodiments described herein. The environment 100 or any aspect thereof may be implemented in any desired environment.

[0036] As shown, the environment 100 supports access to workspaces (e.g., workspace 122.sub.1 and workspace 122.sub.2) by a plurality of users (e.g., collaborators 120) through a variety of computing devices (e.g., user devices 102). For example, the collaborators 120 can comprise a user collaborator 123, an administrator collaborator 124, and a creator collaborator 125. In addition, for example, the user devices 102 can comprise one or more instances of a laptop 102.sub.1 and laptop 102.sub.5, one or more instances of a tablet 102.sub.2, one or more instances of a smart phone 102.sub.3, and one or more instances of a workstation (e.g., workstation 102.sub.4 and workstation 102.sub.6). As shown, the workspaces can present to the collaborators 120 a set of documents accessible by each collaborator (e.g., based on permissions). For example, the workspaces can provide certain groups of the collaborators 120 access to a set of media files (e.g., with container file extensions .mov, .mp4, .wmv, .flv, etc.) for various collaboration activities (e.g., creating, sharing, viewing, listening, editing, etc.).

[0037] The environment 100 further illustrates the content (e.g., media files) represented in the workspaces can be managed (e.g., converted, transcoded, etc.) and stored on a server farm 110. For example, the server farm 110 can be a cloud-based and/or distributed computing and storage network(s) comprising one or more instances of a host server 112, one or more instances of a sync server 113, one or more instances of a notification server 114, one or more instances of a collaboration server 116, one or more instances of a content server 117, and one or more instances of an origin server 118. In certain embodiments, other combinations of computing devices and storage devices can comprise the server farm 110. The collaborators 120 interact with the workspaces to upload media files (e.g., original media file 132) through an upload path 127 to the server farm 110. The collaborators 120 can further interact with the workspaces to download media files (e.g., transcoded media file 134) through a download path 129 from the server farm 110.

[0038] As an example, the creator collaborator 125 may have just posted a new video (e.g., original media file 132 over the upload path 127) that is shared with the user collaborator 123 in the workspace 122.sub.1, and the user collaborator 123 selected the new video for viewing on laptop 102.sub.1. However, a media player 103 on laptop 102.sub.1 and/or the associated computing resource constraints (e.g., of laptop 102.sub.1, of download path 129, etc.) may demand the original media file 132 be transcoded to the transcoded media file 134 having a video playback format (e.g., advanced systems format (ASF) file) that is different than the original media file 132 format (e.g., MP4). In this case, one or more servers in the server farm 110 can perform the transcoding using various approaches, where the choice of approach will impact a first-view latency time 104 (e.g., the time from a request to view to the start of viewing) and extent of viewing quality defects experienced by the user collaborator 123 and other users. A comparison of one approach shown in FIG. 2A to the herein disclosed approach shown in FIG. 2B is described in the following.

[0039] FIG. 2A presents a diagram 2A00 depicting a full file transcoding approach for comparison of techniques for performing low latency and low defect media file transcoding using optimized partitioning. As an option, one or more instances of diagram 2A00 or any aspect thereof may be implemented in the context of the architecture and functionality of the embodiments described herein. The diagram 2A00 or any aspect thereof may be implemented in any desired environment.

[0040] One approach to transcoding a media file is shown in diagram 2A00. For example, a video file may need to be transcoded for viewing by a user. As shown, the approach receives an original media file (see step 202) and proceeds to process (e.g., transcode) the entire original media file (see step 204). In this legacy full file transcoding approach, the user desiring to view the media file will need to wait until the entire media file is transcoded before being able to view the transcoded media file. In some cases, processing the original media file can comprise two steps of first converting to an intermediate format and then converting to a target format. A first-view latency time 221 using the approach shown in diagram 2A00 can be improved by using high-powered computing resources, yet the first-view latency time 221 can remain long. For example, a 30-minute video can take 20 minutes to be transcoded and made ready for viewing using the full file transcoding approach shown in FIG. 2A. The first-view latency time 221 will further increase as the demanded resolution of the video increases.

[0041] In some cases, an original media file is sent to an extremely high-powered computer, with the expectation that the transcoding can complete sooner. In other cases, an original media file in a first format is divided into equally sized partitions, and each partition is transcoded in parallel with each other partition. For example, when an original media file in a first format is divided into N equally sized partitions, then the time to complete the transcoding can theoretically be reduced a time proportional to 1/N. While this divide by N partitioning and parallel processing technique serves to reduce the latency time to a first viewing of a transcoded media file, such an approach can be improved upon, at least as pertains to the aspect that many of the resulting transcoded partitions exhibit defects. For example, many of the resulting transcoded partitions exhibit image distortions brought about by generating clip boundaries according to strict divide by N partitioning.

[0042] One improved approach implemented in the herein disclosed techniques for low latency and low defect media file transcoding using optimized partitioning is described as pertains to FIG. 2B.

[0043] FIG. 2B presents a diagram 2B00 illustrating techniques for low latency and low defect media file transcoding using optimized partitioning. As an option, one or more instances of diagram 2B00 or any aspect thereof may be implemented in the context of the architecture and functionality of the embodiments described herein. The diagram 2B00 or any aspect thereof may be implemented in any desired environment.

[0044] The approach illustrated in diagram 2B00 implements an optimized partitioning of a media file for low latency and low defect transcoding. Specifically, the set of steps describing such an approach begins with receiving an original media file (see step 202) and analyzing the original media file to determine optimized partitions for transcoding (see step 206). For example, optimized partitions can be based in part on the target format or formats (e.g., encoding scheme, codec, container, etc.) and/or available computing resources (e.g., storage, processing, communications bandwidth, etc.). In some cases the size of a partition might vary with environmental considerations. Specifically, in one or more embodiments, the leading-edge partition boundaries can be at encoding key frames (e.g., I-frames). When the partitions and partition boundaries have been determined, the partitions can be assigned to computing resources for transcoding (see step 208). The computing resources (e.g., server farm 110) can then transcode the original media file partitions to respective transcoded media file partitions (see parallel steps of step 210.sub.1, step 210.sub.2, to step 210.sub.N). The transcoded media file partitions can then be assembled into a single transcoded video file (e.g., container) (see step 212).

[0045] The herein disclosed approach and technique presented in diagram 2B00 has several advantages. For example, partitioning the media file (e.g., into N partitions) for parallel transcoding across a distributed computing system (e.g., N servers) can reduce a first-view latency time (e.g., by a factor of 1/N) as compared to a full file transcoding approach. Further, by determining optimal partitions and partition boundaries (e.g., aligned with key frames), defects in the resulting transcoded file can be minimized or eliminated. In addition, a user can start viewing the transcoded media file when the first partition has been transcoded or the first set of partitions have been transcoded, such that viewing can begin before the transcoded file has been assembled. For example, a reduced first-view latency time 222 for a 30-minute video using the herein disclosed approach shown in diagram 2B00 can be a few seconds (e.g., when the first partition has been transcoded and delivered). More details regarding the partitioning of media files are shown and described as pertains to FIG. 3A, FIG. 3B1, FIG. 3B2, FIG. 3C and FIG. 3D.

[0046] FIG. 3A is a chart 3A00 showing media file partitioning as implemented in systems for low latency and low defect media file transcoding using optimized partitioning. As an option, one or more instances of chart 3A00 or any aspect thereof may be implemented in the context of the architecture and functionality of the embodiments described herein. The chart 3A00 or any aspect thereof may be implemented in any desired environment.

[0047] The chart 3A00 shows a time-based representation of an original media file 302 in a first encoding format or a first set of encoding formats. As shown, for example, the original media file 302 can be a video file packaged in a container that comprises a moov atom at the end. The moov atom, also referred to as a movie atom, is an attribute dataset comprising information about the original media file 302 such as the timescale, duration, display characteristics of the video, sub-atoms containing information associated with each track in the video, and other attributes. As shown, an original moov atom 303 is present at the end of the original media file 302. When transcoding of the original media file 302 is demanded, the original media file 302 can be analyzed to determine a set of candidate partitions 304 for parallel processing. For example, in legacy approaches, the candidate partitions can be determined by equally dividing (e.g., into units of time or duration) the original media file 302 by the number of computing resources (e.g., servers) available for parallel transcoding operations. In this case, however, the candidate partitions 304 may have partition boundaries that result in defects in the playback of the assembled transcoded media file. Such defects can comprise subjective quality as perceived by a user (e.g., blockiness, blurriness, ringing artifacts, added high frequency content, picture outages, freezing at a particular frame and then skipping forward by a few seconds, etc.). In many cases, objective metrics that characterize one or more aspects of playback quality can be computed (e.g., frame-by-frame comparison of an original object and a transcoded object).

[0048] In one embodiment, the herein disclosed techniques can determine a set of optimized partitions 305 for transcoding the original media file 302 to reduce first-view latency and reduce or eliminate transcoding defects. As shown, for example, a set of key frame locations (e.g., key frame location 306.sub.1, key frame location 306.sub.2, key frame location 306.sub.3, key frame location 306.sub.4, key frame location 306.sub.5) can be used to define the boundaries of the file partitions (e.g., P1, P2, P3, and P4). In some embodiments, the key frame locations can be defined in the original media file 302, and in certain embodiments, the key frame locations can be based in part on the target transcoding format or formats. In one or more embodiments, the candidate partitions 304 (e.g., based on an optimal set of computing resources to deliver low latency) can be aligned to the closest key frame location (e.g., based on an optimal partitioning boundary to deliver low defects). For example, key frame location 306.sub.2 is chosen as a partition boundary over key frame location 306.sub.5 as being closer to an instance of a candidate partition 304. In some cases and embodiments, a set of moov atoms can be included at various positions in one or more partitions based in part on the known and/or expected delivery and playback method. For example, each instance of a partition in the optimized partitions 305 is shown to have a moov atom at the beginning of the partition (e.g., see moov atom 307.sub.1 in partition P1 and moov atom 307.sub.2 in partition P2). Positioning the moov atom at the beginning of the partitions can reduce the first-view latency by enabling the user media player to start decoding and playing the first partition (e.g., P1) independently of the transcoding completion status and delivery of the other partitions (e.g., for video streaming, progressive downloading, etc.).

[0049] FIG. 3B1 presents a block diagram 3B100 showing media file reformatting as implemented in systems for low latency and low defect media file transcoding using optimized partitioning. The techniques for media file reformatting can be practiced in any environment.

[0050] As shown, a media file is laid out in a first format (e.g., the shown source format) where the file is organized into multiple adjacent partitions. The adjacent partitions comprise playlist data (e.g., playlist.sub.F1 322.sub.F1), video data (e.g., video extent 324), and audio data (e.g., audio extent 323). One way to convert from a source format 342.sub.1 to a target format 352.sub.1 is to use a multi-format transcoder where the multi-format transcoder accepts a media file in a source format and produces a media file in a target format. Another way to convert from certain source formats to certain target formats is to use a non-transcoding segment reformatter 320. Such a non-transcoding segment reformatter 320 segments a video stream into a plurality of video segments (e.g., video segment1 326 through video segmentN 328). In some cases, and as shown, a particular video segment may have corresponding audio data (e.g., the soundtrack for the particular video segment).

[0051] A non-transcoding segment reformatter 320 can combine (e.g., interleave) a particular video segment with corresponding audio data. The act of combining can include producing a series of extents (e.g., 512 byte blocks or 1 k byte blocks) that can be stored as a file. As shown an extent includes both video data and audio data. The combination into the extent can involve interleaving. Interleaving can be implemented where one extent comprises a video segment (e.g., video segment1 326, or video segmentN 328) as well as an audio segment (e.g., audio segment1 327, or audio segmentN 329). Different interleaving techniques interleave within an extent at different degrees of granularity. For example, the combination of video data and audio data within the extent can involve interleaving at a block-level degree of granularity, or at a timecode degree of granularity, or at a byte-by-byte or word-by-word degree of granularity.

[0052] In some embodiments, a non-transcoding segment reformatter 320 can combine video in a particular video format (e.g., in an HTTP live streaming (HLS) format or a dynamic adaptive streaming over HTTP (DASH) format) with corresponding audio data in a particular audio format or encoding (e.g., as an advanced audio coding (AAC) stream or as an MP3 stream). In some cases interleaving can be implemented by merely moving video or audio segments from one location (e.g., from a location in a source format) to a location in a target format without performing signal-level transcoding operations. In some cases interleaving can be implemented by merely moving video segments from one location (e.g., from a location in a source format) to an allocation in a target format without performing any video signal-level transcoding operations. Audio data can segmented and transcoded as needed (e.g., using the shown audio transcoder 331) to meet the specification of the target format. For example, situations where the video is already being delivered in a standard H.264 encoding, the video doesn't need to be re-transcoded, however if the audio is encoded using (for example) the free lossless audio codec (FLAC), the audio might need to be transcoded into an appropriate target format (MP3, high-efficiency advanced audio coding (HE-AAC), or AC-3).

[0053] Some formats include a playlist. Such a playlist might identify titles or chapters or other positions in a corresponding stream. For example, the playlist.sub.F1 322.sub.F1 in the depicted source format includes a series of markers (e.g., titles or chapters or other positions). For each marker, the playlist.sub.F1 322.sub.F1 includes two pointers or offsets: (1) into the video data extent 324, and (2) into the audio data extent 323. Strictly as an additional example, the playlist.sub.F2 322.sub.F2 in the depicted target format includes a series of markers (e.g., titles or chapters or other positions) where, for each marker, the playlist.sub.F2 322.sub.F2 includes one pointer to an extent. The degree of interleaving is assumed or inherent or can be determined from characteristics of the target format.

[0054] FIG. 3B2 presents a block diagram 3B200 showing a variation of the media file reformatting as depicted in FIG. 3B 1. One way to convert from a source format 342.sub.2 to a target format 352.sub.2 is to use a multi-format transcoder where the multi-format transcoder accepts a media file in a source format (e.g., in an interleaved format) and produces a media file in a target format (e.g., comprising a video extent and an audio extent).

[0055] FIG. 3C depicts a media file reformatting system 3C00 as implemented in systems for low latency and low defect media file transcoding. As shown, one or more media files comprise a video portion, a respective audio portion, and a respective playlist. Each of the shown video files are encoded and/or formatted in a source format 342.sub.3. A non-transcoding media file reformatter 350 serves to process each video portion, its respective audio portion, and its respective playlist so as to generate a media file in the target format 354.sub.3.

[0056] In some embodiments the non-transcoding media file reformatter 350 moves video data from its position in the media file of the source format (e.g., preceding the audio portion) to a position in the target format (e.g., following the audio portion). As shown, the video portions of a media file (e.g., video portion 346.sub.1, video portion 346.sub.2, video portion 346.sub.N) are moved to different positions in the media file of the target format. Also as shown, the audio portions of a media file (e.g., audio portion 348.sub.1, audio portion 348.sub.2, audio portion 348.sub.N) are moved to different positions in the media file of the target format. Playlists of media files in the source format (e.g., playlist.sub.S1, playlist.sub.S2, . . . playlist.sub.SN) are converted using the non-transcoding segment media file reformatter 350 such that the playlists of media files in the target format (e.g., playlist.sub.T1, playlist.sub.T2, . . . playlist.sub.TN) are adjusted so as to point to the same titles, chapters, locations, etc. as were present in the playlists of media files in the source format.

[0057] FIG. 3D depicts an on-the-fly watermarking system 3D00 as implemented in systems for low latency and low defect media file transcoding. As shown, the system includes a watermark generator 380 and a watermark applicator 381. On-the-fly watermarking is performed as follows: A server video cache 396 holds video segments that are made ready for delivery to a user/viewer. At some moment before delivery to the user/viewer, a selector 398 determines a next video segment (e.g., segment S.sub.1, segment2 S.sub.2, segment3 S.sub.3, . . . , segmentN S.sub.N) to send over a network (e.g., using sending unit 388) to a device destination prescribed by the user/viewer.

[0058] The watermark generator 380 has access (e.g., through a data access module 390) to a media file repository 382 that stores media files 386 as well as media playlist files 384. The watermark generator 380 further has access to an environmental data repository 385. A watermark can be generated based on any combination of data retrieved from any source. For example, a watermark can contain a user's unique information (e.g., name or nickname or email alias, etc.), and/or the name or identification of the user's device, the time of day, copyright notices, logos, etc. The watermark can be placed over the entire video and/or in a small section, and/or move around every X frames or Y seconds, etc. The watermark itself can also update as the video stream progresses.

[0059] In exemplary embodiments watermark can be generated by combining a segment from the server video cache with environmental data 394 retrieved from the environmental data repository 385. More specifically, a video segment can be watermarked by applying an image over one or more frames in the selected video segment. The aforementioned image might be an aspect of the requesting user, possibly including the requesting user's userID, either based on a plain text version of the userID, or based on an obfuscated version of the userID. In some scenarios, the aforementioned image might include an aspect of a time (e.g., a timestamp), and/or some branding information (e.g., a corporate logo).

[0060] In some cases, the watermark generator 380 performs only watermark generation so as to produce a generated watermark 393 and passes the generated watermark to the watermark applicator 381. The watermark applicator in turn can apply the generated watermark 393 to a video segment so as to produce watermarked selected segments (e.g., watermarked selected segment 389.sub.1, watermarked selected segment 389.sub.2). In some cases a watermark can be included in or on an added frame. In such cases, the length of the video segment is changed (e.g., by the added frame). As such, the playlist regenerator 387 can retrieve and process a selected segment playlist (e.g., see selected segment playlist 392.sub.1 and selected segment playlist 392.sub.2) as well as instructions from the watermark applicator (e.g., "added one frame immediately after frame 0020") so as to produce a regenerated media file playlist 389 that corresponds to the media file from which the selected segment was cached.

[0061] In the context of watermarking video streams as well as in other video streams, some or all of the files in the file system may actually be located in a remote storage location (e.g., in a collaborative cloud-based storage system). To avoid incurring delays in the downloading and processing of that data, a client-side video cache 397 can be implemented. As such, serving of video segments is enhanced.

[0062] For example, consider the situation when a video file in a file system is selected to be played on a client-local video player, but the file is actually located across the network at another network location (e.g., on a server). If network conditions are perfect and download speeds are high enough, then it is quite possible for the video to be streamed across the network without any stalls or interruptions in the display of the video data. However, situations often exist where video downloads still need to perform even when the network conditions are not ideal. Consider if the system is configured such that the video starts being displayed as soon as portions of the video data are received at the local client. In this situation, there is likely to be intermittent interruptions in the video display, where a portion of the video is played, followed by an annoying stall in video playing (as network conditions cause delays in data downloads), followed by more of the video being displayed as additional data is downloaded.

[0063] The present embodiment of the invention provides an improved approach to display data in a virtual file system that significantly eliminates these intermittent interruptions. In the present embodiment, the data is not always immediately queued for display to the user. Instead, chunks (e.g., segments) of data are continuously requested by a client-local module (e.g., a client-local video player), and the data starts being displayed only when there are sufficient amounts of data (e.g., a sufficient number of segments) that has been locally received to ensure smooth display of the data. This approach therefore avoids the problems associated with immediate playback of data, since there should always be enough data on hand to smooth out any changes in network conditions.

[0064] FIG. 3D includes one approach to implement aspects of video segment caching. In the depicted system, a request originating from a user device is received by the server. The requested data may correspond to any granularity of data (e.g., block, segment, chapter, title, etc.). For example, an entire media file or title may be requested; or, only a range of blocks or chapter of the media can be requested. Further, the requested data may pertain to any type of structure or format.

[0065] FIG. 4A presents a processing timeline 4A00 to show full file transcoding latency for comparison to techniques for low latency and low defect media file transcoding using optimized partitioning. As an option, one or more instances of processing timeline 4A00 or any aspect thereof may be implemented in the context of the architecture and functionality of the embodiments described herein. The processing timeline 4A00 or any aspect thereof may be implemented in any desired environment.

[0066] Processing timeline 4A00 shows an original media file01 404.sub.1 transcoded into a transcoded media file01 406.sub.1 using a full file transcoding approach 410. As shown, such an approach introduces a setup time 402 for a computing device or resource to prepare for transcoding an entire media file (e.g., a two-hour movie). The full file transcoding approach 410 then proceeds to transcode the original media file01, requiring that a user desiring to view the transcoded file wait until the entire file is transcoded, thus experiencing a full file transcoding approach first-view latency 412. For example, transcoding a one-hour video to certain combinations of formats and resolutions can result in the full file transcoding approach first-view latency 412 being one to two hours. For comparison to the full file transcoding approach 410, FIG. 4B illustrates the herein disclosed techniques for reducing the first-view latency of transcoded media files, while minimizing or eliminating defects in the transcoded media files.

[0067] FIG. 4B presents a latency timeline 4B00 illustrating a low first-view latency technique used in systems implementing low latency and low defect media file transcoding using optimized partitioning. As an option, one or more instances of latency timeline 4B00 or any aspect thereof may be implemented in the context of the architecture and functionality of the embodiments described herein. The latency timeline 4B00 or any aspect thereof may be implemented in any desired environment.

Optimized Partitioning Approach Using Increasing Chunk Size

[0068] Latency timeline 4B00 shows an original media file01 404.sub.1 transcoded into a transcoded media file01 406.sub.1 using an optimized partitioning approach 420 and a progressive optimized partitioning approach 430 according to the herein disclosed techniques for low latency and low defect media file transcoding using optimized partitioning. Specifically, the optimized partitioning approach 420 can determine optimized partitioning of the original media file01 based in part on the current and/or target format (e.g., key frame location) and/or available computing resources. The resulting partitions are delivered to a set of computing resources for parallel processing (e.g., transcoding) and assembled into a file container. As shown, in some embodiments, the transcoded file can be viewed by a user when the first partition (e.g., P1) has been processed and delivered, resulting in an optimized partitioning first-view latency 422. In the embodiment implementing the progressive optimized partitioning approach 430, the first partition is relatively small to enable a faster transcoding processing time for an initial clip. A shorter progressive optimized partitioning first-view latency 432 can be implemented to improve still further over the earlier-described latency of optimized partitioning first-view latency 422. In both approaches shown in FIG. 4B, the first-view latencies can be substantially shorter (e.g., seconds) than the full file transcoding approach first-view latency 412 as shown in FIG. 4A (e.g., hours). The initial relatively small clip enables a faster transcoding processing time for an initial clip. Successively larger chunks (e.g., clip size) can be determined as transcoding proceeds through the original media file. More particularly, although the initial, relatively small clip is transcoded and presented to the requestor with low latency, the successively larger clip sizes can improve overall performance of the system as a whole. Use of progressive chunk size (e.g., even for just a few of the chunks after the first one) facilitates delivering more and better quality upgrades (e.g., via more and better quality levels) at a faster rate. Consider a system that implements equal chunk sizes of 10 seconds. In such a case, a quality upgrade can only happen at a chunk boundary. It follows then that it would take at least 10 seconds of playback before the quality can be upgraded. Initially, the client starts with a default quality (e.g., a very low one such as 360p). If the client can actually support 1080p, which may be four quality jumps away, it will take 4 chunk times (40 seconds) to reach that quality. By using progressive chunk sizes initially, the amount of time it takes to upgrade to the maximum available quality is shortened. An alternative technique for making all the chunks into small sizes (e.g., 2 seconds each) so as to reduce the chunk time quantum would result in a very large number of chunks--and each individual network request (e.g., request for a chunk) introduces communication latency and processing overhead, resulting in undesired slower retrieval of data. Increasing the chunk size strikes a balance between use of system resources and user experience.

[0069] FIG. 5 is a flow diagram 500 illustrating a system for low latency and low defect media file transcoding using optimized partitioning. As an option, one or more instances of flow diagram 500 or any aspect thereof may be implemented in the context of the architecture and functionality of the embodiments described herein. The flow diagram 500 or any aspect thereof may be implemented in any desired environment.

[0070] Flow diagram 500 in FIG. 5 shows one embodiment of representative modules of, and flows through, a system for implementing techniques for low latency and low defect media file transcoding using optimized partitioning. Specifically, an intermediate server 506 can comprise a partitioner module 508, a partition workload assigner 510, and a transcoder partition assembler 512. The intermediate server 506 can further communicate with a cloud-computing provider 520 comprising a plurality of workload nodes (e.g., workload node 524.sub.1, workload node 524.sub.2, workload node 524.sub.3, to workload node 524.sub.N) to perform various distributed computing and storage operations. Specifically, as pertains to the herein disclosed techniques, the partitioner module 508 of the intermediate server 506 can receive an original media file 502 from a storage facility for transcoding. The partitioner module 508 can analyze the original media file 502 to determine optimized partitions for transcoding (e.g., based in part on the target format or formats and/or available computing resources at cloud computing provider 520). When the partitions and partition boundaries have been determined, the partition workload assigner 510 can assign the respective partitions to the workload nodes at the cloud computing provider 520 to transcode the partitioned original media file segments to respective transcoded media file segments. The transcoder partition assembler 512 can then assemble the transcoded media file segments into a transcoded media video file 504.

Caching

[0071] FIG. 6A is a flow diagram 6A00 illustrating caching of an initial clip as used in systems for low latency and low defect media file transcoding using optimized partitioning. Performance of previewing and other delivery of media to a collaborator can be improved by pre-transcoding an initial portion of a media file upon initial posting. For example, and as shown in FIG. 6A, when a user requests to view a media clip (e.g., by selecting from a workspace media preview icon), the system checks in a cache 602 for the presence of the media clip (or portion thereof). If the requested media clip is already available in the cache, then the clip is served. If requested media clip is not yet available in the cache, then at least an initial portion of the media can be transcoded, stored in the cache, and served to the requestor. Various techniques for pre-transcoding an initial portion of a media file are shown and discussed as pertains to FIG. 6B.

Pre-Transcoding

[0072] FIG. 6B is a flow diagram 6B00 illustrating pre-transcoding of an initial clip as used in systems for low latency and low defect media file transcoding using optimized partitioning. As shown, a creator collaborator 125 might provide a media file for uploading to the remote collaborative cloud-based storage system. At that time, the system can detect that the media file is new to the system and, in accordance with some pre-transcoding techniques, the new file can be preprocessed. Many preprocessing operations can be performed, in particular pre-transcoding of an initial clip. When a collaborator requests to view that media, a preprocessed initial clip of the newly received file is available and can be served to the requestor upon request, resulting in low-latency delivery of the requested media and providing a good user experience. In some embodiments, the media file can be transcoded into multiple initial clips corresponding to various initial lengths (e.g., 10 seconds, 20 seconds, 30 seconds, etc.) and corresponding to various qualities (e.g., 480p, 720p, 1080p, etc.).

Frame-by-Frame Delivery Using a Delivery Pipeline

[0073] FIG. 6C is a flow diagram 6C00 illustrating frame-by-frame delivery of a video clip as used in systems for low latency and low defect media file transcoding using optimized partitioning. In some cases, an initial transcoded clip or other requested transcoded clip might not be pre-transcoded (e.g., might not yet be stored in the cache, and might not be stored in the remote collaborative cloud-based storage system). For example, if the transcoder needs to send a chunk to the client but does not have the chunk transcoded, it will start the transcoding operation for that chunk and, as each frame is transcoded, send the frame down to the client, keeping the connection open for subsequent frames. If the transcoder does have the chunk already transcoded (e.g., because it is running with more resources than it needs and is ahead in the transcoding operation than the client is in the viewing operation), then it simply sends the entire chunk down at once (e.g., without using a frame-by-frame pipeline).

[0074] In some situations a low latency preview can be facilitated by transcoding a very small portion for initial deliver (e.g., see the technique of FIG. 4B). In other cases, such as when sufficient computing resources needed to transcode an entire initial clip cannot be reserved, then a technique to establishing a frame-by-frame pipeline for transcoding and delivery can be employed. When sufficient computing resources needed to transcode become available, the frame-by-frame pipeline can be flushed, and other transcoding and delivery techniques can be pursued.

Frame-by-Frame Delivery Using Automatic Pre-Generation of a Playlist

[0075] FIG. 6D1 is a flow diagram illustrating playlist generation from a video clip as used in systems for low latency and low defect media file transcoding using optimized partitioning. In many situations, a playlist (e.g., an HLS playlist) or manifest (e.g., a DASH manifest) is delivered before delivery of a media clip or portion thereof. In many cases it is felicitous to pre-generate a playlist or manifest upon receipt of a request to access (e.g., watch) a media file. The requester can see various characteristics of the entire media file, and a media player can present navigation controls that are substantially accurate vis-a-vis the entire media file. In some cases many different sized clips, possibly using different qualities of video, can be delivered. In some such cases the timecode of the different clips is corrected (see FIG. 6D4, below).

Generation of URLs Used for Retrieval of Media Clips

[0076] FIG. 6D2 is a flow diagram illustrating generation of URLs for video clips as used in systems for low latency and low defect media file transcoding using optimized partitioning. Generating a playlist or manifest for a clip or series of clips involves relating a time or time range with a media file. Strictly as an example, a playlist corresponding to a series of successively larger chunks (e.g., such as discussed in the foregoing FIG. 4B) can be determined as transcoding proceeds through the original media file. The playlist can refer to the initial media file location (e.g., by URL) for immediate playback, and a playlist can identify to the successively larger clips by referring to the media file location of respective successively larger clips. Multiple playlists can be generated upon presentation of a media file, and the resulting playlists or manifests can be stored or cached for fast retrieval.

Generation of Stateful URLs Used for Retrieval of Media Clips

[0077] FIG. 6D3 is a flow diagram illustrating generation of URLs for video clips as used in systems for low latency and low defect media file transcoding using optimized partitioning. Generating a playlist or manifest for a clip or series of clips involves relating a time or time range with a media file. Strictly as an example, a playlist corresponding to a series of successively larger chunks (e.g., such as discussed in the foregoing FIG. 4B) can be determined as transcoding proceeds through the original media file. As shown, the processing proceeds as follows: (1) a user requests video media from workspace, (2) the video media metadata is retrieved, and (3) a playlist is generated that contains state-encoded URLs.

[0078] In exemplary embodiments, there is one URL generated for each chunk in the playlist, and each URL (e.g., a state-encoded URL entry) that is generated corresponds to an independent transcoding job for that specific chunk. When the URL is accessed by the player, such as after the player reads the playlist file and requests a specific chunk, the state-encoding in the URL pertaining to that chunk is communicates state variables (e.g., variable values) to the transcoding server, which then operates using the state variables and other encoded information necessary to provide the appropriate transcoded chunk. The transcoded chunk data is then delivered to the requesting client. An example of a stateful URL is:

TABLE-US-00001 "http://transcode- 001.streem.com/1080.ts?start=20&chunkDuration=10&totalDurati on=120&orientation=180&mediaId=184719719371&userId=382058184 91&jobId=5112485"

[0079] The URL itself comprises information delivered to the transcoder. In the example above, the start location (e.g., "start=20"), the chunk duration (e.g., "chunkDuration=10"), and other information can be known by the transcoder, merely by receiving and parsing the stateful URL.

Timecode Correction Techniques to Compensate for Variable Communication Channels

[0080] FIG. 6D4 is a flow diagram illustrating timecode correction techniques used when transcoding or delivering video clips to viewers as used in systems for low latency and low defect media file transcoding using optimized partitioning. As heretofore described, the environment in which a video clip can be served might vary widely depending on the situation (e.g., serving to a client with a hardline Internet connection, serving to a mobile device over a wireless channel, etc.). Moreover, the environment in a particular situation can vary in real-time. For example, the bit rate available to and from a mobile device might vary substantially while the mobile device user traverses through between cellular towers. More generally, many variations in the communication fabric to and from a user device can occur at any time. Various techniques ameliorate this situation by selecting a next clip to deliver to a mobile device where the selected next clip is selected on the basis of determined characteristics of the available communication channels. For example, during periods where the communication channel is only able to provide (for example) 500 Kbps of bandwidth, a relatively low quality of video (e.g., 480p) can be delivered. When the communication channel improves so as to be able to provide (for example) higher bandwidth, then a relatively higher quality of video (e.g., 1080p) can be delivered to the viewer. Variations of quality vis-a-vis available bandwidth can occur dynamically over time, and any time duration for re-measuring available bandwidth can be relatively shorter or relatively longer. One artifact of such dynamic selection of a particular quality of a video clip is that the timecode of the next delivered clip needs to be corrected, depending on the selected quality. In particular, the metadata of the clip has to be compared with the playlist so that the playlist will still serve for navigation (e.g., fast forward, rewind, etc.) purposes. Further, absent timecode correction, a succession of chunks exhibit artifacts and glitches (e.g., image freezing, skipping around, missing frames, etc.). In exemplary embodiments, transcoding is performed "on-the-fly" using a "dynamic transcoder". A dynamic transcoder starts and stops frequently based on incoming requests. In some implementations, the starting and stopping of the transcoder resets the timecode of the chunks it generates. One or more timecode correction techniques are applied to make each chunk indistinguishable from a chunk that had been made in a single continuous transcoding job.

[0081] Strictly as one example, when a transcoding job is interrupted, a timecode correction is needed. A transcoding job might be interrupted whenever a request for a chunk that the transcoder does not have transcoded at that time is received. Such a condition can occur when the client requests a different quality than the previous chunk and/or when the client requests a chunk that is determined to be a forward chunk (e.g., a "seek") in the video rather than a next chunk as would be received during continuous playback of the video.

Access Techniques Using a Virtual File System

[0082] FIG. 6E is a flow diagram 6E00 illustrating techniques for accessing media files through a custom virtual file system as used when delivering video clips to collaborators. In certain situations, the remote collaborative cloud-based storage systems rely on a particular file system (e.g., NTFS, CIFS, etc.), and in some situations, the characteristics of such a particular file system might not map conveniently to the functional requirements of a transcoding and delivery service. One technique to ameliorate differences between the functional requirements of a transcoding and delivery service and a back-end file system is to provide a custom virtual file system, which can be used as a video prefetcher 620. The video prefetcher is used for various purposes, including: [0083] 1. Listing the available video files on the user's cloud remote account. The file listing presents actual, readily available files so as to eliminate or reduce the need to write custom code to interact with the APIs of the cloud provider. [0084] 2. Handing the downloading of video file content (e.g., frames, bytes, etc.). [0085] 3. Prefetching frames or bytes of the video file that are most likely to be accessed in the near future by the transcoder. This reduces network load by reducing or eliminating the need to fetch data every time the transcoder needs new portions of the video file. As such, throughput and latency are greatly improved. [0086] 4. Providing a local repository (e.g., a cache) of the video file that is being requested so that subsequent accesses to the same part of the video file by the transcoder can be retrieved from the local repository rather than from cloud-based storage.

[0087] The video prefetcher 620 serves to fetch the predicted next parts of the original video that the transcoder is predicted to process soon. This improves transcoding throughput by pipelining the downloading and the transcoding into adjacent pipeline phases. Further, the video prefetcher can be configured to cache recently downloaded videos so that the transcoder doesn't need to re-download the original when transcoding into another quality level or when re-transcoding for another user. In exemplary embodiments, the video prefetcher provides an abstraction layer to other element of the system, thus allowing the transcoder to remain independent from all network requests. The transcoder is relieved of tasks and operations pertaining to downloading, authentication, identifying and transferring metadata, etc.

Additional Embodiments of the Disclosure

Additional Practical Application Examples

[0088] FIG. 7 depicts a system 700 as an arrangement of computing modules that are interconnected so as to operate cooperatively to implement certain of the herein-disclosed embodiments. The partitioning of system 700 is merely illustrative and other partitions are possible.

[0089] FIG. 7 depicts a block diagram of a system to perform certain functions of a computer system. As an option, the system 700 may be implemented in the context of the architecture and functionality of the embodiments described herein. Of course, however, the system 700 or any operation therein may be carried out in any desired environment. The system 700 comprises at least one processor and at least one memory, the memory serving to store program instructions corresponding to the operations of the system. As shown, an operation can be implemented in whole or in part using program instructions accessible by a module. The modules are connected to a communication path 705, and any operation can communicate with other operations over communication path 705. The modules of the system can, individually or in combination, perform method operations within system 700. Any operations performed within system 700 may be performed in any order unless as may be specified in the claims. The shown embodiment implements a portion of a computer system, presented as system 700, comprising a computer processor to execute a set of program code instructions (see module 710) and modules for accessing memory to hold program code instructions to perform: identifying a first media file having a first format to be converted to a second media file having a second format (see module 720); partitioning the first media file into two or more partitions separated by one or more partition boundaries, wherein the one or more partition boundaries are at a determined key frame position (see module 730); converting into the second format, the two or more partitions to respective two or more converted partitions, wherein the respective two or more partitions are converted by respective two or more computing devices (see module 740); and assembling the respective two or more converted partitions to comprise the second media file (see module 750).

[0090] Variations include: [0091] Variations where the determined key frame position is a closest key frame location to a respective one of the partition boundaries. [0092] Variations that comprise storing at least a portion of the second media file into a cache. [0093] Variations that comprise retrieving at least a portion of the second media file from a cache. [0094] Variations that comprise delivering at least a portion of the second media file to a requestor to be viewed on a media player. [0095] Variations that comprise assigning of the two or more partitions to the respective two or more computing devices. [0096] Variations where the two or more partitions are characterized by a set of progressively increasing durations. [0097] Variations where the two or more partitions are characterized by a set of progressively decreasing durations. [0098] Variations where the two or more partitions are characterized by a set of progressively increasing quality levels. [0099] Variations where the partitioning is based at least in part on a total computing resource availability. [0100] Variations where the partitioning is based at least in part on at least one of, the first format and the second format. [0101] Variations where the respective two or more computing devices comprise a cloud-based computing system. [0102] Variations where at least one of the respective two or more converted partitions comprise an attribute dataset. [0103] Variations where the attribute dataset is at least one of, an atom, a movie atom, and a moov atom. [0104] Variations where steps for converting into the second format comprise a timecode correction. [0105] Variations that comprise generating a playlist based at least in part on the second media file. [0106] Variations where at least some playlist entries comprise a state-encoded URL. [0107] Variations where the respective two or more partitions are selected based on a length. [0108] Variations where at least one of the respective two or more partitions are stored before an assembling step. [0109] Variations where at least one of the respective two or more partitions corresponds to a beginning portion of the first media file.

System Architecture Overview

Additional System Architecture Examples

[0110] FIG. 8A depicts a block diagram of an instance of a computer system 8A00 suitable for implementing embodiments of the present disclosure. Computer system 8A00 includes a bus 806 or other communication mechanism for communicating information. The bus interconnects subsystems and devices such as a CPU, or a multi-core CPU (e.g., processor 807), a system memory (e.g., main memory 808, or an area of random access memory RAM), a non-volatile storage device or area (e.g., ROM 809), an internal or external storage device 810 (e.g., magnetic or optical), a data interface 833, a communications interface 814 (e.g., PHY, MAC, Ethernet interface, modem, etc.). The aforementioned components are shown within processing element partition 801, however other partitions are possible. The shown computer system 8A00 further comprises a display 811 (e.g., CRT or LCD), various input devices 812 (e.g., keyboard, cursor control), and an external data repository 831.

[0111] According to an embodiment of the disclosure, computer system 8A00 performs specific operations by processor 807 executing one or more sequences of one or more program code instructions contained in a memory. Such instructions (e.g., program instructions 802.sub.1, program instructions 802.sub.2, program instructions 802.sub.3, etc.) can be contained in or can be read into a storage location or memory from any computer readable/usable medium such as a static storage device or a disk drive. The sequences can be organized to be accessed by one or more processing entities configured to execute a single process or configured to execute multiple concurrent processes to perform work. A processing entity can be hardware-based (e.g., involving one or more cores) or software-based, and/or can be formed using a combination of hardware and software that implements logic, and/or can carry out computations and/or processing steps using one or more processes and/or one or more tasks and/or one or more threads or any combination therefrom.

[0112] According to an embodiment of the disclosure, computer system 8A00 performs specific networking operations using one or more instances of communications interface 814. Instances of the communications interface 814 may comprise one or more networking ports that are configurable (e.g., pertaining to speed, protocol, physical layer characteristics, media access characteristics, etc.) and any particular instance of the communications interface 814 or port thereto can be configured differently from any other particular instance. Portions of a communication protocol can be carried out in whole or in part by any instance of the communications interface 814, and data (e.g., packets, data structures, bit fields, etc.) can be positioned in storage locations within communications interface 814, or within system memory, and such data can be accessed (e.g., using random access addressing, or using direct memory access DMA, etc.) by devices such as processor 807.

[0113] The communications link 815 can be configured to transmit (e.g., send, receive, signal, etc.) communications packets 838 comprising any organization of data items. The data items can comprise a payload data area 837, a destination address 836 (e.g., a destination IP address), a source address 835 (e.g., a source IP address), and can include various encodings or formatting of bit fields to populate the shown packet characteristics 834. In some cases the packet characteristics include a version identifier, a packet or payload length, a traffic class, a flow label, etc. In some cases the payload data area 837 comprises a data structure that is encoded and/or formatted to fit into byte or word boundaries of the packet.

[0114] In some embodiments, hard-wired circuitry may be used in place of or in combination with software instructions to implement aspects of the disclosure. Thus, embodiments of the disclosure are not limited to any specific combination of hardware circuitry and/or software. In embodiments, the term "logic" shall mean any combination of software or hardware that is used to implement all or part of the disclosure.

[0115] The term "computer readable medium" or "computer usable medium" as used herein refers to any medium that participates in providing instructions to processor 807 for execution. Such a medium may take many forms including, but not limited to, non-volatile media and volatile media. Non-volatile media includes, for example, optical or magnetic disks such as disk drives or tape drives. Volatile media includes dynamic memory such as a random access memory.

[0116] Common forms of computer readable media includes, for example, floppy disk, flexible disk, hard disk, magnetic tape, or any other magnetic medium; CD-ROM or any other optical medium; punch cards, paper tape, or any other physical medium with patterns of holes; RAM, PROM, EPROM, FLASH-EPROM, or any other memory chip or cartridge, or any other non-transitory computer readable medium. Such data can be stored, for example, in any form of external data repository 831, which in turn can be formatted into any one or more storage areas, and which can comprise parameterized storage 839 accessible by a key (e.g., filename, table name, block address, offset address, etc.).

[0117] Execution of the sequences of instructions to practice certain embodiments of the disclosure are performed by a single instance of the computer system 8A00. According to certain embodiments of the disclosure, two or more instances of computer system 8A00 coupled by a communications link 815 (e.g., LAN, PTSN, or wireless network) may perform the sequence of instructions required to practice embodiments of the disclosure using two or more instances of components of computer system 8A00.

[0118] The computer system 8A00 may transmit and receive messages such as data and/or instructions organized into a data structure (e.g., communications packets 838). The data structure can include program instructions (e.g., application code 803), communicated through communications link 815 and communications interface 814. Received program code may be executed by processor 807 as it is received and/or stored in the shown storage device or in or upon any other non-volatile storage for later execution. Computer system 8A00 may communicate through a data interface 833 to a database 832 on an external data repository 831. Data items in a database can be accessed using a primary key (e.g., a relational database primary key).

[0119] The processing element partition 801 is merely one sample partition. Other partitions can include multiple data processors, and/or multiple communications interfaces, and/or multiple storage devices, etc. within a partition. For example, a partition can bound a multi-core processor (e.g., possibly including embedded or co-located memory), or a partition can bound a computing cluster having plurality of computing elements, any of which computing elements are connected directly or indirectly to a communications link. A first partition can be configured to communicate to a second partition. A particular first partition and particular second partition can be congruent (e.g., in a processing element array) or can be different (e.g., comprising disjoint sets of components).

[0120] A module as used herein can be implemented using any mix of any portions of the system memory and any extent of hard-wired circuitry including hard-wired circuitry embodied as a processor 807. Some embodiments include one or more special-purpose hardware components (e.g., power control, logic, sensors, transducers, etc.). A module may include one or more state machines and/or combinational logic used to implement or facilitate the performance characteristics of low latency and low defect media file transcoding using optimized partitioning.

[0121] Various implementations of the database 832 comprise storage media organized to hold a series of records or files such that individual records or files are accessed using a name or key (e.g., a primary key or a combination of keys and/or query clauses). Such files or records can be organized into one or more data structures (e.g., data structures used to implement or facilitate aspects of low latency and low defect media file transcoding using optimized partitioning). Such files or records can be brought into and/or stored in volatile or non-volatile memory.

[0122] FIG. 8B depicts a block diagram of an instance of a cloud-based environment 8B00. Such a cloud-based environment supports access to workspaces through the execution of workspace access code (e.g., workspace access code 853.sub.1 and workspace access code 853.sub.2. Workspace access code can be executed on any of the shown user devices 852 (e.g., laptop device 852.sub.4, workstation device 852.sub.5, IP phone device 852.sub.3, tablet device 852.sub.2, smart phone device 852.sub.1, etc.). A group of users can form a collaborator group 858, and a collaborator group can be comprised of any types or roles of users. For example, and as shown, a collaborator group can comprise a user collaborator, an administrator collaborator, a creator collaborator, etc. Any user can use any one or more of the user devices, and such user devices can be operated concurrently to provide multiple concurrent sessions and/or other techniques to access workspaces through the workspace access code.

[0123] A portion of workspace access code can reside in and be executed on any user device. In addition, a portion of the workspace access code can reside in and be executed on any computing platform (e.g., computing platform 860), including in a middleware setting. As shown, a portion of the workspace access code (e.g., workspace access code 853.sub.3) resides in and can be executed on one or more processing elements (e.g., processing element 862.sub.1). The workspace access code can interface with storage devices such the shown networked storage 866. Storage of workspaces and/or any constituent files or objects, and/or any other code or scripts or data can be stored in any one or more storage partitions (e.g., storage partition 864.sub.1). In some environments, a processing element includes forms of storage, such as RAM and/or ROM and/or FLASH, and/or other forms of volatile and non-volatile storage.

[0124] A stored workspace can be populated via an upload (e.g., an upload from a user device to a processing element over an upload network path 857). One or more constituents of a stored workspace can be delivered to a particular user and/or shared with other particular users via a download (e.g., a download from a processing element to a user device over a download network path 859).

[0125] In the foregoing specification, the disclosure has been described with reference to specific embodiments thereof. It will, however, be evident that various modifications and changes may be made thereto without departing from the broader spirit and scope of the disclosure. For example, the above-described process flows are described with reference to a particular ordering of process actions. However, the ordering of many of the described process actions may be changed without affecting the scope or operation of the disclosure. The specification and drawings to be regarded in an illustrative sense rather than in a restrictive sense.

* * * * *

References

transcode001.streem.com/1080.ts?start=20&chunkDuration=10&totalDuration=120&orientation=180&mediaId=184719719371&userId=38205818491&jobId=5112485