Coherence Groups: Region Descriptors For Low Bit Rate Encoding Zhou; Xiaosong ; et al. [Apple Inc.]

Coherence Groups: Region Descriptors For Low Bit Rate Encoding

Zhou; Xiaosong ; et al.

Patent Application Summary

U.S. patent application number 13/913169 was filed with the patent office on 2014-12-11 for coherence groups: region descriptors for low bit rate encoding. The applicant listed for this patent is Apple Inc.. Invention is credited to Chris Y. Chung, David R. Conrad, Albert E. Keinath, Jae Hoon Kim, Hsi-Jung Wu, Dazhong Zhang, Yunfei Zheng, Xiaosong Zhou.

Application Number	20140362919 13/913169
Document ID	/
Family ID	52005461
Filed Date	2014-12-11

United States Patent Application	20140362919
Kind Code	A1
Zhou; Xiaosong ; et al.	December 11, 2014

COHERENCE GROUPS: REGION DESCRIPTORS FOR LOW BIT RATE ENCODING

Abstract

The invention is directed to an efficient way for encoding and decoding video. Embodiments include identifying different coding units that share a similar characteristic. The characteristic can be, for example: quantization values, modes, block sizes, color space, motion vectors, depth, facial and non-facial regions, and filter values. An encoder may then group the units together as a coherence group. An encoder may similarly create a table or other data structure of the coding units. An encoder may then extract the commonly repeating characteristic or attribute from the coding units. The encoder may transmit the coherence groups along with the data structure, and other coding units which were not part of a coherence group. The decoder may receive the data, and utilize the shared characteristic by storing locally in cache, for faster repeated decoding, and decode the coherence group together.

Inventors:

Zhou; Xiaosong; (Campbell, CA) ; Wu; Hsi-Jung; (San Jose, CA) ; Chung; Chris Y.; (Sunnyvale, CA) ; Keinath; Albert E.; (Sunnyvale, CA) ; Conrad; David R.; (Sunnyvale, CA) ; Zheng; Yunfei; (Cupertino, CA) ; Zhang; Dazhong; (Milpitas, CA) ; Kim; Jae Hoon; (San Jose, CA)

Applicant:

Name	City	State	Country	Type
Apple Inc.	Cupertino	CA	US

Family ID:

52005461

Appl. No.:

13/913169

Filed:

June 7, 2013

Current U.S. Class:	375/240.16
Current CPC Class:	H04N 19/172 20141101; H04N 19/51 20141101; H04N 19/46 20141101; H04N 19/23 20141101; H04N 19/159 20141101; H04N 19/119 20141101; H04N 19/14 20141101; H04N 19/136 20141101
Class at Publication:	375/240.16
International Class:	H04N 19/583 20060101 H04N019/583

Claims

1. A coding method, comprising: parsing a source frame into a plurality of coding units, coding the coding units according to motion compensated predictive coding, searching among the coded coding units for common characteristics, assigning select coded coding units to a coherence group according to identified common characteristics, transmitting, in a channel, data representing the coherence group that includes: a map identifying the coding units that are assigned to the coherence group, data representing the common characteristics of the coding units assigned to the coherence group, and transmitting, in the channel, remaining data of the coded coding units.

2. The method of claim 1, wherein the common characteristics are omitted from transmitted data of the coded coding units.

3. The method of claim 1, wherein the transmitted data of the coded coding units are coded differentially with respect to the common characteristic data.

4. The method of claim 1, wherein the data of the coded coding units is transmitted in a common syntactic element as the coherence group.

5. The method of claim 1, wherein the data of the coded coding units is transmitted in separate syntactic elements from the coherence group.

6. The method of claim 1, wherein the coherence group includes coding units of a common frame.

7. The method of claim 1, wherein the coherence group includes coding units of a plurality of frames.

8. The method of claim 1, wherein the coherence group includes coding units of a plurality of views of stereoscopic video.

9. The method of claim 1, wherein the coherence group includes coding units of a plurality of coding layers of scalability-coded video.

10. The method of claim 1, wherein the common characteristic is a coding mode type.

11. The method of claim 1, wherein the common characteristic is a motion vector.

12. The method of claim 1, wherein the common characteristic is a quantization parameter.

13. The method of claim 1, wherein the common characteristic is a block size.

14. The method of claim 1, wherein the common characteristic is a filter value.

15. The method of claim 1, wherein the common characteristic is an identification of a facial or non-facial region.

16. A computer readable storage device storing program instructions that, when executed, cause an executing device to perform a method, comprising: coding coding units of an input frame according to motion compensated predictive coding, searching among the coded coding units for common characteristics, assigning select coded coding units to a coherence group according to identified common characteristics, transmitting, in a channel, data representing the coherence group that includes: a map identifying the coding units that are assigned to the coherence group, data representing the common characteristics of the coding units assigned to the coherence group, and transmitting, in the channel, remaining data of the coded coding units.

17. The storage device of claim 16, wherein the instructions further cause the executing device to omit the common characteristics from transmitted data of the coded coding units.

18. The storage device of claim 16, wherein the instructions further cause the executing device to transmit data of the coded coding units differentially with respect to the common characteristic data.

19. The storage device of claim 16, wherein the instructions further cause the executing device to transmit the coded coding units in a common syntactic element as the coherence group.

20. The storage device of claim 16, wherein the instructions further cause the executing device to transmit the coded coding units in separate syntactic elements from the coherence group.

21. A video coding system, comprising: a predictive video coder to code source frames of video having been parsed into coding units, a transmit buffer to store coded video data prior to transmission, and a controller, to search among the coding units for common characteristics, assign select coded coding units to a coherence group according to identified common characteristics, transmit, in a channel, data representing the coherence group that includes: a map identifying the coding units that are assigned to the coherence group, data representing the common characteristics of the coding units assigned to the coherence group, and transmit, in the channel, remaining data of the coded coding units.

22. The system of claim 21, wherein the common characteristics are omitted from transmitted data of the coded coding units.

23. The system of claim 21, wherein the transmitted data of the coded coding units are coded differentially with respect to the common characteristic data.

24. The system of claim 21, wherein the data of the coded coding units is transmitted in a common syntactic element as the coherence group.

25. The system of claim 21, wherein the data of the coded coding units is transmitted in separate syntactic elements from the coherence group.

26. The system of claim 21, wherein the coherence group includes coding units of a common frame.

27. The system of claim 21, wherein the coherence group includes coding units of a plurality of frames.

28. The system of claim 21, wherein the coherence group includes coding units of a plurality of views of stereoscopic video.

29. The system of claim 21, wherein the coherence group includes coding units of a plurality of coding layers of scalability-coded video.

30. A decoding method, comprising: receiving coded video data from a channel, the coded video data including data representing a coherence group and data representing coded coding units, identifying, from a map contained in the coherence group data, coding units that are assigned to the coherence group, decoding the coding units according to motion compensated predictive coding using coded data of the respective coding units, wherein, for the coding units assigned to the coherence group, the decoding also is performed using coding data contained in the coherence group; and assembling a recovered frame from the decoded coding units.

31. The method of claim 30, wherein the coherence group contains data representing common characteristics of the coding units that are assigned to the coherence group, the common characteristic data having been omitted from coded data of the coding units.

32. The method of claim 30, wherein the coherence group contains data representing common characteristics of the coding units that are assigned to the coherence group, and the coded data of the coding units respectively contain data that is coded differentially with respect to the common characteristic data.

33. The method of claim 30, wherein the data of the coded coding units is received, from the channel, in a common syntactic element as the coherence group.

34. The method of claim 30, wherein the data of the coded coding units is received, from the channel, in separate syntactic elements from the coherence group.

35. A computer readable storage device storing program instructions that, when executed, cause an executing device to perform a method, comprising: receiving coded video data from a channel, the coded video data including data representing a coherence group and data representing coded coding units, identifying, from a map contained in the coherence group data, coding units that are assigned to the coherence group, decoding the coding units according to motion compensated predictive coding using coded data of the respective coding units, wherein, for the coding units assigned to the coherence group, the decoding also is performed using coding data contained in the coherence group; and assembling a recovered frame from the decoded coding units.

36. A video decoding system, comprising: a receive buffer to receive coded video data from a channel, the coded video data including data representing a coherence group and data representing coded coding units, a controller, to identify, from a map contained in the coherence group data, coding units that are assigned to the coherence group, a predictive video decoder to decode the coding units, using coded data of the respective coding units, wherein, for the coding units assigned to the coherence group, the decoder also decodes the assigned coding units using coding data contained in the coherence group.

37. The video decoding system of claim 36, wherein the predictive video decoder is a parallel processing system and, for the coherence group, a single processor from the parallel processing system decodes the coding units assigned to the coherence group.

Description

BACKGROUND

[0001] In video coding systems, a conventional encoder may code a source video sequence into a coded representation that has a smaller bit rate than does the source video and, thereby achieve data compression. A decoder may then invert the coding processes performed by the encoder to retrieve the source video.

[0002] Modern block-based encoders tessellate spatial regions into non-overlapping coding units which are encoded atomically albeit they are coded in relation to neighboring coding units. This scheme presents several issues. First, for large coherent regions, block-based encoders incur signaling per coding unit, and rely on entropy coding (usually performed in some form of raster-scan ordering of the coding units) to reduce signaling overhead. Additionally, for bit streams that exhibit temporal correlation over a large spatial region, there is a computational overhead incurred by the encoders/decoders as they process one coding unit at a time. Thus, block-based encoders can lose some of the efficiencies that otherwise can be achieved when coding large temporally-correlated image information.

[0003] The inventors perceive a need in the art for a block-based coding protocol that permits efficient coding of temporally-correlated image information in source video.

BRIEF DESCRIPTION OF THE DRAWINGS

[0004] FIG. 1 is a simplified block diagram of an exemplary video coder/decoder system suitable for use with the present invention.

[0005] FIGS. 2-3 illustrate coherence groups for an exemplary frame of video, according to an embodiment of the present invention.

[0006] FIG. 4 is a flow diagram illustrating a method according to an embodiment of the present invention.

[0007] FIGS. 5-6 illustrate syntaxes for video transmissions according to embodiments of the present invention.

[0008] FIG. 7 is a functional block diagram of a video decoder according to an embodiment of the present invention.

[0009] FIG. 8 is a flow diagram illustrating another method according to an embodiment of the present invention.

DETAILED DESCRIPTION

[0010] An enhanced video coding and decoding algorithm is described that may search in a video frame, or sequence of frames for repeating coding units which share a coding characteristic or attribute. The coding units may be grouped together as a coherence group. A packing method and syntax are explained which enable the transport of this novel video packing format. Furthermore, decoding methods to capitalize on the more efficient data packing and streaming are explained which may utilize the reduced data redundancy by reusing the shared characteristic by caching it for fast access.

[0011] FIG. 1 is a simplified block diagram of an encoder/decoder system 100 according to an embodiment of the present invention. The system 100 may include first and second terminals 110, 120 interconnected via a network 130. The first terminal 110 may include an encoder that may generate video data representing locally-captured image information and may code it for delivery over the network 130. The network 130 may deliver the coded video to a second terminal 120, which may include a decoder to recover. Some coding protocols involve lossy coding techniques, in which case, the decoder 120 may generate a recovered video sequence that represents an approximation of the source video. Other coding protocols may be lossless, in which case, the decoder 120 may generate a recovered video sequence that replicates the source video. In either case, the decoder 120 may output the recovered video sequence for local viewing.

[0012] In FIG. 1, the encoder 110 and decoder 120 may be provided within a variety of computing platforms, including servers, personal computers, laptop computers, tablet computers, smart phones, media players and/or dedicated video conferencing equipment. The network 130 represents any number of networks that convey coded video data among the encoder 110 and decoder 120, including, for example, wireline and/or wireless communication networks. A communication network may exchange data in circuit-switched and/or packet-switched channels. Representative networks include telecommunications networks, local area networks, wide area networks and/or the Internet. For the purposes of the present discussion, the architecture and topology of the network 130 is immaterial to the operation of the present invention unless explained hereinbelow.

[0013] The encoder 110 may include a video source 111, a video coder 112, a transmit buffer 113 and a controller 114. The video source 111 may generate the video sequence for coding. Typical video sources 111 include cameras that generate video from locally-captured image information and storage devices or screen buffers (not shown) in which video may be stored, e.g., for media serving applications. The video coder 112 may code frames of video data according to different coding modes. The transmit buffer 113 may store coded video data as it is output by the video coder 112 and awaiting transmission via the network 130. The controller 114 may manage communication of video data to a decoder 120 over a network channel.

[0014] The decoder 120 may include a rendering unit 121, a video decoder 122, a receive buffer 123 and a controller 124. These components may invert coding operations performed by the encoder 110. The receive buffer 123, for example, may store the received data, may parse the data into component data streams and may forward coded video data to the decoding engine 122. The decoding engine 122 may invert coding processes applied by the coding engine 112 and generate decoded video therefrom. The decoding engine 122 may output the recovered video data to the rendering unit 121 for consumption. The rendering unit 121 may be a display, a storage device or scaler (not shown) to which recovered video data may be output.

[0015] As shown, the video coder/decoder system 100 supports video coding and decoding in one direction only. For bidirectional communication, an encoder and decoder may each be implemented at each terminal 110, 120 such that each terminal may capture video data at a local location and code the video data for transmission to the other terminal via the network. Each terminal may receive the coded video data of the other terminal from the network, decode the coded data and display video data recovered therefrom.

[0016] FIG. 1 illustrates a simplified functional block diagram of a video encoder 112 according to an embodiment of the invention. The video encoder 112 may include a pre-processor 115, a coding engine 116 and a reference picture cache 117. The pre-processor 115 may perform statistical analyses of frames received from the video source 111 for use in parsing the frames into coding units and, perhaps, filtering the video data. The coding engine 116 may code the video data according to a predetermined coding protocol. The coding engine 116 may output coded data representing coded pictures, as well as data representing coding modes and parameters selected for coding the pictures, to a transmit buffer 113 for output to a channel 131. The reference picture cache 117 may store decoded data of reference pictures previously coded by the coding engine 116; the picture data stored in the reference picture cache 117 may represent sources of prediction for later-received pictures input to the video encoder 112.

[0017] FIG. 1 also illustrates a simplified functional block diagram of a video decoder 122 according to an embodiment of the invention. The video decoder 122 may include a post-processor 125, a decoding engine 126 and a reference picture cache 127. The decoding engine 126 may decode coded video data according to the coding operations performed by its counterpart in the video coder 112. The decoding engine 126 may output decoded data to the post-processor 125 and to the reference picture cache 127. The reference picture cache 127 may store decoded data of reference pictures output by the decoding engine 126; the picture data stored in the reference picture cache 127 may represent sources of prediction for later-received coded pictures input to the video decoder 122. The post-processor 125 may perform filtering operations on frames received from the decoding engine 126 prior to outputting the decoded video data to a rendering unit 121.

[0018] During operation, the preprocessor 113 may parse input frames into different "coding units" for processing by the coding engine 116. Coding units respectively may represent groups of pixels of various sizes. For example, a coding unit may include a 4.times.4, 8.times.8, 16.times.16 or 32.times.32 sized array of pixel data. Further, pixel data may be parsed into color component data prior to being processed by the coding engine 116. Moreover, a frame may be parsed into coding units of different sizes prior to being processed by the coding engine 116.

[0019] Embodiments of the present invention may build a coding artifact, called a "coherence group" herein, for video coding. A coherence group may include multiple coding units that share coding properties such that a coding engine 112 determines they can be grouped together into a common syntactic element to conserve bandwidth. The coding units may, for example, share common motion properties, quantization parameters, prediction references, coding objects or other properties that can be represented together. By presenting coding data representing these shared characteristics in a common syntactic element, the encoder may conserve bandwidth in a communication channel 130.

[0020] As part of its processing, an encoder 110 may search input video data for common characteristics that can be coded in a coherence group. The encoder 110 may search for such characteristics at various points during its operation, for example, searching input data that has yet to be coded, searching again among the input data after prediction references have been selected, searching again after coding has been applied to input data and (optionally) revising prior coding selections to take advantage of additional coding efficiencies that might be achieved by adding other coding units to a coherence group. For example, the coherence group can be a group of coding units with shared characteristics, such as: [0021] Motion vectors; [0022] Coding mode assignments (e.g., intra-coding, inter-coding, skip coding and merge mode coding) applied to the coding units; [0023] Sizes of the coding units; [0024] Gradients among video data, which may identify regions of smooth image data, video data that contacts edges or regions of video data that contact textures; [0025] Changes among gradients in video data, which also may identify regions of smooth image data, video data that contacts edges or regions of video data that contact textures; [0026] Coding unit complexity in a spatial domain; [0027] Coding unit complexity in a temporal domain; [0028] Correlation among coding units and, further, a direction of correlation among the coding units; [0029] A distinction between nature video content and computer-generated video content; [0030] A distinction between video content that is classified as a facial region and content that is a non-facial region; and [0031] Depth of video content, for example bit-depth and/or 3D depth. The coherence group may be built from coding units of a single frame or, alternatively, from coding units of multiple frames. Still further, the coherence group can be formed of a group of coding units that incur similar processing complexities on decode.

[0032] A coherence group may include coding units of a single frame, coding units of multiple frames, coding units of multiple views in the case of stereoscopic video or coding units of multiple layers in the case of scalability-coded video. In embodiments where a coherence unit includes coding units from multiple frames, views or layers, the coherence group's index may include identifiers indicating the frame(s), view(s) or layer(s) to which each coding unit belongs, in addition to data identifying the coding units' locations.

[0033] In another embodiment, a coding syntax may be defined to include multiple levels of coherence groups. For example, the coding syntax may include provision for frame-level coherence groups that group video content of multiple frames together, syntax-level coherence groups that group video content of multiple slices together, and coding unit-level coherence groups.

[0034] Below, exemplary use cases for coherence group coding are presented.

[0035] FIG. 2 illustrates application of coherence groups to an exemplary frame 210 according to an embodiment of the present invention. The frame 210 is illustrated as having been parsed into a plurality of coding units according to image content within the frame. Each coding unit includes an array of pixels from the frame 210. Four sizes of coding units 212, 214, 216 and 218 are illustrated in the example of FIG. 2. Although the different coding units 212-218 are illustrated as squares, the principles of the present invention apply to coding units of different shapes (for example, rectangles).

[0036] Two coherence groups are illustrated in FIG. 2(a), groups 220 and 230. Coherence group 220 may be identified, for example, based on common motion characteristics that are observable in a video sequence to which the frame 210 belongs. In the illustrated example, the coding unit 220 corresponds to a background region of the image which may exhibit common motion properties to each other (which may be no motion, as the case may be). Alternatively, these "background" coding units may be identified if they exhibit common spatial complexity to each other. In either case, a video encoder may identify these common properties and assign coding units that have such properties to a common coherence group 220 for coding.

[0037] The second coding group 230 may be identified from object detection applied by a preprocessor. For example, a preprocessor may apply facial recognition processes to input video. When a facial region is identified, a controller may cause video data of the facial region to be coded at a higher coding quality than other regions, for example, the background. As part of this process, the controller may cause the video coder to parse the image data corresponding to the facial region into small-sized coding units and also may assign relatively lower quantization parameters to the coding units to preserve image fidelity. As part of this process, the controller may cause those coding units to be assigned to a coherence group 230 for coding.

[0038] FIG. 2(b) illustrates another coherence group 240 according to an embodiment of the present invention. In the illustrated example, background regions of the image may be allocated to a coherence group on the basis of image content within the regions' coding units, for example, spatial complexity, motion or the like. In the example of FIG. 2(b), the coherence group 240 is formed of coding units that are not contiguous within the spatial area of the frame 210. Thus, the principles of the present invention permit construction of a coherence group from non-contiguous regions of coding units.

[0039] FIG. 3 illustrates application of coherence groups to another exemplary frame 310 according to an embodiment of the present invention. Again, the frame 310 may be parsed into a plurality of coding units 312, 314, 316, 318 of various sizes according to image content within the frame 310. FIG. 3 again illustrates a pair of coherence groups 320, 330, which may be identified based on common characteristics among coding units. For example, coherence group 320 may be identified based on motion characteristics of content in the frame 310 which coherence group 330 may be identified based on spatial complexity of other image content of the frame 320 (in addition to or without regard to motion characteristics).

[0040] FIG. 4 is a flow diagram of a method 400 for creating a coherence group according to an embodiment of the present invention. The method 400 may be performed during coding of a new frame of video data. The method 400 may identify coding units from the frame which have similar characteristics to each other (block 410). When coding units are identified that have common characteristics, the coding units may be assigned to the coherence group (block 420) and location information identifying the assigned coding units may be added to an index that identifies the coding units that belong to the coherence group (block 430). The method 400 also may code the assigned coding units with reference to coding parameters that will be transmitted in the coherence group (block 440).

[0041] The method 400 may build a transmission sequence for the coherence group (block 450) that includes a header indicating the onset of the coherence group, an index map and parameter data that is to be applied during decoding of the coherence group. The method also may transmit the coding units that belong to the coherence group after transmission of the coherence group itself (block 460). In one embodiment, the assigned coding units may be transmitted immediately following transmission of the coherence group. In other embodiments, however, the assigned coding units may be transmitted in a transmission order determined by other factors, for example, in a raster scan order or in an order determined by a coding tree to which the coding units belong. In either case, coded data of the coding units may be coded differentially with respect to parameter data of the coherence group or, alternatively, simply may omit fields that correspond to parameter data presented in the coherence group.

[0042] Once an encoder signals coherence groups, there is no need, at each coding unit within the group, to signal redundant information. For example, once a motion vector group is signaled, motion vectors at each coding unit need not be signaled. Additionally, coding units within a coherence group can signal differences from the shared information. For example, a coding unit in a motion vector coherence group can signal a small motion vector difference from the shared motion vector. This still helps the processing, as relevant pixels for the small difference are likely to have been fetched along with the pixels of the rest of the coherence group.

[0043] FIG. 5 illustrates a transmission sequence for a coherence group 500 according to an embodiment of the present invention. As indicated, the coherence group 500 may include a header field 510, an index table 520, a parameters field 530 and the coded coding units 540.1-540.N that belong to the coherence group. The header field 510 may include content that identifies the presence of the coherence group 500 within a larger coded video sequence. The index table 520 may include data that identifies which coding units in the frame are members of the coherence group 500. The parameters field 530 may include coding data that is common to the coded coding units 540.1-540.N of the coherence group 500. For example, the parameters field 530 may include data representing the coding units' motion vector, quantization parameter, coding mode, frequency distribution and/or either in-loop or out-of-loop filtering parameters, such as deblocking filter parameters and/or sample adaptive offset parameters. The coded coding units 510.1-540.N data may include other coded data of the coding units. In this regard, the coding unit data 540.1-540.N may include headers H to indicate the onset of the coding unit and other data from which a decoder may generate recovered video data of the coding unit.

[0044] In this embodiment, because the coded coding units 540.1-540.N of the coherence group appear in transmission order immediately following the coherence group header 510, index table 520 and coding parameters 530, it may be sufficient for the index table to identify a number of coding units that belong to the coherence group 500. A decoder may count the number of coding unit headers H that follow the coherence group's header 510 to identify the coded coding units 510.1-540.N that appear within the coding unit.

[0045] FIG. 6 illustrates a transmission sequence for a coherence group 600 according to another embodiment of the present invention. In this embodiment, the coherence group 600 may include a header field 602, an index table 604 and a parameters field 606. The header field 602 may include content that identifies the presence of the coherence group 600 within a larger coded video sequence. The index table 604 may include data that identifies which coding units in the frame are members of the coherence group 600. The parameters field 606 may including coding data that is common to the coding units of the coherence group 600. Again, the parameters field 606 may include data representing the coding units' motion vector, quantization parameter, coding mode, frequency distribution and/or deblocking filter parameters.

[0046] The transmission sequence may include coded data representing coding units 610.1-610.N of the frame, some of which may be members of the coherence group and others of which may not be members of the coherence group. In the example illustrated in FIG. 6, the coding units 610.2 and 610.N are illustrated as members of the coherence group (marked "[GC]" in the figure), whereas coding unit 610.1 is not. In this embodiment, the coded coding units may be provided in the transmission sequence according to a predetermined coding order, which may be determined by a coding tree, a raster scan or other protocol by which the encoder and decoder operate. As a decoder decodes a given coding unit, the decoder may determine with reference to the index table 604 whether the coding unit is a member of data may include other coded data of the coding units. In this embodiment, the index table may include identifier(s) of the coding units 610.2, 610.N that belong to the coherence group that can be compared to the coding units 610.1-610.N as they are received and decoded.

[0047] In implementation, it is expected that encoders and decoders will operate according to a predefined transmission protocol that codifies coherence groups in its syntax. For example, it may find application in the HEVC coding standard that, at the time of this writing, is under development as ISO/IEC 23008-2 MPEG-H Part 2 and ITU-T H.265. Coherence groups need not be codified in every case, however. In other implementations, for example, coherence group information may be embedded into supplemental enhancement information (SEI) messages, which allow definition of signaling that is out of scope for a governing protocol. Thus, in the embodiment illustrated in FIG. 6, the coherence group 600 is illustrated as embedded in an SEI message 608.

[0048] Use of coherence groups can lead to resource conservation at decoders in various use cases. FIG. 7 illustrates one such use case using a multi-threaded decoder. FIG. 7 is a functional block diagram of a video decoder 700 according to an embodiment of the present invention in which the video decoder 700 includes a coded video buffer 710, a plurality of processors 720.1-720.3, a decoded video buffer 730 and a scheduler 740. The coded video buffer 710 may store coded video data from a channel. The scheduler 740 may review the coded video data and assign portions of the coded video data to be decoded by the processors 720.1-720.3. The processors 720.1-720.3 may decode the coded video data and store decoded video in the decoded video buffer 730. In so doing, the processors 720.1-720.3 each may perform operations described above in connection with the video decoder 122 (FIG. 1). The decoded video may be retrieved from the decoded video buffer 730 and output to a display 750 or other rendering device.

[0049] In implementation, the coded video buffer 710 and the decoded video buffer 730 may be provided as respective portions of a memory system in the decoder. The processors 720.1-720.3 may be provided as separate processing systems (e.g., separate processor integrated circuits, separate processing cores or separate processing threads in a common integrated circuit).

[0050] During operation, the scheduler 740 may assign data of a common coherence group to a single processor (say, processor 720.1) for decoding. The coherence group data may be stored in memory in common areas of the memory system to which the coded video buffer 710 is assigned. Thus, decoding of the coherence group may be performed efficiently because the processor 720.1 may use and reuse a common set of data in the memory space as it performs its decoding operations. When, for example, the coherence group uses common motion vectors, the processor 720.1 may refer to common reference picture data in the memory space to process the coding units in the coherence group. Thus, it is expected that processing of the coherence group will lead to conservation of resources at decoders, particularly for those decoders that include multiple processors.

[0051] FIG. 8 is a simplified flow diagram illustrating an exemplary method 800 for decoding a coherence group according to an embodiment of the present invention. In exemplary method 800, a decoder may receive a packet or video stream (block 810). The decoder may then recognize a coherence group with a shared characteristic (block 820).

[0052] In one embodiment, a decoder may relay either all video streams, or coherence group streams to a control module (block 830), for manipulation specific to coherence group coded data. For example, a decoder may detect that a group is a coherence group either from a standard format, an SEI Message, or other side information and then forward it to a control module if detected. The control module may then extract the common characteristic for fast memory swapping (block 840). The control module may, for example, put the common characteristic in cache (block 850) or RAM. Additionally, the decoder may permit some of the video data to be reused. In one embodiment, the decoder, using cache or RAM, may keep the common data and decode each coding unit in one coherence group consecutively. The decoder may then read data from the data structure identifying the sequence of coding units and respective locations. The decoder may traverse the data structure, to determine where each coding unit is placed in time as well as location in a frame. The decoder may next decode the coding units, repeatedly using the common characteristic (block 860).

[0053] Although primarily described with reference to a video encoding system, the above described methods may be applicable to the capture of video and still images that may directly be stored in a memory system and not coded for compression. Some embodiments may be implemented, for example, using a non-transitory computer-readable storage medium or article which may store an instruction or a set of instructions that, if executed by a processor, may cause the processor to perform a method in accordance with the disclosed embodiments. The exemplary methods and computer program instructions may be embodied on a non-transitory machine readable storage medium. In addition, a server or database server may include machine readable media configured to store machine executable program instructions. The features of the embodiments of the present invention may be implemented in hardware, software, firmware, or a combination thereof and utilized in systems, subsystems, components or subcomponents thereof. The machine readable storage media may include any medium that can store information. Examples of a machine readable storage medium include electronic circuits, semiconductor memory devices, ROM, flash memory, erasable ROM (EROM), floppy diskette, CD-ROM, optical disk, hard disk, fiber optic medium, or any electromagnetic or optical storage device.

[0054] Several embodiments of the invention are specifically illustrated and/or described herein. However, it will be appreciated that modifications and variations of the invention are covered by the above teachings and within the purview of the appended claims without departing from the spirit and intended scope of the invention.

* * * * *