U.S. patent application number 11/941048 was filed with the patent office on 2008-10-30 for method and an apparatus for decoding an audio signal.
This patent application is currently assigned to LG ELECTRONICS INC.. Invention is credited to Yang-Won Jung, Hyen-O Oh.
Application Number | 20080269929 11/941048 |
Document ID | / |
Family ID | 39401874 |
Filed Date | 2008-10-30 |
United States Patent
Application |
20080269929 |
Kind Code |
A1 |
Oh; Hyen-O ; et al. |
October 30, 2008 |
Method and an Apparatus for Decoding an Audio Signal
Abstract
A method of decoding for an audio signal comprises the step of
receiving a downmix of an audio signal, an object information, and
a mix information, the object information including an object level
information, an object correlation information, and an object gain
information, generating a downmix processing information using the
object information and the mix information, and processing the
downmix of the audio signal using the downmix processing
information. Various embodiments of the present invention provide a
method and an apparatus for decoding multi-object audio signals
fast and efficiently by reducing process time, computer resource,
thereby relieving the resource requirement like the wide bandwidth.
The object parameters according to the embodiments of the present
invention can provide backward compatibility in the view of the
channel-oriented decoding process.
Inventors: |
Oh; Hyen-O; (Goyang-si,
KR) ; Jung; Yang-Won; (Seoul, KR) |
Correspondence
Address: |
FISH & RICHARDSON P.C.
PO BOX 1022
MINNEAPOLIS
MN
55440-1022
US
|
Assignee: |
LG ELECTRONICS INC.
Seoul
KR
|
Family ID: |
39401874 |
Appl. No.: |
11/941048 |
Filed: |
November 15, 2007 |
Related U.S. Patent Documents
|
|
|
|
|
|
Application
Number |
Filing Date |
Patent Number |
|
|
60865908 |
Nov 15, 2006 |
|
|
|
60869077 |
Dec 7, 2006 |
|
|
|
60869080 |
Dec 7, 2006 |
|
|
|
60889715 |
Feb 13, 2007 |
|
|
|
60955395 |
Aug 13, 2007 |
|
|
|
60883567 |
Jan 5, 2007 |
|
|
|
Current U.S.
Class: |
700/94 ;
704/E19.005 |
Current CPC
Class: |
G10L 19/008
20130101 |
Class at
Publication: |
700/94 |
International
Class: |
G06F 17/00 20060101
G06F017/00 |
Claims
1. A method of decoding an audio signal, comprising: receiving a
downmix of an audio signal, object information and mix information,
the object information including object level information, object
correlation information, and object gain information, the object
level information being generated by normalizing object levels
corresponding to object signals using one of the object levels as
reference information, the object correlation information provided
by combination of at least two selected object signals, the object
gain information comprising at least one of object gain value
information and object gain ratio information; generating downmix
processing information using the object information and the mix
information; and processing a downmix of the audio signal using the
downmix processing information.
2. The method of claim 1, wherein the reference information is the
largest object level of the object levels.
3. The method of claim 1, wherein the number of object levels are
the same as the number of object signals in the downmix of the
audio signal.
4. The method of claim 1, wherein the object correlation
information comprises a relation information representing different
object signals of a same origin.
5. The method of claim 1, wherein object correlation information is
indicated in a bit stream received by a decoder.
6. The method of claim 1, wherein the object correlation
information comprises a default value.
7. The method of claim 1, wherein the object gain value information
comprises gain value to be applied to object for generation of the
downmix of the audio signal.
8. The method of claim 1, wherein the object gain ratio information
comprises a gain ratio for relatively contributing to at least two
channels of the downmix of the audio signal.
9. The method of claim 1, wherein the object information further
comprises reference information.
10. The method of claim 1, wherein the object information further
comprises a correlation flag.
11. The method of claim 1, further comprising: obtaining the
processed downmix of the audio signal as an output signal.
12. The method of claim 1, further comprising: upmixing the
processed downmix using a multi-channel parameter;
13. The method of claim 1, wherein the downmix of the audio signal
is received as a broadcast signal.
14. The method of claim 1, wherein the downmix of the audio signal
is received on a digital medium.
15. A computer-readable medium having instructions stored thereon,
which, when executed by a decoder, causes the processor to perform
operations, comprising: receiving a downmix of an audio signal,
object information, and mix information, the object information
including object level information, object correlation information,
and object gain information, the object level information being
generated by normalizing object levels corresponding to object
signals using one of the object levels as reference information,
the object correlation information provided by combination of at
least two selected objects, the object gain information including
at least one of object gain ratio information and object gain value
information; generating downmix processing information using the
object information and the mix information; and processing a
downmix of the audio signal using the downmix processing
information.
16. An apparatus for decoding an audio signal, comprising: a
information generating unit receiving an object information and a
mix information, the object information including an object level
information, an object correlation information, and an object gain
information, the object level information being generated by
normalizing object level corresponding to object using one of the
object level as a reference information, the object correlation
information provided from combination of two selected objects, the
object gain information comprising at least one of an object gain
value information and an object gain ratio information, and
generating a downmix processing information using the object
information and the mix information; and a downmix processing unit
receiving the downmix of the audio signal and the downmix
processing information, and processing the downmix of the audio
signal using the downmix processing information;
17. A method of encoding for an audio signal, comprising: receiving
a multi-object audio signal; and generating a downmix of an audio
signal and an object information including an object level
information, an object gain information, and an object correlation,
the object level information and the object correlation information
from the multi-object audio signal, the object level information
being generated by normalizing object level corresponding to object
using one of the object level as a reference information, the
object correlation information provided from combination of two
selected objects, the object gain information comprising at least
one of an object gain value information and an object gain ratio
information;
18. The method of claim 17, wherein the reference information
comprises the largest object level among the all object level.
19. The method of claim 17, wherein the number of the object level
information is same as the number of the objects in the downmix of
the audio signal.
20. The method of claim 17, wherein the object correlation
information comprises a relation information representing a
different object of same origin.
21. An apparatus for encoding an audio signal, comprising: a
downmixing unit generating a downmix of an audio signal from a
multi-object audio signal; and an object information unit
extracting an object information including an object level
information, an object gain information, and an object correlation
information from the multi-object audio signal, the object level
information and the object correlation information from the
multi-object audio signal, the object level information being
generated by normalizing object level corresponding to object using
one of the object level as a reference information, the object
correlation information provided from combination of two selected
objects, the object gain information comprising at least one of an
object gain value information and an object gain ratio information.
Description
RELATED APPLICATIONS
[0001] This application claims the benefit of U.S. Provisional
Patent Application Nos. 60/865,908, 60/869,007, 60/869,080,
60/889,715, 60/955,395, and 60/883, 567, filed on Nov. 15, 2006,
Dec. 7, 2006, Dec. 7, 2006, Feb. 13, 2007, Aug. 13, 2007, and Jan.
5, 2007, respectively, each of which is hereby incorporated by
reference.
BACKGROUND
[0002] 1. Field of the Invention
[0003] The present invention relates to a method and an apparatus
for decoding an audio signal, and more particularly, to a method
and an apparatus for decoding an audio signal received via various
digital medium.
[0004] 2. Discussion of the Related Art
[0005] While downmixing several audio objects to a mono or a stereo
audio signal, information (e.g., parameters) from individual object
signals of the audio signal can be extracted. This information can
be used in a decoder for decoding of the audio signal.
[0006] A MCU (Multipoint Control Unit) is a device that can be used
in a teleconference to articulate provided signals from a remote
place through the conference call.
[0007] A conventional MCU combiner generally makes a combined
signal into multi-channel audio signals. But when multi-channel
audio signals having only multi-channel parameters are used in the
MCU, the MCU only can control the gain and panning of one of the
channels and cannot control the gain and panning of individual
object signals.
[0008] A decoder receives a downmix signal and side information,
and can generate an output signal using the side information. The
output signal may be rendered based on other input information such
as a user control or a playback configuration. In order to control
the individual object signals, the decoder may receive multi-object
signals and process to decode them.
[0009] However, an apparatus and method for decoding multi-object
signals needs a wide bandwidth. Accordingly, a new apparatus and
method for decoding multi-object signals is needed to relieve the
resource requirement of a wide bandwidth. Moreover, for backward
compatibility with channel-oriented decoding, a new apparatus and
method is needed for providing side information corresponding to
audio objects which can be converted to multi-channel
parameters.
SUMMARY
[0010] Various embodiments of the present invention are directed to
a method and an apparatus for decoding an audio signal that
substantially improves disadvantages of the related art and
obviates one or more problems of related art.
[0011] An object of the present invention is to provide a method
for decoding an audio signal by using object information, including
an object level information and an object gain information, to
modify the downmix of an audio signal by changing the contribution
of each object signal to each downmix channel.
[0012] Another object of the present invention is to provide an
apparatus for decoding an audio signal by using object information,
including an object level information and an object gain
information to modify the downmix of an audio signal by changing
the contribution of each object signal to each downmix channel.
[0013] Another object of the present invention is to provide a
method and an apparatus for decoding an audio signal, comprising a
downmix signal and a combined object parameter to be made in a MCU
combiner, to control object gain and output in a teleconference or
other application.
[0014] Additional advantages, objects, and features of the
invention will be set forth in part in the description which
follows and in part will become apparent to those having ordinary
skill in the art upon examination of the following or may be
learned from practice of the invention. The objectives and other
advantages of the invention may be realized and attained by the
structure particularly pointed out in the written description and
claims hereof as well as the appended drawings.
DESCRIPTION OF DRAWINGS
[0015] The accompanying drawings, which are included to provide a
further understanding of the invention, illustrate the preferred
embodiments of the invention, and together with the description,
serve to explain the principles of the present invention. In the
drawings;
[0016] FIG. 1 is an exemplary block diagram of an apparatus for
decoding an audio signal according to one embodiment of the present
invention.
[0017] FIG. 2 is a flow chart illustrating an audio signal decoding
method in accordance with an embodiment of the present
invention.
[0018] FIG. 3 is an exemplary block diagram of an apparatus for
decoding an audio signal according to other embodiment of the
present invention.
[0019] FIG. 4 is an exemplary block diagram of a parameter
generating unit according to one embodiment of the present
invention.
[0020] FIG. 5 is an exemplary block diagram of a object gain
information generating unit according to one embodiment of the
present invention.
[0021] FIG. 6 is an exemplary block diagram of a parameter
generating unit according to other embodiment of the present
invention.
[0022] FIG. 7 is an exemplary block diagram of an apparatus for
processing an audio signal according to other embodiment of the
present invention.
[0023] FIG. 8 is an exemplary block diagram of a MCU combining unit
according to one embodiment of the present invention.
[0024] FIG. 9 is an exemplary block diagram of a combined object
parameter encoding unit according to one embodiment of the present
invention.
DETAILED DESCRIPTION
[0025] Reference will now be made in detail to the preferred
embodiment of the present invention, examples of which are
illustrated in the accompanying drawings. Wherever possible, the
same reference numbers will be used throughout the drawings to
refer to the same or like parts.
[0026] Prior to describing the present invention, it should be
noted that most terms disclosed in the present invention correspond
to general terms well known in the art, but some terms have been
selected by the application as necessary and will hereinafter be
disclosed in the following description of the present invention.
Therefore, it is preferable that the terms defined by the applicant
be understood on the basis of their meanings in the present
invention.
[0027] FIG. 1 is an exemplary block diagram of an apparatus 1000
for decoding an audio signal according to one embodiment of the
present invention. FIG. 3 is an exemplary block diagram of an
apparatus 2000 for decoding an audio signal according to another
embodiment of the present invention.
[0028] The two embodiments of the apparatus 1000 and 2000 differ in
that the apparatus 1000 has a multi-channel decoder 1300 while the
apparatus 2000 does not have the multi-channel decoder 1300. Other
elements, such as a parameter generating unit 1100 and 2000 and a
downmix processing unit 1200 and 2200 are the same as shown in
FIGS. 1 and 3.
[0029] Referring FIG. 1, an apparatus 1000 for decoding an audio
signal (hereinafter also referred to as `a decoder 1000`) includes
a parameter generating unit 1100, a downmix processing unit 1200,
and a multi-channel decoder 1300. The parameter generating unit
1100 is configured to receive object information and mix
information from a user control or a bitstream, and to generate
downmix processing information.
[0030] The object information can include object level information,
object correlation information, and object gain information. The
object level information can be generated by normalizing an object
level corresponding to each object using one of the object levels
as reference information. The object correlation information can be
provided from a combination of two selected objects. The object
gain information can include object gain value information or
object gain ratio information. The downmix processing information
can include a parameter for controlling object gain and object
panning, which is input to the downmix processing unit 1200.
[0031] The downmix processing unit 1200 can be configured to
receive a downmix of an audio signal with the downmix processing
information from the parameter generating unit 1100. The downmix
processing unit 1200 can process the downmix using the downmix
processing information, thereby generating the processed downmix
signal. For example, the downmix processing unit 1200 can apply the
downmix processing information to the downmix of the audio signal
in order to change one or more of object gain and object position
of the downmix of the audio signal to generate the processed
downmix.
[0032] The processed downmix may be input to the multi-channel
decoder 1300 to be upmixed and output by an output device such as a
speaker. A multi-channel parameter output from the parameter
generating unit may be also input to the multi-channel decoder 1300
In some embodiments of the present invention, the multi-channel
decoder 1300 can be used as same as a decoder of MPEG Surround
system.
[0033] Alternatively, the processed downmix signal may be directly
transmitted to and output by the output device as the device 2000
shown in FIG. 2. In order to directly output the processed signal
via speakers, the downmix processing unit 2200 may include a
synthesis filter bank and output PCM data. The unit 2200 may also
select whether to directly output as PCM signal or input to the
multi-channel decoder by user selection.
[0034] FIG. 2 is a flow diagram of a example decoding method for an
audio signal in accordance with the present invention. Reference
will also be made to FIG. 1. In step S110, a downmix of an audio
signal, object information, and mix information is received. Step
120 generates downmix processing information using the object
information and the mix information. In step S130 and S140, a
processed downmix is generated by processing the downmix of the
audio signal using the downmix processing information.
[0035] The configuration of the parameter generating unit 1100
shall be explained in detail with reference to FIG. 4 to FIG.
6.
1. Object Information
1.1 Reference Information and Object Level Information
[0036] FIG. 4 is a block diagram of an exemplary apparatus for
processing an audio signal according to one embodiment of present
invention, in particular, a block diagram of a parameter generating
unit 1100. The parameter generating unit 1100 can be configured to
receive object information and to generate downmix processing
information using the object parameter.
[0037] The parameter generating unit 1100 can include object level
information decoding unit 1110a, object gain information generating
unit 1120a, and object correlation information generating unit
1130a.
[0038] The downmix of an audio signal includes a number of object
signals, and the object signals each have an associated object
level.
[0039] The object level information can be generated by normalizing
the object level using reference information, which may include a
reference object level. In some embodiments, the reference object
level can be the largest object level among a number of object
levels.
[0040] For example, a downmix of an audio signal can include
objects_i, where the object level of each of the objects_i is given
by Ps_i, where i is a positive integer which represents the total
number of object signals in an audio signal.
[0041] If object level energies are transmitted as is to encode an
object parameter, the object parameter can include object
information as follows:
[0042] Ps_i can be obtained as various methods. For example, Ps_i
may be s_i(n) 2 or E[s_i(n) 2]. Ps_i may be transmitted as
information corresponding to each object level information. In this
example, s_i(n) refers to an ith object signal, and s_i(n) can be
either a time domain signal or a subband signal within a given
band.
[0043] However, if the object level information corresponding to
each object signal is transmitted as the value itself the object
level of an object signal may be difficult to quantize due to an
excessive increase in a variation of dynamic range.
[0044] Thus, the object level information may be normalized using
reference information, such as the largest object level energy of
all object energies. The object level information may be
transmitted as in Formula 1 below:
E[s.sub.--i(n) 2]/E[r.sub.--1(n){circumflex over (2)}],
r.sub.--1(n)=reference information, where reference information is
denoted as r.sub.--1. [Formula 1]
[0045] In some embodiments, the object level information includes a
range of values that are less than or equal to 1.
[0046] Therefore, dynamic range can be compressed enough to encode
an audio signal.
[0047] Additionally, the object level information may include
reference information, default information, original object level
energy to use in other signal processes. The object level
information corresponds to each object signal, and object level
information can include an object level for each object signal in
the downmix signal.
1.2 Object Gain Information
[0048] The object parameter comprises an object gain information
including at least one of an object gain value information and an
object gain ratio information. FIG. 5 is a block diagram of an
exemplary apparatus for processing an audio signal according to one
embodiment of present invention, in particular, a block diagram of
an object gain information decoding unit 1120a of the parameter
generating unit 1100.
[0049] The object gain information generating unit 1120a can
include an object gain value information generating unit 1121 and
an object gain ratio information generating unit 1122. The object
gain information relates to a downmix method where one object
signal is used to generate a downmix signal having more than one
channel.
1.2.1 Object Gain Value Information
[0050] The object gain value information can include a gain value
of an object. In some embodiments of the present invention, the
object gain is applied to each object signal before generating the
processed downmix.
[0051] For example, when the downmix of an audio signal includes a
plurality of objects, each object gain value information
corresponding to each object is multiplied to the object level of
each object to generate each gained object, and all of the gained
objects are summed to generate the processed downmix, as described
by Formula 2.
X=sum{a.sub.--i*s.sub.--i}, [Formula 2]
where X is a processed downmix signal to be transmitted to a mono
channel, s_i is an object level, and a_i is object gain value
information of an object contributing to each channel.
1.2.2 Object Gain Ratio Information
[0052] The object gain information can include object gain ratio
information as well as object gain value information. The object
gain ratio information can include a ratio value between the gains
of each object signal contributing to each channel of the processed
downmix signal.
[0053] The object gain ratio information can be used to process the
downmix signal by the Downmix Processing Unit 1200, thereby
obtaining the processed downmix signal to be transmitted through
two (e.g., stereo) or more channels. In the case of a stereo
channel, a processed downmix to be transmitted through each of the
stereo channels is shown by Formula 3. The object gain ratio
information can be obtained from Formula 4.
x.sub.--1=sum{a.sub.--i*s.sub.--i}
x.sub.--2=sum{b.sub.--i*s.sub.--i}, [Formula 3]
where x.sub.--1 and x.sub.--2 are processed downmix to be
transmitted through each channel, respectively, s_i is an object
level, and a_i and b_i are an object gain value information of an
object contributing to each channel of the stereo signal. Formula 4
is as follows:
m.sub.--i=a.sub.--i/b.sub.--i, [Formula 4]
where m_i is an object gain ratio information of each object.
[0054] The object gain information, e.g., the object gain value
information (a_i and b_i) and the object gain ration information
(m_i) can be transmitted to a parameter generating unit 1100 in
various combinations of the object gain information contained in a
bitstream. The combinations can include, for example, (a_i, b_i),
(m_i, a_i) and (m_i, b_i). The parameter generating unit 1100 can
decode the combinations to reconstruct the original object
information. It can be understood that decoding of the combinations
performed by the parameter generating unit 1100 can be adapted to
other decoders, for example a multi-channel decoder 1300.
[0055] Alternatively, when the object gain information is
transmitted to the parameter generating unit 1100 in a combination
of object gain value information (a_i, b_i), the object gain value
information can be scaled. If there is a convention that b_i be
scaled to 1, though object level information and only a_i as an the
object gain information is transmitted, the parameter generating
unit 1100 can reconstruct the original object information according
to the convention. By scaling the object gain value, the number of
the parameters to be transmitted to the parameter generating unit
1100 can be reduced.
[0056] Alternatively, the object gain ration information (m_i) can
be obtained from with a various value as Formula 5:
m.sub.--i=a.sub.--i/b.sub.--i,
m.sub.--i=(a.sub.--i+.alpha.)/(b.sub.--i+.beta.)
m.sub.--i=(a.sub.--i*s.sub.--i)/(b.sub.--i*s.sub.--i), [Formula
5]
where .alpha., .beta. are small numbers to prevent the numerator
and a denominator from being zero.
[0057] In cases where the object gain ratio information includes
s_i, the same m_i value may not include the same value of s_i. For
example, in case of 1) a_i=0.5, b_i=0.5, or 2) a_i=2, b_i=2, each
of these cases has the same m_i (=1) and different values of a_i,
b_i.
[0058] To obtain the processed downmix to be transmitted through
each channel, a new method can be used as described by Formula
6:
x.sub.--1=sum{a.sub.--i'(n)*s.sub.--i'(n)},
x.sub.--2=sum{b.sub.--i'(n)*s.sub.--i'(n)}, [Formula 6]
wherein a_i' and b_i' are values that satisfy the following
conditions: (a_i'+b_i'=C) or (a_i' 2+b_i 2=C) or (a_i'=C or
b_i=C),wherein s_i'=g_i*s_i).
[0059] Finally, the object gain ratio information can be
transmitted m_i'(=a_i'/b_i'). The number of the parameters to be
transmitted to the parameter generating unit 1100 can be reduced.
To prevent distortion of an audio signal in the decoder 1000 or
2000, m_i can be transmitted.
1.3 Object Correlation Information
[0060] Referring to FIG. 4, the parameter decoding unit 1100
receives an object correlation information. The object correlation
information is estimated between two objects and represents the
correlation/coherence between the two objects.
[0061] In case that the two objects originated from the same
channel but are transmitted through different channels, object
correlation information can exist.
[0062] First, if the object signal includes stereo objects, the
stereo objects may generate a mono object downmixing stereo
objects, and generate a descendant object parameter indicating
relations between channels of the stereo objects (hereinafter, this
method is also referred to as the `mono method`). In this case, the
object level information is generated using the object level energy
of the mono object.
[0063] Second, stereo objects recognize two individual mono object
signals. In this case, the object level information is generated
using the two individual mono object levels (hereinafter, this
method is also referred to as the `stereo method`). The amount of
information to be transmitted using the second method can be more
than the first method.
[0064] To process a stereo object, for example, a first channel
signal of stereo objects may be s_i, a second channel signal of
stereo objects is s_j as each mono object signal.
[0065] The object level of above channel signal may be Ps_i
Ps_j.
[0066] In case of a stereo object, each object information
representing L and R channels of a given object is similar to each
other. So, the object correlation information can be used to
represent similarity between the objects information.
[0067] Therefore, to encode Ps_i and Ps_j, each mono object using
the stereo method is considered as constituting the same
object.
[0068] The object correlation information includes one of channel
power as representative, for example, left channel of stereo
object, and normalized power value using the representation
described in Formula 7:
Ps.sub.--j'=Ps.sub.--j/Ps.sub.--i or
Ps.sub.--j'=10 log 10(Ps.sub.--j)-10 log 10(Ps.sub.--i)=10 log
10(Ps.sub.--j/Ps.sub.--i). [Formula 7]
[0069] To reduce the number of transmitted bits of object
information, it can be effective to use object correlation
information.
[0070] And the object correlation information can be generated
using the representation described in Formula 8:
P.sub.--i',Ps.sub.--j'=Ps.sub.--i,Ps.sub.--j/sqrt(Ps.sub.--i*Ps.sub.--j)-
. [Formula 8]
[0071] The object correlation information can represent a relation
between objects, whether or not the objects are both channels of
the same stereo or multi-channel object, that is, each object can
be a different channel of same origin.
[0072] Additionally, regarding the relation between two objects,
differential information can be used.
[0073] The differential information can include a sum or
subtraction signal of the stereo object as described in Formula
9:
M=(L+R)/2,S=(L-R)/2,
Ps.sub.--M=(Ps.sub.--L+Ps.sub.--R)/2,Ps.sub.--S=(Ps.sub.--L-Ps.sub.--R)/-
2. [Formula 9]
[0074] The object correlation information including above the M and
Ps_M can improve transmission efficiency and make it easy to
perform an error balance.
[0075] The number of object correlation information can vary
adaptively according to constituted a same object in order to
reduce the bit rate of a object parameter. A flag information
`correlation_flag` indicating whether an object is part of a stereo
or multi-channel object, and can be received from the object
information. The correlation_flag can be included the object
information, and received the information generating unit 1100.
[0076] An example meaning of a flag information `correlation_flag`
is shown in Table 1.
TABLE-US-00001 TABLE 1 Correlation_flag Meaning 1 Correlation 0 No
correlation
[0077] In case that `correlation_flag` is equal to 0, the object
correlation information is not transmitted to the object
correlation information decoding unit 1130a. When the
`correlation_flag` is not received to the decoder 1000 or 2000, a
default value can be used to process the downmix of the audio
signal. Otherwise (`correlation_flag` is equal to 1), the object
correlation information is transmitted to the object correlation
information decoding unit 1130a and represents a similarity between
the selected two objects.
[0078] The object information can further include reference
information separately. When the reference information exists, the
reference information can be a identifier for an MCU combiner, for
example.
[0079] In some embodiments, a method of encoding for an audio
signal according to the present invention comprises the step of
receiving a multi-object audio signal and the step of generating a
downmix of an audio signal and an object information including an
object level information, an object gain information, and an object
correlation, the object level information and the object
correlation information from the multi-object audio signal,
characteristics of the object level information, the object gain
information, and the object correlation is same as that of the
decoding method. So, the method of encoding for an audio signal
cording to the present invention may not be limited as above
identified.
[0080] Additionally, an apparatus of encoding for an audio signal
according to the present invention comprises a downmixing unit
generating a downmix of an audio signal from a multi-object audio
signal, and an object information unit extracting an object
information including an object level information, an object gain
information, and an object correlation information from the
multi-object audio signal. The apparatus of encoding for an audio
signal may not be limited as above identified.
MCU Combiner
[0081] An audio signal comprising multi-object signals can be used
by an MCU combiner to control object gain and output in a remote
conference and so on. In case the audio signal comprising
multi-object signals, it may be effective to control object gain
and panning corresponding to characteristic of each object
signal.
[0082] For example, the multi-channel audio signal includes vocal
sound, background music (BGM) and narration sound. As occasion
demands, we cannot detect or control a special kind of object
signals when we only use or listen to background music without
vocal sound and narration sound or only make a communication with
someone in a teleconference.
[0083] Additionally, the method of decoding for the present
invention using object information may be used to an enhanced
karaoke system.
[0084] FIG. 6 is an exemplary block diagram of an apparatus for
processing an audio signal according to an embodiment of present
invention. Referring to FIG. 6, an apparatus for processing an
audio signal according to embodiment may comprise an encoder 1
3100, an encoder 2 4100, a combining unit 5000 including a MCU
combining unit 5100 and downmixer 5200. The encoder 1 3100 and the
encoder 2 4100 can be configured to receive each an audio signal_1
or an audio signal_2, and to generate a downmix_1 and an object
information_1 in the encoder 1 3100, and to generate a downmix_2
and an object information_2 in the encoder 2 4100.
[0085] The combining unit 5000 can be configured to receive the
downmix_1 and the object information_1 from the encoder 1 3100, the
downmix_2 and the object information_2 from the encoder 2 4100, and
a control information from user control, and to generate a downmix
and a combined object information.
[0086] The downmix, output signal of the combining unit 5000, can
be generated a conventional downmixing unit. Therefore, details of
elements of the downmixer 5200 shall be omitted.
2.1 Combined Object Parameter
[0087] FIG. 7 is an exemplary block diagram of an apparatus for
processing an audio signal according to an embodiment of present
invention, in particular, an exemplary block diagram of an MCU
combining unit 8100. Referring to FIG. 7, the MCU combining unit
5100 can be configured to generated a combined object information
using the object information_1, the object information_2, and the
control information. The combined object information includes all
information corresponding to the downmix_1 from the encoder 1 3100
and the downmix_2 from the encoder 2 4100.
[0088] The MCU combining unit 5100 includes an object information
decoding unit 5110 and a combined object information encoding unit
5120. The object information decoding unit 5110 can be configured
to receive the object information_1 from the encoder 1 3100 and the
object information_2 from the encoder 2 4100, and to generate a
reference value_1, an object level information_1, and an object
gain informaiton_1 from the object information_1, and a reference
value_2, an object level information_2, and an object gain
information_2. The reference values, the object level information,
and the object gain information is same as that of FIG.
1.about.FIG. 6. Therefore, details of generating method of those
information shall be omitted.
[0089] And the MCU combining unit 5100 can be configured to receive
at least two object information from each multiple encoders without
limitation of input signals, and to generate the combined object
information comprising several information corresponding to the
downmix.
2.2 Control Information
[0090] FIG. 8 is an exemplary block diagram of an apparatus for
processing an audio signal according to an embodiment of present
invention, in particular, an exemplary block diagram of a combined
object information encoding unit 5120. Referring to FIG. 8, the
combined object information encoding unit 5120 can be configured to
receive those information and a control information from user
control, and to generate a combined object information to be
inputted in a decoder (not shown).
[0091] The control information may process the object information_1
and the object information_2, and apply to combination of above the
object information_1 and the object information_2 in the combined
object information encoding unit 5120. The combined object
information may be generated to be processed the control
information, the control information indicating to combine some
objects constituted the combined object information and to control
object gain in the combination of the object information.
[0092] The control information includes an object control
information, a gain control information, and a destination
information. Each of the object control information, the gain
control information, and the destination information may explain
the followings.
2.2.1 Object Control Information
[0093] The object control information may determine target objects
to generate the combined object information. The object control
information can determine a required subset of audio objects of
object information_1 or object information_2.
[0094] The object control information may be processed to the
object level information in the object level information encoding
unit 5112. The combined object information may include information
corresponding to some objects determining by the object control
information, and can be use according to several purposes.
[0095] For example, the object information_1 comprises music
including vocal, piano, guitar object signals, and the object
information_2 comprises violin, vocal object signals. To generate
an audio signal comprising piano, guitar, violin object signals, we
can obtain the combined object information using the object control
information from user control without vocal object signals.
2.2.2 Gain Control Information
[0096] The object gain information encoding unit 5113 can be
configured to receive a gain information_1 from the object
information_1, a gain information_2 from the object information_2,
a gain control information, and a destination information, and to
generate an object gain information of the object information.
[0097] The gain control information may be used to control object
gain for MCU combiner. Unlike the object control information, the
gain control information may be processed object information in the
object gain information encoding unit 5113, the object information
is selected using the object control information in the object
level information encoding unit 5112. The gain control information
may be value within in the range of 0.about.1.
2.2.3 Destination Information
[0098] Among the range of the gain control information, If the gain
control information corresponding to object information_i is 0, the
object information does not included in the combined object
information. When the gain control information is 0 or 1, the gain
control information defines a destination information. The
destination information may include the special gain control
information having 0 or 1 value and the indicators which
destinations are to be outputted the downmix.
[0099] The destination information can be used for special
function, for example, a whisper function, a secret meeting, and
for controlling the destination of an object signal.
[0100] Referring to the FIG. 8, the destination information may be
inputted into the object gain information encoding unit 5123, and
process the gain information_1 and the gain information_2 to
control object gain of the combined object information. If a MCU
combiner has 3-ports, the destination information may include each
gain value (0, 1) corresponding to each output port.
[0101] The gain control information and the destination information
may be inputted at once or separately into the object gain
information encoding unit 5113.
2.3 Process of Generating a Combined Object Information
[0102] FIG. 8 is an exemplary block diagram of the combined object
information encoding unit 5120. Referring to FIG. 8, the combined
object information encoding unit 5120 can be configured to receive
a reference value_1, a reference value_2, an object level
information_1, an object level information_2, an object gain
information_1, an object gain information_2, an object control
information, a gain control information, and a destination
information, and to generate a combined object information using
the object control information, the gain control information, and
the destination information.
2.3.1 Determining of Reference Information
[0103] Again referring to FIG. 8, the combined object information
encoding unit 5120 includes a reference value generating unit 5121,
an object level information encoding unit 5122, and an object gain
information encoding unit 5123.
[0104] To generate the combined object information, first, a
reference information of the combined object information may be
estimated. Each object information_i may include reference
information to normalize each object level, and to generate an
object level information. But, in case of combining at least two
object information to generate a combined object information, the
combined object information may determine to normalize the object
level constituted to the object level information of the combined
object information.
[0105] The reference information of the combined object information
may be determine by several methods. For example, the reference
information of the combined object information may be the reference
information_1 or the largest reference information of the object
information_i.
[0106] Instead of a change of the reference information, the
combined object information may use the object level information of
the object information_i as that of the combined object
information.
2.3.2 Object Level Information of the Combined Object
Information
[0107] The reference information generating unit 5121 may estimate
the reference information of the combined object information as the
above method. Before the change of the reference information of the
combined object information, the object level information_i is
normalized by the reference information_i.
[0108] We assume that the object level information of the object
information_1 is the [formula 10], and the object level information
of the combined object information is the [formula 11].
OL.sub.--1n=EO.sub.--1n/reference information of the object
information.sub.--1 [Formula 10]
[0109] (OL_1n is a nth object level information of the object
information_1, EO_1n is a nth object level energy of the object
information_1)
OL.sub.--k=OL.sub.--1n*reference information of the object
information.sub.--1/reference information of the object information
[Formula 11]
[0110] (OL_k is a kth object level information of the combined
object information)
2.3.3 Object Gain Information
[0111] The object gain information encoding unit 5123 can be
configured to receive an object gain_1, an object gain_2, a gain
control information, and a destination information, and to generate
an object gain information using the gain control information and
the destination information. In case that the destination
information from user control indicates on/off of the object
information, that is, the destination information is 0 or 1, the
object gain information of the object information_i is 0 or 1. In
case that the gain control information may be inputted from user
control, the object gain information_1 and the object gain
information_2 can be changed using the gain control
information.
2.3.4 Object Correlation Information
[0112] The object correlation information indicates
similarity/dissimilarity between the channels of a stereo object or
a multi-channel object, so the object correlation information may
be affected by combining object information in the MCU combining
unit 5100.
[0113] The object correlation information of the combined object
information may be included the object correlation information of
the object information_i as it is.
[0114] It will be apparent to those skilled in the art that various
modifications and variations can be made in the present invention
without departing from the spirit or scope of the inventions. Thus,
it is intended that the present invention covers the modifications
and variations of this invention provided they come within the
scope of the appended claims and their equivalents.
* * * * *