U.S. patent application number 12/619653 was filed with the patent office on 2010-06-24 for dynamic volume control and multi-spatial processing protection.
This patent application is currently assigned to THAT CORPORATION. Invention is credited to Gregory Benulis, Christopher Hanna, Scott Skinner.
Application Number | 20100158259 12/619653 |
Document ID | / |
Family ID | 41665272 |
Filed Date | 2010-06-24 |
United States Patent
Application |
20100158259 |
Kind Code |
A1 |
Hanna; Christopher ; et
al. |
June 24, 2010 |
DYNAMIC VOLUME CONTROL AND MULTI-SPATIAL PROCESSING PROTECTION
Abstract
A disclosed system and method dynamically controls the perceived
volume of a stereo audio program including left and right channel
signals. The system comprises: a dynamic volume control configured
and arranged so as to maintain a perceived constant volume level of
the stereo audio program; and an excessive spatial processing
protection processor configured and arranged for controlling the
level of a difference signal, created as a function of the right
channel signal subtracted from the left channel signal (L-R),
relative to the level of a sum signal, created as a function of the
right channel signal plus the left channel signal;; wherein the
excessive spatial processing protection processor processes the
audio signals so as to control the difference (L-R) signal relative
to the sum (L+R) signal. A system and method are also provided for
dynamically controlling the perceived volume of a stereo audio
program including left and right channel signals, comprising: a
dynamic volume control configured and arranged so as to maintain a
perceived constant volume level of the stereo audio program; and a
program change detector configured and arranged to provide a
program change signal indicating that the volume of the left and
right channel signals has dropped below a threshold level for at
least a threshold time period so as to anticipate a possible change
in the sound level of the left and right channel signals; wherein
the dynamic volume control is responsive to the program change
signal.
Inventors: |
Hanna; Christopher;
(Arlington, MA) ; Benulis; Gregory; (Boxford,
MA) ; Skinner; Scott; (Westborough, MA) |
Correspondence
Address: |
McDermott Will & Emery
600 13th Street, NW
Washington
DC
20005-3096
US
|
Assignee: |
THAT CORPORATION
Milford
MA
|
Family ID: |
41665272 |
Appl. No.: |
12/619653 |
Filed: |
November 16, 2009 |
Related U.S. Patent Documents
|
|
|
|
|
|
Application
Number |
Filing Date |
Patent Number |
|
|
61114777 |
Nov 14, 2008 |
|
|
|
61114684 |
Nov 14, 2008 |
|
|
|
Current U.S.
Class: |
381/28 ; 381/107;
381/109 |
Current CPC
Class: |
H03G 3/3005 20130101;
H04R 5/04 20130101; H04R 3/007 20130101; H04S 1/00 20130101; H03G
3/3089 20130101; H04R 2430/01 20130101; H03G 3/341 20130101 |
Class at
Publication: |
381/28 ; 381/107;
381/109 |
International
Class: |
H03G 3/00 20060101
H03G003/00; H04R 5/00 20060101 H04R005/00 |
Claims
1. A system for dynamically controlling the perceived volume of a
stereo audio program including left and right channel signals,
comprising: a dynamic volume control configured and arranged so as
to maintain a perceived constant volume level of the stereo audio
program; and a program change detector configured and arranged to
provide a program change signal indicating that the volume of the
left and right channel signals has dropped below a threshold level
for at least a threshold time period so as to anticipate a possible
change in the sound level of the left and right channel signals;
wherein the dynamic volume control is responsive to the program
change signal.
2. A system according to claim 1, wherein the dynamic volume
control includes a compressor configured and arranged to have
adjustable speeds of an attack time and a release time, wherein if
a program change signal is detected the speeds of the attack and
release times are increased.
3. A system according to claim 2, wherein the dynamic volume
control is configured and arranged so that if a loud channel is
changed to a quiet channel, then the increased release time permits
a faster gain increase to meet a target sound output level, and if
a quiet channel is changed to a loud channel, then the increased
attack time permits a faster gain decrease to meet the target sound
output level.
4. A system according to claim 3, wherein the dynamic volume
control is configured and arranged so that if the sound level rises
above the threshold level before the time expires, then a program
change is not detected.
5. A system according to claim 1, wherein the threshold time period
and the threshold level are fixed.
6. A system according to claim 1, wherein the threshold time period
and the threshold level are adjustable.
7. A system according to claim 1, wherein the threshold time period
and the threshold level are variable.
8. A system according to claim 1, wherein the program change
detector is configured and arranged so as to respond automatically
by detecting a mute condition during a channel change.
9. A system according to claim 1, wherein the program change
detector is configured and arranged so as to respond by detecting a
channel change condition from a user's remote control when the user
is changing a channel.
10. A system according to claim 1, wherein the program change
detector is configured and arranged so as to respond to a channel
change condition communicated via a host processor.
11. A method of dynamically controlling the perceived volume of a
stereo audio program including left and right channel signals,
comprising: dynamically controlling the volume of the left and
right channel signals in response to a program change signal so as
to maintain a perceived constant volume level of the stereo audio
program; and generating the program change signal in response to
detecting a program change signal indicating that the volume of the
left and right channel signals has dropped below a threshold level
for at least a threshold time period so as to anticipate a possible
change in the sound level of the left and right channel
signals.
12. A system for dynamically controlling the perceived volume of a
stereo audio program including left and right channel signals,
comprising: a dynamic volume control configured and arranged so as
to maintain a perceived constant volume level of the stereo audio
program, the dynamic volume control including at least one
compressor responsive to high and low attack and release ratio
thresholds so as define quiet, normal and loud perceived volume
levels.
13. A system according to claim 12, wherein the compressor applies
a different compression ratio to each of the quiet, normal and loud
perceived volume levels.
14. A system according to claim 13, wherein the compression ratio
for quiet perceived volume levels is set to amplify the left and
right signals, the compression ratio for normal volume levels is
set to preserve the left and right signals, and the compression
ratio for loud volume levels is set to attenuate the left and right
signals.
Description
RELATED APPLICATIONS
[0001] This application is related to and claims priority to U.S.
Provisional Application Nos. 61/114,684 filed on 14 Nov. 2008 in
the names of Christopher M. Hanna, Gregory Benulis and Scott
Skinner; and 61/114,777 filed on 14 Nov. 2008 in the names of
Christopher M Hanna and Gregory Benulis, both applications being
herein incorporated by reference. This application is also related
to copending U.S. application Ser. No. ______ (Attorney's Docket
No. 56233-428-THAT-27) contemporaneously filed with the present
application in the names of, Christopher M. Hanna and Gregory
Benulis, and assigned to the present assignee.
TECHNICAL FIELD
[0002] The present application relates to audio signal processing,
and more particularly to audio signal volume control and
multi-spatial processing protection.
BACKGROUND
[0003] During television viewing, volume changes can be irritating
and often involve manual volume adjustments by the viewer. One
example is the perceived volume change that often occurs when
changing channels on a television. Another example would be the
perceived volume change that can occur between the broadcast of a
television program and a commercial. These large relative changes
are typically attributed to lack of level control at the point of
broadcast or signal compression introduced during production. A
somewhat little known cause of increased perceived loudness is
multiple spatial processing. The audio in some program material is
processed, in the studio, to introduce surround spatial effects
(pseudo-surround) in two-channel systems. If this type of broadcast
audio is then processed in the television to introduce two-channel
surround effects, as is currently done in many television models,
the perceived level change can be dramatic. This additional spatial
processing can cause the center image (typically dialogue) to be
almost unintelligible. In all cases automatic volume control
technology can minimize listener discomfort and maintain a more
consistent volume level. While much attention has been paid to
leveling the audio volume at the point of broadcast, it seems to
have done little to alleviate the problem. In fact, with the advent
of high dynamic range DTV broadcasts wider loudness differences can
be now perceived by the television viewer.
SUMMARY
[0004] In accordance with one aspect of the disclosed system and
method, a system is provided for dynamically controlling the
perceived volume of a stereo audio program including left and right
channel signals, comprising: a dynamic volume control configured
and arranged so as to maintain a perceived constant volume level of
the stereo audio program; and an excessive spatial processing
protection processor configured and arranged for controlling the
level of a difference signal created as a function of the right
channel signal subtracted from the left channel signal (L-R)
relative to the level of a sum signal created as a function of the
right channel signal plus the left channel signal; wherein the
excessive spatial processing protection processor processes the
audio signals so as to control the difference (L-R) signal
enhancement.
[0005] In accordance with another aspect, a system is provided for
dynamically controlling the perceived volume of a stereo audio
program including left and right channel signals, comprising: a
dynamic volume control configured and arranged so as to maintain a
perceived constant volume level of the stereo audio program; and a
program change detector configured and arranged to provide a
program change signal indicating that the volume of the left and
right channel signals has dropped below a threshold level for at
least a threshold time period so as to anticipate a possible change
in the sound level of the left and right channel signals; wherein
the dynamic volume control is responsive to the program change
signal.
[0006] In accordance with yet another aspect, a system is provided
for dynamically controlling the perceived volume of a stereo audio
program including left and right channel signals, comprising: a
dynamic volume control configured and arranged so as to maintain a
perceived constant volume level of the stereo audio program, the
dynamic volume control including at least compressor responsive to
high and low attack and release ratio thresholds so as define
quiet, normal and loud perceived volume levels.
[0007] In accordance with still another aspect, a system is
provided for dynamically controlling the perceived volume of a
stereo audio program including left and right channel signals,
comprising: an excessive spatial processing protection processor
configured and arranged for controlling the level of a difference
signal created from subtracting the right channel signal from the
left channel signal (L-R), and a contour filter for shaping the
difference signal.
[0008] In accordance with yet another aspect, a system is provided
for dynamically controlling the perceived volume of a stereo audio
program including left and right channel signals. The system
comprises: an excessive spatial processing protection processor
configured and arranged for controlling the level of a difference
signal created from subtracting the right channel signal from the
left channel signal (L-R), and a contour filter for shaping the
difference signal.
GENERAL DESCRIPTION OF THE DRAWINGS
[0009] The drawings disclose illustrative embodiments. They do not
set forth all embodiments. Other embodiments may be used in
addition or instead. Details that may be apparent or unnecessary
may be omitted to save space or for more effective illustration.
Conversely, some embodiments may be practiced without all of the
details that are disclosed. When the same numeral appears in
different drawings, it refers to the same or like components or
steps.
[0010] Aspects of the disclosure may be more fully understood from
the following description when read together with the accompanying
drawings, which are to be regarded as illustrative in nature, and
not as limiting. The drawings are not necessarily to scale,
emphasis instead being placed on the principles of the disclosure.
In the drawings:
[0011] FIG. 1 is a simplified block diagram of one embodiment of a
dynamic volume control system;
[0012] FIG. 2 is a state diagram illustrating one embodiment of the
operation of one program change detection;
[0013] FIG. 3 is a simplified block diagram of one embodiment of a
single band of a dynamic volume control system;
[0014] FIG. 4 is a simplified block diagram of one embodiment of a
multi-band dynamic volume control system;
[0015] FIGS. 5-7 graphically illustrate frequency responses of a
multi-band dynamic volume control system;
[0016] FIG. 8 is a simplified block diagram of one embodiment of a
double procession protection system;
[0017] FIG. 9 is a simplified block diagram of one embodiment of an
arrangement of a combined system including both a dynamic volume
control system and a double processing protection system; and
[0018] FIG. 10 is a simplified block diagram of a second embodiment
of an arrangement of a combined system including both a dynamic
volume control system and a double processing protection system
DETAILED DESCRIPTION OF THE DRAWINGS
[0019] Illustrative embodiments are now discussed. Other
embodiments may be used in addition or instead. Details that may be
apparent or unnecessary may be omitted to save space or for a more
effective presentation. Conversely, some embodiments may be
practiced without all of the details that are disclosed.
[0020] Dynamic Volume Control (DVC) System
[0021] A DVC system is described for dynamically controlling the
volume of an audio signal. The system is configured and arranged so
as to dynamically manipulate and modify sound volume when sudden
changes occur. The embodiments described herein are configured and
arranged so as to maintain a perceived constant volume level for
audio band applications. The DVC system can be entirely digital and
can be implemented economically in software (C, assembler etc.) or
digital hardware (HDL description), although it should be evident
that the system can be entirely analog, or a hybrid analog/digital
system. Market applications include television audio, DVD player
audio, set top box audio, radio audio and other hifi and non-hifi
audio products. Without a DVC system of the type described herein,
perceived volume levels can vary dramatically as program material
changes within a given broadcast/source or as the audio
broadcast/source changes. These volume changes can be irritating
and often involve manual volume adjustments by the listener. One
specific example would be the volume changes that occur when
changing channels on a television. Another example would be the
volume changes between a television program and a television
commercial. In both examples the DVC system would eliminate
listener discomfort and maintain a more consistent volume
level.
[0022] FIG. 1 shows one embodiment of such a DVC system 100. The
system 100 receives two input signals, a left signal L at input 102
and a right signal at input 104. In the embodiments described the
DVC system architecture is based upon a digital implementation of a
classic compressor design (THAT Corporation Design Note 118) with
flexibility and additional modifications that are only possible in
a digital implementation. System 100 includes an RMS level detector
110 for providing a signal representative of the sum of the RMS
averages of the left and right signals L and R, log conversion
block 112, and a signal averaging AVG block 114. Log conversion
block 112 converts the output of the RMS level detector 110 from
the linear to the logarithmic domain. System 100 is responsive to a
number of control signals each indicative of whether a certain
condition exists requiring a response from the system The system
100 also includes a host processor (not shown) configured and
arranged for carrying out the operating of the DVC system 100. The
illustrated embodiment is responsive to a number of control signals
including: a target level signal provided by the target signal
generating device 116, an attack threshold signal generated by the
attack threshold signal device 118, a release threshold (not
shown), a gate threshold signal generated by the gate threshold
signal device 120, an attack ratio threshold (not shown), a release
ratio threshold (not shown), a ratio signal generated by the ratio
signal device 122, and a mute hold signal generated by mute hold
device 124 responsive to a program change detector (PCD-not shown).
Devices 116, 118, 120, 122 can simply be adjustable user controls
accessible to the user. Device 124 can be arranged to receive a
signal from the TV controls when the channel changes or from a mute
detector (not shown) that detects if inputs 102 and 104 have both
been muted. The target signal level 116 represents the level in dB,
relative to a full scale input, that is the target volume. The
attack threshold 118 represents the number of dB that REF must be
above AVG before the attack time is reduced by a factor of N, where
N can be any number. In one illustrated embodiment N=10. The
release threshold signal represents the number of dB that REF must
be below AVG before the release time is reduced by a factor of M,
where M can be any number, and in one illustrated embodiment M=10
The Gate threshold 120 represents the amount, a negative dB number,
that REF can go below AVG before all left and right gain
adjustments are frozen. The attack ratio threshold represents the
absolute amount, in dB, that REF can go above the target signal
level 116 before the volume control begins attenuating the input
signal. The release ratio threshold represents the absolute amount,
in dB, that REF can go below the target signal level 116 before the
volume control begins adding gain to the input signal. The ratio
signal 122 adjusts the AVG value by the desired compression
ratio.
[0023] Target level signal 116 is subtracted from the output of log
conversion block 112 by signal summer 126 so as to provide the REF
signal to the signal averaging AVG block 114, a comparator 128 and
a second comparator 130. The REF signal represents the volume level
of the input signal relative to the desired listening threshold.
The AVG signal can also be thought of as the instantaneous (prior
to attack/release processing) ideal gain recommendation. The output
of the signal averaging block 114 is the AVG signal, which is a
signal that is a function of the average of the REF signal. The AVG
signal is applied to the signal summer 132 where it is added to the
attack threshold signal 118. In a similar manner (not shown) the
AVG signal is summed with a release threshold. The AVG signal is
also applied to the signal summer 134 where it is added to the gate
threshold signal 120. The output of signal summer 132 is applied to
attack threshold comparator 128 where it is compared to the REF
signal, while the output of signal summer 134 is applied to gate
threshold comparator 130 where it is compared to the REF signal.
The AVG signal is also multiplied by the ratio signal 122 by the
signal multiplier 136. The output of comparator 128 is applied to
the attack/release selection block 138, which in turn provides
either an Att (attack) signal, or a Rel (release) signal to the
signal averaging block 114, dependent on and responsive to the
status of the mute hold signal 124. The output of the release
threshold AVG summer (not shown) is also compared to the REF signal
and is applied to the attack/release selection block. The
comparator 130 provides an output to the HOLD input of signal
averaging block 114. Finally, the signal multiplier 136 provides an
output to a log-to-linear signal converter 140, which in turn
provides an output which is applied to each of the signal
multipliers 142 and 144, wherein it respectively scales the left
and right signal provided at the corresponding inputs 102 and 104
so as to provide the output modified left and right signals Lo and
Ro.
[0024] Referring to FIG. 1, the RMS level detector 110 senses the
sound level of the input signal. It should be noted that while an
RMS level detector is shown, any type of signal level detector can
be used. For example, a peak detector, average detector, perception
based level detector (such as the ITU 1770 loudness detector or the
CBS loudness detector), or other detector can be used to sense the
sound level. These level detectors usually have time constants
which are dynamically and independently adjustable. One method of
adjusting these time constants is to base them on the envelope or
general shape of the input signal so that the time constants vary
with the signal. In other embodiments, the time constants are
fixed. For ease of data processing, the sound level can be
converted into the log domain, as shown, using log conversion block
112. In a multi-band system, a separate RMS detector can be used
for each band. The signal averaging block 114 is configured and
arranged so as to compute the average of REF relative to the attack
and release times. The output signal AVG of the signal averaging
block 114 is adjusted by the desired compression ratio, via
multiplier 136, to create the gain value to be applied. Finally the
gain is converted back into the linear domain by the log-to-linear
converter 140 for application to the left and right signals L and R
so as to produce the modified left and right signals Lo and Ro.
[0025] A target output level represented by the target level signal
116 is subtracted from the sensed level at the output of the log
conversion block 112 to determine the difference between the actual
and desired sound level. This difference, which represents the
level of the input signal relative to the target level signal 116,
is known as the reference (REF) signal. The target level signal can
be a user input, such as a simple knob or other pre-set setting, so
as to control the level of sound desired. This threshold can be
fixed or it can be changed as a function of the input signal level
to better position the compression relative to the input dynamic
range. Once REF signal is obtained, it is provided as an input to
the averaging block 114, attack threshold comparator 128 and gate
threshold comparator 130. The output of attack threshold comparator
128 is applied to the attack/release select block 138, which in
turn receives a signal a MuteHold signals 124 from a program change
detector.
[0026] The gate threshold signal 120 when added to the current
average AVG represents the lowest value REF is able to achieve
before left and right gain adjustment (142 and 144) are frozen. The
gate threshold comparator 130 receives the instantaneous signal
level (REF) signal and determines if the sound level represented by
REF drops below the given aforementioned threshold. If the
instantaneous signal level (REF) is more than the amount of the
gate threshold below the averaged signal level (AVG) appearing at
the output of block 114, the gain applied to the signal in the
signal path is held constant until the signal level rises above the
threshold. The intent is to keep the system 100 from applying
increased gain to very low level input signals such as noise. In an
infinite hold system, the gain can be constant forever until the
signal level rises. In a leaky hold system, the gain can be
increased at a gradual pace (much slower than the release time). In
a one embodiment, this gate hold threshold is adjustable, while in
another embodiment the threshold set by gate threshold 134 is
fixed.
[0027] The program change detector, or mute-hold, senses when the
input is "silent." When a user changes a television (TV) channel,
the sound level between the two channels can change, either
increasing or decreasing significantly. Typically, a television
manufacturer will mute the audio briefly while changing channels to
protect the viewer from irritating audio transients. The program
change detector is designed to check for this sort of muting by
determining if the sound level drops below a predetermined
threshold (MuteLev) for a predetermined amount of time (MuteTime).
If the instantaneous sound level (REF) is below the threshold for a
certain period of time, or "mute time," then a program change is
detected. If a program change is detected the speeds of the attack
and release times (described in further detail below) are
increased. With this increase, if a loud channel is changed to a
quiet channel, then the increased release time permits a faster
gain increase to meet the target sound output level. Conversely, if
a quiet channel is changed to a loud channel, then the increased
attack time permits a faster gain decrease to meet the target. If
the sound level rises above the threshold before the "mute time"
expires, then a program change is not detected. In alternative
embodiments, the "mute time" and the mute threshold can be fixed,
user adjustable, variable, or otherwise.
[0028] FIG. 2, illustrates one embodiment of state diagram of a
mute detection algorithm for the operation of the program change
detector. The operation 200 includes three states, the MUTE OFF
state 202, the MUTE ON state 208 and the MUTE HOLD state 212. In
the MUTE OFF state 202 the REF signal at the output of the signal
summer 126 is periodically compared to MuteLev threshold level at
204 to determine whether REF>MuteLev or REF<MuteLev. If
REF>MuteLev, then the operation remains in state 202 and
continues in that state. In this state, MUTE ON=0, MUTE HOLD=0, and
the attack and release times are at their normal settings. If,
however, Ref<MuteLev, a mute is detected and the operation
transitions at 206 to state 208 MUTE ON. Once transitioned to state
208, MUTE ON=1, and in the state 208, the program change detector
next determines whether the mute condition remains for a
predetermined time. If the condition of Mute does not last long
enough and REF>MuteOffLev occurs before the expiration of the
timer, the detector transitions back to the state 202. This might
occur where there is a pause in program where the audio portion is
silent. However, where the timer determines that the Mute Time has
been expired, a program change has occurred. In this state when the
REF>MuteOffLev returns, the detector will transition at 210 to
the MUTE HOLD state 212. In this state, the attack and release
times are sped up so that a relative loud signal is made softer,
and a relatively soft signal is made louder for a predetermined
time limit (Mute Time). In FIG. 2 the timer setting in state 208 is
shown to be the same as in state 212. It should be obvious that
they can also be different values. While in state 212, if the Ref
decreases below the MuteLev setting (i.e., Ref<MuteLev) prior to
the expiration of the Mute Time, the state transitions at 214 back
to state 208. If, however, the Mute Time does expire the detector
will transition at 216 back to the state 202.
[0029] In one implementation the MuteTime and MuteLev (mute level)
are adjustable. The mute time and mute level can also be fixed in a
given implementation. The mute threshold is set lower than the gate
threshold. The mute detection algorithm can function in an
automatic or manual mode. In automatic mode the system 100 detects
the mute condition during a channel change. The program change
detector can also operate in a manual mode, where a "muting" signal
is received from a television or other device indicating that a
channel is being changed. Further, the program change detector can
also receive signals from a user's remote control to interpret
whether the user is changing a channel. The system 100 can also
operate using attack and release thresholds. If, in a given time
window, a sound level jumps to the extent that the attack threshold
118 is traversed, then the system 100 can operate in "fast attack"
mode. In one embodiment, if REF exceeds AVG by the attack
threshold, this fast attack mode increases the attack time constant
to quickly reduce the gain of this increased sound level.
Similarly, if the release threshold is traversed, then the system
operates in fast release mode, where the gain is increased quickly.
These attack and release time constants can be independently
adjustable between each other and also between high and low bands
in a multi-band system.
[0030] In some implementations the maximum gain applied to the
input signal may be limited. This would limit the amount of gain
applied to a quiet audio passage. If a loud passage (thunder in a
movie) immediately followed the quiet audio passage, unlimited gain
could result in significant audio overshoot before the gain could
be reduced over the attack time.
[0031] Averaging block 114 receives the REF, attack, release and
hold signals and determines the average (AVG) of the REF signal
based on and as a function of the attack, release, and hold
signals. The AVG signal is then adjusted by the compression ratio
to be applied to the original signal for volume control. The AVG
signal represents the REF signal processed with the Attack/Release
time constants. Once a change in REF ripples through the averaging
block 114 to affect the AVG signal, it first needs to be adjusted
by the desired compression ratio. It should be appreciated that
system 100 does not compress infinitely. Once the value of the AVG
signal is adjusted by the compression ratio, the AVG signal is
multiplied by -(1-ratio) via ratio setting device 122 and
multiplier 136. Thus, by way of example, a 4:1 compression ratio
would multiply the AVG signal by -(1-1/4) or -3/4. So if the audio
is 20 dB above the threshold value, the AVG signal would equal 20
dB (after the attack time constant has elapsed). Multiplying 20 dB
by -3/4 yields a value of -15 dB. As a result the audio that is 20
dB over the threshold is attenuated to 5 dB after the -15 dB gain
is applied. 20/5=4 which is a 4:1 compression ratio.
[0032] The compression ratio applied to the signal can be a single
sloped ratio. For example, a 4:1 ratio can be applied to the
incoming signal, depending on the level threshold. If AVG is above
the threshold, then the signal would be reduced by a factor of four
(at the attack rate). Conversely, if AVG is below the threshold,
then the signal would be amplified by a factor of four (at the
release rate).
[0033] In another embodiment, the compression ratio can be
different, depending on whether the AVG signal is above or below
the Target Level threshold provided by device 116. For example, if
the AVG signal is above the Target Level threshold, then the signal
can be reduced by a factor of four, as in the previous example. In
contrast, however, if AVG is below the threshold, then a different
ratio can be applied to amplify the input signal, say a 1.5:1
ratio. This arrangement permits the compression of loud signals
above the ratio threshold, but also preserves the sound level for
quiet dialogue, such as whispers. The arrangement described above
could be thought of as a movie mode; it takes the jarring edge off
of loud sounds but allows the quiet sounds (leaves rustling etc.)
to maintain their original level. This is a good mode for loud
volume settings. Thus, a fuller dynamic range can be achieved while
still compressing loud annoying signals. Another arrangement
involves heavy compression (for example 10:1) for AVG values above
and below the Level threshold. Heavy compression is referred to
herein as a "night mode" since you can hear all sounds in the
program (both loud and soft) without having to turn the volume up
(for soft sounds) and down (for loud sounds). Night mode is good
for low volume settings, which are often preferred by television
viewers during the late night hours.
[0034] Even further, another embodiment contemplates the use of
high and low attack and release ratio thresholds. In such an
embodiment, the two thresholds define three regions of a loudness
space: quiet, normal, and loud. In each of these windows, a
different compression ratio can be applied. For example, a 1.5:1
ratio can be used to amplify quiet signals, a 1:1 ratio can be used
to preserve normal signals, and a 4:1 ratio can be used to
attenuate loud signals. With this multi-windowed system, the
original dynamic range can more accurately be preserved while
fringe loud and soft signals can be attenuated and amplified
respectively.
[0035] Lastly, if the processing is performed in the log domain,
then the calculated compression ratio is "linearized" at 140 before
applying the gain to the input signal.
[0036] FIG. 3 shows a single band system 300 wherein one DVC system
302 can apply the same gain to each of the left (L) and right (R)
signals applied to the respective inputs 304 and 306. Specifically,
as seen in FIG. 3, the output of the DVC system 302 (provided by
the log-to-linear signal converter 140) dynamically sets the gain
of each of the amplifiers 308 and 310 respectively, which in turn
amplify the corresponding left and rights signals applied to the
two inputs of the system 300 providing the Lout and Rout signal at
the outputs 316 and 318. The DVC system 302 can be responsive to
the entire frequency range of each of the L and R signals, or only
a selective band of each as shown in FIG. 3 for example, high pass
filters 312 and 314 each only pass a high frequency portion of the
respective L and R signals to the DVC system 302, so that the
latter only responds to high frequency content of each of the
signals.
[0037] Alternatively, a multi-band system can be configured so that
select bands are each individually processed by its own DVC system
so the L and R signals are independently controlled. As shown in
FIG. 4, for example, a two band system 400 employs two DVC systems
406 and 408, each for the L and R signals, so that that L and R
signals applied to the inputs 402 and 404 enjoy independent gain
control. As shown, the L signal is applied to a high pass filter
410 and low pass filter 412, while the R signal is applied to the
high pass filter 414 and low pass filter 416. In a two band system
of FIG. 4 with high and low bands, a DVC system (406 and 408) can
apply a gain to the L and R signals in the high band by applying
the output of each DVC system to the respective outputs of the high
and low pass filters. Specifically, the output of DVC system 406 is
applied to control the gain of each of the amplifiers 418 and 420
which receive and amplify the high frequency outputs of high pass
filters 410 and 412. Similarly, the output of DVC system 408 is
applied to control the gain of each of the amplifiers 422 and 424
which receive and amplify the low frequency outputs of the low pass
filters 412 and 416. The outputs of amplifiers 418 and 420 are
added at signal summer 426 so as to produce the output signal Lout
at output 428, while the outputs of amplifiers 422 and 424 are
added by the signal summer 430 so as to produce the output signal
Rout at output 432.
[0038] In another embodiment, if independent gain control of each L
and R signal in a multi-band signal is desired, then a separate DVC
system can be used for each band of each of the L and R signals.
Further, instead of a multi-band system, a high pass filter can be
used to eliminate low frequencies for systems unresponsive to low
frequencies such as shown in FIG. 3.
[0039] Regarding the filters used with the multi-band DVC system,
the cross over frequency between each contiguous band (in the two
band system this would be the low and high pass bands) can be
adjustable. It is also possible to leave the cross over frequency
fixed. One example is a crossover based upon a digital
implementation of a derived filter. Derived filters are described
in THAT Corporation Application Note 104 from THAT Corporation of
Milford, Mass., and in Bohn, D. (Ed.), Audio Handbook (National
Semiconductor Corporation, Santa Clara, Calif. 1976) .sctn.5.2.4.
In one example of a derived filter implementation, the crossover
uses a 2.sup.nd order Butterworth LPF and a derived HPF which sum
to unity as shown in FIG. 5. In another example, the crossover is a
traditional digital 2nd order with a Q=0.5 with the HPF inverted so
the bands sum to unity as shown in FIG. 6. In yet another example,
the crossover is based on 4th order Linkwitz-Riley filters which
sum to unity as shown in FIG. 7. In the single band volume control
a high pass filter controls the input of the RMS detector.
[0040] Multi-Spatial Processing Protection (MPP)
[0041] Television manufacturers often include virtual surround
(pseudosurround) technology (e.g., SRS Tru-Surround, Spatializer
etc.) in the two-channel television audio output path. This
two-channel television audio may go to speakers external to the
television or to speakers mounted in the television enclosure.
These virtual surround technologies create the illusion of surround
sound by manipulating and enhancing the difference channel (L-R)
present in stereo broadcasts. The listener still perceives an
intact center image (L+R) but also often hears the difference
channel (L-R) either widened over a broad soundstage or as a point
source located somewhere other than the speaker locations. Often
this type of spatial enhancement is done during the production of
the audio programming. This is especially true of television
commercials which are enhanced to grab the listener's attention.
When an audio program has two cascaded stages of spatial
enhancement (for example at the point of production and in a
television's audio processing) there can be significant degradation
in the audio quality. The preprocessed audio tends to have
significant L-R energy relative to L+R energy. The second, cascaded
stage, of spatial enhancement processing tends to increase the
amount of L-R energy even more. Recent studies have shown that
excessive amounts of L-R enhancement is one of the top factors in
listener fatigue. There also can be a significant volume
increase.
[0042] Accordingly, in accordance with one aspect of the invention,
a MPP system is provided. In one embodiment the MPP is a double
processing protection (DPP) system that is a part of a television
audio signal reception and playback system, prior to the
television's stereo enhancement technology. The MPP system is
hereinafter referred to as a pseudosurround signal processor. The
exemplary DPP system processes the audio signals so as to minimize
the difference (L-R) enhancement (i.e., minimizing the energy level
of the difference (L-R) signal relative to the sum (L+R) signal)
introduced at the point of production. This allows the television's
spatial enhancement technology to process the audio signals in a
manner that is psychoacoustically pleasing to the listener. The
cascade of the DPP system before the television's spatial
enhancement audio processing has proven to be quite effective in
mitigating the harsh effects of double spatial processing. In one
embodiment the DPP system is entirely digital and can be
implemented economically in software (C, assembler etc.) or digital
hardware (HDL description). It should be appreciated that the DPP
system can also be all analog, or a hybrid of analog and digital
components.
[0043] In one embodiment the DPP system reduces L-R enhancement
relative to the corresponding L+R level. The embodiment reduces the
effects of multiple 2 channel spatial effects processing. One
embodiment of such a system is shown in FIG. 8 at 800. The left
signal L and the right signal R are respectively applied to the
inputs 802 and 804 of system 800. The L and R signals are applied
to matrices represented by the two signal summers 806 and 808.
Signal summers 806 and 808 constitute the matrix which provides the
SUM (L+R) and DIF (L-R) signals.
[0044] In the sum (L+R) path, the signal is generally untouched.
The SUM signal usually contains audio content which does not
necessarily need to be localized. However, in alternate
embodiments, frequency contour shaping can be performed to enhance
audio content such as dialogue. As shown, the SUM signal is
multiplied by a Center constant at signal multiplier 810 prior to
be provided to matrices illustrated as signal summers 812 and 814.
The Center constant allows the level of the center image (L+R) to
be adjusted, if desired, to aid in intelligibility of dialogue.
Adding the L+R and L-R signals provides the left output signal Lo
at output 816, while subtracting the L-R from the L+R provides the
right output signal Ro at output 818.
[0045] In the illustrated embodiment of FIG. 8, most of the
processing occurs in the DIF path. L+R and L-R are compared to
determine the level of the L-R signal relative to L+R. Before
comparison, these two SUM and DIF signals can be each passed
through a respective high pass filter 820 and 822, such as in
circumstances where the speaker frequency response does not include
low frequencies. The L-R DIF signal can further be passed through a
multi-band equalizer 824 to accentuate the ear's most sensitive
frequencies, namely mid-range frequencies, to compensate for the
perceived loudness level of the L-R signal. Equalizer 824 allows
the difference channel level detection to be frequency dependent.
For example, low frequency signals may be minimized when processing
for inexpensive television speakers with limited bass response.
High frequencies may be minimized to limit the response to
transient audio events. Typically mid range frequencies, were the
ear is most sensitive, are equalized to dominate the difference
level detection. Once the levels of the difference and sum signals
are calculated the DIF/SUM ratio is determined.
[0046] Each of these signals is then run through a respective
signal level detector 828 and 830. The detectors listed above can
be used, such as an RMS level detector, although any type of level
detector (such as the ones described above) can be used. Also, the
processing can all be performed in the log domain to increase
efficiency by processing them through the log domain processing
blocks 832 and 834.
[0047] The outputs of the blocks 832 and 834 are applied to the
signal summer wherein the processed SUM signal is subtracted from
the processed DIF signal. Subtracting one signal from the other in
the log domain is the same as providing a signal that is the ratio
of the process SUM signal to that of the DIF signal in the linear
domain. Once the L+R and L-R signal levels are calculated, where
the L-R signal level may have been equalized prior to level
detection to increase the mid-range frequencies, these two signal
levels are compared by the comparator 838 to a preset threshold
840. The ratio between the two signals ((L-R)/(L+R)) is compared to
a threshold ratio by comparator 838 in order to determine the
recommended L-R signal gain adjustment. A limiter stage 842 may be
used to limit the amount and direction of gain applied to the L-R
signal. The illustrated embodiment limits the gain at 0 dB hence
only allowing attenuation of the L-R signal, although in some
applications, there may be a desire to amplify the L-R signal. An
averaging stage 844 averages, with a relatively long time constant,
the output of the limiter stage 842 so as to prevent the DPP system
from tracking brief transient audio events. After conversion back
to the linear domain by linear domain block 846, the level of the
L-R signal is correspondingly adjusted by the signal multiplier 848
to achieve that target ratio.
[0048] Even in the absence of multiple stages of spatial
preprocessing the target (L-R)/(L+R) ratio can be set low to allow,
for example, an increased intelligibility of program dialogue.
[0049] Another method and system for double processing protection
is to "predict" the preprocessing performed on the L-R signal and
compensate for the preprocessing from the prediction. For example,
if SRS Tru-Surround is known to be used on L-R, then the signal can
correspondingly be compensated to remove the L-R enhancement.
Alternatively, the signal energy can be monitored over time to
deduce the pre-processing performed on the L-R signal. From this
deduction, the L-R signal can be compensated to remove any such L-R
enhancements. Preprocessing could change the frequency response of
the difference (and sum for that matter) channel as well as the
L-R/L+R ratio. The inverse filter, of the preprocessor, could be
applied to each path while the existing L-R/L+R ratio adjustment
still remains in use.
[0050] Further, while the DPP system of FIG. 8 is shown as a feed
forward system wherein the DIF signal is sensed prior to the
variable gain control amplifier 848, a feedback system, wherein the
sum and difference signal levels are detected after the variable
gain control amplifier is also possible.
[0051] Combining DVC and DPP
[0052] Since each of the DVC and MPP provide an improved listening
experience, the two can be combined to combine the advantages of
both. There are a number of ways of combining DVC and DPP blocks.
One example of a useful topology places the DPP block 902 first,
followed by a DVC block 904 in a cascaded design, as shown in FIG.
9. In this embodiment, the L and R signals are applied to the
inputs 906 and 908 of the DPP block 902. The L' and R' outputs of
the DDP block 902 at outputs 910 and 912 are applied to the two
inputs 914 and 916 of the DVC block 904. The outputs 918 and 920 of
DVC block provide the respective output signals Lo and Ro. The
cascaded design allows the DPP block to remove the difference (L-R)
signal enhancement first, then maintain the perceived constant
level of the stereo audio program with the DVC block without
ambient energy being present.
[0053] Another example of a topology places the DPP block 1004 in a
feedback path of the DVC block 1002, as shown in FIG. 10. The L and
R inputs are applied to the inputs 1006 and 1008, respectively. The
two signals are applied to matrices (represented by signal summers
1010 and 1012) so as to produce the SUM (L+R) signal and the DIF
(L-R) signal. The outputs 1014 and 1016 of the DVC block 1002
provide the outputs signals Lo and Ro. The two outputs 1014 and
1016 provide the two feedback signals of the feedback path.
Specifically, the Lo and Ro signals are applied to matrices shown
as to signal summers 1018 and 1020 so that the Lo+Ro forms one
input of the DPP block 1004, and the Lo-Ro forms the other input of
the DPP block 1004. The output of the DPP block 1004 represents the
corrected gain, which is then applied to the DIF signal by signal
multiplier 1022. The latter can be in the form of a variable gain
control amplifier. It should be appreciated that while two
embodiments of the combined DVC and DPP blocks are illustrated in
FIGS. 9 and 10, other combinations are possible.
[0054] Accordingly, embodiments of the present disclosure can
provide for improved performance of audio signal reproduction which
reduces the effects of undesirable volume changes in audio
programming.
[0055] The components, steps, features, benefits and advantages
that have been discussed are merely illustrative. None of them, nor
the discussions relating to them, are intended to limit the scope
of protection in any way. Numerous other embodiments are also
contemplated. Additionally, embodiments of the present disclosure
can have fewer, additional, and/or different components, steps,
features, benefits and advantages than as expressly described
herein. These also include embodiments in which the components
and/or steps are arranged and/or ordered differently.
[0056] Unless otherwise stated, all measurements, values, ratings,
positions, magnitudes, sizes, and other specifications that are set
forth in this specification, including in the claims that follow,
are approximate, not exact. They are intended to have a reasonable
range that is consistent with the functions to which they relate
and with what is customary in the art to which they pertain.
[0057] All articles, patents, patent applications, and other
publications which have been cited in this disclosure are hereby
incorporated herein by reference.
[0058] The phrase "means for" if and when used in a claim is
intended to and should be interpreted to embrace the corresponding
structures and materials that have been described and their
equivalents. Similarly, the phrase "step for" if and when used in a
claim embraces the corresponding acts that have been described and
their equivalents. The absence of these phrases means that the
claim is not intended to and should not be interpreted to be
limited to any of the corresponding structures, materials, or acts
or to their equivalents.
[0059] Nothing that has been stated or illustrated is intended or
should be interpreted to cause a dedication of any component, step,
feature, object, benefit, advantage, or equivalent to the public,
regardless of whether it is recited in the claims.
[0060] The scope of protection is limited solely by the claims that
now follow. That scope is intended and should be interpreted to be
as broad as is consistent with the ordinary meaning of the language
that is used in the claims when interpreted in light of this
specification and the prosecution history that follows and to
encompass all structural and functional equivalents.
* * * * *