Ambient Noise Suppressor Patent Grant Mitchell , et al. February 22, 1 [Bell Telephone Laboratories, Incorporated]

Ambient Noise Suppressor

Mitchell , et al. February 22, 1

Patent Grant 3644674

U.S. patent number 3,644,674 [Application Number 04/837,699] was granted by the patent office on 1972-02-22 for ambient noise suppressor. This patent grant is currently assigned to Bell Telephone Laboratories, Incorporated. Invention is credited to Olga M. M. Mitchell, Carolyn Ross, Robert L. Wallace, Jr..

United States Patent	3,644,674
Mitchell , et al.	February 22, 1972

**Please see images for: ( Certificate of Correction ) **

AMBIENT NOISE SUPPRESSOR

Abstract

Signals from a desired source, such as a person speaking, are enhanced relative to unwanted ambient sound by a speech processor that includes an array of microphones arranged at equal distances from the desired source. The unwanted sound, being "off-center," arrives nonconcurrently at the individual microphones. The processor continuously arranges the instantaneous microphone outputs in order of their relative energy contained, and selects as its output some one of the microphone outputs that is intermediate in the instantaneous ranking.

Inventors:	Mitchell; Olga M. M. (Summit, NJ), Ross; Carolyn (Berkeley Heights, NJ), Wallace, Jr.; Robert L. (Warren Township, Somerset County, NJ)
Assignee:	Bell Telephone Laboratories, Incorporated (Murray Hill, NJ)
Family ID:	25275171
Appl. No.:	04/837,699
Filed:	June 30, 1969

Current U.S. Class:	379/392; 379/416; 381/92
Current CPC Class:	H04R 3/005 (20130101); H04M 9/001 (20130101); G10K 2210/1082 (20130101); G10K 2210/3045 (20130101); G10K 2210/111 (20130101); G10K 2210/108 (20130101); G10K 2210/3012 (20130101)
Current International Class:	G10K 11/178 (20060101); H04M 9/00 (20060101); G10K 11/00 (20060101); H04R 3/00 (20060101); H04b 015/00 ()
Field of Search:	;179/1.8,1P ;325/476,475,474,473,472,304 ;324/77A,77E ;328/115-117

References Cited [Referenced By]

U.S. Patent Documents


3044062	July 1962	Katzin
3057960	October 1962	Kaiser
3109066	October 1963	David

Primary Examiner: Claffy; Kathleen H.
Assistant Examiner: Leaheey; Jon Bradford

Claims

What is claimed is:

1. A circuit for enhancing a desired signal in the presence of an undesired signal comprising:

first and second channels mutually arranged to receive simultaneously said desired signal and to receive an undesired signal nonconcurrently;

first means for deriving a signal representing the value of the amplitude difference between the signals respectively present in said first and second channels, thus canceling in the derived signal the components of said desired signal while leaving substantially intact the components of said undesired signal;

second means for full-wave rectifying the components of said undesired signal; and

third means for adding the output of said second means to the signals respectively present in said first and second channels, the output of said third means consisting of the desired signal substantially undistorted and the undesired signal half-wave rectified.

2. A signal processor for enhancing a desired signal in the presence of an undesired signal comprising:

first and second circuits each constructed in accordance with the circuit in claim 1 and each having an output from said third means in which the desired signal components are substantially equal in amplitude;

means for deriving a signal representing the value of the amplitude difference between the two said third means outputs, thus canceling in the just-derived signal the components of desired signal while leaving substantially intact the components of half-wave-rectified undesired signal;

means for full-wave rectifying the half-wave-rectified components of said undesired signal but in a sense opposite to that employed with said second means; and

means for adding the output of said last-named full-wave rectifying means to each of the third means outputs, the resulting sum consisting of the desired signal substantially undistorted and the undesired signal substantially eliminated.

3. The circuit in accordance with claim 1, wherein said undesired signal is impulsive in nature.

4. A signal processor in accordance with claim 2, wherein said undesired signal is impulse-type noise.

5. In a distant-talking two-way telephone system, apparatus for increasing the signal-to-noise ratio comprising:

a first and a second microphone each separated from the same noise source by unequal distances;

first means for placing in phase desired signals received from an information source by each said microphone;

means for deriving a signal representative of the value of the amplitude difference between the output signals of the respective said microphones, the components of said desired signals combining to substantially cancel each other, while the noise components remain substantially intact;

means for full-wave rectifying the noise components; and

means for additionally combing said derived signal with the signals received by said first and said second microphones, to produce an output consisting of the desired signal substantially undistorted and the noise signal half-wave rectified.

6. A system for processing speech, comprising:

first and second circuits each constructed in accordance with the apparatus described in claim 5, each circuit having an output consisting of the said desired signal substantially undistorted and the noise

signal half-wave rectified;

means for deriving a signal representing the value of the amplitude difference between said outputs of said first and second circuits, thus canceling in the just-derived signal the components of said desired signal while leaving substantially intact the noise components;

means for full-wave rectifying the noise components but in a sense opposite to the rectifying means of claim 5; and

means for adding the resulting full-wave-rectified signal to said outputs of said first and second circuits, the sum consisting of the undistorted desired signal and the noise components substantially eliminated.

7. A signal processor for discriminating against the instantaneously strongest of several unequal input signals in a microphone array comprising:

a pair of first stages, each first stage connected to a pair of microphonic inputs;

means for selecting as the instantaneous output of each said first stage the greater of the stage's two instantaneous inputs;

a second stage including means connecting the first stage's instantaneous outputs thereinto; and

means for selecting as the instantaneous output of said second stage the lesser of its two said inputs, whereby the processor output is never the strongest instantaneous input to said microphone array.

Description

FIELD OF THE INVENTION

This invention pertains broadly to the field of signal processing and in particular relates to signal discrimination techniques. As a principal object, the invention seeks to improve the intelligibility, at the receiving end of an electrical communications path, of a desired signal which was acoustically originated in the presence of noise.

BACKGROUND OF THE INVENTION

In telephony and elsewhere, numerous situations arise in which desired acoustical signals require enhancement relative to some unwanted signals. The desired speech signals in "hands-free" telephony for example, are often generated in an environment that includes other speech as well as typewriter clatter, chair scraping and many other background noises impulsive in character. The randomness of the noise source in relation to the relatively stationary talker and microphone locations complicates the problem's general solution, as does the reverberance of most conference rooms and offices.

The present invention in one aspect is a scheme for canceling impulse-type noise or, if the noise is of a continuous nature such as speech, for rendering it relatively unintelligible. In the latter respect, the invention draws uniquely upon the capability of the human ear to disregard unintelligible signals.

SUMMARY OF THE INVENTION

In its broadest aspect, the invention in essence is that the instantaneous outputs of a plurality of arrayed microphones, for example four, are first correlated with respect to the signal from a desired source and then arranged or ranked by a processor in an algebraically ascending order of amplitude value. As its instantaneous output, the processor, in accordance with one aspect of the invention, selects some one of the inputs or some averaged combination of inputs which at that instant occupies a desired location in the referred-to ranking.

In one embodiment, the processor is arranged to select an output which in the ranking is intermediate in amplitude value; but the processor never selects either the algebraically greatest or smallest output. The processor so programmed discriminates against a signal which is stronger at one input than at the others.

In a specific case, the desired signal source may be a person speaking from a location equidistant from each of the four microphones, or "on-center." An offcenter impulse signal such as the striking of a typewriter key, if of duration such that the signal impinges sequentially on successive microphones without overlap, will appear nonconcurrently at each microphone as a peak. Since the signal from the desired source is the same at all of the microphones, the microphone that at any instant contains a noise component will be either algebraically the largest or the smallest signal. But because the processor never selects as its output the algebraically greatest input from the microphones, the output of each successive microphone on which the impulse noise instantaneously impinges is never selected as the output of the processor.

The invention, its further objects, features and advantages are further delineated in the more detailed descriptions which follow.

THE DRAWING

FIG. 1 is a waveform diagram of the first stage process;

FIG. 2 is a functional block diagram of the first stage process;

FIG. 3 is a schematic diagram of one microphone arrangement practicing the invention;

FIG. 4 is a functional block diagram depicting the two-stage process;

FIG. 5 is a waveform diagram of the two-stage process;

FIG. 6 is a functional block diagram depicting an equivalent two-stage process; and

FIG. 7 is a diagram of one circuit for achieving the process of FIG. 6.

DETAILED DESCRIPTION OF AN ILLUSTRATIVE EMBODIMENT

A principal aspect of the inventive signal enhancement process is illustrated in the waveform diagram of FIG. 1, taken in conjunction with the process block diagram of FIG. 2. The environment depicted in FIG. 3 exemplifies the condition of a speech source or talker 1 speaking in the presence of a noise source 2 which, for example, may be the impulselike clacking of a typewriter. The talker 1 is located equidistant from each of a plurality of microphones M.sub.2 M.sub.2 M.sub.3 M.sub.4. His speech signals, denoted S, therefore reach each microphone simultaneously. For unequal microphone distances, an appropriate delay function such as delay 3 in FIG. 2 is employed to place the desired signals S in phase at all microphones. The undesired signal is denoted N.sub.1, N.sub.2, N.sub.3, or N.sub.4, in accordance with which microphone it impinges on; and is located "off-center" or at unequal distances from the microphones.

Consider first the processing (in what is hereinafter called the first stage) by the microphone pair M.sub.1 M.sub.2 of a simultaneous burst from speech source 1 and noise source 2, each burst consisting of one cycle. As depicted in FIG. 1, and assuming equal amplitudes, the signal S reaches microphones M.sub.1 and M.sub.2 in phase or simultaneously. In the instance of this illustration, the noise signal N.sub.1 and the noise signal N.sub.2 thereafter reach the respective microphones at different times.

In the first stage, in accordance with this aspect of the invention, the absolute value of the difference between the M.sub.1 and M.sub.2 microphone outputs is added to the sum of the outputs of microphones M.sub.1 and M.sub.2. This process is accomplished by first subtracting the signals S+N.sub.1 and S+N.sub.2 in difference amplifier 4, which cancels the desired signal S altogether. The remaining signal, N.sub.1 -N.sub.2, is full-wave rectified in rectifier 5, producing the absolute value of the difference signal, or .vertline.N.sub.1 -N.sub.2 .vertline.. The latter then is added in adder 6 to the microphone output signals S+N.sub.1 and S+N.sub.2 and attenuated by a factor of two, in attenuator 10a, yielding the first stage output signal:

E.sub.A =S+1/2(N.sub.1 +N.sub.2 +.vertline.N.sub.1 -N.sub.2 .vertline.) (1)

For the special case in which signals N.sub.1 and N.sub.2 do not overlap in time, the output of the first stage is S+N.sub.1 .sup.++N.sub.2 .sup.+(where N.sub.1 .sup.+and N.sub.2 .sup.+are the positive-going portions of N.sub.1 and N.sub.2). This term is seen to consist of the undistorted desired signal and the unwanted signal half-wave rectified.

So long as the talker 1 remains substantially on-center, the first stage output consists of undistorted desired speech and rectified noise. If the offcenter signal is speech, the result of rectification is to distort it greatly and thus render it unintelligible. Rectified speech contains many more high harmonics than unrectified speech; and accordingly use of a low-pass filter (not shown) on the output signal of the first stage can improve the signal-to-noise ratio.

The two-stage process employing the process steps described above is, pursuant to the invention, used with four microphone outputs to both attenuate and distort an unwanted signal while reproducing the wanted speech signal undistorted. FIG. 3 shows the third and fourth microphones M.sub.3 and M.sub.4, the outputs of which are processed exactly as in the first stage fashion described with respect to the outputs of microphones M.sub.1 and M.sub.2. FIG. 4 depicts the block diagram of the two-stage process including first-stage processor A and first-stage processor B. Delay functions 3a, 3b and 3c are used. FIG. 5 shows exemplary waveforms. The second-stage process is identical in steps to the first stages except that the sign of the full-wave rectifier is opposite to that of the first stages.

Accordingly, the output S+N.sub.3 of microphone M.sub.3 and the output S+N.sub.4 of microphone M.sub.4 are processed in first-stage processor B, whose output is expressed:

E.sub.B =S+1/2(N.sub.3 +N.sub.4 +.vertline.N.sub.3 -N.sub.4 .vertline.) (2)

If signals N.sub.3 and N.sub.4 do not overlap in time, E.sub.B =S+N.sub.3 .sup.++N.sub.4 .sup.+. This term is subtracted in difference amplifier 7 from the signal S+N.sub.1 .sup.++N.sub.2 .sup.+, which is the output of first-stage processor A. This operation cancels the wanted signal S. The resulting signal then is full-wave rectified in rectifier 8 which yields the signal -.vertline.N.sub.1 .sup.++N.sub.2 .sup.+=N.sub.3 .sup.+-N.sub.4 .vertline.. This latter term is the absolute value of the difference between the outputs of the two first stages. The term has a negative sign. To this term is added, in adder 9, the aforementioned respective two outputs of the first-stage processes A and B. This sum then is attenuated by a factor of two in attenuator 10b, and the result is passed to the output of the second stage.

It is seen in FIG. 5 that as a result of the foregoing processing, cancellation of the unwanted noise signal N is achieved through cancellation of its manifestation N.sub.1, N.sub.2, N.sub.3, and N.sub.4. The cancellation is complete if, as in the example, the unwanted signals from the first two stages do not overlap in time. However, the unwanted signals can overlap the desired signal since the initial subtractive operation of each stage results in elimination of the wanted signal in any case.

The two-stage process just described was found to produce undistorted speech output when the talker was on-center. For a talker located off-center, his processed output was distorted as well as attenuated, and because of the distortion the subjective suppression of the unwanted off-center source was even greater than that due to the amplitude attenuation alone.

A test was conducted to compare the suppression observed for speech with that for noise. A static source was substituted for the noise source N in FIG. 3. The static consisted of 0.5 ms. pulses of negative polarity occurring randomly in time, with a repetition frequency of approximately 50 s.sup.-.sup.1. Under these conditions, the probability of overlap is only about 2 percent. The peak amplitude used was about the same as that present in the speech used. Since the signal was only of one polarity, only one stage such as first stage A, of processing was required. The static source was connected to a conventional acoustic delay line and output taps at 1 and 4 ms. were connected to the two inputs of first stage A. For negligible overlap, the static was entirely eliminated at the output of the processor.

The large suppression observed for an unwanted speech signal implies that overlap in time of the unwanted speech manifestations N.sub.1, N.sub.2, N.sub.3, N.sub.4 is not a problem. Consider again the output of the first stage of processing as in first stage A: E.sub.A =S+1/2(N.sub.1 +N.sub.2 +N.sub.1 -N.sub.2). The contribution to the output from the unwanted source is N=1/2(N.sub.1 +N.sub.2 +.vertline.N.sub.1 -N.sub.2 .vertline.). For nonoverlapping signals, N is positive. For overlapping signals, there are four cases: N.sub.1 and N.sub.2 both positive; N.sub.1 positive and N.sub.2 negative; N.sub.2 negative and N.sub.1 positive; and N.sub.1 and N.sub.2 both negative. Inspection of the expression for N shows that N must be positive if N.sub.1 and N.sub.2 are both positive, and negative if N.sub.1 and N.sub.2 are both negative. However, in the two cases where N.sub.1 and N.sub.2 are of opposite sign, .vertline.N.sub.1 -N.sub.2 .vertline.=.vertline.N.sub.1 .vertline.+.vertline.N.sub.2 .vertline.>N.sub.1 +N.sub.2. Hence N is positive in these two cases. Thus even if the unwanted signal overlaps at the two inputs, N has the same sign at the output as in the nonoverlapping case about 75 percent of the time. This result, along with the fact that speech is nonoverlapping to a large extent at the processor inputs, particularly for voiced sounds, is believed to account for the large observed suppression of the unwanted signal.

The foregoing discussion has assumed that the wanted signal S was of equal amplitude at all of the inputs of the signal processor, so that complete cancellation of the wanted signal would result from the first subtraction in each stage. If the signals are not of the same amplitude, part of the signal will be distorted by the subsequent full-wave rectification, and this distortion will be added to the output signal. This problem can be overcome by adjustment of the gains in the four channels to make the signal amplitudes the same.

The output of the processor of FIG. 4 seems at first to be a complicated function of all four of the inputs S+N.sub.1, S+N.sub.2, S+N.sub.3, and S+N.sub.4. It has been realized, however, that if only the instantaneous response of the FIG. 4 two-stage processor to instantaneous inputs are considered, then the instantaneous processor output is always exactly equal to four times some one of the inputs. The two-stage processor of FIG. 4, in effect, looks at only one input at a time and ignores all the others.

This result can be seen by inspecting the output of any one of the stages in the process. The inputs to the two-stage processor in terms of the signals received by microphones M.sub.1 M.sub.2 M.sub.3 M.sub.4 can be expressed:

S+N.sub.1 =E.sub.1 (3) S+N.sub.2 =E.sub.2 (4) S+N.sub.3 =E.sub.3 (5)

S+N.sub.4 =E.sub.4 (6)

The inputs to the first stage A are E.sub.1 and E.sub.2. The output of this stage then is:

E.sub.A =1/2(E.sub.1 +E.sub.2 +.vertline.E.sub.1 -E.sub.2 .vertline.) (7)

It can be easily seen that E.sub.A is equal to the greater of its two inputs E.sub.1 and E.sub.2. Similarly E.sub.B, the output of the first stage B, is equal to the greater of its two inputs E.sub.3 and E.sub.4.

The inputs to the second stage are E.sub.A and E.sub.B and the output is:

E=1/2(E.sub.A+E.sub.B -.vertline.E.sub.A -E.sub.B .vertline.) (8)

It is evident that E, the output of the second stage is equal to the lesser of its two inputs E.sub.A and E.sub.B.

Each of the stages, then, selects as output either the maximum or the minimum of its two inputs depending on whether the sign of the rectifier used is positive or negative.

A study of Equations 7 and 8 will reveal that the input which the processor selects as output at any particular time is the algebraically lesser of two quantities. E.sub.A and E.sub.B, where E.sub.A is the greater of the inputs E.sub.1 and E.sub.2 to first-stage processer A and E.sub.B is the greater of the inputs E.sub.3 and E.sub.4 to first-stage processor B.

Assuming that no two inputs are equal, if the two greatest inputs occur in the same first stage processor, then the input selected is the third greatest. Otherwise the selected input is the second greatest. Significantly, neither the greatest nor the smallest input is ever chosen. If the inputs are uncorrelated gaussian signals of equal standard deviation, the two-stage processor selects the greater of the "middle two" inputs two-thirds of the time and the lesser of the two, one-third of the time.

The processor thus is effective in discriminating against a signal which at the instant is much stronger at one microphone than at the others. If the signal from a noise source is "off-center" as the noise source 2 in FIG. 3, and is of sufficient magnitude when it reaches a given one of the microphones to cause that microphone's output to be instantaneously the greatest or least (algebraically) of the four outputs, that microphone output is not chosen. Rather, one of the "middle two" is chosen.

The FIG. 4 block diagram of the two-stage processor is functionally equivalent to the simplified block diagram of FIG. 6 where E.sub.1, E.sub.2 constitute one input pair and E.sub.3, E.sub.4 constitute the second. The first and second steps are replaced by maximum and minimum operations respectively. Again, the output E is given by Equations 7 and 8 and is equal to the lesser of two quantities, E.sub.A and E.sub.B, where E.sub.A is the greater of E.sub.1 and E.sub.2 ; and E.sub.B is the greater of E.sub.3 and E.sub.4.

A circuit that will quite simply perform the desired two-stage process is schematically illustrated in FIG. 7, which functionally follows the FIG. 6 diagram. The circuit is basically three sets of diode OR gates. Diodes D.sub.1 and D.sub.2 are connected respectively to the inputs E.sub.1 and E.sub.2. Diodes D.sub.3 and D.sub.4 are connected respectively to the inputs E.sub.3 and E.sub.4. Forward current bias I.sub.1 is applied to the diodes D.sub.1 D.sub.2 and D.sub.3 D.sub.4. The two diode gates are arranged conventionally to pass only the greater of their respective inputs. The latter, termed again E.sub.A and E.sub.B, are respective inputs to the third gate consisting of diodes D.sub.A and D.sub.B which are conventionally biased by current sources I.sub.1 and I.sub.2 to pass the lesser of E.sub.A and E.sub.B. The biasing current I.sub.1 .about. 2I.sub.2. The instantaneous output E of the FIG. 7 processor is identical to the output of FIG. 4 processor.

To recapitulate, the four-input, two-stage processor thus far described completely eliminates impulsive noise provided that the length of the impulse is short enough and that the location of the noise source relative to the microphone array is such that only one microphone at a time is excited. When the interfering noise is speech from an offcenter location, the process both distorts and attenuates it, thus greatly reducing its intelligibility. The processing is also effective in discriminating against a source of sound which is much closer to one of the four microphones than it is to the others.

It has, however, been further realized that the described processor is actually one member of a general class of processors. Each class member performs the ranking of the instantaneous (algebraic) values of the microphone signals in ascending order. However, the rule for selecting as the processor output some one or some combination of the instantaneous inputs, differs from species to species.

For example, in the processing scheme described by Equation 7, when the "on-center" talker is silent, the processor still will select either the second or third greatest of the four inputs. During these periods it may be desirable for the processor to select the microphone output that is closest to zero. The undesired noise or talkers then would be at a minimum during pauses of the wanted talker. This process has been found to distort very severely any offcenter sound source and render offcenter speech entirely unintelligible.

A further processor is envisioned which, instead of sometimes selecting the second largest input and otherwise selecting the third largest, always selects one or the other of these inputs.

A still further case is the processor which computes the average of the "middle two" of four unequal inputs. Advantageously the median of five inputs can be selected and a further improvement is realized by averaging the "middle three" of five inputs.

In terms of eliminating impulse-type noise, an advantage is gained by using the median of five inputs, since then an arbitrary amount of noise can be added at any two microphones with no effect on the output. If the median of seven microphone outputs is used, noise at any three of the microphones is eliminated.

The processor of FIG. 4 can eliminate impulsive offcenter noise provided the impulse does not embrace more than one-third of the microphone array at any instant; and provided further than the duty cycle of the impulsive noise is not more than one-fourth. Processing of the median output of, for example, five microphones allows a longer impulse and a longer duty cycle. It is obvious that the inventive signal-enhancing techniques described are applicable in any environment where a desired signal is sought to be received in the presence of an undesired signal or of noise. Such fields of application include underwater sound detection, mobile radio telephony, medicoacoustics and others.

The spirit of the invention is embraced in the scope of the claims to follow.

* * * * *