U.S. patent number 3,743,787 [Application Number 05/068,237] was granted by the patent office on 1973-07-03 for speech signal transmission systems utilizing a non-linear circuit in the base band channel.
Invention is credited to Hiroya Fujisaki, Nobuyuki Goto, Masahiro Iwasaki, Shigeo Nagashima.
United States Patent |
3,743,787 |
Fujisaki , et al. |
July 3, 1973 |
**Please see images for:
( Certificate of Correction ) ** |
SPEECH SIGNAL TRANSMISSION SYSTEMS UTILIZING A NON-LINEAR CIRCUIT
IN THE BASE BAND CHANNEL
Abstract
In a speech signal transmission system there are provided means
including a non-linear circuit and a bandpass filter to produce
from a speech signal a base band signal representing the spectral
fine structures of the speech signal and to transmit the base band
signal to the receiving side, a vocoder channel analyzer to convert
the speech signal into a second signal representing the spectral
envelope of the original speech signal, a second non-linear circuit
on the receiving side to convert the base band signal into an
exciting signal and a vocoder synthesizer which acts to synthesize
the original speech signal from the exciting signal and the second
signal.
Inventors: |
Fujisaki; Hiroya (Shibuya-ku,
Tokyo, JA), Nagashima; Shigeo (Nagano-ku, Tokyo,
JA), Iwasaki; Masahiro (Tsurumi-ku, Yokohama,
JA), Goto; Nobuyuki (Kohoku-ku, Yokohama,
JA) |
Family
ID: |
26410394 |
Appl.
No.: |
05/068,237 |
Filed: |
August 31, 1970 |
Foreign Application Priority Data
|
|
|
|
|
Sep 2, 1969 [JA] |
|
|
44/69196 |
Sep 2, 1969 [JA] |
|
|
44/69197 |
|
Current U.S.
Class: |
704/207 |
Current CPC
Class: |
G10L
19/02 (20130101) |
Current International
Class: |
G10L
19/00 (20060101); G10L 19/02 (20060101); G10l
001/00 () |
Field of
Search: |
;179/15.55R,1SA |
References Cited
[Referenced By]
U.S. Patent Documents
Primary Examiner: Claffy; Kathleen H.
Assistant Examiner: Leaheey; Jon Bradford
Description
BACKGROUND OF THE INVENTION
This invention relates to a speech signal transmission system and
more particularly to a speech signal transmission system employing
a novel speech band compression system.
Band compression of speech is important for communication systems
utilizing expensive transmission circuits, such as communication
systems utilizing satellites or submarine cables. A channel vocoder
is a typical speech band compression system. According to this
system, the frequency spectrum of the speech signal is analyzed
into signals or informations representing the spectral envelope and
fine structures thereof on the transmission side, the spectral
envelope is detected by more than 10 bandpass filters, rectifiers
and low pass filters while spectral fine structures are detected by
determining whether the sound is a voiced sound or a unvoiced sound
and by extracting the pitch frequency in the case of the voiced
sound. The informations regarding spectral envelope and spectral
fine structures are sent from the transmission side as a plurality
of band compressed signals. On the receiving side a frequency
spectrum approximating that of the original speech is reproduced
from these signals. While this system provides a band compression
ratio of more than ten to one it is difficult to detect the
spectral fine structures of the speech on the transmission side
thus lacking articulation and naturalness in the reproduced
speech.
To eliminate these difficulties, a voice excited vocoder (VEV) has
been developed wherein determination of the spectral fine
structures is not performed on the transmission side. According to
this system a portion of the speech band near the lower end of the
speech spectrum (hereinafter termed as the base band) is
transmitted directly to the receiving side thus eliminating the
necessity of the detection of the voiced sound and unvoiced sound
as well as the extraction of the pitch frequency.
FIG. 1 shows diagrammatically the construction of this system
according to which components in the base frequency band f.sub.1 to
f.sub.3 (f.sub.1 <f.sub.3 <f.sub.2) of the speech signal
ranging from f.sub.1 to f.sub.2, where f.sub.1 <f.sub.2, and
impressed upon an input terminal 1 are separated by a bandpass
filter 2 of the frequency band f.sub.1 to f.sub.3 and the separated
components are transmitted to the receiving side as the base band
signal through one of the transmission lines 4. The remaining
components of the input speech in the frequency band f.sub.3 to
f.sub.2 are converted into a plurality of signals with their
frequency bands compressed by a vocoder channel analyzer 3, and the
converted signals are transmitted to the receiving side over other
transmission lines.
On the receiving side, the received base band signal is sent to an
output terminal 8 through an adder 7. The received base band signal
is also supplied to a non-linear circuit 5 to regenerate, by the
action of the non-linear circuit, components in the frequency band
f.sub.3 to f.sub.2 which have been removed on the transmission
side, thus providing an exciting signal in the frequency band
f.sub.1 to f.sub.2 containing fine structures of the original
speech spectrum. Signals from the vocoder channel analyzer 3 are
supplied to a vocoder synthesizer 6 and combined therein with the
exciting signal to reproduce components of frequency band f.sub.3
and f.sub.2 of the original speech. The reproduced components are
sent to the output terminal 8 via adder 7 thus reproducing all
components of f.sub.1 to f.sub.2 of the original speech at the
output terminal 8.
As above described since in the VEV system a portion of the
original speech spectrum is transmitted without being processed in
any way, the qualities of the reproduced speech, such as
articulation and naturalness, are excellent but this system
requires wide transmission band thus decreasing the band
compression ratio. The band width required to transmit the base
band signal is determined in the following manner. More
particularly, since the base band signal serves to transmit
informations of the pitch frequency;
1. As long as the pitch frequency is included in the base band, the
base band is not required to contain higher harmonic components of
the pitch frequency. 2. Where the pitch frequency is not included
in the base band it is necessary that at least two adjacent higher
harmonic components of the pitch frequency should be included in
the base band.
The pitch frequency of ordinary speech generally ranges from about
50 to 450 Hz so that the base band always satisfying either one of
the two conditions mentioned just above is determined in the
following manner. Denoting the pitch frequency of the speech by
f.sub.0, the lower limit of its variation by f.sub.01, the upper
limit by f.sub.02, base band by f.sub.L to f.sub.U (where f.sub.L
<f.sub.U) and its band width by f.sub.B (= f.sub.U -f.sub.L)
then
from condition 1 : f.sub.L .ltoreq.f.sub.0 .ltoreq.f.sub.U
from condition 2 : 2f.sub.0 .ltoreq.f.sub.U -f.sub.L
.ltoreq.f.sub.B
With reference to FIG. 3, the shaded area shows the range of
f.sub.B which satisfies at least one of thess conditions. The solid
line in FIG. 3 shows the necessary minimum value of f.sub.B for a
given f.sub.L, or the lower limit of the base band, when f.sub.02
is greater than 3f.sub.01 as in the case of conversational speech
of an indefinite number of talkers covering a wide range of
variations in the pitch frequency. In this case, when f.sub.L is
selected to be equal to 1/3f.sub.02 and f.sub.U is selected to be
equal to f.sub.02 the base band width will be minimum and thus the
minimum value 2/3f.sub.02 is obtained. Where a particular talker is
specified, the range of the pitch frequency would be narrowed to
satisfy the condition of f.sub.02 <3f.sub.01, so that a small
shaded triangular range 9 shown in FIG. 3 will be added with the
result that the minimum value of f.sub.B is realized when f.sub.L
is selected to be qual to f.sub.01, and f.sub.U is selected to be
equal to f.sub.02 and thus the minimum value f.sub.02 minus
f.sub.01 is obtained. As above described since the pitch
fr950000000000000000000000000000000000000000000000000000000000000000
* * * * *