U.S. patent number 3,662,115 [Application Number 05/079,430] was granted by the patent office on 1972-05-09 for audio response apparatus using partial autocorrelation techniques.
This patent grant is currently assigned to Nippon Telegraph and Telephone Public Corporation. Invention is credited to Fumitada Itakura, Tsunehiko Koike, Masaaki Nishikawa, Shuzo Saito.
United States Patent |
3,662,115 |
Saito , et al. |
May 9, 1972 |
AUDIO RESPONSE APPARATUS USING PARTIAL AUTOCORRELATION
TECHNIQUES
Abstract
The audio response apparatus comprises means for storing speech
parameters including partial autocorrelation coefficients between
two closely adjacent time instants of speech signal, which are
derived by removing the redundant components from the actual speech
signal levels of the two adjacent instants in consideration of the
effect of intermediate sample levels between them and an excitation
source information determined from sampled values at remotely
spaced time instants, a memory to store the speech parameters, read
out means to read out the speech parameters from the memory which
are designated by an electronic computer, and a speech synthesizer
to reconstruct the speech signal from the output of the readout
means. The synthesizer is comprised by high speed logic elements
and operates to synthesize multichannel audio outputs on the time
division basis.
Inventors: |
Saito; Shuzo (Tokyo,
JA), Itakura; Fumitada (Tokyo, JA),
Nishikawa; Masaaki (Tokyo, JA), Koike; Tsunehiko
(Tokyo, JA) |
Assignee: |
Nippon Telegraph and Telephone
Public Corporation (Tokyo, JA)
|
Family
ID: |
26346352 |
Appl.
No.: |
05/079,430 |
Filed: |
October 9, 1970 |
Foreign Application Priority Data
|
|
|
|
|
Feb 7, 1970 [JA] |
|
|
45/10992 |
Feb 7, 1970 [JA] |
|
|
45/10993 |
|
Current U.S.
Class: |
704/200; 708/318;
380/35; 704/E13.002 |
Current CPC
Class: |
G10L
13/02 (20130101) |
Current International
Class: |
G06F
3/16 (20060101); C10L 1/00 (20060101); H04M
11/00 (20060101); H04M 3/00 (20060101); G06F
15/00 (20060101); C10l 001/00 (); H04m
011/00 () |
Field of
Search: |
;179/1SA,15.55 ;324/77
;340/148,152 |
References Cited
[Referenced By]
U.S. Patent Documents
Primary Examiner: Claffy; Kathleen H.
Assistant Examiner: Leaheey; Jon Bradford
Claims
What is claimed is:
1. An audio response apparatus comprising means for previously
storing speech parameters including partial autocorrelation
coefficients between two closely adjacent time instants of a speech
signal required for answering and excitation source informations,
said coefficients being determined by calculating, with respect to
a plurality of sampling instants, partial autocorrelation
coefficients of said two instants representing the correlation of
the difference between the error value predicted by the least
squares method from sampled values at said two instants and the
actual values of the speech signal at said two points, said
excitation source informations being obtained by determining the
autocorrelation between remotely separated sampled values; an
electronic computer for supplying a command signal for designating
the speech parameters of a speech signal to be synthesized; means
to read out said speech parameters designated by said command
signal from said memory means; and a speech synthesizer response to
the output from said read out means to synthesize a desired speech
signal.
2. The audio response apparatus according to claim 1 which further
includes a speech parameter extractor comprising an autocorrelation
coefficient extractor having a plurality of cascade connected
partial autocorrelation coefficient detector stages, each of said
stages including a delay network connected to receive an speech
signal, a correlation coefficient calculator receiving the output
from said delay network and for directly receiving said speech
signal, a first multiplier connected to receive the output from
said delay network and the output from said correlation calculator,
a second multiplier connected to directly receive the output from
said correlation coefficient calculator and said speech signal, a
first adder for adding the output from said delay network and the
output from said second multiplier, a second adder for adding the
output from said first multiplier and said speech signal, and a
quantizer to quantize the output from said correlation coefficient
calculator to provide a partial autocorrelation coefficient between
said two instants; an autocorrelator connected to one output
terminal of the last detector stage of said extractor; and a
maximum value selecting means for determining the period and
amplitude of an excitation source signal from a group of outputs
from said autocorrelator.
3. The audio response apparatus according to claim 1 wherein said
speech signal synthesizer comprises a pulse generator and a white
noise generator which are controlled by the fundamental pitch
period of the speech, an amplitude controller connected to said
generators and controlled by the fundamental amplitude information
of the excitation source, and means for controlling the output from
said amplitude controller in accordance with the partial
autocorrelation coefficient designated by said electronic computer
for reconstructing the speech signal by the correlation between a
group of said correlation coefficients.
4. An audio response apparatus comprising means for deriving speech
parameters from partial autocorrelation coefficients and an
excitation source information of respective speech signals
regarding a plurality of speech signals required for answering;
memory means for storing said speech parameters; an electronic
computer for sending a command signal designating the speech
parameters for respective output channels to send answers to a
plurality of output channels; a plurality of read out means to read
out speech parameters designated by said electronic computer from
said memory means, a single speech synthesizer connected to receive
a plurality of sets of the speech parameters from said read out
means on the time division basis to form a group of digital codes
representing respective sets of designated speech wave from the
excitation source signals corresponding to the excitation source
informations and said partial autocorrelation coefficients of
respective sets of speech parameter; means to read out on the time
division basis a group of digital codes from said speech
synthesizer and to convert said digital codes into pulse amplitude
modulated signals; and timing gate means for distributing said
modulated signals among a plurality of output channels.
5. The audio response apparatus according to claim 4 comprising
cyclic memory means for storing speech parameters including
excitation source informations and said partial autocorrelation
coefficients regarding a plurality of speech units of a
predetermined constant length to be required to send an answer, in
a plurality of cyclic store arrangements each divided into a
plurality of frames, a parameter buffer memory for temporally
storing the speech parameters in respective frames of the speech
unit read out from said cyclic memory means; a speech synthesizer
including a purely digital logic means response to the speech units
designated by the electronic computer and to be answered to a
plurality of output channels for correlating the excitation source
signals corresponding to the speech informations of the speech
parameters selectively read out by said parameter buffer memory
means under the control of said partial autocorrelation
coefficients whereby to convert said speech parameters into a group
of digital codes representing the waveforms of respective speech
signals designated; an output buffer memory for temporally storing
the group of said digital codes from said speech synthesizer; and
means for converting said digital codes read out from said output
buffer memory into analogue signals.
6. The audio response apparatus according to claim 4 which
comprises means for successively storing vacant addresses of a
memory speech parameters each including an excitation source
information and a partial autocorrelation coefficient regarding a
plurality of speech units required for sending an answer of a
predetermined length; a parameter buffer memory for temporally
storing the speech parameters of a speech unit read out from an
address in said memory corresponding to the speech unit designated
by said electronic computer and to be answered to a plurality of
output channels; and means for successively reading out said speech
parameters from said parameter buffer memory and to apply said read
out speech parameters to said speech synthesizer.
Description
BACKGROUND OF THE INVENTION
This invention relates to audio response apparatus utilizing an
electronic computer to present various information services, and
more particularly to novel audio response apparatus wherein speech
signals to be responded are memorized in the form of speech
parameters which are read out according to the command from the
electronic computer to reconstruct speech by means of a
synthesizer.
In prior art apparatus of the type referred above so-called
compiling method of prerecorded speech has been used wherein speech
segments (hereinafter termed "speech units") in the form of, e.g.
word speech units are stored in a memory and the stored speech
units are successively selected is a suitable order in response to
the command from the electronic computer to reconstruct or compile
a speech message. In this method, speech units are generally
recorded directly in the form of audio waveforms, and generally as
the recording medium is used a low speech analogue magnetic drum
having a period of revolution equal to the time length of one
speech unit so as to record one speech unit in each track. With
this construction, however, it is difficult not only to increase
the capacity of the analogue magnetic drum but also to increase the
number of speech units that can be recorded to 100 - 200 or
more.
To eliminate these problems of the compiling method of prerecorded
speech units, it has been proposed a method wherein, instead of
directly recorded speech signals, informations of compressed
signals are recorded for reconstructing speech signals by means of
a speech synthesizer. One example of the audio response apparatus
constructed according to this principle is the apparatus utilizing
the principle of a channel vocoder (See, for example, R. H. BURON:
I.E.E.E. Trans. AU-16, 1, 1968). However, when using a channel
vocoder, the quality of the audio output is poor. Moreover, it is
necessary to install an expensive speech synthesizer on each output
channel.
SUMMARY OF THE INVENTION
It is an object of this invention to provide a novel audio response
apparatus according to which an speech signal is represented by a
new parameter which is termed "a partial autocorrelation
coefficient" and the parameter is used to form a number of speech
units whereby to produce speech outputs of excellent quality.
Another object of this invention is to provide a novel speech
parameter extracting device for forming a partial autocorrelation
coefficient and an excitation source information.
Another object of this invention is to provide an inexpensive
cyclic memory device which can store a plurality of parameters in
the form of partial autocorrelation coefficients as the speech
units.
A further object of this invention is to provide a simple speech
synthesizer comprising a plurality of cascade connected digital
filters for reconstructing a speech from a number of speech units
selected from the memory device.
Still another object of this invention is to provide a novel audio
response apparatus in which a single speech synthesizer can be used
in common for a plurality of output channels.
According to this invention there is provided means according to
which speech signal levels at two closely adjacent time instants
are selected and the intermediate signal levels between these time
instants are used to determine the difference between the signal
levels at these two instants predicted by the least squares method
and the actual signal levels, or the partial autocorrelation
coefficient. Further, means is provided to vary the time interval
between said two time instants to determine the partial
autocorrelation coefficient at new two time instants. By repeating
these operations it is possible to determine a plurality of partial
autocorrelation coefficients. Since these coefficients are closely
related to the frequency spectrum envelope of the speech signal it
is possible to synthesize a speech from such excitation source
informations as the fundamental frequency, its amplitude and the
noise amplitude which are extracted from the speech signal. More
particularly, there is provided an excitation source generator
controlled by the excitation source information so as to control
the output signal from the generator by the partial autocorrelation
coefficient, to reproduce the frequency spectrum envelope.
Further, in accordance with this invention, the parameter memory
device for storing a plurality of partial autocorrelation
coefficients may be an inexpensive memory of large capacity and the
digital speech synthesizer for reproducing the frequency spectrum
envelope is constructed to be utilized on the time division
basis.
BRIEF DESCRIPTION OF THE DRAWINGS
The invention can be more fully understood from the following
detailed description when taken in conjunction with the
accompanying drawings in which:
FIG. 1 is a schematic connection diagram to show the principle of
the novel audio response apparatus;
FIG. 2 shows a speech signal curve for explaining the partial
autocorrelation coefficient;
FIG. 3 shows a connection diagram of an extracting apparatus for
extracting the partial autocorrelation coefficient and the
excitation source information;
FIG. 4 is a diagram of apparatus for determining the correlation
coefficient utilized in this invention;
FIG. 5 is a connection diagram of one example of a autocorrelation
apparatus;
FIG. 6 is a connection diagram of one example of a speech
synthesizer;
FIG. 7 is a connection diagram of the novel audio response
apparatus in which the synthesizer is utilized in multiplex on the
time division basis;
FIG. 8 shows a cyclic store arrangement of the speech parameters on
a magnetic drum of the embodiment shown in FIG. 7;
FIG. 9 is a block diagram of the word synchronizer utilized in the
embodiment shown in FIG. 7;
FIG. 10 is a time chart of control signals recorded on the magnetic
drum;
FIGS. 11 and 12 show a block diagram and a diagram of the time
relationship of the sequence control utilized in FIG. 7;
FIG. 13 shows a block diagram of the input control show in FIG.
7;
FIG. 14 is a block diagram of a modified audio response apparatus
embodying this invention, and
FIG. 15 is a block diagram of the input control shown in FIG.
14.
DESCRIPTION OF THE PREFERRED EMBODIMENTS
With reference now to FIG. 1 of the accompanying drawings, a
request of an information service from a terminal telephone set 1
is coupled to an electronic computer 3 through an exchange
equipment 2. Once this connection is established, electronic
computer 3 is controlled by the terminal telephone set 1 and the
output from the electronic computer is supplied to the audio
response device 4, in the form of a code train of speech units to
be answered. The audio response device 4 has memories of the
partial autocorrelation coefficients and the excitation source
information which are necessary to synthesize the answer speech and
these memories are read out in response to the output from
electronic computer 3 whereby to synthesize the speech. The
synthesized speech signal is supplied to the terminal telephone set
1 via the exchange equipment 2. As shown in FIG. 1, a speech
parameter extractor 5 is connected to the audio response device 4
for extracting the speech parameters from the speech, that is the
partial autocorrelation coefficient and the excitation source
information which are to be stored in the audio response device 4.
The extractor 5 functions to check, when desired, the speech
parameters being stored in the audio response device 5 or to
replace such parameters with new speech ones.
The partial autocorrelation coefficient which is one of the
parameters utilized to synthesize a speech according to this
invention is defined as follows: More particularly, as shown in
FIG. 2, when the speech signal is sampled at a frequency of 8KHz,
for example, the partial autocorrelation between the values of the
speech signal at two relatively close sampling instants t.sub.0 and
t.sub.3 is expressed by the correlation of the difference
.DELTA.X.sub.0 and .DELTA.X.sub.3 between predicted values by the
least squares method X.sub.0 and X.sub.3 which are obtained by
utilizing the sampled values X.sub.1 and X.sub.2 presenting in the
interval between time instants t.sub.0 and t.sub.3, and the actual
sample values X.sub.0 and X.sub.3. The interval between sampling
times is varied successively to T, 2T, 3T, 4T . . . and the partial
autocorrelation coefficients for these different intervals are
determined. The partial autocorrelation coefficient is expressed by
the following equation
where nT represents the interval between sampling times.
Denoting the predicated errors .DELTA.X.sub.0 and .DELTA.X.sub.n by
using a delay operator D, we have ##SPC1## where .alpha..sub.i and
.beta..sub.i are selected so as to make minimum the values of E{
(.DELTA.X.sub.n).sup.2 } and E{ (.DELTA.X.sub.0).sup.2 }.sup.. and
D represents the delay operator expressed by an equation D.sup.i
X.sub.n = X.sub.n.sub.-i.sup. . A.sub.n.sub.-1 (D) and
B.sub.n.sub.-1 (D) are prediction error operators. Then the partial
autocorrelation coefficient K.sub.n is expressed as follows
It can be proved that following equations hold among A.sub.n (D),
B.sub.n (D) and k.sub.n
A.sub.n (D) = A.sub.n.sub.-1 (D) - K.sub.n B.sub.n.sub.-1 (D) (5)
B.sub.n (D) = D[B.sub.n.sub. -1 (D) - K.sub.n A.sub.n.sub.-1 (6)
]
Thus, if A.sub.n.sub.-1 (D) and B.sub.n.sub.-1 (D) are determined,
then K.sub.n and hence A.sub.n (D) and B.sub.n (D) can be
determined. In this manner, it is possible to determine the partial
autocorrelation coefficients. As this coefficient varies relatively
gradually with time, the coefficient is determined at each period
which is sufficiently long to extract the necessary speech
parameter while preserving well the nature of the speech, for
example at every 15 milliseconds and the derived coefficient is
encoded and stored.
FIG. 3 shows one example of the extractor 5 for extracting a
plurality of partial autocorrelation coefficients and excitation
source informations from a speech signal. The extractor shown in
FIG. 3 comprises n partial autocorrelation coefficient detector
stages 14a through 14n which are connected in cascade. Since the
respective detector stages have the same construction, the
construction of only the stage 14a will be described in the
following. More particularly, each partial autocorrelation
coefficient detector stage comprises a delay network 7 for delaying
the speech signal by one sampling interval T, a correlation
coefficient calculator 8, a multipliers 9 and 11, adders 10 and 12
and a quantizer 13. A terminal 6 to the left of the detector stage
14a receives the speech signal and the terminal 15 of the quantizer
13 provides the partial autocorrelation coefficient quantized in
each stage. One output terminal 12 of the final detector stage 14n
is opened whereas the other output terminal 10 is connected to an
autocorrelator 16. The outputs from this autocorrelator 16 are
supplied to a maximum value selector 17 which in turn is connected
to quantizers 18 and 20.
In operation, the speech signal impressed upon input terminal 6 is
divided into two portions, one portion thereof being applied to
adder 10 through correlation coefficient calculator 8 and
multiplier 9 after being delayed by delay network 7 by one sampling
period T. The other portion of the speech signal is supplied to
adder 12 through correlation coefficient calculator 8 and
multiplier 11. FIG. 4 shows one example of the circuit construction
of the correlation coefficient calculator 8 comprising adders 22, a
squaring devices 23, adders 24, low pass filters 25 and a division
or ratio circuit 26.
Assuming now two inputs B.sub.n.sub.-1 (D)X.sub.n and
A.sub.n.sub.-1 (D)X.sub.n for the correlation coefficient
calculator 8, the inputs to two low pass filters 25 will be
expressed respectively by ##SPC2##
Low pass filters 25 determine mean values of these inputs over a
short time. Since mean values of (A.sub.n.sub.-1 (D)X.sub.n).sup.2
and (B.sub.n.sub.-1 (D)X.sub.n).sup.2 are approximately equal, the
following equation holds
average value of 2{(A.sub.n.sub.-1 (D)X.sub.n).sup.2 +
(B.sub.n.sub.-1 (D)X.sub.n).sup.2 } .congruent. average value of 4{
(A.sub.n.sub.-1 (D)X.sub.n).sup.2. (B.sub.n.sub.-1
(D)X.sub.n).sup.2 }1/2 (9)
whereby the value of K.sub.n is given by the output of ratio
circuit 26. The output from the ratio circuit 26 is applied to
multipliers 9 and 11 to produce a predicted value X.sub.1 of
X.sub.1 on the output of multiplier 11. Adder 10 provides the
difference (X.sub.1 - X.sub.1) between the predicted value X.sub.1
and the actual value X.sub.1. Further, multiplier 9 produces on its
output the predicted value X.sub.0 of X.sub.0 and adder 12 provides
the difference (X.sub.0 - X.sub.0). A portion of the output from
the correlation coefficient calculator 8 is supplied to quantizer
13 to produce a quantized output of the partial autocorrelation
coefficient at terminal 15.
Similar processings are also performed by another detector stages
succeeding the detector stage 14a. More particularly, adders 10 and
12 of the second partial autocorrelation coefficient detector stage
14b provide differences (X.sub.2 - X.sub.2) and (X.sub.0 -
X.sub.0). In the same manner, adders 10 and 12 of the last detector
stage 14n provide the differences (X.sub.p - X.sub.p) and (X.sub.0
- X.sub.0), respectively. Where X.sub.p represents the sampled
value of the audio waveform at a sampling time t.sub.p which is the
pth point starting from t.sub.0, and X.sub.p and X.sub.0 represent
predicted sampled values at t.sub.0 and t.sub.p which are predicted
from the sampled values at two instants t.sub.0 and t.sub.p,
respectively. In this manner, quantized values of the partial
autocorrelation coefficients different in time intervals of T, 2T,
3T . . . pT are produced at output terminals of respective
quantizers 13 in respective detector stages 14a, 14b . . . 14n. As
the input speech signal reaches the last stage of a number of
cascade connected detector stages 14a, 14b . . . 14n, the
correlation between closely adjacent sampled values of the speech
signal will be eliminated whereby the autocorrelation corresponding
to the formant of the speech is eliminated. However, the
correlation corresponding to the fundamental frequency of the
speech is preserved without being eliminated. For this reason, when
the output of one adder 10 of the last partial autocorrelation
coefficient detector stage 14n is applied to the autocorrelator 16
to determine its autocorrelation, a significant peak will be formed
with a time delay corresponding to the period of the fundamental
frequency when the input speech signal is a voiced sound, whereas
when the input speech signal is a unvoiced sound no peak will be
formed. Consequently, when the input speech signal is the voiced
sound, the output from the autocorrelator 16 is supplied to the
maximum value selector 17 and the fundamental pitch period of the
speech is obtained by measuring the interval between maximum values
of two adjacent autocorrelation coefficients.
As is illustrated in FIG. 5, for example, the autocorrelator 16
comprises a plurality of delay networks 27, a plurality of
multipliers 28 and a plurality of low pass filters 29. The
fundamental pitch period of the speech obtained by the maixmum
value selector 17 is quantized by quantizer 18 and is then sent to
output terminal 19.
On the other hand when the input speech signal is the unvoiced
sound, the fundamental pitch period of the speech will not appear
at terminal 19. In such a case this information is utilized as a
white noise signal of the excitation source information.
The excitation source amplitude derived from the amplitude value of
the input signal to the autocorrelator 16 is quantized by the
quantizing circuit 20 and is then applied to output terminal
21.
In this manner, the partial autocorrelation coefficient and the
excitation source inforamtion which are necessary to the synthesis
of the speech are produced at output terminals 15, 19 and 21,
respectively. Since the temporary variation of the excitation
signal is relatively gradual just like the partial autocorrelation
coefficient it is sufficient to determine it at every 15
mulliseconds, for example, and the derived informations are encoded
and stored.
According to this invention, a plurality of partial autocorrelating
coefficients of the speech and the fundamental pitch period and the
amplitude of the speech which are utilized as the excitation source
signal informations, obtained by the above described operations are
stored in the audio response device 4 shown in FIG. 1. When the
audio response device 4 receives a command of a code train
regarding the speech to be synthesized from the electronic computer
3, the device 4 functions to sequentially select the partial
autocorrelation coefficients and the excitation source signal
informations which have been stored beforehand in the memory in
accordance with the command whereby to synthesize the designated
speech.
FIG. 6 shows a block diagram of the device employed to synthesize
the designated speech. The device comprises a pulse generator for
the voiced sound 30, a white noise generator for the unvoiced sound
31 and an amplitude controller 32. Operations of the pulse
generator 30, white noise generator 31 and amplitude controller 32
are controlled by signals applied upon respective input terminals
33 and 34. Terminal 33 is connected to receive one of the
excitation signal informations which have been previously stored in
the audio response device 4 and is selected by the electronic
computer 3, that is the information regarding the fundamental pitch
period of the speech so that the pulse generator 30 produces an
inpulse train of the unit power having the same period as the
fundamental pitch period. During an interval in which the
information relating to the fundamental pitch period, one of the
excitation source informations, is not applied on the control
terminal 33 (that is the unvoiced sound interval), the white noise
generator 31 provides a white noise signal output of unit power. In
the same manner, the amplitude controller 32 receives an
information relating to the signal amplitude, also one of the
excitation signal source informations, from control terminal 34 to
control the amplitude of the output signal.
The output from the amplitude controller 32 is applied to a number
of cascade connected digital filters 35n . . . 35b and 35a. Each of
these digital filters has the same construction and comprises
adders 36, 38 and 39, a delay network 40 and a multiplier 37. A
partial autocorrelation coefficient previously stored and selected
by the electronic computer is applied to multiplier 37 through a
terminal 41. One terminal of the delay network 40 of the digital
filter 35n is opened whereas the output terminal 42 of digital
filter 35a receives the synthesized speech output, a portion
thereof being supplied to adder 39 in digital filter 35a via a
delay network 43. Each of the digital filters 35n . . . 35b and 35a
corresponds to each one of the partial autocorrelation coefficient
detector stages 14n . . . 14b and 14a shown in FIG. 3. Thus, the
partial autocorrelation coefficient (selected by the electronic
computer) applied to the input terminal 41 of digital filter 35n is
the partial autocorrelation coefficient that has been produced by
the detector stage 14n shown in FIG. 3 and stored. In the same
manner, the partial autocorrelation coefficient applied to the
input terminal of the digital filter 35a has been previously
produced by the detector stage 14a shown in FIG. 3. It will thus be
noted that the transfer functions of the digital filters are
inverse to those of the partial autocorrelation coefficient
detector stages so that the correlation between speech waveforms
that has been eliminated by a corresponding detector stage will be
given to the output from the amplitude controller 32. Accordingly,
as this output passes progressively through digital filters 35n . .
. 35b and 35a the frequency spectrum envelope will gradually
approach to the envelope of the original speech.
Although in this embodiment of the speech synthesizing device,
digital circuits are shown to constitute the digital filters
controlled by the partial autocorrelation coefficients, it will be
clear that the digital filters can also be comprised by analogue
circuits. In the system utilizing the digital circuit, by utilizing
high speed elements it becomes possible to utilize the speech
synthesizing device on the time division basis whereby multiplexing
of the answer speech becomes easy as will be described later.
In order to produce a synthesized speech of excellent quality in
accordance with this invention, the maximum value of the time
difference between partial autocorrelation coefficient may be about
8T. When the partial autocorrelation coefficient for every time
interval is encoded into a five bit code and extracted at a frame
period of 15 milliseconds, the information capacity of the partial
autocorrelation coefficients will be 2667 bits per second. On the
other hand when the information of the excitation source signal is
given at rate of 15 bits per every 15 milliseconds, the total
capacity amounts 3667 bits per second. The term "frame period"
herein used means a period in which the speech parameters are
stored in a memory which is to be descriminated from the sampling
interval.
This information capacity amounts to about 1/15 of that of the
speech waveform. For this reason it is possible to obtain
synthesized speech of high quality by means of controll signals of
small capacity. For this reason, with the novel audio response
apparatus it is possible to increase the number of words that can
be synthesized by a factor of 15 when compared with the
conventional apparatus.
Each one of the digital filter stages shown in FIG. 6 comprises one
multiplier and three adders. Thus, when the operations of these
multiplier and adders are controlled by a clock frequency of 10MHz,
the operation time per one stage will be equal to approximately 1.8
microseconds. Assuming a miximum of 8T of the time difference of
the partial autocorrelation coefficients, one sampled value of the
synthesized speech will be formed within an interval of about 14.4
microseconds but since each stage completes its operation at every
1.8 microseconds it is possible to give an excitation source
information to the input of the digital filters at every 1.8
microseconds thus producing synthesized speech outputs at every 1.8
microseconds. Consequently, above described period of 14.4
microseconds acts as a pure delay time necessary to synthesize one
sample of speech output. Thus assuming a sampling frequency of 8KHz
for the synthesized speech, it becomes to use in multiple about 64
channels.
In the novel audio response apparatus, the fundamental pitch period
may be extracted by any one of another well known means other than
that has been described. Further, while in the foregoing
description the partial autocorrelation coefficient was obtained
from a sampled value of the audio waveform it is to be understood
that this coefficient can be determined by predicting the values of
two closely adjacent instants by a signal presenting between these
two instants and then determining the correlation of the
differences between the actual values corresponding to the
predicted values and the predicted values. Although in the
foregoing embodiment, a plurality of digital filter stages were
connected in cascade, it will be clear that a single digital filter
may be used repeatedly to provide the desired synthesized
speech.
The audio response apparatus described hereinabove comprises a
memory to store the partial autocorrelation coefficients of a
speech signal and the fundamental pitch period and signal amplitude
which are utilized as the excitation source informations, and a
speech synthesizer which operates, in response to a command of an
electronic compouter, to select the speech parameters stored in the
memory to synthesize a speech. In the novel audio response
apparatus when the speech synthesizer is utilized in multiplex on
the time division basis it is possible to simultaneously synthesize
a plurality of different speeches and to simultaneously send out
them to respective output channels.
An improved audio response apparatus capable of sending out a
plurality of different speech to a number of output channels at the
same time will be described hereunder.
There are many types of memory devices which can store speech
parameters such as magnetic core type, magnetic drum type and
magnetic disc type and so forth. Where it is desired to store
several thousands of words, inexpensive and large capacity magnetic
drum or magnetic disc type memories are preferred. For this reason
in the following two embodiments of the audio response apparatus, a
magnetic drum type memories are used to store speech parameters
whereby to simultaneously give answers to 64 output channels.
FIG. 7 shows the connection diagram of one of such embodiments in
which each speech unit or the speech information of a word is
recorded on a magnetic drum in the form of a speech parameter and
in a sequence such that the speech parameters of a plurality of
words are read out on the time division basis.
FIG. 8 shows a typical arrangement of respective speech parameters
on the magnetic drum. With reference first to this arrangement, a
set of speech parameters are recorded on each block 73 shown in
FIG. 8. Each block 73 comprises bits of the number required for
recording a set of speech parameters. Left hand numerals in the
blocks designate the word numbers (speech unit numbers) whereas
right hand numerals their frame numbers. Taking a word "1" for
example, respective speech parameters which have been extracted at
a frame period of 15 milliseconds are recorded in separate blocks
"1, 1"; " 1, 2"; . . . at every 15 milliseconds so that assuming
duration of a word of L seconds, the last speech parameter thereof
will be recorded in a block "1, N" spaced apart from the block "1,
1" by L seconds. As shown in FIG. 8 there is a relation
N=L/15.times. 10.sup.3. Respective speech parameters of word "2 "
are recorded in blocks "2, 1" ; " 2, 2" ; . . . " 2, N" of the same
cyclic store arrangement on the same magnetic drum, these blocks
being displaced by one block from the blocks for storing the word
"1." Word up to a word "M" are recorded in the same manner. By the
same way, respective speech parameters of a plurality of another
words are recorded in the other cyclic store arrangement of the
magnetic drum. The number of words M that can be recorded in
multiplex in the same cyclic store arrangement of the magnetic drum
in the manner as above described is limited by the frame period of
15 milliseconds and the bit rate of the magnetic drum.
In the following description, use is made of a magnetic drum for
recording speech parameters, having period of rotation of 20
milliseconds, bit rate of 2048 KHz, a bit number per one track of
40960 bits and a number of tracks of 800. It is further assumed
that each block in the cyclic store arrangements contains 64 bits.
(Although the size of the block 73 may be 55 bits which is equal to
the magnitude of one set of speech parameters, 64 bits are selected
for the purpose of description). In such a case the number of M
amounts to 480 and if a word length of about 2 seconds were
assumed, the number of N would be about 133. In the case of the
word length of about 2 second, it is impossible to record in a
single track all speech parameters which constitute cyclic store
arrangement shown in FIG. 8. Accordingly in such a case tracks are
sequentially switched at each revolution of 20 milliseconds of the
drum whereby to form a long cyclic store arrangement as shown in
FIG. 8 with a plurality of tracks. In other words, in this case the
speech parameters of a word of duration of 2 seconds are recorded
in 100 tracks which are switched sequentially. More strictly, in
order to sequentially switch the tracks of a magnetic drum of a
rotary period of 20 milliseconds for recording at every 15
milliseconds, and to assure a cyclic store arrangement to be
perfectly cyclic, the duration of the word should be a common
multiple of 20 milliseconds and 15 milliseconds. For this reason,
in the following description, it is assumed that a word of duration
of 1.98 seconds is to be recorded on 99 tracks which are switched
sequentially. In this case the number of N shown in FIG. 8 equals
132. If the number of tracks equals 800, 8 cyclic store
arrangements (FIG. 8) can be formed. As above described as the
number of words M recorded in multiplex in one cyclic store
arrangement (including 99 tracks) equals 480 words it is possible
to record speech parameters of the words of the total number of 480
.times. 8 = 3840 in eight cyclic store arrangements.
Speech parameters of each words are cyclically read out from left
to right as viewed in FIG. 8 by means of reproducing circuits, one
for each cyclic store arrangement. More particularly, with
reference to cyclic store arrangement 1, speech parameters of the
first set comprising words "1" , " 2" . . . " 480" will appear
sequentially in the reproducing circuit within one frame period,
that is 15 milliseconds. Thereafter, the speech parameters of the
second set comprising words "1," " 2" . . . " 480" will appear
sequentially. In the same manner, successive sets of speech
parameters are successively reproduced. Thus, in the case of a word
length of 1.98 seconds, one cycle of operation is completed when
speech parameters of the words of the 132th set appear.
The embodiment shown in FIG. 7 comprises a magnetic drum for
recording respective speech parameters of respective words in
cyclic store arrangement shown in FIG. 8, and track selection
matrix (61-1) . . . (61-8) to switch the tracks on the magnetic
drum storage 60 at each revolution thereof for forming 8 cyclic
store arrangements of the period of 1.98 seconds each. Each of the
track selection matrix is provided for 99 tracks and the outputs
from the track selection matrixs are supplied to serial-parallel
converters (63-1) . . . (63-8) respectively through read amplifiers
(each including an appropriate pulse shaping circuit) (62-1) . . .
(62-8). Successively read out speech parameters are converted into
a plurality of sets of parallel signals (comprised by 55 bits) by
the action of the serial-parallel converters (63-1) . . . (63-8)
and are then written in parameter buffer memories (64-1) . . .
(64-8) capable of storing one set (55 bits) of speech parameters
per each words in the respective cyclic store arrangement. Each of
the parameter buffer memories includes a read-write control circuit
and generally comprises two planes for simultaneously writing from
one side and reading out from the other. The speech parameters
selectively read out from the parameter buffer memories are then
supplied to the aforementioned digital speech synthesizer 65.
Speech signals supplied by the digital synthesizer 65 in the form
of PCM are written in an output buffer memory 66 provided for each
output channel to store during one frame period (15 milliseconds).
Similar to the parameter buffer memories (64-1) . . . (64-8), the
output buffer memory has two planes as well as a read-write control
circuit. The output buffer memory 66 provides for a D-A converter
67 PCM codes of one sample corresponding to each output channel for
converting these PCM codes into PAM signals. The output from the
D-A converter 67 is supplied to low pass filters (69-1), (69-2) . .
. (69-64) through PAM gates (68-1), (68-2) . . . (68-64), one for
each output channel, to be converted to a continuous speech wave.
There is also provided an input control 71 which is connected to
the electronic computer to receive informations representing the
word numbers of the words to be sent to each output channel. In
order to control on the time division basis the flow of the signal
from the parameter buffer memories (64-1) . . . (64-8) to PAM gates
(68-1), (68-2) . . . (68-64) for each output channel, there is
provided a sequence control 72. Further a word synchronizer 70 is
provided for providing a request for transfer for the electronic
compouter and for designating the write address in parameter buffer
memories (64-1) . . . (64-8). Although in addition to the word
synchronizer 70 it is necessary to provide a magnetic drum
read-write control, but in FIG. 7 it is not shown.
The magnetic drum 60, track selection matrix (61-1) . . . (61-8),
read amplifiers (62-1) . . . (62-8) and serial-parallel converters
(63-1) . . . (63-8) shown in FIG. 7 may be conventional ones
commonly used in digital electronic computers. Further, the
parameter buffer memories (64-1) . . . (64-8) and the output buffer
memory 66 may be magnetic core memories which are widely used in
ordinary electronic computers as the main memories. Furthermore,
the D-A converter 67, PAM gates (68-1), (68-2) . . . (68-64) and
low pass filters (69-1), (69-2) . . . (69-64) may also be
conventional ones commonly used in PCM transmission systems.
The details of the word synchronizer 70, the input control 71 and
the sequence control 72 are as follows.
FIG. 9 shows one example of the construction of the word
synchronizer 70. Two input signals TIMING and MARK shown on the
lefthand side of FIG. 9 represent control signals that have been
recorded on particular tracks of the magnetic drum strage 60. The
time chart of these control signals is shown by FIG. 10. As shown
the signal TIMING is generated at each complete revolution of the
magnetic drum whereas the signal MARK marks the block 73
corresponding to one set of the speech parameter shown in FIG. 8.
In the example shown in FIG. 8, each block includes 64 bits, and a
set of the speech parameters (55 bits) is recorded in one block.
While another signal CLOCK is also shown in FIG. 10, this signal
represents the bit position on the track of the magnetic drum, and
in this example the signal is a pulse sequence having a frequency
of 2048 KHz. As above described since it is necessary to
successively switch the tracks to read the records thereon for each
revolution of the magnetic drum, in the circuit shown in FIG. 9,
the TIMING signals are counted by a 99 step counter 75 for
decording them so as to select a track to be read. The output from
the decorder 74 is supplied in parallel to respective track
switching circuits (61-1) . . . (61-8). An overflow signal 78
provided by counter 75 means that the period of 1.98 seconds has
elapsed, so that this overflow signal 78 is used to send a transfer
request signal to the electronic computer. In response to this
signal the electronic computer beings to transfer the designated
informations of words to be sent out on respective output channels.
On the other hand, MARK pulses are counted for the purpose of
indicating addresses to write speech parameters which are
successively read out from the magnetic drum into respective
parameter butter memories. As above described, in the example shown
in FIG. 8, since the value of M equals 480, the MARK pulses are
counted by a 480 step counter 76 and the resulted counted values
are used to indicate write addresses of respective parameter buffer
memories. Further, each parameter buffer memory has two planes it
is necessary to determine a plane to be written. For this reason a
flip-flop circuit 77 is provided to receive the overflow signal 79
from the 480 step counter 76. The flip-flop circuit 77 reverses the
polarity of its output each time said counter 76 counts up 480 MARK
pulses in 15 milliseconds to indicate that to which plane the
information should be written in.
One example of the construction of the sequence control 72 is
illustrated in FIG. 11 while the time relationships between various
signals are shown in FIG. 12. The sequence control 72 is operated
by the clock signal of a frequency of 2048 KHz of the magnetic
drum. The clock signal is converted into a signal 87 of a frequency
of 512 KHz by means of a 4 step counter 80, and the signal 87 is
supplied to a counting circuit including cascade connected 64 step
counter 84 and a 120 step counter 85, the contents of these
counters indicating the address of the output buffer memory 66 to
be read at that time. The address is sent to the output buffer
memory 66 to read the content corresponding to the address and the
read out content is converted into an analogue signal by means of
A-D converter 67. At the same time decorder 86 operates to decode
the output from a 64 step counter 84 to produce gate signals (G-1),
(G-2), . . . (G-64) for opening PAM gates (68-1), (68-2) . . .
(68-8) in the output channels with the time relationships as shown
in FIG. 12. In this manner the signal which has been read out from
the output buffer memory 66 and converted to analogue form by the
D-A converter 67 is sent to the output channel designated by
counter 84. The output signal 87 from 4 step counter 80 is also
supplied to a counting circuit comprised by cascade connected 120
step counter 81 and 64 step counter 82. The contents of these
counters indicate the address in which the PCM code synthesized at
this time by synthesizer 65 is to be written in the output buffer
memory 66. As shown in FIG. 12, the overflow signal 88 of the 120
step counter 81 is generated at every 234 microseconds and supplied
to the following 64 step counter 82. Signal 88 is also used to
start input control 71. The overflow signal 89 from 64 step counter
82, which is generated at every 120 .times. 64 / 512 KHz = 15
milliseconds, is supplied to the flip-flop circuit 83. The binary
output from this flip-flop circuit indicates which one of two
planes of the output buffer memory 66 should be written or read
out. The output signal 87 from 4 step counter 80 is sent to
synthesizer 65 for the purpose of operating it in synchronism with
the writing and read out operations of the output buffer memory
66.
FIG. 13 shows one example of the construction of the input control
71. When a transfer request is sent to the electronic computer by
the signal 78 from word synchronizer 70, informations designating
the word numbers of words to be sent to respective output channels
CH-1, CH-2 . . . CH-64 are transferred from the electronic computer
and these informations are temporally stored in registers (93-1),
(93-2) . . . (93-64), respectively, corresponding to respective
output channels. After elapse of the word length, 1.98 seconds, the
word synchronizer 70 sends a request signal to the electronic
computer as above described, but signal 78 is also supplied to
gates (92-1), (92-2) . . . (92-64) of the input control 71 as the
gate signal to transfer the contents of registers (93-1), (93-2) .
. . (93-64) into registers (91-1), (91-2) . . . (91-64)
respectively. As above described, the sequence control 72 provides
a start signal 88 to the input control 71 at every 234 microseconds
which is counted by the 64 step counter 95. The content of counter
95 is decoded by decoder 94 to produce gate signals (96-1), (96-2)
. . . (96-64) for gate circuits (90-1), (90-2) . . . (90-64)
respectively. By the action of these gate signals, the contents of
registers (91-1), (91-2) . . . (91-64) are transferred as read
addresses successively to the parameter buffer memories at an
interval of 234 microseconds to read the same. Assuming a word
length of 1.98 seconds, when the contents of respective registers
(91-1), (91-2) . . . (91-64) are sent 132 times (132 frames) to the
parameter buffer memory, the informations for designating the next
word, which have been transferred from the electronic computer and
are being stored in respective registers (93-1), (93-2) . . .
(93-64) are transferred to registers (91-1), (91-2) . . . (91-64),
respectively by the signal 78 generated by the word synchronizer 70
at that time. Above described operations are repeated in
synchronism with the duration of the words.
Referring again to FIG. 7, the speech parameters read out from the
magnetic drum 60 and converted into parallel signals in each set
are written in the addresses of respective parameter buffer
memories (64-1) . . . (64-8) corresponding to eight cyclic store
arrangements to each set. Accordingly, each address in these
memories (64-1) . . . (64-8) includes 55 bits for one set of speech
parameters. Of course the above described operation is performed in
parallel for eight cyclic store arrangements so that one set of
speech parameters regarding one word out of 3,840 words is written
in parameter buffer memories (64-1). . . (64-8) in the manner
described above. This writing operation into the parameter buffer
memories is completed with one frame period of 15 milliseconds.
Then a read out cycle begins for the parameter buffer memories
(64-1) . . . (64-8) for each output channel. During the proceeding
of this read out cycle, the speech parameters for the next frame
period read out from the magnetic drum 60 are written in the other
plane of parameter buffer memories (64-1) . . . (64-8),
respectively, having two planes as above described. During the read
out cycle, the contents of registers (91-1), (91-2) . . . (91-8) of
the input control 71 are transferred to parameter buffer memories
according to the order of the output channels under the control of
signal 88 from sequence control 72 to read the contents (speech
parameters of one set) of the addresses of parameter buffer
memories (64-1) . . . (64-8) and the read out contents are sent to
the speech synthesizer. As described above in detail, by receiption
of the read out contents, the synthesizer 65 operates to synthesize
PCM speech codes, for example 120 PAM sameples, which should be
produced in one frame period. These synthesized codes are
successively stored in addresses of the output buffer memory 66,
said addresses being indicated by 120 step counter 81, 64 step
counter 82 and flip-flop 83 of said channel sequence control 72.
Each address of the output buffer 66 comprises, for example, 8 bits
enough to store one set of PCM speech codes. This operation is
performed in a period corresponding to 1/64 of one frame (15
milliseconds) or 234 microseconds. As a result, during one frame
period, this operation is performed for 64 output channels on the
time division basis. Thus, 120 PCM samples for each output channel
are written in the addresses corresponding to respective output
channels of the output buffer memories, during one frame period, or
15 milliseconds. The contents of the output buffer memory 66 are
read out in the time division basis in synchronism with gate
signals (G-1), (G-2) . . . (G-64) of PAM gates (68-1), (68-2) . . .
(68-64) of respective output channels, under the control of the
sequence control 72. Read out signals are converted into PAM
signals by D-A converter 67 which are supplied to output channels
as a continuous speech wave through corresponding one of low pass
filters (69-1), (69-2) . . . (69-64).
A series of operations described above are repeated with the frame
period of 15 milliseconds to provide a speech wave of the duration
of the words for respective output channels. The word numbers of
the words to be treated next time has already been transferred from
the electronic computer to the registers of (93-1), (93-2) . . .
(93-64) of the input control 71 by the transfer request signal 78
from the word synchronizer 70 before commencement of the treatment
of the next words. By repeating these operations with a period of
duration of the word (1.98 seconds, for example), compiled audio
messages are sent to respective output channels CH-1, CH-2 . . .
CH-64.
Although in the above described example of the audio response
apparatus, a magnetic drum was used as the memory for speech
parameters, it will be clear that in any other type of memory may
be used so long as it can record the speech parameters in the form
of cyclic store arrangement.
Another embodiment of the audio response apparatus utilizing a
magnetic drum as the memory for storing speech parameters will be
described hereunder. Different from the first embodiment in which
the speech parameters extracted from respective words were recorded
with intervals on the tracks of the magnetic drum, in this
modification these speech parameters are recorded continuously,
starting from a particular address. More particularly, each word is
recorded continuously without any overlap in 132 blocks in the case
where the duration of each word is 1.98 seconds, for example,
starting from the first address of the drum which is predetermined
for each word. When the word numbers of the words to be sent to
output channels are transferred from the electronic computer,
speech parameters (consisting of 132 sets, each) for the designated
words for the output channels are read out from the magnetic drum
and are stored in the parameter buffer memory. Thereafter, just in
the same manner as in the first embodiment the speech signal is
synthesized for each one frame, stored in the output buffer memory,
and is sent to the output channel as a continuous speech signal for
the designated word through the D-A converter, the PAM gate and the
low pass filter.
FIG. 14 shows another embodiment comprising a magnetic drum for
storing sets of speech parameters for respective words, a parameter
buffer memory for temporarily storing the speech parameters of the
word selectively read out from the magnetic drum 60, an input
control 100 for storing informations sent from the electronic
computer to designate the word number and for sending the read out
address to the parameter buffer memory 98 at each definite time, a
magnetic drum control response to the command from the input
control 100 for reading the contents of magnetic drum 60 to write
them in the parameter buffer memory 98. The modification further
comprises a synthesizer 65 for synthesizing a speech (120 samples)
of one frame (15 milliseconds) from one set of speech parameters
read out from the parameter buffer memory, said synthesizer
including digital filters for synthesizing the speech signal on the
time division basis for each output channel, an output buffer
memory for temporally storing a group of PCM codes corresponding
the speech signal synthesized by the synthesizer 65, a D-A
converter 67 for converting digital codes read out from the output
buffer memory into analogue signals, PAM gates (68-1), (68-2) . . .
(68-64) for distributing analogue signals from D-A converter 67
among respective output channels CH-1, CH-2 . . . CH-64, low pass
filters (69-1), (69-2) . . . (69-64) for converting the outputs
from respective PAM gates into a continuous waveform and a sequence
control 99 for controlling various component parts described
above.
Of these component parts, magnetic drum 60, parameter buffer memory
98, output buffer memory 66, D-A converter 67, PAM gates (68-1) . .
. (68-64) and low pass filters (69-1), (69-2) . . . (69-64) are
also conventional ones widely used in electronic computers and PCM
transmission systems. Magnetic drum control 97 is substantially
identical to a conventional magnetic drum channel device. In the
conventional computer, in order to read the magnetic drum by means
of a magnetic drum channel device and to store the read out
information in the main memory (corresponding to the parameter
buffer memory 98 shown in FIG. 14) it is necessary to provide some
means to give the address to the magnetic drum channel device for
reading the drum, number of words and write address of the main
memory, but with the magnetic drum control 97 shown in FIG. 14, the
number of words to be read is constant (132 words for the speech
unit of length of 1.98 seconds) which is determined by the duration
of the speech unit and the write address of the parameter buffer
memory varies regularly so that it is not necessary to designate
these values by the input control 100. The sequence control 99 is
substantially identical to the sequence control 72 of the first
embodiment except that it is controlled by independent clock
signals (in other words not synchronized with the revolution of the
magnetic drum.)
FIG. 15 shows the detail of the input control 100. Informations
sent from the electronic computer for designating the words to be
sent to respective output channels are stored in registers (104-1),
(104-2) . . . (104-64) corresponding to respective output channels
CH-1, CH-2 . . . CH-64. These informations are transferred to
registers (102-1), (102-2) . . . (102-64) through gates (103-1),
(103-2) . . . (103-64) operated by the overflow signal 111 (this
signal also acts as the transfer request signal for the electronic
computer) generated by a 132 step counter 108 at a period of the
duration of the word. These inforamtions are successively
transferred to the magnetic drum control 97 in the order of
registers (102-1), (102-2) . . . (102-64), thus reading the
magnetic drum. The input control start signals 88 sent from the
sequence controller 99 at an interval of 234 microseconds are
counted by 132 step counter 105 and 64 step counter 106. The
content of the 64 step counter 106 is decoded by a decoder 112 to
produce gate signals (110-1), (110-2) . . . (110-64) for opening
gates (101-1), (101-2) . . . (101-64) at an interval of about 30
milliseconds whereby to successively send the contents of registers
(102-1), (102-2) . . . (102-64) to the magnetic drum control 97.
All parameters of a word designated by the contents of registers
(102-1), (102-2) . . . (102-64) are required to be read out within
the duration of the word (1.98 seconds) and stored in the parameter
buffer memory 98. However, since a magnetic drum generally has a
relatively long access time so that it takes a maximum of about 25
milliseconds for giving the information designating the word from
the input control 100 to the magnetic control 97 and for reading
all parameters of the words to store them in the parameter buffer
memory 98. Since, in this case gates (101-1), (101-2) . . .
(101-64) are opened at an interval of 30 milliseconds there is
sufficient time to read the magnetic drum 60.
Input control start signals 88 are also counted by 64 step counter
107 and 132 step counter 108, and the contents of these counters
are sent to parameter buffer memory 98 as an address thereof to be
read at this time. Since the writing operation of the speech
parameters from the magnetic drum 60 and the reading operation of
the content of the address designating the input control are
performed in parallel, the parameter buffer memory 98 is provided
with two planes, as in the first embodiment. To select either one
of these planes there is provided a flip-flop 109 which reverses
the polarity of the output in response to the overflow signal 111
from the 132 step counter 108.
During the period in which the magnetic drum 60 is read out by the
contents of the registers (102-1), (102-2) . . . (102-64)
informations transferred from the electronic computer for
designating the next words are received and stored in registers
(104-1), (104-2) . . . (104-64). In this manner, informations for
designating words are successively received from the electronic
computer to send different audio messages designated thereby to
respective output channels. It is of course possible to substitute
a magnetic disc storage for the magnetic drum to store the speech
parameters.
As above described, according to the novel audio response
apparatus, speech signals are recorded as compressed information by
using partial autocorrelation coefficients as parameters so that it
is possible to economically accomodate and read out a great many
words. In addition, since a digital speech synthesizer is used, one
single synthesizer can be used in common for many output channels,
64 for example, on the time division basis, which is extremely
economical.
It is to be understood that the invention is by no means limitted
to particular embodiments illustrated but many changes and
alternations may be made within the spirit and scope of the
invention as defined in the appended claims.
* * * * *