U.S. patent number 3,786,195 [Application Number 05/171,571] was granted by the patent office on 1974-01-15 for variable delay line signal processor for sound reproduction.
This patent grant is currently assigned to Cambridge Research and Development Group, D. T. Liquidating Partnership, Sanford D. Greenberg. Invention is credited to Murray M. Schiffman.
United States Patent |
3,786,195 |
Schiffman |
January 15, 1974 |
**Please see images for:
( Certificate of Correction ) ** |
VARIABLE DELAY LINE SIGNAL PROCESSOR FOR SOUND REPRODUCTION
Abstract
An electrical delay line which has variable delay controlled by
a signal input thereto is connected in a sound signal channel for
signals such as human speech, to compress or expand the sound
signal waveform depending on whether the time delay in the line is
increased or decreased. By periodically sweeping the delay line
from minimum to maximum time delay or vice-versa, repeated segments
of a continuous sound signal waveform are processed so that an
output audio signal can be obtained having the original frequency
components of the signal and occupying a time duration which is
equal to or smaller or larger than the original sound sequence with
the successive segments of the signal processed by the variable
delay line assembled with regard both to the significant parameters
of human speech or other coding and the electrical conditions
imposed by the system to produce a composite audio signal which is
an intelligible replica of the original and substantially free of
annoying aberrations introduced by the delay line processor.
Variable delay using analog or digital signal storage is also
provided.
Inventors: |
Schiffman; Murray M. (Newton,
MA) |
Assignee: |
Cambridge Research and Development
Group (Westport, CT)
Greenberg; Sanford D. (Washington, DC)
D. T. Liquidating Partnership (New York, NY)
|
Family
ID: |
22624258 |
Appl.
No.: |
05/171,571 |
Filed: |
August 13, 1971 |
Current U.S.
Class: |
704/211; G9B/21;
704/E21.017; 704/502 |
Current CPC
Class: |
G11B
5/00 (20130101); H03H 7/30 (20130101); H04B
1/66 (20130101); G11B 21/00 (20130101); G10L
21/04 (20130101); H03H 11/265 (20130101); H03K
4/502 (20130101); H03K 7/08 (20130101); H04B
1/662 (20130101); H04B 3/10 (20130101) |
Current International
Class: |
G10L
21/04 (20060101); G10L 21/00 (20060101); H03K
7/08 (20060101); H03H 7/30 (20060101); H03K
4/00 (20060101); H03H 11/26 (20060101); H04B
1/66 (20060101); H03K 7/00 (20060101); H03K
4/502 (20060101); H04B 3/10 (20060101); H04B
3/04 (20060101); G11B 21/00 (20060101); G11B
5/00 (20060101); G10l 001/06 () |
Field of
Search: |
;179/15.55T,15.55R,15BW
;178/5.4HE,6.6TC,DIG.3 |
References Cited
[Referenced By]
U.S. Patent Documents
Primary Examiner: Claffy; Kathleen H.
Assistant Examiner: Leaheey; Jon Bradford
Attorney, Agent or Firm: Pfund, Esq.; Charles E. Chittick,
Thompson & Pfund
Claims
I claim:
1. A processor for electric signals representing coded audible
signals such as speech or the like comprising:
means for deriving said electric signals as analog representations
of said audible signals with the frequency components of said
electric signals related by a given factor to the frequency
components of said audible signals;
variable delay line means having an input and an output, said input
coupled to said means for deriving said electric signals for
propagating representations of said electric signals to said output
with controllable time delay;
means for controlling said delay line means for periodic linear
variation of said time delay between predetermined delay values;
and
means coupled to said output of said delay line means and
responsive substantially only to signals propagating through said
delay line means which have been subject to unidirectional
variation of said time delay for producing output signals
reproducible as an audible representation of said electric signals
having frequency components altered by substantially said factor to
approximate the frequency components of said audible signals.
2. A processor for electric signals representing coded audible
signals such as speech or the like comprising:
means for deriving said electric signals as analog representations
of said audible signals with the frequency components of said
electric signals related by a given factor to the frequency
components of said audible signals;
variable delay line means having an input and an output, said input
coupled to said means for deriving said electric signals for
propagating representations of said electric signals to said output
with controllable time delay;
means for controlling said delay line means for periodic linear
variation of said time delay between predetermined delay values;
and
means coupled to said output of said delay line means and
responsive substantially only to signals propagating through said
delay line means which have been subject to unidirectional
variation of said time delay for producing output signals
reproducible as an audible representation of said electric signals
having frequency components altered by substantially said factor to
approximate the frequency components of said audible signals;
and
in which said means for controlling said delay line means provides
periods of linearly changing unidirectional delay variation
alternating with reset intervals wherein said delay line is
returned to an initial predetermined delay value and including
means for blanking output signals from said output of said delay
line means to substantially zero amplitude during said reset
intervals.
3. Apparatus according to claim 2 in which said means for blanking
includes means for gradually restoring the amplitude of said output
signals to full amplitude after said reset intervals.
4. Apparatus according to claim 1 and including input high pass
filter means, said filter means substantially eliminating
components of said electrical signals from propagating through said
delay line means which are below a frequency approximately equal to
said factor times the reciprocal of the period of said linear
variation.
5. Apparatus according to claim 4 and including band pass output
filter means for limiting said output signals to a predetermined
audio bandwidth.
6. Apparatus according to claim 1 and including:
input high pass filter means having adjustable low frequency cutoff
interposed between said means for deriving said electric signals
and said input of said variable delay line means;
means for adjusting the effective rate of said periodic linear
variation to adjust said factor by which frequencies propagated
through said delay line are altered; and
means coupling said cutoff adjusting means and said effective rate
adjusting means for related variation to eliminate low frequency
components which would otherwise produce frequency components in
said delay line means below an audio low frequency limit related to
said factor and the period of said variation.
7. Apparatus according to claim 6 in which said means for
controlling said delay line means provides periods of linearly
changing unidirectional delay variation alternating with reset
intervals wherein said delay line is returned to an initial
predetermined delay value and including means for blanking output
signals from said output of said delay line means to substantially
zero amplitude during said reset intervals.
8. Apparatus according to claim 7 in which said means for blanking
includes means for gradually restoring the amplitude of said output
signals to full amplitude after said reset intervals.
9. Apparatus according to claim 2 in which said means for blanking
includes means responsive to the level of said output signals being
zero at the beginning of said reset interval to initiate said
blanking and means responsive to the level of said output signals
being zero at the end of said reset interval to terminate said
blanking.
10. Apparatus according to claim 2 and including means for
introducing an audio signal segment into for combining with said
audible representation of said electric signals during said reset
interval.
11. A processor for electric signals representing coded audible
signals such as speech or the like comprising:
means for deriving said electric signals as analog representations
of said audible signals with the frequency components of said
electric signals related by a given factor to the frequency
components of said audible signals;
variable delay line means having an input and an output, said input
coupled to said means for deriving said electric signals for
propagating representations of said electric signals to said output
with controllable time delay;
means for controlling said delay line means for periodic linear
variation of said time delay between predetermined delay values;
and
means coupled to said output of said delay line means and
responsive substantially only to signals propagating through said
delay line means which have been subject to unidirectional
variation of said time delay for producing output signals
reproducible as an audible representation of said electric signals
having frequency components altered by substantially said factor to
approximate the frequency components of said audible signals;
and
in which said means for controlling said delay line means provides
periods of linearly changing unidirectional delay variation
alternating with reset intervals wherein said delay line is
returned to an initial predetermined delay value and further
including:
second delay line means coupled to said means for deriving said
electric signals; and
means for interrupting signals from the output of said variable
delay line means and substituting signals from the output of said
second delay line means during said reset intervals to produce said
output signals reproducible as said audible representation.
12. Apparatus according to claim 11 in which said second delay line
means include a variable delay line and including means for
controlling said second delay line means for periodic linear
variation of said time delay between predetermined delay values,
said variation of said second delay line means occurring during
said reset intervals for the first recited said delay line
means.
13. A processor for electric signals representing coded audible
signals such as speech or the like, said electric signals being
analog representations of said audible signals with the frequency
components of said electric signals related by a given factor to
the frequency components of said audible signals comprising:
first and second variable delay line means coupled to a source of
said electric signals for propagating said electric signals therein
with controllable time delay;
means for controlling said first and second delay line means for
periodic linear variation of said time delay between predetermined
delay values, said means for controlling said first and second
delay line means providing periods of linearly changing
unidirectional delay variation alternating with reset intervals
wherein said delay lines are returned to an initial predetermined
delay value, said reset intervals for each said delay line means
occurring within the period of the unidirectional delay variation
of the other said delay line means; and
means coupled to the outputs of said first and second delay line
means and responsive substantially only to signals propagating
through said delay line means which have been subject to
unidirectional variation of said delay for producing an audible
representation of said electric signals having frequency components
altered by substantially said factor to approximate the frequency
components of said audible signals.
14. Apparatus according to claim 13 in which said first and second
delay line means vary between the same said predetermined delay
values alternately with substantially equal periods of said linear
variation.
15. Apparatus according to claim 14 in which said reset intervals
are substantially shorter than the periods of said linear variation
and occur for each said delay line means approximately at the
mid-point of said linear variation for the other said delay line
means and including:
separate audio reproducer means coupled to the respective outputs
of said first and second delay line means for producing
simultaneously separate versions of said audible representation
from each said reproducer means.
16. Apparatus according to claim 11 and including:
input high pass filter means having adjustable low frequency cutoff
interposed between said means for deriving said electric signals
and said input of said variable delay line means;
means for adjusting the effective rate of said periodic linear
variation to adjust said factor by which frequencies propagated
through said delay line are altered; and
means coupling said cutoff adjusting means and said effective rate
adjusting means for related variation to eliminate low frequency
components which would otherwise produce frequency components in
said delay line means below an audio low frequency limit related to
said factor and the period of said variation.
17. Apparatus according to claim 16 in which said second delay line
means is also a variable delay line and including means for
controlling said second delay line means for periodic linear
variation of said time delay between predetermined delay values,
said variation of said second delay line means occurring during
said reset intervals for the first recited said delay line
means.
18. A processor for electric signals representing coded audible
signals such as speech or the like, said electric signals being
analog representations of said audible signals with the frequency
components of said electric signals related by a given factor to
the frequency components of said audible signals comprising:
first and second variable delay line means coupled to a source of
said electric signals for propagating said electric signals therein
with controllable time delay;
input high pass filter means having adjustable low frequency cutoff
interposed between said source and the input of said variable delay
line means;
means for controlling said first and second delay line means for
periodic linear variation of said time delay between predetermined
delay values, said means for controlling said first and second
delay line means providing periods of linearly changing
unidirectional delay variation alternating with reset intervals
wherein said delay lines are returned to an initial predetermined
delay value, said reset intervals for each said delay line means
occurring within the period of the unidirectional delay variation
of the other said delay line means;
means for adjusting the effective rate of said periodic linear
variation to adjust the factor by which frequencies propagated
through said delay line are altered;
means coupling said cutoff adjusting means and said effective rate
adjusting means for related variation to eliminate low frequency
components which would otherwise produce frequency components in
said delay line means below an audio low frequency limit related to
said factor and the period of said variation; and,
means coupled to the outputs of said first and second delay line
means and responsive substantially only to signals propagating
through said delay line means which have been subject to
unidirectional variation of said delay for producing an audible
representation of said electric signals having frequency components
altered by substantially said given factor to approximate the
frequency components of said audible signals.
19. Apparatus according to claim 18 in which said first and second
delay line means vary between the same said predetermined delay
values alternately with substantially equal periods of said linear
variation.
20. Apparatus according to claim 19 in which said reset intervals
are substantially shorter than the periods of said linear variation
and occur for each said delay line means approximately at the
mid-point of said linear variation for the other said delay line
means and including:
separate audio reproducer means coupled to the respective outputs
of said first and second delay line means for producing
simultaneously separate versions of said audible representation
from each said reproducer means.
21. Apparatus according to claim 11 in which said second delay line
means is a fixed delay line the input of which is coupled to an
output of said variable delay line means.
22. Apparatus according to claim 21 and including:
input high pass filter means having adjustable low frequency cutoff
interposed between said source and the input of said variable delay
line means;
means for adjusting the effective rate of said periodic linear
variation to adjust said factor by which frequencies propagated
through said delay line are altered; and
means coupling said cutoff adjusting means and said effective rate
adjusting means for related variation to eliminate low frequency
components which would otherwise produce frequency components in
said delay line means below an audio low frequency limit related to
said factor and the period of said variation.
23. A processor for recordings representing coded audible signals
such as speech or the like comprising:
playback means operable to reproduce said recording at a rate
faster by a given factor than the recording rate thereby producing
electric signals which are the time compressed analog of said
audible signals with frequency components increased by said
factor;
variable delay line means coupled to said playback means for
propagating said electric signals therein with controllable time
delay;
means for controlling said delay line means for periodic linear
increase of said time delay between predetermined delay values;
and,
means coupled to the output of said delay line means and responsive
subtantially only to signals propagating through said delay line
means which have been subject to said increase of said time delay
for producing an audible representation of said electrical signals
having frequency components reduced substantially by said factor to
approximate the frequency components of said audible signals.
24. Apparatus according to claim 23 and including input high pass
filter means, said filter means substantially eliminating
components for said electrical signals from propagating through
said delay line means which are below a frequency approximately
equal to said factor times the reciprocal of the period of said
linear variation.
25. Apparatus according to claim 24 and including band pass output
filter means limiting said output representation to a predetermined
audio bandwidth.
26. Apparatus according to claim 23 and including:
input high pass filter means having adjustable low frequency cutoff
interposed between said playback means and the input of said
variable delay line means;
means for adjusting the effective rate of said periodic linear
variation to adjust said factor by which frequencies propagated
through said delay line are altered; and
means coupling said cutoff adjusting means and said effective rate
adjusting means for related variation to eliminate low frequency
components which would otherwise produce frequency components in
said delay line means below an audio low frequency limit related to
said factor and the period of said variation.
27. Apparatus according to claim 26 in which said means for
controlling said delay line means provides periods of linearly
changing unidirectional delay variation alternating with reset
intervals wherein said delay line is returned to an initial
predetermined delay value and including means for blanking output
signals from said output of said delay line means to substantially
zero amplitude during said reset intervals.
28. Apparatus according to claim 27 in which said means for
blanking includes means for gradually restoring the amplitude of
said output to full amplitude after said reset intervals.
29. Apparatus according to claim 2 in which said means for blanking
includes means responsive to the level of said output signals being
zero at the beginning of said reset interval to initiate said
blanking and means responsive to the level of said output signals
being zero at the end of said reset interval to terminate said
blanking.
30. Apparatus according to claim 29 and including means for
introducing an audio signal segment into said output signals and
said audible representation of said electric signals during said
reset interval.
31. Apparatus according to claim 23 in which said means for
controlling said delay line means provides periods of linearly
changing unidirectional delay variation alternating with reset
intervals wherein said delay line is returned to an initial
predetermined delay value and including:
second delay line means coupled to a source of said electric
signals; and
means for interrupting signals from the output of said variable
delay line means and substituting signals from the output of said
second delay line means during said reset intervals to produce said
audible representation.
32. Apparatus according to claim 31 in which said second delay line
means includes a variable delay line and including means for
controlling said second delay line means for periodic linear
variation of said time delay between predetermined delay values,
said variation of said second delay line means occurring during
said reset intervals for the first recited said delay line
means.
33. A processor for recordings representing coded audible signals
such as speech or the like comprising:
playback means operable to reproduce said recording at a rate
faster by a given factor than the recording rate thereby producing
electric signals which are the time compressed analog of said
audible signals with frequency components increased by said
factor;
first and second variable delay line means coupled to said playback
means for propagating said electric signals therein with
controllable time delay;
means for controlling said first and second delay line means for
periodic linear variation of said time delay between predetermined
delay values, said means for controlling said first and second
delay line means providing periods of linearly changing
unidirectional delay variation alternating with reset intervals
wherein said delay lines are returned to an initial predetermined
delay value, said reset intervals for each said delay line means
occurring within the period of the unidirectional delay variation
of the other said delay line means; and
means coupled to the outputs of said first and second delay line
means and responsive substantially only to signals propagating
through said delay line means which have been subject to
unidirectional variation of said delay for producing an audible
representation of said electric signals having frequency components
reduced by substantially said factor to approximate the frequency
components of said audible signals.
34. Apparatus according to claim 33 in which said first and second
delay line means vary between the same said predetermined delay
values alternately with substantially equal periods of said linear
variation.
35. Apparatus according to claim 34 in which said reset intervals
are substantially shorter than the periods of said linear variation
and occur for each said delay line means approximately at the
mid-point of said linear variation for the other said delay line
means and including:
separate audio reproducer means coupled to the respective outputs
of said first and second delay line means for producing
simultaneously separate versions of said audible representation
from each said reproducer means.
36. Apparatus according to claim 31 and including:
input high pass filter means having adjustable low frequency cutoff
interposed between said source and the input of said variable delay
line means;
means for adjusting the effective rate of said periodic linear
variation to adjust said factor by which frequencies propagated
through said delay line are altered; and
means coupling said cutoff adjusting means and said effective rate
adjusting means for related variation to eliminate low frequency
components which would otherwise produce frequency components in
said delay line means below an audio low frequency limit related to
said factor and the period of said variation.
37. Apparatus according to claim 36 in which said second delay line
means is also a variable delay line and including means for
controlling said second delay line means for periodic linear
variation of said time delay between predetermined delay values,
said variation of said second delay line means occurring during
said reset intervals for the first recited said delay line.
38. Apparatus according to claim 18 in which said source of said
electric signals comprises playback means operable to reproduce a
recording at a rate faster by a given factor than the recording
rate, thereby producing electric signals which are the time
compressed analog of said audible signals with frequency components
increased by said given factor.
39. Apparatus according to claim 18 in which said first and second
delay line means vary between the same said predetermined delay
values alternately with substantially equal periods of said linear
variation.
40. Apparatus according to claim 39 in which said reset intervals
are substantially shorter than the periods of said linear variation
and occur for each said delay line means approximately at the
mid-point of said linear variation for the other said delay line
means and including:
separate audio reproducer means coupled to the respective outputs
of said first and second delay line means for producing
simultaneously separate versions of said audible representation
from each said reproducer means.
41. Apparatus according to claim 40 in which said second delay line
means is a fixed delay line the input of which is coupled to an
output of said variable delay line means.
42. Apparatus according to claim 41 and including:
input high pass filter means having adjustable low frequency cutoff
interposed between said source and the input of said variable delay
line means;
means for adjusting the effective rate of said periodic linear
variations to adjust said factor by which frequencies propagated
through said delay line are altered; and
means coupling said cutoff adjusting means and said effective rate
adjusting means for related variation to eliminate low frequency
components which would otherwise produce frequency components in
said delay line means below an audio low frequency limit related to
said factor and the period of said variation.
43. A processor for recordings representing coded audible signals
such as speech or the like comprising:
playback means operable to reproduce said recording at a rate
slower by a given factor than the recording rate thereby producing
electric signals which are the time expanded analog of said audible
signals with frequency components decreased by said factor;
variable delay line means coupled to said playback means for
propagating said electric signals therein with controllable time
delay;
means for controlling said delay line means for periodic linear
decrease of said time delay between predetermined delay values;
and
means coupled to the output of said delay line means and responsive
substantially only to signals propagating through said delay line
means which have been subject to said decrease of said time delay
for producing an audible representation of said electrical signals
having frequency components increased substantially by said factor
to approximate the frequency components of said audible
signals.
44. Apparatus according to claim 43 in which said means for
controlling said delay line means provides periods of linearly
changing unidirectional delay variation alternating with reset
intervals wherein said delay line is returned to an initial
predetermined delay value and including:
second delay line means coupled to a source of said electric
signals; and
means for interrupting signals from the output of said variable
delay line means and substituting signals from the output of said
second delay line means during said reset intervals to produce said
audible representation.
45. Apparatus according to claim 44 in which said second delay line
means include a variable delay line and including means for
controlling said second delay line means for periodic linear
variation of said time delay between predetermined delay values,
said variation of said second delay line means occurring during
said reset intervals for the first recited said delay line
means.
46. Apparatus according to claim 44 in which said second delay line
means is a fixed delay line the input of which is coupled to an
output of said variable delay line means.
47. Apparatus according to claim 1 in which said delay line means
is controlled such that:
the time delay difference between said delay values is d .times. T
where d = 2 c-1/c+1, c is said factor and T is said period; and
the period of said variation is greater than the period of the
lowest input frequency component to be processed through said line
divided by said factor.
48. Apparatus according to claim 1 in which:
said factor is a number greater than one for time compression and
less than one for time expansion;
the period of said variation is greater than the period of the
lowest frequency component of said electric signals appearing at
said output of said delay line; and
said unidirectional variation of said delay is given by 2
(c-1/c+1)t where t is the time and c is said factor.
49. A processor according to claim 1 in which said variable delay
line means is an analog shift register and said means for
controlling said delay line means is a variable frequency clock
pulse generator having a frequency that periodically varies
inversely with time.
50. A processor according to claim 11 in which both of said delay
line means are analog shift registers each having a variable
frequency clock pulse generator.
51. Apparatus according to claim 1 in which said variable delay
line means is a parallel r-word-n-stage shift register and
including input A/D means for converting said electric signals into
a succession of r-bit-digital words; output D/A means for
converting the digital word from the n-th stage into an analog
signal and variable frequency clock signal means for transferring
the r-bit words from stage to stage through said n stages to effect
linear variation of time delay for said signal.
52. Apparatus according to claim 1 in which said variable delay
line means comprises input A/D converter means, serializer means to
produce a digital word sequence, a serial digital shift register,
parallelizer means to produce a parallel digital word and a D/A
converter means; said control means comprising a variable frequency
clock generator for said shift register.
53. Apparatus according to claim 1 in which said delay line means
comprises an analog storage matrix and including write control
means for entering said electric signals into said matrix at a
first predetermined clock rate; and read control means for reading
said electric signals from said matrix at a different second
predetermined rate.
54. Apparatus according to claim 1 in which said variable delay
line means is a digital storage matrix having A/D input means and
D/A output means; and means for writing digital information into
storage in said matrix at one rate and reading stored information
out of said matrix at another rate.
55. The method of processing random speech signals to convey
information intelligibly to a human listener at a rate different
than the normal speaking equivalent for said information and
without objectionable alteration of the frequency components of
said equivalent as reproduced for the listener comprising the steps
of:
developing a full speech signal train of said information with the
elapsed time for said signal train altered into a time interval
differing by a predetermined factor from that of the normal
speaking equivalent for said signal train thereby changing by said
factor the frequency of the spectral components in said signal
train relative to said equivalent;
processing said signal train by a linear periodic time delay
function to alter a regular succession of predetermined length
segments of said signal train into segments approximating a
continuous signal with the frequency components in said continuous
signal altered by said factor relative to said components in said
signal train to approximate the components of said normal speaking
equivalent; and
reproducing said continuous signal as an intelligible
representation of said information content with the elapsed time
altered by said factor but with the frequency components unchanged
relative to said equivalent.
56. Apparatus according to claim 2 in which said means for blanking
includes means responsive to said output signals for initiating and
terminating said blanking for zero output signal levels and for
signal excursion adjacent said zero levels in the same
direction.
57. Apparatus according to claim 43, in which said means for
controlling said delay line means provides periods of linearly
changing unidirectional delay variation alternating with reset
intervals wherein said delay line is returned to an initial
predetermined delay value and including means for blanking output
signals from said output of said delay line means to substantially
zero amplitude during said reset intervals.
58. A processor for electric signals representing coded audible
signals such as speech or the like, said electric signals being
analog representations of said audible signals with the frequency
components of said electric signals related by a given factor to
the frequency components of said audible signals comprising:
two similar delay lines coupled to a source of said electric
signals for propagating said electric signals between the input and
output terminals thereof with controllable frequency
transformation;
means for controlling said delay lines alternately for periodic
propagation of said electric signals to produce predetermined
frequency transformation of signals emerging at the outputs of said
delay lines; and
means coupled to said outputs of both said delay line means and
responsive substantially only to signals propagating through said
delay lines which have been subject to said predetermined frequency
transformation for producing an audible representation of said
electric signals having frequency components altered by
substantially said factor to approximate the frequency components
of said audible signals.
59. Apparatus according to claim 58 in which said two delay lines
are analog shift registers.
60. Apparatus according to claim 58 in which said analog shift
registers are controlled alternately by a clock frequency
periodically varied between predetermined limits.
61. The method of processing random speech signals to convey
information intelligibly to a human listener at a rate different
than the normal speaking equivalent for said information and
without objectionable alteration of the frequency components of
said equivalent as reproduced for the listener comprising the steps
of:
developing a full speech signal train of said information with the
elapsed time for said signal train altered into a time interval
differing by a predetermined factor from that of the normal
speaking equivalent for said signal train thereby changing by said
factor the frequency of the spectral components in said signal
train relative to said equivalent;
propagating said signal train through a storage channel within
which it is subject to a periodic frequency transformation function
to alter a regular succession of predetermined length input
segments of said signal train into output segments approximating a
continuous signal with the frequency components in said continuous
signal altered by the inverse of said factor relative to said
components in said signal train to approximate the components of
said normal speaking equivalent; and
reproducing said continuous signal as an intelligible audible
representation of the information content of said signal train but
with the frequency components unchanged relative to said
equivalent.
62. The method according to claim 61 in which said storage channel
comprises analog shift register means controlled by a clock
frequency periodically varied between predetermined frequency
values.
63. The method according to claim 62 in which said input segments
are propagated alternately through separate analog shift registers
for frequency transformation.
64. Apparatus according to claim 48 in which said delay line means
comprises analog shift register means and said periodic variation
of said time delay is obtained by controlling the clock frequency
for said shift register means to provide said unidirectional
variation of said delay to be:
N (p-1/p) [ 1/f.sub.t - (1/f.sub.o ] = 2 (c-1/c+1) t
where f.sub.t is the clock frequency at time t, N is the number of
stages of phase p and f.sub.o is the initial clock frequency.
65. In a processor for electric signals such as those representing
the coded audible sounds of speech or the like, said electric
signals being analog representations of said audible sounds with
the frequency components of said electric signals related by a
given factor to the frequency components of said audible sounds,
the improvement comprising:
controllable delay means having an input and an output, said input
coupled to a source of said electric signals for passing signals
from said source into said delay means, and said delay means
passing signals therein to said output with controllable time
delay;
means for controlling said delay means with repetitive variation of
said controllable time delay between predetermined delay values the
period of said variation being greater than the period of the
lowest frequency component of said electric signals at said output
of said delay means thereby progressively delaying signals during
each period of said repetitive variation as they pass through said
delay means to obtain a predetermined frequency transformation of
said electric signals as determined by the relation: ##SPC3##
where c is said factor, f(t) is the time delay function
representing said variation of said controllable time delay
t.sub.in is the input time of a given signal element to said delay
means and t.sub.out is the output time for said given signal
element; and
means coupled to said output of said delay means and responsive
substantially only to signals at said output having said
predetermined frequency transformation for producing a composite
output signal representation of said electric signals having
frequency components altered by substantially said factor to
approximate the frequency components of said audible sounds.
Description
BACKGROUND OF THE INVENTION
The field of this invention is the processing of human speech
signals or similar signals for ultimate comprehension by the human
listener at substantially the natural or normal frequency component
distribution but at time interval durations which are usually
different from the original time duration of the speech
utterance.
Sound compression and expansion systems which utilize relative
motion between a magnetic tape record medium and the air gap of the
pick-up head which senses the recorded signal on the magnetic
medium are well known as exemplified by the patent to Schuller
2,352,023. Devices of this type suffer from the usual operational,
cost and weight limitations involved in equipment which utilizes
substantial mechanical motion components. A delay line version of
time compression and expansion for real time signals is also well
known as shown, for example, in the patent to French et al.
1,671,151, where a voice signal is propagated along a delay line
and a movable pick-up repeatedly scans the delay line to sense the
signal propagating therethrough with the relative velocity between
the pick-up head and the propagation velocity of the wave in the
medium giving bandwidth compression or expansion for the purpose of
transmitting the signal over a narrow band telephone line. Later
workers in the field eliminated the mechanical motion portions of
systems such as French et al. by substituting electronic switching
sequentially along taps on the electrical delay line thereby
providing frequency compression or expansion of the signal for
transmission over a narrow band line. The application of the
sequentially scanned tapped delay line for modifying the time
duration of recorded speech signals without altering the frequency
components thereof is disclosed by Greenberg et al. U.S. Pat. No.
3,480,737. By relating the speed of scan of the delay line to the
velocity of propagation in the delay medium and the relative speed
of the reproduction of a recorded message compared to the speech
utterance from which it originated Greenberg et al. achieve time
expansion or compression of a recorded speech signal without
altering the frequency components thereof.
Another form of frequency-time transformation is known in the art
using a signal controlled variable time delay line for error
correction. Systems of this type detect an unwanted frequency
effect due to time irregularities in the pulse train of a
repetitive signal or such variations in an audio system where the
speed of the record medium past the pick-up head is subject to
periodic variations which result in the production of the audible
irregularity known as "wow." In reproducing the original signal
these systems eliminate speed errors by servo-control of the time
delay of a delay line which is interposed in the signal channel.
Audio systems which employ a reference or timing signal track with
variable delay line compensation for playback speed are shown for
example in FIG. 9 of Coleman, Jr. U.S. Pat. No. 3,202,769 who shows
an open loop servo. Woodruff U.S. Pat. No. 3,347,997 shows a closed
loop servo which adjusts playback speed to compensate for the
relatively low frequency "wow" component whereas imperfections of a
high frequency nature (i.e., "flutter") are compensated by a
variable delay line. Such compensation systems depend for their
operation on the repetitive character of the error signal and thus
by providing a delay line of maximum delay time adequate to
compensate the maximum expected error do not encounter the problem
which would be presented for a system that required indefinite
increase in the delay quantity to continuously modify the
time-frequency character of a signal.
The systems described in the patents to Schuller, French et al. and
Greenberg et al., when used to reduce the frequency of a speech
signal, while compressing the time in which a given segment of
speech is reproduced, inevitably involve discarding a portion of
the original speech wave. The ratio of the speech signal discarded
to that which is used is directly related to the compression ratio
and the discard loss is inherently and fundamentally related to
this process of reducing the frequency and compressing the time for
the processing of a given passage of speech. Since the portion of
the speech which is reproduced alternates with portions which are
discarded the problem of merging to reproduce sections in
continuous time slots presents some problem and various solutions
have been offered.
Thus Schuller suggests a skewed air gap in his rotating magnetic
pick-up head or skewed tape approach to the point of contact with
the rotating air gap so that the arrival and departure of the
magnetic tape record relative to the air gap will occur gradually
as the skew provides a transition from zero to full air gap contact
with the recording medium.
The patent to French et al. suggests a number of alternatives
including the use of two spaced transducers rotating in unison with
respect to the delay medium such that the message reproduced in
compressed time in one transducer has superposed thereon from the
other transducer the message which would ordinarily have been
discarded by the first transducer.
These efforts to compensate for the discontinuity caused by
periodically discarding portions of the continuous speech wave
have, in general, been possible only because the output signal is
derived from a sampling along the propagating speech signal path
since with such sampling the signal stored in the line can be
readily accessed at any point along the line. For the
above-mentioned variable delay line correction systems, discard has
not been considered or compensated.
BRIEF SUMMARY OF THE INVENTION
The present invention provides compression-expansion systems for
speech or other coded signals in which the active frequency-time
conversion element is a signal responsive delay line which is
directly interposed in the path between the signal source and the
ultimate reproducer or utilization device which receives the
converted speech wave. Such a system does not avoid the inherent
problem in time compression occasioned by the discontinuity
resulting from discarding alternate portions of the speech wave.
The discard portion of the speech wave may be stored or cancelled
in the delay line or diverted from entering the input terminal of
the delay line which is directly in the signal channel. In any case
this signal and transients produced by line switching must be
discarded as the variable delay line is repetitively controlled by
the delay signal between its minimum and maximum delay values. Line
switching and discard occur coincidentally with the requirement for
making contiguous two originally spaced portions of the speech wave
and hence both of these functions are accomplished by a number of
alternative embodiments herein disclosed which function in a manner
consistent with requirements imposed by the parameters of the
speech signal itself.
Accordingly, it is the principal object of the present invention to
provide a speech compression-expansion system which utilizes a
signal controlled variable delay line located directly in the
signal channel between the signal source and sound reproducer which
delay line is repeatedly sequenced between maximum and minimum
delay values to modify the frequency-time characteristic of the
sound reproduced from the original signal.
It is a feature of the invention to control the input and output
bandwidth of the system for the speech frequencies which are
required to be reproduced for intelligibility in relation to the
compression ratio thereby to exclude those frequencies which would
produce distortion and intermodulation due to insufficient sampling
rate or excessive phase shift per delay stage for the high
frequencies and inadequate output chunk length to attain frequency
conversion for the lowest frequencies.
A further feature of the invention provides maximum discard
intervals as determined by the maximum delay line length which in
relation to the compression ratio limits the actual original speech
message discard to a value which minimizes the loss of significant
cues or transitions in the speech code thereby minimizing the loss
of information content transferred to the listener.
A still further feature of the present invention is to provide, in
a system utilizing variable delay line speech compression or
expansion, for signal processing at the point of juncture of two
reproduced speech portions to suppress distracting noise components
and also to avoid the introduction of false cues which could modify
the information conveyed in the subsequent speech segment. To this
end, if required to suppress such noise, the transition between
successive reproduced speech samples can be modified by simple
transfer function selection or control, or the transition can be
eased by the introduction of synthetic or speech-derived signal
portions to approximate a smooth transition within a time interval
which does not lose actual cues and under such conditions that do
not introduce false cues.
These and other features and advantages of the invention will be
apparent from the following detailed description taken in
conjunction with the accompanying drawings.
BRIEF DESCRIPTION OF THE DRAWINGS
FIGS. 1(a) to 1(h) show diagrams representing the reproduction of a
message recorded on magnetic tape and the operation of the system
of the present invention at different compression ratios.
FIGS. 2(a) and 2(b) are diagrams for compression and expansion,
respectively, showing input-output signal time relations.
FIG. 3(a) shows a set of curves for a specific maximum time delay
relating various parameters involved in the processing of speech at
various compression ratios greater than one and FIG. 3(b) indicates
similar relations for ratios less than one (i.e., expansion).
FIGS. 4(a) to 4(f) show waveforms useful in describing forms of
processing of a transition between adjacent reproduced speech
samples.
FIGS. 5(a) to 5(d) show a set of curves representing active
processing of the transition between adjacent samples.
FIGS. 6(a) to 6(e) show waveforms useful in describing the use of
two delay lines to effect transition between adjacent speech
samples.
FIG. 7 shows a block diagram of a speech compressor-expander system
in accordance with the invention.
FIGS. 8(a) to 8(d) show waveforms useful in describing the
operation of the system of FIG. 7.
FIG. 9 shows a block diagram of a dual delay line system in
accordance with the invention.
FIGS. 10(a) to 10(d) show waveforms useful in describing the
operation of the system of FIG. 9 for compression.
FIGS. 11(a) to 11(d) show waveforms useful in describing the
operation of the system of FIG. 9 for expansion.
FIG. 12 is a partial block diagram of a modification.
FIGS. 13(a) to 13(c) show waveforms useful in describing the
operation of circuit of FIG. 12 for compression.
FIGS. 14(a) to 14(c) show waveforms useful in describing the
operation of the modification of FIG. 12 for expansion.
FIG. 15 is a partial diagram of a dual delay line binaural
system.
FIGS. 16(a) to 16(f) show waveforms useful in describing the
operation of the system of FIG. 15.
FIG. 17 is a partial block diagram of a speech processor in
accordance with the invention using an analog shift register at the
variable delay element.
FIG. 18 is a block diagram showing gap filling with signal
continuity in a system similar to FIG. 17.
FIG. 19 is a partial showing of an embodiment of the invention with
variable delay provided by an r-bit parallel digital shift
register.
FIG. 20 shows an embodiment of the invention with variable delay
provided by a serial digital shift register.
FIG. 21 shows an embodiment using an analog storage memory matrix
for variable delay.
FIG. 22 shows an embodiment using an r-bit digital random access
memory.
FIG. 23 shows a logic diagram of a directional zero signal level
gating control in a dual line system.
FIG. 24 shows waveforms useful in describing the operation of the
circuit of FIG. 23.
FIG. 25 shows graphically the clock frequency and maximum signal
frequency for the system of FIG. 17.
DESCRIPTION OF THE PREFERRED EMBODIMENTS
The description of the preferred embodiments will be preceded by a
discussion of the parameters of the speech signal particularly as
they relate to speech compression for reproducing a given speech
message in a shorter period of time. Because of the fundamental and
unavoidable limitations involved in speech compression, the
following discussion will proceed with primary attention directed
to the method and apparatus used in compression mode. Compression
mode operation in the time domain results in the discard of a
fraction of the original information directly proportional to the
compression factor which is also the factor by which the time to
present a given speech sequence is decreased. The method and
apparatus are usable however in expansion mode and the
considerations involved for use of reproduced signals which occupy
a greater length of time than the original speech utterance will be
described separately hereinafter. The system is also capable of
frequency transformation without a corresponding time change to
achieve a desired frequency signal, such as may be involved in
generating speech in a medium having a propagation velocity
different than air.
Referring to FIG. 1, a particular system using a delay line which
provides maximum time delay of 6ms for the final portion of the
sample is shown. Assuming that the speech signal to be processed is
limited to frequency components between 333 and 5,000 Hz, certain
parameters of the playback system for compression can be defined. A
magnetic tape 21 has the speech signal recorded thereon of which
the lowest frequency component at 333 Hz is depicted by sine wave
22 with the tape being drawn past a pick-up transducer as it is
wound on a take-up reel 24 at speed S. The electrical signal
produced by the transducer 23 passes through a compression
processor 25 and is reproduced as an audible signal from speaker
26.
The system 23-26 shown on line (a) of FIG. 1 just described
reproduces the recorded signal on the tape 21 without frequency or
time change if the speed of take-up reel 24 draws the tape past the
transducer 23 at the speed of recording S and for this condition
processor 25 would introduce a fixed constant delay time of any
value. Thus in line (b) of FIG. 1 where c=1, the reproduction of
the 333 Hz sinusoidal signal without change other than fixed phase
delay (which has been ignored) is shown.
For speech compression the tape speed is increased by a factor c
and the processor 25 is operated to vary the delay linearly from
its minimum to maximum value. As shown in curves (c), (d) and (e)
of FIG. 1, a compression ratio of c = 2 restores, for a 6ms final
signal delay requiring 8ms delay line, a 12ms portion of the
original recorded wave 22 and in so doing retains half and discards
half the amount of signal originally occupying 24ms of recorded
time. This retained portion is designated a "chunk" and is shown in
speeded form prior to processing on line (c) of FIG. 1 to include
cycles number 1, 2, 3 and 4. By virtue of the compression process
and since the maximum final signal delay is maintained at 6ms
corresponding to 8ms in the delay line at the end of the sample, a
6ms portion of the original information representing 12ms at the
recorded speed is discarded and on line (c) is labelled "discard."
This discard includes cycles 5, 6, 7 and 8 of the original wave 22
and represents the gap in the information content between
successive chunks which are reproduced as audible signals. This
audible output is represented on line (d) where the chunk is
depicted as a piece of tape 31 played at speed 2S and containing
cycles 1-4 which after processing is effectively stretched into a
piece of tape 32 occupying the original 12ms of recorded time and
containing the cycles 1-4 at their original recorded frequency. In
line (d) it will be noted that the next cycle reproduced is cycle
number 9 of the original wave after cycles 5-8 inclusive have been
discarded. The representation in line (d) of a smooth transition
between the end of cycle 4 and the start of cycle 9 should not be
taken as representative of real signal conditions as would be
obvious from a consideration of an actual signal as opposed to the
idealized signals presented in FIG. 1.
Lines (f), (g) and (h) of FIG. 1 illustrate the situation which
prevails when the compression ratio is equal to five with the tape
speed drawn past transducer 23 at five times the recorded speed S.
With a final signal delay of 6ms maximum this compression ratio
results in a chunk length of 1.5 containing 2 1/2 cycles of the 333
Hz wave 25 of line (a) and again a discard interval of 6 ms equal
to the final signal delay and corresponding to a delay line length
of 10ms at the end of the sample. The information gap, however, has
increased to the point where the last half of cycle number 3 and
first half of cycle number 13 and all the intervening information
in the original recorded wave have been lost in the discard and
this gap in the message represents 30ms of the originally recorded
speech utterance.
The relations among the parameters of a speech compression system
and those pertaining to the information content of the speech
coding interrelate in a manner to specify optimum conditions and
place outside limits on the mode of operation of the systems of the
invention for a given intelligibility factor. These parameters can
be examined with respect to a particular system for various
compression ratios and for this purpose the system parameters for a
system using a delay line with a maximum final signal delay
.DELTA.T.sub.max of 6ms are set forth in the following table.
TABLE I
__________________________________________________________________________
Typical Parameters for Speech Compressor Chunk/Discard Ratio Comp.
Line (Playback (Recording Sample Rep Cycles Ratio Length "c"Time)
Time Period Rate Sample d. T.sub.out T.sub.in /.DELTA.T.sub.max
T.sub.out /c.DELTA.T.sub.max T 1/T ( fmin= c d (ms) (ms) (ms) 1/P
Hz) 333
__________________________________________________________________________
1.25 2/9 6 2/3 24/6 30/7.5 30 33.3 10 1.5 2/5 7 1/5 12/6 18/9 18
55.6 6 2 2/3 8 6/6 12/12 12 83.3 4 3 1 9 3/6 9/18 9 111 3 4 6/5 9
3/5 2/6 8/24 8 125 2 2/3 5 4/3 10 1.5/6 7.5/30 7 1/2 113 2 1/2
__________________________________________________________________________
The basis for the frequency-time transformation employed in the
present invention can be derived as follows. Consider a sine wave
V=E sin .omega.t recorded with a tape recorder. If the tape is
played back at c times the original recording rate, the result
is
V = E sin c .omega.t (1)
where c is called the compression ratio. If c>1 the time is
compressed for any given speech passage and if c<1 time is
expanded by the factor e where e = 1/c.
If the signal is then applied to a delay line in which the delay of
the line is caused to increase linearly with time at the rate d, so
as to cause the average delay of the signal to be c' which
represents the delay any point on the waveform will experience in
passing through the line, then the signal (1) becomes
V = E sin (c - c') .omega.t (2)
The original signal is restored if the delay is
c't = c't.sub.(to restore) = (c - 1)t (3)
such that
c' = d/2 (c + 1), the average delay rate of the line (4)
which is derived as one half the sum of the final and initial delay
values of the delay line, which when multiplied by time, t, thus
yields the total delay, c't.
FIG. 2(a) shows a plot for a given signal sample of signal output
time, t.sub.out, vs. the corresponding input time, t.sub.in. Thus a
line with a slope of 4 represents signal of four times the original
frequency or speed of presentation, and one-fourth the periodicity
while a line with a slope of 1 represents the resultant restored,
or unchanged signal. In order to convert such a signal (as
represented by the line I of slope c = 4) to one corresponding to
line II with a slope of 1, and with a corresponding frequency
decrease, it is necessary to increasingly delay the input signal,
ct.sub.in .sub.' by an amount c't [or (c-1)t] as shown at line III.
Thus a signal chunk, T.sub.in .sub.' has an ordinate which
intersects III at the ordinate value c'T.sub.in and this value when
added to the time abscissa value at the point cT.sub.in on I,
delays the signal to T.sub.out on line II. The delay dt introduced
by the delay line is shown by line IV. Such a delay line has the
effect of delaying the instantaneous signal, t.sub.i, through a
linearly increasing amount d.sup.. t for the interval from t.sub.in
to t.sub.out as shown by line IV. Thus as in the case of the end
signal at time t=T one-half the sum of the initial delay,
dT.sub.in, and the final delay, dT.sub.out yields an average delay
value on line IV of c'T.sub.in, the amount required for
restoration.
Hence, c'T.sub.in = (dT.sub.in + dT.sub.out /2)
(c-1)T.sub.in = d/2 (T.sub.in + cT.sub.in)
d = 2 (c-1/c+1)
In more general terms the restoration may be achieved by
cumulatively delaying an input singal, t.sub.in, by an amount
##SPC1##
For a linearly varied delay line with a rate of change in delay, d,
f(t) = d.sup.. t and ##SPC2##
from which we obtain (4)
c'T.sub.in = d/2 (c + 1) T.sub.in
Hence
d = 2 (c - 1/ c + 1) (7)
Thus for c.gtoreq.1, 0.ltoreq.d <2
c<1, - 2<d<0
In FIG. 2(b) the corresponding relations for signal expansion are
shown. Line I.sub.e with a slope of one-fourth represents a signal
of one-fourth the original frequency or speed of presentation. In
order to convert such a signal to one corresponding to line II with
a slope of 1, and with a corresponding frequency increase, it is
necessary to decreasingly delay the input signal, ct.sub.in (=(1/e)
t.sub.in = (1/4) t.sub.in) by an amount c't (=(1-e/e) t = -3/4t)
from an initial delay of c'T.sub.out. This amount of delay, c't',
at any point shifts the signal to the corresponding ordinate value
on line II. The delay dt' introduced by the delay line is shown by
line IV.sub.e. Such a delay line has the effect of delaying the
instantaneous signal by a linearly decreasing amount d.sup.. t' for
the interval from t.sub.in to t.sub.out as shown by line IV.sub.e.
Thus as in the case of the initial signal at time, t=0, one half
the sum of the initial delay, -d.sup.. T.sub.in and its final
delay, -dT.sub.out yields an average delay value on line IV.sub.e
of -c'T.sub.in
Hence, c'T.sub.in = (dT.sub.in + dT.sub.out /2)
(1-e/e) T.sub.in = d/2 [T.sub.in + (T.sub.in /e)]
d = 2 (1-e/1+e)
The process of linearly increasing time delay cannot continue
indefinitely, and from time to time the delay line must be returned
to its original length. If this process is repeated at periodic
intervals, provided the interval is longer than the period of the
lowest frequency component of the signal, chunks of the original
signal will be played back at the angular frequency (c -c').omega.
and the rest discarded. When (3) and (4) are satisfied, the system
operates as though sections were cut out of the original tape,
pasted together, and played back at normal speed. The sections of
signal are heard at the correct frequency but the information is
transmitted in a shorter time (if c>1). The speech has been
compressed to 1/c of its original length.
The values set forth in Table I have been plotted in FIG. 3(a). For
any given compression ratio the sample time is given by the curve
T.sub.out and the chunk length is shown by the curve T.sub.in. The
difference between these two curves is the discard which is equal
to the final delay to the signal at the end of the sample period
(6ms in the example shown in FIG. (3a). Entering the curve at any
compression ratio, such as c = 5 in FIG. (3a), one obtains the
chunk and discard times for the tape running at c times the
recorded speed and these values projected to the time axis show the
actual original recorded time for the respective chunk and discard
portions. As indicated for c = 5 the chunk is 1.5 ms long and the
discard is 6 ms long representing respectively 7.5 ms of recorded
and reproduced information and 30 ms of discarded information. This
latter value is represented by the quantity c .DELTA.T.sub.max
which is also plotted in FIG. 3(a)
For a speech signal in which the lowest frequency 333Hz has a
period of 3 ms, a chunk length of 1.5 ms at c = 5 corresponding to
7.5 ms of recorded time will contain 2.5 cycles of the 333H.sub.z
signal. For any higher frequency components in the speech signal
more cycles will be contained in the 1.5 ms chunk. The length of
the chunk should exceed the period of the lowest frequency to be
passed (i.e., should include at least a full cycle) otherwise
satisfactory compression will not be obtained. As indicated in FIG.
3(a) below the time axis at 3ms, the 333Hz signal is processed at
sample periods approaching 3 ms would with its samples reassembled
accordingly produce a compressed output of poor quality since the
sampling would then be causing a disruptive discontinuity for
nearly every cycle of the 333 Hz signal processed. Sample periods
less than 3ms would not permit completion of any one cycle so that
the resultant reassembled output would not only contain the said
disruption but would also begin to exhibit a basic change in its
frequency characteristic in the form of waveform compression by
truncation to produce false frequencies. While this condition does
not represent a real condition for a speech wave due to the
complexity of the waveforms, this principle is controlling and
sample periods less than the period of the lowest frequency wave in
the speech signal will not provide proper compression.
Sample periods greater than the period of the lowest frequency wave
will produce compression and an interval of disruption exists from
the region where the sample period is only slightly greater than
the period of the lowest frequency wave as indicated on the time
axis between 3 ms and 6 ms in FIG. 3(a). The result obtained within
this period of disruption is a distorted expanded wave in which the
effect of disjunctions between samples becomes extremely severe as
the single cycle point is approached and diminishes as the number
of cycles in the sample increases. As a practical matter two and
one-half cycles per sample is indicated as the desired limit in
FIG. 3(a) but in general the more cycles in the sample the less the
disturbance factor.
In order to avoid the extreme distortion produced by waves which
have a longer wavelength than the sample period, these lower
frequencies should be filtered out before the speech signals enters
the delay line otherwise these disjointed and highly distorted
waves will be propagated down the line and intermodulate with the
desired signal and may severly degrade the system performance.
For lower values of the compression ratio than c = 5, and keeping
.DELTA.T.sub.max = 6ms, the chunk length increases with the result
that the actual time sample increases to greater than 7.5 ms and
therefore more than the minimum number of cycles for the lowest
frequency wave component will be present in the chunk. Thus it
would be at the user's option to operate the line over less than
the 6ms indicated delay for .DELTA.T.sub.max to reduce the amount
of discard.
Considering the discard portion of the sample as a constant 6ms
long at the compressed rate of playback, the actual information
loss is the compression ratio times 6ms so that with c = 5 the
actual information discarded for each sample is 30ms of recorded
time. As shown on the time axis of FIG. 2 this is the interval from
7.5 ms to 37.5 ms and the relation of this loss of information to
the intelligibility of the reproduced speech signal must be
examined.
In general, human speech represents an extremely complex coding of
a relatively limited set of sounds called phonemes which taken in
context with the various attributes of the speech code such as the
voiced-unvoiced components, pitch, formant frequencies and the
continuum of sound pattern represented by sound energy (and the
absence thereof) connected by the all important transititions
between the temporal components thereof constitutes an acoustic
stream of infinite variety and versatility. The ability of the
human ear to receive this acoustic message and the ear-brain system
to decode the message is not altogether understood since it appears
that the readily comprehended information rate far exceeds the mere
acoustic response characteristics of the ear as a receiver.
Fortunately, the ability of the ear-brain system to comprehend the
message which is conveyed by human speech signals is sufficiently
good to permit large portions of the actual acoustic stream to be
lost or discarded without significant loss in the perception and
comprehension of the message information content of the acoustic
signal. Since the comprehension of message content decreases more
rapidly than the recognition of individual words as the message is
presented to the listener at increasing rate, the problem
associated with the discard of a portion of the signal stream can
be resolved in favor of comprehension and short of the point where
intelligibility of individual words deteriorates. This latter point
is reached where the loss or alteration of transitions or other
cues representing the connection between a consonant and vowel
sound results effectively from the discard of much or all of a
given cue or cues so as to alter the apparent information content
of cotiguous concatenated chunks. Even before the point of absolute
loss of intelligibility is reached the limit of tolerance due to
discomfort for sustained listening occurs as a result of the
unnatural sounds and the fatigue which develops in the intense
concentration required in attempting to extract the information
content in the presence of excessive time clipping.
For the purpose of speech compression the loss of intelligibility
can be associated with discarding portions of the message
containing significant cues or phonemes which components vary in
length with the shortest being approximately 10ms to 20ms long.
These short cues do not dominate speech but occur with sufficient
regularity to make their systematic loss undesirable and hence a
desirable upper limit for the discard period would be considered to
be 30 ms and preferably closer to 15ms. With this limit set for
intelligibility of the reproduced syllables and words the rate of
presenting a given message can be increased to the comprehension
limit for any given listener and degree of difficulty of the
subject matter with minimum concern for the limitation which would
be imposed by permitting loss or distortion of the word content or
the generation of false cues from the concatenated message chunks.
FIG. 3(a) indicates the recording time discard relation to
compression ratio as the linear function c.DELTA.T.sub.max with the
range from 15 ms to 30 ms designated the discard uncertainty range.
Thus the 6ms discard at c = 5 projects to include the real time
recording interval from time t = 7.5 to t = 37.5 which approaches
the upper limit permitted for discard without undue loss of
intelligibility as required not to contribute significantly to the
loss of comprehension in the message perceived. Smaller values of c
result in smaller actual discard time and hence the intelligibility
is improved especially for those cues which are at the lower end of
the time scale, i.e., in the neighborhood of 10ms.
While Table I and FIG. 3(a) represent parameters for a typical
speech compression system having a final signal delay of 6ms and
define the limits of operation within fairly narrow limits, it will
be appreciated that the principles involved can be adapted for use
over a wider range of operation. Thus the variation of the actual
frequency band of the speech signal and the maximum length of the
delay line are both important design factors which influence the
selection of the chunk-to-discard ratio and sample period for a
given range of the compression ratio c. On the other hand, the
actual frequency range of the signal has an important bearing on
the design of the delay line which must accommodate the frequency
spectrum present in the signal as to such quantitative and
qualitative factors as the voice pitch, the presence of all or only
some of the format frequencies for an individual voice and the
width of the signal spectrum over which linear phase-frequency
properties must be preserved. The ultimate system used however will
embody design choices of the factors involved within the broad
limits herein defined.
FIG. 3(b) is a plot of corresponding relations for signal expansion
showing the initial gap, output chunk and maximum delay line length
variation with expansion ratio e for a given input sample interval
T.sub.in. The output gap occurs at the start of each sample period
and thereafter for the balance of the sample period the reduced
frequency time-expanded output chunk appears. The maximum delay d
T.sub.in required is also shown as a function of the expansion
ratio e.
One aspect of the speech compression system described in connection
with FIG. 1 has not been treated, namely, the audible output of the
transducer 26 when the variable delay processing unit 25 is
switched from maximum to minimum delay at the end of the sample
period. Just prior to switching the delay line is loaded with the
speech signal which is to be discarded and if the line is
instantaneously switched to zero delay all of this information
unless cancelled or predeleted, will be presented in highly
condensed form in the output signal. As a practical matter with
conventional delay lines utilizing R and L or C components there
will be a time interval required for switching the line from
maximum to minimum delay and it has been found that even if the
line does not contain signal information this switching of a line
has a significant minimum time constant associated with it which
produces a disturbing transient audible in the output signal. The
repetition rate of this transient is the reciprocal of the sample
period. Because of the limitations imposed by the parameters of the
system as previously set forth herein, this switching frequency and
spectral components of the transient itself will always be within
the audio range and thus present as a highly disagreeable
intermoduation component in the audio output of the device. The
present invention provides a number of implementations for
transient suppression and message gap bridging arrangements for the
purpose of minimizing the disagreeable noise effects involved. In
more elaborate systems the substitution of pseudo or real message
components further improves the transition from one sample to the
next and can be adapted to fill in a portion of what is discarded
in the compression process.
Referring now to FIG. 4 a portion of the 333 Hz wave at the
transition point illustrated in FIG. 1(d) has been reproduced in
which cycle 4 and cycle 9 of the original recorded 333 Hz wave are
shown as a smooth uninterrupted sine wave. The junction between the
end of cycle 4 and the beginning of cycle 9 at point 41, although
shown as a continuous portion of the sine wave, is in actuality, as
previously stated, almost never so related in the non-selective
periodic sampling of independent complex waveforms and thus instead
of a smooth transition point 41 a disjunction between the end of
one chunk and the beginning of the next chunk in successive samples
is to be expected. This disjunction could undoubtedly be
accommodated with no loss of intelligibility if the transient from
switching the line (either loaded or unloaded) did not have to be
dealt with at exactly this point in time. Since this transient is
responsible for a highly annoying audible output from the system it
must be eliminated and for this purpose a gating signal as
indicated in FIG. 4(b) may be applied symmetrically with respect to
the transition point 41 to produce the output signal shown in FIG.
4(c). By making the gate long enought to encompass the transient
resulting from switching the line, the audible noise so generated
is eliminated. The improvement obtained by this expedient, while
significant, is not ideal since the introduction of the gate signal
within the audio range is itself audible as a repetitive
disjunctive gap which intermodulates with the audio signal. This
effect can be reduced by using an output filter designed for the
particular repetition rate and gate width to smooth the abrupt
transition shown in FIG. 4(c) and this output response is indicated
in FIG. 4(d).
A further improvement is possible by using the gating signal as a
gain control signal and tapering the "off" and perhaps the "on"
transitions of the gate so that a gradual transition of the audio
output from "off" to "on" is accomplished and a relatively smooth
transition as indicated in FIG. 4(f) results. The object is to
minimize the gap effect which in itself has an audio characteristic
and can act like a cue. Tapering the trailing edge of the gate
helps this considerably whereas an anticipating start (or relative
delay of the speech signal) would be preferable for gradual onset
for the leading edge. With these relatively simple expedients the
smooth transition between adjacent chunks which are disjunctively
joined by the operation of the compression-discard process are
achieved in a manner which is satisfactory for many
applications.
Referring now to FIG. 5, the more elaborate arrangements for
bridging the gap between adjacent samples will be described. As
shown in FIG. 5(a) a disjunctive transition which is the norm to be
expected represents a sharp discontinuity in the message signal and
has superposed thereon the noise transient from switching the line
as previously described. By introducing a gate signal FIG. 5(b) of
sufficient width to encompass the line switching transient and
conditioning the gate to coincide with a zero level and same
direction of change for the adjacent signals being processed a zero
level gating transition as shown in FIG. 5(c) can be achieved. This
transition which is free of line switching noise and essentially
continues an existing zero amplitude signal level during the
interval of the gate has been found to provide little or no
disturbance to the average listener.
Because of the nature of the human hearing phenomenon, particularly
the ability of the ear to synthesize the message it is
concentrating upon even in the presence of noise, it may be useful
in certain circumstances to introduce a pseudo or real message
component in the zero level interval indicated in FIG. 5(c). For
this purpose suitably selected noise or signal components of
approximately the same amplitude and frequency can be inserted in
what is otherwise a quiet gap interval in the message stream and
this arrangement of the invention is indicated in FIG. 5(d). Where
the gap is to be filled with noise components, a suitable source
and symmetrical switching to introduce noise from the source into
the signal channel can be readily applied during the gating
interval.
FIG. 6 represents a preferred form of gap filling where two signal
controlled delay lines are used. The speech signal is applied to
both delay lines designated channel A and channel B in FIGS. 6(a)
and 6(b) respectively and these two lines are signal controlled to
have symmetrical complementary gain characteristics and overlapping
variable delay characteristics as shown in FIGS. 6(c) and 6(d).
Here the delay control signals as shown in FIG. 6(d) are phased to
overlap at least an amount corresponding to the transition portion
of the gain control characteristics of FIG. 6(c). The outputs of
both delay channels A and B are combined to produce the combined
output shown in FIG. 6(e).
Generally the length of the delay lines used for channels A and B
in FIG. 6 will employ one full length delay line and one relatively
shorter length delay line for storing the signal used for gap
filling purposes. This arrangement will reduce the cost of the
equipment represented by the multiple section delay lines necessary
to obtain the required maximum delay length for system performance
requirements. On the other hand, for systems where cost is not a
primary factor, two equal full length variable delay lines can be
employed and their control signals can be alternately applied so
that the signal channel is through first one and then the other
delay line thereby giving a full signal period for switching the
inactive delay line back to minimum delay condition prior to its
use for signal transmission again. For such symmetrical delay lines
it may still be useful to provide some overlap during the
transition as indicated in FIG. 6(d) with appropriate gain control
signals applied as indicated in FIG. 6(c).
Referring now to FIG. 7, a basic speech compression-expansion
system in accordance with the invention will be described. This
system comprises a variable speed playback device 51 which is
indicated to be a tape transport with a manual select speed control
input 52. The signal derived from transporting the tape past a
magnetic transducer is applied to an AGC amplifier 53 which also
passes the signal through a band pass filter having an adjustable
low and high frequency cutoff. The selection of the cutoff
frequencies for the filter may be operated from manual control 52
in conjunction with the selection of playback speed for the
playback device 51. The manual control 52 also supplies an
amplitude control signal to a fine voice pitch adjust control 54
which supplies on line 55 a signal to control the end amplitude of
the linearly increasing waveform which controls the variable delay
line as hereinafter described.
The signal after passing through the amplifier and filter 53 enters
a variable delay line 56 which can be signal controlled between
minimum and maximum delay limits. This control signal applied on
line 57 is derived from a ramp level and amplitude charger 58 which
receives as its input either a compression triangular waveform on
line 59 or an expansion inverse of the waveform on line 59 which
appears on line 61 after passing through an inverter 62. One or the
other of the lines 59 and 61 is energized with a ramp waveform
depending upon the setting of a switch 63 which supplies the basic
ramp waveform from ramp pulse train generator 64. The repetition
period of the ramp waveform is selectable by a manual control 65. A
pulse coincident with the reset of the linear portion of the ramp
waveform appears on line 66 and is applied to a blanking pulse
generator 67 to produce a blanking pulse output the width of which
can be controlled by manual adjustment 68 and which is synchronized
with the input pulse on line 66.
The output of the variable delay line 56 is applied to a blanking
circuit and amplifier 71 which transmits or blocks the signal
depending upon the blanking pulse B applied on line 72 from
generator 67 and when the blanking pulse is not present B the delay
signal is applied to a speech bandpass filter 73 the output of
which is applied to an audio reproducer 74.
In addition to the amplitude excursion established for the linear
voltage ramp signal from generator 64 which is controlled by manual
control 52 the absolute level of the voltage applied can be
controlled by level adjust means 60. The variable delay line 56
will generally be of any known type and in particular may be 360 RC
filter stages where the shunt resistor is provided by a FET or
other semiconductor device which varies resistance in response to a
controlled voltage or current. Such delay lines generally perform
best with respect to distortion of the signal passing therethrough
if the phase delay per stage is kept well below the maximum
possible value of 90.degree.. Accordingly, the line can be designed
to operate with 45.degree. to 60.degree. phase delay per stage
maximum and the number of stages is then determined as greater than
the quantity: N>(6 or 8) c(f.sub.max) .DELTA.T.sub.max. In the
above inequality the digits 6 and 8 represent the number of stages
per electrical cycle of the highest frequency to be passed
corresponding to a phase delay of 60.degree. or 45.degree.,
respectively, as the maximum phase shift per stage which is to be
utilized; the quantity c is the compression ratio; the quantity
f.sub.max is the highest frequency being passed by the line; and
.DELTA.T.sub.max is the maximum signal delay desired as dictated by
the maximum permissible discard interval previously specified. Many
other forms of delay line constructions which are capable of being
signal controlled are known in the art and the present invention is
not to be considered as limited to any particular form of delay
line.
Referring now to FIGS. 8(a) and 8(b) the operation of the system of
FIG. 7 will be described. The sample period waveform 81 has an
adjustable period set by control 65 for producing an asymmetrical
sawtooth waveform 82 which produces a relatively long negative
going linear voltage followed by a shorter positive going linear
voltage. This waveform is used directly on line 59 for speech
compression while, after inversion in inverter 62, its inverse is
used on line 61 for expansion. The expansion waveform is indicated
in dotted lines at 83 in FIG. 8(a). For a variable delay line 56
which increases delay as the control voltage becomes more negative,
the waveforms 82 and 83 have the proper sense for controlling the
delay interval and the magnitude of the delay is determined by
amplitude control 52 relative to a voltage level set by the level
adjust 60. Thus the operating point in the excursion of the
waveform 82 is selected for a given compression ratio in
conjunction with the sample period which will be a predetermined
combination for any given compression ratio assuming the maximum
delay .DELTA.T.sub.max, in the line 56 is a fixed value as obtained
by selecting line length according to the value d.sup.. T.sub.out
as given in FIG. 3(a) and Table I for the desired compression
ratio. If the maximum delay to the signal is not maintained
constant the discard period will change correspondingly as is
evident from the description of FIG. 1 and corresponding
adjustments in the amplitude of the wave will be required to give
the slope d required for a compression ratio c. Similar
considerations apply for the slope of curve 83 which must be set at
its corresponding value d for an expansion ratio e.
The operation of blanking pulse generator 67 is shown to produce a
pulse 84 in FIG. 8(b) of predetermined width in response to the
start pulse of the sample period signal 81 received on line 66.
This pulse may be applied in gain control fashion to the circuit 71
with modified trailing edge as previously described to reduce the
transient signal and provide a gradual onset of voice sound signals
which are passed to the transducer 74. The width B of the blanking
pulse is selected with control 68 and is normally made of
sufficient duration to permit the short steep linear portion of the
ramp waveform to return the delay line 56 to its zero or minimum
delay condition and dissipate the signal energy therein (or the
transient caused by switching the line itself) prior to enabling
the signal channel which energizes the transducer 74 with the
subsequent speech signal segments.
The blanking period B and enabled period B for expansion mode are
shown in FIG. 8(c). The expanded chunks with an initial output gap
are shown in FIG. 8(d).
The system of FIG. 7 can also be used to substitute noise or pseudo
signal gap filling signals corresponding to the system described in
FIG. 5. For this purpose a source 75 of such signals is arranged to
supply the input signal to filter 73 during the blanking interval.
By means of a switch 76 this gap filling during the blanking
interval can be made optional. The gap filling signal 75 can also
be derived from the message signal output of amplifier 53.
Referring now to FIG. 9, a modified form of the invention
particularly suitable for accomplishing the various gap filling
procedures for the speech compression systems previously described
will be disclosed. Portions of FIG. 9 which are essentially the
same as those described in FIG. 7 have corresponding reference
numerals and accordingly only the additions and changes will be
further described. In addition to the variable delay line 56 a
second variable delay line 91 receives the signal wave from
amplifier 53. The output of the delay lines 56 and 91 are applied
respectively to complementary blanking circuits 92 and 93. Signals
passed by these blanking circuits 92 and 93 are amplified and
filtered in element 73 and passed to the acoustic reproducer 74 as
heretofore described.
A pulse train generator 94 produces a pulse wave train as shown in
FIG. 10(a) having a selectable pulse repetition rate determined by
the setting of manual control 65 thereby establishing the basic
sample period. The output pulse from generator 94 is delayed in
delay unit 95 and applied to a first ramp generator 96 and in
undelayed form is applied to a second ramp generator 97. The ramp
generators 96 and 97 are subject to waveform level control from
manual adjust element 60 and ramp linear wave amplitude control
from the manual adjust element 52. As previously stated, the fine
pitch adjustment 54 may be provided for slightly modifying the ramp
slope as a voice pitch adjustment by effectively altering the
frequency conversion over a small range. In addition the blanking
width interval of each generator is adjustable with controls 68 and
70 respectively. The outputs of the ramp generators 96 and 97 are
applied respectively to delay lines 56 and 91 to control the time
delay of signals passing through the respective lines in accordance
with the control signals applied. By means of c or e select
controls the sense of the slope of the ramp waveforms can be
selected for compression or expansion.
The level and amplitude controls for setting the respective ramp
generators 96 and 97 are preferably relatively adjustable to permit
selection of the relationship between the two ramp waveforms. By
making the delay and phasing of the unit 95 adjustable any desired
delay line overlap can also be achieved. It is also possible to
rearrange the components to have the complementary gating at the
inputs of the two delay lines 56 and 91 with the outputs switched
to be combined in a common channel to amplifier 73. This
alternative discards the portion of the speech signal that is not
utilized by each line before it enters the line and thus eliminates
the necessity for dissipating these portions when the lines are
switched between active periods.
Referring now to FIG. 10, the operation of the speech compression
system of FIG. 9 will be described. The pulse train generator 94
produces the timing waveform of FIG. 10(a). This pulse triggers the
transition of waveform C2 in pulse ramp generator 97 which produces
the blanking pulse indicated in FIG. 10(c) with the predetermined
width of B and B being determined by the blanking pulse width
control 68. After the delay indicated in FIG. 10(b) the pulse from
generator 94 triggers the ramp generator 96 to produce the waveform
C1 shown in FIG. 10(b). With this arrangement the control wave C1
for the delay line 56 is overlapped in time by waveform C2 having
slope in the same sense and bridging the steep return slope
waveform of ramp wave C1. With the asymmetrical time intervals
shown in FIG. 10, the arrangement for gap filling modes of
operations shown in FIGS. 5 and 6 can be practiced. By making the
waveforms C1 and C2 have symmetrical rising and falling portions
the arrangement is suitable for alternate switching of the lines 56
and 91 to provide alternate compressed (or expanded) chunks of the
speech sample. The choice of the relative lengths of sample through
line 56 and 91 will generally be dictated by manufacturing costs
for the delay line. Thus for a main delay line 56 of adequate
length for the compression ratio desired, a relatively shorter line
91 used only for gap filling purposes will generally be more
economical. On the other hand, two full length lines which are
alternately active to pass speech sample chunks thereby providing
adequate time for the non-active line to be returned to its minimum
delay condition will provide for smooth transitions, any desired
overlap and the maximum time interval for discharging the line to
minimum delay condition prior to its processing the next speech
sample. The action of the system of FIG. 9 in the gap filling mode
is indicated in FIG. 10(d) and generally corresponds to that
previously described with respect to FIG. 5.
The operation of the system of FIG. 9 for speech expansion, i. e.,
increasing the time duration for a given speech utterance and
increasing the frequency components thereof from a reproducer
running at a slower than recorded rate is shown in FIG. 11. Here
the ramp generators 96 and 97 have inverted outputs to produce the
expansion waveforms E1 and E2 shown in FIGS. 11(a) and 11(c)
respectively and the blanking waveform has been made symmetrical
such that the delay lines 56 and 91 are used alternately for
approximately equal periods. By the nature of speech expansion, a
gap in the signal output will always occur since the lines are
controlled to change from maximum delay at the start of the sample
to minimum to zero delay at the end of the sample. Thus when the
line is switched to maximum delay there will inevitably be a time
gap before delayed signal emerges from the output end of the line.
Applying the control sequence indicated at FIG. 11 the speech
samples processed by lines 56 and 91 are overlapped so as to fill
the gap as indicated in FIG. 11(d) by the solid and dotted outlined
signal chunks E.sub.1 .sub.' and E.sub.2 .sub.'. The presence of a
slight overlap in the reproduced signal does not significantly
interfere with intelligibility since it generally is not noticeable
and at worst may result in a slight echo effect of the type
commonly encountered in a telephone conversation. The time expanded
speech waveform obtained using the mode of operation indicated in
FIG. 11 is useful for the recognition and comprehension of
difficult passages and for analysis and study of foreign languages
and the like.
The system shown in FIG. 12 represents a simplification of the
system of FIG. 9 where a fixed delay line 101 is used in place of
the second variable delay line 91 of FIG. 9. The control of
blanking ciruits 92', 93' is simplified in that the variable width
blanking gate B as derived from pulse train generator 94
correspondingly produces gaps in the output signals which have been
delayed by passage through variable delay line 56. The fixed delay
of line 101 is selected to further delay some portion of the signal
emerging from the delay line 56 by an amount sufficient to fill the
gap caused by blanking pulse B thereby essentially repeating some
portion of each message chunk while the variable delay line 56 is
switched back to its minimum delay condition. Again, this
repetition is not objectionable and may merely introduce a slight
echo effect which is much less objectionable than the presence of
the gap in the message signal. This sequence of operation is shown
in FIG. 13 where the variable chunk C.sub.v and the fixed chunk
C.sub.F alternate in supplying the output.
The expansion mode for operation of the circuit of FIG. 12 is shown
in FIG. 14 where the ramp signals are inverted for the expansion
waveform which controls the delay line 56 to vary from maximum
delay to minimum delay over the linear ramp portion E shown in FIG.
14(a). The blanking waveform B is selected to pass some portion of
the signal chunk through the appropriate amount of delay to fill
the gap between chunks in the output as indicated in FIG. 14(c).
Thus the output is composed of chunks E.sub.F and E.sub.v in
alternation for continuous signal.
The system of FIG. 12 could be further simplified by eliminating
delay line 101 and conditioning gate 93' to introduce in the gap
interval any pseudo or noise signal from a suitable source which
would simulate the frequency content of the actual speech signal.
While this version would be less desirable than using the actual
speech signal for gap filling it would, nevertheless, be better
than reproducing the speech signal with the message gaps present
since the audible effect of gaps becomes detrimental to recognition
of the message content, especially at high compression ratios. This
modification gives a mode of operation similar to the optional
noise gap filling described for FIG. 7.
FIG. 15 shows a modification of the invention for binaural
processing. The speech signal from the band pass filter 53 is
applied to symmetrical variable delay lines VDL1 and VDL2
controlled by waveform generator 102. The output of VDL1 is applied
as an input to gates 103 and 105. The output of VDL.sub.2 is
applied as an input to gates 104 and 106. The delay line VDL.sub.1
is controlled for linear variation of delay according to the
waveform of FIG. 16(c). The delay line VDL.sub.2 is controlled for
linear variation of delay according to the waveform of FIG. 16(d).
Each of these waveforms has its rapid return transition at the
mid-point of the linear delay portion of the other waveform.
The gates 103 and 106 are controlled by gating waveforms B.sub.1
and B.sub.1 shown in FIG. 16(e). Gate 103 passes signal during
B.sub.1 and is blocked during B.sub.1. Gate 106 is blocked during
B.sub.1 and passes signal during B.sub.1 . Amplifier 107 combines
the outputs of gates 103 and 106 and applies the combined signal to
an audio reproducer 108.
The gates 104 and 105 are controlled by the gating waveforms
B.sub.2 and B.sub.2 shown in FIG. 16(f). Gate 104 passes signal
during B.sub.2 and is blocked during B.sub.2. Gate 105 is blocked
during B.sub.2 and passes signal during B.sub.2. Amplifier 109
combines the outputs of gates 104 and 105 and applies the combined
signal to an audio reproducer 110.
The system of FIG. 15 operates to reproduce the entire original
signal (for compression ratio equal to two) since each delay line
processes the portion which is the discard for the other line as is
evident from FIGS. 16(a) and 16(b). For compression ratios greater
than two some message discard occurs and for ratios less than two
the overlap or message duplication increases in the output. By
listening binaurally, however, the intelligibility is enhanced
since the overall discard is eliminated (or greatly reduced for the
higher compression ratios) and the overlap or repeat of message
portions is not detrimental to word detection by the listener.
A binaural system without supplemental gap filling (as just
described) would be achieved by removing gates 105 and 106 in FIG.
15. The lines VDL.sub.1 and VDL.sub.2 would supply the processed
signal in alternation to the respective output transducers 108 and
110 for binaural output.
FIG. 17 shows the invention using a form of delay line capable of
processing speech signals in a manner which greatly reduces the
problems associated with discarding stored information in the line.
The system shown in FIG. 17 comprises an analog shift register
having a plurality of stages ASR.sub.1, ASR.sub.2, ASR.sub.n, which
has a speech signal input applied on line 111 and a compressed or
expanded speech signal output on line 112. Alternate stages of the
delay line are clocked by two-phase clocking signals applied at
lines 113 and 114 as derived from a shift frequency generator 115.
The frequency variation of the generator 115 is such that the
inverse of the clocking frequency namely, the pulse to pulse period
varies as a linear function of time with the frequency varying from
high frequency to low frequency for compression and from low
frequency to high frequency for expansion mode operation.
The analog shift register shown in FIG. 17 is of the general type
disclosed in the article by F. L. J. Sangster published in the 1970
IEEE International Solid States Circuits Conference Proceedings,
pages 74-75 and 185. Such shift registers sample an analog signal
and pass the sampled value along the line at the clock rate by a
charge or charge deficit storage mode which permits the signal
sample to be recovered at the output of the delay line after a time
delay which is proportional to the inverse of the clocking rate. In
the present invention, with the clocking rate varied so that the
inverse thereof is linear function of time, the delay line operates
to expand or compress the speech signal and by tailoring the length
of the line and the repetition rate of the linear control function
in accordance with the principles herein set forth, continuous
processing of random speech signals is achieved. In the compression
mode at the end of each linear segment of the control function
generated by generator 115 all stages of the delay line may be
reset if a reset input is available or the line may be simply
emptied during the blanking period as the line is reloaded with the
beginning of the next segment of the speech signal at the high
frequency clock rate. This rate may be made rapid enough to make
the required blanking so short as to be aurally imperceptible. Thus
the problems associated with gap filling or smoothing or blanking
may be minimized in this version of the invention.
The design parameters for the analog shift register (ASR) can be
established by applying the criteria previously set forth, as will
now be described.
The instantaneous delay .tau.(t) of an analog shift register at
time t is .tau.(t) = dt + .tau..sub.o where d is the rate of change
of delay and .tau..sub.o is the initial delay. The delay dt becomes
in the ASR for N stages N (p-1/p) . 1/f.sub.t where f.sub.t is the
frequency of the clock shift signal at time t and p is the number
of phases. Defining N (p-1/p) as N', the delay becomes .tau.(t)= N'
[(1/f.sub.t) + (1/f.sub.o)] where f.sub.o is the initial clock
shift frequency.
The slopes of the time delay function for restoring the original
speech frequencies are the same quantities as before. For
compression:
c't=.DELTA.T.sub.t = total delay to signal entering at time t
= (c-1)t for compression ratio c, and
d = 2 c-1/c+1 = N'/t [(1/f.sub.t) - (1/f.sub.o)]
For expansion:
c't = .DELTA.T.sub.t = (1-e/e) )t for expansion ratio e, and
d = 2 (1-e/1+e) = (N'/t) [(1/f.sub.t) - (1/f.sub.o)]
Hence the inverse of the shift frequency as a linear function of
time multiplied by N', produces the delay needed to obtain
compression or expansion of the speech wave with original speech
frequencies restored.
Referring to FIG. 18, the reset time t.sub.N to clear N stages is
t.sub.N = N'/2 [(1/f.sub.N) + (1/f.sub.o)] which represents the
time needed for the first N pulses to refill the line. By
suppressing the sample-switching transients by filtering or
blanking or any of the other methods herein described, and holding
t.sub.N to less than 0.2ms the effect of gap intermodulation
previously discussed becomes virtually imperceptible.
There is a limit to be observed for the sampling clock frequency to
accommodate the highest signal frequency f.sub.max to be passed by
the line. Referring to FIG. 25 the signal frequency for compression
decreases linearly as it passes through the delay line as indicated
by line 201. The clock frequency varies as a hyperbolic function
indicated at 202 and must throughout the sample period be equal or
greater than the values or line 201 for which the relation is given
by f.sub.t .gtoreq. (2p/p-1) f.sub.max
(which for a two phase ASR in which p = 2 is 4f.sub.max) to provide
for at least two samples per cycle of f.sub.max throughout.
FIG. 18 shows a modification of the analog shift register
embodiment shown in FIG. 17, but with provision for the exact
substitution of processed signal at the initiation of reset for the
delay line. The analog shift register delay line 121 processes
input signals from line 122 in accordance with the variable pulse
derived frequency derieved from square wave generator 123 as
previously described in connection with FIG. 17. The pulse
frequency is such that the pulse spacing varies linearly as
indicated from pulse signal source 124 where the reciprocal of
frequency is shown to be linear with respect to time. At a point
125 on the analog shift register the line branches to have
duplicate shift register stage paths 126 and 127. The number of
stages required in each of the boxes represented by 126 and 127 is
sufficient to continue processing the signal while the line 121 is
being reset. The branch outputs from stages 126 and 127 are subject
to complementary gating controls 128 and 129 and when gated to pass
signal both supply input signals to a combining amplifier 130.
In addition to the control of the main line 121-126 from generator
123 and 124 the branch line 127 is controlled through gate B 131
from pulse generator 124 which triggers a second square wave
generator 132 and when gated during the B interval by gating unit
133 the trigger rate for generator 132 is derived from a fixed
pulse generator 134. The pulse generator 134 may also operate at
the pulse rate 1/f.
The operation of the system of FIG. 18 can be described with
reference to the waveform associated with output line 135. For a
given sample period the variation in frequency of generator 124
starts and controls the ASR line 121 as previously described. For
this condition the gates B are permitting signals to pass and the
output of stage 126 is transferred to the input of amplifier 130
thus producing the frequency converted output signal indicated
during the sample period of the waveform. At the same time the B
control for gate 131 permits the same control pulse signal from
generator 124 to trigger generator 132 thereby keeping the branch
ASR stages 127 operating in synchronism with the corresponding
stages 126. The blanking control B in gate 129, however, prevents
the output from stage 127 from reaching the input of amplifier 130.
At the blanking or reset period for the main ASR line 121 and
generator 124 the B and B gates switch thereby interrupting signal
flow from stage 126 to amplifier 130 and permitting signal flow
from stage 127 to the input of amplifier 130. Since stages 126 and
127 were in synchronism this switching will be of identical signals
and thus imperceptible at the output line 135 of the amplifier 130.
At the same time the B and B switching in gates 133 and 131
interrupts trigger pulses from generator 124 and passes trigger
pulses from generator 134 to the square wave generator 132. This
switching of control assures that the generator 132 will continue
to process signal in stages 127 while the generator 124 can be
reset for the start of the next sample. At the end of the blanking
period there will be some discontinuity as the gates B and B switch
back to their original state thereby returning control to pulse
generator 124 inasmuch as the start of the next sample period will
not produce signals which exactly coincide with the signals
terminated at the end of the blanking pulse which were under
control of the pulse generator 134.
FIG. 19 shows a modification of the present invention in which the
variable delay line under the control of the variable frequency
generator 136 operates with a 1/f repetition rate control function
exactly analagous to that described with reference to FIG. 17. In
FIG. 19 instead of carrying the analog signal through the
successive stages of the shift register, the input signal on line
137 is first converted into a digital word in an A/D converter 138,
the parallel output of which applies a parallel word to the input
registers of the first stage 139 with this digital value
transferred sequentially through the serial stages until it reaches
an output D/A converter 140 where it is converted into an analog
signal on output line 141. This operation is entirely analagous to
the system described with reference to FIG. 17 except for the
coding of the information as it passes through the serial stages
which are driven with the variable clock frequency to provide the
frequency conversion required. One advantage of the system of FIG.
19 is in the provision of a reset signal from generator 136 on line
142 which can be applied to all the registers of all the stages
simultaneously thereby effecting an instantaneous clearing and
resetting of the line at the end of the sample period.
FIG. 20 shows a modification of the invention which is analogous to
that shown in FIG. 19 except that the digital signal is processed
serially through a serial shift register 150 after the digital
output of A/D 138 has been serialized in a serializer 151. The
shift register 150 is under the control of the shift frequency
generator 136 including a reset input line 142. The output of the
serial digital shift register 150 is applied to a parallelizer 152
which converts the serial bit string into a parallel digital word
for conversion by D/A 140 into the required analog output signal on
line 141.
FIG. 21 shows an embodiment of the invention in which an analog
storage matrix with provision for addressible read-in and read-out
signal storage is provided. A charge storage matrix 161 is shown
having a plurality of X write lines 162 and a second plurality of Y
write lines 163 the intersections of which define the matrix
address at which are located analog storage elements. Typically, an
analog storage matrix will have a capacitor charge storage device
at each intersection of the X and Y lines defining the matrix for
storing an analog value represented by the charge on the capacitor.
Each such memory location is also accessed by a plurality of X read
lines 164 and corresponding plurality of Y read lines 165 where the
intersection of the lines 164 and 165 correspond with the location
of the charge storage elements located at the intersections of the
write lines 162 and 163.
To store an analog signal in a charge storage memory 161 an analog
signal input on line 166 is applied and the instantaneous value
thereof is stored in the charge storage element associated with the
coincidentally energized X and Y write line intersection from an X
counter-write enable 167 and a Y counter-write enable 168.
Typically the X and Y counters 167 and 168 will be operating at a
predetermined pulse rate derived from a pulse generator 169 with
the number of pulses in the X line sequencing the X write lines 162
after which the Y counter 168 is stepped and the next row of X
intersections with the then active Y line being energized by the
next sequence of pulses from the pulse generator 169. Accordingly,
the memory 161 has a storage capacity of X .times. Y number of
storage elements corresponding to the number of intersections of
the X and Y lines. With pulse generator 169 operating at a constant
frequency this write-in of the analog signal on line 166 occurs at
a predetermined rate and storage capacity is selected to store a
signal sampled in accordance with the general requirements
hereinbefore set forth.
A frequency converted output signal is derived from output line 171
which receives serially from the charge storage elements of the
memory 161 the analog values stored therein as the matrix
intersections are sequenced in regular order by the operation of X
counter-read enable control 172 and Y counter-read enable control
173. The pulse rate for counters 172 and 173 is determined in
accordance with a ramp voltage generator 174 controlling a voltage
control oscillator 175 with the variable rate selected to produce
the desired signal compression or expansion in accordance with the
principals of this invention. For this purpose a rate control
potentiometer 176 is provided for selection of the ramp voltage
slope in generator 174 and this slope will generally be a gang
control with the tape playback speed control 52 described with
reference to the embodiment shown in FIG. 7. This dual control
function is indicated by line 177. A further control from the rate
control potentiometer 176 is applied on line 178 to the pulse
generator 169 to control its pulse repetition rate in relation to
the maximum read-out rate established by control of the ramp
voltage generator 174 and oscillator 175. In particular the writing
pulse rate must be maintained higher than the maximum read-out
pulse rate in order to avoid having the read-out sequencing
overtake the write-in entry. As soon as any storage element has
been read it is available for storing the value on the next signal
sequence and can be either reset upon read-out or upon entry of the
next write signal. The ramp generator 174 provides a reset on line
179 for resetting the counters at the end of each ramp voltage
period for the start of the next signal sample storage
sequence.
FIG. 22 shows a system employing a random access memory 181 with
write and read controls 182 and 183 which operates in a manner
analogous to that described for the system of FIG. 21. Since the
memory 181 stores a binary information however the input signal on
line 184 is required to be converted in A/D converter 185 and the
corresponding output must be converted in D/A converter 186. The
sequencing of write-in and read-out for the memory matrix generally
corresponds to that previously described for FIG. 21.
Where gap filling is employed a particular means for minimizing the
disturbance caused by the discontinuity at the beginnings and/or
the ends of the signal samples is illustrated in FIG. 23 with
gating signal control indicated in FIG. 24. The logic control and
sequencing elements therein are so configured as to cause the
primary signal sample 191 to end at a zero crossing and the
supplementary gap filling signal 192 to start at its next zero
crossing in the same direction and then at the end of the blanking
period for the primary signal for said supplementary signal to end
in a zero crossing to be followed by the new primary signal sample
191 at its next zero crossing in the same direction. Thus, the
source signals 193 and 194 after filtering through low pass filters
195 and 196 to remove processing and spurious high frequency
components are fed into their respective gates 197 and 198 and
voltage comparators 199 and 200, the latter units so connected to
ground by directional circuitry 201 and 202 as to trigger pulse
generator (PG) 203 or 204 whenever a positive going zero crossing
by the respective signal 193 or 194 occurs. Gates 197 and 198 are
actuated so as to pass signals 193 and 194 by the reset output
lines 205 and 206 from flip flops 207 and 208. Flip flop 207 is set
by a pulse on line 209 from gate 211 as conditioned by the reset
output line 213 from flip flop 208 and by the inverted output 216
of the sample period pulse train generator 219. Flip flop 207 is
reset by the pulse output of gate 217 as conditioned by the direct
output 215 of the PTG 219. Similarly Flip Flop 208 is set by pulse
210 from gate 212 as conditioned by the reset output of line 214
from flip flop 208 and by the direct output 215 of PTG 219. Flip
flop 208 is reset by the pulse output of gate 218 as conditioned by
the inverted output on line 216 of PTG 219. Gate pairs 211 and 217,
or 212 and 218 are pulsed by the output of pulse generator 203 or
204, respectively, whenever a positive going signal zero crossing
occurs as described above. Thus with flip flop 207 in the set
condition allowing the primary signal 191 to pass and with flip
flop 208 in the reset condition blocking the filler signal 192,
when PTG output line 215 goes positive as at 220, gate 217 allows
the next pulse from PG 203 to reset flip flop 207, thus blocking
primary signal to line 191. At the same time, gate 212 is
conditioned to pass the next pulse from PG.sub.2 204 to set flip
flop 208 and allow the supplementary signal to pass until the end
of the gap period. At this time the inverted PTG output 216 goes
positive as at 221 allowing gate 218 to pass the next pulse from
PG.sub.2 204 and reset flip flop 208, cutting off the supplementary
signal 192 and conditioning gate 211 to pass the next pulse from PG
203. This sets flip flop 207 allowing primary signal 191 to pass to
amplifier 222 and be outputted at 223. The process then repeats in
the sequence described above.
While the invention has been described with reference to
frequency-time transformations of the original signal, the
disclosed embodiments are also useful for frequency transformation,
where required, due to other factors such as a change in the
velocity of propagation of the sound waves. For example, a human
breathing in an artificial atmosphere such as one having a high
helium content speaks with a higher than normal voice pitch but
with other parameters substantially unchanged. By using the speech
compression modes provided by the present invention, such speech
can be restored to its normal frequency range without change in
time scale.
Obviously, the methods and apparatus herein disclosed can be used
for coded audible signals other than speech such as music, for
example, by taking due consideration of the corresponding
parameters significant to comprehension as herein disclosed. Many
other modifications can also be made without departing from the
scope of the invention as defined in the appended claims.
* * * * *