U.S. patent number 3,673,331 [Application Number 05/003,991] was granted by the patent office on 1972-06-27 for identity verification by voice signals in the frequency domain.
This patent grant is currently assigned to Texas Instruments Incorporated. Invention is credited to George D. Hair, James U. Kincaid.
United States Patent |
3,673,331 |
Hair , et al. |
June 27, 1972 |
IDENTITY VERIFICATION BY VOICE SIGNALS IN THE FREQUENCY DOMAIN
Abstract
Voice verification is accomplished at a plurality of spaced
apart facilities each having a plurality of terminals. Multiplexing
structure interconnects the terminals through a communications link
to a central processing station. Analog reproductions of voices
transmitted from the terminals are converted into digital signals.
The digital signals are transformed into the frequency domain at
the central processing station. Predetermined features of the
transformed signals are compared with stored predetermined features
of each voice to be verified. A verify or non-verify signal is then
transmitted to the particular terminal in response to the
comparison of the predetermined features.
Inventors: |
Hair; George D. (Irving,
TX), Kincaid; James U. (Richardson, TX) |
Assignee: |
Texas Instruments Incorporated
(Dallas, TX)
|
Family
ID: |
21708579 |
Appl.
No.: |
05/003,991 |
Filed: |
January 19, 1970 |
Current U.S.
Class: |
704/246; 704/203;
704/238; 704/249; 704/272; 704/E17.005 |
Current CPC
Class: |
G10L
17/02 (20130101); G06F 3/16 (20130101); G07C
9/257 (20200101) |
Current International
Class: |
G10L
17/00 (20060101); G06F 3/16 (20060101); G07C
9/00 (20060101); G10l 001/04 (); G10l 001/08 () |
Field of
Search: |
;179/15A,1VS,15B
;340/146.3,148,149,152,153 ;324/77 |
References Cited
[Referenced By]
U.S. Patent Documents
Other References
Shively, A Digital Processor to Generate Spectra in Real Time, IEEE
Transactions on Computers, pp. 485-491, 5/68 .
Dietrich and Maiwald, Digitalized Sound Spectrograph Using FFT and
Multiprint Techniques, JASA, p.308, 11/68.
|
Primary Examiner: Claffy; Kathleen H.
Assistant Examiner: Leaheey; Jon Bradford
Claims
What is claimed is:
1. A method of voice verification comprising:
converting analog representations of a voice into digital
signals,
transforming said digital signals into the frequency domain,
comparing predetermined features comprising corresponding points of
a spectral estimate of said transformed signals which are
characteristic of differences between individual voices with stored
predetermined features of the voice to be verified, and
generating a verify or non-verify signal in response to the
comparison of said predetermined features.
2. The method of claim 1 wherein said step of transforming
comprises Fourier transformation.
3. The method of claim 1 wherein said corresponding points comprise
selected segments of selected phonemes.
4. A method for verifying the voice of an individual
comprising:
converting the voice into analog electrical signals,
converting said electrical signals into digital signals,
sampling spaced apart portions of said digital signals,
Fourier transforming the sampled portions of said digital signals
into the frequency domain,
smoothing the frequency transformed signals,
forming a spectral estimate of the smoothed signals,
comparing the spectral estimate with a stored spectral signal
representative of the individual, and
generating a verify or non-verify signal in response to the
comparison.
5. The method of claim 4 wherein said Fourier transforming utilizes
the Cooley-Tuckey algorithm.
6. The method of claim 4 wherein said step of smoothing
comprises:
convolving a real smoothing function with said frequency
transformed signals.
7. The method of claim 4 wherein said step of forming a spectral
estimate comprises:
multiplying the smoothed signals by the complex conjugate of said
signals.
8. The method of claim 4 wherein said step of comparing
comprises:
computing a non-negative single valued function of the Euclidian
distance between vectors corresponding to said spectral estimate
and said stored spectral signal.
9. The method of claim 8 an further comprising:
generating a verify signal only if said non-negative single valued
function of the Euclidian distance is equal to or less than a
preset threshold value.
10. The method of claim 4 and further comprising:
updating said stored spectral signal in response to variations in
said spectral estimate and said stored spectral signal.
11. A method of voice verification for a plurality of stations
comprising:
transmitting station and person identification signals from a
station to a central processing station,
in response to said identification signals generating signals
representative of a predetermined series of words to be spoken by
the identified person,
displaying the series of words to the identified person,
transmitting spoken words by the identified person to said
processing station,
comparing said spoken words with stored representations of the
words previously spoken by the identified person, and
transmitting verification signals in response to the comparison to
the station.
12. The method of claim 11 wherein said station and person
identification signals are transmitted in response to insertion of
a uniquely coded card into a station.
13. The method of claim 12 wherein said station and person
identification signals are transmitted in response to operation of
numerical registers at the station.
14. The method of claim 12 wherein said predetermined series of
words are visually displayed from a panel at the station in a
random manner.
15. The method of claim 12 and further comprising:
transmitting additional numerical signals from a station to the
central processing station, wherein operations may be performed
thereon if the identity of the person is verified.
16. A system for voice verification comprising:
means for converting analog reproductions of a voice into digital
signals,
means for transforming said digital signals into the frequency
domain,
means for comparing predetermined features of said transformed
signals which are characteristic of differences between individual
voices with stored predetermined features of the voice to be
verified,
fast Fourier transformation means for transforming said digital
signals into the frequency domain,
means for forming a spectral estimate of points of selected
phonemes of said digital signals, and
means for comparing the Euclidian distance between points of said
spectral estimate and corresponding points of said stored
predetermined features, and
means for generating a verify or non-verify signal in response to
the comparison of said predetermined features.
17. A system for voice verification comprising:
means for converting analog reproductions of a voice into digital
signals,
means for transforming said digital signals into the frequency
domain,
means for comparing predetermined features of said transformed
signals which are characteristic of differences between individual
voices with stored predetermined features of the voice to be
verified,
means for generating a verify or non-verify signal in response to
the comparison of said predetermined features, and
means for varying the stored predetermined features in response to
the comparison of said predetermined features.
18. A system for verifying the voice of an individual
comprising:
means for converting the voice into analog electrical signals,
means for converting said electrical signals into digital
signals,
means for sampling spaced apart portions of said digital
signals,
Fourier transform means for transforming the sampled portions of
said digital signals into the frequency domain,
means for smoothing the frequency transformed signals,
means for forming a spectral estimate of the smoothed signals,
means for comparing the spectral estimate with a stored spectral
signal representative of the individual, and
means for generating a verify or non-verify signal in response to
the comparison.
19. The system of claim 18 wherein said Fourier transform means
operates according to the Cooley-Tukey algorithm.
20. The system of claim 18 wherein said means for smoothing
comprises:
means for convolving a real smoothing function with said frequency
transformed signals.
21. The system of claim 18 wherein said means for forming a
spectral estimate comprises:
means for multiplying the smoothed signals by the complex conjugate
of said signals.
22. The system of claim 18 wherein said means for comparing
comprises:
means for computing a non-negative single valued function of the
Euclidian distance between vectors corresponding to said spectral
estimate and said stored spectral signal.
23. The system of claim 22 and further comprising:
means for generating a verify signal only if said non-negative
single valued function of the Euclidian distance is equal to or
less than a preset threshold value.
24. The system of claim 18 and further comprising:
means for updating said stored spectral signal in response to
variations in said spectral estimate and said stored spectral
signal.
25. A system for voice verification comprising:
a plurality of spaced apart facilities each requiring verification
of the identities of a predetermined group of people,
each facility having a plurality of terminals for receiving
identification and voice information and for indicating
verification signals,
multiplexing means interconnecting the terminals at each facility
to a communications link,
a central processing station for receiving and transmitting signals
over the communications link,
means at said central processing station responsive to
identification information from a terminal for requesting voice
information at said terminal,
frequency conversion means at said central processing station for
converting voice information transmitted from said terminal into
the frequency domain,
means at said central processing station for comparing the
frequency converted information with stored identification
information, and
means for generating verification signals for transmission to said
terminal in response to said comparison.
26. The system of claim 25 wherein each of said terminals at a
facility are connected via wirelines to said multiplexer.
27. The system of claim 25 wherein said communications link
comprises an electromagnetic wave communication system.
28. The system of claim 25 and further comprising:
reversible multiplexer means located at said central processing
station.
29. The system of claim 25 and further comprising:
gate means at said terminals for allowing entrance only upon the
receipt of favorable verification signals.
30. In a system for voice verification, the combination
comprising:
a terminal for connection through a communications link with a
central voice verification station,
means at said terminal for transmitting a unique identification
signal representative of an individual to the central station,
display means at said terminal responsive to a control signal from
the central station for displaying in random order a plurality of
required words to be spoken by the individual,
microphone means at said terminal for transmitting the spoken
required words to the central station, and
means at said terminal for indicating verification signals
transmitted from the central station in response to the spoken
required words.
31. The combination of claim 30 wherein said display means
comprises:
a panel located on said terminal and including a plurality of
selectively energizable word display portions,said control signal
sequentially energizing said portions in a random order to reduce
the possibility of fraudulent operation of the system.
32. The combination of claim 31 and further comprising:
keyboard means disposed on said terminal to enable the transmission
of supplemental information to the central station.
33. The combination of claim 30 wherein said means for transmitting
a unique identification signal comprises circuitry responsive to
the insertion of a coded card.
34. The combination of claim 30 wherein said means for transmitting
a unique identification signal comprises selectively operable
register means disposed for manual operation on said terminal.
35. The combination of claim 30 wherein said means for indicating
verification signals includes means for controlling the unlatching
of a passageway.
36. The combination of claim 30 wherein said means for indicating
verification signals energizes a visual sign.
Description
This invention relates to voice verification, and more particularly
to a method and system of voice verification from a number of
spaced apart locations.
It is necessary in a number of applications to positively identify
or recognize an individual by the use of some unique
non-transferable characteristic. For instance, one such application
is the identity verification and selective admittance of employees
to industrial plants or high security civilian or military areas.
Another type of situation includes high volume retail credit
transactions such as in large department stores or the like. It has
heretofore been known to recognize an individual's physical
characteristics by personal knowledge or by visual comparison with
a photograph and description, and it has also been heretofore known
to compare an individual's handwriting or fingerprints with known
samples. However, these prior techniques have not only been
relatively time consuming and cumbersome, but have often not been
completely satisfactory with respect to accuracy.
Systems have also been heretofore developed wherein voice
signatures of individuals are stored and then compared with spoken
words of the individual. An example of such a prior system is
disclosed in U. S. Pat. No. 3,466,394, issued to Walter K. French
on Sept. 9, 1969. However, many such prior systems, as exemplified
by the French patent, attempt to verify speech by matching voltage
amplitude peaks and valleys. The present invention recognizes that
improved verification results are provided by translating voice
data into the frequency domain and then making relatively simple
comparisons thereof to provide verification data. Prior
verification systems have also not provided a practical
verification technique which may be used to perform voice
verification over large geographic areas. Moreover, previous voice
verification systems have not recognized the requirement for remote
terminals which include counterfeit preventing devices therein.
In accordance with the present invention, a method and apparatus of
voice verification is provided wherein analog reproductions of a
voice are converted into digital signals. The digital signals are
transformed into the frequency domain, and predetermined features
of the transformed signals are compared with stored predetermined
features of the voice to be verified. A "verify" or "non-verify"
signal is then generated in response to the comparison of these
predetermined features.
In accordance with a more specific aspect of the invention, a
method and system for voice verification includes the conversion of
the voice into analog electrical signals. The analog electrical
signals are converted into digital signals which are sampled at a
predetermined rate. The sampled signals are Fourier transformed
into the frequency domain and then smoothed. A smoothed spectral
estimate of the signal is formed, the spectral estimate being
compared with a stored spectral signal representative of the
individual to be verified. A "verify" or "non-verify" signal is
then generated in response to this comparison.
In accordance with another aspect of the invention, a method and
system of voice verification is provided for a plurality of remote
spaced apart facilities each requiring verification of the
identities of a predetermined group of people. Each facility
includes a plurality of terminals for receiving identification and
voice information and for indicating verification signals.
Multiplexing circuitry interconnects the terminals at each facility
to a communications link which joins with a central processing
station. The central processing station is responsive to
identification information from each of the terminals and requests
voice signals to be input at the terminals. Voice signals
transmitted from a terminal are converted into the frequency domain
at the central processing station and are compared with stored
spectral signals. Verification signals are then transmitted from
the central processing station to the terminal in response to the
comparison.
In accordance with yet another aspect of the invention, terminals
are provided for interconnection to a central voice verification
station. Structure is provided at each terminal for transmitting to
the central station a unique identification signal representative
of an individual. Display structure is provided at the terminals
which is responsive to a control signal from the central station,
the display structure then displaying in random order a plurality
of required words to be uttered by the individual. A microphone is
disposed at each terminal for transmitting the spoken required
words to the central station. Display structure at the terminal
then indicates verification signals transmitted from the central
station in response to the spoken required words.
For a more complete understanding of the present invention and for
further objects and advantages thereof, reference may now be made
to the following description taken in conjunction with the
accompanying drawings, in which:
FIG. 1 illustrates a block diagram of the basic components of a
system constructed in accordance with the present invention;
FIG. 2 is a somewhat diagrammatic illustration of the use of the
present invention with a plurality of spaced apart facilities each
having a number of spaced apart terminals connected therewith;
FIG. 3 is a perspective view of one embodiment of a terminal for
use with the present invention;
FIG. 4 is a top inverted view of the terminal shown in FIG. 3;
FIG. 5 is a perspective view of another embodiment of a terminal
for use with the invention;
FIG. 6 is a block diagram of a terminal and multiplexing
arrangement for use with the preferred embodiment of the
invention;
FIG. 7 is a block diagram of the central processing station for use
with the preferred embodiment of the invention;
FIG. 8 is a flow diagram for storing voice signatures in a digital
computer according to the invention; and
FIG. 9 is a flow diagram for comparison of spoken words with stored
words in a digital computer according to the invention.
Referring to FIG. 1, a plurality of paced apart terminals 10, 12
and 14 are connected to a reversible multiplexing system 16. The
number of terminals used with the invention will, of course, depend
upon the type of voice verification application desired. For
instance, a large number of terminals may be spaced around a retail
store for use with verifying identity and credit, as well as for
implementing accounting and inventory control procedures.
Alternatively, the terminals may comprise a plurality of gate
stations in a security area which may not be opened until an
individual is properly verified according to the invention. The
terminals can be also utilized as voting terminals which provide
verification of voter identity as well as vote compilation.
Individuals desiring voice verification speak required words into
the terminals. The signals flowing from the various terminals are
suitably multiplexed by the multiplexer 16 and fed through a
transmission link 18 to a reversible demultiplexing station 20
where the signals are demultiplexed. The transmission link 18 may
comprise any electromagnetic wave communication system, including
radio wave, microwave, laser or the like. The demultiplexed signals
are fed to a central processing station 22, where the words spoken
at terminals 10, 12 and 14 are compared with previously stored
voice information contained in permanent data file units 24. The
central processing station 22 also inputs into the remote
surveillance system 26 to provide additional information if
required, particularly in the industrial secruity application.
After the comparison of the voice inputs with the stored voice
information, the central processor 22 transmits verification
signals through the reversible demultiplexer station 20, the
transmission link 18 and the multiplexer 16 to the respective
terminal.
FIG. 2 illustrates in more detail a total system implementing the
present technique over a wide geographic area. A plurality of
facilities 30, 32, 34 and 36 are spaced about a centrally located
facility shown generally by the numeral 38. Each of the facilities
has within its general area a group of individuals with whom it is
necessary to perform identification operations. For instance, each
of the facilities 30, 32, 34 and 36 may comprise a separate
industrial complex requiring identification verification as a part
of overall security operations. Alternatively, the facilities 30,
32, 34 and 36 may comprise retail store operations wherein it is
necessary to verify identification and credit verification during
the conduction of business.
Each facility has a plurality of terminals associated therewith.
Each of the terminals 30a-n at facility 30 are joined to a facility
transmitting and receiving station 40 by wire connections.
Similarly, each of the terminals 32a-n at facility 32 are joined to
central transmitting and receiving station 42. Likewise, the
terminals at facilities 34 and 36 are respectively connected to
transmitting and receiving stations 44 and 46. Each of the
transmitting and receiving stations 40, 42, 44 and 46 communicate
via radio waves with a transmitting and receiving station 48
located at the central processing station facility 38.
In the preferred embodiment, the wirelines connecting the terminals
to each central facility transmitting and receiving station
comprise 15 kc studio quality transmission lines currently used by
commercial radio stations between their studio and transmitter
locations. Additionally, the communications link between the
outlying facilities and the central processing station comprises
microwave communication systems. It is also noted that the central
processing station facility 38 may include a number of outlying
terminals which are connected thereto by wire or other suitable
connections. It will thus be seen that the present invention allows
a number of outlying facilities to utilize a central processing
station for voice verification, the details of operation of which
will be later described.
FIGS. 3 and 4 illustrate one embodiment of a terminal 50 for use in
a retail store. The terminal 50 comprises a case which houses
simple logic and light energizing circuits. A microphone 52 is
connected to the side of the casing which is faced by the
individual to be verified. A receptacle 54 is provided in the
central portion of the casing to receive an identification card
carried by the individual to be verified. This identification card
may comprise a conventional credit card having punched holes using
light and photocell matrix logic, or magnetic encoding techniques,
which actuate switches according to the code imprinted thereon.
When it is desired to charge an item bought in the store, the
purchaser places a card in the receptacle 54. Terminal 50 is
energized and a "go ahead" signal is displayed. At this time, the
clerk or salesman may enter alphanumeric information for sales and
inventory control into the terminal. The unit senses the uniquely
coded card and sends an identifying signal via the wirelines or
other suitable transmitting link to the transmitting and receiving
station of that particular facility. The identifying card signals,
along with signals identifying the particular terminal involved,
are transmitted over the communications link to the central
processing station. This identifying signal is entered into the
computer at the central station and utilized to retrieve a stored
voice signature of the particular individual involved. The computer
then sends an acknowledging signal to the terminal and a panel
58a-e is illuminated to cue the individual to speak the required
words in a random sequential order. At the illumination of each
word panel, the individual speaks the required word into the
microphone 52. This voice data is then transmitted over the
communications link to the central processing station.
An important aspect of the invention is that the word panels 58a-e
are energized in a random order to reduce the possibility of use of
a counterfeit voice source, such as a tape recorder or the like.
The panels 56, 58a-3, 60, 62 and 64 in the simplest aspect may
comprise glass panels bearing legends disposed in front of a light
source. Suitable filtering networks are connected at each light
source so that signals bearing different frequencies will energize
different ones of the panels. A frequency coded signal is then
provided by the central processing station in order to sequentially
energize the panels in a random order.
A "thank you" panel 60 may also be energized by the central control
station. If a "non-verify" decision is made and the decision
measure is within a predetermined distance of the "verify"
threshold, a signal is transmitted to the proper terminal and the
"please repeat" panel 56 is energized. Also, if a "non-verify"
signal is generated due to background noise or the like, then the
"please repeat" panel 56 is energized to enable additional tries
for verification. The number of additional tries may be under the
store clerk or salesman's discretion. Referring to FIG. 4, it will
be seen that the sales clerk is provided with a "verified" panel 62
and a "non-verified" panel 64. After comparison of the voice
signals transmitted by the purchasing individual, the central
station selectively energizes either panels 60 and 62 or 56 and 64.
If the individual is verified due to the lighting up of panels 60
and 62, the sales clerk has the option to enter additional
inventory control and sales information on a keyboard 66. Panels
67a-b are provided to indicate to the clerk as to whether price or
inventory control information is presently being input into the
system. A number of buttons 68 are also provided to initiate the
input and inventory information transmission and to request return
alphanumeric signals. The keyboard 66 and the buttons 68 may
comprise capacitively actuated switches or any other suitable
conventional means.
It should be understood that instead of the use of the
identification card shown in FIGS. 3 and 4, that a series of
manually operable buttons or registers could be alternatively
provided at the terminal. The purchasing individual could then set
the buttons or registers to a particular unique code for
identification purposes. Also, it will be seen that more or less
required words than the five words illustrated could be presented
to the purchasing individual in a variety of ways. For instance, a
cathode ray tube could be utilized to project a desired sequence of
required words to be spoken by the purchasing individual. The
particular required words utilized with the terminals will, of
course, be carefully chosen in order to provide the required number
and type of phonemes most useful for accurate voice
identification.
FIG. 5 illustrates another embodiment of a terminal for use with
the invention in a high security area. An opening is formed in
walls 70 surrounding the high security area, with the pair of
spaced apart semicircular enclosure members 72a-b surrounding a
revolving door 74. The partitions forming the revolving door 74 are
provided with a series of slotted portions 76 which mesh with
outwardly extending members 78 which are rigidly attached to the
wall 70.
The members 78 allow entrance through the revolving door 74 on only
one side. A terminal 80 constructed in accordance with the
invention is disposed on the gate for voice verification purposes.
A person wishing to enter through the gate inserts a card in a slot
82 in the terminal, the card indicating the purported identity of
the person. The data processing station then sequentially
illuminates a plurality of word panels 86. As each panel is
illuminated, the individual speaks the required word into a
microphone 88. As previously indicated, the word panels 86 are
randomly energized in order to reduce the possibility of voice
counterfeiting. If the remote data processing center verifies the
voice of the individual with respect to the stored voice signals of
the individual, a "proceed" light panel 84 is illuminated and the
revolving door 74 is unlatched to allow entrance of the individual
therethrough. Light panel 90 may be energized to display "please
repeat" in the manner previously described. Any type of suitable
latch may be utilized to lock the revolving door against
unauthorized admittance, such as a movable latch controlled by a
solenoid. A surveillance television camera 94 is operated to
indicate when an individual is passing through the gate or to
provide a visual indication of an individual who is not verified by
the system. The camera 94 also prevents entry through the gate of
more than one person. When a "non-verify" decision is received,
"repeat" panel 90 is illuminated.
FIG. 6 illustrates in detail the interconnections between the
terminals of the invention, the reversible multiplexer system, and
the transmission link which extends to the central processing
center. A start signal is initially generated by the depression of
a button on a terminal or by the insertion of a uniquely coded card
into the terminal. The start signal operates a service request
signal generator 100 which generates a suitable signal to an
input-output interface and frequency multiplexer circuit 102.
Generator 100 may comprise a frequency tone generator of the type
used in conventional touch tone telephone systems. Multiplexer 102
comprises any suitable type of reversible analog multiplexer, of
which there are a number of commercially available units made by
different manufacturers.
Also applied as an input into the terminal is a voice input which
is fed through a microphone to a signal conditioning circuit 104,
which may comprise for instance AGC circuitry with preamplification
and bandpass filtering, and other noise reduction circuitry. The
output of the signal conditioning circuitry 104 is also fed through
the frequency multiplexer 102. Alphanumeric user and machine inputs
are fed from the remote terminal into an alphanumeric data block
signal generator 106. Generator 106 preferably will comprise a
frequency shift keying system such as the conventional touch-tone
system for generating tones in response to alphanumeric inputs. The
output of the signal generator 106 is also fed through the
multiplexer 102.
Signals fed from the central processing center back through the
frequency multiplexer 102 are demultiplexed thereby and are applied
through a terminal command and control circuit 108. The circuit 108
comprises simple logic circuitry to turn signal generator 106 on
after reception of a ready acknowledgment from the central
processing center. Additionally, the command and control circuitry
108 decodes command signals and controls the operation of the
visual signal panels on the terminal. These signal panels are
termed, for the purposes of this disclosure, the queue and decision
display circuit 110. Display 110 thus includes the randomly
operated required word displays and also the verification panels on
the terminal. The terminal command and control 108 also controls
the operation of the admittance gate 112 at an industrial site, or
alternatively, the sales ticket data operation 114 in a retail
site.
The multiplexed signals from the frequency multiplxer 102 are fed
through an input interface and channel separator 116. Additionally,
multiplexed signals from other terminals at the facility are fed
into the interface and channel separator 116. The signals fed from
each of the terminals at the facility are separated into various
frequency channels by the separator 116. The start request and
other signals are fed through the signaling channel interface
circuitry 118 to the service request sensor circuit 120. Suitable
circuitry for use as the service request sensor is a detector or
recognizer for the output of the touch-tone system utilized in
generator 106. Such a detector may comprise a narrow band filter,
along with squaring and integrating circuits and time gate
sampling.
The voice signals and alphanumeric user and machine inputs are fed
through a data channel interface 122 to a signal conditioning
circuit 124. Some amplification may be required at interface 122.
Again, filtering and other conditioning operations similar to those
provided by circuit 104 are performed by the circuit 124 on the
voice signals in order to enhance the signal to noise ratio. The
other alphanumeric inputs may be amplified by circuit 124. The
service request sensor 120 generates a tone signal which is fed to
the system command and control circuitry 126. Receipt of a "use"
signal by circuitry 126 initiates channel assignment. Circuitry 126
also provides clocking and control functions for the multiplexing
operations, and generally coordinates movement of signal flow by
opening and inhibiting the various channels. The conditioned data
signals are fed from circuit 124 to the transmitter channel
assignment circuitry 128 wherein the data signals are fed to
available frequency channels for transmission. Clocking and control
signals for the transmission are provided by the command control
system 126. System 126 also maintains memory of which frequency
channels are unassigned and thus available.
The command and control system 126 also actuates a "use" signal
generator 130 which provides a "use" tone signal to be fed to the
central processing station via the transmitter channel assignment
circuit 128. The multiplexed signals are again conditioned by
preamplification and bandpass filtering by a conditioning circuit
132. The conditioned signals are then fed through a microwave
transmitter 134 to the transmitting antenna 136 for transmission to
the central processing center.
Acknowledgment and verification signals transmitted from the
central processing center are received by a receiver antenna 138,
and fed through a microwave receiver 140 to a receiver channel
assignment circuit 142. Control of the receiver channel assignment
circuit 142 is provided by the transmitter channel assignment
circuit 128. Ready or acknowledgment tone signals are fed to a tone
sensing circuit 144, of the type previously described, which
generates a signal which is routed through the command and control
system 126. This signal then actuates the alphanumeric request tone
signal generator 146 and the queue tone signal generators 148. It
will be understood that a plurality of queue signal generators will
be provided, with one generator being provided for each of the
required word panels on a terminal. The queue signal generators 148
generate tone signals of different frequency in order to actuate
selected ones of the lights behind the word panels on the
terminal.
The output signals from the tone signal generators 146 and 148 are
fed through the signaling channel interface 118 and the channel
separator 116 to the remote terminal. After voice signals are
transmitted after the operation of one of the queue signal
generators 148 and the voice signals are received by the central
processing center, a tone indication is provided through the
receiver channel assignment 142 and is fed to the sense queue
circuit 150. The sense queue circuit senses the tone and then
generates tone signals which sequentially actuate the next desired
signal from the queue signal generators 148. After the desired
number of required spoken words have been received by the central
processing center, a decision tone signal on verification is
transmitted from the processing center, received by the receiver
antenna 138 and fed to the sense decision circuit 152. This
decision circuit 152 feeds a signal through the system command and
control circuit 126 to actuate the decision signal generator 154.
The decision signal is then fed through the signaling channel
interface 118 to the remote terminal for actuating the visual
verification display panel at the display 110. Additionally, other
acknowledgment tone signals, inventory control information,
termination tone signals and the like are fed through the receiver
channel assignment 142 to the return data circuit 158 for suitable
display at the terminal displays 112 or 114. The circuit 158
decodes the coded tone signals applied thereto.
FIG. 7 illustrates the circuitry at the central processing station
of the invention. A microwave receiving antenna 160 receives the
microwave signal transmissions from the various terminal
facilities. A microwave receiver 162 transmits the received signals
through a receiver channel interface 164. The channel interface 164
is automatically set by the channel assignment circuit 128 (FIG. 6)
and feeds the signals through a data interface and channel
assignment circuit 166. The receiver channel interface 164 also
provides a signal to the use sensor 168 which determines which time
shared input lines, and thus which facility and terminal is
requesting service. The sensor 168 then generates a tone signal
which notifies the system command and control circuit 170 in order
to schedule the requested job and initiate action thereon. Circuit
170 then notifies assignment circuit 166 for channel
assignment.
The voice signals fed through the data interface system 166 are fed
to a voice data analog-to-digital converter 172. In a typical
system, conversion is provided at the converter 172 at an 8,000 to
20,000 samples/second rate. Selection of portions of the voice
signals will be based on starting an arbitrary time following or
preceding signal onset as determined by threshold logic. The
digitized signals are then fed through a buffer 174 which stores
the digital signals until the signals are accepted by a fast
fourier transform circuit 176. Suitable buffer and fast Fourier
transform systems are commercially available from such
manufacturers as the Digital Systems Division of Texas Instruments
Incorporated of Houston, Texas. The resulting spectra are fed to an
identification and data processor 178, wherein the spectra are
stored on magnetic discs or the like and utilized for voice
verification in the manner to be subsequently described in detail.
The data processor in the preferred embodiment of the invention
will comprise a properly programmed digital computer.
Several Fourier transform units may be required in circuit 176.
Each fast Fourier transform unit will be capable of producing
Fourier transforms in the range of about 50 channels of 1,000 word
speech samples every half second. A 20 kc sampling rate and a 0.05
second transform time-gate then produces 1,000 samples for each
phoneme transformed. Assuming 10 phoneme spectra for each
individual verified, this yields about 30 verifications per second
as a typical maximum verification rate for the system utilizing
three fast Fourier transform systems. The time required for
processing each set of ten spectra and making the verification for
an individual would be about 0.03 seconds. This time is based upon
forty words per spectrum, or a total of 400 words per person,
required to represent the pertinent spectral features. Thus, most
conventional general purpose computers may easily keep up with
three fast Fourier transform systems all working at maximum
capacity. With computers of much larger capacity, the need for fast
Fourier transform units would not be necessary, as such
transformation functions may be done by properly programming the
digital computer.
Alphanumeric tone signals fed through the data interface 166 are
decoded into digital signals and fed through a buffer 180 to the
identification and data processor 178. Additionally, portions of
the alphanumeric signals are fed through buffer 180 for storage in
a memory file 182. The data processor 178 may recall prestored
spectral data from the memory file 182 when desired. Additional
auxiliary data storage is provided by the storage 184, and the
auxiliary data may be processed at a later time by processor 178 or
yet another specially programmed general purpose computer. Buffers
and storage circuits 180, 182 and 184 may comprise conventional
magnetic disc or core storage, or banks of shift register
strings.
The operation of each of the buffers and other circuits of the
invention are controlled and addressed by clock and timing signals
provided by the command control system 170. The command control
system 170 also initiates the ready signal generator 186 which
supplies a tone signal through the transmitter channel interface
system 188. System 188 is automatically set by the operation of
circuit 142 (FIG. 6). The tone signal is then assigned to a channel
for transmission and is fed through a signal conditioning circuit
190 for filtering and waveshaping operations. The signal is then
fed to the microwave transmitter 192 for transmission via the
transmitter antenna 194 to the respective facility and
terminal.
Upon answer from the terminal, the system command and control 170
feeds signals through an exhaustive random order selector circuit
196 which generates a series of five random control pulses to the
queue signal generators 198. An exemplary circuit for use as
selector circuit 196 is a continuously cycling five position shift
register. When it is desired to generate a random word signal, the
instantaneous position of the register is output as the random
number. This position is stored by logic circuitry. For the next
random number to be generated, the instantaneous position of the
register is output and then compared with the stored position. If
the two positions are different, the second position is output as a
random number. If the positions are the same, the instantaneous
position of the register is again detected. This cycle is repeated
until five random numbers have been generated, after which the
system is reset. The generators 198 include a generator of a
different tone frequency for each of the required words to be
displayed at the terminal. The different frequency signals are
transmitted through the channel interface 188 for transmission via
the microwave communication link. Thus, each required word panel is
energized during each verification cycle, but the word order is
random. After the system has received the voice signals and made
the verification comparison at the data processor 178, the system
command and control 170 initiates a decision tone signal from the
tone generator 200 for transmission to the respective terminal.
Additionally, the system command and control 170 controls the
operation of a return data buffer 202, which may be a magnetic
storage, such that return data may be fed from a data processor 178
to the respective terminal.
The operation of the system will now be described in detail with
reference to FIGS. 6 and 7. The person to be identified approaches
the remote terminal and initiates a start signal by the depression
of a button or the like. The start signal is fed through the
service request signal generator 100 which applies a signal through
the interface and frequency multiplexer 102. The channel separator
116 separates the start signal from other signal information being
supplied from other remote terminals and feeds the start signal
through the interface 118 to the service request sensor 120. A
signal is then initiated through the command and control circuit
126 to the use signal generator 130, which generates a "use" signal
for application to the transmitter channel assignment circuit 128.
A transmitter channel is assigned upon receipt of the service
request by circuit 126. The "use" signal is transmitted from
antenna 136 via the microwave link to the receiver antenna 160
shown in FIG. 7.
The "use" signal is then transmitted through the channel interface
circuit 164 to the use sensor 168. The resulting output signal from
the sensor 168 is fed to the system command and control circuit
170, wherein the location and identity of the terminal is stored
and utilized for scheduling of the job. Scheduling signals are fed
from the command and control circuit 170 to the channel assignment
circuit 166. Additionally, the command and control system 170
initiates a signal from a ready signal generator 186 which is fed
through the channel system 188 and transmitted via the antenna 194
to the terminal. This ready signal is received by the antenna 138
(FIG. 6) and fed to the sense ready circuit 144, which applies an
indication to the system command and control 126.
The alphanumeric request signal generator 146 is then energized to
request alphanumeric data blocks through the channel interface 118
and channel separator 116, the multiplexer 102 and through the
command and control 108. These user inputs, which may comprise the
purported identity contained on the card inserted into the
terminal, are fed through signal generator 106. As noted, generator
106 in the preferred embodiment generates tone frequencies which
are fed through the multiplexing circuitry and through the data
channel interface 122 for transmission from the antenna 136 to the
central processing station.
The purported identity and other alphanumeric signals are received
by the receiver antenna 160 and suitably demultiplexed and fed
through the alphanumeric data buffer 180 to the system command and
control 170. The signals are then stored, and in response thereto a
signal is fed to the random order selector 196. This selector then
feeds a first randomly selected control signal to initiate
operation of one of the queue signal generators 198. The selected
queue signal generator generates a unique tone frequency which is
fed through the channel interface 188 and transmitted via the
transmitter antenna 194. The selected queue signal is received by
the receiver antenna 138 and fed through the receiver channel
assignment system 142 to the sense queue circuit 150. The sense
queue signal is fed through the command and control system 126 to
the queue signal generators 148.
Only one of the signal generators 148 is initiated by the
particular queue signal tone, and this particular generator
produces a signal which is fed through the channel interface 118
and through the terminal command and control circuit 108 to the
queue display 110. A single one of the required word panels is then
energized at the terminal. For instance, referring to FIG. 3, a
selected one of the word panels 58a-e is energized. The individual
to be verified then speaks the required word through the
microphone, the resulting voice input being fed through the signal
conditioning circuit 104 and through the multiplexing circuit 102
to the data channel interface 122. The voice signals are suitably
conditioned for noise reduction through the conditioning unit 124
and fed through the channel assignment circuit 128 for transmission
via the transmitting antenna 136.
The required word is received at the receiver antenna 160 and fed
through the data interface and channel assignment circuit 166 which
has been preset by the initial start signal. Circuit 166 routes the
word signal to the voice data analog-to-digital converter 172. The
digital signals are stored in the buffer 174 until they may be
transformed into the frequency domain by the fast Fourier transform
circuit 176. The transformed signals are then fed to the
identification and data processor 178 for processing in the manner
to be subsequently described. The command and control system 170
senses the reception and storing of the required word, and then
pulses the random order selector 196 in order that a second queue
signal may be generated from the signal generators 198. The signal
is again a unique tone frequency which is transmitted to the
terminal, in the manner previously described, so that a second one
of the queue signal generators 148 will be energized to display a
second required word at the terminal.
The individual to be verified again repeats the required word
through the microphone and the voice input is processed,
multiplexed and transmitted to the central processing station. The
voice signal is digitally converted in the unit 172 and transformed
into the frequency domain by the circuit 176. The second spoken
word is then stored in the identification and data processor 178
and the cycle is repeated again. This required word generation and
resulting voice response by the individual to be verified is
continued until all of the required spoken words are received and
stored in the identification and data processor. In the embodiment
shown in FIG. 3, this would comprise five spoken words. However, it
will be understood that more or less words may be required for
varying applications of the invention.
After all the required words have been spoken and stored, the
stored spectral signals for the individual are retrieved from the
memory file 182. The stored signals relative to the individual are
then compared with the spoken words within the data processor 178
in the manner to be subsequently described. The data processor 178
then makes a verification decision and transmits the verification
signal through the system command control 170 and to the decision
signal generator 200. If the person's purported identity is
verified, the generator 200 generates one signal which is fed to
the channel assignment circuit 188 and transmitted via the
microwave link to the receiver antenna 138 at the terminal. The
favorable decision is fed through the sense decision circuit 152
and through the command and control system 126 to the decision
signal generator 154.
In response to the favorable decision signal, generator 154
generates an indication which is fed through the interface 118, the
channel separator 116, the multiplexer 102 and the terminal command
and control 108 to either the gate control 112 or to the sales
ticket data 114. If "no verification" is decided by the data
processor 178, the decision signal generator 200 generates a
"non-verify" signal, which is displayed by 110 and 112, and the
gate control 112 is not actuated, or no sales ticket data is
presented to unit 114. Additionally, the "please repeat" panel will
be energized.
In a credit environment, if the data processor 178 generates a
"verify" signal, the system command and control 170 actuates the
auxiliary data storage to store the input signals for inventory and
credit control purposes.
Upon indication of a decision signal, the terminal generates a
clear indication which is fed through the service request signal
generator loop to the central processing station to release the
channel assignment for the terminal. Return signals fed from the
data processor may at this time be fed through the return data
buffer 202 and through the return data unit 158 for display at the
terminal primarily at display 114.
It will thus be seen that the present system may have many
applications for use as an industrial security system, or for use
in credit verification for retail stores and the like.
Alternatively, the system can be used as a verification unit in
apartment houses to control the entrance thereto. The system may
also be used between banks in order to keep credit balances and the
like on a real-time basis. For security purposes, codes or
scrambling may be utilized in the microwave communications link to
reduce the possibility of interception and use by unauthorized
persons.
The identification and data processor 178 may take on various
forms, as for instance, a special purpose digital computer.
However, it is thought that for most purposes it will be desirable
to properly program a general purpose digital computer to perform
the voice verification functions of the invention. FIGS. 8 and 9
thus illustrate flow diagrams which may be used to program such a
digital computer.
Referring to FIG. 8, a flow diagram is illustrated whereby the
digital processor may be provided with a stored memory of a voice,
termed a voice signature, of an individual. In practice, the
individual speaks the required words into a microphone, after which
the words are converted into digital form by a conventional
analog-to-digital convertor. Assuming M utterances of each of N
phonemes, T points of the digital words are stored at 300. These T
points comprise an arbitrary number to give sufficient information,
but in practice T has been set to 1,024. At step 302, m is set
equal to one and n is set equal to one at 304. At 306, the spectrum
of the mth utterance of the nth phoneme is computed and stored as
the Fourier transform .PHI..sub.n (i), wherein i = a.sub.n, . . .
,b.sub.n.
At 308, n is incremented once and a decision is made at 310 as to
whether or not n = N. If not, the next phoneme is transformed at
306. When the last phoneme of the first utterance is transformed, n
= N, and m is incremented once at 312. The decision is made at 314
as to whether or not m = M. If not, the next utterance is Fourier
transformed, phoneme by phoneme, in the manner described. After the
last utterance has been Fourier transformed, the reference spectra
are computed at 316, as indicated, and are stored to provide
.PHI..sub.n.sup.REF (i) for use as a reference voice signature in
the manner to be subsequently described. The operation at 316 is an
arithmetic average of repetitive utterances made by the individual,
to provide more meaningful stored data. Spectrum smoothing will
generally also be conducted on the transform data to make the data
more compatible with the input data treated in the manner as shown
in FIG. 9.
Referring to FIG. 9, it will be assumed that the required spoken
words have been transmitted to the central data processor and have
been digitally transformed. T points of each of the N preselected
phonemes are stored at 400 as previously noted. In one embodiment T
comprised the value of 1,024, with from five to 10 phonemes being
utilized. n is set to one at 402 and Fourier transformation is
accomplished on the nth phoneme utilizing the Cooley-Tukey
algorithm according to the formula:
The Fourier transforms are of the nth phonemes and are smoothed at
406 according to the formula:
wherein i = a.sub.n, a.sub.n + x, . . . , b.sub.n, and H(m) Real
which involves convolving the Real H(m) function with .PHI..sub.n
'(i) such that only frequencies of interest are examined. It will
be noted that frequencies a.sub.n and b.sub.n may differ for
different phonemes and for various applications desired.
At 408, a spectral estimate of the nth phoneme is formed according
to the formula:
.PHI..sub.nn (i) = .PHI..sub.n (i) .PHI..sub.n *(i) 3.
wherein * denotes the complex conjugate and i = a.sub.n, a.sub.n
+x,...,b.sub.n.
This step involves multiplying the smoothed function by its complex
conjugate. The spectral estimate .PHI..sub.nn (i) is stored and n
is incremented at 410. A decision is made at 412 as to whether or
not n = N. If not, the subsequent phonemes are Fourier transformed
at 404 and the cycle is continued until all phonemes have been
processed. At 414, the reference spectra .PHI..sub.nn.sup.REF (i)
are called from memory for the particular individual to be
verified. The observed spectra are then compared at 416 to the
reference spectra in the following manner:
This involves comparing the Euclidian distance between
corresponding vectors in the .PHI..sub.nn (i) multidimensional
feature space, whose coordinates are the energy-densities at the
particular frequencies of each of the phoneme spectra. The square
of the Euclidian distance, or other suitable non-negative single
valued functions of the Euclidian distance, is compared at 418
against the predetermined threshold value. This value is arbitrary
and is defined by previously made experiments on large portions of
the population. This threshold may, in some instance, be unique for
each individual and may be stored within the computer. If the
Euclidian distance is not less than or equal to the threshold
value, a "not verified" signal is transmitted at 420. If the
squared distance is less than the threshold, a "verify" signal is
transmitted at 422 in the manner previously described.
In the preferred embodiment of the invention, the reference spectra
stored within the computer is updated at 424 in accordance with the
results of the decision at 418. This is accomplished according to
the following:
wherein
i = a.sub.n,...,b.sub.n
n = 1,2,...,N
P = number of verifications.
The stored reference data is then changed to compensate for changes
in voices due to age and the like. In some instances, it may be
desirable to track changes in voices faster by utilizing only the
most recent past, as for instance six months, when updating the
reference spectra at 424.
Whereas the present invention has been described with respect to
specific embodiments thereof, it will be understood that various
changes and modifications will be suggested to one skilled in the
art, and it is intended to encompass such changes and modifications
as fall within the scope of the appended claims.
* * * * *