U.S. patent application number 10/213486 was filed with the patent office on 2003-02-20 for dialog-based voiceprint security for business transactions.
Invention is credited to Buffum, Chuck, Calvin, Nathaniel, Gould, Craig, King, Jeff, Levy, Jared, Lipin, David.
Application Number | 20030037004 10/213486 |
Document ID | / |
Family ID | 26908129 |
Filed Date | 2003-02-20 |
United States Patent
Application |
20030037004 |
Kind Code |
A1 |
Buffum, Chuck ; et
al. |
February 20, 2003 |
Dialog-based voiceprint security for business transactions
Abstract
A system for biometrically securing business transactions uses
speech recognition and voiceprint authentication to biometrically
secure a transaction from a variety of client devices in a variety
of media. A voiceprint authentication server receives a request
from a third party requester to authenticate a previously enrolled
end user of a client device. A signature collection applet presents
the user a randomly generated signature string, prompting the user
to speak the string, and recording the user's as he speaks. After
transmittal to the authentication server, the signature string is
recognized using voice recognition software, and compared with a
stored voiceprint, using voiceprint authentication software. An
authentication result is reported to both user and requestor.
Voiceprints are stored in a repository along with the associated
user data. Enrollment is by way of a separate enrollment applet,
wherein the end user provides user information and records a
voiceprint, which is subsequently stored.
Inventors: |
Buffum, Chuck; (San Jose,
CA) ; Levy, Jared; (Mountain View, CA) ;
Calvin, Nathaniel; (Sunnyvale, CA) ; Gould,
Craig; (Campbell, CA) ; King, Jeff; (Mountain
View, CA) ; Lipin, David; (San Carlos, CA) |
Correspondence
Address: |
GLENN PATENT GROUP
3475 EDISON WAY
SUITE L
MENLO PARK
CA
94025
US
|
Family ID: |
26908129 |
Appl. No.: |
10/213486 |
Filed: |
August 6, 2002 |
Related U.S. Patent Documents
|
|
|
|
|
|
Application
Number |
Filing Date |
Patent Number |
|
|
60312363 |
Aug 14, 2001 |
|
|
|
Current U.S.
Class: |
705/51 ;
382/115 |
Current CPC
Class: |
G07C 9/37 20200101; G10L
17/06 20130101; G06Q 20/4014 20130101; G06F 21/32 20130101 |
Class at
Publication: |
705/51 ;
382/115 |
International
Class: |
G06F 017/60 |
Claims
1. A system for securing a transaction, comprising: an
authentication server, wherein said server receives a request from
a requestor over a network to authenticate an enrolled user of a
client device, based on said user's voiceprint; and a dialog-based
signature collection component adapted to present said user a
random signature string and record said signature string as the
user speaks it, said authentication server comprising: a recognizer
adapted to recognize the recorded signature string; and a
voiceprint authenticator adapted to compare said recorded signature
string with a stored voiceprint of said user; wherein said user is
authenticated or rejected based on result of said comparison.
2. The system of claim 1, wherein said requestor comprises a server
running a business application.
3. The system of claim 1, wherein said requestor comprises a
telephony server.
4. The system of claim 1, wherein said requestor communicates with
said server over either a data network or a voice network.
5. The system of claim 1, wherein said client device communicates
with said server over either a data network or a voice network.
6. The system of claim 1, wherein said transaction comprises a
business transaction
7. The system of claim 1, wherein authentication is requested by
signaling said authentication server an identifier, network address
and, optionally, a device signature associated with said user.
8. The system of claim 1, wherein said signature collection
component comprises an interactive signature collection applet
instantiated on said client device.
9. The system of claim 8, wherein said signature collection applet
is adapted to: present the user a randomly generated signature
string sent from said server; prompt the user to speak the randomly
generated signature string; record the said spoken signature string
as an audio file, using device-specific recording objects; gather
device signature configuration by reading configuration of said
client device; and send said audio file and said device
configuration to said authentication server.
10. The system of claim 8, wherein said signature collection applet
is adapted to report an authentication result to the user.
11. The system of claim 8, wherein said signature collection applet
is one of: pre-installed on said client device; and served up from
said authentication server.
12. The system of claim 8, wherein said signature collection applet
is software and device-specific.
13. The system of claim 8, wherein said client device includes: a
software environment capable of running said signature collection
applet; sufficient memory to run said applet; audio recording
capabilities; readable device configuration data; and capability to
send said signature string and said device configuration data.
14. The system of claim 8, wherein said client device comprises any
of: a personal computer; a WAP (wireless access protocol)
telephone; a PDA (personal digital assistant); and a conventional
telephone.
15. The system of claim 1, wherein said authentication server
further comprises: a dispatcher; at least one authentication
client; an enrollment client; and a voiceprint server.
16. The system of claim 15, wherein said dispatcher comprises; a
load balancer, said load balancer adapted to receive authentication
requests and direct said requests to available authentication
clients; a logger, said logger adapted to log system load data; and
a watcher, said watcher adapted to: start and shut down
authentication clients in response to system demand; terminate
unresponsive authentication clients; restart authentication clients
that terminate abnormally.
17. The system of claim 15, wherein said authentication client
comprises: a transaction manager; a logger; a random signature
generator; an applet launcher; an authentication requester; and a
dialog manager.
18. The system of claim 17, wherein said transaction manager is
adapted to: receive an authentication request from said dispatcher;
respond to a progress query from a requestor; and return an
authentication result to a requestor.
19. The system of claim 17, wherein said logger is adapted to: log
authentication transaction data.
20. The system of claim 17, wherein said random signature generator
is adapted to: generate random signatures for use by signature
collection applets.
21. The system of claim 17, wherein said authentication requestor
is adapted to: pass an audio file to said voiceprint server along
with identifier of associated user; and identify best acoustic
model for recognition and verification based on a client device
signature.
22. The system of claim 17, wherein said dialog manager is adapted
to: supply dialog box content to applets; evaluate recognition and
verification results as compared to configured thresholds;
determine authentication result and potential need for retries; and
interact with other clients to manage data and decision flow.
23. The system of claim 17, wherein said enrollment client
comprises: an enrollment manager; a logger; an identity manager; an
applet launcher; an enrollment requestor; and a dialog manager
24. The system of claim 23, wherein said enrollment manager is
adapted to: receive enrollment request and device signature from a
user; manage enrollment dialog with a user; and confirm completion
of enrollment process.
25. The system of claim 23, wherein said logger is adapted to: log
enrollment transaction data.
26. The system of claim 23, wherein said identity manager is
adapted to: manage user-specific data necessary to confirm identity
of a user during enrollment process.
27. The system of claim 23, wherein said applet launcher is adapted
to: serve up a device appropriate enrollment applet to a client
device based on device signature and network address; receive
signature files from said applet; and send enrollment result to
said applet for display.
28. The system of claim 23, wherein said enrollment requestor is
adapted to: pass an audio file to said voiceprint server along with
user identity information requesting recognition and verification
results; and use device signature to identify best acoustic models
for recognition and verification.
29. The system of claim 28, wherein said dialog manager is adapted
to: supply dialog box content to applets; evaluate recognition and
verification results as compared to configured thresholds;
determine authentication result and potential need for retries;
interact with other clients to manage data and decision flow; and
store a voiceprint in a repository.
30. The system of claim 15, wherein said recognizer comprises at
least one computer readable speech recognition program, wherein
said speech recognition program recognizes said recorded signature
string.
31. The system of claim 30, wherein said voiceprint authenticator
comprises at least one computer-readable program for voiceprint
authentication.
32. The system of claim 31, wherein said voiceprint server
comprises: means for receiving files from said authentication
client for recognition and verification; a software interface that
integrates with API's to said speech recognition and voiceprint
authentication programs to pass audio files and select appropriate
grammars and acoustic models and to receive recognition and
verification results; and a voiceprint manager for managing storage
and retrieval of voiceprints from a depository.
33. The system of claim 32, wherein said server computes confidence
values for recognition and authentication, wherein minimum and
maximum thresholds are set for each confidence value.
34. The system of claim 33, wherein: if both confidence values
exceed corresponding maximum thresholds, the user is authenticated;
if both confidence levels fall below corresponding minimum
thresholds, the user is rejected as an imposter; and if one or both
confidence levels fall between thresholds, the user is prompted to
re-record the signature string.
35. The system of claim 1, further comprising means for: secure
transmission among said requestor, said authentication server and
said client.
36. A method for securing a transaction, comprising: receiving a
request over a network at an authentication server from a requestor
to authenticate an enrolled user of a client device, based on said
user's voiceprint; instantiating a dialog-based signature
collection component on said client device; presenting a random
signature string and recording said signature string as the user
speaks it, recognizing said recorded signature string by a
recognizer at said authentication server; comparing said recorded
signature string with a stored voiceprint of said user by a
voiceprint authenticator at said authentication server; and
authenticating or rejecting said user based on an authentication
result.
37. The method of claim 36, wherein said requestor comprises a
server running a business application.
38. The method of claim 36, wherein said requestor comprises a
telephony server.
39. The method of claim 36, wherein said requestor communicates
with said server over either a data network or a voice network.
40. The method of claim 36, wherein said client device communicates
with said server over either a data network or a voice network.
41. The method of claim 36, wherein said transaction comprises a
business transaction.
42. The method of claim 36, wherein authentication is requested by
signaling said authentication server an identifier, network address
and optionally, a device signature associated with said user.
43. The method of claim 36, wherein said signature collection
component comprises an interactive signature collection applet.
44. The method of claim 43, wherein the step of presenting and
recording comprises the steps of: presenting the user a randomly
generated signature string sent from said server; prompting the
user to speak the randomly generated signature string; recording
the spoken signature string as an audio file, using device-specific
recording objects; gathering a device signature by reading
configuration of said client device; and sending said audio file
and said device configuration to said authentication server.
45. The method of claim 43, further comprising the step of
reporting the authentication result to the user through the
signature collection applet.
46. The method of claim 43, wherein said signature collection
applet is one of: pre-installed on said client device; and served
up from said authentication server.
47. The system of claim 43, wherein said signature collection
applet is software and device-specific.
48. The system of claim 43, wherein said client device comprises
any of: a personal computer; a WAP (wireless access protocol)
telephone; and a PDA (personal digital assistant; and a
conventional telephone.
49. The method of claim 36, further comprising the step of: on said
authentication server, providing any of: a dispatcher; at least one
authentication client; an enrollment client; and voiceprint
server.
50. The method of claim 49, said step of providing a dispatcher
comprising the steps of: receiving an authentication request at
said dispatcher; and directing said request to an available
authentication clients.
51. The method of claim 49, said step of providing a dispatcher
comprising the steps of: logging system load data; starting and
shutting down authentication clients in response to system demand;
terminating unresponsive authentication clients; and restarting
authentication clients that terminate abnormally.
52. The method of claim 49, said step of providing an
authentication client comprising the steps of: receiving an
authentication request from said dispatcher; responding to a
progress query from a requestor; and returning an authentication
result to a requestor.
53. The method of claim 49, said step of providing an
authentication client comprising the step of: logging
authentication transaction data.
54. The method of claim 49, said step of providing an
authentication client comprising the step of: generating random
signatures for use by signature collection applets.
55. The method of claim 49, said step of providing an
authentication client comprising the steps of: passing an audio
file to said voiceprint server along with identifier of associated
user; and identifying best acoustic model for recognition and
verification based on a client device signature.
56. The method of claim 49, said step of providing an
authentication client comprising the steps of: supplying dialog box
content to signature collection applets; evaluating recognition and
verification results as compared to configured thresholds;
determining authentication result and potential need for retries;
and interacting with other clients to manage data and decision
flow.
57. The method of claim 49, said step of providing an enrollment
client comprising the steps of: receiving enrollment request and
device signature from a user; managing enrollment dialog with a
user; and confirming completion of enrollment process.
58. The method of claim 49, said step of providing an enrollment
client comprising the step of: logging enrollment transaction
data.
59. The method of claim 49, said step of providing an enrollment
client comprising the step of: managing user-specific data
necessary to confirm identity of a user during enrollment
process.
60. The method of claim 49, said step of providing an enrollment
client comprising the steps of: serving up a device appropriate
enrollment applet to a client device based on device signature and
network address; receiving signature files from said applet; and
sending enrollment result to said applet for display.
61. The method of claim 49, said step of providing an enrollment
client comprising the steps of: passing an audio file to said
voiceprint server along with user identity information requesting
recognition and verification results; and using device signature to
identify best acoustic models for recognition and verification.
62. The method of claim 49, said step of providing an enrollment
client comprising the steps of: supplying dialog box content to
applets; evaluating recognition and verification results as
compared to configured thresholds; determining authentication
result and potential need for retries; interacting with other
clients to manage data and decision flow; and storing a voiceprint
in a repository.
63. The method of claim 49, said recognizer comprising at least one
computer readable speech recognition program.
64. The method of claim 63, wherein said voiceprint authenticator
comprises at least one computer-readable program for voiceprint
authentication.
65. The method of claim 64, the step of providing a voiceprint
server comprising the steps of: receiving files from said
authentication client for recognition and verification; providing a
software interface that integrates with API's to said speech
recognition and voiceprint authentication programs to pass audio
files and select appropriate grammars and acoustic models and to
receive recognition and verification results; and providing a
voiceprint manager for managing storage and retrieval of
voiceprints from a depository.
66. The method of claim 65, further comprising the step of:
computing confidence values for recognition and authentication,
wherein minimum and maximum thresholds are set for each confidence
value.
67. The method of claim 66, further comprising one of the steps of:
if both confidence values exceed corresponding maximum thresholds,
authenticating the user; if both confidence levels fall below
corresponding minimum thresholds, rejecting the user as an
imposter; and if one or both confidence levels fall between
thresholds, prompting the user to re-record the signature
string.
68. The method of claim 36, further comprising the step of:
providing a secure transmission environment among said requestor,
said authentication server and said client.
Description
CROSS REFERENCE TO RELATED APPLICATIONS
[0001] This application claims priority to U.S. Provisional Patent
Application No. 60/312,363, filed Aug. 14, 2001.
BACKGROUND OF THE INVENTION
[0002] 1. Field of the Invention
[0003] The invention relates generally to the field of biometric
authentication. More particularly, the invention relates to a
system and method for securing business transactions using
dialog-based voice recognition and voiceprint authentication.
[0004] 2. Description of Related Technology
[0005] The ability to positively and reliably authenticate an
individual is of utmost importance in areas such as e-commerce and
financial services provided in a networked environment.
Conventional shared secret authentication technology involves
numerous disadvantages that motivate a continuing search for more
reliable authentication technologies. For example, passwords and
PIN's (personal identification number), while easily implemented,
are easily compromised. Often, workplaces having aggressive
password policies requiring passwords to be changed frequently also
discourage easily remembered passwords. Thus, the inconvenience of
trying to remember a password is such that end-users often write
their password down so that they won't forget it. It is extremely
common to see a user's password displayed in their office in plain
view, perhaps on a POST-IT note affixed to their desk. Furthermore,
replacing the passwords of those who have forgotten theirs is a
significant expense in many organizations. The same problems are
encountered in e-commerce and financial service environments.
Conventional authentication methods render it relatively simple for
a party to masquerade as someone else, resulting in serious
invasions of privacy, and often inflicting grave financial or
reputational harm.
[0006] Biometric authentication, the use of unique physical
characteristics to verify an individual's identity, is receiving an
increasing amount of attention. The use of fingerprints to
positively identify an individual has been known for several
hundred years. T. Tabuki, Verification server for use in
authentication on networks, U.S. Pat. No. 5,987,232 (Nov. 16, 1999)
describes the use of signatures to authenticate users requesting
network access. The user records his or her signature by means of
an electronic signature tablet. The recorded signature is then
verified on a verification server. R. Glass, M Salganicoff, U. Cahn
von Seelen, Method and apparatus for securely transmitting and
authenticating biometric data over a network, U.S. Pat. No.
6,332,193 (Dec. 18, 2001) describes use of a retinal scan to
authenticate a user requesting network access. Y. Yu, S. Wong, M.
Hoffberg, Web-based, biometric authentication system and method,
U.S. Pat. No. 6,182,076 (Jan. 30, 2001) describes a biometric
authentication architecture implemented as middleware that employs
encryption and passwords to lessen the possibility that a user's
biometric data will be compromised while being transmitted to an
authentication center.
[0007] A disadvantage to most current biometric authentication
technologies is that they are subject to compromise. A user's
biometric data can be intercepted and misused in the same way that
a password can. In order to minimize such possibility, as described
in the references above, measures must be taken to make sure that
the biometric data is securely transmitted, and is authentic,
requiring measures such as encryption, watermarking and passwords.
It would be advantageous to provide a simple, reliable way of
minimizing the possibility that biometric data has been
compromised, or is not authentic.
[0008] Another disadvantage of most biometric authentication
schemes is that the biometric templates are stored independently of
their associated user data. The biometric data received from a user
desiring authentication is first matched with a template from the
template database. Subsequently, the individual associated with the
matching template is provided. While such methodology is well
suited for biometric identification, it is resource intensive. It
would be desirable to provide a way of granting direct access to
particular user's biometric template without first matching the
templates.
[0009] Biometric authentication schemes are often implemented as
middleware in a network environment. It would be desirable to
provide a server-based architecture wherein the server is optimized
for biometric authentication.
[0010] A still further disadvantage to most biometric
authentication schemes is that they require dedicated sensing
devices, such as specialized cameras for retinal scans and
digitizing tablets for signatures. Often these devices are
difficult to implement and maintain, requiring special software
drivers and frequent calibration and adjustment. Thus, it would be
an advance to provide a means of biometric authentication that
doesn't require specialized input devices
[0011] Use of biometric authentication has been limited to granting
access, often to a data network. It would be desirable to provide
security for business transactions over either voice or data
networks based on biometric authentication.
SUMMARY OF THE INVENTION
[0012] A system for biometrically securing business transactions
uses speech recognition and voiceprint authentication to
biometrically secure a transaction from a variety of client devices
in a variety of media. A voiceprint authentication server receives
a request from a third party requestor, often a server running a
business application, to authenticate a previously enrolled end
user of a client device. In response, the authentication server
instantiates a signature collection applet on the client device.
Any client having audio recording capabilities, a software
environment and memory capable of running the applet, readable
configuration data that can serve as a device signature, and the
ability to send the signature is suitable for the invention. During
an interactive dialog, the signature collection presents the user a
randomly generated signature string, prompting the user to speak
the string, and recording the user's utterance as he speaks. The
dialog-driven nature of the signature gathering process, coupled
with the use of a randomly generated signature string, provides an
important liveness check. While the invention is completely
compatible with industry standards for secure transmission and
digital signatures, the liveness check provides a high degree of
security for the collected voice data independently of other
security measures.
[0013] After being transmitted to the authentication server, the
signature string is first recognized using voice recognition
software, and the string subsequently compared with a stored
voiceprint, using voiceprint authentication software. Based on the
comparison, an authentication result is reported to the user and
the requester. Voiceprints are stored in a repository along with
the associated user data. The invention is capable of operating
over one or both of a data network and a voice network.
[0014] Enrollment is by way of a separate dialog-based enrollment
applet, wherein the end user provides user information and records
a voiceprint, which is subsequently stored.
BRIEF DESCRIPTION OF THE DRAWINGS
[0015] FIG. 1 provides a functional flow diagram of a system for
biometrically securing business transactions using speech
recognition and voiceprint authentication according to the
invention;
[0016] FIG. 2 illustrates the architecture of a voiceprint
authentication server according to the invention;
[0017] FIGS. 3A-F show a series of screen shots corresponding to
the steps of an enrollment process according to the invention;
and
[0018] FIGS. 4A-C show a series of screen shots corresponding to
the steps of an authentication process according to the
invention.
DETAILED DESCRIPTION
[0019] The invention provides a system and method that use voice
recognition and voiceprint technologies to biometrically secure
transactions from a variety of devices over a variety of media. The
system, as described herein below, is compatible with industry
standards for secure transmission, digital signatures, etc., and
can be combined with other biometric and data security techniques
to provide improved levels of security to a variety of
transactions, for example electronic business transactions.
[0020] Referring now to FIG. 1, shown is a functional flow diagram
of the invented system 100.
[0021] 1. Request for authentication:
[0022] Any business application 101, running on any server, can
request authentication for any specific pre-enrolled user at any
time. This may occur prior to transaction completion (e.g.,
checkout a shopping cart, trade stocks, transfer funds). The
business application requests authentication by signaling the
voiceprint server with an identifier for a specific user at a
specific network address.
[0023] 2. Serve up signature collection applet.
[0024] A voiceprint server 102 instantiates a signature collection
applet on a client device 103 at the network address provided by
the business application. There are several varieties of the
applet, depending upon the kind of end-user device (PC, Palm,
telephone, etc.) and the software supporting the business
transaction on that device (browser, client software, etc.). Any
client having audio recording capabilities, a software environment
and memory capable of running the applet, readable configuration
data that can serve as a device signature, and the ability to send
the signatures is suitable for the invention. FIGS. 4A-C provide
screen shots of an exemplary user interface to the signature
collection applet. As FIG. 4A shows, the signature collection
applet alerts the user, giving the user the option to continue 401.
Following the initial alert, the voiceprint server 102 randomly
generates a signature string used to collect speech data by the
signature collection applet.
[0025] 3. The applet collects a voice and data signature.
[0026] The signature collection applet (device- and
software-specific) renders a dialog box on the client device
displaying the randomly generated string 402, prompting the user to
click the `record` button 403, and speak the signature string 402
into the device microphone. The dialog box prompts the user to
click a `stop` button when finished recording 405 and then click a
`submit` button 404. (FIG. 4B). In the event that the user needs to
repeat the process he starts over by clicking a `try again` button
406.
[0027] In the case of a telephone device, the system calls the
device and asks the end-user to speak the signature string.
[0028] The applet uses device-specific recording objects to record
the speech as an audio file, for example a wav file, although other
file formats are consistent with the spirit and scope of the
invention. The applet also reads the device configuration data
(e.g., Windows registry) to generate a device signature.
[0029] The user's signature audio file and device's configuration
data are sent by the applet to the voiceprint server 102. While the
data transfer and authentication processing is occurring, the
applet displays a "processing" message (not shown) on the end-user
device 103.
[0030] 4. The authentication server receives the audio file and
device configuration from the applet.
[0031] The server passes the audio file, device configuration, and
signature string to the speech recognition and voiceprint
authentication software, and requests recognition with respect to
the recorded signature string and verification with respect to the
specific user's voiceprint (already on record, as described
below).
[0032] The speech recognition software returns a recognition result
that is compared against a configurable recognition confidence
level; and the voiceprint authentication software returns a
verification result that is compared against a configurable
verification confidence level. If both confidence levels exceed
established thresholds, the server determines that the user is
authenticated. If the confidence levels of both results are below
their minimum respective thresholds (also configurable), the user
is rejected as an imposter. If one or both of the confidence levels
are between the threshold values, the voiceprint server generates a
new random signature string and retries the process. The number of
retries is configurable.
[0033] When the server has made its determination, the
authentication result is sent to the applet, as shown in FIG. 4C.
The applet displays the result ("You have been authenticated" or
"We were unable to authenticate your voice") and then terminates.
In addition, the voiceprint server sends the authentication result
to the business application.
[0034] Referring now to FIG. 2, shown is a block diagram that
illustrates the various server-side 200 components:
[0035] DISPATCHER (201). Sub-components within the dispatcher
include:
[0036] A load balancer--receives authentication requests and
directs them to available authentication clients 202;
[0037] A system logger--logs system load data for performance and
data analysis; and
[0038] A watcher--starts and shuts down authentication clients in
response to system demand, terminates unresponsive authentication
clients, and restarts authentication clients that terminate
abnormally.
[0039] AUTHENTICATION CLIENT (202). Sub-components within the
authentication client include:
[0040] A transaction manager--receives authentication request from
the business application 101, responds to progress queries from the
business application, and returns the authentication result to the
business application;
[0041] A logger--logs the authentication transaction data for
performance and data analysis;
[0042] A random signature generator--generates the random
signatures for use by the signature collection applets;
[0043] An applet launcher--using device configuration data and the
network address, the applet launcher serves up the appropriate
signature collection applet to the end user device and receives the
signature files from the applet. It also sends the authentication
result to the applet for display;
[0044] An authentication requester--passes the audio file to the
voiceprint server along with the user id requesting recognition and
verification results. Uses device signature, as appropriate, to
identify best acoustic models for recognition and verification;
and
[0045] A dialog manager--stuffs dialog box content into the
applets. Evaluates recognition and verification results as compared
to configured thresholds. Determines the authentication result and
the potential need for retries. Interacts with other client
components to manage the data and decision flow.
[0046] ENROLLMENT CLIENT (203). Sub-components within the
enrollment client include:
[0047] An enrollment manager--receives enrollment request and
device specific data from the user, manages the dialog with the
user, and confirms the completion of the enrollment process;
[0048] A logger--logs the enrollment transaction data for
performance and data analysis;
[0049] An identity manager--manages the user-specific data
necessary to confirm the identity of the user during the enrollment
process;
[0050] An applet launcher--using device configuration data and the
network address, the applet launcher serves up the appropriate
applet (enrollment, FIG. 3) to the end user device and receives the
signature files from the applet. It also sends the enrollment
result to the applet for display;
[0051] An enrollment requestor--passes the audio file to the
voiceprint server along with the user identity information
requesting recognition and verification results. Use device
signature, as appropriate, to identify best acoustic models for
recognition and verification; and
[0052] A dialog manager--stuffs dialog box content into the
applets, evaluates recognition and verification results as compared
to configured thresholds, determines the enrollment result and the
potential need for retries, interacts with other client components
to manage the data and decision flow. Stores the voiceprint in the
repository.
[0053] AUTHENTICATION APPLET (FIG. 4)
[0054] Construction--there are many device-specific authentication
applets, for PCs, Palms, Microsoft CE devices, WAP phones and other
portable devices capable of recording speech. In addition,
telephony servers such as Voice mail and IVR systems are supported
with authentication applets to allow voiceprint security for
messaging, IVR or even agent-handled voice transactions;
[0055] Instantiation--the applet is served up by the authentication
client to run on the target device or has been pre-installed on the
target device. It is provided a random signature and dialog content
by the authentication client;
[0056] Dialog--presents the text to instruct the user to speak the
specified digit string and present the results, additionally,
handles any retries required;
[0057] Records the speech--using device specific resources (e.g.,
windows recorder) records the utterance and formats into an audio
file;
[0058] Device configuration--Reads the device configuration
information and prepares it for transmission to the authentication
client for use as a device signature; and
[0059] Data transfer--transfers the audio file and device
configuration to the authentication client.
[0060] VOICEPRINT SERVER (204)
[0061] Receives files--receives data from the authentication client
for recognition and verification;
[0062] Software interface--integrates with speech recognition and
verification API's to pass audio files and select appropriate
grammars and acoustic models. Also receives recognition and
verification results; and
[0063] Voiceprint manager--manages storage and retrieval of
voiceprints from the data repository.
[0064] The invention further includes a number of API's
(application program interfaces), among them:
[0065] AUTHENTICATION REQUEST API
[0066] The business application requests authentication, sending
the following information to the authentication client:
[0067] User id;
[0068] Network address;
[0069] Device configuration (if known);
[0070] The authentication client responds with the authentication
result as follows:
[0071] User id;
[0072] Authentication pass/fail, or one of various errors (e.g.,
invalid user ID)
[0073] APPLET API
[0074] The signature collection and enrollment applets instantiate
on the end-user device 103 and use device specific resources as
follows:
[0075] Audio recorder (windows media recorder, etc); and
[0076] Device configuration file.
[0077] VOICEPRINT SERVER API
[0078] The voiceprint server interacts with speech recognition and
voiceprint verification software 206 using their API's 205. It
sends the following data:
[0079] Recognition request with grammar name and audio file;
[0080] Verification request with user ID and audio file; and
[0081] Results with confidence scores returned to server.
[0082] Enrollment Applet
[0083] As described above, users must have previously enrolled
their voiceprint on the system, prior to being able to be
authenticated. FIGS. 3A - F depict the various stages of the
enrollment process from the user perspective, showing dialog boxes
as they are presented to the user. As FIG. 3A shows, the user first
provides his Account ID 301 and password 302. As in FIG. 3B, the
user is prompted to supply an enrollment number 303. As in FIGS. 3C
- E the user then records his voiceprint, using controls
corresponding to the user interface of the signature collection
applet: `record` 304, `stop` 305, `try again` 306 and `submit` 307.
Recording of the voiceprint includes the following steps, for each
of which the user receives a prompt:
[0084] Record account ID (FIG. 3C);
[0085] Record the numbers 0-9 a first time (FIG. 3D); and
[0086] Record the numbers 0-9 a second time (FIG. 3E).
[0087] As the user finishes each utterance, he presses the `stop`
button to terminate recording, and presses the `submit` button to
send the recorded utterance. After the voiceprint is successfully
enrolled and stored, the user receives a confirmation (FIG.
3F).
[0088] One skilled in the art will appreciate that the use of a
random signature string for authentication, coupled with the
requirement that the upper confidence thresholds for both
recognition of the signature string and verification of the user be
exceeded provides a important liveness check, greatly minimizing
the possibility that a user's voiceprint will be compromised. In
particular, the requirement that the system recognizes the
signature string with a high degree of confidence provides
assurance that the recorded string is genuine.
[0089] Although the invention has been described herein with
reference to certain preferred embodiments, one skilled in the art
will readily appreciate that other applications may be substituted
for those set forth herein without departing from the spirit and
scope of the present invention. Accordingly, the invention should
only be limited by the Claims included below.
* * * * *