U.S. patent application number 14/142190 was filed with the patent office on 2015-01-29 for method and device for audio input routing.
This patent application is currently assigned to Motorola Mobility LLC. The applicant listed for this patent is Motorola Mobility LLC. Invention is credited to Michael P. Labowicz, Kazuhiro Ondo, Hideki Yoshino.
Application Number | 20150032238 14/142190 |
Document ID | / |
Family ID | 52390940 |
Filed Date | 2015-01-29 |
United States Patent
Application |
20150032238 |
Kind Code |
A1 |
Ondo; Kazuhiro ; et
al. |
January 29, 2015 |
Method and Device for Audio Input Routing
Abstract
A method on a mobile device for processing an audio input is
described. A trigger for the audio input is received. At least one
parameter is determined for an audio processor based on at least
one input characteristic for the audio input. The audio input is
routed to the audio processor with the at least one parameter.
Inventors: |
Ondo; Kazuhiro; (Buffalo
Grove, IL) ; Labowicz; Michael P.; (Palatine, IL)
; Yoshino; Hideki; (Lake Zurich, IL) |
|
Applicant: |
Name |
City |
State |
Country |
Type |
Motorola Mobility LLC |
Libertyville |
IL |
US |
|
|
Assignee: |
Motorola Mobility LLC
Libertyville
IL
|
Family ID: |
52390940 |
Appl. No.: |
14/142190 |
Filed: |
December 27, 2013 |
Related U.S. Patent Documents
|
|
|
|
|
|
Application
Number |
Filing Date |
Patent Number |
|
|
61857696 |
Jul 23, 2013 |
|
|
|
61889938 |
Oct 11, 2013 |
|
|
|
Current U.S.
Class: |
700/94 |
Current CPC
Class: |
G10L 15/20 20130101;
H04M 1/6041 20130101; G10L 15/28 20130101; H04M 1/271 20130101;
G06F 16/60 20190101; H04M 2250/74 20130101 |
Class at
Publication: |
700/94 |
International
Class: |
G06F 17/30 20060101
G06F017/30 |
Claims
1. A method in a mobile device for processing an audio input, the
method comprising: receiving a trigger for the audio input;
determining at least one parameter for an audio processor based on
at least one input characteristic for the audio input; routing the
audio input to the audio processor with the at least one
parameter.
2. The method of claim 1 wherein routing the audio input comprises:
routing a first portion of the audio input from a buffer associated
with a source microphone; routing a second portion of the audio
input from the source microphone.
3. The method of claim 2 further comprising: initializing the audio
processor with the at least one parameter; buffering the audio
input in the buffer while the audio processor initializes; wherein
routing the second portion of the audio input comprises routing the
second portion of the audio input from the source microphone when
the audio processor has initialized.
4. The method of claim 3 wherein initializing the audio processor
comprises sending a wakeup signal to the audio processor.
5. The method of claim 1 wherein determining the at least one
parameter comprises: performing a lookup for the at least one input
characteristic with an identifier associated with the audio
input.
6. The method of claim 1 wherein determining the at least one
parameter comprises: dynamically determining the at least one input
characteristic based on the trigger.
7. The method of claim 1 wherein the at least one input
characteristic comprises at least one of a sampling rate, quality
indicator, frequency range, codec type, background noise level,
compression feature, noise separation feature, or noise canceling
feature for the audio input or a transmission latency associated
with the audio input.
8. The method of claim 1 wherein routing the audio input comprises
receiving the audio input from a remote microphone via a wireless
transceiver; wherein determining the at least one parameter
comprises determining the at least one parameter based on the
wireless transceiver.
9. The method of claim 1 wherein routing the audio input comprises
obtaining the audio input as a pre-recorded audio file.
10. The method of claim 1 wherein determining the at least one
parameter comprises selecting an audio codec for the audio
processor.
11. A method in a mobile device for processing an audio input, the
method comprising: receiving a trigger for the audio input;
selecting a microphone from a set of microphones based on the
trigger, wherein the set of microphones comprises a local
microphone of the mobile device and a remote microphone;
determining at least one parameter for an audio processor based on
the selected microphone; receiving the audio input from the
selected microphone; providing the audio input to the audio
processor with the at least one parameter.
12. The method of claim 11 wherein the remote microphone is a
Bluetooth-enabled microphone; wherein receiving the audio input
comprises receiving the audio input over a Bluetooth connection to
the Bluetooth-enabled microphone.
13. The method of claim 13 wherein determining the at least one
parameter comprises selecting a sampling rate associated with the
Bluetooth-enabled microphone if the blue-tooth enabled microphone
is selected.
14. The method of claim 12 wherein determining the at least one
parameter comprises selecting a sampling rate associated with the
local microphone if the local microphone is selected.
15. The method of claim 11 further comprising selecting the audio
processor based on the selected microphone.
16. A mobile device for processing an audio input, the mobile
device comprising: a non-transitory memory; a processor configured
to retrieve instructions from the memory; a local microphone; a
wireless transceiver; and an audio processor; wherein: the mobile
device is configured to receive a trigger for the audio input,
wherein the trigger indicates whether the local microphone or the
wireless transceiver is an input source for the audio input; the
mobile device is configured to determine at least one parameter for
the audio processor based on at least one input characteristic of
the input source; the mobile device is configured to route the
audio input from the input source, based on the trigger, to the
audio processor with the at least one parameter.
17. The mobile device of claim 16 wherein the mobile device is
configured to receive the audio input as an analog audio input from
the local microphone.
18. The mobile device of claim 16 wherein the mobile device is
configured to receive the audio input as an analog audio input
through the wireless transceiver from a remote microphone.
19. The mobile device of claim 16 wherein the mobile device is
configured to receive the audio input as a digital audio input
through the wireless transceiver from a remote microphone.
20. The mobile device of claim 19 wherein the wireless transceiver
is one of a Bluetooth transceiver, Wi-Fi transceiver, or cellular
transceiver.
Description
CROSS-REFERENCE TO RELATED APPLICATIONS
[0001] The present application claims priority to U.S. Provisional
Patent Application 61/857,696, filed Jul. 23, 2013, and U.S.
Provisional Patent Application 61/889,938, filed Oct. 11, 2013, the
contents of which are hereby incorporated by reference herein.
TECHNICAL FIELD
[0002] The present disclosure relates to processing audio signals
and, more particularly, to methods and devices for routing audio
signals including voice or speech.
BACKGROUND
[0003] Although speech recognition has been around for decades, the
quality of speech recognition software and hardware has only
recently reached a high enough level to appeal to a large number of
consumers. One area in which speech recognition has become very
popular in recent years is the smartphone and tablet computer
industry. Using a speech recognition-enabled device, a consumer can
perform such tasks as making phone calls, writing emails, and
navigating with GPS, strictly by voice.
[0004] Traditional voice recognition systems may receive an audio
input from several input sources, such as a built-in microphone, a
Bluetooth headset, or wired headset. However, the behavior of the
traditional systems is typically the same regardless of the input
source except that a voice response may be played back from a
different audio output, such as a phone speaker or the
corresponding headset.
BRIEF DESCRIPTION OF THE SEVERAL VIEWS OF THE DRAWINGS
[0005] While the appended claims set forth the features of the
present techniques with particularity, these techniques, together
with their objects and advantages, may be best understood from the
following detailed description taken in conjunction with the
accompanying drawings of which:
[0006] FIG. 1 is a block diagram illustrating a mobile device,
according to an embodiment;
[0007] FIG. 2 is a block diagram of example components of a mobile
device, according to an embodiment;
[0008] FIG. 3 is a block diagram of a mobile device for receiving
audio input from multiple input sources, according to an
embodiment.
[0009] FIG. 4 illustrates a process flow of a method for audio
input routing that may be performed by the mobile device of FIG. 3,
according to an embodiment;
DETAILED DESCRIPTION
[0010] Turning to the drawings, wherein like reference numerals
refer to like elements, techniques of the present disclosure are
illustrated as being implemented in a suitable environment. The
following description is based on embodiments of the claims and
should not be taken as limiting the claims with regard to
alternative embodiments that are not explicitly described
herein.
[0011] When a user provides speech for voice recognition by a
mobile device, the speech is converted by a microphone to an audio
input (e.g., an analog or digital signal). The audio input may be
further processed, such as converted from analog-to-digital or
encoded using an audio codec, before the mobile device recognizes
the speech with an audio processor. Various input characteristics
for the audio input affect its quality or recognition capability,
such as a sampling rate or frequency range. These input
characteristics may be dependent on the quality or features of the
microphone or other components in an audio input path between the
microphone and the audio processor.
[0012] The various embodiments described herein allow a mobile
device to determine the input characteristics for the audio input
and to recognize the speech based on those characteristics. The
mobile device configures or "tunes" the audio processor, for
example, to improve accuracy, increase speed, or reduce power
consumption for the voice recognition. The mobile device in one
example performs a lookup of predetermined input characteristics
for an input source, such as a microphone with a fixed sampling
rate. In another example, the mobile device dynamically determines
the input characteristics, for example, based on information
associated with the audio input.
[0013] The mobile device receives a trigger for the audio input
that indicates which input source will provide the audio input,
such as a microphone or headset. The trigger may further indicate
an audio input path for the audio input (e.g., a wired path or
wireless path). The mobile device determines at least one parameter
for the audio processor based on input characteristics for the
audio input or audio input path. The parameters may include the
input characteristics themselves, such as a sampling rate of a
microphone or latency of the audio input path. In another example,
the parameter is an indicator for which voice recognition engine
the audio processor should use, such as a high quality or low
quality voice recognition engine. The mobile device routes the
audio input to the audio processor, which then performs voice
recognition based on the parameters.
[0014] In one embodiment, a mobile device receives a trigger for an
audio input. The mobile device determines at least one parameter
for an audio processor based on at least one input characteristic
for the audio input. The mobile device routes the audio input to
the audio processor with the at least one parameter.
[0015] In another embodiment, a mobile device receives a trigger
for an audio input. The mobile device selects a microphone from a
set of microphones based on the trigger. The set of microphones
includes a local microphone of the mobile device and a remote
microphone. The mobile device determines at least one parameter for
an audio processor based on the selected microphone. The mobile
device receives the audio input from the selected microphone and
provides the audio input to the audio processor with the at least
one parameter.
[0016] Referring to FIG. 1, there is illustrated a perspective view
of an example mobile device 100. The mobile device 100 may be any
type of device capable of storing and executing multiple
applications. Examples of the mobile device 100 include, but are
not limited to, mobile devices, smart phones, smart watches,
wireless devices, tablet computing devices, personal digital
assistants, personal navigation devices, touch screen input device,
touch or pen-based input devices, portable video and/or audio
players, and the like. It is to be understood that the mobile
device 100 may take the form of a variety of form factors, such as,
but not limited to, bar, tablet, flip/clam, slider, rotator, and
wearable form factors.
[0017] For one embodiment, the mobile device 100 has a housing 101
comprising a front surface 103 which includes a visible display 105
and a user interface. For example, the user interface may be a
touch screen including a touch-sensitive surface that overlays the
display 105. For another embodiment, the user interface or touch
screen of the mobile device 100 may include a touch-sensitive
surface supported by the housing 101 that does not overlay any type
of display. For yet another embodiment, the user interface of the
mobile device 100 may include one or more input keys 107. Examples
of the input key or keys 107 include, but are not limited to, keys
of an alpha or numeric keypad or keyboard, a physical keys,
touch-sensitive surfaces, mechanical surfaces, multipoint
directional keys and side buttons or keys 107. The mobile device
100 may also comprise a speaker 109 and microphone 111 for audio
output and input at the surface. It is to be understood that the
mobile device 100 may include a variety of different combination of
displays and interfaces.
[0018] The mobile device 100 includes one or more sensors 113
positioned at or within an exterior boundary of the housing 101.
For example, as illustrated by FIG. 1, the sensor or sensors 113
may be positioned at the front surface 103 and/or another surface
(such as one or more side surfaces 115) of the exterior boundary of
the housing 101. The sensor or sensors 113 may include an exterior
sensor supported at the exterior boundary to detect an
environmental condition associated with an environment external to
the housing. The sensor or sensors 113 may also, or in the
alternative, include an interior sensors supported within the
exterior boundary (i.e., internal to the housing) to detect a
condition of the device itself. Examples of the sensors 113 are
described below in reference to FIG. 2.
[0019] Referring to FIG. 2, there is shown a block diagram
representing example components (e.g., internal components) 200 of
the mobile device 100 of FIG. 1. In the present embodiment, the
components 200 include one or more wireless transceivers 201, one
or more processors 203, one or more memories 205, one or more
output components 207, and one or more input components 209. As
already noted above, the mobile device 100 includes a user
interface, including the touch screen display 105 that comprises
one or more of the output components 207 and one or more of the
input components 209. Also as already discussed above, the mobile
device 100 includes a plurality of the sensors 113, several of
which are described in more detail below. In the present
embodiment, the sensors 113 are in communication with (so as to
provide sensor signals to or receive control signals from) a sensor
hub 224.
[0020] Further, the components 200 include a device interface 215
to provide a direct connection to auxiliary components or
accessories for additional or enhanced functionality. In addition,
the internal components 200 include a power source or supply 217,
such as a portable battery, for providing power to the other
internal components and allow portability of the mobile device 100.
As shown, all of the components 200, and particularly the wireless
transceivers 201, processors 203, memories 205, output components
207, input components 209, sensor hub 224, device interface 215,
and power supply 217, are coupled directly or indirectly with one
another by way of one or more internal communication link(s) 218
(e.g., an internal communications bus).
[0021] Further, in the present embodiment of FIG. 2, the wireless
transceivers 201 particularly include a cellular transceiver 211
and a Wi-Fi transceiver 213. Although in the present embodiment the
wireless transceivers 201 particularly include two of the wireless
transceivers 211 and 213, the present disclosure is intended to
encompass numerous embodiments in which any arbitrary number of
(e.g., more than two) wireless transceivers employing any arbitrary
number of (e.g., two or more) communication technologies are
present. More particularly, in the present embodiment, the cellular
transceiver 211 is configured to conduct cellular communications,
such as 3G, 4G, 4G-LTE, vis-a-vis cell towers (not shown), albeit
in other embodiments, the cellular transceiver 211 can be
configured to utilize any of a variety of other cellular-based
communication technologies such as analog communications (using
AMPS), digital communications (using CDMA, TDMA, GSM, iDEN, GPRS,
EDGE, etc.), or next generation communications (using UMTS, WCDMA,
LTE, IEEE 802.16, etc.) or variants thereof.
[0022] By contrast, the Wi-Fi transceiver 213 is a wireless local
area network (WLAN) transceiver configured to conduct Wi-Fi
communications in accordance with the IEEE 802.11 (a, b, g, or n)
standard with access points. In other embodiments, the Wi-Fi
transceiver 213 can instead (or in addition) conduct other types of
communications commonly understood as being encompassed within
Wi-Fi communications such as some types of peer-to-peer (e.g.,
Wi-Fi Peer-to-Peer) communications. Further, in other embodiments,
the Wi-Fi transceiver 213 can be replaced or supplemented with one
or more other wireless transceivers configured for non-cellular
wireless communications including, for example, wireless
transceivers employing ad hoc communication technologies such as
HomeRF (radio frequency), Home Node B (3G femtocell), Bluetooth, or
other wireless communication technologies such as infrared
technology. Although in the present embodiment each of the wireless
transceivers 201 serves as or includes both a respective
transmitter and a respective receiver, it should be appreciated
that the wireless transceivers are also intended to encompass one
or more receiver(s) that are distinct from any transmitter(s), as
well as one or more transmitter(s) that are distinct from any
receiver(s). In one example embodiment encompassed herein, the
wireless transceiver 201 includes at least one receiver that is a
baseband receiver.
[0023] Exemplary operation of the wireless transceivers 201 in
conjunction with others of the components 200 of the mobile device
100 can take a variety of forms and can include, for example,
operation in which, upon reception of wireless signals (as
provided, for example, by remote device(s)), the internal
components detect communication signals and the transceivers 201
demodulate the communication signals to recover incoming
information, such as voice or data, transmitted by the wireless
signals. After receiving the incoming information from the
transceivers 201, the processors 203 format the incoming
information for the one or more output components 207. Likewise,
for transmission of wireless signals, the processors 203 format
outgoing information, which can but need not be activated by the
input components 209, and convey the outgoing information to one or
more of the wireless transceivers 201 for modulation so as to
provide modulated communication signals to be transmitted. The
wireless transceiver(s) 201 convey the modulated communication
signals by way of wireless (as well as possibly wired)
communication links to other devices (e.g., remote devices). The
wireless transceivers 201 in one example allow the mobile device
100 to exchange messages with remote devices, for example, a remote
network entity (not shown) of a cellular network or WLAN network.
Examples of the remote network entity include an application
server, web server, database server, or other network entity
accessible through the wireless transceivers 201 either directly or
indirectly via one or more intermediate devices or networks (e.g.,
via a WLAN access point, the Internet, LTE network, or other
network).
[0024] Depending upon the embodiment, the output and input
components 207, 209 of the components 200 can include a variety of
visual, audio, or mechanical outputs. For example, the output
device(s) 207 can include one or more visual output devices such as
a cathode ray tube, liquid crystal display, plasma display, video
screen, incandescent light, fluorescent light, front or rear
projection display, and light emitting diode indicator, one or more
audio output devices such as a speaker, alarm, or buzzer, or one or
more mechanical output devices such as a vibrating mechanism or
motion-based mechanism. Likewise, by example, the input device(s)
209 can include one or more visual input devices such as an optical
sensor (for example, a camera lens and photosensor), one or more
audio input devices such as a microphone, and one or more
mechanical input devices such as a flip sensor, keyboard, keypad,
selection button, navigation cluster, touch pad, capacitive sensor,
motion sensor, and switch.
[0025] As already noted, the various sensors 113 in the present
embodiment can be controlled by the sensor hub 224, which can
operate in response to or independent of the processor(s) 203.
Examples of the various sensors 113 may include, but are not
limited to, power sensors, temperature sensors, pressure sensors,
moisture sensors, ambient noise sensors, motion sensors (e.g.,
accelerometers or Gyro sensors), light sensors, proximity sensors
(e.g., a light detecting sensor, an ultrasound transceiver or an
infrared transceiver), other touch sensors, altitude sensors, one
or more location circuits/components that can include, for example,
a Global Positioning System (GPS) receiver, a triangulation
receiver, an accelerometer, a tilt sensor, a gyroscope, or any
other information collecting device that can identify a current
location or user-device interface (carry mode) of the mobile device
100.
[0026] With respect to the processor(s) 203, the processor(s) can
include any one or more processing or control devices such as, for
example, a microprocessor, digital signal processor, microcomputer,
application-specific integrated circuit, etc. The processors 203
can generate commands, for example, based on information received
from the one or more input components 209. The processor(s) 203 can
process the received information alone or in combination with other
data, such as information stored in the memories 205. Thus, the
memories 205 of the components 200 can be used by the processors
203 to store and retrieve data.
[0027] Further, the memories (or memory portions) 205 of the
components 200 can encompass one or more memory devices of any of a
variety of forms (e.g., read-only memory, random access memory,
static random access memory, dynamic random access memory, etc.),
and can be used by the processors 203 to store and retrieve data.
In some embodiments, one or more of the memories 205 can be
integrated with one or more of the processors 203 in a single
device (e.g., a processing device including memory or
processor-in-memory (PIM)), albeit such a single device will still
typically have distinct portions/sections that perform the
different processing and memory functions and that can be
considered separate devices. The data that is stored by the
memories 205 can include, but need not be limited to, operating
systems, applications, and informational data.
[0028] Each operating system includes executable code that controls
basic functions of the mobile device 100, such as interaction among
the various components included among the components 200,
communication with external devices or networks via the wireless
transceivers 201 or the device interface 215, and storage and
retrieval of applications and data, to and from the memories 205.
Each application includes executable code that utilizes an
operating system to provide more specific functionality, such as
file system service and handling of protected and unprotected data
stored in the memories 205. Such operating system or application
information can include software update information (which can be
understood to potentially encompass updates to either
application(s) or operating system(s) or both). As for
informational data, this is non-executable code or information that
can be referenced or manipulated by an operating system or
application for performing functions of the mobile device 100.
[0029] It is to be understood that FIG. 2 is provided for
illustrative purposes only and for illustrating components of an
mobile device in accordance with various embodiments, and is not
intended to be a complete schematic diagram of the various
components required for an mobile device. Therefore, an mobile
device can include various other components not shown in FIG. 2, or
can include a combination of two or more components or a division
of a particular component into two or more separate components, and
still be within the scope of the disclosed embodiments.
[0030] Turning to FIG. 3, a block diagram 300 illustrates another
embodiment of the mobile device 100 along with a remote microphone
319. As shown in FIG. 3, the mobile device 100 comprises the
wireless transceiver 201, the memory 205, and the device interface
215. The mobile device further comprises an audio processor 303, a
buffer 305, and a local microphone 311. The audio processor 303 in
one example is an instance of the processor 203, such as a digital
signal processor or an application processor. The buffer 305 in one
example is an instance of the memory 205 that is available as an
intermediate buffer for audio inputs from the local microphone 311.
The local microphone 311 is an instance of the microphone 111. In
one example, multiple instances of the local microphone 311
cooperate to provide the audio input.
[0031] The remote microphone 321 is remotely located from the
mobile device 100 and not integrated with the mobile device 100.
For example, the remote microphone 321 comprises a headset
accessory for the mobile device 100. In the implementation shown in
FIG. 3, the remote microphone 321 is a wireless-enabled headset
that provides the audio input to the mobile device 100 via the
wireless transceiver 201 (e.g., a Bluetooth transceiver via a
synchronous connection-oriented link). In another example, the
remote microphone 321 is a wired headset that provides the audio
input to the mobile device 100 via the device interface 215. In yet
another example, the remote microphone 321 is located in an
electronic device, such as a voice-activated household appliance
(e.g., a television, thermostat, entertainment console, or lighting
system), an automobile, desktop computer, or other devices as will
be apparent to those skilled in the art. In this case, the remote
microphone 321 provides the audio input to the mobile device 100
via the wireless transceiver 201, such as the cellular transceiver
211, WLAN transceiver 213, or a Bluetooth transceiver. In one
example, a plurality of remote microphones 321 cooperates to
provide the voice input. For example, multiple microphones 321 may
be spread throughout a user's home to provide a voice activation
capability.
[0032] Referring to FIG. 3, a plurality of audio input paths are
available for providing audio input to the mobile device 100. A
first path starts with the local microphone 311 and proceeds
"directly" to the audio processor 303 (e.g., without substantial
processing by other components). In this case, the audio input may
be provided to the audio processor 303 substantially in real-time.
In one example, the mobile device 100 comprises an Integrated
Interchip Sound ("I2S") bus for providing the audio input from the
local microphone 311 to the audio processor 303.
[0033] A second path starts with the local microphone 311 and
proceeds through the buffer 305 to reach the audio processor 303.
In this case, the mobile device 100 stores the audio input in the
buffer 305 before providing the audio input to the audio processor
303. Storage in an intermediate buffer (e.g., the buffer 305 or the
memory 205) allows the mobile device 100 additional time for
initializing the audio processor 303 or to reduce the effects of
high latency in receiving the audio input. A third path is based on
both the first path and the second path. In this case, a first
portion of the audio input is provided via the second path and
buffered, for example, while the audio processor 303 is
initialized. Once initialized, the mobile device 100 uses the first
path to provide a second portion of the audio input directly from
the local microphone 311. The mobile device 100 in one example uses
the third path to provide a "one-shot" voice recognition feature of
an always-on voice system. For example, the mobile device 100 may
listen for a trigger phrase (e.g., "OK Google Now") and then buffer
a command phrase that occurs after the trigger phrase while the
audio processor 303 is initialized.
[0034] A fourth path for audio input starts with the remote
microphone 321 and proceeds through the device interface 215 to the
audio processor 303. In this case, the remote microphone 321 or the
device interface 215 may include a buffer (not shown) for buffering
a portion of the audio input. A fifth path starts with the remote
microphone 321 and proceeds through the wireless transceiver 201 to
the audio processor 303. A sixth path starts with the memory 205
and proceeds to the audio processor 303. In this case, a software
program or application records or stores the audio input in the
memory 205. Upon receipt of the trigger (e.g., a software trigger
or inter-process communication trigger), the mobile device 100
obtains the pre-recorded audio input from the memory 205. For the
paths described herein, the audio input or a portion thereof may be
stored in the memory 205 for access by the audio processor 303
while still being considered "direct" processing. The mobile device
100 in one example stores the audio input in the memory 205 to
reduce the effects of high latency (e.g., from a remote microphone
321 over a Wi-Fi connection).
[0035] Turning to FIG. 4, a process flow 400 illustrates a method
for audio input routing that may be performed by the mobile device
100. The mobile device 100 receives (402) a trigger for an audio
input. The trigger indicates which input source will provide the
audio input, such as the local microphone 311, the remote
microphone 321, or an application via the memory 205. The trigger
may further indicate an audio input path for the audio input (e.g.,
a wired path or wireless path), as described above, or other input
characteristics for the audio input. The audio input path may also
include multiple sub-paths for using multiple microphones in
cooperation. Based on the audio trigger, the mobile device selects
(403) the input source (or multiple sources) for the audio input.
The trigger may be a processor interrupt, Bluetooth multi-function
button trigger, software trigger, inter-process communication
trigger, push notification, button press, audio keyword detection
indicator (e.g., "OK Google Now"), or other user input.
[0036] The mobile device 100 determines (404) whether the audio
input path has an available buffer, such as the buffer 305, a
buffer integrated with the remote microphone 321, or a buffer
integrated with the wireless transceiver 201. The mobile device 100
determines (406) at least one input characteristic for the audio
input. Examples of the input characteristic include a sampling rate
(e.g., 8 kHz, 44.1 kHz), quality indicator (e.g., "High
Definition)", frequency range (e.g., 300 Hz to 6 kHz), codec type,
background noise level, compression feature (e.g., compression type
or ratio), noise separation feature, or noise canceling feature for
the audio input. The input characteristics may also include a
transmission latency for the corresponding audio input path. The
mobile device 100 in one example performs a lookup of predetermined
input characteristics with an identifier of the input source, such
as a microphone name or model number. In another example, the
mobile device 100 dynamically determines the input characteristics,
for example, based on information associated with the audio input.
In this case, the mobile device 100 may determine a sampling rate
or codec based on the audio input, such as from a header portion of
a file that contains the audio input.
[0037] After determination (406) of the input characteristics, the
mobile device 100 determines (408) at least one audio parameter for
the audio processor 303. The audio parameters are used by the audio
processor 303 for performing the voice recognition. The audio
parameters may include one or more of the input characteristics,
such as the sampling rate, frequency range, or availability of an
intermediate buffer (e.g., the buffer 305). In one example, the
local microphone 311 and the remote microphone 321 support
different sampling rates. In this case, the mobile device 100
selects a sampling rate based on the microphone used for the audio
input. For example, a Bluetooth headset may support an 8K sampling
rate while the local microphone 311 supports a 16K sampling rate.
If the intermediate buffer is available or the remote microphone
321 is selected, the audio parameters may further include an
indication of the paths used for the audio input. In another
example, the audio parameter is an indicator for which voice
recognition engine the audio processor should use, such as a high
quality or low quality voice recognition engine. If the audio input
path includes a wireless path (e.g., Bluetooth, Wi-Fi, or
cellular), the mobile device 100 in one example selects the
parameters based on a latency of the wireless path.
[0038] The mobile device 100 initializes (410) the audio processor
303 with the audio parameters. Using the audio parameters, the
mobile device 100 configures or "tunes" the audio processor 303,
for example, to improve accuracy, increase speed, or reduce power
consumption for the voice recognition. In one example, the mobile
device 100 sends a wakeup signal to the audio processor 303 for the
initialization. In another example, the mobile device 100 passes
the audio parameters in a function call to a voice recognition
application running on the audio processor 303. While
initialization (410) is shown after the determination (408) of the
audio parameters, in alternative implementations the mobile device
100 begins the initialization after receiving (402) the audio
trigger. In this case, the mobile device 100 may perform the
initialization in two or more steps, such as waking the audio
processor 303 followed by configuration with the audio parameters.
The audio processor 303 may also be started or "running" prior to
the audio trigger, for example, as a background process or service,
or in response to another voice recognition request.
[0039] When the audio processor 303 is ready to receive the audio
input, the mobile device 100 routes (412) the audio input from the
input source to the audio processor 303. In one example, the mobile
device 100 streams the audio input to the audio processor 303
substantially in real-time. In another example, the mobile device
100 receives the audio input as an analog audio input from the
local microphone 311 or the remote microphone 321 (via the
transceiver 201). In yet another example, the mobile device 100
receives the audio input as a digital audio input from the local
microphone 311 or the remote microphone 321 (via the transceiver
201). Where the audio input has two or more input sources such as
an intermediate buffer and a direct input (e.g., the local
microphone 311), the mobile device 100 instructs the audio
processor 303 to change from processing the audio input from the
intermediate buffer to the direct input.
[0040] The mobile device 100 may also provide a prompt or
indication to the user when the mobile device 100 is ready to
receive the audio input. For example, if an intermediate buffer is
not available, the mobile device 100 may provide an audio
indication, instruction, or "beep" once the audio processor 303 has
been initialized. In another implementation, the mobile device 100
provides instructions to the user upon receipt of the trigger.
[0041] It can be seen from the foregoing that a method and system
for audio input routing have been described. In view of the many
possible embodiments to which the principles of the present
discussion may be applied, it should be recognized that the
embodiments described herein with respect to the drawing figures
are meant to be illustrative only and should not be taken as
limiting the scope of the claims. Therefore, the techniques as
described herein contemplate all such embodiments as may come
within the scope of the following claims and equivalents
thereof.
[0042] The apparatus described herein may include a processor, a
memory for storing program data to be executed by the processor, a
permanent storage such as a disk drive, a communications port for
handling communications with external devices, and user interface
devices, including a display, touch panel, keys, buttons, etc. When
software modules are involved, these software modules may be stored
as program instructions or computer readable code executable by the
processor on a non-transitory computer-readable media such as
magnetic storage media (e.g., magnetic tapes, hard disks, floppy
disks), optical recording media (e.g., CD-ROMs, Digital Versatile
Discs (DVDs), etc.), and solid state memory (e.g., random-access
memory (RAM), read-only memory (ROM), static random-access memory
(SRAM), electrically erasable programmable read-only memory
(EEPROM), flash memory, thumb drives, etc.). The computer readable
recording media may also be distributed over network coupled
computer systems so that the computer readable code is stored and
executed in a distributed fashion. This computer readable recording
media may be read by the computer, stored in the memory, and
executed by the processor.
[0043] The disclosed embodiments may be described in terms of
functional block components and various processing steps. Such
functional blocks may be realized by any number of hardware and/or
software components configured to perform the specified functions.
For example, the disclosed embodiments may employ various
integrated circuit components, e.g., memory elements, processing
elements, logic elements, look-up tables, and the like, which may
carry out a variety of functions under the control of one or more
microprocessors or other control devices. Similarly, where the
elements of the disclosed embodiments are implemented using
software programming or software elements, the disclosed
embodiments may be implemented with any programming or scripting
language such as C, C++, JAVA.RTM., assembler, or the like, with
the various algorithms being implemented with any combination of
data structures, objects, processes, routines or other programming
elements. Functional aspects may be implemented in algorithms that
execute on one or more processors. Furthermore, the disclosed
embodiments may employ any number of conventional techniques for
electronics configuration, signal processing and/or control, data
processing and the like. Finally, the steps of all methods
described herein may be performed in any suitable order unless
otherwise indicated herein or otherwise clearly contradicted by
context.
[0044] For the sake of brevity, conventional electronics, control
systems, software development and other functional aspects of the
systems (and components of the individual operating components of
the systems) may not be described in detail. Furthermore, the
connecting lines, or connectors shown in the various figures
presented are intended to represent exemplary functional
relationships and/or physical or logical couplings between the
various elements. It should be noted that many alternative or
additional functional relationships, physical connections or
logical connections may be present in a practical device. The words
"mechanism", "element", "unit", "structure", "means", "device",
"controller", and "construction" are used broadly and are not
limited to mechanical or physical embodiments, but may include
software routines in conjunction with processors, etc.
[0045] No item or component is essential to the practice of the
disclosed embodiments unless the element is specifically described
as "essential" or "critical". It will also be recognized that the
terms "comprises," "comprising," "includes," "including," "has,"
and "having," as used herein, are specifically intended to be read
as open-ended terms of art. The use of the terms "a" and "an" and
"the" and similar referents in the context of describing the
disclosed embodiments (especially in the context of the following
claims) are to be construed to cover both the singular and the
plural, unless the context clearly indicates otherwise. In
addition, it should be understood that although the terms "first,"
"second," etc. may be used herein to describe various elements,
these elements should not be limited by these terms, which are only
used to distinguish one element from another. Furthermore,
recitation of ranges of values herein are merely intended to serve
as a shorthand method of referring individually to each separate
value falling within the range, unless otherwise indicated herein,
and each separate value is incorporated into the specification as
if it were individually recited herein.
[0046] The use of any and all examples, or exemplary language
(e.g., "such as") provided herein, is intended merely to better
illuminate the disclosed embodiments and does not pose a limitation
on the scope of the disclosed embodiments unless otherwise claimed.
Numerous modifications and adaptations will be readily apparent to
those of ordinary skill in this art.
* * * * *