U.S. patent application number 14/458894 was filed with the patent office on 2014-12-04 for mobile device localization using audio signals.
The applicant listed for this patent is Microsoft Corporation. Invention is credited to Philip Gosset, Dinan Gunawardena, Timothy Regan, Stuart Taylor, Eno Thereska.
Application Number | 20140355785 14/458894 |
Document ID | / |
Family ID | 47006316 |
Filed Date | 2014-12-04 |
United States Patent
Application |
20140355785 |
Kind Code |
A1 |
Taylor; Stuart ; et
al. |
December 4, 2014 |
MOBILE DEVICE LOCALIZATION USING AUDIO SIGNALS
Abstract
Mobile device localization using audio signals is described. In
an example, a mobile device is localized by receiving a first audio
signal captured by a microphone located at the mobile device and a
second audio signal captured from a further microphone. A
correlation value between the first audio signal and second audio
signal is computed, and this is used to determine whether the
mobile device is in proximity to the further microphone. In one
example, the mobile device can receive the audio signals from the
further microphone and calculate the correlation value. In another
example, a server can receive the audio signals from the mobile
device and the further microphone and calculate the correlation
value. In examples, the further microphone can be a fixed
microphone at a predetermined location, or the further microphone
can be a microphone located in another mobile device.
Inventors: |
Taylor; Stuart; (Cambridge,
GB) ; Regan; Timothy; (Cambridge, GB) ;
Gosset; Philip; (Stroud, GB) ; Gunawardena;
Dinan; (Cambridge, GB) ; Thereska; Eno;
(Cambridge, GB) |
|
Applicant: |
Name |
City |
State |
Country |
Type |
Microsoft Corporation |
Redmond |
WA |
US |
|
|
Family ID: |
47006316 |
Appl. No.: |
14/458894 |
Filed: |
August 13, 2014 |
Related U.S. Patent Documents
|
|
|
|
|
|
Application
Number |
Filing Date |
Patent Number |
|
|
13089033 |
Apr 18, 2011 |
8830792 |
|
|
14458894 |
|
|
|
|
Current U.S.
Class: |
381/92 |
Current CPC
Class: |
H04W 64/00 20130101;
G01S 3/8006 20130101; G01S 5/18 20130101 |
Class at
Publication: |
381/92 |
International
Class: |
G01S 3/80 20060101
G01S003/80; H04W 64/00 20060101 H04W064/00 |
Claims
1. A computer-implemented method of localizing a mobile device,
comprising: receiving, at a processor, a first audio signal
captured by a microphone located at the mobile device; receiving,
at the processor, a second audio signal captured from a second
microphone; and determining whether the mobile device is in
proximity to the second microphone based on the first audio signal
and second audio signal
2. A method according claim 1, further comprising computing a
correlation value for the first audio signal and second audio
signal, computing the correlation value comprising calculating a
cross-correlation between the first audio signal and second audio
signal.
3. A method according to claim 1, wherein the first audio signal
comprises ambient noise from the vicinity of the mobile device, and
the second audio signal comprises ambient noise from the vicinity
of the further microphone.
4. A method according to claim 1, wherein the further microphone is
non-mobile and is associated with a predefined location.
5. A method according to claim 4, wherein the predefined location
is a room of an indoor environment.
6. A method according to claim 4, wherein the processor is located
at a server, and wherein the step of receiving the first audio
signal comprises receiving the first audio signal via a wireless
interface, and the step of receiving the second audio signal
comprises receiving the second audio signal via and
analog-to-digital converter.
7. A method according to claim 4, wherein the processor is located
at the mobile device, and wherein the step of receiving the second
audio signal comprises receiving the second audio signal via a
wireless communication interface at the mobile device.
8. A method according to claim 1, wherein the processor is located
at the mobile device, and the further microphone is located in a
further mobile device, such that the step of determining determines
whether the mobile device is in proximity to the further mobile
device.
9. A method according to claim 1, further comprising: receiving, at
the processor, a third audio signal captured from an additional
microphone; and computing a correlation value for the first audio
signal and third audio signal.
10. A method according to claim 9, further comprising: computing a
further correlation value for the first audio signal and second
audio signal; and comparing the correlation value and the further
correlation value, wherein the step of determining comprises
determining that the mobile device is in proximity to the further
microphone if the second audio signal has a higher degree of
correlation with the first audio signal than the third audio
signal.
11. A method according to claim 1, further comprising determining
an audio fingerprint for the first and second audio signals prior
to computing a correlation value for the first audio signal and
second audio signal.
12. A method according to claim 1, further comprising at least one
of normalizing and filtering at least one of the first and second
audio signals prior to computing a correlation value for the first
audio signal and second audio signal.
13. A method according to claim 1, further comprising transforming
the first and second audio signals into frequency domain signals
prior to computing a correlation value for the first audio signal
and second audio signal.
14. A method according to claim 1, further comprising applying a
time-shift to at least one of the first and second audio signals
prior to computing a correlation value for the first audio signal
and second audio signal.
15. A mobile device, comprising: a microphone arranged to capture a
first audio signal from the vicinity of the mobile device; a
communication interface arranged to receive a second audio signal
captured from a further microphone; a processor connected to the
microphone and the communication interface, and arranged to
determine whether the mobile device is in proximity to the further
microphone based on the first audio signal and the second audio
signal.
16. A mobile device according to claim 15, wherein the further
microphone is located at a further mobile device, and the processor
is arranged to determine whether the mobile device is in proximity
to the further mobile device.
17. A method according to claim 1, wherein the further microphone
is non-mobile and is associated with a predefined location, and the
processor is arranged to determine whether the mobile device is in
proximity to the predefined location.
18. A mobile device according to claim 15, wherein the mobile
device is a mobile telephone or laptop computer.
19. An indoor positioning system, comprising: a plurality of fixed
microphones, each located in a different room of an indoor
environment, and each arranged to capture audio signals from its
respective MOM; a wireless interface arranged to receive an audio
signal from a mobile device having a microphone arranged to capture
the audio signal from the vicinity of the mobile device; and a
computing device connected to the plurality of fixed microphones
and the wireless interface, and arranged to receive each of the
fixed microphone audio signals and the mobile device audio signal,
determine a selected fixed microphone providing the audio signal
based on the mobile device audio signal and one or more of the
fixed microphone audio signals, and outputting the room associated
with the selected fixed microphone as the mobile device
location.
20. An indoor positioning system according to claim 19, wherein at
least one of the plurality of fixed microphones is located in a
landline telephone or telephone conferencing device.
Description
RELATED APPLICATIONS
[0001] This application is a continuation of, and claims priority
to, U.S. patent application Ser. No. 13/089,033, filed Apr. 18,
2011, and entitled "MOBILE DEVICE LOCALIZATION USING AUDIO
SIGNALS." The disclosure of the above-identified application is
hereby incorporated by reference in its entirety as if set forth
herein in full.
BACKGROUND
[0002] Positioning systems and techniques enable the location of
devices to be determined and utilized to provide useful services.
For example, the global positioning system (GPS) uses signals from
a constellation of satellites to localize a receiver to within a
few tens of meters. However, whilst systems such as GPS work
effectively in open, outdoor environments, they typically do not
operate well in indoor environments due to a lack of line-of-sight
to the satellites.
[0003] Whilst alternative positioning techniques, such as those
based on cell site identities can be used indoors, these techniques
tend to have a lower accuracy than GPS and are more unpredictable
due to uneven radio propagation. As location-based services become
more pervasive and useful, it is therefore beneficial to be able to
determine the position of a mobile device in indoor environments,
without the addition of complex or expensive infrastructure or
hardware at the mobile device.
[0004] The embodiments described below are not limited to
implementations which solve any or all of the disadvantages of
known positioning techniques.
SUMMARY
[0005] The following presents a simplified summary of the
disclosure in order to provide a basic understanding to the reader.
This summary is not an extensive overview of the disclosure and it
does not identify key/critical elements of the invention or
delineate the scope of the invention. Its sole purpose is to
present a selection of concepts disclosed herein in a simplified
form as a prelude to the more detailed description that is
presented later.
[0006] Mobile device localization using audio signals is described.
In an example, a mobile device is localized by receiving a first
audio signal captured by a microphone located at the mobile device
and a second audio signal captured from a further microphone. A
correlation value between the first audio signal and second audio
signal is computed, and this is used to determine whether the
mobile device is in proximity to the further microphone. In one
example, the mobile device can receive the audio signals from the
further microphone and calculate the correlation value. In another
example, a server can receive the audio signals from the mobile
device and the further microphone and calculate the correlation
value. In examples, the further microphone can be a fixed
microphone at a predetermined location, or the further microphone
can be a microphone located in another mobile device.
[0007] Many of the attendant features will be more readily
appreciated as the same becomes better understood by reference to
the following detailed description considered in connection with
the accompanying drawings.
DESCRIPTION OF THE DRAWINGS
[0008] The present description will be better understood from the
following detailed description read in light of the accompanying
drawings, wherein:
[0009] FIG. 1 illustrates a schematic diagram of an indoor
positioning system using a central server for location calculation
relative to fixed microphones;
[0010] FIG. 2 illustrates a schematic diagram of an indoor
positioning system using a mobile device for location calculation
relative to fixed microphones;
[0011] FIG. 3 illustrates a schematic diagram of an indoor
positioning system using a mobile device for location calculation
relative to other mobile devices;
[0012] FIG. 4 illustrates a flow chart of a process for determining
a location of a mobile device using audio signals;
[0013] FIG. 5 illustrates a functional block diagram of an indoor
localizer; and
[0014] FIG. 6 illustrates an exemplary computing-based device in
which embodiments of the mobile device localization technique may
be implemented.
[0015] Like reference numerals are used to designate like parts in
the accompanying drawings.
DETAILED DESCRIPTION
[0016] The detailed description provided below in connection with
the appended drawings is intended as a description of the present
examples and is not intended to represent the only forms in which
the present example may be constructed or utilized. The description
sets forth the functions of the example and the sequence of steps
for constructing and operating the example. However, the same or
equivalent functions and sequences may be accomplished by different
examples.
[0017] Although the present examples are described and illustrated
herein as being implemented in a mobile computing system, the
system described is provided as an example and not a limitation. As
those skilled in the art will appreciate, the present examples are
suitable for application in a variety of different types of
embedded or dedicated systems in which indoor positioning is
useful.
[0018] Within most indoor environments there exists low level
background acoustic noise, in addition to louder acoustic sounds
coming from, for example, people, music, TVs, etc. These sounds are
often unique to a certain physical space. For example, the sounds
present in a kitchen may generally be different from those in a
living room. These sounds can therefore be utilized as part of a
positioning system to determine which room of an indoor environment
a user is located in.
[0019] In order to utilize audio signals to determine the position
of a user, a mobile device capable of sampling the ambient noise in
the area of the user can be used. Many users already possess such a
device in the form of a mobile telephone or other portable
computing device (such as a laptop computer or tablet device).
These devices generally already comprise microphones, and are able
to sample audio signals.
[0020] The techniques described below localize a user by comparing
audio signals captured by a mobile microphone associated with the
user with audio signals captured by other microphones (which can be
fixed or mobile) in order to determine a relative location of the
user to the other microphones. FIGS. 1 to 3 below describe three
different example positioning systems utilizing this technique, and
a method for determining location from the audio signals is
described with reference to FIGS. 4 and 5.
[0021] Reference is first made to FIG. 1, which illustrates a first
example indoor positioning system. FIG. 1 shows a schematic diagram
of an indoor positioning system using a central server for location
calculation relative to fixed microphones. The example of FIG. 1
illustrates an indoor environment 100 comprising a first room 102,
a second room 104 and a third room 106. In other examples a
different number of rooms in different configurations can be
present in the indoor environment.
[0022] A mobile device 108 associated with a user comprises a
microphone 110. The microphone 110 is able to capture audio from
the vicinity of the user. In the example of FIG. 1, the mobile
device 108 is located in the second room 104, and is able to
capture audio signals from within this room. In the example of FIG.
1, the mobile device is a mobile telephone. However, in other
examples, the mobile device can be a laptop computer, tablet
device, or any other type of mobile computing device. In further
examples, the mobile device can be a dedicated device for using
audio signals for localization.
[0023] The system of FIG. 1 aims to determine the location of the
mobile device 108 in terms of which room the mobile device is
located in. By proxy, this can also be used to estimate the
location of the user, as the user is likely to be in the same room
as the mobile device.
[0024] Each room of the indoor environment 100 comprises a
microphone. For example, the first room 102 comprises microphone
112, the second room 104 comprises microphone 114, and the third
room 106 comprises microphone 116. In this example, these room
microphones are fixed, and associated with predefined locations
(e.g. the rooms in which they are placed). The example of FIG. 1
shows one microphone in each room. In alternative examples, rooms
or spaces in the indoor environment can comprise more than one
microphone. The microphones within the rooms are able to capture
ambient noise from within the room. In examples, this can comprise
both background noise (e.g. appliances, air conditioning, music
etc.) as well as foreground noise (e.g. speech).
[0025] In one example, one or more of the room microphones may be
dedicated microphones placed in the room for the purposes of
determining location. In other examples, one or more of the
microphones may already be present in equipment located in the
rooms. This may be any fixed device having audio capture
capabilities. For example, in the case of an indoor environment
that is an office, each room may have conferencing equipment or
landline telephones present. Such equipment already comprises
microphones able to capture audio from the room.
[0026] The indoor positioning system of FIG. 1 further comprises a
computing device 118 such as a server. The computing device is
executing localizer functionality 120, which is arranged to compare
the audio signals from the microphone 110 in the mobile device 108
and the further microphones (112, 114, 116), and determine the
location of the mobile device 108. More detail on the operation of
the localizer functionality is described with reference to FIGS. 4
and 5 below.
[0027] The computing device 118 receives the audio signal from the
mobile device 108 via a wireless interface 122. The wireless
interface 122 may be located at or in the computing device 118, or
remote from it (e.g. connected over a communications network). The
wireless interface 122 is arranged to receive signals transmitted
from the mobile device 108. These signals can comprise audio
information captured by the microphone 110, or data derived
therefrom. The user of the mobile device 108 can be prompted to
provide consent for the audio signal to be transmitted to the
computing device 118. The audio signal received at the wireless
interface 122 from the mobile device 108 is provided to the
localizer functionality 120 at the computing device 118.
[0028] In one example, the wireless interface 122 can be in the
form of an access point, and the wireless interface 122 can
communicate with the mobile device using any suitable short range
communication technique such as WiFi or Bluetooth. In alternative
examples, the wireless interface 122 can be in the form of base
station, and the wireless interface 122 can communicate with the
mobile device using any suitable cellular communication technique
such as GSM, GPRS, UMTS, WiMAX or LTE.
[0029] The computing device 118 is connected to the room
microphones (112, 114, 116) and receives the audio signals from
these microphones and provides them to the localizer functionality
120. The computing device 118 may be connected to the room
microphones (112, 114, 116) directly or via a communication
network. In the example of FIG. 1, the microphones provide an
analogue signal directly to the computing device 118, and these are
sampled as digital data by an analogue-to-digital converter 124
(ADC). The digital representation of the analogue audio signals
from the microphones is then provided to the localizer
functionality for processing.
[0030] In alternative examples, the microphones can each be
provided with individual, local ADCs, such that they each transmit
digital audio data to the computing device (either directly or via
a network). In further examples, the room microphones can also be
wireless, and transmit the audio signals to the computing device
118 wirelessly (e.g. to the wireless interface 122), rather than
using a wired connection.
[0031] As the computing device 118 (e.g. server) of FIG. 1 receives
the audio signals and determines a location for the mobile device
108, the example positioning system of FIG. 1 represents a
centralized architecture. The system of FIG. 1 may therefore be
suitable for a controlled environment such as a home or office,
where the computing device receiving the audio signals is
maintained locally. In such scenarios the users can explicitly
consent to the capture of ambient audio signals by the
microphones.
[0032] Reference is now made to FIG. 2, which illustrates a second
example positioning system. FIG. 2 shows a schematic diagram of an
indoor positioning system using a mobile device for location
calculation relative to fixed microphones. Unlike the system of
FIG. 1, the system of FIG. 2 does not utilize a central server for
location calculation.
[0033] FIG. 2 again shows the indoor environment 100 comprising the
first room 102, the second room 104, the third room 106, and the
mobile device 108 associated with the user. The mobile device 108
again comprises microphone 110, and the first room 102 comprises
microphone 112, the second room 104 comprises microphone 114, and
the third room 106 comprises microphone 116. As with FIG. 1, the
room microphones are fixed and are associated with their respective
rooms. Note that in other examples a different number of rooms or
microphones in different configurations can be present.
[0034] As before, the mobile device microphone 110 is able to
capture ambient audio from the vicinity of the user, and the room
microphones are able to capture ambient audio from within their
rooms. The system of FIG. 2 aims to determine the location of the
mobile device 108 in terms of which room the mobile device is
located in.
[0035] Rather than communicating with a central computing device,
in the example of FIG. 2, the room microphones (112, 114, 116) are
each connected to a transmitter 202, and the transmitter
communicates with the mobile device 108. In this configuration, the
room microphones each transmit their audio signals directly to the
mobile device 108. For example, the room microphones may transmit
their audio signals using a short range wireless communication
technique such as WiFi or Bluetooth. A corresponding receiver
arranged to receive the audio signals from the transmitters 202 is
present at the mobile device 108.
[0036] In alternative examples, rather than using a separate
transmitter 202 for each microphone, the microphones can be
connected to a common transmitter or access point that transmits
the audio signals for a plurality of microphones.
[0037] In the example of FIG. 2, the mobile device 108 executes the
localizer functionality 204. The localizer functionality 204 is
similar to that described above with reference to FIG. 1, and is
described below in more detail with reference to FIGS. 4 and 5. The
localizer functionality 204 receives the audio signal from the
microphone 110 in the mobile device 108 that describes the ambient
noise in the vicinity of the mobile device 108. The localizer
functionality 204 also receives the audio signals from each of the
room microphones 112, 114, 116 that describe the ambient noise in
the vicinity of the each of the rooms of the indoor environment
100. The localizer functionality 204 then compares these audio
signals (as described below) to determine the location of the
mobile device.
[0038] Therefore, the system of FIG. 2 enables the determination of
the location of the mobile device 108 in the indoor environment (in
terms of which room it is located in). This is achieved without the
use of a central server, as the processing for the localization
functionality is performed at the mobile device 108. This means
that the audio signal from the mobile device microphone 110 is not
transmitted outside the mobile device 108, which may be useful in
scenarios that are not under the control of the user (e.g. not in a
home/office environment), although user consent can again be
obtained prior to audio capture.
[0039] Reference is now made to FIG. 3, which illustrates a third
example positioning system. FIG. 3 shows a schematic diagram of an
indoor positioning system that calculates location relative to
other mobile devices. The system of FIG. 3 differs from those in
FIG. 1 or 2 in that the system is not aiming to determine the
location of the mobile device 108 in terms of which room of an
indoor environment the mobile device is in, but rather the system
aims to determine which other mobile devices it is in proximity
to.
[0040] The use of acoustic signals for this purpose enables a more
representative location to be determined for indoor environments.
For example, low power radio signals can be transmitted between
mobile devices to ascertain whether they are in proximity. However,
these signals pass readily through walls, floors, windows, and
other internal structures in indoor environments. Therefore, when
radio signals are used it may appear that certain mobile devices
are in proximity, whereas they are actually in different rooms or
on different floors. This results in a difference between what the
user perceives as being other mobile devices in proximity, and what
the positioning system determines. Acoustic signals are more
readily attenuated by indoor structures, and are therefore suitable
for determining proximity between mobile devices that matches the
user's perception.
[0041] FIG. 3 again shows the indoor environment 100 comprising the
first room 102, the second room 104, the third room 106, and the
mobile device 108 associated with the user. The mobile device 108
again comprises microphone 110 that is able to capture ambient
audio from the vicinity of the user. However, fixed room
microphones are not present.
[0042] The example of FIG. 3 includes three further mobile devices:
a first further mobile device 302 comprising microphone 304; a
second further mobile device 306 comprising microphone 308; and a
third further mobile device 310 comprising microphone 312. Each of
these further mobile devices 302, 306, 310 can communicate with the
mobile device 108 of the user. For example, the further mobile
devices 302, 306, 310 may communicate with the mobile device 108
using a short range wireless communication technology such as WiFi,
Bluetooth or similar. The further mobile devices 302, 306, 310
transmit audio signals from their respective microphones 304, 308,
312 to the mobile device 108, at the consent of the associated
user. These audio signals contain information on the ambient audio
in the vicinity of the associated further mobile device.
[0043] In the FIG. 3 example, the mobile device 108 executes
localizer functionality 314. The localizer functionality 314 is
similar to that described above with reference to FIGS. 1 and 2
(and described below in more detail with reference to FIGS. 4 and
5). The localizer functionality 314 receives the audio signal from
the microphone 110 in the mobile device 108 that describes the
ambient noise in the vicinity of the mobile device 108. The
localizer functionality 314 also receives the audio signals from
each of the microphones 304, 308, 312 that describe the ambient
noise in the vicinity of the each of the further mobile devices
302, 306, 310. The localizer functionality 314 then compares these
audio signals (as described in more below) to determine the
relative location of the mobile device.
[0044] In the illustrative example of FIG. 3, the mobile device 108
is present in the second room 104. Also present in the second room
104 is the second further mobile device 306. When the mobile device
108 compares the audio signal from its own microphone 110 with that
from the microphone 308 of the second further mobile device 306 it
can determine that these are sufficiently similar to indicate that
the mobile device 108 is close to (e.g. in the same room as) the
second further mobile device 306.
[0045] Conversely, in the example of FIG. 3, the first further
mobile device 302 and third further mobile device 310 are located
in different rooms to the mobile device 108. They are still
sufficiently close to the mobile device 108 that the radio signals
comprising the audio from the first further mobile device 302 and
third further mobile device 310 are received at the mobile device
108. However, because they are located in different rooms to the
mobile device 108, the audio signals from the first further mobile
device 302 and third further mobile device 310 are different from
the audio captured by the mobile device microphone 110. The
localizer functionality 314 therefore does not consider the mobile
device 108 to be in close proximity to the first further mobile
device 302 and third further mobile device 310.
[0046] This matches the user's perception of the relative spatial
locations of the mobile devices. The user perceives that the second
further mobile device 306 is close by, as it is in the same room,
but does not consider the first further mobile device 302 or third
further mobile device 310 to be close as they are in a different
room and cannot be seen (despite the fact that they may be
spatially nearby).
[0047] The system of FIG. 3 does not provide an absolute location
for the mobile device 108 in an indoor environment, but does
provide a relative location between mobile devices. In other words,
the positioning system of FIG. 3 determines which mobile devices
are in sufficiently close proximity to experience similar ambient
noises (e.g. in the same room). This relative location information
can be used to provide location based services such as sharing of
documents or presentation materials with participants of a meeting
that are all in the same room, without sharing with other mobile
devices that are outside the room.
[0048] In some examples, to avoid sharing audio signals with
unknown mobile devices, each mobile device can be arranged to only
send an audio signal to another mobile device if the user has
expressly permitted the communication, or if the other mobile
device is pre-approved, e.g. by listing the other mobile device in
its address book.
[0049] In an alternative example to that shown in FIG. 3, a
centralized architecture using a server can be used, similar to
that shown in FIG. 1. In such an example, each of the mobile
devices 108, 302, 306, 310 transmits their audio signals to a
computing device such as a server, which executes the localizer
functionality and determines the relative locations.
[0050] Systems such as that shown in FIG. 3, in which mobile
devices compare audio signals can be used to generate a relative
topography for a group of users of mobile devices. For example,
each mobile device can use the audio signals to determine which
other mobile devices it is in proximity to. This information can be
collated and used to generate an overall topography for all users,
indicating who is near who.
[0051] Note that in further examples, combinations of fixed and
mobile microphones can also be used. For example, the examples of
FIGS. 2 and 3 can be combined, such that some or all rooms have
fixed microphones, and the further mobile devices also send their
audio signals to the mobile device 108. This enables determination
of both absolute and relative locations of the mobile device
108.
[0052] Reference is now made to FIG. 4, which illustrates a flow
chart of a process for determining a location of a mobile device
using audio signals. The process of FIG. 4 can be implemented by
the localizer functionality 120, 204, 314 as mentioned above with
reference to FIGS. 1, 2 and 3, located at the mobile device 108 or
computing device 118. The process of FIG. 4 may be implemented in
software, hardware, firmware, or any suitable combination thereof,
as described in more detail with reference to FIG. 5 below.
[0053] The audio signal from the mobile device 108 to be localized
is received 402. This audio signal originates from the microphone
110 in the mobile device 108 as described above. The audio signal
received can be in the form of digital samples of the analogue
audio signal captured by the microphone 110. The audio signals from
one or more further microphones are also received 404. These audio
signals are those received from, for example, the fixed room
microphones 112, 114, 116 in the examples of FIGS. 1 and 2, or the
mobile device microphones 304, 308, 312 in the example of FIG. 3.
These audio signals can be in the form of digital samples of the
captured analogue audio.
[0054] Optional signal processing can then be applied 406 to either
or both of the audio signals from the mobile device 108 and the
further microphones. The signal processing that can be applied
includes (but is not limited to) one or more of encryption, audio
fingerprinting, filtering, normalization, time-shifting, and domain
transformation.
[0055] An encryption operation can be used to ensure that ambient
audio signals captured by the microphones cannot readily be
intercepted during transmission between elements of the
localization system. In some examples, encryption can be performed
locally at the microphones, such that only secure audio signals are
transmitted (wired or wirelessly) from the microphones.
[0056] For example, an audio fingerprinting operation can determine
a "signature" for each audio signal. This is also known as
content-based audio identification (CBID). Audio fingerprinting
operations extract representative features from the audio signals.
The audio fingerprint therefore characterizes the audio signal
without retaining the information content (e.g. any captured
speech) within the signal. If an audio fingerprint operation is
used, then the signatures of the audio signals can be compared,
rather than the original captured audio. Examples of features that
can be extracted from audio signals in an audio fingerprinting
operation include (but are not limited to): Mel-frequency cepstrum
coefficients (MFCC); spectral flatness measures (SFM); band
representative vectors; and hash strings. Note that, in some
examples, the audio fingerprinting operation can be performed
locally at the microphones, to ensure that only signals without
information content are sent from the microphones.
[0057] In examples, filtering operations can be applied to one or
more of the audio signals to filter one or more frequency bands.
Selecting certain frequency bands of the audio signal to retain can
be used to enhance the audio signals by focusing the analysis on
representative frequency bands that characterize locations. For
example, a high-pass filter can be used to remove low frequency
portions of the signal that may propagate more easily through
internal building structures, leaving higher frequency signals that
do not pass between rooms readily. In another example, band-pass
filters can be used to remove frequency bands associated with human
speech, such that mainly background noise is retained in the audio
signals.
[0058] In other examples, the filtering performed can be based on
amplitude, i.e. volume level, of the audio signals. For example,
only the portions of the audio signals that are less than a
selected amplitude can be retained by the filters. This enables
foreground audio to be removed from the audio signals, and only
background audio signals are retained.
[0059] In further examples, a normalization operation can be
performed on the audio signals. A normalization operation can
equalize the amplitude of the different audio signals. For example,
this can normalize the peak level or a mean level (e.g. RMS) of the
audio signals. The normalization can, in other examples, also (or
additionally) be performed in the frequency domain, such that the
frequency range of the audio signals is equalized.
[0060] A time-shift operation can be applied to the audio signals
in yet further examples. The time-shift can be used to more
accurately align (i.e. synchronize) the samples of the audio
signals originating from different sources. For example, in the
case of FIG. 1, the fixed room microphones are shown providing
their audio signals to the computing device via direct wired
connections. This therefore results in these audio signals arriving
with minimal time-lag. However, the audio signals from the
microphone 110 in the mobile device 108 are sent over a wireless
link. The processing involved in coding, transmitting and
subsequently receiving and decoding the audio signals over the
wireless link introduces a time-lag for this audio signal, relative
to the others received more directly. To counteract such time
differences, a time shift can be applied to one or more of the
audio signals (e.g. the audio signals from the fixed room
microphones), such that the audio signals are time-aligned.
[0061] A domain transformation operation can be applied in some
examples to transform the audio signals from the time-domain to the
frequency domain. The audio signals are then subsequently compared
in the frequency domain rather than time domain. By processing the
audio signals in the frequency domain, information such as speech
in the audio signals is not directly derivable. A transformation
from the time-domain to the frequency-domain can be performed
using, for example, a fast Fourier transform.
[0062] Note that some or all of these signal processing operations
can also be performed locally at the microphones, as well as at the
localizer functionality.
[0063] Following the optional signal processing operations, the
various audio signals are compared. To do this, a correlation
between the audio signal from the mobile device 108 and the audio
signals from each of the further microphones is computed 408. In
one example, the correlation calculation can be in the form of a
cross-correlation calculation. For example, the cross-correlation
between two functions (e.g. audio signals), f and g, can be found
using the following definition:
( f a .cndot. g ) [ n ] = def m = - .infin. .infin. f * [ m ] g [ n
+ m ] ##EQU00001##
[0064] Where n is a time lag between the two functions, and f* is
the complex conjugate of f.
[0065] The output of the correlation calculations is a set of
values that indicates the degree of similarity between the audio
signal from the mobile device 108 and the audio signals from each
of the further microphones. The set of correlation values are then
compared to determine 410 which of the further microphones the
mobile device is in proximity to. This can be achieved by selecting
the further microphone providing the audio signal that has the
highest degree of correlation with the audio signal from the mobile
device. In a further example, a threshold correlation value can
also be set, such that the mobile device is determined to be in
proximity to one or more further microphones for which the degree
of correlation exceeds the threshold.
[0066] Even in an example where multiple further microphones are
present in a single room, the correlation will be greatest for the
further microphone that is closest to the mobile device. This is
because the ambient noise can vary even within the confines of a
single room. Therefore, this technique can also be used to provide
localization within a single room environment.
[0067] The determined location in terms of a relative proximity to
one of the further microphones can then be output from the
localizer functionality and utilized in any suitable location based
services. As noted above, the output location can be transformed
into an absolute location in the case of fixed microphones, as the
location (e.g. in terms of rooms) of the fixed microphones is
known. Alternatively, the output location can be in the form of a
relative location in the case of mobile microphones, for example in
terms of a proximity to one or more other mobile devices.
[0068] Reference is now made to FIG. 5, which illustrates a
functional block diagram of the localizer functionality
implementing the flowchart of FIG. 4. FIG. 5 is illustrated in the
context of the example system of FIGS. 1 and 2, with fixed
microphones. Note that a similar structure also applies when the
audio signals are received from other mobile devices rather than
the fixed microphones (e.g. in the case of FIG. 3).
[0069] FIG. 5 shows an audio signal 502 captured by the mobile
device 108. FIG. 5 also shows an audio signal 504 captured by the
microphone 112 in the first room 102, an audio signal 506 captured
by the microphone 114 in the second room 104, and an audio signal
508 captured by the microphone 116 in the first room 106.
[0070] Each of the audio signals 502, 504, 506, 508 can be in the
form of digital samples of ambient sounds from a short period of
time. In some examples, the time period over which the sound is
sampled can be sufficiently short that no significant information
content can be obtained from any speech that is captured by the
microphones.
[0071] The audio signals 502, 504, 506, 508 are then each provided
to optional signal processing blocks 510, which can apply one or
more of the signal processing operations described above. These
include (but are not limited to) an audio fingerprint operation
512, a time-shift operation 514, a normalize operation 516, a
filter operation 518, a domain transform operation 520, and an
encryption operation 521.
[0072] Following signal processing (if applied), each audio signal
504, 506, 508 from the rooms are separately applied to one input of
a correlator 522. The audio signal 502 from the mobile device 108
is applied to the other input of each correlator 522. The
correlator 522 outputs the correlation between the signals applied
at its inputs. The output from each correlator 522 is provided to a
selector 524. The selector 524 compares the correlation between the
mobile device audio signal 502 and each of the room audio signals
504, 506, 508, and outputs the room having the highest degree of
correlation as the location for the mobile device 108.
[0073] Reference is now made to FIG. 6, which illustrates various
components of an exemplary computing device 600 which may be
implemented as any form of a computing and/or electronic device,
and in which embodiments of the indoor localization technique may
be implemented. For example, the computing device 600 of FIG. 6 can
be the centralized computing device 118 of FIG. 1, or the mobile
device 108 of FIG. 2 or 3.
[0074] Computing device 600 comprises one or more processors 602
which may be microprocessors, controllers or any other suitable
type of processors for processing computing executable instructions
to control the operation of the device in order to perform indoor
localization. In some examples, for example where a system on a
chip architecture is used, the processors 602 may include one or
more fixed function blocks (also referred to as accelerators) which
implement a part of the indoor localization methods in hardware
(rather than software or firmware).
[0075] The computing device 600 comprises a communication interface
604, which is arranged to communicate with one or more
communication networks. For example, the communication interface
can be a wireless communication interface arranged to communicate
wirelessly with one or more mobile devices or microphones (e.g. as
shown in FIG. 1-3). The communication interface may also
communicate with one or more wired communication networks (e.g. the
internet).
[0076] The computing device 600 also comprises an input interface
606 arranged to receive input from one or more devices or data
sources, such as the microphones 112, 114, 116 as shown in of FIG.
1. An output interface 608 may also optionally be provided and
arranged to provide output to, for example, a storage device or
display system integral with or in communication with the computing
device. The display system may provide a graphical user interface,
or other user interface of any suitable type although this is not
essential.
[0077] The computer executable instructions may be provided using
any computer-readable media that is accessible by computing device
600. Computer-readable media may include, for example, computer
storage media such as memory 610 and communications media. Computer
storage media, such as memory 610, includes volatile and
non-volatile, removable and non-removable media implemented in any
method or technology for storage of information such as computer
readable instructions, data structures, program modules or other
data. Computer storage media includes, but is not limited to, RAM,
ROM, EPROM, EEPROM, flash memory or other memory technology,
CD-ROM, digital versatile disks (DVD) or other optical storage,
magnetic cassettes, magnetic tape, magnetic disk storage or other
magnetic storage devices, or any other non-transmission medium that
can be used to store information for access by a computing device.
In contrast, communication media may embody computer readable
instructions, data structures, program modules, or other data in a
modulated data signal, such as a carrier wave, or other transport
mechanism. As defined herein, computer storage media does not
include communication media. Although the computer storage media
(memory 610) is shown within the computing device 600 it will be
appreciated that the storage may be distributed or located remotely
and accessed via a network or other communication link (e.g. using
communication interface 604).
[0078] Platform software comprising an operating system 612 or any
other suitable platform software may be provided at the computing
device to enable application software 614 to be executed on the
device. The memory 610 can store executable instructions to
implement the functionality of a correlator 816 for comparing audio
signals, selection logic 618 for comparing correlation values and
determining a location, and optional signal processing logic 620
for implementing the signal processing operations described above.
The memory 610 can also provide a data store 622, which can be used
to provide storage for data used by the processors 602 when
performing the indoor localization techniques.
[0079] The term `computer` is used herein to refer to any device
with processing capability such that it can execute instructions.
Those skilled in the art will realize that such processing
capabilities are incorporated into many different devices and
therefore the term `computer` includes PCs, servers, mobile
telephones, personal digital assistants and many other devices.
[0080] The methods described herein may be performed by software in
machine readable form on a tangible storage medium e.g. in the form
of a computer program comprising computer program code means
adapted to perform all the steps of any of the methods described
herein when the program is run on a computer and where the computer
program may be embodied on a computer readable medium. Examples of
tangible (or non-transitory) storage media include disks, thumb
drives, memory etc and do not include propagated signals. The
software can be suitable for execution on a parallel processor or a
serial processor such that the method steps may be carried out in
any suitable order, or simultaneously.
[0081] This acknowledges that software can be a valuable,
separately tradable commodity. It is intended to encompass
software, which runs on or controls "dumb" or standard hardware, to
carry out the desired functions. It is also intended to encompass
software which "describes" or defines the configuration of
hardware, such as HDL (hardware description language) software, as
is used for designing silicon chips, or for configuring universal
programmable chips, to carry out desired functions.
[0082] Those skilled in the art will realize that storage devices
utilized to store program instructions can be distributed across a
network. For example, a remote computer may store an example of the
process described as software. A local or terminal computer may
access the remote computer and download a part or all of the
software to run the program. Alternatively, the local computer may
download pieces of the software as needed, or execute some software
instructions at the local terminal and some at the remote computer
(or computer network). Those skilled in the art will also realize
that by utilizing conventional techniques known to those skilled in
the art that all, or a portion of the software instructions may be
carried out by a dedicated circuit, such as a DSP, programmable
logic array, or the like.
[0083] Any range or device value given herein may be extended or
altered without losing the effect sought, as will be apparent to
the skilled person.
[0084] Although the subject matter has been described in language
specific to structural features and/or methodological acts, it is
to be understood that the subject matter defined in the appended
claims is not necessarily limited to the specific features or acts
described above. Rather, the specific features and acts described
above are disclosed as example forms of implementing the
claims.
[0085] It will be understood that the benefits and advantages
described above may relate to one embodiment or may relate to
several embodiments. The embodiments are not limited to those that
solve any or all of the stated problems or those that have any or
all of the stated benefits and advantages. It will further be
understood that reference to `an` item refers to one or more of
those items.
[0086] The steps of the methods described herein may be carried out
in any suitable order, or simultaneously where appropriate.
Additionally, individual blocks may be deleted from any of the
methods without departing from the spirit and scope of the subject
matter described herein. Aspects of any of the examples described
above may be combined with aspects of any of the other examples
described to form further examples without losing the effect
sought.
[0087] The term `comprising` is used herein to mean including the
method blocks or elements identified, but that such blocks or
elements do not comprise an exclusive list and a method or
apparatus may contain additional blocks or elements.
[0088] It will be understood that the above description of a
preferred embodiment is given by way of example only and that
various modifications may be made by those skilled in the art. The
above specification, examples and data provide a complete
description of the structure and use of exemplary embodiments of
the invention. Although various embodiments of the invention have
been described above with a certain degree of particularity, or
with reference to one or more individual embodiments, those skilled
in the art could make numerous alterations to the disclosed
embodiments without departing from the spirit or scope of this
invention.
* * * * *