U.S. patent application number 14/099900 was filed with the patent office on 2014-08-07 for systems and methods for eye gaze determination.
This patent application is currently assigned to Eyefluence, Inc.. The applicant listed for this patent is Eyefluence, Inc.. Invention is credited to Gholamreza Amayeh, Dave Leblanc, Zhiming Liu, Michael Vacchina, Steve Wood.
Application Number | 20140218281 14/099900 |
Document ID | / |
Family ID | 50884065 |
Filed Date | 2014-08-07 |
United States Patent
Application |
20140218281 |
Kind Code |
A1 |
Amayeh; Gholamreza ; et
al. |
August 7, 2014 |
SYSTEMS AND METHODS FOR EYE GAZE DETERMINATION
Abstract
Devices and methods are provided for eye and gaze tracking
determination. In one embodiment, a method for compensating for
movement of a wearable eye tracking device relative to a user's eye
is provided that includes wearing a wearable device on a user's
head such that one or more endo-cameras are positioned to acquire
images of one or both of the user's eyes, and an exo-camera is
positioned to acquire images of the user's surroundings;
calculating the location of features in a user's eye that cannot be
directly observed from images of the eye acquired by an
endo-camera; and spatially transforming camera coordinate systems
of the exo- and endo-cameras to place calculated eye features in a
known location and alignment.
Inventors: |
Amayeh; Gholamreza; (Reno,
NV) ; Leblanc; Dave; (Reno, NV) ; Liu;
Zhiming; (Reno, NV) ; Vacchina; Michael;
(Reno, NV) ; Wood; Steve; (Sunnyvale, CA) |
|
Applicant: |
Name |
City |
State |
Country |
Type |
Eyefluence, Inc. |
Reno |
NV |
US |
|
|
Assignee: |
Eyefluence, Inc.
Reno
NV
|
Family ID: |
50884065 |
Appl. No.: |
14/099900 |
Filed: |
December 6, 2013 |
Related U.S. Patent Documents
|
|
|
|
|
|
Application
Number |
Filing Date |
Patent Number |
|
|
61734354 |
Dec 6, 2012 |
|
|
|
61734294 |
Dec 6, 2012 |
|
|
|
61734342 |
Dec 6, 2012 |
|
|
|
Current U.S.
Class: |
345/156 |
Current CPC
Class: |
A61B 3/14 20130101; G03B
2213/025 20130101; G03B 17/38 20130101; G06F 1/163 20130101; G06F
3/013 20130101 |
Class at
Publication: |
345/156 |
International
Class: |
G06F 3/01 20060101
G06F003/01 |
Goverment Interests
STATEMENT REGARDING FEDERALLY SPONSORED RESEARCH AND
DEVELOPMENT
[0002] The U.S. Government may have a paid-up license in this
invention and the right in limited circumstances to require the
patent owner to license others on reasonable terms as provided for
by the terms of Department of Defense (US Army) Contract No.
W81XWH-05-C-0045, U.S. Department of Defense Congressional Research
Initiatives No. W81XWH-06-2-0037, W81XWH-09-2-0141, and
W81XWH-11-2-0156; and U.S. Department of Transportation
Congressional Research Initiative Agreement Award No. DTNH
22-05-H-01424.
Claims
1. A method for eye tracking, comprising: a) calibrating a wearable
device before the wearable device is worn by a user; b) placing the
wearable device on a user's head adjacent one or both of the user's
eyes; c) calibrating the wearable device after placing the wearable
device on the user's head; d) detecting at least one eye feature of
a first eye of the user's eyes; e) performing a compensation
algorithm; and f) calculating a gaze direction of the user.
2. The method of claim 1, wherein step c) includes at least one of:
i) identifying one or more glints reflected off one or both eyes of
the user; and ii) calibrating between an endo-camera configured to
acquire images of one eye of the user and an exo-camera configured
to acquire images of the user's surroundings.
3. The method of claim 1, wherein step a) comprises computer vision
methods.
4. The method of claim 2, wherein step a) is completed after
manufacturing the wearable device and before first use of the
wearable device.
5. The method of claim 1, wherein step c) comprises at least one of
estimating a head pose of the user wearing the wearable device.
6. The method of claim 1, wherein step e) comprises at least one of
normalization, denormalization, and spatial transform to correct
for movement between the eye and the eye tracking camera.
7. The method of claim 1, wherein step f) comprises calculating a
target region within a real or virtual surface or volume, which
includes at least one of construction of a vector in space,
mapping, and interpolation.
8. A system for eye tracking, comprising: a wearable device
configured to be worn on a user's head; an exo-camera on the
wearable device configured to provide images of a user's
surroundings when the wearable device is worn by the user; an
endo-camera on the wearable device configured to provide images of
a first eye of the user when the wearable device is worn by the
user; and one or more processors configured for: a) calibrating a
wearable device before the wearable device is worn by a user; b)
calibrating the wearable device after placing the wearable device
on the user's head; c) detecting at least one eye feature of a
first eye of the user's eyes; d) performing a compensation
algorithm; and e) calculating a gaze direction of the user.
9. A method for compensating for movement of a wearable eye
tracking device relative to a user's eye, comprising: wearing a
wearable device on a user's head such that one or more endo-cameras
are positioned to acquire images of one or both of the user's eyes,
and an exo-camera is positioned to acquire images of the user's
surroundings; calculating the location of features in a user's eye
that cannot be directly observed from images of the eye acquired by
an endo-camera; and spatially transforming camera coordinate
systems of the exo- and endo-cameras to place calculated eye
features in a known location and alignment.
Description
RELATED APPLICATION DATA
[0001] This application claims benefit of co-pending provisional
applications Ser. Nos. 61/734,354, 61/734,294, and 61/734,342, all
filed Dec. 6, 2012. This application is also related to
applications Ser. Nos. 12/715,177, filed Mar. 1, 2010, 13/290,948,
filed Nov. 7, 2011, and U.S. Pat. No. 6,541,081. The entire
disclosures of these references are expressly incorporated by
reference herein.
FIELD OF THE INVENTION
[0003] The present invention relates generally to systems and
methods for eye tracking that are implemented for gaze
determination, e.g., determining locations in space or object(s)
being viewed by one or both eyes. In particular, the
gaze-determination systems and methods herein may enable
point-of-gaze determination in a wearable device without the need
for head-tracking after calibration.
BACKGROUND
[0004] This systems and methods herein relate to gaze tracking
using a wearable eye-tracking device that utilizes head-pose
estimation to improve gaze accuracy. The use of head-tracking
allows the system to know the user's head position in relation to
the monitor. This enables the user to accurately interact with an
electronic display or other monitor (e.g., control a pointer) using
his/her gaze.
[0005] Many wearable eye-tracking devices do not include head pose
estimation. However, minor shifts in head pose can introduce
ambiguity in eye trackers that use the eye visual axis only when
determining the gaze vector. Knowledge of the head pose can extend
the range of accuracy of a gaze-tracking system.
SUMMARY
[0006] The present invention is directed to systems and methods for
eye tracking that are implemented for gaze determination, e.g.,
determining locations in space or object(s) being viewed by one or
both eyes. In particular, the gaze-determination systems and
methods herein may enable point-of-gaze determination in a wearable
device without the need for head-tracking after calibration.
[0007] In accordance with an exemplary embodiment, a method is
provided for eye tracking that includes one or more steps, such as
calibrating a wearable device before the wearable device is worn by
a user; placing the wearable device on a user's head adjacent one
or both of the user's eyes; calibrating the wearable device after
placing the wearable device on the user's head; detecting at least
one eye feature of a first eye of the user's eyes; performing a
compensation algorithm; and calculating a gaze direction of the
user.
[0008] In accordance with another embodiment, a system is provided
for eye tracking that includes a wearable device configured to be
worn on a user's head; an exo-camera on the wearable device
configured to provide images of a user's surroundings when the
wearable device is worn by the user; an endo-camera on the wearable
device configured to provide images of a first eye of the user when
the wearable device is worn by the user; and one or more processors
configured for one or more of calibrating the wearable device
before the wearable device is worn by a user; calibrating the
wearable device after placing the wearable device on the user's
head; detecting at least one eye feature of a first eye of the
user's eyes; performing a compensation algorithm; and calculating a
gaze direction of the user.
[0009] In accordance with still another embodiment, a method is
provided for compensating for movement of a wearable eye tracking
device relative to a user's eye that includes wearing a wearable
device on a user's head such that one or more endo-cameras are
positioned to acquire images of one or both of the user's eyes, and
an exo-camera is positioned to acquire images of the user's
surroundings; calculating the location of features in a user's eye
that cannot be directly observed from images of the eye acquired by
an endo-camera; and spatially transforming camera coordinate
systems of the exo- and endo-cameras to place calculated eye
features in a known location and alignment.
[0010] Other aspects and features of the present invention will
become apparent from consideration of the following description
taken in conjunction with the accompanying drawings.
BRIEF DESCRIPTION OF THE DRAWINGS
[0011] The invention is best understood from the following detailed
description when read in conjunction with the accompanying
drawings. It will be appreciated that the exemplary apparatus shown
in the drawings are not necessarily drawn to scale, with emphasis
instead being placed on illustrating the various aspects and
features of the illustrated embodiments.
[0012] FIGS. 1A and 1B are perspective and back views,
respectively, of an exemplary embodiment of a wearable gaze
tracking device.
[0013] FIG. 2 is a flowchart showing an exemplary method for gaze
tracking using a wearable device, such as that shown in FIGS. 1A
and 1B.
[0014] FIG. 3 is a flowchart showing an exemplary method for gaze
mapping that may be included in the method shown in FIG. 2.
[0015] FIG. 4 is a flowchart showing an exemplary method for pupil
detection that may be included in the method shown in FIG. 2.
[0016] FIGS. 5 and 6 are schematic representations showing a
projected pupil point on a virtual plane after normalization and
denormalization using a method, such as that shown in FIG. 2.
DETAILED DESCRIPTION OF THE EXEMPLARY EMBODIMENTS
[0017] The present invention may provide apparatus, systems, and
methods for head tracking and gaze tracking that include one or
more of the following features: [0018] gaze tracking in a system
that allows unrestricted movement of the head; [0019] gaze tracking
in a system that is robust to small shifts in frame position
relative to the face for a given user; [0020] gaze point
registration with scene image with or without head tracking; [0021]
the storage of the user's calibration data for use with a single
headset at a later time
[0022] One of the hurdles to accurate gaze-mapping in a mobile
wearable eye-tracking device is finding a user-friendly method to
determine head pose information. In many cases, a user is
comfortable with a short user-specific calibration. The main
advantage of the gaze determination method disclosed herein is that
point-of-regard may be maintained with or without head tracking
after calibration. This is accomplished by estimating the point in
space where the user is looking and projecting it onto the scene
image. This allows for gaze determination in a plethora of
environments not restricted to a computer desk.
[0023] Turning to the drawings, FIGS. 1A and 1B show an exemplary
embodiment of a wearable gaze-tracking device 10 that includes a
wearable device 12, e.g., a frame for glasses (as shown), or a
mask, a headset, a helmet, and the like that is configured to be
worn on a users head (not shown), an exo-camera 20 (mounted on the
device to image the user's surroundings), one or more endo-cameras
30 (mounted on the device to image one or more both of the user's
eyes). In addition, the device 10 may include one or more light
sources, processors, memory, and the like (not shown) coupled to
other components for operating the device 10 and/or performing the
various functions described herein. Exemplary components, e.g.,
wearable devices, cameras, light sources, processors, communication
interfaces, and the like, that may be included in the device 10 are
disclosed in U.S. Pat. Nos. 6,541,081 and 7,488,294, and U.S.
Publication Nos. 2011/0211056 and 2013/0114850, the entire
disclosures of which are expressly incorporated by reference
herein.
[0024] Turning to FIG. 2, an exemplary method for gaze mapping and
determination is shown. Although the steps are shown in an
exemplary sequence, the steps may optionally be formed in a
different order than that shown. Generally, the method includes a)
a calibration step 110 in which the wearable device (e.g., device
10) is calibrated, a marker detection step 112, a pupil detection
step 114, a glint detection step 116, a normalization step 118, a
user calibration step 120, a gaze mapping step 122, and a
three-dimensional (3D) point-of-regard (POR) step 124. In step 112,
head pose estimation, typically operates substantially
continuously. Once the user has placed the device upon their head
or face, gaze determination (steps 114-124), including user
calibration step 120, generally begins with i) pupil detection 114,
and ii) glint location (identifying glints reflected off of one or
more both eyes acquired by the endo-camera(s) 30), where i) and ii)
may also be performed in reverse order (glint detection before
pupil detection). The camera-to-camera calibration steps 110
(calibrating the endo-camera(s) 30 and exo-camera 20) is generally
performed prior to the user placing the wearable device on their
face, e.g., as described below.
[0025] Illumination Source Calibration Step: The first step in
calibrating the glint locations in endo-camera images with the
light source locations on the wearable device is to acquire a set
of perspective images with a secondary reflective surface and light
source(s). For example, images of a mirror placed near the working
distance of the camera may be acquired, where the mirror's edges
are surrounded by LEDs and the mirror is placed in front of the
camera such that the glint-LEDs may be seen in the image. The
second step is to use a software program to mark and extract the
positions of the light sources surrounding the mirror and the
reflections in the mirror of the glint-generating light sources on
the wearable device. The next step is to determine the homography
between the image and the plane of the reflective surface. The
aforementioned homography is subsequently applied to the glint
light source in the image plane to get the three-dimensional (3D)
point corresponding to the light source on the reflective surface.
With the 3D locations in space, the ray originating at the light
source that generated the glint on the reflective surface may be
determined. These steps are repeated for each of the perspective
images. The intersection of the calculated ray vectors is
determined for each glint source for each perspective image
acquired.
[0026] Camera-Camera Calibration Step: Two standard checkerboards,
and/or other known geometric pattern, are positioned such that one
pattern substantially fills the field of view of each of the
exo-camera 20 and the endo-camera(s) 30, e.g., positioned at a near
optimal working distance of the respective camera, i.e., the object
is at near best focus. The position of the checkerboards remains
substantially fixed during camera-to-camera calibration. The
wearable device is moved between the patterns, while several sets
of images with the patterns in full view are acquired from both the
endo-camera(s) 30 and exo-camera 20 (eye and scene camera,
respectively). Each set of images yields a set of 3 equations.
Multiple sets of images yield an overdetermined matrix of the form
Ax=B. The matrix equation may be solved with SVD to get the
camera-to-camera transformation.
[0027] In addition, the calibration step 110 may then include a
User-Specific Calibration. In this step, codes displayed on the
monitor in the exo-camera images are registered with an established
monitor plane. This provides an estimate of head-pose at each
calibration and test point in the user's calibration session. The
codes may come in the form of a variety of patterns comprising
contrasting geometric phenomenon. The patterns may be displayed on
the monitor, constructed of other materials and attached to the
monitor, a series of light sources in pattern around the monitor,
and the like. Additionally, head pose may be estimated using an
accelerometer, MEMS device, or other orientation sensor. In the
past, accelerometers were bulky, but have significantly been
reduced in their overall footprint with the incorporation of MEMS
technology.
[0028] In an exemplary embodiment, user-specific calibration may be
performed with mapping techniques, wherein mapping refers to a
mathematical function. The function takes as a variable raw data
and evaluates to calibrated points. For example, a polynomial fit
is applied to an entire space, and an output value for any point
within that space is determined by the function.
[0029] In another exemplary embodiment, user-specific calibration
may be performed with interpolation. While mapping covers an entire
space of interest, interpolation is performed in a piecewise
fashion on specific subregions and localized data. For example, the
entire space may be subdivided into four subregions, and linear
fits may be applied to each of those subregions by using a weighted
average of the corner points of each region. If the number of
subregions is increased, the interpolation approaches the
polynomial fit of the prior exemplary embodiment.
[0030] In another exemplary embodiment, user-specific calibration
may be performed with machine learning. While machine learning may
appear to behave like mathematical functions as applied to the
mapping method, machine learning techniques may internally
represent highly irregular mappings that would otherwise require
extremely complex mathematical equations like discontinuous
functions and high-order polynomials. Machine learning techniques
also make no assumptions about the types of equations they will
model, meaning that the training procedure is identical regardless
of the type of mapping it will ultimately represent. This
eliminates, among other things, the need for the author to
understand the relationship between inputs and outputs. They may
also execute very quickly making them useful in high performance
applications.
[0031] Head-Pose Estimation Step: In the head-pose estimation step
112 in FIG. 2, each eye image in the video sequence is first
pre-processed and has a threshold applied to acquire marker
candidates as contours. The candidates are evaluated for contour
size, roundness, and corner count. The final candidates are
extracted and matched to marker codes stored within the system
directories. The user's head pose and orientation are calculated
relative to the marker corners.
[0032] The following steps may occur during and/or after the user
calibration step:
[0033] Pupil Detection Step: An exemplary embodiment of the pupil
detection step 114 in FIG. 2 is shown in FIG. 4. One potential
method for pupil detection is to first apply a blob detector, e.g.,
MSER, to a downsized and thresholded image to identify regions
similar in features to a pupil from endo-camera images. The blob
detector may, for example, be constrained to find circularity
(e.g., eccentricity, low order moments, and the like) and stable
regions that resemble a pupil. After a suitable region of interest
is identified, an algorithm such as Dense Stage I Starburst may be
applied to find pupil edges, while ignoring glints. Finally, an
ellipse is fitted to the pupil edge, for example using methods such
as Ransac or Hough transforms. Exemplary methods are disclosed in
Chinese Publication No. CN102831610 and U.S. Pat. No. 7,110,568,
the entire disclosures of which are expressly incorporated by
reference herein.
[0034] Glint Detection Step: For the glint detection step 116 of
FIG. 2, in an exemplary embodiment, first, an adaptive threshold is
applied to a subwindow of the full resolution image determined,
where the threshold value is based on mean and median intensity of
iris. The image contrast is enhanced. Then, the glints are
segmented out of the image through a combination of the threshold
and edge detection. Dilation and erosion filters are applied to
segmented glints to remove noise. The contours of the glint
candidates are determined. The aforementioned glint candidates are
screened for predetermined parameters of the actual glint, e.g.,
size constraints, oddly shaped, eccentricity, and the like. The
actual glints are selected from a final pool of candidates based on
geometric constraints.
[0035] Cornea Center Calculation Step: Next, the normalization step
118 of FIG. 2 may be performed. The location of the light sources
on the device 10 that produce the glints reflected at the anterior
corneal surface and the eye tracking camera intrinsic parameters
are known. The cornea is assumed to be substantially spherical.
Each glint establishes a trajectory of possible cornea center
positions in three dimensional (3D) space. Each trajectory pair
generates a 3D location on which the cornea center resides. The
corneal center coordinates are calculated using the aforementioned
information together with a default corneal radius of curvature
that matches the population average.
[0036] Once the gaze vector in the endo-camera coordinate system is
obtained, it may be mapped to either a point-of-regard (POR)
overlay, or the monitor plane if mouse or pointer control is
required. In the case of two dimensional (2D) POR and pointer
control, head pose continues to be calculated. In either scenario,
accurate gaze determination with unrestricted head movement may be
accomplished through proper normalization and denormalization of
the endo- (toward the eye) and exo-camera spaces(outward-looking)
relative to a virtual plane. The image pupil point is projected
onto the virtual plane (the mapping between the endo-,exo-, and
virtual coordinate spaces is determined during calibration). Then
the gaze point is found by intersection of the line formed by the
cornea and virtual plane point with the monitor.
[0037] Because the frames are not fixed to the person, the user
could move the frames while still looking at the same spot on the
virtual plane. A processor analyzing the endo-camera images would
detect that the center of the eye moved and project it to a
different spot on the virtual plane. To rectify this problem, the
center of the pupil is normalized. The cornea center is used as a
reference point and every frame it is transformed to a specific,
predetermined position. The normalized pupil position is then
determined on the shifted cornea image.
[0038] Essentially, the normalization puts the cornea in the same
position in every frame of the endo-camera images, e.g., within an
x-y-z reference frame. First, the normalization includes a rotation
that will rotate the cornea about the origin and put it on the
z-axis. This rotation is determined by restricting the rotation to
a combination of rotation around the x-axis followed by rotation
around the y-axis. The translation is determined by calculating the
required translation to move the rotated cornea to a predetermined
value on the z-axis. Because of the rotation done before
translation, the translation only contains a z value. FIG. 5 shows
how the pupil position may be found on the cornea.
[0039] To determine the pupil position on the cornea, the pupil
position on the image plane in 3D is retrieved and then the
intersection of the line formed by the pupil on the image and
origin with the non-normalized cornea is found. That point on the
cornea is then normalized along with the cornea.
[0040] Once the cornea and pupil are normalized on the cornea, the
next step is to determine the normalized pupil on the image plane,
e.g., at the normalization step 118 shown in FIG. 2. This is the
intersection of the line formed by the normalized pupil on the
cornea and origin with the image plane. FIG. 5 demonstrates
this.
[0041] Normalization puts the cornea in a specific position in the
endo-camera coordinate system. Since the cornea does not move
relative to the screen, the screen moves as well. They are both
fixed in space for the instance of this frame. The cameras and
virtual plane are all fixed together as well as the frames. So when
normalization moves the cornea into the specific position in the
endo-camera coordinate system, it is functionally the same as the
cornea remaining still and the coordinate system moving. The new
normalized pupil center is projected onto the virtual plane but
because the virtual plane moved with the endo coordinate system,
the gaze point right now would be wrong. The virtual plane must now
be denormalized to return it to the proper position for the gaze
point, e.g., as shown in FIG. 6.
[0042] Normalization Step: FIG. 3 shows an exemplary method for
performing the normalization step 118 shown in FIG. 2. The cornea
center is rotated about the origin to lie on the z-axis in the
endo-camera coordinate system (eye camera coordinate system).
Rotation is performed about the x-axis first, then the y-axis. The
rotated cornea position is translated to a constant, predefined
position along the z-axis. The next step is to transform pupil
center data from image pixels to image plane in units of
millimeters. Now the point where the line intersecting the
endo-camera center and the pupil center on the image plane
intersects with the cornea may be determined. The cornea is assumed
to be a sphere with a radius centered at the normalized cornea
center position. The intersection point is endo-normalized and
scaled such that it lies on the image plane, and transformed back
into pixels. The normalized pupil is then projected onto a virtual
plane, where the polynomial projection function is user-dependent
and generated during user calibration. The display origin and
normal vector are transformed to the exo-camera coordinate system
(scene camera coordinate system). The next step is to transform the
cornea center to exo-camera coordinates, followed by transforming
the endo-normalization into the exo-camera coordinate system to
obtain exo-normalization transformation. The inverse of the
exo-normalization transformation is applied to the projected
normalized pupil point in the exo-camera coordinate system, e.g.,
as shown in FIG. 6. The intersection of the line (exo-cornea and
de-normalized projected normalized pupil) with the exo-screen plane
is determined. The final step is to transform the result of that
intersection to the screen coordinate system of the monitor, and
then to pixel to obtain gaze point on the monitor.
[0043] For practical implementation, a mobile gaze-determination
system must be robust to small shifts in frame position relative to
the face for a given user in addition to accommodating unrestricted
head movement. Both conditions may be accomplished through proper
normalization of the endo-(toward the eye) and
exo-spaces(outward-looking) relative to the viewing plane.
[0044] For 3D POR, gaze point is determined by convergence of the
left and right eye gaze vectors. The information may then be
relayed to the user through the mobile device as an overlay on the
ex-camera (scene) video images.
[0045] Point-of-Regard Step: Next, at step 124 of FIG. 2, a 3D POR
overlap may be performed. The left gaze line is defined by
de-normalized projected normalized pupil and cornea in exo-camera
coordinate system for left eye. The same procedure is applied to
right eye. The intersection (or closest point of intersection)
between the two lines is determined and then projected onto the
exo-camera images.
[0046] When the point of gaze data is integrated into a more
elaborate user interface with cursor control, eye movements may be
used interchangeably with other input devices, e.g., that utilize
hands, feet, and/or other body movements to direct computer and
other control applications.
[0047] It will be appreciated that elements or components shown
with any embodiment herein are exemplary for the specific
embodiment and may be used on or in combination with other
embodiments disclosed herein.
[0048] While the invention is susceptible to various modifications,
and alternative forms, specific examples thereof have been shown in
the drawings and are herein described in detail. It should be
understood, however, that the invention is not to be limited to the
particular forms or methods disclosed, but to the contrary, the
invention is to cover all modifications, equivalents and
alternatives falling within the scope of the appended claims.
* * * * *