U.S. patent application number 11/141640 was filed with the patent office on 2005-12-22 for binaural horizontal perspective display.
Invention is credited to Clemens, Nancy L., Vesely, Michael A..
Application Number | 20050281411 11/141640 |
Document ID | / |
Family ID | 35462954 |
Filed Date | 2005-12-22 |
United States Patent
Application |
20050281411 |
Kind Code |
A1 |
Vesely, Michael A. ; et
al. |
December 22, 2005 |
Binaural horizontal perspective display
Abstract
The present invention display system discloses a three dimension
display system comprising a three dimensional horizontal
perspective display and a 3-D audio system such as binaural
simulation to lend realism to the three dimensional display. The
three dimensional display system can futher comprise a second
display, together with a curvilinear blending display section to
merge the various images. The multi-plane display surface can
accommodate the viewer by adjusting the various images and 3-D
sound according to the viewer's eyepoint and earpoint
locations.
Inventors: |
Vesely, Michael A.; (Santa
Cruz, CA) ; Clemens, Nancy L.; (Santa Cruz,
CA) |
Correspondence
Address: |
Tue Nguyen
496 Olive Ave.
Fremont
CA
94539
US
|
Family ID: |
35462954 |
Appl. No.: |
11/141640 |
Filed: |
May 31, 2005 |
Related U.S. Patent Documents
|
|
|
|
|
|
Application
Number |
Filing Date |
Patent Number |
|
|
60576187 |
Jun 1, 2004 |
|
|
|
60576189 |
Jun 1, 2004 |
|
|
|
60576182 |
Jun 1, 2004 |
|
|
|
60576181 |
Jun 1, 2004 |
|
|
|
Current U.S.
Class: |
381/61 ; 359/445;
381/17; 463/35 |
Current CPC
Class: |
H04N 13/383 20180501;
G02B 30/56 20200101; G06T 3/4038 20130101; H04R 2499/15 20130101;
H04N 13/395 20180501; G02B 30/50 20200101; G06F 3/011 20130101;
H04S 7/302 20130101; H04S 7/304 20130101; H04N 13/363 20180501;
H04S 7/30 20130101; G06T 15/20 20130101; G02B 30/40 20200101; G06F
3/0346 20130101; H04S 7/303 20130101; G06F 3/04815 20130101; H04N
13/361 20180501; H04N 13/279 20180501; H04N 13/356 20180501; H04S
2420/01 20130101 |
Class at
Publication: |
381/061 ;
381/017; 359/445; 463/035 |
International
Class: |
H03G 003/00; H04R
005/00 |
Claims
What is claimed is:
1. A horizontal perspective display system comprising a first real
time display to display horizontal perspective images according to
a predetermined projection eyepoint; and a 3-D audio simulation
system providing 3-D sound to a predetermined projection earpoint,
the 3-D sound corresponded to the horizontal perspective
images.
2. A horizontal perspective display system comprising a first real
time display to display horizontal perspective images according to
a predetermined projection eyepoint; an eyepoint input device for
accepting an input eyepoint location wherein the displayed images
can be adjusted using the input eyepoint as the projection
eyepoint; and a 3-D audio simulation system providing 3-D sound to
a predetermined projection earpoint, the 3-D sound corresponded to
the horizontal perspective images.
3. A display system as in claim 2 wherein the eyepoint input device
further functions as an earpoint input device for accepting an
input earpoint location wherein the 3-D sound can be adjusted using
the input earpoint as the projection earpoint
4. A display system as in claim 2 wherein the 3-D audio simulation
system comprises two sound channels and a HRTF (head related
transfer function) filter.
5. A display system as in claim 2 wherein the 3-D audio simulation
system comprises a 3-D loudspeaker audio system or a 3-D headphone
audio system.
6. A display system as in claim 2 further comprising a computer
system for receiving the input eyepoint location from the eyepoint
input device, calculating the horizontal perspective projection
images according to the input eyepoint location, and outputting the
images to the display wherein the displayed images is real time
adjusted using the input eyepoint as the projection eyepoint.
7. A display system as in claim 3 further comprising a computer
system for receiving the input eyepoint location from the eyepoint
input device, calculating the horizontal perspective projection
images according to the input eyepoint location, and outputting the
images to the display wherein the displayed images is real time
adjusted using the input eyepoint as the projection eyepoint; and
calculating the 3-D sound according to the input eyepoint location,
and outputting the 3-D sound wherein the 3-D sound is real time
adjusted using the input eyepoint as the projection earpoint.
8. A display system as in claim 2 wherein the eyepoint input device
is a manual input device whereby the eyepoint input location is
manually entered.
9. A display system as in claim 3 wherein the eyepoint input device
is a manual input device whereby the eyepoint and earpoint input
locations are manually entered.
10. A display system as in claim 8 or 9 wherein the manual input
device is a computer peripheral or a wireless computer
peripheral.
11. A display system as in claim 8 or 9 wherein the manual input
device is selected from a group consisted of a keyboard, a stylus,
a keypad, a computer mouse, a computer trackball, a tablet, a
pointing device.
12. A display system as in claim 2 wherein the eyepoint input
device is an automatic input device whereby the automatic input
device automatically extracts the eyepoint location from the
viewer.
13. A display system as in claim 3 wherein the eyepoint input
device is an automatic input device whereby the automatic input
device automatically extracts the eyepoint and earpoint locations
from the viewer.
14. A display system as in claim 12 or 13 wherein the automatic
input device is selected from a group consisted of radio-frequency
tracking device, infrared tracking device, camera tracking
device.
15. A display system as in claim 2 further comprising an image
input device for accepting an image command; wherein the computer
system further accepts an image command from the image input
device, calculating a horizontal perspective projection image
according to the image command using the input eyepoint location as
the projection eyepoint before outputting the image to the
display.
16. A display system as in claim 15 wherein the image command
includes image magnification, image movement, image rotation
command and command to display another predetermined image.
17. A display system as in claim 2 further comprising a second
display positioned at an angle to the first display.
18. A display system as in claim 17 further comprising a third
curvilinear display blending the first and the second displays.
19. A horizontal perspective display system comprising a first real
time display to display horizontal perspective images according to
a predetermined projection eyepoint; a 3-D audio simulation system
providing 3-D sound to a predetermined projection earpoint, the 3-D
sound corresponded to the horizontal perspective images; and an
earpoint input device for accepting an input earpoint location
wherein the 3-D sound can be adjusted using the input earpoint as
the projection earpoint.
20. A display system as in claim 19 wherein the earpoint input
device further functions as an eyepoint input device for accepting
an input eyepoint location wherein the display images can be
adjusted using the input eyepoint as the projection eyepoint.
Description
[0001] This application claims priority from U.S. provisional
applications Ser. No. 60/576,187 filed Jun. 1, 2004, entitled
"Multi plane horizontal perspective display"; Ser. No. 60/576,189
filed Jun. 1, 2004, entitled "Multi plane horizontal perspective
hand on simulator"; Ser. No. 60/576,182 filed Jun. 1, 2004,
entitled "Binaural horizontal perspective display"; and Ser. No.
60/576,181 filed Jun. 1, 2004, entitled "Binaural horizontal
perspective hand on simulator" which are incorporated herein by
reference. This application is related to co-pending applications
Ser. No. 11/098,681 filed Apr. 4, 2005, entitled "Horizontal
projection display"; Ser. No. 11/098,685 filed Apr. 4, 2005,
entitled "Horizontal projection display", Ser. No. 11/098,667 filed
Apr. 4, 2005, entitled "Horizontal projection hands-on simulator";
Ser. No. 11/098,682 filed Apr. 4, 2005, entitled "Horizontal
projection hands-on simulator"; "Multi plane horizontal perspective
display" filed May 27, 2005; "Multi plane horizontal perspective
hand on simulator" filed May 27, 2005; "Binaural horizontal
perspective display" filed May 27, 2005; and "Binaural horizontal
perspective hand on simulator" filed May 27, 2005.
FIELD OF INVENTION
[0002] This invention relates to a three-dimensional display
system, and in particular, to a multiple view display system.
BACKGROUND OF THE INVENTION
[0003] Ever since humans began to communicate through pictures,
they faced a dilemma of how to accurately represent the
three-dimensional world they lived in. Sculpture was used to
successfully depict three-dimensional objects, but was not adequate
to communicate spatial relationships between objects and within
environments. To do this, early humans attempted to "flatten" what
they saw around them onto two-dimensional, vertical planes (e.g.
paintings, drawings, tapestries, etc.). Scenes where a person stood
upright, surrounded by trees, were rendered relatively successfully
on a vertical plane. But how could they represent a landscape,
where the ground extended out horizontally from where the artist
was standing, as far as the eye could see?
[0004] The answer is three dimensional illusions. The two
dimensional pictures must provide a numbers of cues of the third
dimension to the brain to create the illusion of three dimensional
images. This effect of third dimension cues can be realistically
achievable due to the fact that the brain is quite accustomed to
it. The three dimensional real world is always and already
converted into two dimensional (e.g. height and width) projected
image at the retina, a concave surface at the back of the eye. And
from this two dimensional image, the brain, through experience and
perception, generates the depth information to form the three
dimension visual image from two types of depth cues: monocular (one
eye perception) and binocular (two eye perception). In general,
binocular depth cues are innate and biological while monocular
depth cues are learned and environmental.
[0005] The major binocular depth cues are convergence and retinal
disparity. The brain measures the amount of convergence of the eyes
to provide a rough estimate of the distance since the angle between
the line of sight of each eye is larger when an object is closer.
The disparity of the retinal images due to the separation of the
two eyes is used to create the perception of depth. The effect is
called stereoscopy where each eye receives a slightly different
view of a scene, and the brain fuses them together using these
differences to determine the ratio of distances between nearby
objects.
[0006] Binocular cues are very powerful perception of depth.
However, there are also depth cues with only one eye, called
monocular depth cues, to create an impression of depth on a flat
image. The major monocular cues are: overlapping, relative size,
linear perspective and light and shadow. When an object is viewed
partially covered, this pattern of blocking is used as a cue to
determine that the object is farther away. When two objects known
to be the same size and one appears smaller than the other, this
pattern of relative size is used as a cue to assume that the
smaller object is farther away. The cue of relative size also
provides the basis for the cue of linear perspective where the
farther away the lines are from the observer, the closer together
they will appear since parallel lines in a perspective image appear
to converge towards a single point. The light falling on an object
from a certain angle could provide the cue for the form and depth
of an object. The distribution of light and shadow on a objects is
a powerful monocular cue for depth provided by the biologically
correct assumption that light comes from above.
[0007] Perspective drawing, together with relative size, is most
often used to achieve the illusion of three dimension depth and
spatial relationships on a flat (two dimension) surface, such as
paper or canvas. Through perspective, three dimension objects are
depicted on a two dimension plane, but "trick" the eye into
appearing to be in three dimension space. The first theoretical
treatise for constructing perspective, Depictura, was published in
the early 1400's by the architect, Leone Battista Alberti. Since
the introduction of his book, the details behind "general"
perspective have been very well documented. However, the fact that
there are a number of other types of perspectives is not well
known. Some examples are military, cavalier, isometric, and
dimetric, as shown at the top of FIG. 1.
[0008] Of special interest is the most common type of perspective,
called central perspective, shown at the bottom left of FIG. 1.
Central perspective, also called one-point perspective, is the
simplest kind of "genuine" perspective construction, and is often
taught in art and drafting classes for beginners. FIG. 2 further
illustrates central perspective. Using central perspective, the
chess board and chess pieces look like three dimension objects,
even though they are drawn on a two dimensional flat piece of
paper. Central perspective has a central vanishing point, and
rectangular objects are placed so their front sides are parallel to
the picture plane. The depth of the objects is perpendicular to the
picture plane. All parallel receding edges run towards a central
vanishing point. The viewer looks towards this vanishing point with
a straight view. When an architect or artist creates a drawing
using central perspective, they must use a single-eye view. That
is, the artist creating the drawing captures the image by looking
through only one eye, which is perpendicular to the drawing
surface.
[0009] The vast majority of images, including central perspective
images, are displayed, viewed and captured in a plane perpendicular
to the line of vision. Viewing the images at angle different from
90.degree. would result in image distortion, meaning a square would
be seen as a rectangle when the viewing surface is not
perpendicular to the line of vision. However, there is a little
known class of images that we called it "horizontal perspective"
where the image appears distorted when viewing head on, but
displaying a three dimensional illusion when viewing from the
correct viewing position. In horizontal perspective, the angle
between the viewing surface and the line of vision is preferrably
45.degree. but can be almost any angle, and the viewing surface is
perferrably horizontal (wherein the name "horizontal perspective"),
but it can be any surface, as long as the line of vision forming a
not-perpendicular angle to it.
[0010] Horizontal perspective images offer realistic three
dimensional illusion, but are little known primarily due to the
narrow viewing location (the viewer's eyepoint has to be coincide
precisely with the image projection eyepoint), and the complexity
involving in projecting the two dimensional image or the three
dimension model into the horizontal perspective image.
[0011] The generation of horizontal perspective images require
considerably more expertise to create than conventional
perpendicular images. The conventional perpendicular images can be
produced directly from the viewer or camera point. One need simply
open one's eyes or point the camera in any direction to obtain the
images. Further, with much experience in viewing three dimensional
depth cues from perpendicular images, viewers can tolerate
significant amount of distortion generated by the deviations from
the camera point. In contrast, the creation of a horizontal
perspective image does require much manipulation. Conventional
camera, by projecting the image into the plane perpendicular to the
line of sight, would not produce a horizontal perspective image.
Making a horizontal drawing requires much effort and very time
consuming. Further, since human has limited experience with
horizontal perspective images, the viewer's eye must be positioned
precisely where the projection eyepoint point is to avoid image
distortion. And therefore horizontal perspective, with its
difficulties, has received little attention.
[0012] For realistic three dimensional display, binaural or three
dimensional audio simulation is also needed.
SUMMARY OF THE INVENTION
[0013] The present invention recognizes that the personal computer
is perfectly suitable for horizontal perspective display. It is
personal, thus it is designed for the operation of one person, and
the computer, with its powerful microprocessor, is well capable of
rendering various horizontal perspective images to the viewer.
[0014] Thus the present invention display system discloses a three
dimension display system comprising at least a display surface
displaying a three dimensional horizontal perspective images. The
other display surfaces can display two dimensional images, or
preferably three dimensional central perpective images. Further,
the display surfaces can have a curvilinear blending display
section to merge the various images. The multi-plane display system
can comprise various camera eyepoints, one for the horizontal
perspective images, one for the central perspective images, and
optionally one for the curvilinear blending display surface. The
multi-plane display surface can further adjust the various images
to accommodate the position of the viewer. By changing the
displayed images to keep the camera eyepoints of the horizontal
perspective and central perspective images in the same position as
the viewer's eye point, the viewer's eye is always positioned at
the proper viewing position to perceive the three dimensional
illusion, thus minimizing viewer's discomfort and distortion. The
display can accept manual input such as a computer mouse,
trackball, joystick, tablet, etc. to re-position the horizontal
perspective images. The display can also automatically re-position
the images based on an input device automatically providing the
viewer's viewpoint location.
[0015] Further, the display is also included three dimensional
audio such as binaural simulation to lend realism to the three
dimensional display.
BRIEF DESCRIPTION OF THE DRAWINGS
[0016] FIG. 1 shows the various perspective drawings.
[0017] FIG. 2 shows a typical central perspective drawing.
[0018] FIG. 3 shows the comparison of central perspective (Image A)
and horizontal perspective (Image B).
[0019] FIG. 4 shows the central perspective drawing of three
stacking blocks.
[0020] FIG. 5 shows the horizontal perspective drawing of three
stacking blocks.
[0021] FIG. 6 shows the method of drawing a horizontal perspective
drawing.
[0022] FIG. 7 shows a horizontal perspective display and an viewer
input device.
[0023] FIG. 8 shows a horizontal perspective display, a
computational device and an viewer input device.
[0024] FIG. 9 shows mapping of the 3-d object onto the horizontal
plane.
[0025] FIG. 10 shows the projection of 3-d object by horizontal
perspective.
[0026] FIG. 11 shows the simulation time of the horizontal
perspective.
[0027] FIG. 12 shows an embodiment of the present invention
multi-plane display.
[0028] FIG. 13 shows the horizontal perspective and central
perspective projection on the present invention multi-plane
display.
DETAILED DESCRIPTION OF THE INVENTION
[0029] The present invention discloses a multi-plane display system
comprising at least two display surfaces, one of which capable of
projecting three dimensional illusion based on horizontal
perspective projection.
[0030] In general, the present invention multi-plane display system
can be used to display three dimensional images and has obvious
utility to many industrial applications such as manufacturing
design reviews, ergonomic simulation, safety and training, video
games, cinematography, scientific 3D viewing, and medical and other
data displays.
[0031] Horizontal perspective is a little-known perspective, of
which we found only two books that describe its mechanics:
Stereoscopic Drawing (.COPYRGT.D1990) and How to Make Anaglyphs
(.COPYRGT.1979, out of print). Although these books describe this
obscure perspective, they do not agree on its name. The first book
refers to it as a "free-standing anaglyph," and the second, a
"phantogram." Another publication called it "projective anaglyph"
(U.S. Pat. No. 5,795,154 by G. M. Woods, Aug. 18, 1998). Since
there is no agreed-upon name, we have taken the liberty of calling
it "horizontal perspective." Normally, as in central perspective,
the plane of vision, at right angle to the line of sight, is also
the projected plane of the picture, and depth cues are used to give
the illusion of depth to this flat image. In horizontal
perspective, the plane of vision remains the same, but the
projected image is not on this plane. It is on a plane angled to
the plane of vision. Typically, the image would be on the ground
level surface. This means the image will be physically in the third
dimension relative to the plane of vision. Thus horizontal
perspective can be called horizontal projection.
[0032] In horizontal perspective, the object is to separate the
image from the paper, and fuse the image to the three dimension
object that projects the horizontal perspective image. Thus the
horizontal perspective image must be distorted so that the visual
image fuses to form the free standing three dimensional figure. It
is also essential the image is viewed from the correct eye points,
otherwise the three dimensional illusion is lost. In contrast to
central perspective images which have height and width, and project
an illusion of depth, and therefore the objects are usually
abruptly projected and the images appear to be in layers, the
horizontal perspective images have actual depth and width, and
illusion gives them height, and therefore there is usually a
graduated shifting so the images appear to be continuous.
[0033] FIG. 3 compares key characteristics that differentiate
central perspective and horizontal perspective. Image A shows key
pertinent characteristics of central perspective, and Image B shows
key pertinent characteristics of horizontal perspective.
[0034] In other words, in Image A, the real-life three dimension
object (three blocks stacked slightly above each other) was drawn
by the artist closing one eye, and viewing along a line of sight
perpendicular to the vertical drawing plane. The resulting image,
when viewed vertically, straight on, and through one eye, looks the
same as the original image. In Image B, the real-life three
dimension object was drawn by the artist closing one eye, and
viewing along a line of sight 45.degree. to the horizontal drawing
plane. The resulting image, when viewed horizontally, at 45.degree.
and through one eye, looks the same as the original image.
[0035] One major difference between central perspective showing in
Image A and horizontal perspective showing in Image B is the
location of the display plane with respect to the projected three
dimensional image. In horizontal perspective of Image B, the
display plane can be adjusted up and down, and therefore the
projected image can be displayed in the open air above the display
plane, i.e. a physical hand can touch (or more likely pass through)
the illusion, or it can be displayed under the display plane, i.e.
one cannot touch the illusion because the display plane physically
blocks the hand. This is the nature of horizontal perspective, and
as long as the camera eyepoint and the viewer eyepoint is at the
same place, the illusion is present. In contrast, in central
perspective of Image A, the three dimensional illusion is likely to
be only inside the display plane, meaning one cannot touch it. To
bring the three dimensional illusion outside of the display plane
to allow viewer to touch it, the central perspective would need
elaborate display scheme such as surround image projection and
large volume.
[0036] FIGS. 4 and 5 illustrate the visual difference between using
central and horizontal perspective. To experience this visual
difference, first look at FIG. 4, drawn with central perspective,
through one open eye. Hold the piece of paper vertically in front
of you, as you would a traditional drawing, perpendicular to your
eye. You can see that central perspective provides a good
representation of three dimension objects on a two dimension
surface.
[0037] Now look at FIG. 5, drawn using horizontal perspective, by
sifting at your desk and placing the paper lying flat
(horizontally) on the desk in front of you. Again, view the image
through only one eye. This puts your one open eye, called the eye
point at approximately a 45.degree. angle to the paper, which is
the angle that the artist used to make the drawing. To get your
open eye and its line-of-sight to coincide with the artist's, move
your eye downward and forward closer to the drawing, about six
inches out and down and at a 45.degree. angle. This will result in
the ideal viewing experience where the top and middle blocks will
appear above the paper in open space.
[0038] Again, the reason your one open eye needs to be at this
precise location is because both central and horizontal perspective
not only define the angle of the line of sight from the eye point;
they also define the distance from the eye point to the drawing.
This means that FIGS. 4 and 5 are drawn with an ideal location and
direction for your open eye relative to the drawing surfaces.
However, unlike central perspective where deviations from position
and direction of the eye point create little distortion, when
viewing a horizontal perspective drawing, the use of only one eye
and the position and direction of that eye relative to the viewing
surface are essential to seeing the open space three dimension
horizontal perspective illusion.
[0039] FIG. 6 is an architectural-style illustration that
demonstrates a method for making simple geometric drawings on paper
or canvas utilizing horizontal perspective. FIG. 6 is a side view
of the same three blocks used in FIGS. 5. It illustrates the actual
mechanics of horizontal perspective. Each point that makes up the
object is drawn by projecting the point onto the horizontal drawing
plane. To illustrate this, FIG. 6 shows a few of the coordinates of
the blocks being drawn on the horizontal drawing plane through
projection lines. These projection lines start at the eye point
(not shown in FIG. 6 due to scale), intersect a point on the
object, then continue in a straight line to where they intersect
the horizontal drawing plane, which is where they are physically
drawn as a single dot on the paper. When an architect repeats this
process for each and every point on the blocks, as seen from the
drawing surface to the eye point along the line-of-sight the
horizontal perspective drawing is complete, and looks like FIG.
5.
[0040] Notice that in FIG. 6, one of the three blocks appears below
the horizontal drawing plane. With horizontal perspective, points
located below the drawing surface are also drawn onto the
horizontal drawing plane, as seen from the eye point along the
line-of-site. Therefore when the final drawing is viewed, objects
not only appear above the horizontal drawing plane, but may also
appear below it as well-giving the appearance that they are
receding into the paper. If you look again at FIG. 5, you will
notice that the bottom box appears to be below, or go into, the
paper, while the other two boxes appear above the paper in open
space.
[0041] The generation of horizontal perspective images require
considerably more expertise to create than central perspective
images. Even though both methods seek to provide the viewer the
three dimension illusion that resulted from the two dimensional
image, central perspective images produce directly the three
dimensional landscape from the viewer or camera point. In contrast,
the horizontal perspective image appears distorted when viewing
head on, but this distortion has to be precisely rendered so that
when viewing at a precise location, the horizontal perspective
produces a three dimensional illusion.
[0042] The present invention multi-plane display system promotes
horizontal perspective projection viewing by providing the viewer
with the means to adjust the displayed images to maximize the
illusion viewing experience. By employing the computation power of
the microprocessor and a real time display, the horizontal
perspective display of the present invention is shown in FIG. 7,
comprising a real time electronic display 100 capable of re-drawing
the projected image, together with a viewer's input device 102 to
adjust the horizontal perspective image. By re-display the
horizontal perspective image so that its projection eyepoint
coincides with the eyepoint of the viewer, the horizontal
perspective display of the present invention can ensure the minimum
distortion in rendering the three dimension illusion from the
horizontal perspective method. The input device can be manually
operated where the viewer manually inputs his or her eyepoint
location, or change the projection image eyepoint to obtain the
optimum three dimensional illusion. The input device can also be
automatically operated where the display automatically tracks the
viewer's eyepoint and adjust the projection image accordingly. The
multi-plane display system removes the constraint that the viewers
keeping their heads in relatively fixed positions, a constraint
that create much difficulty in the acceptance of precise eyepoint
location such as horizontal perspective or hologram display.
[0043] The horizontal perspective display system, shown in FIG. 8,
can further a computation device 110 in addition to the real time
electronic display device 100 and projection image input device 112
providing input to the computational device 110 to calculating the
projectional images for display to providing a realistic, minimum
distortion three dimensional illusion to the viewer by coincide the
viewer's eyepoint with the projection image eyepoint. The system
can further comprise an image enlargement/reduction input device
115, or an image rotation input device 117, or an image movement
device 119 to allow the viewer to adjust the view of the projection
images.
[0044] The input device can be operated manually or automatically.
The input device can detect the position and orientation of the
viewew eyepoint, to compute and to project the image onto the
display according to the detection result. Alternatively, the input
device can be made to detect the position and orientation of the
viewer's head along with the orientation of the eyeballs. The input
device can comprise an infrared detection system to detect the
position the viewer's head to allow the viewer freedom of head
movement. Other embodiments of the input device can be the
triangulation method of detecting the viewer eyepoint location,
such as a CCD camera providing position data suitable for the head
tracking objectives of the invention. The input device can be
manually operated by the viewer, such as a keyboard, mouse,
trackball, joystick, or the like, to indicate the correct display
of the horizontal perspective display images.
[0045] The head or eye-tracking system can comprise a base unit and
a head-mounted sensor on the head of the viewer. The head-mounted
sensor produces signals showing the position and orientation of the
viewer in response to the viewer's head movement and eye
orientation. These signals can be received by the base unit and are
used to compute the proper three dimensional projection images. The
head or eye tracking system can be infrared cameras to capture
images of the viewer's eyes. Using the captured images and other
techniques of image processing, the position and orientation of the
viewer's eyes can be determined, and then provided to the base
unit. The head and eye tracking can be done in real time for small
enough time interval to provide continous viewer's head and eye
tracking.
[0046] The multi-plane display system comprises a number of new
computer hardware and software elements and processes, and together
with existing components creates a horizontal perspective viewing
simulator. For the viewer to experience these unique viewing
simulations the computer hardware viewing surface is preferrably
situated horizontally, such that the viewer's line of sight is at a
45.degree. angle to the surface. Typically, this means that the
viewer is standing or seated vertically, and the viewing surface is
horizontal to the ground. Note that although the viewer can
experience hands-on simulations at viewing angles other than
45.degree. (e.g. 55.degree., 30.degree. etc.), it is the optimal
angle for the brain to recognize the maximum amount of spatial
information in an open space image. Therefore, for simplicity's
sake, we use "45.degree." throughout this document to mean "an
approximate 45 degree angle". Further, while horizontal viewing
surface is preferred since it simulates viewers' experience with
the horizontal ground, any viewing surface could offer similar
three dimensional illusion experience. The horizontal perspective
illusion can appear to be hanging from a ceiling by projecting the
horizontal perspective images onto a ceiling surface, or appear to
be floating from a wall by projecting the horizontal perspective
images onto a vertical wall surface.
[0047] The viewing simulations are generated within a three
dimensional graphics view volume, both situated above and below the
physical viewing surface. Mathematically, the computer-generated x,
y, z coordinates of the Angled Camera point form the vertex of an
infinite "pyramid", whose sides pass through the x, y, z
coordinates of the Reference/Horizontal Plane. FIG. 9 illustrates
this infinite pyramid, which begins at the Angled Camera point and
extending through the Far Clip Plane. The viewing volume is defined
by a Comfort Plane, a plabe on top of the viewing volume, and is
appropriately named because its location within the pyramid
determines the viewer's personal comfort, i.e. how their eyes,
head, body, etc. are situated while viewing and interacting with
simulations.
[0048] For the viewer to view open space images on their physical
viewing device it must be positioned properly, which usually means
the physical Reference Plane is placed horizontally to the ground.
Whatever the viewing device's position relative to the ground, the
Reference/Horizontal Plane must be at approximately a 45.degree.
angle to the viewer's line-of-site for optimum viewing.
[0049] One way the viewer might perform this step is to position
their CRT computer monitor on the floor in a stand, so that the
Reference/Horizontal Plane is horizontal to the floor. This example
uses a CRT-type television or computer monitor, but it could be any
type of viewing device, display screen, monochromic or color
display, luminescent, TFT, phosphorescent, computer projectors and
other method of image generation in general, providing a viewing
surface at approximately a 45.degree. angle to the viewer's
line-of-sight.
[0050] The display needs to know the view's eyepoint to proper
display the horizontal perspective images. One way to do this is
for the viewer to supply the horizontal perspective display with
their eye's real-world x, y, z location and line-of-site
information relative to the center of the physical
Reference/Horizontal Plane. For example, the viewer tells the
horizontal perspective display that their physical eye will be
located 12 inches up, and 12 inches back, while looking at the
center of the Reference/Horizontal Plane. The horizontal
perspective display then maps the computer-generated Angled Camera
point to the viewer's eyepoint physical coordinates and
line-of-site. Another way is for the viewer to manually adjusting
an input device such as a mouse, and the horizontal perspective
display adjust its image projection eyepoint until the proper
eyepoint location is experienced by the viewer. Another way way is
using triangulation with infrared device or camera to automatically
locate the viewer's eyes locations.
[0051] FIG. 10 is an illustration of the horizontal perspective
display that includes all of the new computer-generated and real
physical elements as described in the steps above. It also shows
that a real-world element and its computer-generated equivalent are
mapped 1:1 and together share a common Reference Plane. The full
implementation of this horizontal perspective display results in a
real-time computer-generated three dimensional graphics appearing
in open space on and above a viewing device's surface, which is
oriented approximately 45.degree. to the viewer's
line-of-sight.
[0052] The present invention also allows the viewer to move around
the three dimensional display and yet suffer no great distortion
since the display can track the viewer eyepoint and re-display the
images correspondingly, in contrast to the conventional pior art
three dimensional image display where it would be projected and
computed as seen from a singular viewing point, and thus any
movement by the viewer away from the intended viewing point in
space would cause gross distortion.
[0053] The display system can further comprise a computer capable
of re-calculate the projected image given the movement of the
eyepoint location. The horizontal perspective images can be very
complex, tedious to create, or created in ways that are not natural
for artists or cameras, and therefore require the use of a computer
system for the tasks. To display a three-dimensional image of an
object with complex surfaces or to create an animation sequences
would demand a lot of computational power and time, and therefore
it is a task well suited to the computer. Three dimensional capable
electronics and computing hardware devices and real-time
computer-generated three dimensional computer graphics have
advanced significantly recently with marked innovations in visual,
audio and tactile systems, and have producing excellent hardware
and software products to generate realism and more natural
computer-human interfaces.
[0054] The multi-plane display system of the present invention are
not only in demand for entertainment media such as televisions,
movies, and video games but are also needed from various fields
such as education (displaying three-dimensional structures),
technological training (displaying three-dimensional equipment).
There is an increasing demand for three-dimensional image displays,
which can be viewed from various angles to enable observation of
real objects using object-like images. The horizontal perspective
display system is also capable of substitute a computer-generated
reality for the viewer observation. The systems may include audio,
visual, motion and inputs from the user in order to create a
complete experience of three dimensional illusion.
[0055] The input for the horizontal perspective system can be two
dimensional image, several images combined to form one single three
dimensional image, or three dimensional model. The three
dimensional image or model conveys much more information than that
a two dimensional image and by changing viewing angle, the viewer
will get the impression of seeing the same object from different
perspectives continuously.
[0056] The multi-plane display system can further provide multiple
views or "Multi-View" capability. Multi-View provides the viewer
with multiple and/or separate left-and right-eye views of the same
simulation. Multi-View capability is a significant visual and
interactive improvement over the single eye view. In Multi-View
mode, both the left eye and right eye images are fused by the
viewer's brain into a single, three-dimensional illusion. The
problem of the discrepancy between accommodation and convergence of
eyes, inherent in stereoscopic images, leading to the viewer's eye
fatigue with large discrepancy, can be reduced with the horizontal
perspective display, especially for motion images, since the
position of the viewer's gaze point changes when the display scene
changes.
[0057] In Multi-View mode, the objective is to simulate the actions
of the two eyes to create the perception of depth, namely the left
eye and the right right sees slightly different images. Thus
Multi-View devices that can be used in the present invention
include methods with glasses such as anaglyph method, special
polarized glasses or shutter glasses, methods without using glasses
such as a parallax stereogram, a lenticular method, and mirror
method (concave and convex lens).
[0058] In anaglyph method, a display image for the right eye and a
display image for the left eye are respectively
superimpose-displayed in two colors, e.g., red and blue, and
observation images for the right and left eyes are separated using
color filters, thus allowing a viewer to recognize a stereoscopic
image. The images are displayed using horizontal perspective
technique with the viewer looking down at an angle. As with one eye
horizontal perspective method, the eyepoint of the projected images
has to be coincide with the eyepoint of the viewer, and therefore
the viewer input device is essential in allowing the viewer to
observe the three dimensional horizontal perspective illusion. From
the early days of the anaglyph method, there are much improvements
such as the spectrum of the red/blue glasses and display to
generate much more realizm and comfort to the viewers.
[0059] In polarized glasses method, the left eye image and the
right eye image are separated by the use of mutually extinguishing
polarizing filters such as orthogonally linear polarizer, circular
polarizer, elliptical polarizer. The images are normally projected
onto screens with polarizing filters and the viewer is then
provided with corresponding polarized glasses. The left and right
eye images appear on the screen at the same time, but only the left
eye polarized light is transmitted through the left eye lens of the
eyeglasses and only the right eye polarized light is transmitted
through the right eye lens.
[0060] Another way for stereocopic display is the image sequential
system. In such a system, the images are displayed sequentially
between left eye and right eye images rather than superimposing
them upon one another, and the viewer's lenses are synchronized
with the screen display to allow the left eye to see only when the
left image is displayed, and the right eye to see only when the
right image is displayed. The shuttering of the glasses can be
achieved by mechanical shuttering or with liquid crystal electronic
shuttering. In shuttering glass method, display images for the
right and left eyes are alternately displayed on a CRT in a time
sharing manner, and observation images for the right and left eyes
are separated using time sharing shutter glasses which are
opened/closed in a time sharing manner in synchronism with the
display images, thus allowing an observer to recognize a
stereoscopic image.
[0061] Other way to display stereoscopic images is by optical
method. In this method, display images for the right and left eyes,
which are separately displayed on a viewer using optical means such
as prisms, mirror, lens, and the like, are superimpose-displayed as
observation images in front of an observer, thus allowing the
observer to recognize a stereoscopic image. Large convex or concave
lenses can also be used where two image projectors, projecting left
eye and right eye images, are providing focus to the viewer's left
and right eye respectively. A variation of the optical method is
the lenticular method where the images form on cylindrical lens
elements or two dimensional array of lens elements.
[0062] FIG. 11 is a horizontal perspective display focusing on how
the computer-generated person's two eye views are projected onto
the Horizontal Plane and then displayed on a stereoscopic 3D
capable viewing device. FIG. 11 represents one complete display
time period. During this display time period, the horizontal
perspective display needs to generate two different eye views,
because in this example the stereoscopic 3D viewing device requires
a separate left- and right-eye view. There are existing
stereoscopic 3D viewing devices that require more than a separate
left- and right-eye view, and because the method described here can
generate multiple views it works for these devices as well.
[0063] The illustration in the upper left of FIG. 11 shows the
Angled Camera point for the right eye after the first (right)
eye-view to be generated. Once the first (right) eye view is
complete, the horizontal perspective display starts the process of
rendering the computer-generated person's second eye (left-eye)
view. The illustration in the lower left of FIG. 11 shows the
Angled Camera point for the left eye after the completion of this
time. But before the rendering process can begin, the horizontal
perspective display makes an adjustment to the Angled Camera point.
This is illustrated in FIG. 11 by the left eye's x coordinate being
incremented by two inches. This difference between the right eye's
x value and the left eye's x+2" is what provides the two-inch
separation between the eyes, which is required for stereoscopic 3D
viewing. The distances between people's eyes vaty but in the above
example we are using the average of 2 inches. It is also possible
for the view to supply the horizontal perspective display with
their personal eye separation value. This would make the x value
for the left and right eyes highly accurate for a given viewer and
thereby improve the quality of their stereoscopic 3D view.
[0064] Once the horizontal perspective display has incremented the
Angled Camera point's x coordinate by two inches, or by the
personal eye separation value supplied by the viewer, the rendering
continues by displaying the second (left-eye) view.
[0065] Depending on the stereoscopic 3D viewing device used, the
horizontal perspective display continues to display the left- and
right-eye images, as described above, until it needs to move to the
next display time period. An example of when this may occur is if
the bear cub moves his paw or any part of his body. Then a new and
second Simulated Image would be required to show the bear cub in
its new position. This new Simulated Image of the bear cub, in a
slightly different location, gets rendered during a new display
time period. This process of generating multiple views via the
nonstop incrementing of display time continues as long as the
horizontal perspective display is generating real-time simulations
in stereoscopic 3D.
[0066] By rapidly display the horizontal perspective images, three
dimensional illusion of motion can be realized. Typically, 30 to 60
images per second would be adequate for the eye to perceive motion.
For stereocopy, the same display rate is needed for superimposed
images, and twice that amount would be needed for time sequential
method.
[0067] The display rate is the number of images per second that the
display uses to completely generate and display one image. This is
similar to a movie projector where 24 times a second it displays an
image. Therefore, {fraction (1/24)} of a second is required for one
image to be displayed by the projector. But the display time could
be a variable, meaning that depending on the complexity of the view
volumes it could take {fraction (1/12)} or 1/2 a second for the
computer to complete just one display image. Since the display was
generating a separate left and right eye view of the same image,
the total display time is twice the display time for one eye
image.
[0068] The present invention further discloses a Multi-Plane
display comprising a horizontal perspective display together with a
non-horizontal central perspective display. FIG. 12 illustrates an
example of the present invention Multi-Plane display in which the
Multi-Plane display is a computer monitor that is approximately "L"
shaped when open. The end-user views the L-shaped computer monitor
from its concave side and at approximately a 45.degree. angle to
the bottom of the "L," as shown in FIG. 12. From the end-user's
point of view the entire L-shaped computer monitor appears as one
single and seamless viewing surface. The bottom L of the display,
positioned horizontally, shows horizontal perspective image, and
the other branch of the L display shows central perspective image.
The edge is the two display segments is preferably smoothly joined
and can also having a curvilinear projection to connect the two
displays of horizontal perspective and central perspective.
[0069] The Multi-Plane display can be made with one or more
physical viewing surfaces. For example, the vertical leg of the "L"
can be one physical viewing surface, such as flat panel display,
and the horizontal leg of the "L" can be a separate flat panel
display. The edge of the two display segments can be a non-display
segment and therefore the two viewing surface are not continuous.
Each leg of a Multi-Plane display is called a viewing plane and as
you can see in the upper left of FIG. 25 there is a vertical
viewing plane and a horizontal viewing plane where a central
perspective image is generated on the vertical plane and a
horizontal perspective image is generated on the horizontal plane,
and then blend the two images where the planes meet, as illustrated
in the lower right of FIG. 12.
[0070] FIG. 12 also illustrates that a Multi-Plane display is
capable of generating multiple views. Meaning that it can display
single-view images, i.e. a one-eye perspective like the simulation
in the upper left, and/or multi-view images, i.e. separate right
and left eye views like the simulation in the lower right. And when
the L-shaped computer monitor is not being used by the end-user it
can be closed and look like the simulation in the lower left.
[0071] FIG. 13 is a simplified illustration of the present
invention Multi-Plane display. In the upper right of FIG. 13 is an
example of a single-view image of a bear cub that is displayed on
an L-shaped computer monitor. Normally a single-view or one eye
image would be generated with only one camera point, but as you can
see there are at least two camera points for the Multi-Plabe
display even though this is a single-view example. This is because
each viewing plane of a Multi-Plane device requires its own
rendering perspective. One camera point is for the horizontal
perspective image, which is displayed on the horizontal surface,
and the other camera point is for the central perspective image,
which is displayed on the vertical surface.
[0072] To generate both the horizontal perspective and central
perspective images requires the creation of two camera eyepoints
(which can be the same or different) as shown in FIG. 13 for two
different and separate camera points labeled OSI and CPI. The
vertical viewing plane of the L-shaped monitor, as shown at the
bottom of FIG. 13, is the display surface for the central
perspective images, and thus there is a need to define another
common reference plane for this surface. As discussed above, the
common reference plane is the plane where the images are display,
and the computer need to keep track of this plane for the
synchronization of the locations of the displayed images and the
real physical locations. With the L-shaped Multi-Plane device and
the two display surfaces, the Simulation can to generate the three
dimansional images, a horizontal perspective image using (OSI)
camera eyepoint, and a central perspective image using (CPI) camera
eyepoint.
[0073] The multi-plane display system can further include a
curvilinear connection display section to blend the horizontal
perspective and the central perspective images together at the
location of the seam in the "L," as shown at the bottom of FIG. 13.
The multi-plane display system can continuously update and display
what appears to be a single L-shaped image on the L-shaped
Multi-Plane device.
[0074] Furthermore, the multi-plane display system can comprise
multiple display surfaces together with multiple curvilinear
blending sections as shown in FIG. 14. The multiple display
surfaces can be a flat wall, multiple adjacent flat walls, a dome,
and a curved wraparound panel.
[0075] The present invention multi-plane display system thus can
simultaneously projecting a plurality of three dimensional images
onto multiple display surfaces, one of which is a horizontal
perspective image. Further, it can be a stereoscopic multiple
display system allowing viewers to use their stereoscopic vision
for three dimensional image presentation.
[0076] Since the multi-plane display system comprises at least two
display surfaces, various requirements need to be addressed to
ensure high fidelity in the three dimensional image projection. The
display requirements are typically geometric accuracy, to ensure
that objects and features of the image to be correctly positioned,
edge match accuracy, to ensure continuity between display surfaces,
no blending variation, to ensure no variation in luminance in the
blending section of various display surfaces, and field of view, to
ensure a continuous image from the eyepoint of the viewer.
[0077] Since the blending section of the multi-plane display system
is preferably a curve surface, some distortion correction could be
applied in order for the image projected onto the blending section
surface to appear correct to the viewer. There are various
solutions for providing distortion correction to a display system
such as using a test pattern image, designing the image projection
system for the specific curved blending display section, using
special video hardware, utilizing a piecewise-linear approximation
for the curved blending section. Still another distortion
correction solution for the curve surface projection is to
automatically computes image distortion correction for any given
position of the viewer eyepoint and the projector.
[0078] Since the multi-plane display system comprises more than one
display surface, care should be taken to minimize the seams and
gaps between the edges of the respective displays. To avoid seams
or gaps problem, there could be at least two image generators
generating adjacent overlapped portions of an image. The overlapped
image is calculated by an image processor to ensure that the
projected pixels in the overlapped areas are adjusted to form the
proper displayed images. Other solutions are to control the degree
of intensity reduction in the overlapping to create a smooth
transition from the image of one display surface to the next.
[0079] For realistic three dimensional display, binaural or three
dimensional audio simulation is also included. The present
invention also provide the means to adjust the binaural or 3D audio
to ensure proper sound simulation.
[0080] Similar to vision, hearing using one ear is called monoaural
and hearing using two ears is called binaural. Hearing can provide
the direction of the sound sources but with poorer resolution than
vision, the identity and content of a sound source such as speech
or music, and the nature of the environment via echoes,
reverberation such as a normal room or an open field.
[0081] The head and ears, and sometime the shoulder, function as an
antenna system to provide information about the location, distance
and environment of the sound sources. The brain can interprete
properly the various kinds of sound arriving at the head such as
direct sounds, diffractive sounds around the head and by
interaction with the outer ears and shoulder, different sound
amplitudes and different arrival time of the sounds. These acoustic
modifications are called `sound cues` and serve to provide us the
directional acoustis information of the sounds.
[0082] Basically, the sound cues are related to timing, volume,
frequency and reflection. In timing cues, the ears recognize the
time the sound arrives and assume that the sound comes from the
closest source. Further, with two ears separated about 8 inches
apart, the delay of the sound reaching one ear with respect to the
other ear can give a cue about the location of the sound source.
The timing cue is stronger than the level cue in the sense that the
listener locates the sound based on the first wave that reaches the
ear, regardless of the loudness of any later arriving waves. In
volume (or level) cues, the ears recognize the volume (or loudness)
of the sound and assume that the sound coming from the loudest
direction. With the binaural (two ears) effect, the amplitude
difference between the ears is a strong cue for the localization of
the sound source. In frequency (or equalization) cues, the ears
recognize the frequency balance of the sound as it arrives in each
ear since frontal sounds are directed into the eardrums, while rear
sounds bounce off the external ear and thus having a high frequency
roll off. In reflection cue, the sound bounces off various surfaces
and are either dispersed or absorbed in varying degrees before
reaching the ears multiple times. This reflections off the walls of
the room and the foreknowledge of the difference between the way
various floor coverings sound also contribute to localization. In
addition, the body, especially the head, can move relative to the
sound source to help in locate the sound.
[0083] The above various sound cues are scientifically classified
into three types of spatial hearing cues: interaural time
differences (ITDs), interaural level differences (ILDs), and
head-related transfer functions (HRTFs). ITDs relate to the time
for a sound to reach the ears and the time difference for reaching
both ears. ILDs refer to the amplitude in the frequency spectrum of
sound reaching the ears and also the amplitude differences of the
sound frequencies as heard in both ears. HRTFs can provide the
perception of distance by the changes in the timbre and distance
dependencies, the time delay and directions of direct sound and
reflections in echoic environments.
[0084] The HRTFs are a collection of spatial cues for a particular
listener, including ITDs, ILDs and the reflections, diffractions
and damping caused by the listener's body, head, outer ears and
shoulder. The external ear, or pinna, has a significant
contribution to the HRTFs. Higher frequency sounds are filtered by
the pinna to provide the brain a way as to perceive the lateral
position, or azimuth, and elevation of the sound source since the
response of the pinna filter is highly dependent on the overall
direction of the sound source. The head can account for a reduced
amplitude of various frequencies of the sounds since the sound has
to go through or around the head in order to reach the ear. The
overall effects of head shadowing contribute to the perception of
linear distance and direction of a sound source. Further, sound
frequencies in the range of 1-3 kHz are reflected from the shoulder
to produce echoes representing a time delay dependent on the
elevation of the sound source. The reflections from surfaces in the
world and the reverberation also seem to affect the localization
judgement of sound distance and direction.
[0085] In addition to these cues, the movement of the head to help
in locate the location of a sound source is a key factor, together
with the vision to confirm the sound direction. For a 3D immersion,
all mechanisms to localize the sounds are always in play and should
normally agree. If not, there would be some discomfort and
confusion.
[0086] Although we can hear with one ear, hearing with two ears is
clearly better. Many of the sound cues are related to the binaural
perception depending on both the relative loudness of sound and the
relative time of arrival of sound at each ear. And thus the
binaural performance is clear superior for the localization of
single or multiple sound sources and for the formation of the room
environment, for the separation of signals coming from multiple
incoherent and coherent sound sources; and the enhancement of a
chosen signal in a reverberant environment.
[0087] Mathematically speaking, HRTF is the frequency response of
the sound waves as received by the ears. By measuring the HRTF of a
particular listener, and by synthesised electronically using
digital signal processing, the sounds can be delivered to a
listener's ears via headphones or loudspeakers to create a virtual
sound image in three dimensions.
[0088] The sound transformation to the ear canal, i.e. HRTF
frequency response, can be measured accurately by using small
microphones in the ear canals. The measured signal is then
processed by a computer to derive the HRTF frequency responses for
the left and right ears corresponding to the sound source
location.
[0089] Thus a 3D audio system works by using the measured HRTFs as
the audio filters or equalizers. When a sound signal is processed
by the HRTFs filters, the sound localization cues are reproduced,
and the listener should perceive the sound at the location
specified by the HRTFs. This method of binaural synthesis works
extremely well when the listener's own HRTFs are used to synthesize
the localization cues. However, measuring HRTFs is a complicated
procedure, so 3D audio systems typically use a single set of HRTFs
previously measured from a particular human or manikin subject.
Thus the HRTF sometimes needs to be changed to accurately respond
to a perticular listener. The tuning of a HRTF function can be
accomplished by providing various sound source locations and
environments and asking the listener to identify.
[0090] A 3D audio system should provide the ability for the
listener to define a three-dimensional space, to position multiple
sound sources and that listener in that 3D space, and to do it all
in real-time, or interactively. Beside 3D audio system, other
technologies such stereo extension and surround sound could offer
some aspects of 3D positioning or interactivity.
[0091] Extended stereo processes an existing stereo (two channel)
soundtrack to add spaciousness and to make it appear to originate
from outside the left/right speaker locations through fairly
straight-forward methods. Some of the characteristics of the
extended stereo technology include the size of the listening area
(called sweet spot), the amount of spreading of stereo images, the
amount of tonal changes, the amount of lost stereo panning
information, and the ability to achieve effect on headphones as
well as speakers.
[0092] The surround sound create a larger-than-stereo sound stage
with a surround sound 5-speaker setup. Additionally, virtual
surround sound systems use 3D audio technology to create the
illusion of five speakers emanating from a regular set of stereo
speakers, therefore enabling a surround sound listening experience
without the need for a five speaker setup. The characteristics of
the surround sound technology include the presentation accuracy,
the clarity of spatial imaging, and the size of the listening
area
[0093] For better 3D audio system, audio technology needs to create
a life-like listening experience by replicating the 3D audio cues
that the ears hear in the real world for allowing non-interactive
and interactive listening and positioning of sounds anywhere in the
three-dimensional space surrounding a listener.
[0094] The head tracker function is also very important to provide
perceptual room constancy to the listener. In other words, when the
listener move their heads around, the signals would change so that
the perceived auditory world maintain its spatial position. To this
end, the simulation system needs to know the head position in order
to be able to control the binaural impulse responses adequately.
Head position sensors have therefore to be provided. The impression
of being immersed is of particular relevance for applications in
the context of virtual reality.
[0095] A replica of a sound field can be produced by putting an
infinite number of microphones everywhere. After being stored on a
recorder with an infinite number of channels, this recording can
then be played back through an infinite number of point-source
loudspeakers, each placed exactly as its corresponding microphone
was placed. As the number of microphones and speakers is reduced,
the quality of the sound field being simulated suffers. By the time
we are down to two channels, height cues have certainly been lost
and instead of a stage that is audible from anywhere in the room we
find that sources on the stage are now only localizable if we
listen along a line equidistant from the last two remaining
speakers and face them.
[0096] However, only two channels should be adequate, since if we
deliver the exact sound required to simulate a live performance at
the entrance to each ear canal, then since we only have two ear
canals, we should only need to generate two such sound fields. In
other words, since we can hear three-dimensionally in the real
world using just two ears, it must be possible to achieve the same
effect from just two speakers or a set of headphones.
[0097] Headphone reproduction is thus differed from loudspeaker
reproduction since headphone microphones should be spaced about
seven inches apart for a normal ear separation, and loudspeaker
microphones separation should be about seven feet apart. Further
loudspeakers suffer from crosstalk and therefore some signal
conditioning such as crosstalk cancellation will be needed for 3D
loudspeaker setup.
[0098] Loudspeaker 3D audio systems are extremely effective in
desktop computing environments. This is because there is usually
only a single listener (the computer user) who is almost always
centered between the speakers and facing forward towards the
monitor. Thus, the primary user gets the full 3D effect because the
crosstalk is properly cancelled. In typical 3D audio applications,
like video gaming, friends may gather around to watch. In this
case, the best 3D audio effects are heard by others when they are
also centered with respect to the loudspeakers. Off-center
listeners may not get the full effect, but they still hear a high
quality stereo program with some spatial enhancements.
[0099] To achieve 3D audio, the speakers are typically arranged
surrounding the listener in about the same horizontal plane, but
could be arranged to completely surround the listener, from the
ceiling to the floor to the surrounding walls. Optionally, the
speakers can also be put on the ceiling, on the floor, arranged in
an overhead dome configuration, or arranged in a vertical wall
configuration. Further, beam transmitted speakers can be used
instead of headphone. Beam transmitted speaker offers the freedom
of movement for the listener and without the crosstalk between
speakers since beam transmitted speaker provide a tight beam of
sound.
[0100] Generally, a minimum of four loudspeakers are required to
achieve a convincing 3-D audio experience, while some researchers
are using twenty or more speakers in an anechoic chamber to
recreate acoustic environments with much greater precision.
[0101] The main advantages of multi-speaker playback are:
[0102] There is no dependence on the individual subject's HRTF,
since the sound field is created without any reference to
individual listeners.
[0103] The subject is free to turn their head, and even move about
within a limited range.
[0104] In some cases, more than one subject can listen to the
system simultaneously.
[0105] Many crosstalk cancellers are based on a highly simplified
model of crosstalk, for example modeling crosstalk as a simple
delay and attenuation process, or a delay and a lowpass filter.
Other crosstalk cancellers have been based on a spherical head
model. As with binaural synthesis, crosstalk cancellation
performance is ultimately limited by the variation in the size and
shape of human heads.
[0106] 3D audio simulation can be accomplished by the following
steps:
[0107] Input the characteristics of the acoustic space.
[0108] Determine the sequence of sound arrivals that occur at the
listening position. Each sound arrival will have the following
characteristics: (a) time of arrival, based on the distance
travelled by the echo-path, (b) direction of arrival, (c)
attenuation (as a function of frequency) of the sound due to the
absorption properties of the surfaces encountered by the
echo-path.
[0109] Compute the impulse response of the acoustic space
incorporating the multiple sound arrivals.
[0110] The results from the FIR filter are played back to a
listener. In the case where the impulse responses were computed
using a dummy head response, the results are played over headphones
to the listener. In this case, the equalisation required for the
particular headphones is also applied.
[0111] The simulation of an acoustic environment involves one or
more of the following functions:
[0112] Processing an audio source input and presenting it to the
subject through a number of loudspeakers (or headphones) with the
intention of making the sound source appear to be located at a
particular position in space.
[0113] Processing multiple input audio sources in such a way that
each source is independently located in space around the
subject.
[0114] Enhanced processing to simulate some aspects of the room
acoustics, so that the user can acoustically sense the size of the
room and the nature of the floor and wall coverings.
[0115] The capability for the subject to move (perhaps within a
limited range) and turn his/her head so as to focus attention on
some aspects of the sound source characteristics or room
acoustics.
[0116] Binaural simulation is generally carried out using the sound
source material free from any unwanted echoes or noise. The sound
source material can then be replayed to a subject, using the
appropriate HRTF filters, to create the illusion that the source
audio is originating from a particular direction. The HRTF
filtering is achieved by simply convolving the audio signal with
the pair of HRTF responses (one HRTF filter for each channel of the
headphone).
[0117] The eyes and ears often perceive an event at the same time.
Seeing a door close, and hearing a shutting sound, are interpreted
as one event if they happen synchronously. If we see a door shut
without a sound, or we see a door shut in front of us, and hear a
shutting sound to the left, we get alarmed and confused. In another
scenario, we might hear a voice in front of us, and see a hallway
with a corner; the combination of audio and visual cues allows us
to figure out that a person might be standing around the corner.
Together, synchronized 3D audio and 3D visual cues provide a very
strong immersion experience. Both 3D audio and 3D graphics systems
can be greatly enhanced by such synchronization.
[0118] Improved playback through headphones can be achieved through
the use of head tracking. This technique makes use of continuous
measurements of the orientation of a subject's head, and adapts the
audio signals being fed to the headphones appropriately. Binaural
signal should allow a subject to easily discriminate between left
and right sound source locations easily, but the ability to
discriminate between front and back, and high and low sound sources
is generally only possible if head movement is permitted. Whilst
multiple speaker playback methods solve this problem to a large
degree, there are still many applications where headphone playback
is preferable, and head tracking can then be used as a valuable
tool for improving the quality of the 3-D playback.
[0119] The simplest form of head tracking binaural system is one
which simply simulates anechoic HRTFs, and changes the HRTF
functions rapidly in response to the subjects head movements. This
HRTF switching can be achieved through a lookup table, with
interpolation used to resolve angles that are not represented in
the HRTF table.
[0120] Simulation of room acoustics over headphones with head
tracking becomes more difficult because the direction of arrival of
the early reflections is also important in making the result sound
realistic. Many researchers believe that the echoes in the
reverberant tail of the room response are generally so diffuse that
there is no requirement for this part of the room response to be
tracked with the subject's head movements.
[0121] An important feature of any head tracking playback system is
the delay from the subject head movement to the change in the audio
response at the headphones. If this delay is excessive, the subject
can experience a form of virtual motion sickness and general
disorientation.
[0122] Audio cues change dramatically when a listener tilts or
rotates his or her head. For example, quickly turning the head 90
degrees to look to the side is the equivalent of a sound traveling
from the listener's side to the front in a split second. We often
use head motion to track sounds or to search for them. The ears
alert the brain about an event outside of the area that the eyes
are currently focused on, and we automatically turn to redirect our
attention. Additionally, we use head motion to resolve ambiguities:
a faint, low sound could be either in front or back of us, so we
quickly and sub-consciously turn our head a small fraction to the
left, and we know if the sound is now off to the right, it is in
the front, otherwise it is in the back. One of the reasons why
interactive audio is more realistic than pre-recorded audio
(soundtracks) is the fact that the listeners head motion can be
properly simulated in an interactive system (using inputs from a
joystick, mouse, or head-tracking system).
[0123] The HRTF function are performed using digital signal
processing (DSP) hardware for real time performance. Typical
feature of DSP are that the direct sound must be processed to give
the correct amplitude and perceived direction, the early echoes
must arrive at the listener with appropriate time, amplitude and
frequency response to give the perception of the size of the spaces
(as well as the acoustic nature of the room surfaces), and the late
reverberation must be natural and correctly distributed in 3-D
around the listener. The relative amplitude of the direct sound
compared to the remainder of the room response helps to provide the
sensation of distance.
[0124] Thus 3D audio simulation can provides a binaural gain so
that the exact same audio content is more audible and intelligible
in the binaural case, because the brain can localize and therefore
"single out" the binaural signal, while the non-binaural signal
gets washed into the noise. Further the listener would still be
able to tune into and understand individual conversations, because
they are still spatially separated, and "amplified by" binaural
gain, an effect called the cocktail party effct. Binaural
simulation also can provide faster reaction time because such a
signal mirrors the ones received in the real world. In addition,
binaural signals can convey positional information: a binaural
radar warning sound can warn a user about a specific object that is
approaching (with a sound that is unique to that object), and
naturally indicate where that object is coming from. Also listening
to binaural simulation can beless fatigue since we are used to
hearing sounds that originate outside of their heads, as is the
case with binaural signals. Mono or stereo signals appear to come
from inside a listener's head when using headphones, and produce
more strain than a natural sounding, binaural signal. An lastly, 3D
binaural simulation can provide an increased perception and
immersion in higher quality 3D environment when visuals are shown
in synch with binaural sound.
* * * * *