U.S. patent application number 17/671292 was filed with the patent office on 2022-06-02 for pervasive 3d graphical user interface configured for machine learning.
The applicant listed for this patent is Wen-Chieh Geoffrey Lee. Invention is credited to Wen-Chieh Geoffrey Lee.
Application Number | 20220171520 17/671292 |
Document ID | / |
Family ID | 1000006137802 |
Filed Date | 2022-06-02 |
United States Patent
Application |
20220171520 |
Kind Code |
A1 |
Lee; Wen-Chieh Geoffrey |
June 2, 2022 |
Pervasive 3D Graphical User Interface Configured for Machine
Learning
Abstract
A three-dimensional graphical user interface (3D GUI) configured
to be used by a computer, a display system, an electronic system,
or an electro-mechanical system. The 3D GUI provides an enhanced
user-engaging experience while enabling a user to manipulate the
motion of an object of arbitrary size and a multiplicity of
independent degrees of freedom, using sufficient degrees of freedom
to represent the motion. The 3D GUI includes the functionality of
machine learning (ML) and the support vector machine (SVM) and
convolutional neural network (CNN) which provides intelligent
control of robot kinematics and computer graphics as well as the
ability of the user to more quickly learn the more subtle
applications of 3D computer graphics.
Inventors: |
Lee; Wen-Chieh Geoffrey;
(Taipei City, TW) |
|
Applicant: |
Name |
City |
State |
Country |
Type |
Lee; Wen-Chieh Geoffrey |
Taipei City |
|
TW |
|
|
Family ID: |
1000006137802 |
Appl. No.: |
17/671292 |
Filed: |
February 14, 2022 |
Related U.S. Patent Documents
|
|
|
|
|
|
Application
Number |
Filing Date |
Patent Number |
|
|
16164928 |
Oct 19, 2018 |
11307730 |
|
|
17671292 |
|
|
|
|
Current U.S.
Class: |
1/1 |
Current CPC
Class: |
G06V 10/94 20220101;
G06F 3/0346 20130101; G06F 3/04815 20130101; G06K 9/6269 20130101;
G06T 7/246 20170101; G06T 2207/30241 20130101; G06K 9/628 20130101;
G06T 15/20 20130101; G06K 9/6217 20130101; G06T 2207/20084
20130101 |
International
Class: |
G06F 3/04815 20060101
G06F003/04815; G06K 9/62 20060101 G06K009/62; G06T 15/20 20060101
G06T015/20; G06T 7/246 20060101 G06T007/246; G06F 3/0346 20060101
G06F003/0346 |
Claims
1. A system comprising: a memory and at least one processor coupled
to the memory in a computer, a display system, an electronic
system, or an electro-mechanical system, configured to present on a
display device a three-dimensional graphical user interface (3D
GUI); wherein said 3D GUI is configured to allow maneuvering an
object in a 3D space represented by said 3D GUI by a motion of at
least three independent degrees of freedom, said motion being
characterized by either linear or non-linear motion vectors, or
both; and said space being augmented by additional dimensions for
characterizing features, wherein said linear and non-linear motion
vectors represent translational and rotational motion respectively
and are capable of being generated by a single gestural motion of a
navigational device on a reference surface without applying the
input of other motion detection devices.
2. The system of claim 1 further comprising a neural network module
that is loaded into the memory of said system or implemented as a
separated device/subsystem, including a graphic processing unit,
GPU; or an application specified integrated circuit, ASIC,
electronically linking to said system; wherein said neural network
module carries a specific artificial intelligence function that,
through a method of of machine learning or an equivalent learning
method, said neural network module is able to classify a plurality
of 3D objects presentable by said 3D GUI; wherein at least one
property of said 3D objects is identified by said computer or said
separated device/subsystem as a feature vector; and wherein the
status of said feature vector can be configured by said computer or
said separated device/subsystem, making said system able to control
an output signal, or the kinematics of an object undergoing a
motion.
3. The system of claim 2, wherein: when in software formation, said
neural network module may be incorporated by said system with a
plurality of other software modules in a layered configuration;
wherein said software modules are stored in the memory of said
system or in said separated device/subsystem, and wherein each of
said software modules is dedicated to a unique functionality of
said system, at least two of said unique functionalities are
associated to providing the perspectives of said 3D GUI and a
robotic kinematics.
4. The system of claim 2: wherein said neural network module is
characterized as an SVM (support vector machine), CNN
(convolutional neural network), or a machine learning method that
has a net effect equivalent to that of said SVM or said CNN;
5. The system of claim 2: wherein a first exemplary set of said 3D
objects, either represented by a plurality of graphical vectors or
a set of image data in said 3D GUI, denote a unique group of
interactive beings, such beings including a bouquet of flowers
configured to interact with a butterfly; wherein said first
exemplary set of 3D objects are classified by said neural network
module; and wherein, using a machine learning process, said first
exemplary set of 3D objects are identified by said neural network
module as a plurality of distinct species, based on the information
provided by said feature vector; wherein a cursor is configured by
said 3D GUI to act as a second set of 3D objects including a
plurality of butterflies, configured to interact with said first
exemplary set of 3D objects according to some of its feature
vectors that are identifiable by said 3D GUI.
6. The system of claim 5, wherein said interactive beings denote a
cluster of plants including a bouquet of flowers, a group of
animals, including a group of bees, a set of biological entities,
including cells in a medical image, or a few typical cartoon
characters including Tinker Bell and Winny the Pooh, that have
their own personalities or some unique properties identifiable by
said 3D GUI.
7. A computer-implemented method for three dimensional (3D)
graphical rendering of objects on a display, comprising the steps
of: rendering a plurality of three dimensional graphical vectors
referenced to at least one vanishing point(s); wherein a position
of said vanishing point(s) can be manipulated by an artificial
intelligence technique; and dividing said plurality of three
dimensional graphical vectors tracked by said method into one or
more classes, each of which forms a margin with one another that is
configured to be recognized by an AI (artificial intelligence)
method; wherein, when said margin reaches different values, said
method recognizes that occurrence and generates graphical rendering
effects, or supports levels of interaction between a user and said
method.
8. The computer-implemented method of claim 7 wherein said
artificial intelligence method is a support vector machine
(SVM).
9. The computer-implemented method of claim 7 wherein said
artificial intelligence technique is a convolutional neural network
(CNN), in which at least one of its output signals is not decided
by an optimized value of the margin of support vector machine
(SVM).
10. A computer-implemented neural signal processing system
configured to reduce the processing load or time carried out by
said computer for classifying a plurality of neural signals while
maintaining the accuracy of results of said processing within a
user acceptable range, comprising; using the classification
functionality of a support vector machine (SVM) stored as a module
in a 3D GUI, either in hardware or software formation, creating a
plurality of multidimensional feature vectors based on a set of raw
input data comprising a 3D image, or a 3D vector graphic, or
acoustic data in multiple frequency channels, or a vector field,
all of whose profiles can be mapped to a 2D image frame;
designating a plurality of vanishing points in said 2D image frame,
such that the apparent degrees of freedom of said raw input data
after being mapped to said 2D image frame follow a consistent trend
of decreasing toward one of said vanishing points, by manipulating
the positions of said vanishing points in said 2D image frame
automatically, or by an in-situ manual process using 3D
navigational device that provides means of changing said 2D image
frame by more than three degrees of freedom, the total dimension or
size of the vector space constructed by said plurality of
multidimensional feature vectors can be manipulated and reduced,
which subsequently causes the processing load of said computer in
said computer-implemented said neural network system to be reduced
correspondingly while the accuracy of result of said neural network
system is till maintained at a level acceptable to the user.
11. A computer-implemented method for neural network signal
processing using a computer configured to utilize a three
dimensional graphical user interface (3D GUI) shown on a display,
said method comprising the steps of; using the classification
functionality of a support vector machine downloaded in a module in
said 3D GUI, creating a plurality of multidimensional feature
vectors based on a set of raw input data comprising a 3D image, or
a 3D vector graphic, or acoustic data in multiple frequency
channels, or a vector field, all of whose profiles can be mapped to
a 2D image frame; designating a plurality of vanishing points in
said 2D image frame, such that the apparent degrees of freedom of
said raw input data after being mapped to said 2D image frame
follow a consistent trend of decreasing toward one of said
vanishing points, manipulating the positions of said vanishing
points in said 2D image frame automatically, or by an in-situ
manual process using 3D navigational device that provides means of
changing said 2D image frame by more than three degrees of freedom,
whereby the total dimension or size of the vector space constructed
by said plurality of multidimensional feature vectors can be
manipulated and reduced from 3D to 2.5D, which reduction
subsequently causes a processing load of said computer in said
computer-implemented neural network processing system to be reduced
correspondingly while the accuracy of result of said neural network
system is till maintained at a level acceptable to the user.
12. The computer-implemented method of claim 11: wherein said 3D
GUI communicates with a 3D navigational device that is controllably
moving along a tinted 2D reference surface, accessing said set of
raw input data and, by touching a surface element of said 3D
navigational device, altering intensities of a system of
illumination within said 3D navigational device thereby changing
the 3D address of said vanishing point(s), causing the total
dimension or size of said vector space constructed by said
plurality of multidimensional feature vectors to be reduced.
Description
[0001] This is a Divisional Application of U.S. patent application
Ser. No. 16/164,928 filed on Oct. 19, 2018, which is herein
incorporated by reference in its entirety and assigned to a common
assignee.
1. RELATED APPLICATIONS
[0002] The present disclosure relates to the following US patent
applications and US Patents, all of which are owned by the owner of
the instant application, and all of which are incorporated by
reference in their entirety: docket no NU11-002, U.S. Pat. No.
9,720,525, filed on May 29, 2012, docket no NU11-006, Ser. No.
13/834,085, filed on Mar. 15, 2013, docket no. NU11-007, U.S. Pat.
No. 9,733,727, filed on Oct. 17, 2013, docket no NU11-009, Ser. No.
14/294,369, filed on Jun. 3, 2014, docket no NU11-010, U.S. Pat.
No. 9,703,396, filed on Jul. 12, 2013, and docket no. NU17-001,
Ser. No. 16/056,752, filed on Aug. 7, 2018.
2. TECHNICAL FIELD
[0003] The present disclosure relates to a three-dimensional
graphical user interface (3D GUI) for a computer, an electronic
display, a control system or an electro-mechanical system that
incorporates artificial intelligence feature in its data processing
module. The 3D GUI provides an absolute address and linear and
non-linear motion vectors for describing the motion of a
3-dimensional (3D) object with at least three independent degrees
of freedom and moving in accord with three-dimensional kinematics
and visualized in a graphic rendering device. When the presently
disclosed 3D GUI analyzes a plurality of neural signals whose
profile of network can be mapped onto a displaying device used by
said 3D GUI, the performance of said 3D GUI is greatly enhanced by
a 3D zone whose profile or dimension is defined by said
3-dimensional (3D) object; the level of engagement between the user
and the computer which carries such a 3D GUI thus is augmented.
3. BACKGROUND
[0004] A Graphical User Interface (GUI) generally denotes a
software module embedded in an electronic system such as a computer
or, more specifically, in its operating system, or embedded in a
cloud of servers. The ultimate object of the GUI is to enable its
user to engage with the graphical features presented in a
displaying device associated with the electronic system, such as
icons, menu bars, title bars or ribbons. In a broader sense, said
graphical features comprise the ones that are generated by both
graphical vectors as well as the ones that are acquired or measured
by an instrument (e.g. a raster scanned image). In an even broader
sense, a GUI can not only provide these graphical features to a
user, but it can also provide the user with access to non-graphical
functionalities, such as audio, speech recognition, fingerprint
reading, intelligent agents, robotic manipulation, the use of
advanced techniques of analysis such as machine learning or neural
networks, the use of automated functions such as turning an
electronic device on or off, or even surveying the habits/desires
of a user. We consider a well-designed GUI to be one that engages
its user(s) relatively easily, initiating many intuitive/direct
interactions. For decades, the GUI of a computer has been in
two-dimensional (2D) format (e.g. its icons, cursors, etc., are all
in 2D format). In recent years, the computer industry has started
embracing two streams of innovations, i.e., 3D digital graphics,
and artificial intelligence. With the arrival of the era of 3D
digital graphics, there has been a corresponding need for the
development of a user-engaging type of 3D GUI, allowing for new
features such as moving a 3D cartoon character or manipulating a
robot following the instruction of the user, all in an intuitive,
direct, real-time, and intelligent manner. The arrival of the
artificial intelligence technique further augment the fundamental
capability of said 3D GUI, making the interactions between a
computer and its user even more versatile and efficient. The prior
arts disclose many approaches to improving the design and
versatility of GUI's, but these efforts do not provide the
capabilities to be presented herein. For example, Ullman (U.S. Pat.
No. 9,405,430) discloses a GUI that includes a menu tree to reduce
the distance that a cursor has to move during an instruction
selecting process. Anzures (U.S. Pat. No. 8,736,561) discloses a
method of adjusting properties, content or context of a graphical
object. Tseng (U.S. Pat. No. 8,954,887) discloses a GUI that
pops-up a new window when a touch-sensitive screen is pressed for
an extended period of time. Kushman (U.S. Pat. No. 9,189,254)
discloses an automated tool that can interact with a plurality of
users on web server through the use of a GUI by each user. Fostall
(U.S. Pat. No. 9,690,446) discloses a plurality of profiles of
finger gestures that are detected by a touch-sensitive display
panel to make the use of a GUI more intuitive. Matthews (U.S. Pat.
No. 8,527,896) discloses a GUI having an icon that can be made to
visually hover over other icons so that the user is informed that
the position of his cursor is over that icon. Mohammed (U.S. Pat.
No. 9,904,874) discloses a neural network system that provides a
time-domain-to-frequency-domain converter for the input signals
prior to extracting features from the input signals as a means of
reducing the loading on the processors of the neural network
system.
[0005] FIG. 1D schematically shows a conventional two-dimensional
(2D) graphical displaying device (115) such as a monitor. FIG. 1D
also shows that the GUI (105) that is applied to the displaying
device (115) is also a 2D GUI. Correspondingly, as FIG. 1D further
shows, the formats of the graphical features (e.g. icon 108) within
that GUI (105) are also in a 2D format. Based on this 2D design
correspondence, the motion vector provided by the conventional
navigational device (such as a mouse) shown in FIG. 1A (101) is in
2D format as well, as further shown in FIG. 1C. During operation, a
user moves a navigational device (101), such as a mouse, on a
two-dimensional (2D) planar reference surface, such as a mouse pad
or a desktop surface (104). The mouse (101) compares a series of
images of the surface captured by its image sensor (102) as it
moves along the reference plane (104) and sends relative motion
vectors to the electronic system or to a cloud of servers (i.e., a
group of servers linked by a network, such as the internet, or a
means of equivalent effect). Upon the receipt of the motion vector
data by the computer shown in FIG. 1D (112), the cursor, shown as
(111) in FIG. 1B, will be moved on the 2D GUI (105) accordingly. In
further detail, as FIG. 1C shows, when the mouse (101) is moved on
a mouse pad or a desktop surface (104) by a 2D motion vector with
components (.DELTA.u, .DELTA.v), it creates a corresponding
positional motion vector (.DELTA.x, .DELTA.y) of the cursor (111)
that appears on the 2D GUI (105). When a conventional 2D
navigational device (101) is used by a 3D GUI, such as the one that
will be described herein and which is pictured schematically for
reference hereinafter as (207) in FIG. 2A, several technological
challenges will be encountered: first, a significant amount of CPU
(central processing unit) or GPU (graphic processing unit) power
will be consumed by the matrix (i.e., array, tensor) transformation
process required for the 2D mouse data to be converted to a 3D
format for the subsequent use by the 3D GUI. Secondly, perhaps even
more importantly, the conventional 2D mouse (101) cannot provide
the angular displacement data for a 3D GUI. Lastly, there is a
major limitation on the conventional 2D navigational device (101)
in that it lacks a comprehensive means to provide a depth value
(z). A further shortcoming of the conventional 2D GUI and a
fundamental strength of the present 3D GUI, is the ability to
control a device such as a robot which is 3-dimensional and has
many degrees of freedom. Such a robot is shown in FIG. 3. In the
present 3D GUI, the end effector of said robot can be envisioned as
a 3D cursor. When the degrees of freedom of said robot is very
high, it denotes that the interaction between said 3D cursor and
the object carried by said 3D GUI is very complicated. Under this
circumstance, using the artificial intelligence process module
(610) can rapidly derive the resultant motions/status of said 3D
object, i.e., it can help the presently disclosed 3D GUI derive the
output signal of a neural network at a speed and accuracy much
higher than those of the prior art. Buttressed up by this
innovative feature, new applications such as a user engaging video
game, an interactive cartoon that carries future proof capability,
or an intelligent diagnosis systems for medical images, etc. can
reach a performance that is unprecedented to the prior art.
4. SUMMARY
[0006] To address the shortcomings of conventional GUI, it is an
object of the present disclosure to provide a "pervasive" (i.e.,
comprehensive and fully integrated) 3-dimensional graphical user
interface (3D GUI) for a computer, electronic control system, or
electro-mechanical system that enhances the user's engagement
experience by allowing the user to manipulate the motions of an
object by sufficient degrees of freedom, regardless of its size,
e.g. from an object as small as a single pixel to one that is as
large as a network of computers. For all future reference herein,
the 3D GUI provided by this disclosure is the one represented
schematically as (207) in FIG. 2A It will hereinafter simply be
referred to as "the presently disclosed 3D GUI" or, more simply,
the 3D GUI.
[0007] To achieve the above object, the 3D GUI will provide
absolute addresses and linear and non-linear motion vectors for a
3D object, enabling a user to gain an extraordinary and
"transparent" experience of engaging directly with that 3D object
so that there is no conscious experience that a GUI is being used.
Further, when providing input to the 3D GUI by using the high
resolution and high sensitivity 3D navigational device, whose
functionality is fully disclosed by docket number NU11-009, Ser.
No. 14/294,369 which is fully incorporated herein by reference (and
will be further discussed below), the presently disclosed 3D GUI
will provide its fullest capabilities and advantages. It will then
be able to provide an absolute address for an object and the
positional accuracy of that object will be kept constant during the
entirety of its motion, instead of the accuracy of the motion
continually deteriorating as a result of successive approximations.
This motional accuracy is a result of the 3D navigational device
being moved on a specially tinted reference surface. Still further,
the presently disclosed 3D GUI can provide a 2.5D coordinate system
(a 2D system with a separate rotational axis) to help the user
learn by interacting with 3D scenery, i.e., renderings that are
created using 3D vector graphics. By manipulating a perspective
angle by moving a world space camera using linear and non-linear
motion vectors in six degrees of freedom, the 3D GUI is able to
classify a plurality of 3D graphical vectors into several classes,
i.e., the basic graphical entities that are used to construct the
3D vector graphics and/or 3D motion vectors selected for denoting
the levels of user engagement.
5. BRIEF DESCRIPTION OF DRAWINGS
[0008] FIGS. 1A, B, C, and D schematically depict elements
associated with a conventional 2D GUI that uses a 2D navigational
device to maneuver a cursor;
[0009] FIGS. 2A, B, C and D schematically depicts elements
associated with the presently disclosed 3D GUI that uses a 3D
navigational device to provide 3D motion vectors for an object
having six degrees of freedom (DOF);
[0010] FIG. 3A schematically shows a robot that can be directly
manipulated by the presently disclosed 3D GUI;
[0011] FIG. 3B shows an alternative structure of the end of the arm
of the robot of FIG. 3A which has a different set of descriptive
coordinates corresponding to a different set of matrices.
[0012] FIG. 4A schematically shows layers of the 3D GUI based on a
windowing system, in which a specific GUI layer maybe positioned
between the input device and the kernel of an operating system,
designed for controlling user's viewing experience; several vendors
in this market segment are also listed;
[0013] FIG. 4B schematically shows application interface (API) that
bridges different types of input devices with the presently
disclosed 3D GUI;
[0014] FIG. 4C illustrates a hardware environment in which the
present 3D GUI operates.
[0015] FIG. 5A schematically shows that the graphical objects in
the presently disclosed 3D GUI (i.e., roses 1701L and R) are
interacting with approaching object (1707) by matrix multiplying
process;
[0016] FIG. 5B schematically shows that the graphical objects in
the presently disclosed 3D GUI (spots and asteroids) are classified
into different classes (i.e., 1701L and R) in the feature vector
space, allowing for fast and accurate engagement with the
approaching object (1707);
[0017] FIG. 5C schematically shows the typical processing steps
taken by the presently disclosed neural network process module (610
in FIG. 4B) to adjust the accuracy and reliability of the result of
neural signals (i.e., manipulating the multi-dimensional feature
vectors by, e.g. convolutional process, which are implemented by
steps of 1714S, Kernel functions K.sub.x, which are implemented in
the steps of 1715, and weighting factors, which are implemented in
the steps of 1717, etc.);
[0018] FIG. 5D schematically shows that the presently disclosed 3D
GUI is able to reduce the dimension of a vector graphic from 3D to
2.5D, such that the loading on the neural network module (610) is
reduced effectively and efficiently;
[0019] FIG. 6 schematically shows the apparent "leeway" between two
objects (i.e., circles J' and K') in a 2.5D coordinate system
changes in accordance with the variation of the perspective angle
.delta.;
[0020] FIG. 7 schematically depicts the directionality of the
motion of an object in a 3D space with regard to the vanishing
point when a world space camera (embodied as the Cartoon Genie)
makes a relative motion with regard to the same vanishing
point;
[0021] FIG. 8 schematically depicts a method of using a projected
plane (i.e., X.sub.(3D)-Z.sub.(3D) plane) to analyze the sweeping
angle of the perspective angle of a 2.5D coordinate system, i.e.,
d.delta., when Genie's line of eye sight sweeps by an angle of
d.OMEGA.,
[0022] FIG. 9 schematically shows the apparent dimension of an
object (i.e., circle A) in a 2.5D coordinate system changes in
accordance with the variation of the perspective angle .delta.;
[0023] FIGS. 10 and 11 schematically show that some non-linear
motions of objects are manifested in a way much stronger than
others when the perspective angle .delta. is changed;
6. DETAILED DESCRIPTION
[0024] The present disclosure describes a three-dimensional (3D)
graphical user interface (3D GUI) of an electronic system, such as
a computer, shown schematically in FIG. 2A as (207). This device
provides the absolute address and linear and non-linear motion
vectors for a 3D object, which gives its user the extraordinary
experience of engaging directly with that 3D object. As an example,
a cartoon "Genie" (204) is shown being made to move from a position
(x; y; z) along various directions n and the genie's "flying
carpet" (201) is being controlled by the Genie (as will be
discussed further below) and made to move along a plane abc, in
direction vectors n and n' etc. The 3D GUI described herein not
only provides the means for constructing and manipulating 3D
graphical constructions (i.e., a Genie or 3D scenery), which is
fully described in related docket no. NU17-001, Ser. No. 16/056,752
which is fully incorporated herein by reference, but it also
provides a complete methodology by which 3D objects, with many
degrees of freedom, such as robots, can be manipulated and
controlled. The present disclosure will concentrate on showing how
the presently disclosed 3D GUI can control a robot intelligently by
application of Machine Learning (ML), a powerful tool of artificial
intelligence (AI).
[0025] FIG. 4A shows a typical GUI in software layer formation,
running on Hardware 620. Hardware 620 is further shown and
described in FIG. 4C. As FIG. 4A shows, a GUI is a plurality of
layers of software lying between the input devices (601) and the
kernel (605) of an operating system (e.g. Windows, Linux, OS,
Android); note that Microsoft Corp. refers to its operating system
which comprises the Kernel 605 and GUI 207 as WINDOWS. In the
generic definition of a GUI, a window is a region of a screen
(i.e., 207 in FIG. 2A) that is allocated to a specific application;
a window manager (e.g. 604) is a system software that controls the
placement and appearance of windows within a windowing system in a
graphical user interface (e.g. 207). The typical types of window
managers comprise the stacking type, tiling type, dynamic type, or
the composite type. For the detailed characteristics of a GUI,
readers may refer to the Wikipedia article titled "Graphical User
Interface." Note that although conventional art tends to implement
the above described layers of functions as software (e.g. 602, 603,
and 604, of FIG. 4A), it does not rule out the possibility that a
next generation 3D GUI (207) implements certain of these layers
(i.e., internal process modules of FIG. 4B, such as Support Vector
Machine 616, Neural Network 610, etc.) into hardware (e.g.
Application Specific IC, ASIC).
[0026] Referring now more particularly to FIG. 4C, hardware 620 (as
shown in FIG. 4A) is (as referred variously herein) a computer,
display system, electronic system, or electro-mechanical system, or
more generally for purposes of this disclosure--a computing device.
The computing device typically includes a central processing unit
(CPU) 1402, a main memory (1404), input/output devices (1406A/B),
input/output ports (1408A/B), memory I/O (1410), a bridge (1412),
and a cache memory (1414) in communication with the central
processing unit (1402). The central processing unit (1402) is any
logic circuitry that responds to and processes instructions
received from the main memory (1410), and which reads and writes
data to and from memory (1410). The main memory (1410) may include
one or more memory chips capable of storing data and allowing any
storage location to be directly accessed by the main processor
(1402). The graphical user interface of the disclosure is typically
displayed on an I/O device (1406A) such as an electronic display.
Input device 601 (from FIG. 4A) similarly is represented in FIG. 4C
as another I/O device (1406B), which interacts with CPU (1402).
6.1 Embedding Robot Kinematics in a 3D GUI
[0027] As robots became more and more common in our daily life, the
conventional methods (e.g. algorithms and/or software) used to
calculate and control the position/motion of a robot are inadequate
as they have no effective way to manipulate the position or motion
of a robot in a real time manner. For the applications that require
in-situ monitoring and/or controlling the kinematics of a robot,
the presently disclosed 3D GUI becomes a time-saving and welcome
device. FIG. 3A schematically shows an exemplary "robot", e.g. a
robotic arm, that can benefit from the presently disclosed 3D GUI
(e.g. a six-joint PUMA.RTM. robot, hereinafter referred to as robot
700). FIG. 3B shows an alternative drawing of the end of the robot
arm in FIG. 3A requiring a different matrix formulation to describe
the motion of the gripper at the termination of the arm. NU17-001
fully describes an introduction to robot kinematics as applied by
the 3D GUI of this disclosure. For convenience, this section 6.1
repeats some of the introductory material presented in section 6.7
of NU17-001, but the following material in section 6.2 of this
disclosure will expand upon NU17-001 and disclose additional
capabilities of the 3D GUI.
[0028] As FIG. 3A shows, the motion of the respective joints or
elbows of the robot (700) can be described by their
rotational/orientation angles (i.e., .theta..sub.1, .theta..sub.2,
.theta..sub.3, .theta..sub.4, .theta..sub.5, and .theta..sub.6).
When the six joints are linked in a way as FIG. 3A depicts, the
associated matrix operation of each respective joint can be
expressed as:
A 1 0 = [ C 1 0 - S 1 0 S 1 0 C 1 0 0 - 1 0 H 0 0 0 1 ] .times.
.times. A 2 1 = [ C 2 - S 2 0 L elbow .times. .times. 1 C 2 S 2 C 2
0 L e .times. l .times. bow .times. .times. 1 S 2 0 0 1 d 0 0 0 1 ]
.times. .times. A 3 2 = [ C 3 0 S 3 a 3 C 3 S 3 0 - C 3 a 3 S 3 0 1
0 0 0 0 0 1 ] .times. .times. A 4 3 = [ C 4 0 - S 4 0 S 4 0 C 4 0 0
- 1 1 L e .times. l .times. bow .times. .times. 2 0 0 0 1 ] .times.
.times. 4 .times. A 5 = [ C 5 0 S 5 0 S 5 0 - C 5 0 0 1 0 0 0 0 0 1
] .times. .times. A 6 5 = [ C 6 - S 6 0 0 S 6 C 6 0 0 0 0 1 d
grippe .times. r 0 0 0 1 ] ( 1 ) ##EQU00001##
[0029] Where C stands for cosine function, S stands for sine
function; L.sub.elbow1 is the length of the elbow linking joint1
(i.e., origin of x.sub.1-y.sub.1-z.sub.1) and joint2 (i.e., origin
of x.sub.2-y.sub.2-z.sub.2); L.sub.elbow2 is the length of the
elbow linking joint3 (i.e., origin of x.sub.3-y.sub.3-z.sub.3) and
joint4 (i.e., origin of x.sub.4-y.sub.4-z.sub.4); and the
subscripts 1, 2, 3, 4, 5, and 6 in Eq. (1) denote the rotational
angles .theta..sub.1, .theta..sub.2, .theta..sub.3, .theta..sub.4,
.theta..sub.5, and .theta..sub.6, respectively. So, when robot
(700) is moving, the corresponding kinematics can be expressed by
the following matrix multiplication, i.e.,
T 0 i = A 1 0 A 2 1 A 3 2 A 4 3 .times. .times. .times. = j = 1 i
.times. A j j - 1 = [ R 1 .times. 1 R 1 .times. 2 R 1 .times. 3 X R
2 .times. 1 R 2 .times. 2 R 2 .times. 3 Y R 31 R 3 .times. 2 R 3
.times. 3 Z 0 0 0 1 ] ; for .times. .times. i = 1 , 2 , .times.
.times. n ( 2 ) ##EQU00002##
When i=n, we obtain the T matrix, i.e., T.sub.0.sup.n, which
provides the positional and rotational information of P.sub.end,
i.e., the end point of robot (700) with respect to the base
coordinate system (i.e., O of FIG. 3A). Note that the parameters
R.sub.11.about.R.sub.33, X, Y, and Z of the T.sub.0.sup.i matrix of
Eq. (2) can be directly applied to Eq. (3); this means that the
presently disclosed 3D GUI can control the motion of robot (700)
directly. Alternatively, said parameters R.sub.11.about.R.sub.33,
X, Y, and Z of the T.sub.0.sup.i matrix can be transformed into the
other formats; a couple of the corresponding ones are shown in Eqs.
(3) and (4), respectively. Readers are advised that when i is less
than n, said T.sub.0.sup.i matrix denotes the position and rotation
of the internal joint, i.e., .sup.0A.sub.1, .sup.1A.sub.2,
.sup.2A.sub.3, .sup.3A.sub.4, .sup.4A.sub.5, respectively. Special
notice is further advised that using the 3D navigational device
described in docket no. NU11-009, Ser. No. 14/294,369, the
presently disclosed 3D GUI can impart physical meaning to the above
stated parameters by considering said T matrix in the following
formation:
T = [ x n y n z n p n 0 0 0 1 ] = [ n s a p 0 0 0 1 ] = [ n x s x a
x p x n y s y a y p y n z s z a z p z 0 0 0 1 ] ( 3 )
##EQU00003##
where [0030] n is the normal vector of the hand. If we use a
parallel jaw hand, n will be orthogonal to the fingers of the
robot. FIG. 3B shows the direction of {right arrow over (n)}.
[0031] s is the sliding vector of the hand. It is pointing to the
direction of the gripper (e.g. a simplified finger) for the opening
and closing movement; FIG. 3B shows the direction of {right arrow
over (s)}. [0032] a is the approach vector of the hand. It is
pointing in the direction normal to the palm of the hand (the
rotating plane denoted by y.sub.5 and z.sub.5 axes); FIG. 3B shows
the direction of {right arrow over (a)}. [0033] p is the position
vector of the hand. It points from the origin of the base
coordinate system (i.e., point O of FIG. 3B) to the center of the
hand coordinate system when the gripper is closed (i.e.,
P.sub.end);. specifically, {right arrow over (n)}=[n.sub.x,
n.sub.y, n.sub.z], {right arrow over (a)}=[a.sub.x, a.sub.y,
a.sub.z], {right arrow over (s)}=[s.sub.x, s.sub.y, s.sub.z],
{right arrow over (p)}[p.sub.x, p.sub.y, p.sub.z];
[0033]
n.sub.x=C.sub.1[C.sub.23(C.sub.4C.sub.5C.sub.6-S.sub.4S.sub.6)-S.-
sub.23S.sub.5C.sub.6]-S.sub.1[S.sub.4C.sub.5C.sub.6+C.sub.4S.sub.6]=R.sub.-
11
n.sub.y=S.sub.1[C.sub.23(C.sub.4C.sub.5C.sub.6-S.sub.4S.sub.6)-S.sub.23S-
.sub.5C.sub.6]-S.sub.1[S.sub.4C.sub.5C.sub.6+C.sub.4S.sub.6]=R.sub.21
n.sub.z=-S.sub.23[(C.sub.4C.sub.5C.sub.6-S.sub.4S.sub.6)]-S.sub.23S.sub.-
5C.sub.6=R.sub.31
s.sub.x=C.sub.1[-C.sub.23(C.sub.4C.sub.5C.sub.6-S.sub.4S.sub.6)+S.sub.23-
S.sub.5C.sub.6]-S.sub.1[-S.sub.4C.sub.5C.sub.6+C.sub.4S.sub.6]=R.sub.12
s.sub.y=S.sub.1[-C.sub.23(C.sub.4C.sub.5C.sub.6+S.sub.4S.sub.6)+S.sub.23-
S.sub.5C.sub.6]+C.sub.1[-S.sub.4C.sub.5C.sub.6+C.sub.4S.sub.6]=R.sub.22
s.sub.z=S.sub.23[(C.sub.4C.sub.5C.sub.6+S.sub.4S.sub.6)]+S.sub.23S.sub.5-
C.sub.6=R.sub.32
a.sub.x=C.sub.1[(C.sub.23C.sub.4C.sub.5+S.sub.23C.sub.5)]+S.sub.1S.sub.4-
S.sub.5=R.sub.13
a.sub.y=S.sub.1[(C.sub.23C.sub.4C.sub.5+S.sub.23C.sub.5)]+C.sub.1S.sub.4-
S.sub.5=R.sub.23
a.sub.z=-S.sub.23C.sub.4S.sub.5+C.sub.23C.sub.5=R.sub.33
p.sub.x=C.sub.1[d.sub.gripper(C.sub.23C.sub.4S.sub.5+S.sub.23C.sub.5)+S.-
sub.23d.sub.4+a.sub.3d.sub.4+a.sub.3C.sub.23+a.sub.2C.sub.2]-S.sub.1(d.sub-
.gripperS.sub.4S.sub.5+d.sub.2)
p.sub.y=S.sub.1[d.sub.gripper(C.sub.23C.sub.4S.sub.5+S.sub.23C.sub.5)+S.-
sub.23d.sub.4+a.sub.3d.sub.4+a.sub.3C.sub.23+a.sub.2C.sub.2]+C.sub.1(d.sub-
.gripperS.sub.4S.sub.5+d.sub.2)
p.sub.z=d.sub.gripper(C.sub.23C.sub.5-S.sub.23C.sub.4S.sub.5)+C.sub.23d.-
sub.4-a.sub.3d.sub.23+a.sub.2S.sub.2+H (4)
6.2 Using Robot Kinematics in Conjunction with Machine Learning
[0034] In related disclosure, docket no. NU17-001, Ser. No.
16/056,752 which is fully incorporated herein by reference, we have
introduced the present 3D GUI and addressed its capabilities when
used in conjunction with a 3D navigational device that can provide
absolute addresses. We have also described many of the capabilities
of that 3D GUI as related to its ability to create and manipulate
3D scenery. Further, we have also introduced the manner in which
robot kinematics can be embedded in the 3D GUI so that mechanical
systems having many degrees of freedom (DOF) can be accommodated by
the GUI seamlessly, transparently and pervasively. We now show that
this same ability to deal with robot kinematics, when augmented by
the additional capabilities of artificial intelligence such as
Machine Learning, provides a user with unprecedented modes of
interaction with that 3D scenery and, in addition, even provides a
method for teaching the user how to interact with that 3D
scenery.
[0035] This section begins by showing how a 3D GUI can use a 2.5D
coordinate system (a 2D system with an additional axis of rotation)
to help a user and, even more generally, a viewer, to learn how to
interact with 3D scenery (i.e., a rendering created using 3D vector
graphics) effectively and efficiently. By manipulating a
perspective angle properly, the 3D GUI disclosed herein is able to
classify a plurality of graphical vectors (i.e., the basic entities
that construct 3D vector graphics) and/or motion vectors selected
for denoting the levels of user engagement, into several classes.
When these classes are separated by clear margins, the presently
disclosed 3D GUI achieves an optimal condition in which to render
key graphical features and, thereby, for engaging with the viewer
most effectively. Here the differences between a graphical vector
and an ordinary Euclidian vector (e.g. a motion vector) must be
recalled. Graphical vectors are not the same as the mathematical
Euclidian vectors. Graphical vectors are actually constructed using
the Euclidian vectors, but a graphical vector is something usually
hidden by the software that actually creates the graphs, e.g. the
2D and 3D vector graphics. For a detailed analysis of these
distinctions, the reader may refer to: "Rex van der Spuy (2010).
"Advanced Game Design with Flash",
(https://books.google.com/books?id=Xsheyw3JJrMC&pg=PA306).
Apress. p. 306. ISBN 978-1-4302-2739-7".
[0036] To teach the presently disclosed 3D GUI and its user how to
manipulate a perspective angle intelligently, there are two
fundamental requirements that must be concurrently met. First, the
world space camera used by the 3D GUI and described in related
application NU17-001 should be treated as a realistic 3D entity, so
that it can be moved by the continual translational motion vectors
and rotational motion vectors of six degrees of freedom, much like
the robot introduced in FIG. 7A of NU17-001 and shown here as FIG.
3. Second, the world space camera should be an "intelligent" one;
that is, one that, by using a state-of-the-art machine learning
theorem, can classify the graphical objects/graphical vectors
selected by the 3D GUI into a plurality of classes, so that the
process of classification enables the user to do computer learning
with the 3D scenery effectively and efficiently.
[0037] In previously disclosed docket NU17-001 Ser. No. 16/056,752
(as noted above), the presently disclosed 3D GUI focused on the
manipulation of the position or motion of a structured entity,
i.e., a robot (shown here as FIG. 3), whose degrees of freedoms of
all joints and elbows have been clearly defined. The present
disclosure focuses on how that same 3D GUI supports an operator
interacting with a 3D graphical entity whose essential features may
not be as clearly defined as by a conventional 2D GUI, but which,
after mapping, can be denoted by the feature vectors in a
substantially higher dimensional space (e.g. one having hundreds,
even thousands, of coordinates)--so high a dimension that the
viewer may not be able to know the exact dimension of said feature
vector space. In nature, we recognize the existence of an object by
its 3D location and 3D motion vectors. To analyze the location and
motion vectors of a fairly large number of objects in a 3D space, a
supervised learning process can play a vital role (e.g. the
"support vector machine", or SVM). On the other hand, a learning
process as such can consume a large amount of the calculation power
of the computer (e.g. CPU, GPU, etc.). If a GUI designer is not
aware of such a situation, they may not feel the imminent demand
for the industry to develop an intelligent 3D GUI such as disclosed
herein to meet the upcoming challenges of pervasive 3D graphics. In
the graphic rendering industry, it is very common that a 3D scenery
contains many delicate details (e.g. 3D vector graphics) and the
way our brains comprehend them is strongly associated with the
perspective angles we take to see them.
[0038] Perspective angle is a unique feature embedded in a 2.5D
graphic rendering; we human beings rely on the perspective angle to
comprehend the 3D world. In this section, we will combine the merit
of perspective angle and state of art learning theorem (i.e.,
support vector machine, SVM). With this unique combination of those
two arts, a next generation 3D GUI (such as the presently disclosed
3D GUI) can demonstrate its unprecedented technological superiority
to its predecessors.
[0039] When the computer industry fully enters 3D graphic regime,
the soaring amount of data generated by its 3D objects creates a
unique space containing a large quantity of feature vectors (which
are not to be confused with the graphical vectors). Graphical
vectors are those used to depict the appearance of static objects;
two exemplary graphical vectors are arrows (1712) and (1713) of
FIG. 5A. In the feature vector space, a motion vector is considered
as representing the identity of an object that has equivalent
importance to that of said graphical vector in its space.
[0040] Eq. (5) (below), is extracted from related application,
docket no. NU17-001, Ser. No. 16/056,752, where it appears as Eq.
(8). As this equation and FIG. 5A both show, the motion vector of a
3D object can be denoted by a (3.times.3) matrix.
P ' = R P + T = [ X ' Y ' Z ' ] = [ R 1 .times. 1 R 1 .times. 2 R 1
.times. 3 R 2 .times. 1 R 2 .times. 2 R 2 .times. 3 R 31 R 3
.times. 2 R 3 .times. 3 ] . [ X Y Z ] + [ T X T Y T Z ] ( 5 )
##EQU00004##
[0041] Graphical vectors can also be denoted by (3.times.3)
matrices. So, by mapping, the motion vector and graphical vector of
a 3D object may, taken together, constitute a feature vector space.
Note that the feature vector space may have a substantially large
number of dimensions (e.g. >>6); if one uses physical
connotation to depict such a situation, the dimension of the space
established by said feature vectors can be so high that sometimes a
viewer may not have the mental capacity to understand the
"kinetics", or "kinematics", among them quickly enough. Thus, the
following challenge for the 3D GUI arises: how can it present a 3D
graphical image that the viewer can comprehend more effectively
(e.g. at a pace that is equal to the substantially large flow of
the information delivered). As one may deduce from the above, the
solution has to be a mathematically sound methodology that allows
the perspective angles to be adjusted intelligently. Thus, the
present disclosure has to provide a method that enhances the
efficiency of a viewer's learning of a 3D vector graphics by the
perspective rendering techniques. To meet the above goal, the
presently disclosed 3D GUI has incorporated algorithms of
artificial intelligence including the Support Vector Machine (i.e.,
616 of FIG. 4B), Neural Network (610), and means of perspective
adjustment (i.e., 607), all in its internal process modules (615).
It is thus an objective of this section to illustrate how these
modules collaborate to meet the goal.
[0042] To a conventional GUI designer, changing a perspective angle
manually, as by turning a rotational knob, seems not to be a
difficult task (e.g. as when moving the world camera). In
conventional art (e.g. a conventional video game), there have been
quite a few software features that are designed for changing a
scene quickly (e.g. WINDOW 1 (210) and WINDOW 2 (211) of FIG. 2A).
Meanwhile, what really challenges a new generation of GUIs is how
to change the perspective angle both proactively and intelligently;
this has not been addressed by the conventional art. Here the
reader must be advised that toggling the scenes (i.e., selecting
different windows) by clicking the instructions is an artificial
means--it is not a realistic viewing experience that a person has
in daily life. In the real world, many dynamic situations may take
place in a 3D environment, and they are equally, if not more,
important than the static scenes. For example, when an internet
service provider is providing a movie that is composed of many
dynamic situations for a viewer, some objects may be visible to the
viewer only from one perspective angle; others may be visible to
said viewer from another. Hence, a realistic way of rendering a
movie containing a plurality of dynamic situations is allowing a
viewer to change his/her perspectives to scenery by walking close
to or away from an object (which is a translational motion), or
rotating his/her head to different directions (which is a
rotational motion)--these are all preferably implemented by the
continuous motions of a camera (contrary to hopping through a
plurality of static windows) instead of merely swapping the scenes.
In the conventional GUIs that claim having certain amount of 3D
perspective capabilities, there is a software module entitled
"world camera" (i.e., perspective camera) that is responsible for
doing the job of changing perspective angles. However, conventional
art lacks an effective and comprehensive means to maneuver the
world camera intelligently. The rapid scene hopping caused by the
conventional art often leads to a chaotic or dizzying viewing
experience for the viewer. In the presently disclosed 3D GUI, the
motion of the world camera is as though it is being held by a
sturdy robot whose body is invisible to the viewer. Thus, the
translational and rotational motions of the scene are controlled in
a continual and smooth manner. This feature is achieved by the
collaboration of several processing modules of the 3D GUI.
Specifically, as FIG. 4B shows, inside the presently disclosed 3D
GUI there are specific processing modules designed for Perspectives
(607), Robotics (608), Neural Network (610), and Support Vector
Machine (SVM, 616). To make the above concept more concrete to the
viewer (i.e., an effective process of maneuvering the perspective
angles of a scene provides an extraordinary comprehension for a
dynamic situation), a unique cartoon character (i.e., Genie 204) is
created by the 3D GUI, it is a 3D object without a fixed body
formation, but its center of body (e.g. torso 218) can be moved by
the translational and rotational motion vectors provided by the
associated 3D navigational device (202) of FIG. 2D. Thus the job of
generating the translational motion and rotational motion vectors
for said world camera thereby can be done in a live manner.
[0043] Referring now back to FIG. 2A, a 3D cartoon character, Genie
(204), is generated for such a purpose (navigating a world camera
in 3D space). When a user clicks the oil lamp (216), using
navigational device (202) of FIG. 2B the cartoon Genie (204)
appears, thereby the user can cause the Genie (204) to change his
perspective angle toward different scenes, such as the one
containing the flying carpet (201), or do something else. If the
operator clicks the oil lamp (216) again, the cartoon Genie (204)
will disappear; consequently, the Genie's perspective angle
adjusting function will be inactivated. In the 3D GUI, the
provision of said cartoon Genie (204) denotes that an electronic
system (200) carrying the presently disclosed 3D GUI has a unique
ability to change the perspective angle intelligently and
automatically. In addition to the automatic process, the presently
disclosed 3D navigational device (202), may change Genie's position
in a 3D space (i.e., x', y', z') by dragging the Genie (204)
manually. Essentially, the entire process of manually dragging the
cartoon Genie (204) is similar to that of maneuvering the flying
carpet (201). What is more important to Genie (204) is that his
track can be analyzed and stored by the presently disclosed 3D GUI,
henceforth an intelligent maneuvering plan for Genie's perspective
angle can be created by the presently disclosed 3D GUI (note that
this process may require the support of some artificial
intelligence functionality, which would be contained in an internal
process module (610), shown schematically in FIG. 4B, which is
embedded in the presently disclosed 3D GUI). The detailed steps of
such an internal process module are the following. If an operator
is intending to create a "script" for Genie's maneuvering plan, in
the initial step, the operator may move the 3D cursor (209) to a
position from which it is suitable to aim the 3D cursor (209) at
Genie (204) directly, thereafter the operator may click the mouse
button (215) to acknowledge the 3D GUI that Genie (204) has been
designated for services. Then, by wiggling the finger (206), one
may cause the normal vector of said Genie (204) to change (e.g.
from {right arrow over (n'')} to {right arrow over (n''')} in FIG.
2A); as we have learned from the former paragraphs, looking at FIG.
2D an operator can wiggle their finger (206) while moving the body
of the presently disclosed 3D navigational device (202) over the
reference surface (205). Corresponding to the combined effect of
the operator's finger's motion and hand maneuvering motion over the
reference surface (205), Genie's body (204) is moved in 3D manner
concurrently. In Genie's eyes (its image can be displayed by a
secondary 3D GUI, e.g. 207W1 of FIG. 8, which is window
incorporated within the primary 3D GUI as a separate window, or, it
can be presented in juxtaposition with the primary 3D GUI as a
second tile widow, or, it can be presented in a swapping process
with the primary window 207, etc.), the combined effect of the
above stated translational and rotational movement generates an
extraordinary viewing experience of scenery (e.g. Genie's
perception of the location of the flying carpet 201 is constantly
changing due to the maneuvering movement of the 3D navigational
device 202). In the past, there has been some graphical sketching
programs providing the so called world space camera as a
preliminary means of changing the perspective angles; this kind of
camera only provides very primitive functions as compared to those
of the presently disclosed Genie (204). It is not intelligent, when
having no knowledge, to try to determine where or how to look into
a scene effectively. Today's computer confronts processing a vastly
large amount of data provided by the internet (the computer
industry often calls this phenomenon: "Big Data"), when a computer
is innundated by a large amount of data, using the conventional
means such as relational database analysis to sort out the
information is cumbersome, and it often leads to not very useful
results in the end. Providing a means for a computer to visually
investigate a dynamic situation from different perspective angles
thus becomes a stitch in time for such a situation (i.e., Big Data
analysis). Fundamentally speaking, without the presently disclosed
3D navigational device (202), the above stated conventional world
space camera cannot change the scenes (i.e., the above stated
secondary 3D GUI) by manipulating the translational and rotational
motions simultaneously. Hindered by such a shortcoming, prior art
cannot generate a script of motion for said world space camera to
move like a live Genie (204). Lacking a script to"steer" Genie
(204), the conventional world space camera cannot provide any
"scene plot" for a viewer to learn from the "kinetics" of a scene
(note that there is a great deal of "kinetics" that affect the
characters of a cartoon movie or video game; however, the state of
art digital graphic industry has not exploited this characteristic
effectively). For example, when watching the Disney.RTM. cartoon
movie "Peter Pan", some viewers may like to follow where Peter Pan
flies; others may like to follow where Tinker Bell or Wendy has
gone. The presently disclosed Genie (204) is in fact a unique
software module embedded in a 3D GUI that, without verbose
education, provides an intuitive suggestion that the user adjust
their perception to an imaginary 3D world. Still further, it takes
no effort for an operator to comprehend that Genie (204) can be
enlarged or shrunken (i.e., by zooming in and out of a scene), this
feature fits the scenario of watching the cartoon movie Alice in
the wonderlandwell. Still further, it takes no effort for an
operator to understand that said Genie (204) can "transit" through
different worlds with no difficulty, and this feature fits the
theme of Disney's cartoon movie "Fantasia" quite well. Thus, a 3D
GUI affiliated with the presently disclosed cartoon feature Genie
(204) can help a viewer transit through various world zones without
being befuddled by the drastic changes of scene. The conventional
world space camera does not have such a capability in that the body
of said camera is considered as a mathematical point; this makes it
difficult for the operator to adjust its gesture and position
easily. In an envisioned Disney movie "Fantasia.RTM.3D", the viewer
becomes a virtual Tinker Bell, immersing herself in an imaginary 3D
environment, observing all the 3D objects surrounding her (e.g.
flowers, etc.) from different perspective angles. In physics, the
address of a static object can be denoted by its whereabouts in a
three dimensional coordinate system (i.e., X, Y, and Z). As to the
essential graphical entities carried by said static object, they
are often denoted by the vectors derived from said address; we call
the space constituted by such a vector as the graphical vector
space (the object formed by these graphical vectors are called the
vector graphics). If the geometrical relationship of a plurality of
objects (e.g. vector graphics) is not complicated (i.e., said
objects are seperated by a linear margin in the formation of a wide
and clear band), a viewer may comprehend said graphical vector
space with no difficulty. When the relationship is quite
complicated (e.g. that margin is a non-linear one), the artificial
intelligence module (e.g. 610 in FIG. 4B) of the presently
disclosed 3D GUI may jump into the play and map the above stated
graphical vectors to another feature vector space, which has the
benefit of seperating said objects more clearly. One can
characterize said mapping process as the following: the dimension
of said feature vector space will be increased by said mapping
process--it can be relatively high, and such a high dimension
denotes the fundamental methodology that an artificially
intelligent being such as a computer program, or a biologically
intelligent being, such as a human, uses to learn about the world.
In a different way of narration, the above stated mapping process
can be deemed as a transition of an object through different
spaces. One comes to an understanding that the dimension of the
realistic world of matters (in no or slow motion) as perceived by
human eyes cannot be higher than three. But the way a 3D object is
interacting with the world (i.e., its motion vectors) can be
denoted by a space of vectors whose dimension is far higher than
three (e.g. the typical motion of a 3D object has six degrees of
freedom; if said object is accelerating, then the degrees of
freedom of said motion can be higher than six). With this basic
understanding in mind, a high quality 3D GUI as is disclosed herein
treats the static and motional objects by different means. When the
presently disclosed 3D GUI is depicting a static object, using its
absolute address in three dimensional format, i.e., X, Y, and Z,
would meet the purpose. When the presently disclosed 3D GUI is
depicting a moving object, it will be able to depict the dynamics
of the motions of said object by using six, or more, degrees of
freedom (i.e., each degree of freedom of said motion may constitute
one dimension to said feature vector space). In the end, the
dimension of the feature vector space can be far higher than six.
When a 3D scenery is composed of a plurality of objects (e.g.
vector graphics), the margins between these objects can be quite
complicated (e.g. a non-linear band that zig-zags through said
objects). The conventional (2D) GUI does not know how to deal with
such a situation in that it treats all 3D objects as static points,
i.e., its dimension is limited to two or three. What is worse is
that when the conventional GUI treats the above stated objects as a
set of mathematical points, because points themselves do NOT have
any sensitivity to rotation, this makes the conventional GUI
difficult to adjust the gesture of said 3D objects or perspective
angle easily. Using physical concepts to characterize the above
stated problem of conventional GUI: the dimension of the feature
vector space of a plurality of objects can be relatively high, it
is so high that the conventional GUI does not have any effective
means to characterize the graphical feature vectors carried
therein. To solve this problem, the present high quality 3D GUI has
to utilize an intelligent process to adjust the perspective angle,
such that a plurality of object/motions can be classified into
several classes when they are presented to the viewer by varying
graphical vectors (graphical vectors are not the Eulidian vectors
in the screen that are directly perceivable to human eyes, but they
can be comprehended by human through visualization or
imagination).
[0044] A Support Vector Machine (SVM) is a machine learning process
that has been widely used in such fields as pattern recognition,
forecasting, classification, and regression analysis. SVM has
proven quite useful in the above applications because in many cases
its performance is superior to that of the other similar prior
arts, such as conventional statistical models. The detailed theory
of SVM is developed by Corrina Cortes and Vladimir Vapnik; and
published in: "Support Vector Network", Machine Learning, 20,
273-297 (1995).
[0045] In the general formulation of an SVM, it is defined as a
maximum margin classifier (see 1711 of FIG. 5B as an example),
whose decision function is a hyperplane (e.g. 1710) in a feature
vector space (e.g. X.sup.d(d>3)-Y.sup.d(d>3) of FIG. 5B). By
the maximal values of the margin, the hyperplane divides a
plurality of feature vectors into different classes (see
K.sub.1701L and K.sub.1701R as the examples). Using SVM requires
teaching. In the graphical vector space, a means of teaching may be
given as a labeled training data set {x.sub.i,
y.sub.i}.sub.i=1.sup.n, where x.sub.i.di-elect cons..sup.N and
y.sub.i.di-elect cons.{-1,+1}, and a nonlinear mapping
.PHI.()--which is a situation common to most of the 3D graphical
rendering devices--the SVM method solves the following:
min w , .xi. , p .times. { 1 2 .times. w 2 + C .times. 1 n .times.
.xi. i } ( 6 ) ##EQU00005##
[0046] which is constrained to:
y.sub.i{.PHI.(x.sub.i), w+b}.gtoreq.1-.xi..sub.i .A-inverted.i=1 .
. . n (7)
.xi..sub.i.gtoreq.0 .A-inverted.i=1 . . . n, (8)
[0047] The parameters w and b in Eqs. (6) and (7) denote a linear
classifier in .sup.N since x.sub.i is in .sup.N; .xi..sub.i is a
positive valued slack variable that denotes the permitted errors of
classification. We may now consider a set of learning data
(x.sub.1, y.sub.1), . . . , (x.sub.n, y.sub.n).di-elect
cons..chi..times., where x.sub.i is an input taken from .chi., and
y.sub.i.di-elect cons., which is called the output, denoting which
class said input belongs to. In the present case, x.sub.i denotes
the respective graphical vectors in a GUI. Per the theorem of
artificial intelligence, a machine learning process is one that
uses the above stated pairs of learning data to construct a model
or function to predict the output, i.e., y.sub.test.di-elect cons.,
of yet to come test data x.sub.test.di-elect cons..chi.. To develop
a machine learning process that generalizes well, an SVM may use
the so-called kernel method to exploit the structure of the data as
a means to find the similarity between pairs of said learning data.
When .chi. denotes a space of graphical vectors that are used to
construct a complicated 3D graphical entity, there may not be a
notion of similarity in said .chi. space; to cope with this
problem, an intelligent GUI will map said learning data (x.sub.n,
y.sub.n) to a feature vector space using a means of mapping, e.g.
.PHI.:.chi..fwdarw., x.PHI.(x). The similarity between the elements
in , i.e., the feature vectors, can now be expressed using its
associated dot product "". Henceforth, we may define a function
that computes said similarity, K:.chi..times..chi..fwdarw., such
that (xx') K(xx'). This function K is typically called the kernel
function by the industry; using the fundamental property of the
"dotting" process, K satisfies:
K(x, x')=.PHI.(x).PHI.(x'). (9)
[0048] Thus, one understands that the mapping .PHI. is a feature
map, and thereby the space is a feature vector space. In a 3D GUI,
the graphical vectors of the essential features of the 3D graphical
objects are converted (e.g. transform) to said feature vectors by a
mapping process, the associated scheme can be selected by the GUI
designer (this process usually cannot be implemented as an in-situ
one; each application program may have its own mapping process. To
expedite the process, a possible way is adding an ASIC, i.e.,
application specified IC, to an electronic system to handle this
job). When a GUI classifies a plurality of feature vectors into
several classes (typically using a GUI dedicated to this process),
a layout of multi-class graphical entities is established in said
feature vector space. This is a space that may not be directly
perceivable by the viewer via a displaying device made of pixels,
but it is indeed one that is understandable by the software of said
GUI. For example, when a flower bouquet is composed of roses,
tulips, and lilies, a high quality SVM-GUI (i.e., a GUI
specifically associated with SVM) is able to sort them into
different classes (e.g. class 1: roses, class 2: tulips, and class
3: lilies). When a cursor is embodied as a butterfly by the high
quality SVM-GUI, the interactions between the different kinds of
flowers and the butterfly (i.e., cursor) can be different.
Fundamentally speaking, such a unique capability of the presently
disclosed GUI (207) is attributed to its capability of recognizing
different classes of said feature vectors, and thereby the
presently disclosed GUI (207) can support an operator, i.e., the
butterfly, enabling it to navigate through said bouquet in an
interactive manner.
[0049] In a relatively simple situation, such a plurality of
flowers may have clear margins in the graphical vector space so
that the above stated classification does not have to be elevated
to the feature vector space. Using a linear algorithm, an SVM can
do a decent job of separating the flowers directly. When said
margin is not linear in the graphical vector space, using the
kernel method to exploit the above attribute (i.e., linear
classification) in a higher dimensional space, i.e., the feature
vectors space, and thereafter constructing a linear algorithm
therein, may result in the non-linear algorithm addressing a
complicated (i.e., non-linear margin) situation in the graphical
vector space successfully. From mathematical point of view, the
above stated kernel method relies on the notion of the similarity
among said feature vectors. We will find out how they are
associated with vector dotting process in the following.
[0050] Let us consider a set of teaching data (x.sub.1, y.sub.1), .
. . , (x.sub.n, y.sub.n).di-elect cons..chi..times., wherein
x.sub.i are the inputs taken from .chi. and y.sub.i.di-elect cons.
are called the outputs (i.e., the classification). In the field of
artificial intelligence, a machine learning process denotes a
unique methodology that uses the above stated teaching data pairs
to construct a model or function to predict on the test examples
x.di-elect cons..chi., which is unseen at the moment of learning,
but is expected to come afterwards. To construct a machine learning
process that generalizes this theorem, a kernel method (i.e., a
module of software that contains a kernel function) can be used to
exploit the structure of said learning data, and thus defines a
similarity between said pairs of teaching data. In the following,
we are demonstrating the vital role of vector dotting process plays
in said kernel function.
[0051] A real symmetric n.times.n matrix K whose elements are
K(x.sub.i, x.sub.j) is called positive definite if for all c.sub.1,
. . . , c.sub.n.di-elect cons.,
i , j = 1 n .times. c i .times. c j .times. K .gtoreq. 0 ( 10 )
##EQU00006##
[0052] With this attribute kept in mind, a GUI design engineer
understands that algorithms that are operating on the data in terms
of dot products can be used by any positive definite kernel
function by simply replacing the dot product formulation
.PHI.(x).PHI.(x') with kernel evaluations K(xx'); this is a
technique called by the industry "the kernel trick" (NOT to be
confused with the kernel of the operating system 605 in FIG. 4A,
e.g. Linux, XNU, etc.). If a GUI design engineer uses some algebra
on said kernel functions, he/she may find out that said SVM kernel
functions are very useful in terms of teaching a GUI how to present
a complicated graph efficiently. We can demonstrate this in the
following. Let K.sub.1 and K.sub.2 be the two positive definite
kernels on .chi..times..chi., A be a symmetric positive
semi-definite matrix, d(x.sub.ix.sub.j) be the result of a dotting
process, which is a proper distance, f any function with support in
.chi., and .mu.>0. Then, the following functions are also kernel
functions:
K(xx')=K.sub.1(xx')+K.sub.2(xx') (11)
K(xx')=.quadrature.K.sub.1(xx') (12)
K(xx')=K.sub.1(xx').times.K.sub.2(xx') (13)
K(xx')=x.sup..quadrature.Ax' (14)
K(xx')=K.sub.1(f(x)f(x') (15)
[0053] The above basic properties of kernel functions help a GUI
designer develop certain measures to refine the similarity among
the feature vectors, making them better fitted to the fundamental
characteristics of a graphical entity. For example, a GUI designer
can sum the dedicated kernel functions to different portions of the
feature vector space using Eq. (15). In addition, a scaling factor
can be added to a kernel function, such as Eq. (15). Thus, one
comes to an understanding that vector dotting process indeed plays
the central role for an SVM to analyze the graphical objects. Note
that prior art (e.g. Vapnik's publication) only exploits the said
dotting process by algebraic means (e.g. Eqs. 11.about.15). To
further enhance its utility, the presently disclosed 3D GUI
conducts said dotting process by manipulating the perspective angle
in the GUI (e.g. FIG. 6). Compared to prior art, performing said
dotting process in geometrical domain, i.e., the way the presently
disclosed 3D GUI does, provides a more intuitive and more powerful
means for an 3D GUI to render a 3D scenery (e.g. one that is
constructed using 3D vector graphics); the operator thus can learn
and interact with said 3D vector graphics more effectively and
efficiently.
[0054] Generally speaking, the cartoon feature Genie (204) is
watching the scene that is directly in front of him; rear view or
side view are not so necessary in ordinary situations. In this
situation, Genie's head does not have to turn, and the above stated
translational and rotational motion vectors generated by the
presently disclosed 3D navigational device (202) have provided
sufficient degrees of freedom (i.e., six) for an operator to
maneuver Genie's body (204) anywhere in a 3D space. If an operator
desires to turn Genie's head (217) while moving his body (204) by
said translational and rotational motion vectors, then Genie's head
(217) gains an extra capability of seeing the side or rear view.
This feature is helpful; but it has a price to pay for, i.e., Genie
(204) needs one more degree of freedom (this motion can be deemed
as the seventh DOF). In the present 3D GUI such an additional
degree of freedom is provided by the parameter .theta.' of FIG. 7
(in fact, it is generated by the parameter .omega. of the presently
disclosed 3D navigational device). Here we enter the discussion on
the perspective angle analysis by two steps. In the first step, we
focus on the fundamental utilities of the perspective angle. In the
second, we extend the above stated feature to various applications.
For example, the above stated extra degree of freedom, i.e., the
parameter .theta.', can be used by a video game player to guide the
cartoon character Peter Pan swinging the dagger while flying in the
3D space. Without such a parameter .theta.', prior art does not
have any effective means to manipulate the motions of Peter Pan's
torso and hand in a separated manner.
[0055] Referring again now to FIG. 7, when an operator aims the 3D
cursor (209) at Genie (204) and clicks the mouse button to call for
Genie's services, the subsequent action of rotating (i.e.,
spinning) the body of the presently disclosed 3D navigation device
(202) over the reference surface (205 or 205T) by an angle .omega.
will lead to a corresponding spinning motion on Genie's head (217)
or torso (218) by an angle .theta.' (the remaining portions of
Genie's body may also be spun, but this is not in the scope of the
present 3D GUI). When Genie's head (217) is turning towards a
specific direction, the objects (P201A) and (P201B) in Genie's eyes
will be making the corresponding motions in the opposite direction.
As the exemplary case of FIG. 11 shows, when Genie's head (217) is
turning in a direction that is the counterclockwise with respect to
the pivot axis {right arrow over (Pivot.sub.Genie)}, the motions of
said objects (P201A) and (P201B) as perceived by Genie (204) will
be in the direction (i.e., -.theta.) that is opposite (i.e.,
clockwise, -.theta.') to said turning direction of Genie's head
(i.e., .theta.'). Determining the magnitude of said motion of these
objects requires one to assign a vanishing point in the 3D space
first. In the case of FIG. 11, said vanishing point has been
assigned to the origin of the 3D coordinate system x-y-z, i.e.,
O.sub.3D. Note that what FIG. 7 shows is merely a contextual
illustration, the rule of scale may not be applied to the
respective objects in FIG. 7 in realistic manner (e.g. Genie may be
quite further away from said vanishing point O.sub.3D as the way
FIG. 7 has presented). Nevertheless, the geometrical relationship
among said objects still holds well in FIGS. 11 through 16, and,
most importantly, such a relationship complies with the fundamental
rules of perspective sketching.
[0056] In the art of perspective sketching, a vanishing point is
one that is departed from the viewer (in this case, it is the Genie
204) by a significant amount of distance such that two objects
located nearby to the vanishing point are perceived by the viewer
as having converged to one spot, i.e., there is no longer a
distinctive differentiation between the two objects. As FIG. 7
shows, such a distance is denoted as d.sub.O3D. As FIG. 7 also
shows, the translational motion vectors of said objects (P201A) and
(P201B) as perceived by the viewer (Genie 204) are
V.sub.201A(d.sub.O3D,-.theta.') and
V.sub.201B(d.sub.O3D,-.theta.'), respectively. The parameters in
the parenthesis suggest that said motion vectors (i.e., V.sub.201A
and V.sub.201B) are the functions of the relative distance between
said vanishing point O.sub.3D and said objects, which is largely
equal to d.sub.O3D. In addition, the function is affected by the
turning motion of Genie's head, which is .theta.'.
[0057] FIG. 8 shows an exemplary case of how an intelligent
perspective angle adjusting process takes place in the presently
disclosed 3D GUI. Referring now to FIG. 8, two objects in circular
shapes are presented by the presently disclosed 3D GUI (207), i.e.,
J.sub.(3D) and K.sub.(3D), respectively. Between the two circles,
there is a line linking their centroids, i.e.,
J.sub.(3D)K.sub.(3D). Before the spinning action of Genie's head
(217) takes place, the geographical centers of Genie's two eyes,
E.sub.G herein, is generally aligned with the centroids J.sub.(3D)
and K.sub.(3D); thus, from where E.sub.G is, the two circles
J.sub.(3D) and K.sub.(3D) appear to be overlapping with one
another. As FIG. 8 also shows, the diameter of circle K.sub.(3D) is
substantially larger than that of circle J.sub.(3D); hence, Genie
(204) is not able to acknowledge the existence of the circle
J.sub.(3D) by his own observation. The same situation happens on
the line J.sub.(3D)K.sub.(3D); specifically, from the standpoint of
E.sub.G, line J.sub.(3D)K.sub.(3D) looks like nothing but a
dot/point to Genie (204). Imagine said circle J.sub.(3D) and said
line J.sub.(3D)K.sub.(3D) are a unique atom and bond in a large
molecule which contains a myriad of atoms and bonds (e.g. a spiral
of DNA); if an operator is intending to investigate certain
properties of said atom J.sub.(3D) or said bond
J.sub.(3D)K.sub.(3D), the conventional GUI is virtually useless in
that it does not have any clue as to how to present them in a
proper manner. To enable the viewer to observe a particular object
that lies within a myriad of objects, an automatic and intelligent
perspective angle-adjusting feature would play a vital role in a
high quality 3D GUI (e.g. 207). For example, as FIG. 8 shows, in
order to reveal the circular shape of circle J.sub.(3D), and the
length of J.sub.(3D)K.sub.(3D), the Genie (204) has to move his
torso (218) laterally (i.e., along the direction of x axis) and
turn his head (217) by an angle .theta.', such that the objects
presented in the presently disclosed 3D GUI (207), i.e., circles
J.sub.(3D) and K.sub.(3D), as perceived by Genie (204) from the new
location are separated from one another by a discernable distance.
We now denote two points, i.e., E.sub.1(3D) and E.sub.2(3D), on the
presently disclosed 3D GUI (207) as the ones that engage the direct
contact with the line of sight of E.sub.G before and after Genie
(204) turns his head (217). As FIG. 7 shows, before Genie (204)
turns his head (217), his line of sight is intersecting with the
presently disclosed 3D GUI (207) at the point E.sub.1(3D). After
Genie (204) has turned his head (217) by an angle d.OMEGA., his
line of sight intersects with the presently disclosed 3D GUI (207)
at a new point, i.e., E.sub.2(3D). Corresponding to the sweeping
movement of Genie's line of sight, which is denoted as {right arrow
over (E.sub.1(3D)E.sub.2(3D) )} by the presently disclosed 3D GUI
(207), the objects presented therein, i.e., circles J.sub.(3D) and
K.sub.(3D), and line J.sub.(3D)K.sub.(3D), are subjected to their
respective relative motion vectors with regard to the terminal
point of line of eye sight E.sub.G; in this situation, as has been
explained in the above, the magnitude of said relative motion
vector is a function of the relative distance between said objects
and said vanishing point O.sub.3D. Here the reader is reminded that
the presently disclosed 3D GUI (207) is in effect a plurality of
layers of software modules that are designed to handle 3D
positional/motional data. In other words, the 3D GUI is able to
process 3D positional/motional data in the format of, say, x, y,
and z; a conventional GUI is a 2D software-using device, it can
only handle 2D positional data, i.e., x and y. When it comes to
presenting 3D positional data by the 3D GUI (207) onto a displaying
device whose physical display format is 2D (e.g. a flat Liquid
Crystal Displaying Panel), a process of converting the 3D
positional data to a 2.5D formation accord with the fundamental
rules of perspective sketching is required (e.g. certain modules in
the the 3D GUI 207, such as the Display Server 603, etc., may be
responsible for performing that task). As FIG. 13 shows, an
X.sub.(2.5D)-Y.sub.(2.5D)-Z.sub.(2.5D) coordinate system is
embodied in the window (207W1), which is in effect an element of
usable area on the displaying device allocated to it by the 3D GUI
(207). There are some general geometrical relationships between the
3D coordinate system of a realistic world and the 2.5D coordinate
system X.sub.(2.5D)-Y.sub.(2.5D)-Z.sub.(2.5D). As a rule of thumb,
the X.sub.(2.5D) axis and Y.sub.(2.5D) axis of window (207W1)
correspond to the X.sub.(3D) and Y.sub.(3D) axes of the realistic
world. As to the third axis of said 2.5D coordinate system, i.e.,
Z.sub.(2.5D), it is drawn as a slanted line on the geographical
plane of X.sub.(2.5D)-Y.sub.(2.5D), intersecting with said two axes
at the origin O.sub.(2.5D), which in some situations is also
defined by the present 3D GUI as the vanishing point of the
perspective sketch (i.e., 207W1). As FIG. 8 shows, the angle of
intersection between said Z.sub.(2.5D) axis and said X.sub.(2.5D)
axis is denoted as .delta.. When the value of .delta. changes, we
denote the process as one associated with perspective angle
changing. In the following, we will elaborate how an object in a
2.5D coordinate system changes its physical location and size in a
displaying device (i.e., X.sub.displaying device, Y.sub.displaying
device) when the angle of intersection .delta. is changed. To
reiterate, a conventional GUI does not know all this because size
and directionality do not apply to its basic graphical element,
i.e., pixel or voxel. The presently disclosed 3D GUI (207)
considers said pixel and voxel as physical object, thereby size and
directionality matter to the basic graphical element of a
perspective sketch.
[0058] As FIG. 7 shows, the size of an object displayed in a 2.5D
coordinate system is inversely proportional to its relative
distance to the viewer (i.e., the viewer's eye, E.sub.G). Or, said
in a different way, the size of an object in a 2.5D coordinate
system is proportional to its relative distance to said vanishing
point, i.e., O.sub.2.5D. In practice, when a GUI design engineer is
investigating the size of an object in a 2.5D coordinate system, an
easy and effective way to do the job is look into the geometrical
relationship between said object and said 2.5D vanishing point,
O.sub.2.5D. FIGS. 7 and 9 depict how such a geometrical
relationship is established.
[0059] Referring now to FIG. 7, when a person projects several
objects in a 3D coordinate system (i.e., the
X.sub.(3D)-Y.sub.(3D)-Z.sub.(3D) coordinate system) onto its
X.sub.(3D)-Z.sub.(3D) plane, the above stated geometrical
relationship can be revealed more clearly to the viewer. On said
X.sub.(3D)-Z.sub.(3D) plane, the projected length of the line
linking points M.sub.(3D) and N.sub.(3D) is
|M.sub.(X-Z)N.sub.(X-Z)|. The length of this projected line by and
large tells us how large said circle K.sub.(3D) may be. The other
circle J.sub.(3D) is located relatively closer to the vanishing
point O.sub.(3D) (i.e., the origin point of the
X.sub.(3D)-Y.sub.(3D)-Z.sub.(3D) coordinate system) as compared to
the location of circle K.sub.(3D). When one looks into said
X.sub.(3D)-Z.sub.(3D) plane, the projected diameter of said circle
J.sub.(3D) is |J.sub.(X-Z)|; by appearance, said projected diameter
is substantially shorter than line M.sub.(X-Z)N.sub.(X-Z) is. Thus,
we can understand that there are two fundamental reasons that cause
said circle J.sub.(3D) to appear smaller than K.sub.(3D) in the
eyes of Genie (204). First, the physical dimension of circle
J.sub.(3D) and K.sub.(3D), and, second, their relative distances to
the origin of the X.sub.(3D)-Y.sub.(3D)-Z.sub.(3D) coordinate
system, i.e., O.sub.3D. In this case, both causes lead to a common
consequence: circle K.sub.(3D) blocks Genie's line of sight,
preventing him from seeing the circle J.sub.(3D). The remedy for
this problem is to move Genie's line of sight away from the point
E.sub.1(3D). FIG. 8 shows a 2.5D coordinate system (i.e.,
X.sub.(2.5D)-Y.sub.(2.5D)-Z.sub.(2.5D) used by a window 207W1,
which is a segment of the presently disclosed 3D GUI). Here we are
using a secondary window (207W1) to depict such a 2.5D coordinate
system; as a matter of fact, the same rule (i.e., depicting a 3D
scene by 2.5D coordinate system) can be applied to the other window
(e.g. 207W2 of FIG. 12), or even the entire 3D GUI. Referring again
to FIG. 13, within said window (207W1), a coordinate system
X.sub.displaying device-Y.sub.displaying device is used to denote
the geographical address of the objects lying therein.
Mathematically, the 2.5D coordinate system and the geographical
address of the objects in the presently disclosed 3D GUI satisfy
the following equation:
X.sub.displaying device=X.sub.(2.5D)+Z.sub.(2.5D)cos .delta.
(16)
Y.sub.displaying device=Y.sub.(2.5D)-Z.sub.(2.5D)sin .delta.
(17)
[0060] where the symbole "" denotes multiplying; .delta. is the
intersecting angle between said X.sub.(2.5D) and Z.sub.(2.5D) axes;
X.sub.displaying device and Y.sub.displaying device denote the
physical address of the object lying in the displaying device
(i.e., 217); the parameters X.sub.(2.5D), Y.sub.(2.5D), and
Z.sub.(2.5D) are the coordinate values of said object in said 2.5D
window (207W1).
[0061] As FIGS. 7 and 8 show, when Genie (204) is turning his head
(217) by an angle d.OMEGA., the above stated intersecting angle
.delta. (i.e., .angle.E.sub.1(X-Z)O.sub.(3D)E.sub.2(X-Z)) is
changed by an amount of d.delta.. For the simplicity of analysis,
we let all objects in the 3D space not having any relative motions
with respect to Genie (204); this makes the coordinate values of
said objects in said 3D coordinate system (i.e., X.sub.(3D),
Y.sub.(3D) and Z.sub.(3D) of FIG. 7) unchanged while Genie (204) is
turning his head (217); in the mean time, the point that the
presently disclosed 3D GUI (207) engages the direct contact with
the line of sight of E.sub.G is going to be moved from E.sub.1(3D)
to E.sub.2(3D) when Genie (204) is turning his head (217) by said
angle d.OMEGA.. In Genie's eyes, said turning action of head (217)
will lead the rotational movement of the Z.sub.(2.5D) axis of said
2.5D coordinate system (i.e., 207W1). As the consequence, such a
rotational movement of said Z.sub.(2.5D) axis with respect to said
Y.sub.(2.5D) axis will lead to the apparent translational motion of
all objects in said 2.5D coordinate system. To depict the above
phenomenon clearly, the following equations are the
differentiations of Eqs. (18) and (19):
dX.sub.displaying
device=dX.sub.(2.5D)+dZ.sub.(2.5D)cos(.delta.)-Z.sub.(2.5D)sin(.delta.)d.-
delta. (18)
dY.sub.displaying
device=dY.sub.(2.5D)-dZ.sub.(2.5D)sin(.delta.)-Z.sub.(2.5D)cos(.delta.)d.-
delta. (19)
[0062] where the parameter d.delta. denotes the change of the angle
intersected by the X.sub.(2.5D) and Z.sub.(2.5D) axes of FIG. 9;
dX.sub.displaying device and dY.sub.displaying device denote the
changes of the address values of an object in the displaying device
of FIG. 9 (i.e., 207W1).
[0063] A GUI design engineer can now adjust the apparent locations
of the objects in the displaying device. For example, when an
object is unmoved in the 3D space (i.e., dX.sub.(2.5D)=0,
dY.sub.(2.5D)=0, and dZ.sub.(2.5D)=0), but there is a validated
value of d.delta. (i.e., d.delta..noteq.0), as EQ's. (30) and (31)
reveal, the above situation will lead to the validated values of
dX.sub.displaying device and dY.sub.displaying device
(dX.sub.displaying device.noteq.0, and dY.sub.displaying
device.noteq.0), which means that the object is going to perform an
apparent translational motion in window (207W1). Alternatively,
when an object is moving by itself during the period of said
d.delta., or, said another way, Genie (204) then observes a
relative motion with respect to said realistic object (denoted by
the relocation of E.sub.G of FIG. 6), henceforth the parameters
dX.sub.(2.5D), dY.sub.(2.5D), and dZ.sub.(2.5D) of Eqs. (30) and
(31) will have non-zero values, which subsequently leads to the
variations of the physical coordinate values of said realistic
object in the the displaying device of FIG. 7; in latter
paragraphs, we will use FIG. 9 to explain such a situation more
clearly.
[0064] In order to analyze how the intersecting angle d.delta. is
generated in a lucid manner, one may look into the
X.sub.(3D)-Y.sub.(3D) plane of a the 3D coordinate system of FIG. 8
(i.e., X.sub.(3D)-Y.sub.(3D)-Z.sub.(3D). As FIG. 8 shows, there are
two lines that are intersecting with one another on the vanishing
point O.sub.(3D), i.e., {right arrow over (O.sub.(3D)E.sub.1(X-Z))}
and {right arrow over (O.sub.(3D)E.sub.2(X-Z))}, respectively;
note, the two terminal points of said two lines, i.e., E.sub.1(X-Z)
and E.sub.2(X-Z), reveal that d.delta. is generated by the sweeping
action of Genie's line of eye sight. When Genie's line of eye sight
is sweeping from {right arrow over (E.sub.GE.sub.1)} to {right
arrow over (E.sub.GE.sub.2(3D))}, an angle
.angle.E.sub.1(3D)O.sub.(3D)E.sub.2(3D) is formed in the 3D space;
this angle is too small to be presented by said
X.sub.(3D)-Y.sub.(3D)-Z.sub.(3D)coordinate system clearly, so we
may turn to look into its projected image on the
X.sub.(3D)-Z.sub.(3D)plane, i.e.,
.angle.E.sub.1(X-Z)O.sub.(3D)E2.sub.(X-Z), which reveals its shape
as well as motion more clearly. We now denote
.angle.E.sub.1(X-Z)O.sub.(3D)E2.sub.(X-Z) as d.delta., it is in
effect the rotational (sweeping) angle of the Z.sub.(2.5D) axis of
FIG. 8, which is now clear that it is caused by the spinning action
of Genie's head (217) by said angle d.OMEGA.. Take this effect to
FIG. 9, when the Z.sub.(2.5D) axis of the 2.5D coordinate system of
FIG. 9 is moving, all objects in said 2.5D coordinate system will
be moving accordingly. Correlate this effect to FIG. 7, when the
motion of said Z.sub.(2.5D) axis is caused by the spinning action
of Genie's head, the entire process of rotating said Z.sub.(2.5D)
axis with regard to the the pivot axis {right arrow over
(Pivot.sub.Genie)} will make all objects in said window (207W1)
moving in the direction opposite to the spinning motion of said
Genie's head (217). With that basic understanding in mind, we now
can move on to analyze the motions of the objects that have
specific volumes or dimensions.
[0065] As FIG. 9 shows, when the 3D GUI uses a 2.5D perspective
sketching methodology to depict an object in a scene, by the
fundamental rule of 2.5D sketching, said object will have different
apparent sizes (i.e., area of A.noteq.A') when its relative
distance to the vanishing point O.sub.(2.5D) changes. FIG. 10 shows
how a viewer's capability of differentiating the objects in a 2.5D
coordinate system is affected by the perspective angle (e.g.
.delta..sub.1). As FIG. 7 shows, in order to change the perspective
angle toward an object in a 2.5D sketch, there are are two ways to
do so:
(1) Relocate Genie (204) from one place to another in the 3D space
(e.g. move from E.sub.G1 (.delta..sub.1) to E.sub.G2
(.delta..sub.2); (2) Turn Genie's head (217) by an angle d.OMEGA.,
such that said perspective angle .delta..sub.1 is changed by an
angle d.delta..
[0066] The above two methods can be implemented concurrently, and
they can be done manually or automatically. When a GUI is adjusting
said perspective angle manually, the above two methods can be
implemented by an operator using the presently disclosed navigation
device (202); when a computer (200 of FIG. 2C) is intended to
adjust said perspective angles automatically, it mainly relies on
some algorithms that are developed based on Eqs. (30) and (31) to
achieve the goal.
[0067] Referring now to FIG. 9, a 2.5D coordinate system is formed
by three axes, i.e, X.sub.(2.5D), Y.sub.(2.5D), and Z.sub.(2.5D,
EG1). In order to generate the perspective sketching viewing
experience to a viewer, a unique angle of intersection between said
X.sub.(2.5D) axis and Z.sub.(2.5D, EG1) axis is applied to FIG. 9,
i.e., .delta..sub.1. Namely, this angle .delta..sub.1 denotes the
perspective angle of said perspective sketching; at the perspective
angle .delta..sub.1 (e.g. .delta..sub.1.about.45.degree.), the
direct viewing point of Genie (204) is located at E.sub.G1
(.delta..sub.1). Within said perspective sketch (i.e., 207W1), two
circles, i.e., J.sub.(2.5D) and K.sub.(2.5D), are placed next to
each other; their apparent dimensions as shown by the displaying
panel (207W1.sub.Displaying device) are denoted as |J.sub.(2.5D)|
and |K.sub.(2.5D)|, respectively. As FIG. 10 further reveals, at
the moment said perspective angle .delta..sub.1 is about
45.degree., |J.sub.(2.5D)| and |K.sub.(2.5D)| appear as if they are
linking one another; the viewer will have difficulty to tell if
said circles J.sub.(2.5D) and K.sub.(2.5D) are one object or two
objects. When Genie (204) is intended to move his body to different
locations (e.g. from E.sub.G2 to E.sub.Gx) to seek for a better
vision on said circles, he will have a range of perspective angles
(i.e., .delta..sub.1 is varying) that allows him to tell if said
two circles are one or two objects (e.g. between .delta..sub.2 and
.delta..sub.x). Upon taking the initial step of making the body
movement, as FIG. 7 shows, Genie (204) has two choices, i.e, making
said perspective angle .delta..sub.1 by moving E.sub.G1 to the
right, or making said perspective angle .delta..sub.1 smaller by
moving Genie's body to the left (the direction right or left are
arbitrarily chosen by the present 3D GUI for easy narration; the
realistic direction would have to be determined by the relative
positions between O.sub.(2.5D) and E.sub.G1 in the 3D space).
Glancing at the scene by turning the head (217) helps Genie (204)
decide which direction he shall move his body to judiciously.
Referring to FIG. 10 again, when Genie's line of sight is changed
from {right arrow over (E.sub.GE.sub.1)} to {right arrow over
(E.sub.GE.sub.2)}, a sweeping angle of Genie's line of sight is
formed, i.e., d.OMEGA.. Corresponding to said sweeping action of
Genie's line of sight, a rotational angle d.delta. of the
Z.sub.(2.5D) axis of the X.sub.(2.5D)-Y.sub.(2.5D)-Z.sub.(2.5D)
coordinate system of FIG. 10 shall take place. Per that rotational
movement, circles J.sub.(2.5D) and K.sub.(2.5D)are moved to new
locations, which become J' and K' eventually. Note carefully, at
the new perspective angle (i.e., .delta..sub.1+d.delta.), the
apparent dimensions of said two circles, i.e., and |J'|, and |K'|,
are not affecting one another; the evidence is there is a leeway
DJ'K' between said two circles (|J.sub.(2.5D)| and |K.sub.(2.5D)|
don't have a leeway like this). Thus, one comes to an understanding
that sweeping the perspective angle .delta..sub.1 by a small angle
d.delta. can provide a leading index for an intelligent 3D GUI to
generate an intelligent motion plan of the world camera (i.e.,
Genie 204), which has the merit of revealling the shape or
dimensions of the 3D objects presented therein effectively and
efficiently (i.e., similar to the future proof techniques). In
practice, said change of the rotational angle d.delta. of the
Z.sub.(2.5D) axis is dependent upon several parameters of the
presently disclosed 3D GUI. Referring now to FIG. 10 again, when
the viewing distance (i.e., VD) between the terminal point of the
line of sight, e.g. E.sub.Gx, and the displaying device
(207W1.sub.Displaying device) is substantially larger (e.g. several
tens of meters) than the length of |{right arrow over
(E.sub.1E.sub.2)}|,which is the distance between the two points on
said the displaying panel (e.g. mm) that engage direct contact with
said line of sight of E.sub.G before and after Genie (204) turns
his head (217), the sweeping angle of Genie's line of sight
d.OMEGA. can be calculated as:
d .times. .times. .OMEGA. = E 1 .times. E 2 .fwdarw. V .times. D (
20 ) ##EQU00007##
Corresponding to the above stated sweeping action of Genie's line
of sight by angle d.OMEGA., the perspective angle .delta..sub.1
shall be changed by an amount of d.delta., whose value can be
calculated by:
d .times. .times. .delta. = E 1 .times. E 2 .fwdarw. D .times. O
.times. S = E 1 .times. E 2 X ( 2.5 .times. D ) .times. Z ( 2.5
.times. D , 90 .times. .degree. ) _ ( 21 ) ##EQU00008##
where DOS denotes the depth of scene of window (207W1), whose value
is |X.sub.(2.5D)Z.sub.(2.5D, 90.degree.)|. For example, a typical
value of said DOS is several kilometers (km). Here the readers may
also be acknowledged that one of the main differences between FIGS.
10 and 11 is the incorporation of DOS. What FIG. 11 shows is the
outlook of the presently disclosed 3D GUI (207); what FIG. 10 shows
is the geographical relationship and certain scalar parameters used
by the presently disclosed 3D GUI (207).
[0068] In FIG. 6, a path of Genie's movement (i.e, {right arrow
over (Path.sub.EG)}) is generated for providing a unique viewing
experience for the viewer of the displaying device
(207W1.sub.Displaying device). In practice, path {right arrow over
(Path.sub.EG)} can be generated manually or automatically; the
winding profile of path {right arrow over (Path.sub.EG)} denotes
the presently disclosed 3D GUI (207) is able to reveal 3D
objects/motion in a proactive, intelligent manner. When the
computer industry enters 3D graphic regime, this kind of phenomenon
(i.e., one object blocks the image of the other) is literally
inevitable to all situations. Hence, an intelligent 3D GUI must
include the ability to help the viewer differentiate the objects in
a 2.5D perspective sketch more easily.
[0069] The situation depicted in FIG. 6 is a relatively simple
one--there are only two objects in the window (207W1). In practical
situations, a complicated scene may be composed of a myriad of
objects (e.g. a Big Data set), each of which may have a unique
motion vector of its own. In this situation, the collaborative
functionality of Neural Network (610) and Support Vector Machine
(616) of FIG. 4B come into play. In a neural network system, a
computer does not seek a surefire answer that is derived only based
on a human being's knowledge (e.g. the use of linear algebra).
Instead, the Neural Network (610) will perform a supervised
learning process to approach a satisfying result. For example, when
the presently disclosed 3D GUI implements a supervised learning
process, it will seek a function of Genie's trajectory (i.e.,
|{right arrow over (Path.sub.EG)}| of FIG. 6) based on a former
experience, i.e., a set of (input, output) data. Specifically, said
input data has to do with the address of E.sub.G and the
perspective angle .delta..sub.1; said output data has to do with
the dimensions of the targeted objects whose images are being
projected on the displaying device 207W1.sub.Displaying device
(e.g. |J'|, |K'|, and DJ'K', etc). FIGS. 10 and 11 denote a dilemma
that an object (i.e., line |J.sub.(2.5D)K.sub.(2.5D)| is being
"sandwiched" by two objects, i.e., circles J.sub.(2.5D) and
K.sub.(2.5D). In this situation, as FIG. 11 shows, Genie may only
have a fair chance to see the whole area of said two circles (i.e.,
J' and K') after it has changed the perspective angle from
.delta..sub.1 to .delta.'. But Genie may not have a fair chance to
see the whole profile of |J'K'| regardless said perspective angle
is .delta..sub.1 or .delta.', i.e., in both situations, there is a
large portion of said line |J'K'| being blocked by the circle K'.
Seeking a perspective angle that shows the whole length of line
|J'K'| is literally impractical to a high caliber 3D GUI (or, it
may require Genie 204 to travel a long distance to meet the goal,
i.e., relocate from E.sub.G0 to E.sub.Gn in FIG. 14). In this case,
inference would be a more practical way for a viewer to comprehend
a 3D scene. Referring back to FIG. 6, when E.sub.G is moved by a
vector {right arrow over (E.sub.1E.sub.2)}, per the relative motion
between E.sub.G and the origin of the coordinate system, circle
K.sub.(2.5D) shall be moved by a vector -{right arrow over
(E.sub.1E.sub.2)}. As to circle J.sub.(2.5D), per the same motion
vector {right arrow over (E.sub.1E.sub.2 )} of E.sub.G, it will be
moved by a vector
- O ( 2.5 .times. D ) .times. J ( 2.5 .times. D ) _ O ( 2.5 .times.
D ) .times. K ( 2.5 .times. D ) _ .times. E 1 .times. E 2 .fwdarw.
. ##EQU00009##
The dissimilarity of said motion vectors of said circles
J.sub.(2.5D) and K.sub.(2.5D) denote a non-linear motion of the
line linking them, i.e., {right arrow over
(J.sub.(2.5D)K.sub.(2.5D))}. When a 3D scene is presenting a
plurality of objects that have various non-linear motions, it may
infer the objects presented therein have unique gestures. Together
the pattern and motion vectors of the objects constitute our
preliminary comprehension of the world by visualization.
[0070] We now take the above stated envisioned "Fantasia.RTM. 3D"
as an example. In this envisioned cartoon movie, each cluster of
objects (e.g. flowers) denotes a unique class of objects, whose
essential geometrical property (i.e., the graphical vectors) is
denoted by their projected lengths on the X.sub.(2.5D) axis.
Referring back to FIG. 9, circles A and A' are two objects being
looked at by Genie (204) from different perspective angles
(.delta.). Per the above stated methodology, circles A and A' are
denoted by two graphical vectors, i.e., {right arrow over (D.sub.A
)} and {right arrow over (D.sub.A')}, respectively. As of such, the
projected lengths of said two graphical vectors on the X.sub.(2.5D)
axis are {right arrow over (D.sub.A)}{circumflex over
(X)}.sub.(2.5D) and {right arrow over (D.sub.A')}{circumflex over
(X)}.sub.(2.5D), respectively. The magnitude of {right arrow over
(D.sub.A)}{circumflex over (X)}.sub.(2.5D) and {right arrow over
(D.sub.A')}{circumflex over (X)}.sub.(2.5D) denotes the apparent
sizes of said objects A and A'; that is, if they are relatively
large, the essential properties of said objects represented by said
graphical vectors {right arrow over (D.sub.A )} and {right arrow
over (D.sub.A')} can be recognized by the viewer more easily, and
vice versa. Referring now to FIG. 14, consider circles J.sub.(2.5D)
and K.sub.(2.5D) as two clusters of flowers (in other words, the
graphical pattern of said two clusters is not necessarily as simple
as two circles), their essential features are denoted by the
graphical vectors {right arrow over (D.sub.J )} and {right arrow
over (D.sub.K)}, respectively (we are using one vector to
characterize each clusters of flowers; in fact, a GUI designer can
use as many graphical vectors as he/she wants; as a rule of thumb,
the more graphical vectors one uses to to depict an object, the
more detail a viewer can learn said object (especially its motions)
from different perspective angle (the cost is the increased
calculation power). As FIG. 10 shows, the projected lengths of said
graphical vectors {right arrow over (D.sub.J )} and {right arrow
over (D.sub.K )} on the X axis of the window (207W1), i.e.,
207W1.sub.displaying device, are |{right arrow over
(J.sub.(2.5D))}| and |{right arrow over (K.sub.(2.5D))}|,
respectively. Eqs (22) and (23) depict the mapping process and
dotting process associated to the above result. As one may notice,
the projected lengths of circles J.sub.(2.5D) and K.sub.(2.5D) are
keenly affected by the dot product between the Z.sub.(2.5D, EG1)
axis and the unit vector of the +X.sub.(2.5D) axis of the 2.5D
coordinate system shown in FIG. 6. Specifically, one may use the
unit vectors of said two axes of 2.5D coordinate system to depict
their mathematical relationship:
J ( 2.5 .times. .times. D ) .fwdarw. = ( .phi. : D J .fwdarw.
.fwdarw. Z ( 2.5 .times. D , EG .times. .times. 1 ) .fwdarw. ) X ^
( 2.5 .times. D ) = .phi. : D J .fwdarw. .fwdarw. Z ( 2.5 .times. D
, EG .times. .times. 1 ) .fwdarw. .times. X ^ ( 2.5 .times. D )
.times. cos .times. .times. .delta. 1 = D J , Z .function. ( 2.5
.times. D ) .times. cos .times. .times. .delta. 1 ( 22 ) K ( 2.5
.times. .times. D ) .fwdarw. = ( .phi. : D K .fwdarw. .fwdarw. Z (
2.5 .times. D , EG .times. .times. 1 ) .fwdarw. ) X ^ ( 2.5 .times.
D ) = .phi. : D K .fwdarw. .fwdarw. Z ( 2.5 .times. D , EG .times.
.times. 1 ) .fwdarw. .times. X ^ ( 2.5 .times. D ) .times. cos
.times. .times. .delta. 1 = D J , Z .function. ( 2.5 .times. D )
.times. cos .times. .times. .delta. 1 ( 23 ) ##EQU00010##
[0071] Where .PHI. is a function that maps the graphical vectors
{right arrow over (D.sub.J )} and {right arrow over (D.sub.K )} to
Z.sub.(2.5D) axis, {circumflex over (X)}.sub.(2.5D) is the unit
vectors of the X.sub.(2.5) axis; |{right arrow over
(J.sub.(2.5D))}| and |{right arrow over (K.sub.(2.5D))}| are the
apparent sizes of of said two clusters of flowers circles as
perceived by Genie (204). So, Eqs. (22) and (23) justify what we
have discussed before that a means of 3D graphical rendering is
keenly affecting the level of comprehension of a viewer.
[0072] Of course, the above disclosed methodology can be applied to
more than two objects in a screen. In this section we elaborate
that when a data analyst intenteds to separate a plurality of
objects into a few classes (not necessarily graphical ones), whose
essential features can be characterized by some characteristic
vectors (e.g. hue index, genome, etc.), according to the theorem of
SVM (the "support vector machine"), the analyst may first map said
characteristic vectors to a feature vector space that has its
dimensions higher than that of the graphical vectors; the dotting
process may proceed afterwards. In the above stated case, Genie
(204) is intended to seperating a plurality of objects (e.g.
clusters of flowers) by a specific perspective angle; namely, said
seperating process is carried out by a dotting process from the
{circumflex over (Z)}.sub.(2.5D, EG1) axis to the {circumflex over
(X)}.sub.(2.5D) axis. The entire process is literally a machine
learning one that aims to divide a plurality of objects into
multiple classes. What is important to be acknowledged is said
dotting process is keenly related to the 2.5D coordinate system
embedded in the images captured by our retina, which is a literally
a 2D organ. So, the presently disclosed invention denotes a
revolutionary technology for a computer or electromechanical system
to engage with the user, in which certain 3D patterns, i.e., the
essential features of the objects, etc., can interact with the
users by the deliberately adjusted perspective angles of a 2.5D
coordinate system. Here the readers must be advised that a 3D
object has three degrees of freedom for its whereabout (i.e., X, Y,
and Z); in the mean time, it has six degrees of freedom for the
respective motions in the same space; these fundamental properties
will all be taken into account by the presently disclosed 3D GUI
(207).
[0073] The above stated method of depicting a plurality 3D objects
by carefully controlling the way their graphical vectors are
presented to the viewer (e.g. |{right arrow over (J.sub.(2.5D))}|,
|{right arrow over (K.sub.(2.5D))}|, etc.) is implemented by
computing the dot products between the unit vector of X and Z axes
of a 2.5D coordinate system (i.e., {circumflex over (X)}.sub.(2.5D)
{circumflex over (Z)}.sub.(2.5D)); this methodology mimics the
kernel trick used by SVM. As has already been demonstrated in the
earlier paragraphs, the fundamental value of the kernel trick is
attributed to its capability of taking the dot products of the
feature vectors, which is extracted from the graphical vectors.
Here the presently disclosed invention extends the utility of SVM
by exploiting the strong relationship between the 3D coordinate
system of the real world and a 2.5D coordinate system used by a
graphic rendering feature (i.e., 3D GUI). This is indeed a gift
that Mother Nature gives to humans. If counted by numbers, most of
the creatures in nature use compound eyes; in their compound eyes,
the photo receptors are wired to the neurons in direct
ways--although this makes compound eyes more responsive to the
optical flows in ambient light (i.e., making said creatures escape
from predators more easily), these creatures' level of
comprehension of their surroundings is far lower than that of human
being--this fundamental drawback has to do with the missing dotting
process in their respective neural systems (having no 2.5D
coordinate system, thereby having no way to adjust the perspective
angle to assess the similarity among different graphical entities).
In the present disclosure, we are using SVM (616) to elucidate the
fundamental advantages of using artificial intelligence for
reinforcing a viewer's capability of learning/interacting with a 3D
scenery. In the AI industry, there are other methodologies that can
do similar jobs; this disclosure does not rule out the options of
using them.
[0074] In the above paragraphs, the disclosure focused on
elaborating the unique methodology developed to reinforce a
viewer's learning experience through interaction with 3D scenery.
In the latter paragraphs of this section, the presently disclosed
3D GUI focuses on elaborating the means to make a computer's, (a
machine's) learning experience or interaction with a 3D scenery
proceed effectively. FIG. 5A shows an exemplary graphical rendering
process of the presently disclosed 3D GUI. A cursor (1707A) is
approaching two roses, i.e., (1701L) and (1701R), the 3D
geometrical patterns of the roses are quite complicated in that
there are many petals in each rose; if a GUI is meant to use the
conventional methodology to sketch these petals, a most common
method is using the polygon network (see FIG. 10A in related
application NU17-001 e.g. vortex 1010 of FIG. 10B). The position
and orientation of the respective vortexes of said polygon network
may serve as the nodes that engage with an operator/cursor to
modify the associated 3D graphical pattern (e.g. 1001). Such a
method works fine for static models, but it may not be suitable to
continually modifying a polygon network when the processing
time/power of an application is limited (this is a typical case for
video games). As one can imagine, continually modifying the
position, i.e., P.sub.2 (X.sub.F2, Y.sub.F2, Z.sub.F2), and
orientation (e.g. the parameters 1709) of a large number of
vortexes imposes a heavy load on the CPU and GPU; the processing
time required for such a task can be extended when the number of
said vortexes is quite large. To cope with this problem, the
presently disclosed 3D GUI may extract a few graphical
vectors/motion vectors from said roses (e.g. the normal vector
N.sub.1701L, 1712, and 1713, etc.) to meet the goal. Using selected
graphical vectors/motion vectors of an object (e.g. 1701L) to
engage with a 3D cursor (1707A) may make the processes of
calculating the effect of an engagement much more efficient, and
this capability not only can reduce the number of the vortexes to
be processed, but also the enhanced comprehension of the
interaction by the viewer (result is more predictable and
understandable).
[0075] Here the readers are advised that both 3D graphical vector
(e.g. normal vector N.sub.1701L) and 3D motion vector (denoted by
the coordinate system 1707 of cursor 1707A) can be denoted by the
matrices having the same dimensions (e.g. 3.times.3). Thus, a
straightforward matrix multiplying process (e.g. multiplying the
normal vector N.sub.1701L to said motion vector of cursor 1707A,
etc.) may generate the resultant motion vector in a relatively
short period of time. In the presently disclosed 3D GUI, one can
execute this kind of process for the graphical object (e.g. the
petal of the rose 1701R) that are selected for cursor engagement;
the ones that are not selected can be waived for such a process. By
designating objects for said matrix multiplying process, the
interaction between a portion of a complicated large object (e.g.
petals of rose 1701R) and said cursor (1707A) would be calculated
out in a locally constrained manner, which can be much faster and
understandable to the user. In Eq. (16), the presently disclosed
invention demonstrates that the parameters n, a, and s, of the
matrix T can be used to denote the normal vector, sliding vector,
and approaching vector of an end effector of a robot. In FIG. 5A,
we are using two local coordinate systems, i.e., (1708) and (1709),
to denote the matrixes of each of said roses (1701L) and (1701R)
respectively. So, by multiplying matrix (1707) with (1708), or
multiplying matrix (1707) with (1709), based on Eq. (16), the
presently disclosed 3D GUI is able to generate many kinds of
interactions between said cursor (1707) and said roses (e.g. 1701L
or 1701R), with their respective results predicable and
understandable by the operator. To further make such an interaction
sensitive to the distance between said roses and said cursor
(1707), the presently disclosed 3D GUI has allocated three zones of
engagement, each of which denotes a different level of engagement
(i.e., (1702), (1703), and (1704). As FIG. 17A shows, zone (1702)
is dedicated to rose (1701L), zone (1703) is dedicated to rose
(1701R), and zone (1704) is dedicated a region that engages both
roses (1701L) and (1701R). Note that these zones are by and largely
related to the relative distances between the centroids of said
roses (i.e., P.sub.1 (X.sub.F1, Y.sub.F1, Z.sub.F1) and P.sub.2
(X.sub.F2, Y.sub.F2, Z.sub.F2) and said cursor (1707). As an
example, when said cursor is moved from (1707A) to (1707B), its
tip, i.e., P.sub.c (X.sub.c, Y.sub.c, Z.sub.c), enters the
effective zone of engagement (1704), so the presently disclosed 3D
GUI (207) will wake up the associated processes of interactions for
both roses (1701L) and (1701R), such as swaying, etc. When the
cursor (1708) is further approaching either zone (1703) or (1704),
the level of engagement between said cursor and one of the roses
(i.e., either rose (1701L) or (1701R) will be enhanced; certain
actions such as flower blooming may proceed accordingly. When the
presently disclosed 3D GUI is engaging with a fairly large number
of objects (e.g. the number of said roses is hundreds, or
thousands, etc.), and said objects may be changing their respective
patterns or colors during the course of engagement, designating so
many zones of engagement to each flower may not be a practical way
of processing. To cope with this problem, the presently disclosed
3D GUI (207) uses artificial intelligence (e.g. SVM, Convolutional
Neural Network, CNN, etc.) to classify some of the objects by their
respective graphical vectors (which may comprise the color indexes
as well), images, and motion vectors. FIG. 5A shows an exemplary
feature vector space established by the selected graphical vectors
and motion vectors of roses (1701L) and (1701R). In certain
applications, said feature vectors can be established by a set of
realistic data (e.g. images, etc.) that are measured by an
instrument (e.g. DICOM data, a set of image data in JPG format;
DICOM is the standard for the communication and management of
medical imaging information and related data) rather than being
generated by any 3D graphical vectors, the stated above artificial
intelligent means (e.g. a process module 610 that carries CNN
feature, etc.) still can work effectively and efficiently, i.e.,
classifying the respective objects in the presently disclosed 3D
GUI (207), with the occasional help of human judgment. In the
following, for the brevity of explanation, we use the graphical
vectors to illustrate the merit of the presently disclosed
invention. Nevertheless, readers are advised that such a simplified
way of narration does not rule out the utilities of the other types
of data, such as the realistic images. As FIG. 5B shows, using the
kernel trick taught in the former paragraphs (i.e., mapping said
vectors to a higher dimensional space), the roses fall into the
class of (1701L), which are denoted by the spots in FIG. 5B, and
the ones falling into the class of (1701R), which are denoted by
the asteroids (stars) in FIG. 5B, can form a distinct boundary
(1711) in the feature vector space (denoted by the coordinate
system X.sup.d(d>3)-Y.sup.d(d>3); thus, using a hyperplane
(1710), the presently disclosed 3D GUI can determine which class of
said roses is engaging with the cursor (1707). At this stage (i.e.,
FIGS. 5A and 5B), the interaction between an operator (i.e., cursor
1707B) and a 3D vector graphic in the presently disclosed 3D GUI
(e.g. the rose 1701L or 1701R of FIG. 5A) is denoted by matrix
operations (e.g. multiplying 1707, 1708, and 1709, which are all
3.times.3 matrixes). The resultant matrix still is a matrix; it
denotes multiple interactions between an operator, which in effect
is the 3D cursor (e.g. 1707B) in the presently disclosed 3D GUI
(207), and said 3D vector graphic (e.g. rose 1701L or 1701R). FIGS.
5C and 5D further depict the neural signal processing steps taken
by the presently disclosed 3D GUI (207) for an operator (i.e., the
cursor 1707B) to engage with a 3D vector graphic (i.e., 1714; as
has been stated in the above, 1714 may also denote a set of data
pertaining to a realistic object; for easier explanation, we use
the vector graphics to proceed with the following explanation). As
FIG. 5D shows, the content or context of a 3D vector graphic (1723)
can be denoted by a plurality of 3D features, e.g. eyes, lips, etc.
To extract said 3D features, the presently discoed 3D GUI
designates a few 3D zones (e.g. 1723D1, 1723D2, 1723D3, and 1723D4,
etc.) in said 3D vector graphic (1723). As FIG. 5D further shows,
within each said 3D zone, a 3D feature is constructed by a set of
3D graphical vectors (e.g. 1723VG-X). When any of said 3D graphical
vectors changes its properties (e.g. length, direction, color
index, . . . etc.), said 3D features shall be adjusted accordingly.
In practice, a GUI designer can designate many subtle variations to
said 3D graphical vectors (e.g. 1723GV-X), the corresponding 3D
features can be used to denote a rich set of facial expressions
(e.g. sad, happy, etc.) of said human head (1723). In the present
invention, the variations of the feature vectors derived from said
graphical vectors are denoted as being the neural input signals;
the variations of said human expression are denoted as the final
neural output signal. Within the presently disclosed 3D GUI, there
may be several layers of neural nodes (e.g. 1716) that are arranged
in a serial manner or parallel manner to process said neural input
signals (this type of multilayered processes of neural signals
denotes deep machine learning). During operation, the presently
disclosed 3D GUI first converts said features (i.e., a set of
graphical vectors) into a plurality of neural input signals; when
said neural input signals pass through said layers of neural nodes
to become the final neural output signals, some functions in the
presently disclosed 3D GUI may be activated or inactivated in
accordance with the final neural output signals; thus, the operator
of the presently disclosed 3D GUI gets an impression that a
computer carrying the presently disclosed 3D GUI (207) is able to
perform some intelligent functions based on the neural signals,
e.g. the variation of the facial expression of said human head
(1723). For example, when the expression of said human head (1723)
is happy, a cheerful background music can be played; when the
expression of said human head (1723) is sad, a consoling background
music can be played. In the presently disclosed 3D GUI, there is a
3D cursor (1707B) that facilitates the interactions between the
operator and said vector graphic (1714); this 3D cursor (1707B) is
different from the one used by the prior art (e.g. a cursor in a 2D
GUI) in that it designates a 3D zone (1707C) instead of merely a
point in the 3D space to interact/engage with a 3D graphical entity
(e.g. a 3D feature). When a 3D cursor (1707B) accesses a 3D zone
(e.g. 1723D1, 1723D2, 1723D3, or 1723D4), the features contained
therein can be adjusted by the operator (i.e., by forms of matrix
operation); this denotes that the presently disclosed 3D GUI has
effective means to manipulate the above stated neural signals. In
practice, said 3D zone (e.g. 1723D1) may have a specific pattern
(e.g. rectangular box with specific length, width, and height) that
is designated by the presently disclosed 3D GUI, but they can be
changed whenever the operator requires. FIG. 5C shows the typical
steps that the presently disclosed 3D GUI takes to process said
neural input signals. When a 3D cursor (1707B) picks out a 3D zone
(e.g. 1707C) for analysis, it creates a neural input signal as
stated in the above; the typical methods comprise using the
techniques such as convolutional neural network (CNN) to derive the
signals 1714-1, 1714-2, 1714-3, 1714-4, etc.) Typical CNN functions
are hyperbolic tangent function, sigmoid function, etc. In the
presently disclosed 3D GUI, or more specifically, the Neural
Network internal process module (610) in FIG. 3B, layer such as
(1714S) can be called the CNN layer; the remaining layers (e.g.
1715, 1716, 1717, etc.) may carry similar functions, but
occasionally they may remove said convolutional functionality as an
attempt to save the processing power. The merit of CNN is like
applying a non-linear function on the raw input data (e.g., a 3D
vector graphic, or a set of DICOM images, etc.), which may help the
presently disclosed 3D GUI to extract certain features (e.g.
corners, serifs of texts, or a type of medicine flowing in an
organ, etc.) from the 3D vector graphics more reliably. In the
present invention, the neural input signals of said 3D zones
(1714-1, 1714-2, 1714-3, 1714-4, etc.) are also called the feature
vectors; contextually, a vector graphic (e.g. human head 1723) can
be denoted by a plurality of said feature vectors. Likewise, in a
medical image, there could be a plurality of feature vectors
embedded therein. Take FIG. 5C as an example, the feature vectors
are, respectively, 1714-1, 1714-2, and 1714-3; together, said
feature vectors construct a feature vector space (1714S) whose
dimension can be very high. Within the internal process module of
the Neural Network (610), FIG. 4B, several neural input signals can
be linked to a common neural node to denote their combined effect;
in a deep learning machine (e.g. CNN), an output signal per said
combined effect can be linked to the input of another neural node;
by doing so repetitively, a layered structure of a neural network
can be formed. When said neural signals are propagating through
said layers of neural nodes, some unique functions can be applied
to the respective neural signals to enhance/suppress their
influences on the final neural output signals. For example, the
feature vectors (e.g. 1714-1, 1714-2, etc.) of FIG. 5C can be
processed by the Kernel functions K.sub.x (i.e., 1715,
x=1.about.n). Still further, the feature vectors after being
processed by said Kernel functions K.sub.x can be adjusted by the
weight factors W.sub.x 1717 (i.e., multiplying W.sub.x to the
output signals of first layer neural node 1716). Still further, the
Neural Network module (610) can adjust the threshold value (not
shown) for each of the respective neural nodes. In the exemplary
case of FIG. 5D, there are four 3D zones in the human head 1723
(i.e., 1723D1, 1723D2, 1723D3, and 1723D4, respectively); said four
3D zones may denote four neural input signals of a neural network,
thereby four feature vectors (1714-1, 1714-2, 1714-3, and 1714-4)
are generated in the corresponding feature vector space (1714S).
Applying the knowledge learned from support vector machine, when
said Kernel functions K.sub.x (1715) and said weight factors
W.sub.x (1717) are set at proper conditions, an entire feature
vector space can be represented by (1725). As FIG. 5C shows, within
said feature vector space (1725), the feature vector (1719-1)
denotes a final neural output signal that is located on the right
side of the hyperplane (1721), which actually designates a unique
status of the neural network (610). Likewise, the feature vector
(1719-2) is a final neural output signal that is located on the
left side of the hyperplane (1721), which actually designates
another status of the neural network (610) that is different from
that of (1719-1). As FIG. 5C shows, said two status of neural
network (610) are separated from one another by two margin lines
(1718-1 and 1718-2); the gap between said two margin lines is
denoted by two unique feature vectors which belong to the opposite
classes but are the closest to one another (i.e., 1719-1 and
1719-2) among all feature vectors. When said gap of margin is wide
enough to be recognized clearly by the internal process module of
support vector machine (616), the corresponding neural output
signals (i.e., all feature vectors in 1725) can be used to turn on
or off certain functions of a computer or electronic system
accurately and reliably.
[0076] In practice, the format and resolution of said feature
vector can have many varieties (e.g. a real number between zero and
one, etc.). For example, a feature vector 1714-2 (x,
x=0.0.about.1.0; y, y=0.0.about.1.0) can be used to denote a neural
signal data that represents the condition of the left eye (1723D2)
of the human head (1723). Likewise, the feature vector 1714-1 (x,
x=0.0.about.1.0; y, y=0.0.about.1.0) can be used to denote a neural
signal data that represents the condition of the right eye (1723D1)
of the human head (1723). When a neural node combines the above two
neural input signals with different weight factors (e.g. W.sub.x
1717), it may generate a variety of neural output signals; extend
this scenario to a fairly large number of neural input signals, the
vast varieties of the corresponding neural output signals can be
used to denote very complicated conditions (e.g. facial expression
of said human head 1723 in a mood of sad, happy, pondering,
frowning, frightening, etc). When the number of said neural input
signals is very large (e.g. there are far more feature vectors than
1714-1, 1714-2, 1714-3, and 1714-4), we may characterize their
corresponding vector space (1714-S for the input signals, 1725 for
the output signals) as a high dimensional one. The advantages and
drawbacks of a very high dimension feature vector space is the
following. When said dimension is very high, it denotes an
advantage that the presently disclosed 3D GUI (207) is able to
generate many functions based on the status of said neural output
signals. Prior art (i.e., conventional GUI) has never reached such
a profound level of interaction between an operator and a 3D GUI.
The fundamental drawback for a neural network having very high
dimensional feature vector space is the processing load on the CPU
and GPU is increased drastically. To accommodate the issue that
said dimension of feature vectors may vary in different
applications, the presently disclosed 3D GUI (207) may temporarily
disable the neural nodes that are un-used as a contingent way of
reducing the processing load of the CPU and GPU. For example, the
3D zones (1723D4) is located on the back of said human head (1723);
it can be used to denote the hair of said human head (1723), e.g.
curl, straight, trimmed, etc. When an interaction between the user
and the presently disclosed 3D GUI (207) has nothing to do with
hair, the processing steps of generating the neural signals
associated with the 3D zones (1723D4) can be temporarily disabled,
this in turn increases the processing speed of the neural network
module (610). When the above method is un-enough to tackle the
situation (e.g. said 3D vector graphic 1723 is relatively
complicated), the presently disclosed 3D GUI (207) uses graphical
means (i.e., perspective sketching) to reduce the dimension of said
feature vector space.
[0077] At this stage, the 3D vector graphic (1723) of FIG. 5D is
still in 3D formation, i.e., all the features contained therein,
and their respective graphical vectors, are denoted by the X, Y,
and Z values of a 3D coordinate system. Upon assessing the content
or context of a 3D vector graphic by neural network, the presently
disclosed 3D GUI (207) provides a unique and convenient means to
reduce the dimension of a vector graphic (e.g. 1723) from 3D to
2.5D without deteriorating the performance of the neural network
module (610) too much; by doing so, the overall performance (e.g.
speed, power consumption, etc.) of the presently disclosed 3D GUI
(207) can be increased without losing the accuracy of assessing the
respective neural input signals. Before we elaborate such a unique
feature in further detail, we may refer back to the section 6.7 of
NU17-001 to recite the methodology used by the presently disclosed
3D GUI (207) to manipulate the apparent dimension of a vector;
specifically, said method has to do with the perspective sketching
techniques. As is discussed in section 6.7 of related application
NU17-001, Ser. No. 16/056,752, which is fully incorporated herein
by reference, the presently disclosed 3D GUI is able to manifest
the sensation of three dimension of a 3D scenery by classifying its
graphical vectors in accordance with the perspective angles (i.e.,
aligning some edge lines to the vanishing points and/or vanishing
lines). From a more mathematical point of view, this feature is in
fact the result of a mapping process from a 3D vector graphic to a
2.5D one. As section 6.7 of NU17-001further explained, the location
of the vanishing points and vanishing lines in a perspective
graphic (e.g. FIG. 10J in that application) affects the viewer's
comprehension profoundly. Such a rule of graphical sketching also
affects the level of comprehension of a machine that uses
artificial intelligence to assess the content or context of a 3D
vector graphic, or a 3D image acquired by instrument; the
fundamental reason is that a 2.5D graphical perspective sketch
bears the fundamental capability to control the sensation of three
dimension of a 3D scenery, regardless of whether the viewer is a
live person or a machine. Applying this methodology to the current
section, the presently disclosed 3D GUI maps the 3D graphical
vectors (e.g. 1723GV-X) of the vector graphic (1723) to a 2D image
frame (i.e., 1724); by deliberatively choosing the locations of the
respective vanishing points (i.e., VP1, VP2, and VP3), the
processing load on the neural network module (610) is greatly
reduced. In order to illustrate such a merit more clearly, a
contour box (1723-CB) is added to FIG. 5C to denote several
principal graphical vectors therein, i.e., (1722-X), (1722-Y), and
(1722-Z). These principal graphical vectors have dual utilities; as
FIG. 5C shows, said principal graphical vectors (1722-X), (1722-Y),
and (1722-Z) are the normal vectors of the three facets of said
contour box 1723-CB (thereby they are the X, Y, and Z axes of the
coordinate system of said contour box); in this respect, said
principal graphical vectors are 3D graphical entities. On the other
hand, said three principal vectors can be projected onto the 2D
image frame (1724); in this respect, said principal graphical
vectors (1722-X), (1722-Y), and (1722-Z) are 2D graphical entities
representing the vanishing lines of the 2.5D coordinate system.
Once said vanishing lines have been projected onto said 2D image
frame (1724), the remaining 3D graphical vectors (e.g. 1723GV-X)
may follow the same processing steps to map themselves onto said 2D
image frame (1724). Thus, when the neural network process module
(610) is extracting feature vectors from the image frame (1724),
the vector graphic contained therein appears to the viewer as being
a 3D graphic (e.g. 1723), but it is already a 2.5D graphical
entity. In our discussions in related docket no. NU17-001 fully
incorporated herein, we have explained that vanishing points (e.g.
VP1, VP2, and VP3, etc.) are used by the graphical artists to
denote the converging effect of the basic graphical element (pixel
or voxel) in a 2.5D perspective sketch. When we take this graphical
art to a physicist, we will receive the explanation that the
degrees of freedom of the pixel or voxel in said 2.5D perspective
sketch are decreasing in accordance with their distance to said
vanishing points. Conventional GUIs do not know all this; they are
treating each pixel or voxel as mathematical points, to which size
and direction are irrelevant. The presently disclosed 3D GUI treats
each pixel or voxel as a real object. In the present disclosure,
the mathematical formulas denoting the relationship among the X, Y,
and Z values of a 2.5D coordinate system are Eqs (16) and (17). As
is stated in the above, when a 3D vector graphic is projected onto
a 2D image frame (e.g. 1724) by way of perspective sketching, the
degrees of freedom of the respective pixels decrease in accordance
with a unique profile designated by said vanishing points. After
the presently disclosed 3D GUI (207) has projected the 3D graphical
vectors (e.g. 1723GV-1) onto said 2D image frame (1724), the
features contained in the respective 3D zones (e.g. 1723D1) have
all been transformed into 2D ones (more specifically, 2.5D ones).
When the dimension of a graphical vector is reduced (i.e., from 3
to 2.5), the dimension of the corresponding feature vectors will be
reduced accordingly. In FIG. 5D, we use Kernel function K.sub.x to
reduce the dimension of the feature vectors. In the earlier
paragraphs of the present section, we have explained that the
fundamental merit of the Kernel function of an SVM is equivalent to
the dotting process of two vectors (exemplary ones are Eqs 11
through 15). In FIG. 10, we have demonstrated that manipulating the
perspective angle of a graphical entity (e.g. J.sub.2.5D,
K.sub.2.5D) in a 2.5D displaying device (207W1) as shown in FIG. 10
is equivalent to performing the dotting process between the
graphical vectors (e.g. {right arrow over (D.sub.J )} and {right
arrow over (D.sub.K)}) and the unit vector of its X.sub.(2.5) axis.
Applying these understandings to FIG. 5D, the presently disclosed
3D GUI generates a unique Kernel trick; this Kernel trick uses
geometrical means (not the algebraic ones as Eq. 11 through 15 have
shown) to map a 3D vector graphic (e.g. 1723) onto a 2.5D image
frame (1724). When a vector graphic lying in a 2D image frame (e.g.
1724) uses a 2.5D coordinate system to represent the 3D graphical
vectors, we call such a graphical vector a 2.5D one; the feature
vector and vector graphic derived from said 2.5D vector graphic
henceforth are 2.5D one as well. Thus, the methodology developed by
the present invention (i.e., a Kernel trick of SVM developed from
perspective sketching) is a straightforward and powerful way to
reduce the total dimension of the feature vector space generated
from a raw 3D vector graphic (e.g. 1723), or a set of data (e.g.
DICOM) that denote an object in the 3D space. To appreciate the
merit of such a methodology from physical point of view, one has to
first understand the essential properties of the vanishing point.
As has been explained previously, a vanishing point does not carry
any information pertaining to size or directionality; that is, at a
distance between the viewer and said vanishing point that is
relatively large, there is no way for said viewer to differentiate
two neighboring objects, this denotes that the apparent degree of
freedom of said vanishing point is literally zero. Thus,
transforming a 3D vector graphic into a 2.5D one in accordance with
the rule of perspective sketching is a very powerful method to
generate the features that appear to the viewer as the 3D ones, but
in reality the dimensions of said feature vectors have been
reduced. There is a side effect on this methodology: the contour
smearing effect may take place when said Kernel trick is overly
done. Nevertheless, by carefully choosing said perspective angle,
the presently disclosed 3D GUI (207) can control said contour
smearing effect to a reasonably low level. Enlightened by the
vector classifying power of SVM, and the finesse used by the
graphical artists of impressionism, the presently disclosed 3D GUI
(207) adds a few vanishing points (e.g. VP1, VP2, and VP3) to a 3D
vector graphic (e.g. 1723) as an unprecedented means to reduce the
dimension/resolution of the corresponding feature vectors.
Buttressed up by the collaborations with the internal process
module of Perspective (607), and the support of the internal
process module of Support Vector Machine (616), the dimensions of
the feature vector space is reduced effectively and certain unique
sensations of graphics may be rendered to the viewer; in this
situation, the performance of the presently disclosed 3D GUI (207)
has been enhanced profoundly.
[0078] Readers are advised that a feature vector does not always
have to be derived from the graphical vectors (e.g. 1723GV-X);
there are other signals such as the realistic images, sound (e.g.
multi-channel sound) or motion vector . . . etc. that can serve as
the source of a feature vector. As has been stated in the earlier
paragraph of this section 6.2, when a 3D scene is presenting a
plurality of objects that has various non-linear motions, it may
infer the objects presented therein have unique gestures. Together
the pattern (i.e., features in a vector graphic) and motion vectors
of the objects (some motion vectors can be generated by the
presently disclosed 3D GUI directly) constitute our preliminary
comprehension of the world by visualization.
[0079] Note that the magnitude and direction of the motion vectors
of the objects engaging with the cursor (1707) will not be affected
by the perspective angle. In other words, while an operator is
using cursor (1707) to engage with the 3D objects, Genie (204) can
move around to seek the best perspective angle to present the
result freely--the two processes (i.e., object engagement and
perspective angle adjustment) are not interfering with one
another.
[0080] Based on the above scenario, one comes to an understanding
that, supported by the extraordinary machine learning capability of
the presently disclosed 3D GUI (207), the interaction between an
operator and the objects presented/controlled by a 3D graphical
rendering system using the presently disclosed 3D GUI, such as a
movie player of "Fantasia.RTM. 3D", a 3D medical image rendering
system, a graphic sketching system, or a manipulator of a
sophisticated robot, etc., would be far more intuitive and
pervasive than its predecesors are, and this merit has gone beyond
the scope of intelligent perspective angle adjustment.
[0081] As is understood by a person skilled in the art, the
sections of the present disclosure are illustrative of the present
disclosure rather than being limiting of the present disclosure.
Revisions and modifications may be made to methods, processes,
materials, structures, and dimensions through which is made and
used a 3D GUI that imparts linear and nonlinear motion vectors
corresponding to different degrees of freedom of a 3-dimensional
object to its basic graphical elements, such as pixels, voxels, or
to a complete 3D maneuverable system such as a robot and includes
the artificial intelligence methodology of machine learning (ML),
support vector machine (SVM), and convolutional neural network
(CNN) to enable a more complete, yet comprehensible control of
complex systems such as 3D graphics and robots, while still
providing such methods, processes, materials, structures and
dimensions in accordance with the present disclosure as defined by
the appended claims.
* * * * *
References