U.S. patent application number 13/042196 was filed with the patent office on 2012-05-31 for system and method for cognitive processing for data fusion.
Invention is credited to Tuan A. DUONG, Vu A. Duong.
Application Number | 20120136913 13/042196 |
Document ID | / |
Family ID | 44649765 |
Filed Date | 2012-05-31 |
United States Patent
Application |
20120136913 |
Kind Code |
A1 |
DUONG; Tuan A. ; et
al. |
May 31, 2012 |
SYSTEM AND METHOD FOR COGNITIVE PROCESSING FOR DATA FUSION
Abstract
A system and method for cognitive processing of sensor data. A
processor array receiving analog sensor data and having
programmable interconnects, multiplication weights, and filters
provides for adaptive learning in real-time. A static random access
memory contains the programmable data for the processor array and
the stored data is modified to provide for adaptive learning.
Inventors: |
DUONG; Tuan A.; (Glendora,
CA) ; Duong; Vu A.; (Rosemead, CA) |
Family ID: |
44649765 |
Appl. No.: |
13/042196 |
Filed: |
March 7, 2011 |
Related U.S. Patent Documents
|
|
|
|
|
|
Application
Number |
Filing Date |
Patent Number |
|
|
61314055 |
Mar 15, 2010 |
|
|
|
Current U.S.
Class: |
708/819 ;
708/835 |
Current CPC
Class: |
G06N 3/063 20130101;
G11C 11/54 20130101; G06N 3/0454 20130101 |
Class at
Publication: |
708/819 ;
708/835 |
International
Class: |
G06G 7/16 20060101
G06G007/16 |
Goverment Interests
STATEMENT OF GOVERNMENT GRANT
[0002] The invention described herein was made in the performance
of work under a NASA contract, and is subject to the provisions of
Public Law 96-517 (35 USC 202) in which the Contractor has elected
to retain title.
Claims
1. (canceled)
2. A system for processing sensor data comprising: an input/output
bus, wherein the input/output bus receives sensor data and outputs
processed sensor data; a processor array, wherein the processor
array receives analog signals containing sensor data from the
input/output bus and the processor array comprises one or more
matrices of analog multiplication nodes and transfer function
elements and the processor array outputs the processed sensor data
from the one or more matrices of analog multiplication nodes and
transfer function elements; and a memory containing data values
controlling configurations of the processor array, wherein the
processor array comprises: a first array block, the first array
block comprising: a first programmable switch array configured to
receive the sensor data; a square array of first array analog
multiplication nodes, wherein each first array multiplication node
selectively receives analog signals from the first programmable
switch array, or one or more first array multiplication nodes, or
the first programmable switch array and one or more first array
multiplication nodes; and one or more first array transfer function
elements, wherein each first array transfer function elements
receives analog signals from one of the first array analog
multiplication nodes, and a second array block, the second array
block comprising: a second programmable switch array configured to
receive signals from the one or more first array transfer function
elements; a cascaded array of second array analog multiplication
nodes; one or more second array transfer function elements, wherein
each second array transfer function element receives analog signals
from one of the second array analog multiplication nodes; wherein
each second array multiplication node selectively receives analog
signals from the second programmable switch array, one or more
second array multiplication nodes, the second programmable switch
array and one or more second array multiplication nodes, or one of
the one or more second array transfer function elements.
3. The system according to claim 2, wherein outputs from one or
more of the first array transfer function elements are selectively
coupled to the first programmable switch array as inputs to the
first programmable switch array.
4. The system according to claim 2, wherein outputs from one or
more of the second array transfer function elements, or outputs
from one or more of the second array transfer function elements, or
outputs from one or more of the second array transfer function
elements and outputs from one or more of the second array transfer
function elements are selectively coupled to the second
programmable switch array as inputs to the second programmable
switch array.
5. The system according to claim 3, wherein outputs from one or
more of the second array transfer function elements, or outputs
from one or more of the second array transfer function elements, or
outputs from one or more of the second array transfer function
elements and outputs from one or more of the second array transfer
function elements are selectively coupled to the second
programmable switch array as inputs to the second programmable
switch array and selectively coupled to the first programmable
switch array as inputs to the first programmable switch array.
6. The system according to claim 2, wherein the memory contains
data values for selectively controlling the first programmable
switch array or the second programmable switch array or the first
programmable switch array and the second programmable switch
array.
7. The system according to claim 2, wherein the memory contains
data values for controlling multiplication factors for one or more
of the first array multiplication nodes and second array
multiplication nodes.
8. The system according to claim 7, wherein one or more of the
first array multiplication nodes and second array multiplication
nodes comprise a current mode multiplying digital to analog
converter having a digital multiplication factor stored in the
memory.
9. The system according to claim 2, wherein data values in the
memory are modified based upon the processed sensor data.
10. The system according to claim 2, wherein the transfer function
elements comprise programmable analog filters.
11. A method for processing sensor data comprising: receiving
sensor data in a plurality of first analog data streams; applying a
first set of multiplicative weights to the plurality of first
analog data streams to produce a plurality of first multiplied
analog data streams; filtering the plurality of first multiplied
analog data streams based on a first set of filter characteristics
to produce a plurality of preprocessed analog data streams;
providing a plurality of second analog data streams based on the
plurality of preprocessed analog data streams; applying a second
set of multiplicative weights to the plurality of second analog
data streams to produce a second plurality of multiplied analog
data streams; filtering the second plurality of multiplied analog
data streams based on a second set of filter characteristics to
produce a plurality of processed analog data streams; and
outputting the plurality of processed analog data streams to
produce processed sensor data.
12. The method according to claim 11, wherein the plurality of
first analog data streams comprises: sensor data, sensor data and
the plurality of preprocessed analog data streams; sensor data and
the plurality of processed data streams; or sensor data and the
plurality of preprocessed analog data streams and the plurality of
processed data streams.
13. The method according to claim 12, wherein the plurality of
second analog data streams comprises: the plurality of preprocessed
analog data streams, or the plurality of preprocessed analog data
streams and the plurality of processed analog data streams.
14. The method according to claim 13, wherein applying a first set
of multiplicative weights comprises selectively directing the
plurality of analog streams to a plurality of first multiplication
nodes, wherein each first multiplication node multiplies a first
multiplication node input by a weight selected from the first set
of multiplicative weights to produce a first multiplication node
output, wherein the first multiplication node input comprises: one
analog data steam of the plurality of analog data streams; a first
multiplication node output from another first multiplication node;
or a sum of one analog data steam of the plurality of analog data
streams and a first multiplication node output from another first
multiplication node, and wherein first multiplication node outputs
from selected first multiplication nodes comprise the first
plurality of multiplied analog data streams.
15. The method according to claim 13, wherein filtering the first
plurality of multiplied analog data streams comprises: providing an
array of first analog filters, each first analog filter having a
first analog filter input and a first analog filter output and a
first analog filter characteristic selected from the selected first
filter characteristics; and, directing each multiplied analog data
stream of the first plurality of multiplied analog data streams
into a first analog filter input of a selected first analog filter,
wherein first analog filter outputs comprise the plurality of
preprocessed analog data streams.
16. The method according to claim 13, wherein applying a second set
of multiplicative weights comprises: selectively directing the
plurality of preprocessed analog data streams to a plurality of
second multiplication nodes, wherein each second multiplication
node multiplies a second multiplication node input by a weight
selected from the second set of multiplicative weights to produce a
second multiplication node output, wherein the second
multiplication node input comprises: one preprocessed analog data
steam of the plurality of preprocessed analog data streams; a
second multiplication node output from another second
multiplication node; or a sum of one preprocessed analog data steam
of the plurality of preprocessed analog data streams and a second
multiplication node output from another second multiplication node,
and wherein second multiplication node outputs from selected second
multiplication nodes comprise the second plurality of multiplied
analog data streams.
17. The method according to claim 13, wherein filtering the second
plurality of multiplied analog data streams comprises: providing an
array of second analog filters, each second analog filter having a
second analog filter input and a second analog filter output and a
second analog filter characteristic selected from the selected
second filter characteristics; and, selectively directing each
multiplied analog data stream of the second plurality of multiplied
analog data streams to a second analog filter input of a selected
second analog filter or to a selected third multiplication node of
a plurality of third multiplication nodes, wherein each third
multiplication node multiplies a third multiplication node input by
a weight selected from a third set of multiplicative weights
selected from the second filter characteristics to produce a third
multiplication node output, wherein the third multiplication node
input comprises: a selected second analog filter output; a third
multiplication node output from another third multiplication node;
or a sum of a selected second analog filter output and a third
multiplication node output from another third multiplication node;
and selectively directing each second analog filter output to a
selected third multiplication node of a plurality of third
multiplication nodes or to the plurality of processed analog data
streams.
18. The method according to claim 11, further comprising selecting
or modifying at least one of the following sets based upon the
processed sensor data: the first set of multiplicative weights; the
first set of filter characteristics; the second set of
multiplicative weights; and, the second set of filter
characteristics.
19. The method according to claim 11, wherein the processed sensor
data provides knowledge about a sensor environment and the method
further comprises selecting or modifying at least one of the
following sets based upon the sensor environment knowledge: the
first set of multiplicative weights; the first set of filter
characteristics; the second set of multiplicative weights; and, the
second set of filter characteristics.
20. The method according to claim 11, further comprising converting
sensor data received in a digital format to an analog format.
Description
CROSS-REFERENCE TO RELATED APPLICATIONS
[0001] This application is related to and claims the benefit of the
following copending and commonly assigned U.S. Provisional Patent
Application: U.S. Patent Application No. 61/314,055, titled "Real
Time Cognitive Computing Architecture for Data Fusion in a Dynamic
Environment," filed on Mar. 15, 2010; the entire contents of which
is incorporated herein by reference.
BACKGROUND
[0003] 1. Field
[0004] This disclosure relates to a system and method for real time
cognitive processing for data fusion in a dynamic environment. More
particularly, the present disclosure describes a method and system
for real-time, adaptive, intelligent, low power, high productive
and miniaturized processing using custom VLSI design for target
detection and classification.
[0005] 2. Description of Related Art
[0006] A general purposed computer can typically simulate or
support just about any application through iterated computation in
a sequential manner. Such general purpose computers typically use
an architecture known as a Von Neumann architecture that consists
of input/output, memory, and an Arithmetic Logic Unit. The Von
Neumann architecture (or Von Neumann machine) is a sequential
computation architecture. One draw back of the Von Neumann
architecture is that it is slow, regardless of computer speed.
[0007] To deal with complex data fusion applications as required in
military applications, particularly remote, real time applications
related to the dynamic environment, the Von Neumann machine may not
be effective for demands such as compactness, real time processing,
adaptive system and low power. There are many types of data (e.g.,
IR, LIDAR, RADAR, Visual, Olfactory) that need to be processed and
fused in real time. In data processing and fusion, especially for
sensors, time can be critical to every millisecond. The speed
requirements may present a challenge for a digital computer and the
architecture of a system as a whole. For example, sensors may
collect analog data that will be converted to a digital format
before sending the data to the digital machine in a sequential
manner for algorithmic computation. Each sequential step requires a
delay and processing time to digest data, and finally, the solution
that is provided by the computer may no longer be valid.
[0008] In contrast, a neural network architecture is a parallel
processing technique and its performance can be much faster as
compared with the digital machines. See, for example, J. J.
Hopfield, "Neural networks and physical systems with emergent
collective computational abilities," Proc. Natl. Acad. Sci. USA,
vol. 79, pp. 2554-2558, 1982 and T. A. Duong, S. P. Eberhardt, T.
Daud, and A. Thakoor, "Learning in neural networks: VLSI
implementation strategies," Fuzzy logic and Neural Network
Handbook, Chap. 27, Ed: C. H. Chen, McGraw-Hill, 1996. However,
neural network hardware is typically not as fully-programmable as a
digital computer. In neural network, one computer (or set of
computers) may perform the learning and download a weight set to a
neural network chip (see for example, E. Fiesler, T. Duong, and A.
Trunov, "Design of neural network-based microchip for color
segmentation," SPIE Proceeding of Applications and Science of
Computational Intelligence part III, Vol. 4055, pp. 228-237,
Florida, May 2000, and T. X. Brown, M. D. Tran, T. A. Duong, T.
Daud, and A. P. Thakoor, Cascaded VLSI neural network chips:
Hardware learning for pattern recognition and classification,
Simulation, 58, 340-347,1992), while another computer (or set of
computers) can perform on-chip learning with limited programming
capability (i.e., not flexible for other applications) (see, for
example, T. A. Duong, T. Daud, "On-Chip Learning of Hyper-Spectra
Data for Real-Time Target Recognition", Proceeding of IASTED
Intelligent System and Control, pp. 305-308, Honolulu, Hi., August
14-17, 2000; B. Girau, "On-Chip learning of FPGA-Inspired neural
nets," Proceeding of International of Neural Networks IJCNN'2001,
Vol. 1, pp. 212-215, 2001; C. Lu, B. Shi, and L. Chen, "A
Programmable on-chip learning neural network with enhance
characteristics," The 2001 IEEE Inter. Symposium Circuit and
Systems ISCAS 2001, Vol. 2, pp. 573-576, 2001; and G. M. Bo, D. D.
Caviglia, and M. Valle, "An on-chip learning neural network," Proc.
Inter. Neural Networks IJCNN'2000, Vol. 4, pp. 66-71, 2000). A
neural network hardware implementation also has a two-fold problem:
reliable learning techniques in limited weight space for learning
network convergence in a parallel architecture (see, for example,
T. A. Duong and Allen R. Stubberud, "Convergence Analysis Of
Cascade Error Projection--An Efficient Learning Algorithm For
Hardware Implementation," International Journal of Neural System,
Vol. 10, No. 3, pp. 199-210, June 2000; Hoehfeld, M. and Fahlman,
S., "Learning with limited numerical precision using the
cascade-correlation algorithm," IEEE Trans. Neural Networks, vol.
3, No. 4, pp. 602-611, 1992; and P. W. Hollis, J. S. Harper, and J.
J. Paulos, "The Effects of Precision Constraints in a
Backpropagation learning Network," Neural Computation, vol. 2, pp.
363-373, 1990), and a flexible architecture to solve a wide range
of problems. To break these problems, one must devise a reliable
learning neural network technique that is able to learn under a
limited weight space in milliseconds and a novel architecture that
is fully programmable through instruction sets, from which the real
time adaptive network in a chip can be achieved to solve a real
time applications in a dynamic environment.
[0009] Hence, there is a need in the art for a fully programmable
processing architecture that can address complex data fusion
applications in a dynamic environment.
SUMMARY
[0010] Embodiments of the present invention comprise a system and
method for cognitive processing of sensor data. A processor array
receiving analog sensor data and having programmable interconnects,
multiplication weights, and filters may provide for adaptive
learning in real-time. A static random access memory may contain
the programmable data for the processor array and the stored data
is modified to provide for adaptive learning.
[0011] One embodiment of the present invention is a system for
processing sensor data comprising: an input/output bus, wherein the
input/output bus receives sensor data and outputs processed sensor
data; a processor array, wherein the processor array receives
analog signals containing sensor data from the input/output bus and
the processor array comprises one or more matrices of analog
multiplication nodes and transfer function elements and the
processor array outputs processed sensor data from the one or more
matrices of analog multiplication nodes and transfer function
elements; and a memory containing data values controlling
configurations of the processor array.
[0012] Another embodiment of the present invention is a method for
processing sensor data comprising: receiving sensor data in a
plurality of first analog data streams; applying a first set of
multiplicative weights to the plurality of first analog data
streams to produce a plurality of first multiplied analog data
streams; filtering the plurality of first multiplied analog data
streams based on a first set of filter characteristics to produce a
plurality of preprocessed analog data streams; providing a
plurality of second analog data streams based on the plurality of
preprocessed analog data streams; applying a second set of
multiplicative weights to the plurality of second analog data
streams to produce a second plurality of multiplied analog data
streams; filtering the second plurality of multiplied analog data
streams based on a second set of filter characteristics to produce
a plurality of processed analog data streams; and outputting the
plurality of processed analog data streams to produce processed
sensor data.
BRIEF DESCRIPTION OF THE SEVERAL VIEWS OF THE DRAWINGS
[0013] FIG. 1 shows a system block diagram including a Cognitive
Computing Architecture.
[0014] FIG. 2 shows a Cognitive Computing Architecture.
[0015] FIG. 3 shows an architecture for a Hybrid Intelligent
Processor Array.
[0016] FIG. 4 shows a block diagram of a Multiplying
Digital-to-Analog Converter.
[0017] FIG. 5 shows an electrical schematic of a 10-bit Multiplying
Digital-to-Analog Converter.
[0018] FIGS. 6A-6C show the Elaine image and results from
reconstructing the Elaine image.
[0019] FIG. 6D shows the results of a 10 component vector extracted
from the Elaine image shown in FIG. 6A.
[0020] FIGS. 7A-7C show a tank image and results from
reconstructing the tank image.
[0021] FIG. 7D shows the results of a 10 component vector extracted
from the tank image shown in FIG. 7A.
[0022] FIG. 8A shows an image of the MARS yard at the Jet
Propulsion Lab at 3 PM.
[0023] FIG. 8B shows the segmented output of the image shown in
FIG. 8A.
[0024] FIGS. 9A-9C show the 4 PM scene of FIG. 8A and the segmented
results of that scene.
[0025] FIGS. 10A-10C show the 5PM scene of FIG. 8A and the
segmented results of that scene.
[0026] FIGS. 11A-19B show various views of a scene and an object to
be detected and tracked within the various views of that scene.
[0027] FIGS. 20A-20C show the mixing of odorants within an
environment, the detection of those odorants, and the actual
odorant composition of the environment.
DETAILED DESCRIPTION
[0028] A system block diagram comprising an embodiment of the
present invention having a Cognitive Computing Architecture is
shown in FIG. 1. As shown on FIG. 1, the system 100 comprises an
input block 110, an output block 120, a data bus block 130, a
processing block 140 and a storage block 150. The input block 110
may comprise sensing devices such as Infrared (IR) sensors, Light
Detection and Ranging (LIDAR) sensors, Radio Detection and Ranging
(RADAR) sensors, visuals sensors, chemical sensors, bio-sensors,
olfactory sensors, and any other such sensors. The output block 120
may comprise devices that provide output signals to other receiving
elements. The output signals may include, but are not limited to,
visual indicators, electrical signals, mechanical actuation
signals, radio-frequency signals. The other receiving elements may
comprise elements such as machines, humans, or computing devices.
The processing block 140 may be configured to process fully
parallel analog data from the input block 110 via the data bus
block 130 to allow real time processing. The processing block 140
may comprise processing techniques such as Principal Component
Analysis (PCA), Independent Component Analysis (ICA), Neural
Network (NN), Genetic Algorithm (GA), or other techniques. In a
preferred embodiment of the present invention, the processing block
140 is capable of reconfiguration and adaptation as required when a
target is changing in a dynamic environment. Moreover, the
processing block 140 or portions of the processing block 140 may be
configured to a particular architecture as determined by processing
requirements, e.g. preprocessing, processing, static or adaptive
requirements. The storage block 150 may store knowledge of data
collected, instead of raw or information data as typically done
with current computing architectures, e.g. digital machine. When
required, knowledge data in the storage block 150 can be
reconstructed to obtain the raw data. This capability may be
considered as a cognitive approach.
[0029] An embodiment of the present invention comprises a Cognitive
Computing Architecture 200 which includes a Digital/Analog I/O Bus
210, Hybrid Intelligent Processor Array 220, and Static Random
Access Memory (SRAM) 230 is shown in FIG. 2. These elements are
described in additional detail below. In the architecture depicted
in FIG. 2, the I/O Bus 210 may be configured to receive digital or
analog data, but the Processor Array 220 is configured to operate
on analog data. Therefore a digital to analog converter 250 may be
used to convert digital input data into analog data for processing
by the Processor Array 220 or other portions of the architecture.
The architecture may also include an analog to knowledge element
240 that provides the ability to extract knowledge features (in a
digital representation) from an analog stream for storage in or
access by the SRAM 230.
[0030] As shown in FIG. 2, the I/O bus 210 is able to accommodate
both analog and digital signals. With this hybrid I/O scheme, the
Processor Array 220 can reduce I/O bottleneck and perform fully
parallel and high speed computation for adaptive learning and
execution operations, while maintaining a digital data format for
instruction sets and network reconfiguration. Moreover, the hybrid
I/O will be a natural retrofit to multiple sensory data types
(e.g., IR, visual data etc.). The I/O bus may accommodate several
streams of analog input data. In a preferred embodiment, the input
data bus may comprise 128 analog data lines, which, if a 10 bit
digital resolution were assumed, would comprise a 1280 bit
(128.times.10 bits) data bus.
[0031] The SRAM 230 may be to store the instruction codes, system
configurations, and/or digital data sets. It may also reconfigure
the network within the Processor Array 220 so that it can be
programmed to a particular set of processors suitable for a
particular application using a reconfigurable computing technique
(such as Field Programmable Array, System on a Chip, network
processor, etc.). The SRAM may also store feature data, not raw
data as in traditional approaches. Accesses to the SRAM would then
allow reconstruction of the raw information if needed. The form of
feature data is the knowledge as selected information and enables
the readiness of use as cognitive processing.
[0032] The Hybrid Intelligent Processor Array 220 is an adaptive
processor comprising a matrix of, preferably, 10-bit connection
strength (weight) and an array of processing units. Depending on
the data type(s) from one or multiple sources, one or several data
filters can be reconfigured to perform adaptive learning in real
time. The data rates of different types of sensory data do not
necessarily have to be the same. To perform the real time adaptive
learning, the instruction set is stored in SRAM and may run as fast
as the raw data rate in the sensor I/O channels. The Processor
Array 220 configuration allows for high speed, fully parallel, and
low power processing since sensory input signals are in analog,
fully parallel and asynchronous.
[0033] FIG. 3 shows an architecture for a Hybrid Intelligent
Processor Array 220 according to an embodiment of the present
invention. In this architecture, there are two blocks, A-block 310,
and B-block 320. The A-block 310 comprises a programmable switch
array 311, multiplication nodes 313, and transfer function elements
315. The A-block 310 receives data from the I/O bus that is routed
through the switch array 311 to the multiplication nodes 313. The
multiplication factors of each multiplication node 313 are defined
by matrix W.sub.1. The results are routed to transfer function
elements 315 implementing a transfer function F(.). In a preferred
embodiment of the present invention, the transfer function elements
315 comprise programmable analog filters. The transfer function
F(.) may comprise linear, sigmoidal, Gaussian functions or other
transfer functions. For example, to convert a signal in the time
domain to a frequency domain such as Discrete Cosine Transform
(DCT), a particular set of coefficient values are downloaded to the
W.sub.1 array, and the multiplication between the weight matrix and
the input vector will result in a conversion with the output
transfer function F(.) as a linear function. Moreover, the output
from F(.) can be fedback to the input via a set of switches
programmed in the SRAM 230. This feature allows users to be able to
manipulate block-A 310 in a high order nonlinear network as needed.
For example, block-A may be configured as cascading neural network
architecture (see, for example, S. E. Fahlman and C. Lebiere, "The
Cascade Correlation learning architecture," Advances in Neural
Information Processing Systems II, Ed: D. Touretzky, Morgan
Kaufmann, San Mateo, Calif., pp. 524-532, 1990 or 19. T. A. Duong
and Allen R. Stubberud, "Convergence Analysis Of Cascade Error
Projection--An Efficient Learning Algorithm For Hardware
Implementation," International Journal of Neural System, Vol. 10,
No. 3, pp. 199-210, June 2000) or a multi layered neural network
(see, for example, B. Widrow, "Generalization and information
storage in networks of ADALINE neurons," Ed: G. T. Yovitt,
"Self-Organizing Systems," Spartan Books, Washington D.C.,
1962)
[0034] The B-block 320 is cascaded with the A-block 310 in which
the output of the A-block 310 can be inputted into the B-block 320.
The array of the B-block 320 is similar to the A-block 310, where
the B-block 320 comprises a programmable switch array 321,
multiplication nodes 323, and transfer function elements 325. In a
preferred embodiment of the present invention, the transfer
function elements 325 comprise programmable analog filters.
However, the B-block 320 is preferably implemented with a cascading
architecture array rather than the squared array of the A-block
310. The transfer function G(.) is also a programmable function.
The output of the B-block 320 can be programmed to feedback to the
input of the B-block 320 and/or the input of the A-block 310. The
programmable switches 311, 321 allow the reconfiguration of the
Processor Array into a more complicated network as needed.
[0035] From block-A and block-B, each switch can be programmed to
become a multi-layered neural network (MLP), cascading neural
network (CNN), principle component analysis (PCA), independent
component analysis, least mean-squared network (LMS), DCT, FFT,
etc.
[0036] The multiplication nodes 313, 323 may be implemented as
10-bit Multiplying Digital to Analog Converters (MDACs). A block
diagram of an exemplary 10-bit MDAC 400 is shown in FIG. 4. The
MDAC 400 implements a multiplier 401 between analog input current
and a 10-bit digital weight D stored in SRAM 407, and the output of
the MDAC is in a current mode. Since the MDAC operates in a current
mode, multiple sources may provide current to the MDAC 400 and the
MDAC 400 may provide current to multiple devices. Such a capability
supports the multiple node arrays discussed above. Switches 403,
405 allow the reconfiguration of the MDAC 400 to load different
multiplication factors and can be controlled from a part of SRAM
block. This allows the system architecture to be flexible to
achieve various tasks. The 10-bit digital weight of the MDAC 400 is
provided by the elements of the W.sub.1 and W.sub.2 matrices. This
MDAC implementation provides that the processing array may be
compact, consume low power, have a relatively simple architecture,
and exhibit parallelism.
[0037] FIG. 5 shows an electrical schematic of an exemplary MDAC
500. In this design, the coefficient W.sub.N.sup.nk can be
digitized and stored into SRAM (D0-D9) and the digital signal input
is converted into current I.sub.in and 32*I.sub.in for optimizing
the design space. The multiplier is used to multiply W.sub.k and
I.sub.in and its result in the current mode is I.sub.out. In FIG.
5, the dotted block 510 is an SRAM design, which will take
Data.sub.in to D.sub.i when Load is high and store D.sub.i when
Load is low (and Load.sub.B is high).
[0038] As digital W.sub.k is written into the SRAM array and
I.sub.in is available, multiplication is accomplished by
conditionally scaling the input current I.sub.in by a series of
current mirrored transistors 511. For each current mirror, a pass
transistor 513 controlled by one bit from the SRAM which
conditionally allows the current to be placed on a common summation
line. The bits in the digital word from LSB to MSB are connected to
1, 2, 4 . . . and 512 current mirror transistors, so that the input
current is scaled by the appropriate amount. To optimize the number
of current mirror transistors (the total is 1023 current minor
transistors) for space and speed due to gate capacities of minor
transistors, the 10-bit multiply is split into two 5-bit multiplies
(two of 31 current mirror transistors) and each of them (two) is
taken through different input currents: I.sub.in or 32*I.sub.in.
The resulting summation current is unipolar. However, a current
steering differential transistor pair 521, controlled by a tenth
bit of the digital word, determines the direction of the current
output, such that two-quadrant multiplication is accomplished
(-1024 to 1024 levels).
[0039] In embodiments according to the present invention, data can
come directly from sensors in an analog form. Using an analog I/O
bus (such as that shown in FIG. 2), the I/O bus can digest
information flow as fast as the sampling rate of the sensors.
However, the raw data may expose some difficulties for reliable
processing. One technique often used is a preprocessing step and
this step can vary from one type of data to another. For example,
the principle component analysis can be a useful tool for visual
data by reducing the background, while still maintaining the
principle information for processing. In such a technique, block-A
310 as shown in FIG. 3 may be used to implement preprocessing. In
addition, the independent component analysis can be useful for
audio sensory data to reduce cross talk and background noise. When
the preprocessed filter is reconfigured and ready, a real time
adaptive learning for preprocessing will take place to adapt needed
information. When finished, processing units (which may be
implemented in block-B 320 as shown in FIG. 3) can be reconfigured
accordingly, and the learning of a suitable filter to solve the
problem will be accomplished. The learning iterations and
reconfiguration system can be coded as machine (i.e., instruction)
code to provide instructional steps for a particular application.
This machine will act as switching control to activate the
information flow path for adaptive learning step. Further, multiple
instantiations of the architecture shown in FIG. 3 may be cascaded
into a larger system as needed for other applications.
[0040] A feature of embodiments of the present invention is the
ability to dynamically reconfigure the architecture of the
Processor Array. That is, in some embodiments, the data received by
the Processor Array may result in the configuration of the switch
array, weights used by the multiplication nodes, or the transfer
functions of the transfer function elements being changed in
real-time. Hence, the Processor Array may be flexibly configured to
implement learning and be adaptive to the data received and the
processing to be provided. This flexibility is typically not
present in other digitally-based hardware platforms, such as Field
Programmable Gate Arrays, Digital Signal Processors,
microprocessors, etc.
[0041] As indicated above, the Processing Array may be considered
as providing preprocessing with block-A and processing with
block-B. The discussion below presents results that may be obtained
in using the Processing Array in various applications.
[0042] A first application is the use of block-A as a real time
adaptive PCA filter for feature extraction and data compression.
Two gray scale images were used: Elaine (as shown in FIG. 6A and
tank (as shown in FIG. 7A). The Elaine image consisted of
256.times.256 gray pixels and each pixel had 8-bit quantization.
The tank was a 512.times.512pixel image with 8-bit gray
scale/pixel.
[0043] An input vector as row data with 64 pixels/vector was used
to construct the training vector set. When the training vector set
is available, the algorithm as shown in EQ. 1 below is applied to
extract the principal vector. Analysis showed that the maximum
number of iterations required was 150 of learning repetitions and
the first 20 component vectors are extracted. Feature extraction
using PCA is a well known approach (see, for example, T. A. Duong
and Allen R. Stubberud, "Convergence Analysis Of Cascade Error
Projection--An Efficient Learning Algorithm For Hardware
Implementation," International Journal of Neural System, Vol. 10,
No. 3, pp. 199-210, June 2000) and is based on the most expressive
features (eigen vectors with the largest eigenvalues).
[0044] The first 10-component vector extracted from Elaine image
using the block-A architecture is projected onto the first
10-component from MATLAB (inner product) and its results are shown
in FIG. 6D. The first 10-component vector extracted from tank image
using MATLAB and the block-A architecture and the projection
between principal vectors are shown in FIG. 7D. As orthogonal
characteristics between principal vectors, if the learning
component vector and the component vector from MATLAB are the same
order and identical (or close to identical), the expected inner
product should be close to +/-1; otherwise, it should be close to
zero. As shown in FIG. 6D and FIG. 7D, there are ten values (+/-1)
and the rest (70 values are close to zero) from which it may be
concluded that the block-A architecture can extract the feature
vector with the same results as the MATLAB technique.
[0045] A comparison of feature extraction and image reconstruction
between a known MATLAB technique and use of the block-A configured
for PCA was also performed. In this case, the first 20-component
vector from full set of 64 component vectors was extracted. The
full image was then constructed from the extracted first 20
component principal vector. FIG. 6B shows the Elaine image and FIG.
7B shows the Tank image constructed using the MATLAB technique.
FIG. 6C shows the Elaine image and FIG. 7C shows the Tank image
constructed using the block-A configured for PCA. These images show
that the block-A implementation provides results similar to known
techniques, but, given the high speed nature of the architecture,
at a higher speed.
[0046] Another application is to configure block-B for color
segmentation. A challenging aspect in color segmentation is when
the light intensity and resolution are dynamically changing. It is
easily recognized that the initial knowledge used to train the
network will have very little effect at a new location and
therefore will need to be updated through learning of newly
extracted data.
[0047] Color segmentation may be applied to a spacecraft guidance
application. In this case, the adaptation process that can aid in
spacecraft guidance may be described as follows: when the network
that has acquired current knowledge at time t.sub.0 is used to test
the subsequent image at time t.sub.0+.DELTA.t, segmentation results
from the image at t.sub.0+.DELTA.t will be used to extract the
training set to update the previous knowledge of the network at
time t.sub.0. This process of repeatedly segmenting and updating is
performed until a spacecraft attempting a landing on a surface
reaches its destination.
[0048] While the process of segmenting and updating are desired
characteristics of an adaptive processor, one issue is how often
such updates are necessary. The frequency of updates has a direct
impact on power consumption. More power is consumed if updates are
performed between each sequential image. The problem with
infrequent updates, however, is that the network may not
interpolate easily based upon new images from which the newly
segmented data may be insufficient for training. To find the
optimal sampling rate in a landing application, .DELTA.t should be
"sufficiently small" and will depend upon the landing velocity and
other environmental changes. The sampling rate becomes significant
in the actual design and development of a spacecraft landing
system.
[0049] To judge the applicability of the Hybrid Processor Array
described above to a color segmentation environment that may be
presented in a spacecraft landing environment, a simulation study
was conducted based upon images synthetically derived at JPL. A
digital camera was used to obtain a set of images at different
times of the day, namely, at 3:00 PM, 4:00 PM, and 5:00 PM, and at
different locations within the MARS YARD at the Jet Propulsion
Laboratory. These images were used to validate the neural network
performance with and without adaptive capabilities. In addition,
images were acquired at 15 second intervals to capture effects of
light intensity, contrast, and resolution for adaptive learning
step.
[0050] To classify each pixel, a pixel to be classified and its
immediate neighbors were used to form a 3.times.3 sub-window as the
input training pattern (thus each input pattern has 27 elements
from 3.times.3 of RGB pixel). Based on a previous study, the
3.times.3 RGB input pattern was found to be the optimal size in
comparison to using a single RGB input, a 5.times.5 RGB sub-window,
or a 7.times.7 RGB sub-window. In this study, the objective was to
segment the image into three groups: "Rock1", "Rock2", and "Sand".
The topology of the studied network was a 27.times.5.times.3
cascading architecture neural network, having 27 inputs, 5 cascaded
hidden units, and 3 output units.
[0051] FIG. 8A shows a 3:00 PM image. From this image, 408 patterns
for training data, 588 patterns for cross validation, and 1200
patterns for testing were sampled and collected. With these sample
sets, the learning was completed with 91% correct in training, 90%
correct in validation, and 91% correct in testing. After training
was performed, the segmented output of the original image of FIG.
8A is shown in FIG. 8B.
[0052] With the knowledge acquired from FIG. 8A, the network was
tested with the image input shown in FIG. 9A, which was collected
at 4:00 PM. The output result is shown in FIG. 9B (no intermediate
adaptive step was performed). FIG. 9C is the output result with the
network acquired from the intermediate knowledge through adaptive
learning.
[0053] In a similar manner, the original image shown in FIG. 10A
was collected at 5 PM. FIG. 10B is the segmented image with the
previous training set at 4 PM and FIG. 10C is the segmented image
with an intermediate adaptive step.
[0054] Based on the aforementioned results, it may be concluded
that the adaptive technique is needed to obtain better segmentation
when the environment is changing rapidly. For a MARS landing
application, the lander would be moving with a velocity towards its
landing surface. This would require an increased frequency of
updates as the lander approaches the landing site. As indicated,
embodiments of the present invention may support the adaptive
technique that provides for the increased frequency of updates for
the MARS lander application.
[0055] Still another application of an embodiment of the present
invention is dynamic detection and tracking. Such an application
may be based on the adaptive shape feature detection technique as
U.S. patent application Ser. No. 11/498,531 filed Aug. 1, 2006
titled "Artificial Intelligence Systems for Identifying Objects."
Such a technique may be considered as more robust than local
tracking approaches, e.g., particular local pixel group.
[0056] FIG. 11B shows an object present in FIG. 11A that is to be
detected and tracked in subsequent images. Given the initial image
in FIG. 11D, the shape feature of the object may be processed using
in the preprocessing block (i.e., Block A) of the Hybrid Processor
Array discussed above for detection and tracking. When processing
of the images of FIG. 11A and FIG. 11B is completed, the result is
tested in the scene of shown in FIG. 12A and autonomously found the
similar object shown in FIG. 12B. The object shown in FIG. 12B was
adapted for a current dynamic and used to test FIG. 13A and
autonomously found the similar object shown in FIG. 13B from the
object shown in FIG. 12B. Remaining FIGS. 14A through 19B show the
detection and tracking of the objects shown in FIGS. 14B, 15B, 16B,
17B, 18B, and 19B from the scenes shown in FIGS. 14A, 15A, 16A,
17A, 18A, and 19A. From these figures, it can be seen that the
Hybrid Processor Array may successfully detect the moving car in
the dynamic scene.
[0057] In another application of an embodiment of the present
invention, analysis of sensor responses may be employed for the
detection of chemical compounds in an open and changing
environment, such as a building or a geographical area where air
exchange is not controlled and limited. To search for a chemical
compound to determine if it exists in the operating environment, a
Spatial Invariant Independent Component Analysis (SPICA), such as
that described in T. A. Duong, M. Ryan and V. A. Duong, "Smart
ENose for Detection of Selective Chemicals in an Unknown
Environment." Special issue of Journal of Advanced Computational
Intelligence and Intelligent Informatics, Vol. 11, No. 10, pp.
1197-1203, 07, may be used to separate and detect the mixtures in
the open environment.
[0058] SPICA can be embedded in an embodiment of the Hybrid
Processor Array described above using both Blocks A and B. FIG. 20A
shows the mixing of two unknown odorants at unknown concentrations.
FIG. 20B shows the odorants detected by the SPICA processing
performed by an embodiment of the present invention. FIG. 20C shows
the actual odorant composition of these mixtures. Hence, it may be
concluded that an embodiment of the present invention may be used
for detecting chemicals in an unknown environment.
[0059] No limitation is intended by the description of exemplary
embodiments which may have included tolerances, feature dimensions,
specific operating conditions, engineering specifications, or the
like, and which may vary between implementations or with changes to
the state of the art, and no limitation should be implied
therefrom. In particular it is to be understood that the
disclosures are not limited to particular compositions or
biological systems, which can, of course, vary. This disclosure has
been made with respect to the current state of the art, but also
contemplates advancements and that adaptations in the future may
take into consideration of those advancements, namely in accordance
with the then current state of the art. It is intended that the
scope of the invention be defined by the Claims as written and
equivalents as applicable. It is also to be understood that the
terminology used herein is for the purpose of describing particular
embodiments only, and is not intended to be limiting. Reference to
a claim element in the singular is not intended to mean "one and
only one" unless explicitly so stated. As used in this
specification and the appended claims, the singular forms "a,"
"an," and "the" include plural referents unless the content clearly
dictates otherwise. The term "several" includes two or more
referents unless the content clearly dictates otherwise. Unless
defined otherwise, all technical and scientific terms used herein
have the same meaning as commonly understood by one of ordinary
skill in the art to which the disclosure pertains.
[0060] Moreover, no element, component, nor method or process step
in this disclosure is intended to be dedicated to the public
regardless of whether the element, component, or step is explicitly
recited in the Claims. No claim element herein is to be construed
under the provisions of 35 U.S.C. Sec. 112, sixth paragraph, unless
the element is expressly recited using the phrase "means for . . .
" and no method or process step herein is to be construed under
those provisions unless the step, or steps, are expressly recited
using the phrase "comprising step(s) for . . . "
[0061] A number of embodiments of the disclosure have been
described. Nevertheless, it will be understood that various
modifications may be made without departing from the spirit and
scope of the present disclosure. Accordingly, other embodiments are
within the scope of the following claims.
* * * * *